././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2935474 coremltools-8.0/0000755000000000000000000000000014672075535012477 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/LICENSE.txt0000644000000000000000000000272414672066616014327 0ustar00rootrootCopyright © 2020-2023, Apple Inc. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of the copyright holder(s) nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/MANIFEST.in0000644000000000000000000000002114672066616014226 0ustar00rootrootinclude README.md././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/NOTICE.txt0000644000000000000000000001023314672066616014220 0ustar00rootrootCopyright © 2020-2023, Apple Inc. All rights reserved. This project contains content adapted from kmeans1d (https://github.com/dstein64/kmeans1d), the license for which follows: MIT License Copyright (c) 2019 Daniel Steinberg Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. This project contains content in the files coremltools/optimize/torch/layerwise_compression/_quant.py, coremltools/optimize/torch/layerwise_compression/algorithms.py, and coremltools/optimize/torch/layerwise_compression/layerwise_compressor.py which are adapted from gtpq (https://github.com/IST-DASLab/gptq/). It also contains content in the file coremltools/optimize/torch/layerwise_compression/algorithms.py which is adapted from sparsegpt (https://github.com/IST-DASLab/sparsegpt). The license for these follows: Apache License 2.0 Copyright 2023 IST Austria Distributed Algorithms and Systems Lab Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. This project contains content in the files coremltools/optimize/torch/quantization/modules/conv_transpose.py and coremltools/optimize/torch/quantization/modules/conv_transpose_fused.py which are adapted from pytorch (https://github.com/pytorch/). The license for these follows: Copyright (c) 2016 Facebook, Inc (Adam Paszke) All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the names of Facebook, Deepmind Technologies, NYU, NEC Laboratories America and IDIAP Research Institute nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2935474 coremltools-8.0/PKG-INFO0000644000000000000000000000460314672075535013577 0ustar00rootrootMetadata-Version: 2.1 Name: coremltools Version: 8.0 Summary: Community Tools for Core ML Home-page: https://github.com/apple/coremltools Author: Apple Inc. Author-email: coremltools@apple.com License: BSD Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers Classifier: Operating System :: MacOS :: MacOS X Classifier: Operating System :: POSIX :: Linux Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Topic :: Scientific/Engineering Classifier: Topic :: Software Development License-File: LICENSE.txt License-File: NOTICE.txt Requires-Dist: numpy>=1.14.5 Requires-Dist: protobuf>=3.1.0 Requires-Dist: sympy Requires-Dist: tqdm Requires-Dist: packaging Requires-Dist: attrs>=21.3.0 Requires-Dist: cattrs Requires-Dist: pyaml coremltools =========== `Core ML `_ is an Apple framework that allows developers to easily integrate machine learning (ML) models into apps. Core ML is available on iOS, iPadOS, watchOS, macOS, and tvOS. Core ML introduces a public file format (.mlmodel) for a broad set of ML methods including deep neural networks (convolutional and recurrent), tree ensembles (boosted trees, random forest, decision trees), and generalized linear models. Core ML models can be directly integrated into apps within Xcode. :code:`coremltools` is a python package for creating, examining, and testing models in the .mlmodel format. In particular, it can be used to: - Convert trained models from popular machine learning tools into Core ML format (.mlmodel). - Write models to Core ML format with a simple API. - Making predictions using the Core ML framework (on select platforms) to verify conversion. More Information ---------------- - `coremltools user guide and examples `_ - `Core ML framework documentation `_ - `Machine learning at Apple `_ License ------- Copyright (c) 2020, Apple Inc. All rights reserved. Use of this source code is governed by the `3-Clause BSD License `_ that can be found in the LICENSE.txt file. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/README.md0000644000000000000000000000573314672066616013766 0ustar00rootroot[![Build Status](https://img.shields.io/gitlab/pipeline/coremltools1/coremltools/main)](https://gitlab.com/coremltools1/coremltools/-/pipelines?page=1&scope=branches&ref=main) [![PyPI Release](https://img.shields.io/pypi/v/coremltools.svg)](#) [![Python Versions](https://img.shields.io/pypi/pyversions/coremltools.svg)](#) [Core ML Tools](https://apple.github.io/coremltools/docs-guides/source/overview-coremltools.html) ======================= ![Core ML Tools logo](docs/logo.png) Use [Core ML Tools](https://apple.github.io/coremltools/docs-guides/source/overview-coremltools.html) (*coremltools*) to convert machine learning models from third-party libraries to the Core ML format. This Python package contains the supporting tools for converting models from training libraries such as the following: * [TensorFlow 1.x](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf) * [TensorFlow 2.x](https://www.tensorflow.org/api_docs) * [PyTorch](https://pytorch.org/) * Non-neural network frameworks: * [scikit-learn](https://scikit-learn.org/stable/) * [XGBoost](https://xgboost.readthedocs.io/en/latest/) * [LibSVM](https://www.csie.ntu.edu.tw/~cjlin/libsvm/) With coremltools, you can: * Convert trained models to the Core ML format. * Read, write, and optimize Core ML models. * Verify conversion/creation (on macOS) by making predictions using Core ML. After conversion, you can integrate the Core ML models with your app using Xcode. ## Install 8.0 Beta The [coremltools version 8 beta 2](https://github.com/apple/coremltools/releases/tag/8.0b2) is now out. To install, run the following command in your terminal: ```shell pip install coremltools==8.0b2 ``` ## Install Version 7.2 To install the latest non-beta version, run the following command in your terminal: ```shell pip install -U coremltools ``` ## Core ML [Core ML](https://developer.apple.com/documentation/coreml) is an Apple framework to integrate machine learning models into your app. Core ML provides a unified representation for all models. Your app uses Core ML APIs and user data to make predictions, and to fine-tune models, all on the user’s device. Core ML optimizes on-device performance by leveraging the CPU, GPU, and Neural Engine while minimizing its memory footprint and power consumption. Running a model strictly on the user’s device removes any need for a network connection, which helps keep the user’s data private and your app responsive. ## Resources To install coremltools, see [Installing Core ML Tools](https://apple.github.io/coremltools/docs-guides/source/installing-coremltools.html). For more information, see the following: * [Release Notes](https://github.com/apple/coremltools/releases/) * [Guide and examples](https://apple.github.io/coremltools/docs-guides/index.html) * [API Reference](https://apple.github.io/coremltools/index.html) * [Core ML Specification](https://apple.github.io/coremltools/mlmodel/index.html) * [Building from Source](BUILDING.md) * [Contribution Guidelines](CONTRIBUTING.md) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2055464 coremltools-8.0/coremltools/0000755000000000000000000000000014672075535015041 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/__init__.py0000644000000000000000000001251714672066616017160 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ Core ML is an Apple framework which allows developers to simply and easily integrate machine learning (ML) models into apps running on Apple devices (including iOS, watchOS, macOS, and tvOS). Core ML introduces a public file format (.mlmodel) for a broad set of ML methods including deep neural networks (both convolutional and recurrent), tree ensembles with boosting, and generalized linear models. Models in this format can be directly integrated into apps through Xcode. Coremltools is a python package for creating, examining, and testing models in the .mlpackage and .mlmodel formats. In particular, it can be used to: * Convert existing models to .mlpackage or .mlmodel formats from popular machine learning tools including: PyTorch, TensorFlow, scikit-learn, XGBoost and libsvm. * Express models in .mlpackage and .mlmodel formats through a simple API. * Make predictions with .mlpackage and .mlmodel files (on macOS). For more information: http://developer.apple.com/documentation/coreml """ from enum import Enum as _Enum from logging import getLogger as _getLogger from .version import __version__ _logger = _getLogger(__name__) # This is the basic Core ML specification format understood by iOS 11.0 SPECIFICATION_VERSION = 1 # New versions for iOS 11.2 features. Models which use these features should have these # versions, but models created from this coremltools which do not use the features can # still have the basic version. _MINIMUM_CUSTOM_LAYER_SPEC_VERSION = 2 _MINIMUM_FP16_SPEC_VERSION = 2 # New versions for iOS 12.0 features. Models which use these features should have these # versions, but models created from this coremltools which do not use the features can # still have the basic version. _MINIMUM_CUSTOM_MODEL_SPEC_VERSION = 3 _MINIMUM_QUANTIZED_MODEL_SPEC_VERSION = 3 _MINIMUM_FLEXIBLE_SHAPES_SPEC_VERSION = 3 # New versions for iOS 13.0. _MINIMUM_NDARRAY_SPEC_VERSION = 4 _MINIMUM_NEAREST_NEIGHBORS_SPEC_VERSION = 4 _MINIMUM_LINKED_MODELS_SPEC_VERSION = 4 _MINIMUM_UPDATABLE_SPEC_VERSION = 4 _SPECIFICATION_VERSION_IOS_13 = 4 # New versions for iOS 14.0 _SPECIFICATION_VERSION_IOS_14 = 5 # New versions for iOS 15.0 _SPECIFICATION_VERSION_IOS_15 = 6 # New versions for iOS 16.0 _SPECIFICATION_VERSION_IOS_16 = 7 # New versions for iOS 17.0 _SPECIFICATION_VERSION_IOS_17 = 8 # New versions for iOS 18.0 _SPECIFICATION_VERSION_IOS_18 = 9 class ComputeUnit(_Enum): ''' The set of processing-unit configurations the model can use to make predictions. ''' ALL = 1 # Allows model to use all compute units available, including the neural engine. CPU_AND_GPU = 2 # Allows model to use both the CPU and GPU, but not the neural engine. CPU_ONLY = 3 # Limits model to only use the CPU. CPU_AND_NE = 4 # Allows model to use both the CPU and neural engine, but not the GPU. # Only available on macOS >= 13.0 class ReshapeFrequency(_Enum): ''' https://developer.apple.com/documentation/coreml/mlreshapefrequencyhint?language=objc ''' Frequent = 1 Infrequent = 2 class SpecializationStrategy(_Enum): ''' The optimization strategy for the model specialization. https://developer.apple.com/documentation/coreml/mlspecializationstrategy?language=objc ''' # The strategy that works well for most applications. Default = 1 # Prefer the prediction latency at the potential cost of specialization time, memory footprint, # and the disk space usage of specialized artifacts. FastPrediction = 2 # A dictionary that maps the CoreML model specification version to the MLProgram/MIL opset string _OPSET = { _SPECIFICATION_VERSION_IOS_13: "CoreML3", _SPECIFICATION_VERSION_IOS_14: "CoreML4", _SPECIFICATION_VERSION_IOS_15: "CoreML5", _SPECIFICATION_VERSION_IOS_16: "CoreML6", _SPECIFICATION_VERSION_IOS_17: "CoreML7", _SPECIFICATION_VERSION_IOS_18: "CoreML8", } # Default specification version for each backend _LOWEST_ALLOWED_SPECIFICATION_VERSION_FOR_NEURALNETWORK = _SPECIFICATION_VERSION_IOS_13 _LOWEST_ALLOWED_SPECIFICATION_VERSION_FOR_MILPROGRAM = _SPECIFICATION_VERSION_IOS_15 # expose sub packages as directories from . import converters, models, optimize, proto # expose unified converter in coremltools package level from .converters import ClassifierConfig from .converters import ColorLayout as colorlayout from .converters import EnumeratedShapes, ImageType, RangeDim, Shape, StateType, TensorType, convert from .converters.mil._deployment_compatibility import AvailableTarget as target from .converters.mil.mil.passes.defs import quantization as transform from .converters.mil.mil.passes.defs.quantization import ComputePrecision as precision from .converters.mil.mil.passes.pass_pipeline import PassPipeline from .models import utils from .models.ml_program import compression_utils try: from . import libcoremlpython except: pass # Time profiling for functions in coremltools package, decorated with @profile import os as _os import sys as _sys from .converters._profile_utils import _profiler _ENABLE_PROFILING = _os.environ.get("ENABLE_PROFILING", False) if _ENABLE_PROFILING: _sys.setprofile(_profiler) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2055464 coremltools-8.0/coremltools/_deps/0000755000000000000000000000000014672075535016133 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/_deps/__init__.py0000644000000000000000000001542214672066616020250 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ List of all external dependencies for this package. Imported as optional includes """ import platform as _platform import re as _re import sys as _sys from packaging.version import Version from coremltools import _logger as logger _HAS_KMEANS1D = True try: from . import kmeans1d as _kmeans1d except: _kmeans1d = None _HAS_KMEANS1D = False def _get_version(version): # matching 1.6.1, and 1.6.1rc, 1.6.1.dev version_regex = r"^\d+\.\d+\.\d+" version = _re.search(version_regex, str(version)).group(0) return Version(version) def _warn_if_above_max_supported_version(package_name, package_version, max_supported_version): if _get_version(package_version) > Version(max_supported_version): logger.warning( "%s version %s has not been tested with coremltools. You may run into unexpected errors. " "%s %s is the most recent version that has been tested." % (package_name, package_version, package_name, max_supported_version) ) # --------------------------------------------------------------------------------------- _IS_MACOS = _sys.platform == "darwin" _MACOS_VERSION = () if _IS_MACOS: ver_str = _platform.mac_ver()[0] MACOS_VERSION = tuple([int(v) for v in ver_str.split(".")]) MSG_ONLY_MACOS = "Only supported on macOS" # --------------------------------------------------------------------------------------- _HAS_SKLEARN = True _SKLEARN_VERSION = None _SKLEARN_MIN_VERSION = "0.17" _SKLEARN_MAX_VERSION = "1.5.1" def __get_sklearn_version(version): # matching 0.15b, 0.16bf, etc version_regex = r"^\d+\.\d+" version = _re.search(version_regex, str(version)).group(0) return Version(version) try: import sklearn _SKLEARN_VERSION = __get_sklearn_version(sklearn.__version__) if _SKLEARN_VERSION < Version( _SKLEARN_MIN_VERSION ) or _SKLEARN_VERSION > Version(_SKLEARN_MAX_VERSION): _HAS_SKLEARN = False logger.warning( ( "scikit-learn version %s is not supported. Minimum required version: %s. " "Maximum required version: %s. " "Disabling scikit-learn conversion API." ) % (sklearn.__version__, _SKLEARN_MIN_VERSION, _SKLEARN_MAX_VERSION) ) except: _HAS_SKLEARN = False MSG_SKLEARN_NOT_FOUND = "Sklearn not found." # --------------------------------------------------------------------------------------- _HAS_LIBSVM = True try: from libsvm import svm except: _HAS_LIBSVM = False MSG_LIBSVM_NOT_FOUND = "Libsvm not found." # --------------------------------------------------------------------------------------- _HAS_XGBOOST = True _XGBOOST_MAX_VERSION = "1.4.2" try: import xgboost _warn_if_above_max_supported_version("XGBoost", xgboost.__version__, _XGBOOST_MAX_VERSION) except: _HAS_XGBOOST = False # --------------------------------------------------------------------------------------- _HAS_TF = True _HAS_TF_1 = False _HAS_TF_2 = False _TF_1_MIN_VERSION = "1.12.0" _TF_1_MAX_VERSION = "1.15.4" _TF_2_MIN_VERSION = "2.1.0" _TF_2_MAX_VERSION = "2.12.0" try: import tensorflow tf_ver = _get_version(tensorflow.__version__) # TensorFlow if tf_ver < Version("2.0.0"): _HAS_TF_1 = True if tf_ver >= Version("2.0.0"): _HAS_TF_2 = True if _HAS_TF_1: if tf_ver < Version(_TF_1_MIN_VERSION): logger.warning( ( "TensorFlow version %s is not supported. Minimum required version: %s ." "TensorFlow conversion will be disabled." ) % (tensorflow.__version__, _TF_1_MIN_VERSION) ) _warn_if_above_max_supported_version("TensorFlow", tensorflow.__version__, _TF_1_MAX_VERSION) elif _HAS_TF_2: if tf_ver < Version(_TF_2_MIN_VERSION): logger.warning( ( "TensorFlow version %s is not supported. Minimum required version: %s ." "TensorFlow conversion will be disabled." ) % (tensorflow.__version__, _TF_2_MIN_VERSION) ) _warn_if_above_max_supported_version("TensorFlow", tensorflow.__version__, _TF_2_MAX_VERSION) except: _HAS_TF = False _HAS_TF_1 = False _HAS_TF_2 = False MSG_TF1_NOT_FOUND = "TensorFlow 1.x not found." MSG_TF2_NOT_FOUND = "TensorFlow 2.x not found." # --------------------------------------------------------------------------------------- _HAS_TORCH = True _TORCH_MAX_VERSION = "2.4.0" _HAS_TORCH_EXPORT_API = False _CT_OPTIMIZE_TORCH_MIN_VERSION = "2.1.0" _IMPORT_CT_OPTIMIZE_TORCH = False try: import torch _warn_if_above_max_supported_version("Torch", torch.__version__, _TORCH_MAX_VERSION) torch_version = _get_version(torch.__version__) if torch_version >= Version("2.1.0"): _HAS_TORCH_EXPORT_API = True if torch_version >= Version(_CT_OPTIMIZE_TORCH_MIN_VERSION): _IMPORT_CT_OPTIMIZE_TORCH = True else: logger.warning( ( f"Minimum required torch version for importing coremltools.optimize.torch is {_CT_OPTIMIZE_TORCH_MIN_VERSION}. " f"Got torch version {torch_version}." ) ) except: _HAS_TORCH = False MSG_TORCH_NOT_FOUND = "PyTorch not found." MSG_TORCH_EXPORT_API_NOT_FOUND = "Torch.Export API not found." _HAS_TORCH_VISION = True try: import torchvision except: _HAS_TORCH_VISION = False MSG_TORCH_VISION_NOT_FOUND = "TorchVision not found." _HAS_TORCH_AUDIO = True try: import torchaudio except: _HAS_TORCH_AUDIO = False MSG_TORCH_AUDIO_NOT_FOUND = "TorchAudio not found." _HAS_EXECUTORCH = True try: import executorch except: _HAS_EXECUTORCH = False MSG_EXECUTORCH_NOT_FOUND = "Executorch not found." _HAS_TORCHAO = True try: import torchao except: _HAS_TORCHAO = False MSG_TORCHAO_NOT_FOUND = "Torchao not found." # --------------------------------------------------------------------------------------- try: import scipy except: _HAS_SCIPY = False else: _HAS_SCIPY = True # --------------------------------------------------------------------------------------- try: import transformers except: _HAS_HF = False else: _HAS_HF = True # General utils def version_ge(module, target_version): """ Example usage: >>> import torch # v1.5.0 >>> version_ge(torch, '1.6.0') # False """ return Version(module.__version__) >= Version(target_version) def version_lt(module, target_version): """See version_ge""" return Version(module.__version__) < Version(target_version) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2055464 coremltools-8.0/coremltools/converters/0000755000000000000000000000000014672075535017233 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/__init__.py0000644000000000000000000000073714672066616021353 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause # expose directories as imports from . import libsvm, sklearn, xgboost from ._converters_entry import convert from .mil import ( ClassifierConfig, ColorLayout, EnumeratedShapes, ImageType, RangeDim, Shape, StateType, TensorType, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/_converters_entry.py0000644000000000000000000013575114672066616023373 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import collections import gc import os from typing import List, Optional, Text, Union from coremltools import ( _LOWEST_ALLOWED_SPECIFICATION_VERSION_FOR_MILPROGRAM, _LOWEST_ALLOWED_SPECIFICATION_VERSION_FOR_NEURALNETWORK, ) from coremltools import ComputeUnit as _ComputeUnit from coremltools import __version__ as _ct_version from coremltools import _logger as logger from coremltools._deps import _HAS_TF_1, _HAS_TF_2, _HAS_TORCH, _HAS_TORCH_EXPORT_API from coremltools.converters._profile_utils import _profile from coremltools.converters.mil._deployment_compatibility import ( AvailableTarget, check_deployment_compatibility, ) from coremltools.converters.mil.converter import mil_convert from coremltools.converters.mil.input_types import ( ClassifierConfig, EnumeratedShapes, ImageType, InputType, RangeDim, Shape, StateType, TensorType, ) from coremltools.converters.mil.mil import Program, types from coremltools.converters.mil.mil.passes.defs.quantization import ComputePrecision as precision from coremltools.converters.mil.mil.passes.defs.quantization import FP16ComputePrecision from coremltools.converters.mil.mil.passes.graph_pass import PassOption as _PassOption from coremltools.converters.mil.mil.passes.pass_pipeline import PassPipeline from coremltools.models import _METADATA_SOURCE, _METADATA_SOURCE_DIALECT, _METADATA_VERSION from coremltools.models.utils import _MLPACKAGE_EXTENSION if _HAS_TF_1: import tensorflow as tf from coremltools.converters.mil.frontend.tensorflow.load import TF1Loader if _HAS_TF_2: import tensorflow as tf from coremltools.converters.mil.frontend.tensorflow2.load import TF2Loader if _HAS_TORCH: import torch from coremltools.converters.mil.frontend.torch.load import is_torch_model if _HAS_TORCH_EXPORT_API: from torch.export import ExportedProgram @_profile def convert( model, source="auto", inputs=None, outputs=None, classifier_config=None, minimum_deployment_target=None, convert_to=None, compute_precision=None, skip_model_load=False, compute_units=_ComputeUnit.ALL, package_dir=None, debug=False, pass_pipeline: Optional[PassPipeline] = None, states=None, ): """ Convert a TensorFlow or PyTorch model to the Core ML model format as either a neural network or an `ML program `_. Some parameters and requirements differ for TensorFlow and PyTorch conversions. Parameters ---------- model : TensorFlow 1, TensorFlow 2, or PyTorch model in one of the following formats: * TensorFlow versions 1.x - Frozen `tf.Graph `_ - Frozen graph (``.pb``) file path - `tf.keras.Model `_ - `HDF5 `_ file path (``.h5``) - `SavedModel `_ directory path * TensorFlow versions 2.x - `tf.keras.Model `_ - `HDF5 file path `_ (``.h5``) - `SavedModel `_ directory path - A `concrete function `_ - A `GraphDef `_ * PyTorch - TorchScript Models: - A `TorchScript `_ object - Path to a ``.pt`` file - Torch Exported Models: - An `ExportedProgram `_ object with ``EDGE`` dialect. source : str (optional) One of [``auto``, ``tensorflow``, ``pytorch``, ``milinternal``]. ``auto`` determines the framework automatically for most cases. Raises ``ValueError`` if it fails to determine the source framework. inputs : list of ``TensorType`` or ``ImageType`` * If you specify ``dtype`` with ``TensorType`` or ``ImageType``, it will be applied to the input of the converted model. For example, the following code snippet will produce a Core ML model with float 16 typed inputs. .. sourcecode:: python import coremltools as ct mlmodel = ct.convert( keras_model, inputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, ) * The following code snippet will produce a Core ML model with the ``GRAYSCALE_FLOAT16`` input image type: .. sourcecode:: python import coremltools as ct # H : image height, W: image width mlmodel = ct.convert( torch_model, inputs=[ ct.ImageType(shape=(1, 1, H, W), color_layout=ct.colorlayout.GRAYSCALE_FLOAT16) ], minimum_deployment_target=ct.target.macOS13, ) * TensorFlow 1 and 2 (including tf.keras): - The ``inputs`` parameter is optional. If not provided, the inputs are placeholder nodes in the model (if the model is a frozen graph) or function inputs (if the model is a ``tf.function``). - If ``inputs`` is provided, it must be a flat list. - The ``inputs`` must correspond to all or some of the placeholder nodes in the TF model. - If ``name`` is specified with ``TensorType`` and ``ImageType``, it must correspond to a placeholder op in the TF graph. The input names in the converted Core ML model can later be modified using the ``ct.utils.rename_feature`` API. - If ``dtype`` is not specified, it defaults to the ``dtype`` of the inputs in the TF model. - For ``minimum_deployment_target >= ct.target.macOS13``, and with ``compute_precision`` in float 16 precision. When ``inputs`` not provided or ``dtype`` not specified, the float 32 inputs default to float 16. * PyTorch: - TorchScript Models: - The ``inputs`` parameter is required. - Number of elements in ``inputs`` must match the number of inputs of the PyTorch model. - ``inputs`` may be a nested list or tuple. - ``TensorType`` and ``ImageType`` must have the ``shape`` specified. - If the ``name`` argument is specified with ``TensorType`` or ``ImageType``, the converted Core ML model will have inputs with the same name. - If ``dtype`` is missing: * For ``minimum_deployment_target <= ct.target.macOS12``, it defaults to float 32. * For ``minimum_deployment_target >= ct.target.macOS13``, and with ``compute_precision`` in float 16 precision. It defaults to float 16. - Torch Exported Models: - The ``inputs`` parameter is not supported. - The ``inputs`` parameter is inferred from the Torch `ExportedProgram `_. outputs : list of ``TensorType`` or ``ImageType`` (optional) * If you specify ``dtype`` with ``TensorType`` or ``ImageType``, it will be applied to the output of the converted model. For example, to produce float 16 typed inputs and outputs: .. sourcecode:: python import coremltools as ct mlmodel = ct.convert( keras_model, inputs=[ct.TensorType(dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, ) * To produce image inputs and outputs: .. sourcecode:: python import coremltools as ct # H: image height, W: image width mlmodel = ct.convert( torch_model, inputs=[ct.ImageType(shape=(1, 3, H, W), color_layout=ct.colorlayout.RGB)], outputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.macOS13, ) * TensorFlow 1 and 2 (including tf.keras): - If ``outputs`` is not specified, the converter infers outputs from the sink nodes in the graph. - If specified, the ``name`` with ``TensorType`` or ``ImageType`` must correspond to a node in the TF graph. In this case, the model will be converted up to that node. - For ``minimum_deployment_target >= ct.target.macOS13``, and with ``compute_precision`` in float 16 precision. If ``dtype`` not specified, the outputs inferred of type float 32 default to float 16. * PyTorch: TorchScript Models - If specified, the length of the list must match the number of outputs returned by the PyTorch model. - If ``name`` is specified, it is applied to the output names of the converted Core ML model. - For ``minimum_deployment_target >= ct.target.macOS13``, and with ``compute_precision`` in float 16 precision. - If ``dtype`` not specified, the outputs inferred of type float 32 default to float 16. * PyTorch: Torch Exported Models: - The ``outputs`` parameter is not supported. - The ``outputs`` parameter is inferred from Torch `ExportedProgram `_. classifier_config : ClassifierConfig class (optional) The configuration if the MLModel is intended to be a classifier. minimum_deployment_target : coremltools.target enumeration (optional) A member of the ``coremltools.target`` enum. The value of this parameter determines the type of the model representation produced by the converter. To learn about the differences between ML programs and neural networks, see `ML Programs `_. - The converter produces a neural network (``neuralnetwork``) if: .. sourcecode:: python minimum_deployment_target <= coremltools.target.iOS14/ coremltools.target.macOS11/ coremltools.target.watchOS7/ coremltools.target.tvOS14: - The converter produces an ML program (``mlprogram``) if: .. sourcecode:: python minimum_deployment_target >= coremltools.target.iOS15/ coremltools.target.macOS12/ coremltools.target.watchOS8/ coremltools.target.tvOS15: - If neither the ``minimum_deployment_target`` nor the ``convert_to`` parameter is specified, the converter produces an ML program model type with as minimum of a deployment target as possible. - If this parameter is specified and ``convert_to`` is also specified, they must be compatible. The following are examples of invalid values: .. sourcecode:: python # Invalid: convert_to="mlprogram", minimum_deployment_target=coremltools.target.iOS14 # Invalid: convert_to="neuralnetwork", minimum_deployment_target=coremltools.target.iOS15 convert_to : str (optional) Must be one of [``'mlprogram'``, ``'neuralnetwork'``, ``'milinternal'``]. The value of this parameter determines the type of the model representation produced by the converter. To learn about the differences between ML programs and neural networks, see `ML Programs `_. - ``'mlprogram'`` : Returns an MLModel (``coremltools.models.MLModel``) containing a MILSpec.Program proto, which is the Core ML program format. The model saved from this returned object is executable on iOS15, macOS12, watchOS8, and tvOS15. - ``'neuralnetwork'``: Returns an MLModel (``coremltools.models.MLModel``) containing a NeuralNetwork proto, which is the original Core ML format. The model saved from this returned object is executable either on iOS13/macOS10.15/watchOS6/tvOS13 and newer, or on iOS14/macOS11/watchOS7/tvOS14 and newer, depending on the layers used in the model. - ``'milinternal'``: Returns an MIL program object (``coremltools.converters.mil.Program``). An MIL program is primarily used for debugging and inspection. It can be converted to an MLModel for execution by using one of the following: .. sourcecode:: python ct.convert(mil_program, convert_to="neuralnetwork") ct.convert(mil_program, convert_to="mlprogram") - If neither the ``minimum_deployment_target`` nor the ``convert_to`` parameter is specified, the converter produces the ML programs model type with as minimum of a deployment target as possible. compute_precision : coremltools.precision enumeration or ct.transform.FP16ComputePrecision() (optional) Use this argument to control the storage precision of the tensors in the ML program. Must be one of the following. - ``coremltools.precision.FLOAT16`` enum: The following transform is applied to produce a float 16 program; that is, a program in which all the intermediate float tensors are of type float 16 (for ops that support that type). .. sourcecode:: python coremltools.transform.FP16ComputePrecision(op_selector= lambda op:True) The above transform iterates through all the ops, looking at each op's inputs and outputs. If they are of type float 32, ``cast`` ops are injected to convert those tensors (also known as `vars`) to type float 16. Similarly, int32 vars will also be cast to int16. - ``coremltools.precision.FLOAT32`` enum: No transform is applied. The original float32 tensor dtype in the source model is preserved. Opt into this option if the default converted model is displaying numerical precision issues. - ``coremltools.transform.FP16ComputePrecision(op_selector=...)`` Use this option to control which tensors are cast to float 16. Before casting the inputs/outputs of any op from float32 to float 16, the op_selector function is invoked on the op object. This function must return a boolean value. By default it returns ``True`` for every op, but you can customize this. For example: .. sourcecode:: python coremltools.transform.FP16ComputePrecision(op_selector= lambda op: op.op_type != "linear") The above casts all the float32 tensors to be float 16, except the input/output tensors to any ``linear`` op. See more examples below. - ``None``: The default - When ``convert_to="mlprogram"``, the ``compute_precision`` parameter defaults to ``coremltools.precision.FLOAT16``. - When ``convert_to="neuralnetwork"``, the ``compute_precision`` parameter needs to be ``None`` and has no meaning. - For example, you can customize the float 16 precision transform to prevent casting all the ``real_div`` ops in the program to float 16 precision: .. sourcecode:: python def skip_real_div_ops(op): if op.op_type == "real_div": return False return True model = ct.convert( source_model, compute_precision=ct.transform.FP16ComputePrecision(op_selector=skip_real_div_ops), minimum_deployment_target=ct.target.iOS15, ) skip_model_load : bool Set to ``True`` to prevent coremltools from calling into the Core ML framework to compile and load the model, post-conversion. In that case, the returned model object cannot be used to make a prediction, but can be used to save with ``model.save()``. This flag may be used to convert to a newer model type on an older Mac, which may raise a runtime warning if done without turning this flag on. Example: Use this flag to suppress a runtime warning when converting to an ML program model on macOS 11, since an ML program can only be compiled and loaded from macOS12+. Defaults to ``False``. compute_units: coremltools.ComputeUnit The set of processing units the model can use to make predictions. After conversion, the model is loaded with the provided set of compute units and returned. An enum with the following possible values: * ``coremltools.ComputeUnit.ALL``: Use all compute units available, including the neural engine. * ``coremltools.ComputeUnit.CPU_ONLY``: Limit the model to only use the CPU. * ``coremltools.ComputeUnit.CPU_AND_GPU``: Use both the CPU and GPU, but not the neural engine. * ``coremltools.ComputeUnit.CPU_AND_NE``: Use both the CPU and neural engine, but not the GPU. Available only for macOS >= 13.0. package_dir : str Post conversion, the model is saved at a temporary location and loaded to form the MLModel object ready for prediction. * If ``package_dir`` is provided, model will be saved at this location rather than creating a temporary directory. * If not ``None``, this must be a path to a directory with the extension ``.mlpackage``. debug : bool This flag should generally be ``False`` except for debugging purposes. Setting this flag to ``True`` produces the following behavior: * For Torch conversion, it will print the list of supported and unsupported ops found in the model if conversion fails due to an unsupported op. * For Tensorflow conversion, it will cause to display extra logging and visualizations. pass_pipeline : PassPipeline Manage graph passes. You can control which graph passes to run and the order of the graph passes. You can also specify options for each pass. See the details in the docstring of PassPipeline (``coremltools/converters/mil/mil/passes/pass_pipeline.py``). * To avoid fusing the ``conv`` and ``batchnorm`` ops, skip the corresponding pass as shown in the following example: .. sourcecode:: python pipeline = ct.PassPipeline() pipeline.remove_passes({"common::fuse_conv_batchnorm"}) mlmodel = ct.convert(model, pass_pipeline=pipeline) * To avoid folding too-large ``const`` ops that lead to a large model, set pass option as shown in the following example: .. sourcecode:: python pipeline = ct.PassPipeline() pipeline.set_options("common::const_elimination", {"skip_const_by_size": "1e6"}) mlmodel = ct.convert(model, pass_pipeline=pipeline) We also provide a set of predefined pass pipelines that you can directly call. * To avoid running all graph pass, you can use: .. sourcecode:: python mlmodel = ct.convert(model, pass_pipeline=ct.PassPipeline.EMPTY) * To only run the cleanup graph passes, like constant_elimination, dead_code_elimination, etc. You can use: .. sourcecode:: python mlmodel = ct.convert(model, pass_pipeline=ct.PassPipeline.CLEANUP) * To convert a source model with sparse weights to a sparse format Core ML model, you can use: .. sourcecode:: python mlmodel = ct.convert(model, pass_pipeline=ct.PassPipeline.DEFAULT_PRUNING) * To convert a source model with palettized weights to a compressed format Core ML model, you can use: .. sourcecode:: python mlmodel = ct.convert(model, pass_pipeline=ct.PassPipeline.DEFAULT_PALETTIZATION) states: Create a stateful ``mlprogram`` model by providing the ``StateType`` in the ``states`` argument (for details see `MIL Input Types `_). The stateful model is useful when converting a large language model with KV-Cache. The name of ``StateType`` must match the key of the PyTorch ``named_buffers()`` method in the source traced model. The following example converts a torch model with a buffer called ``state_1``. .. sourcecode:: python class UpdateBufferModel(torch.nn.Module): def __init__(self): super(UpdateBufferModel, self).__init__() self.register_buffer( "state_1", torch.tensor(np.array([0, 0, 0], dtype=np.float32)) ) def forward(self, x): # In place update of the model state self.state_1.add_(x) return self.state_1 model = UpdateBufferModel() traced_model = torch.jit.trace(model, torch.tensor([1, 2, 3], dtype=torch.float32)) inputs = [ ct.TensorType(shape=(1, 2)), ] states = [ ct.StateType( wrapped_type=ct.TensorType( shape=(1, 2), ), name="state_1", ), ] mlmodel = ct.convert( traced_model, inputs=inputs, states=states, minimum_deployment_target=ct.target.iOS18, ) Returns ------- model : ``coremltools.models.MLModel`` or ``coremltools.converters.mil.Program`` A Core ML MLModel object or MIL program object (see ``convert_to``). Examples -------- TensorFlow 1, 2 (``model`` is a frozen graph): >>> with tf.Graph().as_default() as graph: >>> x = tf.placeholder(tf.float32, shape=(1, 2, 3), name="input") >>> y = tf.nn.relu(x, name="output") Automatically infer inputs and outputs: >>> mlmodel = ct.convert(graph) >>> test_input = np.random.rand(1, 2, 3) - 0.5 >>> results = mlmodel.predict({"input": test_input}) >>> print(results['output']) TensorFlow 2 (``model`` is a tf.Keras model path): >>> x = tf.keras.Input(shape=(32,), name='input') >>> y = tf.keras.layers.Dense(16, activation='softmax')(x) >>> keras_model = tf.keras.Model(x, y) >>> keras_model.save(h5_path) >>> mlmodel = ct.convert(h5_path) >>> test_input = np.random.rand(2, 32) >>> results = mlmodel.predict({'input': test_input}) >>> print(results['Identity']) PyTorch: TorchScript Models: >>> model = torchvision.models.mobilenet_v2() >>> model.eval() >>> example_input = torch.rand(1, 3, 256, 256) >>> traced_model = torch.jit.trace(model, example_input) >>> input = ct.TensorType(name='input_name', shape=(1, 3, 256, 256)) >>> mlmodel = ct.convert(traced_model, inputs=[input]) >>> results = mlmodel.predict({"input": example_input.numpy()}) >>> print(results['1651']) # 1651 is the node name given by PyTorch's JIT For more options see `Conversion Options `_. """ _check_deployment_target(minimum_deployment_target) outputs_as_strings, outputs_as_tensor_or_image_types = _validate_outputs_argument(outputs) exact_source = _determine_source(model, source, outputs_as_strings, outputs_as_tensor_or_image_types, outputs) source_dialect = _determine_source_dialect(model, exact_source) exact_target = _determine_target(convert_to, minimum_deployment_target) _validate_conversion_arguments( model, exact_source, exact_target, inputs, outputs_as_tensor_or_image_types, classifier_config, compute_precision, exact_target, minimum_deployment_target, ) need_fp16_cast_pass = _need_fp16_cast_pass(compute_precision, exact_target) if pass_pipeline is None: pass_pipeline = PassPipeline() if not need_fp16_cast_pass: pass_pipeline.remove_passes({"common::add_fp16_cast", "common::add_int16_cast"}) if isinstance(compute_precision, FP16ComputePrecision): # For backward compatibility with the `op_selector` param in FP16ComputePrecision. pass_pipeline._pass_options["common::add_fp16_cast"] = [ _PassOption(option_name="op_selector", option_val=compute_precision.op_selector) ] if package_dir is not None: _, ext = os.path.splitext(package_dir) if ext != _MLPACKAGE_EXTENSION: raise ValueError( f"`package_dir` must have extension {_MLPACKAGE_EXTENSION} (not {ext})" ) specification_version = minimum_deployment_target.value if minimum_deployment_target is not None else None if specification_version is None: specification_version = _set_default_specification_version(exact_target) use_default_fp16_io = ( specification_version is not None and specification_version >= AvailableTarget.iOS16 and need_fp16_cast_pass ) # Verify the inputs cannot contains state if states is None: states = [] _verify_inputs_doesnot_contains_states(inputs) # states can only passed if the source is pytorch if len(states) > 0 and exact_source != "pytorch": raise ValueError("'states' can only be passed with pytorch source model.") mlmodel = mil_convert( model, convert_from=exact_source, convert_to=exact_target, inputs=inputs, outputs=outputs_as_tensor_or_image_types, # None or list[ct.ImageType/ct.TensorType] classifier_config=classifier_config, skip_model_load=skip_model_load, compute_units=compute_units, package_dir=package_dir, debug=debug, specification_version=specification_version, main_pipeline=pass_pipeline, use_default_fp16_io=use_default_fp16_io, states=states, ) if exact_target == "mlprogram" and mlmodel._input_has_infinite_upper_bound(): raise ValueError( "For mlprogram, inputs with infinite upper_bound is not allowed. Please set upper_bound" ' to a positive value in "RangeDim()" for the "inputs" param in ct.convert().' ) if exact_target == 'milinternal': return mlmodel # Returns the MIL program if minimum_deployment_target is not None: check_deployment_compatibility( spec=mlmodel.get_spec(), representation=exact_target, deployment_target=minimum_deployment_target, ) gc.collect() mlmodel = _record_build_metadata(mlmodel, exact_source, source_dialect=source_dialect) return mlmodel def _need_fp16_cast_pass( compute_precision: Optional[Union[precision, FP16ComputePrecision]], convert_to: Text ) -> bool: if convert_to not in ("mlprogram", "neuralnetwork", "milinternal", "milpython"): raise NotImplementedError(f"Backend converter {convert_to} not implemented") if compute_precision is None: return convert_to != "neuralnetwork" elif compute_precision == precision.FLOAT32: return False elif compute_precision == precision.FLOAT16 or isinstance( compute_precision, FP16ComputePrecision ): return True else: raise ValueError(f"Invalid value of the argument 'compute_precision': {compute_precision}") def _set_default_specification_version(target) -> Optional[AvailableTarget]: if target == "neuralnetwork": return _LOWEST_ALLOWED_SPECIFICATION_VERSION_FOR_NEURALNETWORK elif target == "mlprogram": return _LOWEST_ALLOWED_SPECIFICATION_VERSION_FOR_MILPROGRAM elif target in ("milinternal", "milpython"): return None else: raise NotImplementedError("Backend converter {} not implemented".format(target)) def _check_deployment_target(minimum_deployment_target): if minimum_deployment_target is not None and not isinstance( minimum_deployment_target, AvailableTarget ): msg = ( "Unrecognized value of argument 'minimum_deployment_target': {}. " "It needs to be a member of 'coremltools.target' enumeration. " "For example, coremltools.target.iOS13" ) raise TypeError(msg.format(minimum_deployment_target)) def _verify_inputs_doesnot_contains_states( inputs: List[InputType], ) -> None: """ Verify that StateType is not present in the inputs. """ if inputs is None: return for val in inputs: if isinstance(val, StateType): raise ValueError("'inputs' cannot contain an instance of StateType.") def _validate_outputs_argument(outputs): """ - validate properties that the "outputs" argument must satisfy, for instance, it should either be a list of ct.ImageType/ct.TensorType or a list of strings, etc. - return : tuple - (outputs_as_strings, outputs_as_tensor_or_image_types) - outputs_as_strings: list[str] - outputs_as_tensor_or_image_types : list[ct.ImageType] or list[ct.TensorType] """ if outputs is None: return None, None else: if not isinstance(outputs, list): raise ValueError('"outputs" must be of type list') if len(outputs) == 0: return None, None if not all(map(lambda t: isinstance(t, (ImageType, str, TensorType)), outputs)): raise ValueError('Elements in "outputs" must be ct.TensorType or ct.ImageType or str') msg_inconsistent_types = 'all elements of "outputs" must either be of type str ' \ 'or of types ct.ImageType/ct.TensorType' if isinstance(outputs[0], str): # if one of the elements is a string, all elements must be strings if not all([isinstance(t, str) for t in outputs]): raise ValueError(msg_inconsistent_types) return outputs, [TensorType(name=name) for name in outputs] if isinstance(outputs[0], InputType): if not all([isinstance(t, TensorType) or isinstance(t, ImageType) for t in outputs]): raise ValueError(msg_inconsistent_types) if any([t.shape is not None for t in outputs]): msg = "The 'shape' argument must not be specified for the outputs, since it is " \ "automatically inferred from the input shapes and the ops in the model" raise ValueError(msg) for out_ in outputs: if isinstance(out_, TensorType): if out_.default_value is not None: raise ValueError( "The 'default_value' argument must not be specified for the outputs" ) if isinstance(out_, ImageType): if out_.scale != 1.0: raise ValueError("'scale' must be 1.0 for a output of ImageType") if not (out_.bias is None or out_.bias == 0.0 or out_.bias == [0.0, 0.0, 0.0]): raise ValueError("'bias' must be None or 0 for an output of ImageType") if out_.channel_first is not None: raise ValueError("'channel_first' must be None for an output of ImageType") output_names = [t.name for t in outputs] # verify that either all of the entries in output_names is "None" or none of them is "None" msg_consistent_names = 'Either none or all the outputs must have the "name" argument specified' if output_names[0] is None and not all([name is None for name in output_names]): raise ValueError(msg_consistent_names) if output_names[0] is not None and not all([name is not None for name in output_names]): raise ValueError(msg_consistent_names) if output_names[0] is not None: if len(set(output_names)) != len(output_names): raise ValueError("Duplicate names provided in 'outputs'") if output_names[0] is None: return None, outputs else: return output_names, outputs def _validate_conversion_arguments( model, exact_source, exact_target, inputs, outputs, classifier_config, compute_precision, convert_to, minimum_deployment_target, ) -> None: """ Validate and process model, inputs, classifier_config based on `exact_source` (which cannot be `auto`) and `exact_target`. """ def raise_if_duplicated(input_list): # Detect duplicated inputs input_names = [t.name for t in input_list if t.name is not None] dups = [ item for item, count in collections.Counter(input_names).items() if count > 1 ] if len(dups) > 0: raise ValueError("Duplicated inputs: {}".format(dups)) def _flatten_list(_inputs): ret = [] for _input in _inputs: if isinstance(_input, (list, tuple)): ret.extend(_flatten_list(_input)) elif isinstance(_input, InputType): ret.append(_input) else: raise ValueError( "Unknown type {} for flattening into InputType.".format( type(_input) ) ) return ret flat_inputs = None if inputs is not None: if not isinstance(inputs, list): raise ValueError("`inputs` must be of type list") # get flattened inputs flat_inputs = _flatten_list(inputs) for flat_input in flat_inputs: if not isinstance(flat_input, InputType): raise ValueError("inputs must be a list of type ct.TensorType or ct.ImageType") if flat_input.dtype == types.fp16: if not ( minimum_deployment_target is not None and minimum_deployment_target >= AvailableTarget.iOS16 ): raise TypeError( "float16 dtype for inputs is only supported for deployment " "target >= iOS16/macOS13/watchOS9/tvOS16" ) if exact_target == "mlprogram": err_msg_infinite_bound = ( "For mlprogram, inputs with infinite upper_bound is not allowed. Please set upper_bound" ' to a positive value in "RangeDim()" for the "inputs" param in ct.convert().' ) if inputs is not None: for flat_input in _flatten_list(inputs): tensor_shapes: List[Optional[Shape]] = ( flat_input.shape.shapes if isinstance(flat_input.shape, EnumeratedShapes) else [flat_input.shape] ) for tensor_shape in tensor_shapes: if tensor_shape is not None: for shape in tensor_shape.shape: if isinstance(shape, RangeDim) and shape.upper_bound < 0: raise ValueError(err_msg_infinite_bound) if outputs is not None: for t in outputs: if t.dtype == types.fp16: if not ( minimum_deployment_target is not None and minimum_deployment_target >= AvailableTarget.iOS16 ): raise TypeError( "float16 dtype for outputs is only supported for deployment " "target >= iOS16/macOS13/watchOS9/tvOS16" ) if classifier_config is not None: if not isinstance(classifier_config, ClassifierConfig): raise ValueError("`classifier_config` must be of type ClassifierConfig") if convert_to.lower() == "neuralnetwork" and compute_precision is not None: raise ValueError( "compute_precision is only supported for mlprogram target and must be " "None if target=='neuralnetwork'. Note that target may be implicitly set " "depending on the minimum_deployment_target. See " "minimum_deployment_target for more details." ) if compute_precision is not None: if compute_precision not in [precision.FLOAT32, precision.FLOAT16]: if not isinstance(compute_precision, FP16ComputePrecision): raise ValueError( "'compute_precision' must be either coremltools.precision.FLOAT32 " "or coremltools.precision.FLOAT16 or of type " "coremltools.transform.FP16ComputePrecision()" ) if exact_source in {"tensorflow", "tensorflow2"}: if exact_source == "tensorflow" and not _HAS_TF_1: raise ValueError( 'Converter was called with source="tensorflow", but missing ' "tensorflow package" ) if inputs is not None: raise_if_duplicated(inputs) if inputs is not None and not all([isinstance(_input, InputType) for _input in inputs]): raise ValueError("Input should be a list of TensorType or ImageType") elif exact_source == "pytorch": if _HAS_TORCH_EXPORT_API and isinstance(model, ExportedProgram): if model.dialect not in ("ATEN", "EDGE"): raise NotImplementedError( f"Conversion for models with only ATEN or EDGE dialect is supported/tested. Provided Dialect: {model.dialect}" ) # TODO: rdar://115845792 ([Executorch] Handle user provided inputs/outputs in the convert API) if inputs is not None: raise AssertionError("'inputs' argument should be None for ExportedProgram") if outputs is not None: raise AssertionError("'outputs' argument should be None for ExportedProgram") else: if is_torch_model(model): if inputs is None: raise ValueError( 'Expected argument "inputs" for TorchScript models not provided' ) raise_if_duplicated(flat_inputs) if inputs is not None and not all( [isinstance(_input, InputType) for _input in flat_inputs] ): raise ValueError( "Input should be a list/tuple (or nested lists/tuples) of TensorType or ImageType" ) else: raise TypeError( "Model must either be a TorchScript object (or .pt or .pth file) or an ExportedProgram object (if using torch.export based API), received: {}".format( type(model) ) ) elif exact_source == "milinternal": if not isinstance(model, Program): raise ValueError( "Converter was asked to convert MIL input, but input is not a MIL " "program!" ) def _determine_source_dialect(model, exact_source): source_dialect = None if exact_source == "pytorch": if _HAS_TORCH_EXPORT_API and isinstance(model, ExportedProgram): return f"TorchExport::{model.dialect}" else: return "TorchScript" return source_dialect def _determine_source(model, source, output_names, outputs_as_tensor_or_image_types, output_argument_as_specified_by_user) -> str: """ Infer source (which can be auto) to the precise framework. """ source = source.lower() if source not in {"auto", "tensorflow", "pytorch", "milinternal"}: raise ValueError( f'Unrecognized value of argument "source": {source}. It must be one of ["auto", "tensorflow", "pytorch", "milinternal"].' ) # Determine tensorflow version if source == "tensorflow" and _HAS_TF_2: return "tensorflow2" if source != 'auto': return source # Determine `auto` source if source == "auto" and _HAS_TF_1: try: loader = TF1Loader(model, outputs=outputs_as_tensor_or_image_types) loader._graph_def_from_model(output_names=output_names) return "tensorflow" except: pass if source == "auto" and _HAS_TF_2: try: loader = TF2Loader(model, outputs=outputs_as_tensor_or_image_types) loader._graph_def_from_model(output_names=output_names) return "tensorflow2" except: pass if source == "auto" and _HAS_TORCH: if _HAS_TORCH_EXPORT_API and isinstance(model, ExportedProgram): return "pytorch" if is_torch_model(model): # validate that the outputs passed by the user are of type ImageType/TensorType if output_argument_as_specified_by_user is not None and not all( [ isinstance(t, TensorType) or isinstance(t, ImageType) for t in output_argument_as_specified_by_user ] ): raise ValueError( '"outputs" must be a list of type ct.TensorType or ct.ImageType ' "for pytorch conversion" ) return "pytorch" if source == "auto" and isinstance(model, Program): return "milinternal" msg = ( "Unable to determine the type of the model, i.e. the source framework. " 'Please provide the value of argument "source", from one of ' '["tensorflow", "pytorch", "milinternal"]. Note that model conversion requires the ' "source package that generates the model. Please make sure you have " "the appropriate version of source package installed. E.g., if you're " "converting model originally trained with TensorFlow 1.14, make sure " "you have `tensorflow==1.14` installed." ) raise ValueError(msg) def _determine_target(convert_to, minimum_deployment_target) -> str: """ Infer the precise backend target, which could be one of ``milinternal``, ``neuralnetwork`` or ``mlprogram`` """ if minimum_deployment_target is None and convert_to is None: logger.warning( "When both 'convert_to' and 'minimum_deployment_target' not specified, " "'convert_to' is set to \"mlprogram\" and 'minimum_deployment_target' is set to " "ct.target.iOS15 (which is same as ct.target.macOS12). " "Note: the model will not run on systems older than iOS15/macOS12/watchOS8/tvOS15. " "In order to make your model run on older system, please set the 'minimum_deployment_target' to iOS14/iOS13. " "Details please see the link: https://apple.github.io/coremltools/docs-guides/source/target-conversion-formats.html" ) if minimum_deployment_target is not None: if convert_to == "mlprogram" and minimum_deployment_target < AvailableTarget.iOS15: raise ValueError( f"When 'convert_to' is {convert_to}, the minimum deployment target " f"must be at least iOS15/macOS12/watchOS8/tvOS15" ) if convert_to == "neuralnetwork" and minimum_deployment_target >= AvailableTarget.iOS15: raise ValueError( f"If minimum deployment target is iOS15/macOS12/watchOS8/tvOS15 or " f"higher, then 'convert_to' cannot be {convert_to}. It must be " f"'mlprogram'" ) if convert_to is not None: return convert_to else: if minimum_deployment_target is None: return "mlprogram" elif minimum_deployment_target <= AvailableTarget.iOS14: return "neuralnetwork" else: return "mlprogram" def _get_metadata_from_mlmodel(mlmodel): # Copy from source mlmodel if metadata info exists src_pkg_version = mlmodel.user_defined_metadata[_METADATA_SOURCE] coremltools_version = mlmodel.user_defined_metadata[_METADATA_VERSION] src_dialect = ( None if _METADATA_SOURCE_DIALECT not in mlmodel.user_defined_metadata else mlmodel.user_defined_metadata[_METADATA_SOURCE_DIALECT] ) src_pkg_version_list = src_pkg_version.split("==") if len(src_pkg_version_list) == 0: src_pkg, pkg_ver = None, None elif len(src_pkg_version_list) == 1: src_pkg, pkg_ver = src_pkg_version_list[0], "" elif len(src_pkg_version_list) == 2: src_pkg, pkg_ver = src_pkg_version_list else: raise AssertionError("Unable to parse src_pkg_version") build_info = { "coremltools-version": _ct_version if not coremltools_version else coremltools_version } if src_pkg is not None and pkg_ver is not None: build_info['coremltools-component-' + src_pkg] = str(pkg_ver) if src_dialect is not None: build_info["coremltools-source-dialect"] = src_dialect return build_info def _record_build_metadata(mlmodel, exact_source, source_dialect=None): # recording metadata: coremltools version, source framework and version if exact_source in {"tensorflow", "tensorflow2"} and (_HAS_TF_1 or _HAS_TF_2): src_pkg_version = "tensorflow=={0}".format(tf.__version__) elif exact_source == "pytorch" and _HAS_TORCH: src_pkg_version = "torch=={0}".format(torch.__version__) elif exact_source == 'milinternal': src_pkg_version = "milinternal" else: raise ValueError('Unsupported source {}'.format(exact_source)) mlmodel.user_defined_metadata[_METADATA_SOURCE] = src_pkg_version mlmodel.user_defined_metadata[_METADATA_VERSION] = _ct_version if source_dialect is not None: mlmodel.user_defined_metadata[_METADATA_SOURCE_DIALECT] = source_dialect build_info = _get_metadata_from_mlmodel(mlmodel) mlmodel._set_build_info_mil_attributes(build_info) return mlmodel ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/_profile_utils.py0000644000000000000000000000450714672066616022632 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import time _FUNCTION_PROFILE_REGISTRY = {} # str -> list (function name to time stack) _ENABLE_PROFILING = os.environ.get("ENABLE_PROFILING", False) def _profile(_f=None): def func_wrapper(func): f_name = f"{func.__module__}.{func.__name__}" if f_name in _FUNCTION_PROFILE_REGISTRY: raise ValueError(f"Function {f_name} is already registered for profiling.") _FUNCTION_PROFILE_REGISTRY[f_name] = [] return func if _f is None: return func_wrapper return func_wrapper(_f) _INITIAL_CALL = True def _pr_color(skk, color="94m", end="\n"): print("\033[{} {}\033[00m".format(color, skk), end=end) def _profiler(frame, event, arg, indent=[0]): if frame.f_globals.get("__name__", None) is None: return package_name = __name__.split(".")[0] function_name = f"{frame.f_globals['__name__']}.{frame.f_code.co_name}" profile_function = ( package_name in str(frame) and function_name in _FUNCTION_PROFILE_REGISTRY ) if event == "call" and profile_function: global _INITIAL_CALL if _INITIAL_CALL: _INITIAL_CALL = False print("\n" * 2) indent[0] += 3 _pr_color( "{} call {} {}".format( "=" * indent[0] + ">", function_name.split(".")[-1], " (" + ".".join(function_name.split(".")[2:-1]) + ")", ) ) start_time = time.clock() _FUNCTION_PROFILE_REGISTRY[function_name].append(start_time) elif event == "return" and profile_function: duration = time.clock() - _FUNCTION_PROFILE_REGISTRY[function_name][-1] duration = round(duration) _pr_color( "{} exit {} {} ".format( "<" + "=" * indent[0], function_name.split(".")[-1], " (" + ".".join(function_name.split(".")[2:-1]) + ")", ), end="", ) _pr_color(": Time spent {} seconds ".format(duration,), color="91m") indent[0] -= 3 _FUNCTION_PROFILE_REGISTRY[function_name].pop() return _profiler ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2055464 coremltools-8.0/coremltools/converters/libsvm/0000755000000000000000000000000014672075535020527 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/libsvm/__init__.py0000644000000000000000000000650714672066616022650 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..._deps import _HAS_LIBSVM from . import _libsvm_converter, _libsvm_util if _HAS_LIBSVM: from libsvm import svmutil as _svmutil def convert( model, input_names="input", target_name="target", probability="classProbability", input_length="auto", ): """ Convert a LIBSVM model to Core ML format. Parameters ---------- model: a libsvm model (C-SVC, nu-SVC, epsilon-SVR, or nu-SVR) or string path to a saved model. input_names: str | [str] Name of the input column(s). If a single string is used (the default) the input will be an array. The length of the array will be inferred from the model, this can be overridden using the 'input_length' parameter. target: str Name of the output column. probability: str Name of the output class probability column. Only used for C-SVC and nu-SVC that have been trained with probability estimates enabled. input_length: int Set the length of the input array. This parameter should only be used when the input is an array (i.e. when 'input_name' is a string). Returns ------- model: MLModel Model in Core ML format. Examples -------- .. sourcecode:: python # Make a LIBSVM model >>> import svmutil >>> problem = svmutil.svm_problem([0,0,1,1], [[0,1], [1,1], [8,9], [7,7]]) >>> libsvm_model = svmutil.svm_train(problem, svmutil.svm_parameter()) # Convert using default input and output names >>> import coremltools >>> coreml_model = coremltools.converters.libsvm.convert(libsvm_model) # Save the CoreML model to a file. >>> coreml_model.save('./my_model.mlmodel') # Convert using user specified input names >>> coreml_model = coremltools.converters.libsvm.convert(libsvm_model, input_names=['x', 'y']) """ if not (_HAS_LIBSVM): raise RuntimeError("libsvm not found. libsvm conversion API is disabled.") if isinstance(model, str): libsvm_model = _libsvm_util.load_model(model) else: libsvm_model = model if not isinstance(libsvm_model, _svmutil.svm_model): raise TypeError( "Expected 'model' of type '%s' (got %s)" % (_svmutil.svm_model, type(libsvm_model)) ) if not isinstance(target_name, str): raise TypeError( "Expected 'target_name' of type str (got %s)" % type(libsvm_model) ) if input_length != "auto" and not isinstance(input_length, int): raise TypeError( "Expected 'input_length' of type int, got %s" % type(input_length) ) if input_length != "auto" and not isinstance(input_names, str): raise ValueError( "'input_length' should not be used unless the input will be only one array." ) if not isinstance(probability, str): raise TypeError( "Expected 'probability' of type str (got %s)" % type(probability) ) return _libsvm_converter.convert( libsvm_model, input_names, target_name, input_length, probability ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/libsvm/_libsvm_converter.py0000644000000000000000000001604414672066616024630 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools import __version__ as ct_version from coremltools.models import _METADATA_SOURCE, _METADATA_VERSION from ... import SPECIFICATION_VERSION from ..._deps import _HAS_LIBSVM def _infer_min_num_features(model): # find the largest index of all the support vectors max_index = 0 for i in range(model.l): j = 0 while model.SV[i][j].index != -1: cur_last_index = model.SV[i][j].index j += 1 if cur_last_index > max_index: max_index = cur_last_index return max_index def convert(libsvm_model, feature_names, target, input_length, probability): """ Convert a support vector machine (SVM) model to the protobuf spec. Supports: * C-SVC * nu-SVC * Epsilon-SVR * nu-SVR Parameters ---------- model_path: libsvm_model Libsvm representation of the model. feature_names : [str] | str Names of each of the features. target: str Name of the predicted class column. probability: str Name of the class probability column. Only used for C-SVC and nu-SVC. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_LIBSVM): raise RuntimeError("libsvm not found. libsvm conversion API is disabled.") from libsvm import svm as _svm from ...models import MLModel from ...proto import Model_pb2 svm_type_enum = libsvm_model.param.svm_type # Create the spec export_spec = Model_pb2.Model() export_spec.specificationVersion = SPECIFICATION_VERSION if svm_type_enum == _svm.EPSILON_SVR or svm_type_enum == _svm.NU_SVR: svm = export_spec.supportVectorRegressor else: svm = export_spec.supportVectorClassifier # Set the features names inferred_length = _infer_min_num_features(libsvm_model) if isinstance(feature_names, str): # input will be a single array if input_length == "auto": print( "[WARNING] Inferring an input length of %d. If this is not correct," " use the 'input_length' parameter." % inferred_length ) input_length = inferred_length elif inferred_length > input_length: raise ValueError( "An input length of %d was given, but the model requires an" " input of at least %d." % (input_length, inferred_length) ) input = export_spec.description.input.add() input.name = feature_names input.type.multiArrayType.shape.append(input_length) input.type.multiArrayType.dataType = Model_pb2.ArrayFeatureType.DOUBLE else: # input will be a series of doubles if inferred_length > len(feature_names): raise ValueError( "%d feature names were given, but the model requires at" " least %d features." % (len(feature_names), inferred_length) ) for cur_input_name in feature_names: input = export_spec.description.input.add() input.name = cur_input_name input.type.doubleType.MergeFromString(b"") # Set target output = export_spec.description.output.add() output.name = target # Set the interface types if svm_type_enum == _svm.EPSILON_SVR or svm_type_enum == _svm.NU_SVR: export_spec.description.predictedFeatureName = target output.type.doubleType.MergeFromString(b"") nr_class = 2 elif svm_type_enum == _svm.C_SVC or svm_type_enum == _svm.NU_SVC: export_spec.description.predictedFeatureName = target output.type.int64Type.MergeFromString(b"") nr_class = len(libsvm_model.get_labels()) for i in range(nr_class): svm.numberOfSupportVectorsPerClass.append(libsvm_model.nSV[i]) svm.int64ClassLabels.vector.append(libsvm_model.label[i]) if probability and bool(libsvm_model.probA): output = export_spec.description.output.add() output.name = probability output.type.dictionaryType.MergeFromString(b"") output.type.dictionaryType.int64KeyType.MergeFromString(b"") export_spec.description.predictedProbabilitiesName = probability else: raise ValueError( "Only the following SVM types are supported: C_SVC, NU_SVC, EPSILON_SVR, NU_SVR" ) if libsvm_model.param.kernel_type == _svm.LINEAR: svm.kernel.linearKernel.MergeFromString( b"" ) # Hack to set kernel to an empty type elif libsvm_model.param.kernel_type == _svm.RBF: svm.kernel.rbfKernel.gamma = libsvm_model.param.gamma elif libsvm_model.param.kernel_type == _svm.POLY: svm.kernel.polyKernel.degree = libsvm_model.param.degree svm.kernel.polyKernel.c = libsvm_model.param.coef0 svm.kernel.polyKernel.gamma = libsvm_model.param.gamma elif libsvm_model.param.kernel_type == _svm.SIGMOID: svm.kernel.sigmoidKernel.c = libsvm_model.param.coef0 svm.kernel.sigmoidKernel.gamma = libsvm_model.param.gamma else: raise ValueError( "Unsupported kernel. The following kernel are supported: linear, RBF, polynomial and sigmoid." ) # set rho # also set probA/ProbB only for SVC if svm_type_enum == _svm.C_SVC or svm_type_enum == _svm.NU_SVC: num_class_pairs = nr_class * (nr_class - 1) // 2 for i in range(num_class_pairs): svm.rho.append(libsvm_model.rho[i]) if bool(libsvm_model.probA) and bool(libsvm_model.probB): for i in range(num_class_pairs): svm.probA.append(libsvm_model.probA[i]) svm.probB.append(libsvm_model.probB[i]) else: svm.rho = libsvm_model.rho[0] # set coefficients if svm_type_enum == _svm.C_SVC or svm_type_enum == _svm.NU_SVC: for _ in range(nr_class - 1): svm.coefficients.add() for i in range(libsvm_model.l): for j in range(nr_class - 1): svm.coefficients[j].alpha.append(libsvm_model.sv_coef[j][i]) else: for i in range(libsvm_model.l): svm.coefficients.alpha.append(libsvm_model.sv_coef[0][i]) # set support vectors for i in range(libsvm_model.l): j = 0 cur_support_vector = svm.sparseSupportVectors.vectors.add() while libsvm_model.SV[i][j].index != -1: cur_node = cur_support_vector.nodes.add() cur_node.index = libsvm_model.SV[i][j].index cur_node.value = libsvm_model.SV[i][j].value j += 1 model = MLModel(export_spec) from libsvm import __version__ as libsvm_version libsvm_version = "libsvm=={0}".format(libsvm_version) model.user_defined_metadata[_METADATA_VERSION] = ct_version model.user_defined_metadata[_METADATA_SOURCE] = libsvm_version return model ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/libsvm/_libsvm_util.py0000644000000000000000000000171314672066616023573 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..._deps import _HAS_LIBSVM def load_model(model_path): """Load a libsvm model from a path on disk. This currently supports: * C-SVC * NU-SVC * Epsilon-SVR * NU-SVR Parameters ---------- model_path: str Path on disk where the libsvm model representation is. Returns ------- model: libsvm_model A model of the libsvm format. """ if not (_HAS_LIBSVM): raise RuntimeError("libsvm not found. libsvm conversion API is disabled.") import os from svmutil import svm_load_model # From libsvm if not os.path.exists(model_path): raise IOError("Expected a valid file path. %s does not exist" % model_path) return svm_load_model(model_path) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2055464 coremltools-8.0/coremltools/converters/mil/0000755000000000000000000000000014672075535020014 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/__init__.py0000644000000000000000000000161114672066616022124 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .mil import (SPACES, SUPPORT_FLOAT_TYPES, SUPPORT_INT_TYPES, Block, Builder, DefaultInputs, Function, InputSpec, InternalVar, ListInputType, ListVar, Operation, Placeholder, Program, Symbol, TupleInputType, Var, builder, curr_block, get_existing_symbol, get_new_symbol, get_new_variadic_symbol, mil_list, register_op) from .input_types import (ClassifierConfig, ColorLayout, EnumeratedShapes, ImageType, InputType, RangeDim, Shape, TensorType, StateType) from .frontend.tensorflow.tf_op_registry import register_tf_op from .frontend.torch import register_torch_op ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/_deployment_compatibility.py0000644000000000000000000001411014672066616025633 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from enum import IntEnum from coremltools import ( _SPECIFICATION_VERSION_IOS_13, _SPECIFICATION_VERSION_IOS_14, _SPECIFICATION_VERSION_IOS_15, _SPECIFICATION_VERSION_IOS_16, _SPECIFICATION_VERSION_IOS_17, _SPECIFICATION_VERSION_IOS_18, ) class AvailableTarget(IntEnum): # iOS versions iOS13 = _SPECIFICATION_VERSION_IOS_13 iOS14 = _SPECIFICATION_VERSION_IOS_14 iOS15 = _SPECIFICATION_VERSION_IOS_15 iOS16 = _SPECIFICATION_VERSION_IOS_16 iOS17 = _SPECIFICATION_VERSION_IOS_17 iOS18 = _SPECIFICATION_VERSION_IOS_18 # macOS versions (aliases of iOS versions) macOS10_15 = _SPECIFICATION_VERSION_IOS_13 macOS10_16 = _SPECIFICATION_VERSION_IOS_14 macOS11 = _SPECIFICATION_VERSION_IOS_14 macOS12 = _SPECIFICATION_VERSION_IOS_15 macOS13 = _SPECIFICATION_VERSION_IOS_16 macOS14 = _SPECIFICATION_VERSION_IOS_17 macOS15 = _SPECIFICATION_VERSION_IOS_18 # watchOS versions (aliases of iOS versions) watchOS6 = _SPECIFICATION_VERSION_IOS_13 watchOS7 = _SPECIFICATION_VERSION_IOS_14 watchOS8 = _SPECIFICATION_VERSION_IOS_15 watchOS9 = _SPECIFICATION_VERSION_IOS_16 watchOS10 = _SPECIFICATION_VERSION_IOS_17 watchOS11 = _SPECIFICATION_VERSION_IOS_18 # tvOS versions (aliases of iOS versions) tvOS13 = _SPECIFICATION_VERSION_IOS_13 tvOS14 = _SPECIFICATION_VERSION_IOS_14 tvOS15 = _SPECIFICATION_VERSION_IOS_15 tvOS16 = _SPECIFICATION_VERSION_IOS_16 tvOS17 = _SPECIFICATION_VERSION_IOS_17 tvOS18 = _SPECIFICATION_VERSION_IOS_18 # customized __str__ def __str__(self): original_str = super().__str__() new_str = original_str.replace(type(self).__name__, "coremltools.target") return new_str _get_features_associated_with = {} def register_with(name): def decorator(func): if name not in _get_features_associated_with: _get_features_associated_with[name] = func else: raise ValueError("Function is already registered with {}".format(name)) return func return decorator @register_with(AvailableTarget.iOS14) def iOS14Features(spec): features_list = [] if spec.WhichOneof("Type") == "neuralNetwork": nn_spec = spec.neuralNetwork elif spec.WhichOneof("Type") in "neuralNetworkClassifier": nn_spec = spec.neuralNetworkClassifier elif spec.WhichOneof("Type") in "neuralNetworkRegressor": nn_spec = spec.neuralNetworkRegressor else: raise ValueError("Invalid neural network specification for the model") # Non-zero default optional values for idx, input in enumerate(spec.description.input): value = 0 if input.type.isOptional: value = max(value, input.type.multiArrayType.floatDefaultValue) value = max(value, input.type.multiArrayType.doubleDefaultValue) value = max(value, input.type.multiArrayType.intDefaultValue) if value != 0: msg = "Support of non-zero default optional values for inputs." features_list.append(msg) break # Layers or modifications introduced in iOS14 new_layers = [ "oneHot", "cumSum", "clampedReLU", "argSort", "pooling3d", "convolution3d", "globalPooling3d", ] for layer in nn_spec.layers: layer_type = layer.WhichOneof("layer") msg = "" if layer_type in new_layers: msg = "{} {}".format(layer_type.capitalize(), "operation") if layer_type == "tile" and len(layer.input) == 2: msg = "Dynamic Tile operation" if layer_type == "upsample" and layer.upsample.linearUpsampleMode in [1, 2]: msg = "Upsample operation with Align Corners mode" if layer_type == "reorganizeData" and layer.reorganizeData.mode == 2: msg = "Pixel Shuffle operation" if layer_type == "sliceDynamic" and layer.sliceDynamic.squeezeMasks: msg = "Squeeze mask for dynamic slice operation" if layer_type == "sliceStatic" and layer.sliceDynamic.squeezeMasks: msg = "Squeeze mask for static slice operation" if layer_type == "concatND" and layer.concatND.interleave: msg = "Concat layer with interleave operation" if msg != "" and (msg not in features_list): features_list.append(msg) return features_list def check_deployment_compatibility(spec, representation, deployment_target): if not isinstance(deployment_target, AvailableTarget): raise TypeError( "Argument for deployment_target must be an enumeration from Enum class AvailableTarget" ) for any_target in AvailableTarget: if any_target > deployment_target and any_target in _get_features_associated_with: missing_features = _get_features_associated_with[any_target](spec) if missing_features: msg = ( "Provided minimum deployment target requires model to be of version {} but converted model " "uses following features which are available from version {} onwards. Please use a higher " "minimum deployment target to convert. \n ".format( deployment_target.value, any_target.value ) ) for i, feature in enumerate(missing_features): msg += " {}. {}\n".format(i + 1, feature) raise ValueError(msg) # Default exception throwing if not able to find the reason behind spec version bump if spec.specificationVersion > deployment_target.value: msg = ( "Provided deployment target requires model to be of version {} but converted model has version {} " "suitable for later releases".format( deployment_target.value, spec.specificationVersion, ) ) raise ValueError(msg) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2055464 coremltools-8.0/coremltools/converters/mil/backend/0000755000000000000000000000000014672075535021403 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/__init__.py0000644000000000000000000000033214672066616023512 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/backend_helper.py0000644000000000000000000000743414672066616024713 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.input_types import ColorLayout from coremltools.converters.mil.mil.passes.defs.preprocess import NameSanitizer from coremltools.proto import FeatureTypes_pb2 as ft def _get_probability_var_for_classifier(prog, classifier_config): ''' Return the var which will be used to construct the dictionary for the classifier. :param prog: mil program :param classifier_config: an instance of coremltools.ClassifierConfig class :return: var ''' block = prog.functions["main"] probability_var = None if classifier_config.predicted_probabilities_output is None \ or classifier_config.predicted_probabilities_output == "": # user has not indicated which tensor in the program to use as probabilities # (i.e which tensor to link to the classifier output) # in this case, attach the last non const op to the classify op for op in reversed(block.operations): if op.op_type != 'const' and len(op.outputs) == 1: probability_var = op.outputs[0] break if probability_var is None: raise ValueError("Unable to determine the tensor in the graph " "that corresponds to the probabilities for the classifier output") else: # user has indicated which tensor in the program to use as probabilities # (i.e which tensor to link to the classifier output) # Verify that it corresponds to a var produced in the program predicted_probabilities_output = NameSanitizer().sanitize_name(classifier_config.predicted_probabilities_output) for op in block.operations: for out in op.outputs: if out.name == predicted_probabilities_output: probability_var = out break if probability_var is None: msg = "'predicted_probabilities_output', '{}', provided in 'ClassifierConfig', does not exist in the MIL program." raise ValueError(msg.format(predicted_probabilities_output)) return probability_var def _get_colorspace_enum(color_layout): if color_layout == ColorLayout.GRAYSCALE: return ft.ImageFeatureType.ColorSpace.GRAYSCALE elif color_layout == ColorLayout.GRAYSCALE_FLOAT16: return ft.ImageFeatureType.ColorSpace.GRAYSCALE_FLOAT16 elif color_layout == ColorLayout.BGR: return ft.ImageFeatureType.ColorSpace.BGR else: return ft.ImageFeatureType.ColorSpace.RGB def _validate_image_input_output_shapes(color_layout, shape, name, is_input=True): io_str = "input" if is_input else "output" if len(shape) != 4: raise ValueError("Image {}, '{}', must have rank 4. Instead it has rank {}". format(io_str, name, len(shape))) if color_layout in (ColorLayout.BGR, ColorLayout.RGB): if shape[1] != 3 or shape[0] != 1: raise ValueError("Shape of the RGB/BGR image {}, '{}', must be of kind (1, 3, H, W), " "i.e., first two dimensions must be (1, 3), instead they are: {}". format(io_str, name, shape[:2])) elif color_layout in (ColorLayout.GRAYSCALE, ColorLayout.GRAYSCALE_FLOAT16): if shape[1] != 1 or shape[0] != 1: raise ValueError("Shape of the Grayscale image {}, '{}', must be of kind (1, 1, H, W), " "i.e., first two dimensions must be (1, 1), instead they are: {}". format(io_str, name, shape[:2])) else: raise KeyError("Unrecognized color_layout {}".format(color_layout)) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2095466 coremltools-8.0/coremltools/converters/mil/backend/mil/0000755000000000000000000000000014672075535022164 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/mil/__init__.py0000644000000000000000000000033214672066616024273 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/mil/helper.py0000644000000000000000000003154514672066616024025 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools import proto from coremltools.converters.mil.mil import types # For immediate values, those types are stored in bytes (MIL parser reads those types from bytes). IMMEDIATE_VALUE_TYPES_IN_BYTES = ( types.fp16, types.int4, types.int8, types.uint1, types.uint2, types.uint3, types.uint4, types.uint6, types.uint8, types.uint32, ) def create_valuetype_scalar(data_type): """ Return proto.MIL_pb2.ValueType with DataType set """ v_type = proto.MIL_pb2.ValueType() update_tensortype(v_type.tensorType, (), data_type) return v_type def update_listtype(l_type, length, elem_shape, dtype): """ Update in-place of l_type (ListType) to length and type. """ elem_type = create_valuetype_tensor(elem_shape, dtype) l_type.type.CopyFrom(elem_type) l_dim = l_type.length set_proto_dim(l_dim, length) def create_valuetype_list(length, elem_shape, dtype): """ Return proto.MIL_pb2.ValueType with List (ListType) set. length: length of list (int) """ v_type = proto.MIL_pb2.ValueType() update_listtype(v_type.listType, length, elem_shape, dtype) return v_type def create_valuetype_tensor(shape, data_type): """ Return proto.MIL_pb2.ValueType with tensor (TensorType) set. shape: list of ints """ v_type = proto.MIL_pb2.ValueType() update_tensortype(v_type.tensorType, shape, data_type) return v_type def set_proto_dim(proto_dim, dim): if isinstance(dim, (int, np.integer)): proto_dim.constant.size = dim else: dim_str = str(dim) if len(dim_str) > 0: if dim_str[0] == "*" or (len(dim_str) >= 3 and dim_str[0:3] == "..."): proto_dim.unknown.variadic = True return proto_dim.unknown.variadic = False def update_tensortype(t_type, shape, data_type): """ Update in-place of t_type (TensorType) to shape and data_type. """ t_type.dataType = data_type t_type.rank = len(shape) t_type.ClearField("dimensions") for s in shape: t_dim = t_type.dimensions.add() set_proto_dim(t_dim, s) def _tensor_field_by_type(tensor_val, builtin_type): """ Pick the field based on the builtin_type. The field is defined in TensorValue in ``mlmodel/format/MIL.proto``. The picked field need to be consistent with how it will be read by MIL. For example, int8 is serialized to ``bytes`` field while int16 is serialized to ``ints`` field. """ if builtin_type == types.bool: return tensor_val.bools.values elif types.is_int(builtin_type): if builtin_type == types.int64 or builtin_type == types.uint64: return tensor_val.longInts.values if builtin_type in IMMEDIATE_VALUE_TYPES_IN_BYTES: return tensor_val.bytes.values if builtin_type == types.int16 or builtin_type == types.uint16: # TODO (rdar://111797203): Serialize to byte after MIL changes to read from byte field. return tensor_val.ints.values return tensor_val.ints.values elif types.is_float(builtin_type): if builtin_type == types.fp64: return tensor_val.doubles.values elif builtin_type == types.fp32: return tensor_val.floats.values elif builtin_type == types.fp16: return tensor_val.bytes.values else: raise TypeError( "Unsupported float dtype for MIL proto serialization: {}".format( types.builtin_to_string(builtin_type) ) ) elif builtin_type == types.str: return tensor_val.strings.values else: raise NotImplementedError("Unimplemented tensor type for: " + str(builtin_type)) def _set_empty_tensor_field_by_type(tensor_val, builtin_type): if builtin_type == types.bool: tensor_val.bools.SetInParent() elif types.is_int(builtin_type): if (builtin_type == types.int64 or builtin_type == types.uint64): tensor_val.longInts.SetInParent() elif builtin_type in IMMEDIATE_VALUE_TYPES_IN_BYTES: tensor_val.bytes.SetInParent() else: tensor_val.ints.SetInParent() elif types.is_float(builtin_type): if (builtin_type == types.fp64): tensor_val.doubles.SetInParent() elif (builtin_type == types.fp32): tensor_val.floats.SetInParent() elif (builtin_type == types.fp16): tensor_val.bytes.SetInParent() else: raise TypeError( "Unsupported float dtype for MIL proto serialization: {}".format( types.builtin_to_string(builtin_type) ) ) elif builtin_type == types.str: tensor_val.strings.SetInParent() else: raise NotImplementedError("Unimplemented tensor type for: " + str(builtin_type)) def create_tensor_value(np_tensor): """ Return TensorValue. """ builtin_type = types.numpy_type_to_builtin_type(np_tensor.dtype) value_type = create_valuetype_tensor(np_tensor.shape, types_to_proto_primitive(builtin_type)) val = proto.MIL_pb2.Value(type=value_type) t_val = val.immediateValue.tensor # Copy the tensor values from the input tensor t_field = _tensor_field_by_type(t_val, builtin_type) if 0 not in np_tensor.shape: if builtin_type == types.str: for x in np.nditer(np_tensor): t_field.append(x.encode("utf-8")) elif builtin_type in IMMEDIATE_VALUE_TYPES_IN_BYTES: val.immediateValue.tensor.bytes.values = types.type_mapping.np_val_to_py_type(np_tensor) else: for x in np_tensor.flatten(): t_field.append(types.type_mapping.np_val_to_py_type(x)) else: # This is an "empty" tensor (tensor with a dimension being size 0) _set_empty_tensor_field_by_type(t_val, builtin_type) return val def create_scalar_value(py_scalar): """ Return TensorValue (since there's no ScalarValue) """ # Create the "scalar" (rank 0) tensor builtin_type = types.type_to_builtin_type(type(py_scalar)) value_type = create_valuetype_scalar(types_to_proto_primitive(builtin_type)) val = proto.MIL_pb2.Value(type=value_type) t_val = val.immediateValue.tensor # Set the tensor value t_field = _tensor_field_by_type(t_val, builtin_type) if builtin_type in IMMEDIATE_VALUE_TYPES_IN_BYTES: # Serialize to bytes because MIL read them from the "bytes" field in TensorValue. val.immediateValue.tensor.bytes.values = types.type_mapping.np_val_to_py_type(py_scalar) else: if builtin_type == types.str: py_scalar = py_scalar.encode("utf-8") t_field.append(types.type_mapping.np_val_to_py_type(py_scalar)) return val def create_tuple_value(py_tuple): """ Return type of Tuple """ tp_val = proto.MIL_pb2.TupleValue() for t in py_tuple: item_val = tp_val.values.add() item_type = item_val.type # ValueType if isinstance(t, int): v = create_scalar_value(t) item_val.immediateValue.i = t item_type = v.type elif isinstance(t, np.ndarray): v = create_tensor_value(t) item_val.immediateValue.tensor.CopyFrom(v.immediateValue.tensor) item_type.tensorType.CopyFrom(v.type.tensorType) else: raise NotImplementedError() return tp_val def create_list_scalarvalue(py_list, np_type): """ Return a Value of type List, which holds scalar values """ builtin_type = types.numpy_type_to_builtin_type(np_type) value_type = create_valuetype_list(length=len(py_list), elem_shape=(), dtype=types_to_proto_primitive(builtin_type)) val = proto.MIL_pb2.Value(type=value_type) list_val = val.immediateValue.list for v in py_list: item_val = list_val.values.add() item_val.CopyFrom(create_scalar_value(v)) return val def create_file_value_tensor(file_name, offset, dim, data_type): """ Create a Value Type to store File Value """ val = proto.MIL_pb2.Value( blobFileValue=proto.MIL_pb2.Value.BlobFileValue(fileName=file_name, offset=offset), type=create_valuetype_tensor(dim, data_type), ) return val def types_to_proto_primitive(valuetype): if valuetype not in types.BUILTIN_TO_PROTO_TYPES: additional_error_msg = "" if valuetype in (types.complex64, types.complex128): additional_error_msg = ( "(MIL doesn't support complex data as model's output, please extract real and " "imaginary parts explicitly.) " ) raise ValueError( f"Unknown map from SSA type {valuetype} to Proto type. {additional_error_msg}" ) return types.BUILTIN_TO_PROTO_TYPES[valuetype] def _get_offset_by_writing_data(output_var, blob_writer): if output_var.dtype == types.int4: offset = blob_writer.write_int4_data(np.ascontiguousarray(output_var.val.flatten())) elif output_var.dtype == types.uint1: offset = blob_writer.write_uint1_data(np.ascontiguousarray(output_var.val.flatten())) elif output_var.dtype == types.uint2: offset = blob_writer.write_uint2_data(np.ascontiguousarray(output_var.val.flatten())) elif output_var.dtype == types.uint3: offset = blob_writer.write_uint3_data(np.ascontiguousarray(output_var.val.flatten())) elif output_var.dtype == types.uint4: offset = blob_writer.write_uint4_data(np.ascontiguousarray(output_var.val.flatten())) elif output_var.dtype == types.uint6: offset = blob_writer.write_uint6_data(np.ascontiguousarray(output_var.val.flatten())) elif output_var.val.dtype.kind == "f" and output_var.val.dtype.itemsize == 4: offset = blob_writer.write_float_data(np.ascontiguousarray(output_var.val.flatten())) elif output_var.val.dtype.kind == "f" and output_var.val.dtype.itemsize == 2: output_var_fp16_to_bytes_to_uint16 = np.frombuffer( output_var.val.flatten().tobytes(), np.uint16 ) offset = blob_writer.write_fp16_data( np.ascontiguousarray(output_var_fp16_to_bytes_to_uint16) ) elif output_var.val.dtype.kind == "u" and output_var.val.dtype.itemsize == 1: offset = blob_writer.write_uint8_data(np.ascontiguousarray(output_var.val.flatten())) elif output_var.val.dtype.kind == "i" and output_var.val.dtype.itemsize == 1: offset = blob_writer.write_int8_data(np.ascontiguousarray(output_var.val.flatten())) elif output_var.val.dtype.kind == "u" and output_var.val.dtype.itemsize == 2: offset = blob_writer.write_uint16_data(np.ascontiguousarray(output_var.val.flatten())) elif output_var.val.dtype.kind == "i" and output_var.val.dtype.itemsize == 2: offset = blob_writer.write_int16_data(np.ascontiguousarray(output_var.val.flatten())) elif output_var.val.dtype.kind == "i" and output_var.val.dtype.itemsize == 4: offset = blob_writer.write_int32_data(np.ascontiguousarray(output_var.val.flatten())) elif output_var.val.dtype.kind == "u" and output_var.val.dtype.itemsize == 4: offset = blob_writer.write_uint32_data(np.ascontiguousarray(output_var.val.flatten())) else: raise TypeError("Unsupported type, {}, for net buffer serialization.".format(output_var.val.dtype)) return offset def create_immediate_value(var): if types.is_tensor(var.sym_type): return create_tensor_value(var.val) elif types.is_list(var.sym_type): if var.elem_type == types.str: return create_list_scalarvalue(var.val, str) elif var.elem_type == types.int64: return create_list_scalarvalue(var.val, np.int64) else: raise NotImplementedError("List element type, {}, not supported yet.".format(var.sym_type.__type_info__())) else: return create_scalar_value(var.val) def cast_to_framework_io_dtype(var, is_output): if var.dtype == types.fp32: return proto.FeatureTypes_pb2.ArrayFeatureType.ArrayDataType.FLOAT32 elif var.dtype == types.int32: return proto.FeatureTypes_pb2.ArrayFeatureType.ArrayDataType.INT32 elif var.dtype == types.fp16: return proto.FeatureTypes_pb2.ArrayFeatureType.ArrayDataType.FLOAT16 else: ioname = "Output " if is_output else "Input " ioname2 = "outputs" if is_output else "inputs" raise NotImplementedError( ioname + var.name + " has data type " + types.builtin_to_string(var.dtype) + ". ML Program models only support fp32 and int32 " + ioname2 + "." ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/mil/load.py0000644000000000000000000013265114672066616023465 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import warnings from collections import OrderedDict from typing import Any, Dict, List, Optional, Tuple, Union import numpy as np from coremltools import ( _OPSET, _SPECIFICATION_VERSION_IOS_15, _SPECIFICATION_VERSION_IOS_17, _SPECIFICATION_VERSION_IOS_18, ) from coremltools import _logger as logger from coremltools import proto from coremltools.converters.mil import mil from coremltools.converters.mil.backend.backend_helper import _get_probability_var_for_classifier from coremltools.converters.mil.backend.mil import helper from coremltools.converters.mil.backend.mil.helper import ( cast_to_framework_io_dtype, create_file_value_tensor, create_immediate_value, create_list_scalarvalue, create_scalar_value, create_valuetype_list, create_valuetype_scalar, create_valuetype_tensor, types_to_proto_primitive, ) from coremltools.converters.mil.backend.nn.load import _set_optional_inputs from coremltools.converters.mil.input_types import ( ClassifierConfig, EnumeratedShapes, ImageType, RangeDim, TensorType, ) from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, Operation, Program, Var, mil_list, types from coremltools.converters.mil.mil.ops.registry import SSAOpRegistry from coremltools.converters.mil.mil.scope import ScopeInfo, ScopeSource from coremltools.converters.mil.mil.types.symbolic import any_symbolic, any_variadic, is_symbolic from coremltools.models.neural_network import flexible_shape_utils from coremltools.models.neural_network.flexible_shape_utils import ( NeuralNetworkImageSize, NeuralNetworkImageSizeRange, ) from coremltools.models.utils import _WEIGHTS_DIR_NAME, _WEIGHTS_FILE_NAME from ..backend_helper import _get_colorspace_enum, _validate_image_input_output_shapes try: from coremltools.libmilstoragepython import _BlobStorageWriter as BlobWriter except Exception as e: logger.warning(f"Fail to import BlobWriter from libmilstoragepython. {e}") BlobWriter = None def should_use_weight_file( val: Union[np.ndarray, np.generic], specification_version: Optional[int] = _SPECIFICATION_VERSION_IOS_15, ) -> bool: # additional dtype are supported >= iOS18 supported_dtypes = ["float16", "float32", "uint8", "int8"] if specification_version >= _SPECIFICATION_VERSION_IOS_18: supported_dtypes += ["uint16", "int16", "int32", "uint32"] return ( val is not None and isinstance(val, (np.ndarray, np.generic)) and val.size >= 10 and val.dtype in supported_dtypes ) class MILProtoExporter: """ An utility class to export a pymil program to milproto. """ def __init__( self, prog: Program, weights_dir: str, specification_version: int, ): self.prog = prog self.weights_dir = weights_dir self.specification_version = specification_version self.blob_writers = {} self.weight_id_to_file_value = {} # mapping from weight_id to file value self.prog.validate(check_essential_scope=True) @staticmethod def _get_valid_kwargs(kwargs: Dict[str, Any]) -> Dict[str, Any]: """ Get a valid kwargs to initialize a MILProtoExporter object. """ return { "prog": kwargs["prog"], "weights_dir": kwargs["weights_dir"], "specification_version": kwargs["specification_version"], } def translate_program_attributes(self) -> Dict[str, Any]: """ Get the program attributes which need to be exported to mil proto. """ return {} def get_weight_path(self, op: Operation) -> str: """ Get the weight path for a constant operation. By default, the weight is saved in {weight_dir}/weight.bin """ assert ( op.op_type == "const" ), f"Expected op (op.name) be a const op. Got op_type of {op.op_type}." return os.path.join(self.weights_dir, _WEIGHTS_FILE_NAME) def get_blob_writer(self, weight_path: str) -> BlobWriter: """ Get a blob writer given a weight_path. """ if weight_path not in self.blob_writers: self.blob_writers[weight_path] = BlobWriter(weight_path) return self.blob_writers[weight_path] def create_file_value(self, var: Var) -> proto.MIL_pb2.Value: """ Returns the mil proto file value of a var. If weight_id is in self.weight_id_to_file_value, we return the value. """ def create_file_value_helper(): weight_path = self.get_weight_path(var.op) blob_writer = self.get_blob_writer(weight_path) offset = helper._get_offset_by_writing_data(var, blob_writer) weight_file_name = os.path.basename(weight_path) # Get proto type for the primitive if hasattr(var.sym_type, "get_primitive"): # tensor primitive = var.sym_type.get_primitive() else: # scalar primitive = var.sym_type proto_primitive = types_to_proto_primitive(primitive) return create_file_value_tensor( file_name=os.path.join( os.path.join("@model_path", _WEIGHTS_DIR_NAME), weight_file_name ), offset=offset, dim=var.val.shape, data_type=proto_primitive, ) # use the cached file value weight_id = var.op.weight_id if weight_id is None: return create_file_value_helper() if weight_id in self.weight_id_to_file_value: assert weight_id is not None, "invalid weight_id" return self.weight_id_to_file_value[weight_id] file_value = create_file_value_helper() self.weight_id_to_file_value[weight_id] = file_value return file_value def get_milproto_value(self, var: Var) -> proto.MIL_pb2.Value: """ Translate a pymil Var into milproto value. """ if should_use_weight_file(var.val, self.specification_version): return self.create_file_value(var) else: return create_immediate_value(var) @staticmethod def _get_input_dict(op: Operation) -> Dict[str, Any]: """ Given an op, returns a dict that maps the param name into the corresponding Var. """ return op.inputs @staticmethod def _get_attr_dict(op: Operation) -> Dict[str, Any]: """ Return the initial attribute dict for an op. """ return {"name": create_scalar_value(op.name)} def translate_const(self, op: Operation) -> proto.MIL_pb2.Operation: """ Translate constant operation. """ if len(op.outputs) != 1: raise AssertionError(f"const {op.name} must have 1 output, but got {len(op.outputs)}") output_var = op.outputs[0] value = self.get_milproto_value(output_var) return proto.MIL_pb2.Operation( type="const", attributes={"name": create_scalar_value(op.name), "val": value}, outputs=[ proto.MIL_pb2.NamedValueType( name=output_var.name, type=self.types_to_proto(output_var.sym_type) ) ], ) def translate_constexpr(self, op: Operation) -> proto.MIL_pb2.Operation: """ Translate constexpr operation. """ inputs = {} attributes = {"name": create_scalar_value(op.name)} if op.opset_version <= _SPECIFICATION_VERSION_IOS_17: attributes.update( {param_name: self.get_milproto_value(var) for param_name, var in op.inputs.items()} ) else: for param_name, var in op.inputs.items(): if var.op.op_type.startswith("constexpr_"): arguments = [proto.MIL_pb2.Argument.Binding(name=var.name)] else: arguments = [proto.MIL_pb2.Argument.Binding(value=self.get_milproto_value(var))] args = proto.MIL_pb2.Argument() args.arguments.extend(arguments) inputs[param_name] = args return proto.MIL_pb2.Operation( type=op.op_type, inputs=inputs, attributes=attributes, outputs=[ proto.MIL_pb2.NamedValueType( name=output_var.name, type=self.types_to_proto(output_var.sym_type) ) for output_var in op.outputs ], ) def create_valuetype_dict(self, key_type: type, value_type: type) -> proto.MIL_pb2.ValueType: """ Return proto.MIL_pb2.ValueType with dict (dictionaryType) set """ v_type = proto.MIL_pb2.ValueType() v_type.dictionaryType.keyType.CopyFrom(self.types_to_proto(key_type)) v_type.dictionaryType.valueType.CopyFrom(self.types_to_proto(value_type)) return v_type def types_to_proto(self, valuetype: type) -> proto.MIL_pb2.ValueType: """ Return proto.MIL_pb2.ValueType from PyMIL types. """ if types.is_tensor(valuetype): primitive = types_to_proto_primitive(valuetype.get_primitive()) return create_valuetype_tensor(valuetype.get_shape(), primitive) elif types.is_tuple(valuetype): v_type = proto.MIL_pb2.ValueType() t_type = v_type.tupleType for t in valuetype.T: new_v_type = t_type.types.add() new_v_type.CopyFrom(self.types_to_proto(t)) return v_type elif types.is_list(valuetype): elem = valuetype.T[0] length = valuetype.T[1] if types.is_tensor(elem): dtype = types_to_proto_primitive(elem.get_primitive()) elem_shape = elem.get_shape() elif types.is_scalar(elem): dtype = types_to_proto_primitive(valuetype) elem_shape = () elif types.is_str(elem): dtype = types_to_proto_primitive(elem) elem_shape = () else: raise NotImplementedError( "Only list of either tensors or scalars supported. " "Got element of type {}".format(elem.__type_info__()) ) return create_valuetype_list(length=length, elem_shape=elem_shape, dtype=dtype) elif types.is_dict(valuetype): return self.create_valuetype_dict(valuetype.T[0], valuetype.T[1]) elif types.is_state(valuetype): wrapped_type = valuetype.wrapped_type() v_type = proto.MIL_pb2.ValueType() v_type.stateType.wrappedType.CopyFrom(self.types_to_proto(wrapped_type)) return v_type else: return create_valuetype_scalar(types_to_proto_primitive(valuetype)) def translate_coreml_update_state_op(self, op: Operation) -> List[proto.MIL_pb2.Operation]: """ ``coreml_update_state`` is decomposed into ``write_state`` and ``read_state``. """ def get_input_binding(param_name: str) -> proto.MIL_pb2.Argument: arguments = [proto.MIL_pb2.Argument.Binding(name=op.inputs[param_name].name)] args = proto.MIL_pb2.Argument() args.arguments.extend(arguments) return args res = [] # write_state write_state_attrs = {"name": create_scalar_value(op.name + "_write_state")} write_state_inputs = { "input": get_input_binding("state"), "data": get_input_binding("value"), } res.append( proto.MIL_pb2.Operation( type="write_state", inputs=write_state_inputs, attributes=write_state_attrs, ) ) # If the coreml_update_state is not feed into any ops or is not block outputs, # we don't need the read_state op if len(op.outputs[0].child_ops) == 0 and len(op.outputs[0].consuming_blocks) == 0: return res # read_state read_state_attrs = {"name": create_scalar_value(op.name)} read_state_inputs = { "input": get_input_binding("state"), } outputs = [ proto.MIL_pb2.NamedValueType(name=v.name, type=self.types_to_proto(v.sym_type)) for v in op.outputs ] res.append( proto.MIL_pb2.Operation( type="read_state", inputs=read_state_inputs, attributes=read_state_attrs, outputs=outputs, ) ) return res def translate_generic_op( self, op: Operation, literal_params: Optional[List[str]] = None ) -> proto.MIL_pb2.Operation: """ Translate a generic pymil Operation. """ if literal_params is None: literal_params = [] inputs = {} for param_name, vars in self._get_input_dict(op).items(): if param_name.startswith("_"): continue if not isinstance(vars, (list, tuple)): vars = [vars] arguments = [] for _var in vars: binding = proto.MIL_pb2.Argument.Binding() # use const value literals if requested if param_name in literal_params: binding.value.CopyFrom(create_immediate_value(_var)) else: binding.name = _var.name arguments.append(binding) args = proto.MIL_pb2.Argument() args.arguments.extend(arguments) inputs[param_name] = args outputs = [ proto.MIL_pb2.NamedValueType(name=v.name, type=self.types_to_proto(v.sym_type)) for v in op.outputs ] blocks = None if len(op.blocks) > 0: blocks = [self.create_block(b) for b in op.blocks] op_type = op.op_type attr_dict = self._get_attr_dict(op) if op.op_type in SSAOpRegistry.custom_ops: op_type = "custom_layer" class_name = op.bindings.get("class_name", op.name) input_order = op.bindings.get("input_order", []) parameters = op.bindings.get("parameters", []) weights = op.bindings.get("weights", []) description = op.bindings.get("description", "") attr_dict["class_name"] = create_scalar_value(class_name) attr_dict["input_order"] = create_list_scalarvalue(input_order, str) attr_dict["parameters"] = create_list_scalarvalue(parameters, str) attr_dict["weights"] = create_list_scalarvalue(weights, str) attr_dict["description"] = create_scalar_value(description) return proto.MIL_pb2.Operation( type=op_type, blocks=blocks, inputs=inputs, attributes=attr_dict, outputs=outputs, ) def create_block(self, block: Block) -> proto.MIL_pb2.Block: """ Translate pymil Block. """ def feeds_to_only_constexprs(op: Operation) -> bool: return ( (op.op_type == "const") and len(op.outputs[0].child_ops) > 0 and all( (child_op.op_type.startswith("constexpr_")) for child_op in op.outputs[0].child_ops ) ) proto_ops = [] # Find the const op that generates classify's "label" / "class" string vec. classify_const_classes_op = None if len(block.operations) > 0: # Classify is always the last operation in the block. op = block.operations[-1] op_cls_name = type(op).__name__ if op_cls_name == "classify": classes_var = op.inputs["classes"] classify_const_classes_op = classes_var.op if len(classes_var.child_ops) != 1: raise ValueError( "Classify's labels/classes should be input to only 1 op (classify)." ) for op in block.operations: op_cls_name = type(op).__name__ if op_cls_name == "const": if feeds_to_only_constexprs(op): continue # Do not serialize the const op that creates the var bound to the classifier's "classes" param. # The variable's value will be bound directly to classify's "classes" param instead. if op != classify_const_classes_op: proto_ops.append(self.translate_const(op)) elif op_cls_name.startswith("constexpr_"): proto_ops.append(self.translate_constexpr(op)) elif op_cls_name == "classify": # Classify's "classes" param should be serialized as a value literal bound # directly to the param, rather than as a const-generated variable. proto_ops.append(self.translate_generic_op(op, ["classes"])) elif op_cls_name == "reshape_like": # The reshape_like should also be able to take value from a const op # This is a workaround solution # rdar://98689808 (Reshape_like should also accept const value from non literal input) literal_params = ["begins", "ends", "end_masks"] proto_ops.append(self.translate_generic_op(op, literal_params)) elif op_cls_name == "coreml_update_state": proto_ops.extend(self.translate_coreml_update_state_op(op)) else: proto_ops.append(self.translate_generic_op(op)) inputs = [] if not isinstance(block, Function): # Function is subclass of Block, but function's block has no input, # and hence skipping reading the block inputs. for var in block.inputs: proto_type = self.types_to_proto(var.sym_type) inputs.append(proto.MIL_pb2.NamedValueType(name=var.name, type=proto_type)) output_names = [v.name for v in block.outputs] return proto.MIL_pb2.Block(inputs=inputs, outputs=output_names, operations=proto_ops) def convert_function(self, function: Function, opset: str) -> proto.MIL_pb2.Function: """ Translate pymil Function. """ block = self.create_block(function) inputs = [] for name, var in function.inputs.items(): proto_type = self.types_to_proto(var.sym_type) inputs.append(proto.MIL_pb2.NamedValueType(name=name, type=proto_type)) return proto.MIL_pb2.Function( inputs=inputs, opset=opset, block_specializations={opset: block} ) def export(self) -> proto.MIL_pb2.Program: """ Export a pymil program into mil proto with the given specification version. """ if BlobWriter is None: raise RuntimeError("BlobWriter not loaded") function_protos = {} for func_name, func in self.prog.functions.items(): function_protos[func_name] = self.convert_function( func, _OPSET[self.specification_version] ) kwargs = { "version": 1, "functions": function_protos, } prog_attributes = self.translate_program_attributes() if len(prog_attributes) > 0: kwargs["attributes"] = prog_attributes return proto.MIL_pb2.Program(**kwargs) # Add a classify op to the output. # Replaces the original probabilities output (in the containing MIL block) # with the outputs of the classifier op. Returns the name of the original # probabilities output variable. def _add_classify_op(prog, classifier_config): ''' Add a "classify" op to the program, at the end of the main block ''' def remove_output(block, prob_var): for i in range(len(block.outputs)): if block.outputs[i] is prob_var: block.outputs.pop(i) if block in prob_var.consuming_blocks: prob_var.consuming_blocks.remove(block) break block = prog.functions["main"] message = "Class labels must be a list of integers / strings or a file path" classes_in = classifier_config.class_labels if isinstance(classes_in, str): import os if not os.path.isfile(classes_in): raise ValueError("Path to class labels (%s) does not exist." % classes_in) with open(classes_in, "r") as f: classes = f.read() classes = classes.splitlines() elif isinstance(classes_in, list): # list[int or str] classes = classes_in assert all([isinstance(x, (int, str)) for x in classes]), message else: raise ValueError(message) probability_var = _get_probability_var_for_classifier(prog, classifier_config) original_probability_var = probability_var # add the classify op now # we consider this step as a scope of coremltools graph pass with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["add_classify_op"])): with block: # cast the int label to np.int64 if isinstance(classes[0], int): classes = [np.int64(x) for x in classes] classes_var = mb.const(val=mil_list(classes)) if probability_var.dtype != types.fp32: remove_output(block, probability_var) probability_var = mb.cast( x=probability_var, dtype="fp32", name=probability_var.name + "_cast_to_fp32" ) out = mb.classify(probabilities=probability_var, classes=classes_var) predicted_feature_name = ( "classLabel" if classifier_config.predicted_feature_name is None else classifier_config.predicted_feature_name ) out[0].name = predicted_feature_name out[1].name = predicted_feature_name + "_probs" # Remove probabilities from block outputs, replace with classify's outputs remove_output(block, probability_var) block.outputs[:0] = out out[0].consuming_blocks.append(block) out[1].consuming_blocks.append(block) # The new classifier op should have scope information Block._copy_scope_info(original_probability_var, out[0]) return out[0].name, out[1].name class CoreMLProtoExporter: """ An utility class to export a pymil program to coreml model. """ def __init__( self, prog: mil.Program, mil_proto: proto.MIL_pb2.Program, predicted_feature_name: str, predicted_probabilities_name: str, classifier_config: ClassifierConfig, convert_to: str, convert_from: str, export_multi_functions: bool, ): self.prog = prog self.mil_proto = mil_proto self.predicted_feature_name = predicted_feature_name self.predicted_probabilities_name = predicted_probabilities_name self.classifier_config = classifier_config self.convert_to = convert_to self.convert_from = convert_from self.export_multi_functions = export_multi_functions self.prog.validate(check_essential_scope=True) @staticmethod def _decouple_state_and_input( input_features: List[proto.Model_pb2.FeatureDescription], ) -> Tuple[List[proto.Model_pb2.FeatureDescription], List[proto.Model_pb2.FeatureDescription]]: """ Utils seperates state input from non-state input features. """ state_features = [] non_state_input_features = [] for input in input_features: if input.type.WhichOneof("Type") == "stateType": state_features.append(input) else: non_state_input_features.append(input) return state_features, non_state_input_features def get_func_input(self, func: mil.Function) -> List[proto.Model_pb2.FeatureDescription]: """ Utils to get function input feature description. """ input_types = func.input_types input_features = [] image_input_names = {} # these are the model inputs marked as image by the user input_shape_map = {} for input_type in input_types: if isinstance(input_type, ImageType): image_input_names[input_type.name] = input_type # error checking for input(s) marked as images if input_type.name not in list(func.inputs.keys()): raise ValueError( f"Provided image input '{input_type.name}' is not one of the inputs of the MIL program" ) if input_type.name is None: raise ValueError( 'Fail to auto-determine the input name. Please specify the "name" ' 'parameter when use "inputs" in ct.convert().' ) input_shape_map[input_type.name] = input_type for name, var in func.inputs.items(): input_feature_type = proto.FeatureTypes_pb2.FeatureType() is_input_shape_symbolic = False # error checking for input(s) marked as images # an image input must be of type tensor in program proto # (since an image type does not exist in MIL program) if name in image_input_names and not types.is_tensor(var.sym_type): raise ValueError( "For the image input, '{}', its type in the MIL program must be tensor. " "Instead it is {}.".format(name, var.sym_type.__type_info__()) ) if types.is_tensor(var.sym_type): shape = var.sym_type.get_shape() if any_variadic(shape): raise ValueError("Variable rank model inputs are not supported!") if any_symbolic(shape): is_input_shape_symbolic = True # We extract the default input shape given by user first if name in input_shape_map: shape = input_shape_map[name].shape.default else: logger.warning( "Input shape not fully specified by enumerated shapes or range dim! 1 will be used for dimension not specified instead." ) # If no input shape is provided (ex. auto conversion of -1 in Tensorflow) shape = [1 if is_symbolic(d) else d for d in shape] if name not in image_input_names: # make a feature type of Type "multiArrayType" array_type = proto.FeatureTypes_pb2.ArrayFeatureType( shape=shape, dataType=cast_to_framework_io_dtype(var, False) ) input_feature_type.multiArrayType.CopyFrom(array_type) else: # make a feature type of Type "imageType" input_type = image_input_names[name] _validate_image_input_output_shapes( input_type.color_layout, shape, name, is_input=True ) if not input_type.channel_first: raise ValueError( "Image input, '{}', must be in the channel_first format".format(name) ) clr_space = _get_colorspace_enum(input_type.color_layout) image_type = proto.FeatureTypes_pb2.ImageFeatureType( width=shape[-1], height=shape[-2], colorSpace=clr_space ) input_feature_type.imageType.CopyFrom(image_type) input_features.append( proto.Model_pb2.FeatureDescription(name=name, type=input_feature_type) ) elif types.is_scalar(var.sym_type): array_type = proto.FeatureTypes_pb2.ArrayFeatureType( shape=[1], dataType=cast_to_framework_io_dtype(var, False) ) input_feature_type.multiArrayType.CopyFrom(array_type) input_features.append( proto.Model_pb2.FeatureDescription(name=var.name, type=input_feature_type) ) elif types.is_state(var.sym_type): # shape for state input cannot be symbolic shape = var.sym_type.wrapped_type().get_shape() if any_variadic(shape): raise ValueError("Variable rank model states are not supported!") if any_symbolic(shape): raise ValueError("Flexible shape model states are not supported!") # Core ML only support fp16 for state if not var.dtype == types.fp16: raise ValueError( f"State only support fp16 dtype. Got input var {var.name} with dtype {types.builtin_to_string(var.dtype)}." ) # create the input feature type array_type = proto.FeatureTypes_pb2.ArrayFeatureType( shape=shape, dataType=cast_to_framework_io_dtype(var, False) ) state_feature_type = proto.FeatureTypes_pb2.StateFeatureType() state_feature_type.arrayType.CopyFrom(array_type) input_feature_type = proto.FeatureTypes_pb2.FeatureType() input_feature_type.stateType.CopyFrom(state_feature_type) # append feature to the input features list input_features.append( proto.Model_pb2.FeatureDescription(name=var.name, type=input_feature_type) ) else: raise NotImplementedError(f"Unsupported input type {var.sym_type}.") if not is_input_shape_symbolic: continue # Set symbolic shapes default_lower_bound = 1 default_upper_bound = default_lower_bound + 1 if self.convert_to == "mlprogram" else -1 default_bound_used = False input_type = input_shape_map.get(name, None) if isinstance(input_type, ImageType): if isinstance(input_type.shape, EnumeratedShapes): enumerated_shapes = [] for s in input_type.shape.shapes: enumerated_shapes.append( NeuralNetworkImageSize(height=s.shape[-2], width=s.shape[-1]) ) flexible_shape_utils._add_enumerated_image_sizes_for_feature( input_features[-1], sizes=enumerated_shapes ) else: img_range = NeuralNetworkImageSizeRange() H = input_type.shape.shape[-2] W = input_type.shape.shape[-1] if isinstance(H, RangeDim): img_range.add_height_range((H.lower_bound, H.upper_bound)) elif is_symbolic(H): img_range.add_height_range((default_lower_bound, default_upper_bound)) default_bound_used = True else: img_range.add_height_range((H, H)) if isinstance(W, RangeDim): img_range.add_width_range((W.lower_bound, W.upper_bound)) elif is_symbolic(W): img_range.add_width_range((default_lower_bound, default_upper_bound)) default_bound_used = True else: img_range.add_width_range((W, W)) flexible_shape_utils._update_image_size_range_for_feature( input_features[-1], img_range ) elif isinstance(input_type, TensorType): if isinstance(input_type.shape, EnumeratedShapes): flexible_shape_utils._add_multiarray_ndshape_enumeration_for_feature( input_features[-1], [tuple(s.shape) for s in input_type.shape.shapes] ) else: lb = [] ub = [] for s in input_type.shape.shape: if isinstance(s, RangeDim): lb.append(s.lower_bound) ub.append(s.upper_bound) elif is_symbolic(s): lb.append(default_lower_bound) ub.append(default_upper_bound) default_bound_used = True else: lb.append(s) ub.append(s) flexible_shape_utils._set_multiarray_ndshape_range_for_feature( input_features[-1], lower_bounds=lb, upper_bounds=ub ) elif input_type is None: sym_type = func.inputs[name].sym_type lb = [] ub = [] for s in sym_type.get_shape(): if is_symbolic(s): lb.append(default_lower_bound) ub.append(default_upper_bound) default_bound_used = True else: lb.append(s) ub.append(s) flexible_shape_utils._set_multiarray_ndshape_range_for_feature( input_features[-1], lower_bounds=lb, upper_bounds=ub ) if default_bound_used and self.convert_to == "mlprogram": warnings.warn( "Some dimensions in the input shape are unknown, hence they are set to flexible ranges " f"with lower bound and default value = {default_lower_bound}, and upper bound = " f"{default_upper_bound}. To set different values for the default shape and upper bound, " "please use the ct.RangeDim() method as described here: " "https://coremltools.readme.io/docs/flexible-inputs#set-the-range-for-each-dimension.", UserWarning, ) convert_from = self.convert_from if convert_from is not None and convert_from.startswith("tensorflow"): warnings.warn( 'There is "None" dim in TF input placeholder. Please consider specifying ' 'input shapes by using the "inputs" param in ct.convert().' ) return input_features def get_func_output(self, func: mil.Function) -> List[proto.Model_pb2.FeatureDescription]: """ Utils to get function output feature description. """ output_types = func.output_types output_features = [] if output_types is not None and self.classifier_config is None: assert len(output_types) == len( func.outputs ), "number of mil program outputs do not match the number of outputs provided by the user" for i, var in enumerate(func.outputs): output_feature_type = proto.FeatureTypes_pb2.FeatureType() if types.is_tensor(var.sym_type) or types.is_primitive(var.sym_type): if output_types is not None and isinstance(output_types[i], ImageType): if not types.is_tensor(var.sym_type): raise ValueError( "Image output, '{}', is a scalar, but it should be a tensor of rank 4".format( var.name ) ) clr_space = _get_colorspace_enum(output_types[i].color_layout) shape = var.sym_type.get_shape() if any_variadic(shape): raise ValueError( "Variable rank model outputs, that are ImageTypes, are not supported" ) if any_symbolic(shape): # For flexible shape output, we set the imageSizeRange to [1, -1], # util this radar is fixed in CoreML: rdar://122895892 ([Bug] CoreML produce empty dictionary with image output with dynamic shape) image_type = proto.FeatureTypes_pb2.ImageFeatureType( width=1, height=1, colorSpace=clr_space ) image_type.imageSizeRange.widthRange.lowerBound = 1 image_type.imageSizeRange.widthRange.upperBound = -1 image_type.imageSizeRange.heightRange.lowerBound = 1 image_type.imageSizeRange.heightRange.upperBound = -1 else: image_type = proto.FeatureTypes_pb2.ImageFeatureType( width=shape[-1], height=shape[-2], colorSpace=clr_space ) _validate_image_input_output_shapes( output_types[i].color_layout, shape, var.name, is_input=False ) output_feature_type.imageType.CopyFrom(image_type) output_features.append( proto.Model_pb2.FeatureDescription(name=var.name, type=output_feature_type) ) else: dataType = None if self.classifier_config is None or var.name != self.predicted_feature_name: # Not a classifier output, make sure model output type matches with ML Program type. dataType = cast_to_framework_io_dtype(var, True) else: # Classifier outputs are set up separately, so default to fp32 for now. dataType = proto.FeatureTypes_pb2.ArrayFeatureType.ArrayDataType.FLOAT32 output_shape = ( None if any_symbolic(var.shape) or types.is_primitive(var.sym_type) else var.shape ) array_type = proto.FeatureTypes_pb2.ArrayFeatureType( shape=output_shape, dataType=dataType ) output_feature_type.multiArrayType.CopyFrom(array_type) output_features.append( proto.Model_pb2.FeatureDescription(name=var.name, type=output_feature_type) ) elif types.is_dict(var.sym_type): output_feature_type.dictionaryType.MergeFromString(b"") keytype, valtype = var.sym_type.T if types.is_str(keytype): output_feature_type.dictionaryType.stringKeyType.MergeFromString(b"") elif keytype == types.int64: output_feature_type.dictionaryType.int64KeyType.MergeFromString(b"") else: raise ValueError("Dictionary key type not supported.") output_features.append( proto.Model_pb2.FeatureDescription(name=var.name, type=output_feature_type) ) else: raise NotImplementedError(f"Unsupported output type {var.sym_type}.") return output_features def get_coreml_model( self, input: Dict[str, List[proto.Model_pb2.FeatureDescription]], output: Dict[str, List[proto.Model_pb2.FeatureDescription]], specification_version: int, ) -> proto.Model_pb2.Model: """ Utils to get a coreml model description. For the multifunction export, we utilize the FunctionDescription proto message. """ if self.export_multi_functions: # For multifunction export, we use the FunctionDescription if specification_version < _SPECIFICATION_VERSION_IOS_18: raise ValueError( "minimum_deployment_target for multi-functions export should be iOS18+." ) if self.classifier_config is not None: # TODO: This should be fixed in rdar://123660416 ([New Feature][Multi-functions] Enable classifier for multi-functions CoreML model) raise NotImplementedError("classifier model not supported in multi-functions export.") function_desc = [] for func_name in input.keys(): state_features, non_state_input_features = self._decouple_state_and_input( input[func_name] ) desc = proto.Model_pb2.FunctionDescription( name=func_name, input=non_state_input_features, output=output[func_name], state=state_features, ) function_desc.append(desc) desc = proto.Model_pb2.ModelDescription( functions=function_desc, defaultFunctionName=self.prog.default_function_name, ) else: # single function export input_features = input[self.prog.default_function_name] output_features = output[self.prog.default_function_name] state_features, non_state_input_features = self._decouple_state_and_input( input_features ) desc = proto.Model_pb2.ModelDescription( input=non_state_input_features, output=output_features, state=state_features, ) if self.classifier_config is not None: desc.predictedFeatureName = self.predicted_feature_name desc.predictedProbabilitiesName = self.predicted_probabilities_name # Manually edit output type of predictedFeatureName. # It doesn't use MLMultiArray and really uses a "primitive" type. for output in desc.output: if output.name == self.predicted_feature_name: if type(self.classifier_config.class_labels[0]) == int: output.type.int64Type.MergeFromString(b"") else: output.type.stringType.MergeFromString(b"") break # Create ML Model model = proto.Model_pb2.Model(description=desc, specificationVersion=specification_version) model.mlProgram.CopyFrom(self.mil_proto) return model def export( self, specification_version: Optional[int] = _SPECIFICATION_VERSION_IOS_15 ) -> proto.Model_pb2.Model: # get functions input / output description func_to_input = OrderedDict() func_to_output = OrderedDict() for name, func in self.prog.functions.items(): func_to_input[name] = self.get_func_input(func) func_to_output[name] = self.get_func_output(func) # create a coreml model with I/O description and mil proto model = self.get_coreml_model( func_to_input, func_to_output, specification_version, ) # Set optional inputs for main function if "main" in self.prog.functions: _set_optional_inputs(model, self.prog.functions["main"].input_types) return model def load( prog: Program, weights_dir: str, resume_on_errors: Optional[bool] = False, specification_version: Optional[int] = _SPECIFICATION_VERSION_IOS_15, **kwargs, ) -> proto.Model_pb2.Model: if prog.default_function_name not in prog.functions: raise ValueError(f"Default function {prog.default_function_name} not found in program") # if user has specified "ClassifierConfig", then add the "classify" op to the prog classifier_config = kwargs.get("classifier_config", None) predicted_feature_name, predicted_probabilities_name = None, None if classifier_config is not None: predicted_feature_name, predicted_probabilities_name = _add_classify_op( prog, classifier_config ) # convert pymil program into mil proto kwargs["prog"] = prog kwargs["weights_dir"] = weights_dir kwargs["specification_version"] = specification_version exporter_kwargs = MILProtoExporter._get_valid_kwargs(kwargs) mil_proto_exporter = MILProtoExporter( **exporter_kwargs, ) mil_proto = mil_proto_exporter.export() # return the model provided by users desc = kwargs.get("model_description", None) if desc and not isinstance(desc, proto.Model_pb2.ModelDescription): raise ValueError("Invalid model descriptor") if desc: if classifier_config is not None: raise AssertionError("Both model_description and classifier_config can't be provided") model = proto.Model_pb2.Model(description=desc, specificationVersion=specification_version) model.mlProgram.CopyFrom(mil_proto) return model # create a CoreML model protobuf coreml_proto_exporter = CoreMLProtoExporter( prog, mil_proto, predicted_feature_name, predicted_probabilities_name, classifier_config=kwargs.get("classifier_config", None), convert_to=kwargs.get("convert_to", None), convert_from=kwargs.get("convert_from", None), export_multi_functions=kwargs.get("export_multi_functions", False), ) return coreml_proto_exporter.export(specification_version) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2095466 coremltools-8.0/coremltools/converters/mil/backend/mil/passes/0000755000000000000000000000000014672075535023462 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/mil/passes/__init__.py0000644000000000000000000000054314672066616025575 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import (adjust_io_to_supported_types, fuse_activation_silu, insert_image_preprocessing_op, sanitize_name_strings) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/mil/passes/adjust_io_to_supported_types.py0000644000000000000000000002462014672066616032054 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Set from coremltools import _logger as logger from coremltools.converters.mil._deployment_compatibility import AvailableTarget as target from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types as types from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass # TODO: rdar://122845072 ([Infra] Refactor the transform_function_signatures, adjust_io_to_supported_types and update_output_dtypes using a shared graph pass) @register_pass(namespace="mil_backend") class adjust_io_to_supported_types(AbstractGraphPass): """ Converts all dtypes to types that are supported by the Core ML runtime. The runtime supports fp16, fp32, int16, uint16, int32, str, and bool variables. General rules: * Integer vars with unsupported types are replaced with int32 types. * All other types not in the list of runtime supported types are replaced with the fp32 dtype. No casts are inserted; the previous type is replaced. The assumption is that all remaining types are numerical and can be reasonably replaced with 32 bit float types. The "main" function has additional rules since its I/O is mapped to Core ML model I/O: * if function.opset_version < coremltools.target.iOS16, then: * Fp16 I/O is replaced with fp32 I/O. Casts (fp32 input -> fp16) are inserted at the beginning of the program to preserve 16 bit inputs. Casts (fp16 -> fp32 output) are inserted at the end of the program to preserve 16 bit computations. * All non-integer I/O that is not fp32 is replaced with fp32 I/O. A cast (prev input type -> fp32) is inserted at the beginning of the program to preserve non-fp32 inputs. A cast (prev type -> fp32 out) is inserted at the end of the program to preserve non-fp32 computations. The assumption is that all remaining types are numerical and it is valid to cast them to/from fp32. * The only exception: Int64 outputs are allowed for the classifier op. This is to keep consistency with the Core ML API, which uses 64 bit integers to represent classifier labels. ------ func main(bool x, int32 y, fp32 z) { bool out = logical_not(x) } -> (out, y, z) becomes func main(fp32 x, int32 y, fp32 z) { bool x_casted = cast(x) bool out__pre__output__fp32__cast = logical_not(x_casted) fp32 out = cast(out__pre__output__fp32__cast) } -> (out, y, z) ------ func not_main(bool x, int32 y, fp32 z) { bool out = logical_not(x) } -> (out, y, z) is unchanged. """ def apply(self, prog): for name, func in prog.functions.items(): is_main_funtion = name == "main" _adjust_io_to_supported_types(func, is_main_funtion) def _adjust_var_dtype_helper(var, dtype): if types.is_scalar(var.sym_type): var._sym_type = dtype else: var._sym_type = types.tensor(dtype, var.sym_type.get_shape()) def _get_io_supported_types(opset_version: target) -> Set[type]: """Get Core ML I/O supported data types based on opset version.""" supported_types = {types.fp32, types.int32} if opset_version is not None and opset_version >= target.iOS16: supported_types.add(types.fp16) return supported_types def _get_runtime_supported_types(opset_version: target) -> Set[type]: """Get Core ML Runtime supported data types based on opset version.""" supported_types = {types.fp16, types.fp32, types.int32, types.str, types.bool} if opset_version >= target.iOS17: supported_types.update({types.int8, types.uint8, types.int16, types.uint16}) return supported_types @block_context_manager def _adjust_main_inputs(func): """ Adjust the inputs in main func. If the input's dtype is not in Core ML I/O supported types, we do following steps: 1. Change the input's dtype to int32 or fp32 based on original dtype. 2. If the original dtype is supported in Core ML Runtime, we insert a cast op to cast the input from the changed dtype to the original dtype. """ _IO_SUPPORTED_TYPES = _get_io_supported_types(func.opset_version) _RUNTIME_SUPPORTED_TYPES = _get_runtime_supported_types(func.opset_version) for input_name, input_var in func.inputs.items(): if ( types.is_tensor(input_var.sym_type) or types.is_scalar(input_var.sym_type) ) and input_var.dtype not in _IO_SUPPORTED_TYPES: input_dtype_str = types.builtin_to_string(input_var.dtype) convert_to_dtype = types.int32 if types.is_int(input_var.dtype) else types.fp32 convert_to_dtype_str = types.builtin_to_string(convert_to_dtype) should_insert_cast = input_var.dtype in _RUNTIME_SUPPORTED_TYPES _adjust_var_dtype_helper(input_var, convert_to_dtype) logger.warning( f"\nInput '{input_var.name}' is of dtype {input_dtype_str}. The Core ML I/O does " f"not support this dtype (supported dtypes are: {_IO_SUPPORTED_TYPES}). Consider " f"setting `minimum_deployment_target` to a higher IOS version for more supported " f"dtypes. This input is changed to {convert_to_dtype_str}.\n" ) if not should_insert_cast: logger.warning( f"The original input dtype {input_dtype_str} is not supported in " f"Core ML Runtime (supported dtypes are: {_RUNTIME_SUPPORTED_TYPES}). Consider " f"setting `minimum_deployment_target` to a higher IOS version for more " f"supported dtypes. We just changed the dtype and won't insert any cast op." ) continue logger.warning( f"Trying to insert a cast op at the beginning of the program to convert " f"the input to the originally defined dtype ({input_dtype_str}).\n" ) try: first_op = func.operations[0] if len(func.operations) > 0 else None casted_input_var = mb.cast(x=input_var, dtype=input_dtype_str, before_op=first_op) # Use force replace as the `input_var.dtype` could be not subtype of the # `convert_to_dtype`. For example, int16 cast to int32. As it's only for input # dtype cast, this replace should be safe. func.replace_uses_of_var_after_op( anchor_op=casted_input_var.op, old_var=input_var, new_var=casted_input_var, force_replace=True, no_check_var_types=True, ) except Exception as e: logger.warning( f"Failed to insert the cast op.\n{e}\nThe dtype of the input " f"'{input_var.name}' is changed to {convert_to_dtype_str} without " f"inserting any cast op." ) @block_context_manager def _adjust_main_outputs(func): """Adjust the outputs in the main func to make sure they have Core ML I/O supported types.""" _IO_SUPPORTED_TYPES = _get_io_supported_types(func.opset_version) new_outputs = [] for output_var in func.outputs: output_type = output_var.sym_type # classify outputs contains type int64 output variables, which should not be casted. if ( (types.is_tensor(output_type) or types.is_scalar(output_type)) and output_var.dtype not in _IO_SUPPORTED_TYPES and output_var.op.op_type != "classify" ): output_dtype_str = types.builtin_to_string(output_var.dtype) target_dtype = "int32" if types.is_int(output_var.dtype) else "fp32" logger.warning( f"\nOutput '{output_var.name}' is of dtype {output_dtype_str}. The " f"Core ML runtime does not support outputs with this dtype (supported " f"dtypes are: {_IO_SUPPORTED_TYPES}). This output will changed to " f"{target_dtype} by adding a cast op at the end of the program.\n" ) if output_var.dtype == types.fp16: logger.warning( "fp16 dtype output is supported if function.opset_version is chosen to be at " "least iOS16/macOS13.\n" ) output_var_name = output_var.name output_var.set_name(f"{output_var_name}__pre__output__{target_dtype}__cast") old_output_var = output_var output_var = mb.cast(x=output_var, dtype=target_dtype) output_var.set_name(output_var_name) Block._copy_scope_info(old_output_var, output_var) new_outputs.append(output_var) func.set_outputs(new_outputs) def _adjust_func_inputs(func): """ Changes the dtype of the provided variable according to the rules outlined in the top level pass comment (see adjust_io_to_supported_types). """ _RUNTIME_SUPPORTED_TYPES = _get_runtime_supported_types(func.opset_version) for input_name, input_var in func.inputs.items(): if ( types.is_tensor(input_var.sym_type) or types.is_scalar(input_var.sym_type) ) and input_var.dtype not in _RUNTIME_SUPPORTED_TYPES: dtype_str = types.builtin_to_string(input_var.dtype) convert_to_dtype = types.int32 if types.is_int(input_var.dtype) else types.fp32 convert_to_dtype_str = types.builtin_to_string(convert_to_dtype) _adjust_var_dtype_helper(input_var, convert_to_dtype) logger.warning( f"Input '{input_var.name}' is of dtype {dtype_str}, which is not" f"supported by the Core ML runtime. This input will be changed to " f"{convert_to_dtype_str}. No cast will be inserted." ) def _adjust_io_to_supported_types(func, is_main): if is_main: _adjust_main_inputs(func) _adjust_main_outputs(func) else: _adjust_func_inputs(func) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/mil/passes/fuse_activation_silu.py0000644000000000000000000000521614672066616030257 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass def _match_pattern(op): if op.op_type == "sigmoid": # abort fusion if op output is also a block output if op.outputs[0] in op.enclosing_block.outputs: return None # find following op child_ops = op.outputs[0].child_ops if len(child_ops) == 1: mul_op_candidate = list(child_ops)[0] if mul_op_candidate.op_type != "mul": return None mul_inputs_actual = {mul_op_candidate.x.name, mul_op_candidate.y.name} mul_inputs_expect = {op.x.name, op.outputs[0].name} if mul_inputs_actual != mul_inputs_expect: return None return mul_op_candidate return None def _try_to_transform(sigmoid_op, mul_op, block): out_name = mul_op.outputs[0].name # create a new silu op x = mb.silu(x=sigmoid_op.x, name=out_name, before_op=sigmoid_op) mul_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=mul_op, old_var=mul_op.outputs[0], new_var=x ) # Remove all the ops at once block.remove_ops([sigmoid_op, mul_op]) return True @block_context_manager def _fuse_activation_silu_block(block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = _fuse_activation_silu_block(b) if len(op.blocks) > 0: continue mul_op = _match_pattern(op) if mul_op is not None: if _try_to_transform(op, mul_op, block): fusion_occurred = True return fusion_occurred @register_pass(namespace="mil_backend") class fuse_activation_silu(AbstractGraphPass): """ Fold x * sigmoid(x) into silu(x) Given: %1 = sigmoid(x=%0) %2 = mul(x=%0, y=%1) or mul(x=%1, y=%0) ... Result: %3 = silu(%0) ... """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = _fuse_activation_silu_block(f) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/mil/passes/fuse_pow2_sqrt.py0000644000000000000000000000567114672066616027027 0ustar00rootroot# Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass def _match_pattern(op): pow_op, sqrt_op = None, None # check the current op is pow(2) or sqrt if op.op_type == "pow" and op.y.val == 2: pow_op = op if op.op_type == "sqrt": sqrt_op = op # check the children of the current op child_ops = op.outputs[0].child_ops # if the op output is a block output or there is more than one child, fast fail if op.outputs[0] in op.enclosing_block.outputs or len(child_ops) != 1: return None # if we have pow(2), check for sqrt if pow_op and child_ops[0].op_type == "sqrt": sqrt_op = child_ops[0] # if we have sqrt, check for pow(2) elif sqrt_op and child_ops[0].op_type == "pow" and child_ops[0].y.val == 2: pow_op = child_ops[0] # if we don't have both ops, fast fail if not pow_op or not sqrt_op: return None # check that the two ops are connected if pow_op.outputs[0].name != sqrt_op.x.name and sqrt_op.outputs[0].name != pow_op.x.name: return None # return the other op return pow_op if pow_op != op else sqrt_op def _try_to_transform(op1, op2, block): # replace the pow2(x) --> sqrt(x) with identity(x) x = mb.identity(x=op1.x, name= op2.outputs[0].name, before_op=op1) # update the graph op2.enclosing_block.replace_uses_of_var_after_op( anchor_op=op2, old_var=op2.outputs[0], new_var=x ) # remove the ops block.remove_ops([op1, op2]) return True @block_context_manager def _fuse_pow2_sqrt(block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = _fuse_pow2_sqrt(b) if len(op.blocks) > 0: continue op2 = _match_pattern(op) if op2 is not None: if _try_to_transform(op, op2, block): fusion_occurred = True return fusion_occurred @register_pass(namespace="mil_backend") class fuse_pow2_sqrt(AbstractGraphPass): """ Fold pow(x, 2) --> sqrt(x) into identity(x) Given: %1 = pow(x=%0, y=2) %2 = sqrt(x=%1) ... %1 = sqrt(x=%0) %2 = pow(x=%1, y=2) ... Result: %3 = identity(%0) ... """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = _fuse_pow2_sqrt(f) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/mil/passes/insert_image_preprocessing_op.py0000644000000000000000000000656714672066616032161 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.input_types import ColorLayout, ImageType from coremltools.converters.mil.mil import Builder as mb # import mil internal ops to add it to the builder from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.types import nptype_from_builtin @register_pass(namespace="mil_backend") class insert_image_preprocessing_ops(AbstractGraphPass): """ Insert preprocessing ops, right after the input if its of type Image """ def apply(self, prog): for f_name, f in prog.functions.items(): if f_name == 'main': _insert_image_preprocessing_ops(f, prog) @block_context_manager def _insert_image_preprocessing_ops(block, prog): input_types = list(prog.functions["main"].input_types) for input_type in input_types: if isinstance(input_type, ImageType): if input_type.name not in block.inputs: continue input_var = block.inputs[input_type.name] placeholder_op = block.placeholder_inputs[input_type.name] first_op = block.operations[0] old_var = placeholder_op.outputs[0] has_bias = np.any(np.array(input_type.bias) != 0) last_output = input_var input_nptype = nptype_from_builtin(type(last_output.dtype())) if input_type.scale != 1: last_output = mb.mul(x=last_output, y=np.array(input_type.scale, dtype=input_nptype), before_op=first_op, name=input_var.name + "__scaled__") if has_bias: if input_type.color_layout in (ColorLayout.GRAYSCALE, ColorLayout.GRAYSCALE_FLOAT16): last_output = mb.add(x=last_output, y=np.array(input_type.bias, dtype=input_nptype), before_op=first_op, name=input_var.name + "__biased__") else: if len(last_output.shape) == 3: last_output = mb.add(x=last_output, y=np.array(input_type.bias, dtype=input_nptype).reshape([3, 1, 1]), before_op=first_op, name=input_var.name + "__biased__") elif len(last_output.shape) == 4: last_output = mb.add(x=last_output, y=np.array(input_type.bias, dtype=input_nptype).reshape([1, 3, 1, 1]), before_op=first_op, name=input_var.name + "__biased__") else: raise TypeError("Unsupported rank for image input type.") if last_output != input_var: block.replace_uses_of_var_after_op(anchor_op=last_output.op, old_var=old_var, new_var=last_output) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/mil/passes/sanitize_name_strings.py0000644000000000000000000000237014672066616030435 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil.passes.defs.preprocess import NameSanitizer from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="mil_backend") class sanitize_name_strings(AbstractGraphPass): """ Sanitize the names of vars and ops to make sure that they are of the format as described in the NameSanitizer class, i.e. of the format [a-zA-Z_][a-zA-Z0-9_]* """ def apply(self, prog): for f in prog.functions.values(): sanitizer_vars = NameSanitizer(prefix="var_") sanitizer_ops = NameSanitizer(prefix="op_") # TODO: rdar://126498947 ([Infra] Investigate the name sanitizer on multifunction model) if "main" in prog.functions: NameSanitizer.sanitize_block( f, sanitizer_vars, sanitizer_ops, prog.functions["main"].input_types, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/mil/passes/test_passes.py0000644000000000000000000011503514672066616026376 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import itertools import numpy as np import pytest import coremltools as ct from coremltools.converters.mil._deployment_compatibility import AvailableTarget as target from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import mil_list, types from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.converters.mil.testing_utils import ( apply_pass_and_basic_check, assert_model_is_valid, get_op_types_in_program, ) class TestAdjustToSupportedTypes: def test_basic(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 1, 1), dtype=types.bool), mb.TensorSpec(shape=(1, 1, 1, 1), dtype=types.int32), mb.TensorSpec(shape=(1, 1, 1, 1), dtype=types.fp32)]) def prog(x, y, z): out = mb.logical_not(x=x) return (out, y, z) prog.functions['not_main'] = copy.deepcopy(prog.functions['main']) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::adjust_io_to_supported_types", skip_output_type_check=True, skip_input_type_check=True, ) # output dtype is modified """ Input graph: func main(bool x, int32 y, fp32 z) { bool out = logical_not(x) } -> (out, y, z) becomes func main(fp32 x, int32 y, fp32 z) { bool x_casted = cast(x) bool out__pre__output__fp32__cast = logical_not(x_casted) fp32 out = cast(out__pre__output__fp32__cast) } -> (out, y, z) """ assert get_op_types_in_program(prev_prog) == ['logical_not'] assert get_op_types_in_program(prog) == ['cast', 'logical_not', 'cast'] prev_inputs = list(prev_prog.functions['main'].inputs.items()) inputs = list(prog.functions['main'].inputs.items()) assert prev_inputs[0][1].name == inputs[0][1].name assert inputs[0][1].dtype == types.fp32 for i in range(1, len(inputs)): assert prev_inputs[i][1].name == inputs[i][1].name assert prev_inputs[i][1].dtype == inputs[i][1].dtype prev_outputs = prev_prog.functions['main'].outputs outputs = prog.functions['main'].outputs assert prev_outputs[0].name == outputs[0].name assert outputs[0].dtype == types.fp32 for i in range(1, len(outputs)): assert prev_outputs[i].name == outputs[i].name assert prev_outputs[i].dtype == outputs[i].dtype """ Input graph: func not_main(bool x, int32 y, fp32 z) { bool out = logical_not(x) } -> (out, y, z) is identical after the pass. """ assert get_op_types_in_program(prev_prog, 'not_main') == ['logical_not'] assert get_op_types_in_program(prog, 'not_main') == ['logical_not'] prev_inputs = list(prev_prog.functions['not_main'].inputs.items()) inputs = list(prog.functions['not_main'].inputs.items()) for i in range(0, len(inputs)): assert prev_inputs[i][1].name == inputs[i][1].name assert prev_inputs[i][1].dtype == inputs[i][1].dtype prev_outputs = prev_prog.functions['not_main'].outputs outputs = prog.functions['not_main'].outputs for i in range(0, len(outputs)): assert prev_outputs[i].name == outputs[i].name assert prev_outputs[i].dtype == outputs[i].dtype def test_int64_input(self): """ Input graph: func main(int64 x) { } -> (x) becomes func main(int32 x) { } -> (x) """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 1, 1), dtype=types.int64)]) def prog(x): return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::adjust_io_to_supported_types", skip_output_type_check=True, skip_input_type_check=True, ) # output dtype is modified prev_inputs = list(prev_prog.functions['main'].inputs.items()) inputs = list(prog.functions['main'].inputs.items()) assert prev_inputs[0][1].name == inputs[0][1].name assert inputs[0][1].dtype == types.int32 def test_float64_input(self): """ Input graph: func main(float64 x) { } -> (x) becomes func main(float32 x) { } -> (x) """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 1, 1), dtype=types.fp64)]) def prog(x): return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::adjust_io_to_supported_types", skip_output_type_check=True, skip_input_type_check=True, ) # output dtype is modified prev_inputs = list(prev_prog.functions['main'].inputs.items()) inputs = list(prog.functions['main'].inputs.items()) assert prev_inputs[0][1].name == inputs[0][1].name assert inputs[0][1].dtype == types.fp32 @pytest.mark.parametrize( "opset_version", [None, target.iOS13, target.iOS16], ) def test_float16_input_output(self, opset_version): """ Input graph: main(%x: (1, 1, 1, 1, fp16)(Tensor)) { block0() { %relu_0: (1, 1, 1, 1, fp16)(Tensor) = relu(x=%x, name="relu_0") } -> (%relu_0) } Output graph (if opset_version < ios16): main(%x: (1, 1, 1, 1, fp32)(Tensor)) { block0() { %cast_0: (1, 1, 1, 1, fp16)(Tensor) = cast(x=%x, dtype="fp16", name="cast_0") %relu_0__pre__output__fp32__cast: (1, 1, 1, 1, fp16)(Tensor) = relu(x=%cast_0, name="relu_0") %relu_0: (1, 1, 1, 1, fp32)(Tensor) = cast(x=%relu_0__pre__output__fp32__cast, dtype="fp32", name="cast_1") } -> (%relu_0) } Output graph (if opset_version >= ios16): same as the input graph """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 1, 1), dtype=types.fp16)], opset_version=opset_version) def prog(x): return mb.relu(x=x) skip_type_check = opset_version in [None, ct.target.iOS13] prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::adjust_io_to_supported_types", skip_output_type_check=skip_type_check, skip_input_type_check=skip_type_check, ) prev_inputs = list(prev_block.inputs.items()) inputs = list(block.inputs.items()) prev_outputs = prev_block.outputs outputs = block.outputs assert prev_inputs[0][1].name == inputs[0][1].name assert outputs[0].name == prev_outputs[0].name if opset_version is None or opset_version < target.iOS16: assert get_op_types_in_program(prog) == ['cast', 'relu', 'cast'] assert inputs[0][1].dtype == types.fp32 assert outputs[0].dtype == types.fp32 else: assert get_op_types_in_program(prog) == ['relu'] assert inputs[0][1].dtype == types.fp16 assert block.outputs[0].dtype == types.fp16 def test_float16_input_output_with_opset_version_inference(self): """ Input graph: main(%x: (1, 1, 4, 4, fp16)(Tensor)) { block0() { %pixel_unshuffle_0: (1, 4, 2, 2, fp16)(Tensor) = pixel_unshuffle(x=%x, downscale_factor=2, name="pixel_unshuffle_0") } -> (%pixel_unshuffle_0) } This function would be inferred as an iOS16 function, and the graph pass should behave properly """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 4, 4), dtype=types.fp16)]) def prog(x): x = mb.pixel_unshuffle(x=x, downscale_factor=np.uint32(2)) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::adjust_io_to_supported_types" ) prev_inputs = list(prev_block.inputs.items()) inputs = list(block.inputs.items()) prev_outputs = prev_block.outputs outputs = block.outputs assert prev_inputs[0][1].name == inputs[0][1].name assert outputs[0].name == prev_outputs[0].name assert get_op_types_in_program(prog) == ['pixel_unshuffle'] assert inputs[0][1].dtype == types.fp16 assert block.outputs[0].dtype == types.fp16 def test_int8_input(self): """ Input graph: func main(int8 x) { } -> (x) becomes func main(int32 x) { } -> (x) """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 1, 1), dtype=types.int8)]) def prog(x): return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::adjust_io_to_supported_types", skip_output_type_check=True, skip_input_type_check=True, ) # output dtype is modified prev_inputs = list(prev_prog.functions['main'].inputs.items()) inputs = list(prog.functions['main'].inputs.items()) assert prev_inputs[0][1].name == inputs[0][1].name assert inputs[0][1].dtype == types.int32 @pytest.mark.parametrize( "opset_version", [None, target.iOS17], ) def test_int16_input(self, opset_version): """ Input graph: func main(int16 x) { .... } -> (x) Before IOS17, it becomes func main(int32 x) { .... } -> (x) In IOS17+, it becomes func main(int32 x) { %cast_0: (1, 1, 1, 1, int16)(Tensor) = cast(x=%x, dtype="int16", name="cast_0") .... %cast_1: (1, 1, 1, 1, int32)(Tensor) = cast(x=%x, dtype="int32", name="cast_1") } -> (cast_1) because IOS17+ supports int16 in Runtime (but doesn't support int16 for I/O). """ @mb.program( input_specs=[mb.TensorSpec(shape=(1, 1, 1, 1), dtype=types.int16)], opset_version=opset_version, ) def prog(x): return x skip_type_check = opset_version is None prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::adjust_io_to_supported_types", skip_output_type_check=True, skip_input_type_check=True, ) # output dtype id modified prev_inputs = list(prev_block.inputs.items()) inputs = list(block.inputs.items()) prev_outputs = prev_block.outputs outputs = block.outputs assert prev_inputs[0][1].dtype == types.int16 assert prev_outputs[0].dtype == types.int16 assert inputs[0][1].dtype == types.int32 assert outputs[0].dtype == types.int32 assert prev_inputs[0][1].name == inputs[0][1].name assert outputs[0].name == prev_outputs[0].name if opset_version and opset_version >= target.iOS17: assert get_op_types_in_program(prog) == ["cast", "cast"] cast_ops = [op for op in prog["main"].operations if op.op_type != "const"] # The first cast is for int32 to int16. assert cast_ops[0].x.dtype == types.int32 assert cast_ops[0].outputs[0].dtype == types.int16 # The second cast is for int16 to int32. assert cast_ops[1].x.dtype == types.int16 assert cast_ops[1].outputs[0].dtype == types.int32 else: # Before IOS17, the int16 is not supported in Runtime, so there is no cast inserted. assert get_op_types_in_program(prog) == [] def test_subblock(self): """ Input graph: func main(float64 a, float32 b) { float64 out_0, float32 out_1 = while_loop(a, b, (float64 a, float32 b) { bool cond = less(a, b) } -> (cond) (float64 a, float32 b) { float64 temp = const(1) float64 out = add(a, b) } -> (out, b) ); } -> (out_0, out_1) becomes func main(float32 a, float32 b) { float32 out_0, float32 out_1 = while_loop(a, b, (float32 a, float32 b) { bool cond = less(a, b) } -> (cond) (float32 a, float32 b) { float32 temp = const(1) float32 out = add(a, b) } -> (out, b) ); } -> (out_0, out_1) """ pytest.xfail("fp64 dtype not supported in MIL") def body(a, b): return mb.add(x=a, y=np.float64(1)), b def cond(a, b): return mb.less(x=a, y=b) @mb.program(input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp64), mb.TensorSpec(shape=(1,), dtype=types.fp32)]) def prog(a, b): return mb.while_loop(_cond=cond, _body=body, loop_vars=(a, b)) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::adjust_io_to_supported_types" ) prev_inputs = list(prev_prog.functions['main'].inputs.items()) inputs = list(prog.functions['main'].inputs.items()) for i in range(0, len(prev_inputs)): assert prev_inputs[i][1].name == inputs[i][1].name assert inputs[i][1].dtype == types.fp32 assert get_op_types_in_program(prev_prog) == ['while_loop'] assert get_op_types_in_program(prog) == ['while_loop'] def assert_block_inputs(prev_inputs, inputs): for i in range(0, len(prev_inputs)): assert prev_inputs[i].name == inputs[i].name assert inputs[i].dtype == types.fp32 subblocks = prog.functions["main"].operations[0].blocks prev_subblocks = prev_prog.functions["main"].operations[0].blocks for i in range(0, len(subblocks)): assert_block_inputs(prev_subblocks[i].inputs, subblocks[i].inputs) def test_adjust_cast(self): """ Input graph: func main(int32 x) { fp64 y = cast(x=x, dtype="fp64") } -> (y) becomes func main(int32 x) { fp32 y = cast(x=x, dtype="fp32") } -> (y) """ pytest.xfail("cast operation does not support casting to fp64") @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 1, 1), dtype=types.int32)]) def prog(x): y = mb.cast(x=x, dtype="fp64") return y prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::adjust_io_to_supported_types" ) assert get_op_types_in_program(prev_prog) == ['cast'] assert get_op_types_in_program(prog) == ['cast'] prev_cast = prev_prog.functions['main'].operations[1] cast = prog.functions['main'].operations[2] assert prev_cast.dtype.val == "fp64" assert prev_cast.outputs[0].dtype == types.fp64 assert cast.dtype.val == "fp32" assert cast.outputs[0].dtype == types.fp32 def test_adjust_redundant_cast(self): """ Input graph: func main(int32 x) { int64 y = cast(x=x, dtype="int64") } -> (y) becomes func main(int32 x) { } -> (x) """ pytest.xfail("cast not supports dtype=`int64`") @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 1, 1), dtype=types.int32)]) def prog(x): y = mb.cast(x=x, dtype="int64") return y prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::adjust_io_to_supported_types" ) assert get_op_types_in_program(prev_prog) == ['cast'] assert get_op_types_in_program(prog) == [] @staticmethod def test_classify_no_affected(): """ If the outputs are from a classify op, it should not be affected by this graph pass. """ @mb.program(input_specs=[mb.TensorSpec(shape=(3,))]) def prog(x): classes = [np.int64(x) for x in range(3)] classes_var = mb.const(val=mil_list(classes)) return mb.classify(probabilities=x, classes=classes_var) apply_pass_and_basic_check(prog, "mil_backend::adjust_io_to_supported_types") assert get_op_types_in_program(prog) == ["classify"] class TestImagePreprocessingPass: def test_program_grayscale(self): """ Input graph: main(x: ImageType(color_layout="G", channel_first=True)) { y1 = relu(x) y2 = relu(x) output = add(y1, y2) } [output] Output graph: main(x: ImageType(channel_first=True)) { y1 = relu(x) y2 = relu(x) output = add(y1, y2) } [output] """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 20, 20))]) def prog(x): y1 = mb.relu(x=x) y2 = mb.relu(x=x) z = mb.add(x=y1, y=y2) return z prog.functions["main"].input_types = ( ct.ImageType(name="x", shape=[1, 1, 20, 20], color_layout="G", channel_first=True), ) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::insert_image_preprocessing_ops" ) assert get_op_types_in_program(prev_prog) == ["relu", "relu", "add"] assert get_op_types_in_program(prog) == ["relu", "relu", "add"] def test_program_grayscale_with_scale(self): """ Input graph: main(x: ImageType(scale=2.0, color_layout="G", channel_first=True)) { y1 = relu(x) y2 = relu(x) output = add(y1, y2) } [output] Output graph: main(x: ImageType(channel_first=True)) { y = mul(x, 2) y1 = relu(y) y2 = relu(y) output = add(y1, y2) } [output] """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 20, 20))]) def prog(x): y1 = mb.relu(x=x) y2 = mb.relu(x=x) z = mb.add(x=y1, y=y2) return z prog.functions["main"].input_types = ( ct.ImageType( name="x", shape=[1, 1, 20, 20], scale=2.0, color_layout="G", channel_first=True ), ) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::insert_image_preprocessing_ops" ) assert get_op_types_in_program(prev_prog) == ["relu", "relu", "add"] assert get_op_types_in_program(prog) == ["mul", "relu", "relu", "add"] scale_op = prog.find_ops(op_type="mul", exactly_one=True)[0] assert scale_op.y.val == 2.0 def test_program_grayscale_with_bias(self): """ Input graph: main(x: ImageType(bias=2.0, color_layout="G", channel_first=True)) { y1 = relu(x) y2 = relu(x) output = add(y1, y2) } [output] Output graph: main(x: ImageType(channel_first=True)) { y = add(x, 2) y1 = relu(y) y2 = relu(y) output = add(y1, y2) } [output] """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 20, 20))]) def prog(x): y1 = mb.relu(x=x) y2 = mb.relu(x=x) z = mb.add(x=y1, y=y2) return z prog.functions["main"].input_types = ( ct.ImageType( name="x", shape=[1, 1, 20, 20], bias=2.0, color_layout="G", channel_first=True ), ) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::insert_image_preprocessing_ops" ) assert get_op_types_in_program(prev_prog) == ["relu", "relu", "add"] assert get_op_types_in_program(prog) == ["add", "relu", "relu", "add"] add_op = prog.find_ops(op_type="add", exactly_one=False)[0] assert add_op.y.val == 2.0 def test_program_grayscale_with_scale_bias(self): """ Input graph: main(x: ImageType(scale=2.0, bias=2.0, color_layout="G", channel_first=True)) { y1 = relu(x) y2 = relu(x) output = add(y1, y2) } [output] Output graph: main(x: ImageType(channel_first=True)) { y_scaled = mul(x, 2) y = add(y_scaled, 2) y1 = relu(y) y2 = relu(y) output = add(y1, y2) } [output] """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 20, 20))]) def prog(x): y1 = mb.relu(x=x) y2 = mb.relu(x=x) z = mb.add(x=y1, y=y2) return z prog.functions["main"].input_types = ( ct.ImageType( name="x", shape=[1, 1, 20, 20], scale=2.0, bias=2.0, color_layout="G", channel_first=True, ), ) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::insert_image_preprocessing_ops" ) assert get_op_types_in_program(prev_prog) == ["relu", "relu", "add"] assert get_op_types_in_program(prog) == ["mul", "add", "relu", "relu", "add"] scale_op = prog.find_ops(op_type="mul", exactly_one=True)[0] assert scale_op.y.val == 2.0 add_op = prog.find_ops(op_type="add", exactly_one=False)[0] assert add_op.y.val == 2.0 def test_program_rgb(self): """ Input graph: main(x: ImageType(color_layout="RGB", channel_first=True)) { y1 = relu(x) y2 = relu(x) output = add(y1, y2) } [output] Output graph: main(x: ImageType(channel_first=True)) { y1 = relu(x) y2 = relu(x) output = add(y1, y2) } [output] """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 3, 20, 20))]) def prog(x): y1 = mb.relu(x=x) y2 = mb.relu(x=x) z = mb.add(x=y1, y=y2) return z prog.functions["main"].input_types = ( ct.ImageType(name="x", shape=[1, 3, 20, 20], color_layout="RGB", channel_first=True), ) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::insert_image_preprocessing_ops" ) assert get_op_types_in_program(prev_prog) == ["relu", "relu", "add"] assert get_op_types_in_program(prog) == ["relu", "relu", "add"] def test_program_rgb_scale_bias(self): """ Input graph: main(x: ImageType(color_layout="RGB", scale=2.0, bias=[1.0, 2.0, 3.0], channel_first=True)) { y1 = relu(x) y2 = relu(x) output = add(y1, y2) } [output] Output graph: main(x: ImageType(channel_first=True)) { y = mul(x, scale) y_bias = add(y, bias) y1 = relu(y_bias) y2 = relu(y_bias) output = add(y1, y2) } [output] """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 3, 20, 20))]) def prog(x): y1 = mb.relu(x=x) y2 = mb.relu(x=x) z = mb.add(x=y1, y=y2) return z prog.functions["main"].input_types = ( ct.ImageType( name="x", shape=[1, 3, 20, 20], scale=2.0, bias=[1.0, 2.0, 3.0], color_layout="RGB", channel_first=True, ), ) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::insert_image_preprocessing_ops" ) assert get_op_types_in_program(prev_prog) == ["relu", "relu", "add"] assert get_op_types_in_program(prog) == ["mul", "add", "relu", "relu", "add"] scale_op = prog.find_ops(op_type="mul", exactly_one=True)[0] assert scale_op.y.val == 2.0 add_op = prog.find_ops(op_type="add", exactly_one=False)[0] assert np.all(add_op.y.val == np.array([1.0, 2.0, 3.0]).reshape([1, 3, 1, 1])) def test_program_bgr(self): """ Input graph: main(x: ImageType(color_layout="BGR", channel_first=True)) { y1 = relu(x) y2 = relu(x) output = add(y1, y2) } [output] Output graph: main(x: ImageType(channel_first=True)) { y1 = relu(x) y2 = relu(x) output = add(y1, y2) } [output] """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 3, 20, 20))]) def prog(x): y1 = mb.relu(x=x) y2 = mb.relu(x=x) z = mb.add(x=y1, y=y2) return z prog.functions["main"].input_types = ( ct.ImageType(name="x", shape=[1, 3, 20, 20], color_layout="BGR", channel_first=True), ) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::insert_image_preprocessing_ops" ) assert get_op_types_in_program(prev_prog) == ["relu", "relu", "add"] assert get_op_types_in_program(prog) == ["relu", "relu", "add"] def test_program_bgr_scale_bias(self): """ Input graph: main(x: ImageType(color_layout="BGR", scale=2.0, bias=[1.0, 2.0, 3.0], channel_first=True)) { y1 = relu(x) y2 = relu(x) output = add(y1, y2) } [output] Output graph: main(x: ImageType(channel_first=True)) { y = mul(x, scale) y_bias = add(y, bias) y1 = relu(y_bias) y2 = relu(y_bias) output = add(y1, y2) } [output] """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 3, 20, 20))]) def prog(x): y1 = mb.relu(x=x) y2 = mb.relu(x=x) z = mb.add(x=y1, y=y2) return z prog.functions["main"].input_types = ( ct.ImageType( name="x", shape=[1, 3, 20, 20], scale=2.0, bias=[1.0, 2.0, 3.0], color_layout="BGR", channel_first=True, ), ) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::insert_image_preprocessing_ops" ) assert get_op_types_in_program(prev_prog) == ["relu", "relu", "add"] assert get_op_types_in_program(prog) == ["mul", "add", "relu", "relu", "add"] scale_op = prog.find_ops(op_type="mul", exactly_one=True)[0] assert scale_op.y.val == 2.0 add_op = prog.find_ops(op_type="add", exactly_one=False)[0] assert np.all(add_op.y.val == np.array([1.0, 2.0, 3.0]).reshape([1, 3, 1, 1])) @pytest.mark.parametrize( "scale_type, bias_type", itertools.product([np.float32, np.int32], [np.float32, np.int32]) ) def test_scale_bias_types(self, scale_type, bias_type): """ Input graph: main(x: ImageType(color_layout="RGB", scale=2.0, bias=[1.0, 2.0, 3.0], channel_first=True)) { y1 = relu(x) y2 = relu(x) output = add(y1, y2) } [output] Output graph: main(x: ImageType(channel_first=True)) { y = mul(x, scale) y_bias = add(y, bias) y1 = relu(y_bias) y2 = relu(y_bias) output = add(y1, y2) } [output] """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 3, 20, 20))]) def prog(x): y1 = mb.relu(x=x) y2 = mb.relu(x=x) z = mb.add(x=y1, y=y2) return z prog.functions["main"].input_types = ( ct.ImageType( name="x", shape=[1, 3, 20, 20], scale=scale_type(2.0), bias=np.array([1, 2, 3]).astype(bias_type), color_layout="RGB", channel_first=True, ), ) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "mil_backend::insert_image_preprocessing_ops" ) assert get_op_types_in_program(prev_prog) == ["relu", "relu", "add"] assert get_op_types_in_program(prog) == ["mul", "add", "relu", "relu", "add"] scale_op = prog.find_ops(op_type="mul", exactly_one=True)[0] assert scale_op.y.dtype() == prog.functions["main"].inputs["x"].dtype() add_op = prog.find_ops(op_type="add", exactly_one=False)[0] assert add_op.y.dtype() == prog.functions["main"].inputs["x"].dtype() class TestSanitizerPass: def test_sanitize_numeric_var_names(self): """ Input: main(%x: (1, 3, 20, fp32)(Tensor)) { block0() { %var_1!: (1, 3, 20, fp32)(Tensor) = relu(x=%x, name="var_1!") %1: (1, 3, 20, fp32)(Tensor) = relu(x=%x, name="1") %3: (1, 3, 20, fp32)(Tensor) = add(x=%Var_1!, y=%1, name="3") } -> (%3) } Output: main(%x: (1, 3, 20, fp32)(Tensor)) { block0() { %var_1_: (1, 3, 20, fp32)(Tensor) = relu(x=%x, name="var_1_") %var_1: (1, 3, 20, fp32)(Tensor) = relu(x=%x, name="op_1") %var_3: (1, 3, 20, fp32)(Tensor) = add(x=%var_1_, y=%var_1, name="op_3") } -> (%var_3) } """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 3, 20))]) def prog(x): y1 = mb.relu(x=x, name = "var_1!") y2 = mb.relu(x=x, name = "1") z = mb.add(x=y1, y=y2, name = "3") return z PASS_REGISTRY["mil_backend::sanitize_name_strings"](prog) block = prog.functions["main"] assert block.find_ops(op_type="relu")[0].outputs[0].name == "var_1_" assert block.find_ops(op_type="relu")[1].outputs[0].name == "var_1" assert prog["main"].outputs[0].name == "var_3" assert block.find_ops(op_type="relu")[0].name == "var_1_" assert block.find_ops(op_type="relu")[1].name == "op_1" assert block.find_ops(op_type="add")[0].name == "op_3" def test_sanitize_var_names_with_two_functions(self): """ Input: main(%x: (1, 3, 20, fp32)(Tensor)) { block0() { %var_1!: (1, 3, 20, fp32)(Tensor) = relu(x=%x, name="var_1!") } -> (%var_1!) } main_2(%x: (1, 3, 20, fp32)(Tensor)) { block0() { %var_1!: (1, 3, 20, fp32)(Tensor) = relu(x=%x, name="var_1!") } -> (%var_1!) } Output: main(%x: (1, 3, 20, fp32)(Tensor)) { block0() { %var_1!: (1, 3, 20, fp32)(Tensor) = relu(x=%x, name="var_1_") } -> (%var_1_) } main_2(%x: (1, 3, 20, fp32)(Tensor)) { block0() { %var_1!: (1, 3, 20, fp32)(Tensor) = relu(x=%x, name="var_1_") } -> (%var_1_) } """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 3, 20))]) def prog(x): z = mb.relu(x=x, name = "var_1!") return z @mb.program(input_specs=[mb.TensorSpec(shape=(1, 3, 20))]) def prog_2(x): z = mb.relu(x=x, name = "var_1!") return z prog.add_function("main_2", prog_2.functions["main"]) PASS_REGISTRY["mil_backend::sanitize_name_strings"](prog) block = prog.functions["main"] assert block.find_ops(op_type="relu")[0].outputs[0].name == "var_1_" assert prog["main"].outputs[0].name == "var_1_" assert block.find_ops(op_type="relu")[0].name == "var_1_" block = prog.functions["main_2"] assert block.find_ops(op_type="relu")[0].outputs[0].name == "var_1_" assert prog["main"].outputs[0].name == "var_1_" assert block.find_ops(op_type="relu")[0].name == "var_1_" class TestPassFuseActivationSiLU: """ Input graph: input --> sigmoid --> mul --> output Output graph: input --> silu --> output """ @pytest.mark.skipif(ct.utils._macos_version() < (12, 0), reason="mlprogram predict available only on macOS12+") @pytest.mark.parametrize( "reverse_order", itertools.product([True, False]), ) def test_0(self, reverse_order): x_shape = tuple(np.random.randint(low=1, high=4, size=5)) @mb.program(input_specs=[mb.TensorSpec(shape=x_shape)]) def program(x): sigmoid_x = mb.sigmoid(x=x) if not reverse_order: x = mb.mul(x=x, y=sigmoid_x) else: x = mb.mul(x=sigmoid_x, y=x) return x prev_prog, prev_block, block = apply_pass_and_basic_check( program, "mil_backend::fuse_activation_silu" ) assert get_op_types_in_program(prev_prog) == ["sigmoid", "mul"] assert get_op_types_in_program(program) == ["silu"] assert_model_is_valid( program=program, inputs={"x": x_shape}, backend=("mlprogram", "fp32"), expected_output_shapes={block.outputs[0].name: tuple(x_shape)}, ) class TestPassFusePow2Sqrt: """ Input graph: input --> pow(2) --> sqrt --> output Output graph: input --> output """ @pytest.mark.skipif(ct.utils._macos_version() < (12, 0), reason="mlprogram predict available only on macOS12+") @pytest.mark.parametrize( "reverse_order", itertools.product([True, False]), ) def test_fuse(self, reverse_order): x_shape = tuple(np.random.randint(low=1, high=4, size=5)) @mb.program(input_specs=[mb.TensorSpec(shape=x_shape)]) def program(x): if not reverse_order: x = mb.sqrt(x=mb.pow(x=x, y=2.0)) else: x = mb.pow(x=mb.sqrt(x=x), y=2.0) return x prev_prog, _, block = apply_pass_and_basic_check( program, "mil_backend::fuse_pow2_sqrt" ) assert set(get_op_types_in_program(prev_prog)) == set(("pow", "sqrt")) assert get_op_types_in_program(program) == ["identity"] assert_model_is_valid( program=program, inputs={"x": x_shape}, backend=("mlprogram", "fp32"), expected_output_shapes={block.outputs[0].name: tuple(x_shape)}, ) @pytest.mark.skipif(ct.utils._macos_version() < (12, 0), reason="mlprogram predict available only on macOS12+") @pytest.mark.parametrize( "reverse_order", itertools.product([True, False]), ) def test_illegal_pow(self, reverse_order): x_shape = tuple(np.random.randint(low=1, high=4, size=5)) @mb.program(input_specs=[mb.TensorSpec(shape=x_shape)]) def program(x): if not reverse_order: x = mb.sqrt(x=mb.pow(x=x, y=3.0)) else: x = mb.pow(x=mb.sqrt(x=x), y=3.0) return x prev_prog, _, block = apply_pass_and_basic_check( program, "mil_backend::fuse_pow2_sqrt" ) assert set(get_op_types_in_program(prev_prog)) == set(("pow", "sqrt")) assert set(get_op_types_in_program(program)) == set(("pow", "sqrt")) assert_model_is_valid( program=program, inputs={"x": x_shape}, backend=("mlprogram", "fp32"), expected_output_shapes={block.outputs[0].name: tuple(x_shape)}, ) @pytest.mark.skipif(ct.utils._macos_version() < (12, 0), reason="mlprogram predict available only on macOS12+") def test_no_pow(self): x_shape = tuple(np.random.randint(low=1, high=4, size=5)) @mb.program(input_specs=[mb.TensorSpec(shape=x_shape)]) def program(x): return mb.sqrt(x=x) prev_prog, _, block = apply_pass_and_basic_check( program, "mil_backend::fuse_pow2_sqrt" ) assert get_op_types_in_program(prev_prog) == ["sqrt"] assert get_op_types_in_program(program) == ["sqrt"] assert_model_is_valid( program=program, inputs={"x": x_shape}, backend=("mlprogram", "fp32"), expected_output_shapes={block.outputs[0].name: tuple(x_shape)}, ) @pytest.mark.skipif(ct.utils._macos_version() < (12, 0), reason="mlprogram predict available only on macOS12+") def test_no_sqrt(self): x_shape = tuple(np.random.randint(low=1, high=4, size=5)) @mb.program(input_specs=[mb.TensorSpec(shape=x_shape)]) def program(x): return mb.pow(x=x, y=2.0) prev_prog, _, block = apply_pass_and_basic_check( program, "mil_backend::fuse_pow2_sqrt" ) assert get_op_types_in_program(prev_prog) == ["pow"] assert get_op_types_in_program(program) == ["pow"] assert_model_is_valid( program=program, inputs={"x": x_shape}, backend=("mlprogram", "fp32"), expected_output_shapes={block.outputs[0].name: tuple(x_shape)}, ) @pytest.mark.skipif(ct.utils._macos_version() < (12, 0), reason="mlprogram predict available only on macOS12+") @pytest.mark.parametrize( "reverse_order", itertools.product([True, False]), ) def test_multiple_nodes(self, reverse_order): x_shape = tuple(np.random.randint(low=1, high=4, size=5)) @mb.program(input_specs=[mb.TensorSpec(shape=x_shape)]) def program(x): if not reverse_order: x = mb.mul(x=x, y=x) x = mb.pow(x=x, y=2.0) x = mb.sqrt(x=x) x = mb.reduce_argmax(x=x) x = mb.reshape(x=x, shape=[*x_shape[:-1]]) else: x = mb.mul(x=x, y=x) x = mb.sqrt(x=x) x = mb.pow(x=x, y=2.0) x = mb.reduce_argmax(x=x) x = mb.reshape(x=x, shape=[*x_shape[:-1]]) return x prev_prog, _, block = apply_pass_and_basic_check( program, "mil_backend::fuse_pow2_sqrt" ) assert set(get_op_types_in_program(prev_prog)) == set(("mul", "pow", "sqrt", "reduce_argmax", "reshape")) assert get_op_types_in_program(program) == ["mul", "identity", "reduce_argmax", "reshape"] assert_model_is_valid( program=program, inputs={"x": x_shape}, backend=("mlprogram", "fp32"), expected_output_shapes={block.outputs[0].name: tuple(x_shape[:-1])}, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/mil/test_helper.py0000644000000000000000000000236114672066616025056 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil.passes.defs.preprocess import NameSanitizer as _NameSanitizer class TestNameSanitizer: def test_name_sanitizer(self): input_and_expected_strings = [("1", "_1"), ("abc", "abc"), ("*asdf", "_asdf"), ("*asd*f", "_asd_f"), ("0abc2", "_0abc2"), ("is8174 + 16", "is8174___16"), ("a:abc", "a_abc"), ("a.abc", "a_abc"), ("dense_2_1/BiasAdd", "dense_2_1_BiasAdd"), ("dense_2_1-BiasAdd", "dense_2_1_BiasAdd"), ("key:0", "key_0"), ] for i, in_and_out_str in enumerate(input_and_expected_strings): out = _NameSanitizer().sanitize_name(in_and_out_str[0]) assert out == in_and_out_str[1] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/mil/test_load.py0000644000000000000000000013622714672066616024527 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import math import os import platform import shutil import tempfile from typing import List, Union import numpy as np import pytest import coremltools as ct from coremltools import _SPECIFICATION_VERSION_IOS_18, proto from coremltools.converters.mil import mil from coremltools.converters.mil.converter import mil_convert as _mil_convert from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.builder import Builder as mb from coremltools.converters.mil.mil.ops.tests.iOS18.test_compression import ( TestConstexprLut as _TestConstexprLut, ) from coremltools.converters.mil.mil.program import Symbol from coremltools.converters.mil.mil.types.type_mapping import string_to_nptype from coremltools.models.utils import _macos_version class TestWeightFileSerialization: @staticmethod @pytest.mark.parametrize( "dtype, opset_version", itertools.product( ["fp16", "fp32", "uint8", "int8", "uint16", "int16", "int32", "uint32"], [ct.target.iOS16, ct.target.iOS18], ), ) def test_weight_serialization(dtype, opset_version): if dtype == "uint32": # There is a pass that casts the output to CoreML supported dtype. # uint32 will fail because `cast` op doesn't accept such input type. pytest.skip("uint32 is not supported in `cast` op.") if dtype in ["uint8", "int8", "uint16", "int16"] and opset_version == ct.target.iOS16: # iOS16 doesn't support the above dtype either pytest.skip("dtype not support in iOS16") if dtype in ["fp16", "fp32", "uint8", "int8"]: should_serialize_weight = True else: should_serialize_weight = opset_version >= ct.target.iOS18 @mb.program(input_specs=[mb.TensorSpec((1,))], opset_version=opset_version) def prog(x): val = np.random.rand(1000).astype(string_to_nptype(dtype)) return mb.const(val=val), mb.add(x=x, y=1.0) # we don't want the const to be constant folding after casting pipeline = ct.PassPipeline() pipeline.set_options("common::const_elimination", {"skip_const_by_size": "-1"}) mlmodel = ct.convert( prog, minimum_deployment_target=opset_version, pass_pipeline=pipeline, ) saved_package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(saved_package_path) # check the weights are serialized as file value if ct.utils._macos_version() >= (15, 0): with tempfile.TemporaryDirectory() as serialize_dir: os.system(f"coremlcompiler compile {saved_package_path} {serialize_dir}") model_name_with_extension = os.path.basename(saved_package_path) model_name_wo_extension, _ = os.path.splitext(model_name_with_extension) mil_file = open( os.path.join(serialize_dir, f"{model_name_wo_extension}.mlmodelc", "model.mil") ) mil_txt = mil_file.read() if should_serialize_weight: assert f"tensor<{dtype}, [1000]>(BLOBFILE" in mil_txt else: assert f"tensor<{dtype}, [1000]>(BLOBFILE" not in mil_txt # cleanup shutil.rmtree(saved_package_path) class TestMILFlexibleShapes: @mb.program(input_specs=[mb.TensorSpec(shape=[1, 3, Symbol("H"), Symbol("W")])]) def basic_network(x): return mb.relu(x=x) def test_mil_enumerated_multiarray(self): enumerated_shapes = tuple([(1, 3, 10, 10), (1, 3, 10, 20), (1, 3, 10, 30)]) input_shape = [ct.TensorType(name="x", shape=ct.EnumeratedShapes(shapes=enumerated_shapes))] mlmodel = ct.convert( self.basic_network, source="milinternal", convert_to="mlprogram", inputs=input_shape ) input_spec = mlmodel.get_spec().description.input assert len(input_spec) == 1, "1 input expected, got {} instead".format(len(input_spec)) assert input_spec[0].name == "x", "input name in MLModel is {}, 'x' is expected".format( input_spec[0].name ) assert ( input_spec[0].type.WhichOneof("Type") == "multiArrayType" ), "Expected multiArrayType, got {}".format(input_spec[0].type.WhichOneof("Type")) assert ( input_spec[0].type.multiArrayType.WhichOneof("ShapeFlexibility") == "enumeratedShapes" ), "Expected enumeratedShapes in ShapeFlexibility" spec_default_shape = [s for s in input_spec[0].type.multiArrayType.shape] spec_enumerated_shapes = set() for enumerated in input_spec[0].type.multiArrayType.enumeratedShapes.shapes: spec_enumerated_shapes.add(tuple([s for s in enumerated.shape])) assert spec_default_shape == [ 1, 3, 10, 10, ], "Expected default shape to be [1, 3, 10, 10], got {} instead".format( str(spec_default_shape) ) assert spec_enumerated_shapes == set(enumerated_shapes), "Enumerated shape mismatch" def test_mil_enumerated_multiarray_with_default(self): enumerated_shapes = tuple([(1, 3, 10, 10), (1, 3, 10, 20), (1, 3, 10, 30)]) input_shape = [ ct.TensorType( name="x", shape=ct.EnumeratedShapes(shapes=enumerated_shapes, default=(1, 3, 10, 30)), ) ] mlmodel = ct.convert( self.basic_network, source="milinternal", convert_to="mlprogram", inputs=input_shape ) input_spec = mlmodel.get_spec().description.input assert len(input_spec) == 1, "1 input expected, got {} instead".format(len(input_spec)) assert input_spec[0].name == "x", "input name in MLModel is {}, 'x' is expected".format( input_spec[0].name ) assert ( input_spec[0].type.WhichOneof("Type") == "multiArrayType" ), "Expected multiArrayType, got {}".format(input_spec[0].type.WhichOneof("Type")) assert ( input_spec[0].type.multiArrayType.WhichOneof("ShapeFlexibility") == "enumeratedShapes" ), "Expected enumeratedShapes in ShapeFlexibility" spec_default_shape = [s for s in input_spec[0].type.multiArrayType.shape] spec_enumerated_shapes = set() for enumerated in input_spec[0].type.multiArrayType.enumeratedShapes.shapes: spec_enumerated_shapes.add(tuple([s for s in enumerated.shape])) assert spec_default_shape == [ 1, 3, 10, 30, ], "Expected default shape to be [1, 3, 10, 10], got {} instead".format( str(spec_default_shape) ) assert spec_enumerated_shapes == set(enumerated_shapes), "Enumerated shape mismatch" def test_mil_enumerated_image(self): enumerated_shapes = tuple([(1, 3, 10, 10), (1, 3, 10, 20), (1, 3, 10, 30)]) input_shape = [ct.ImageType(name="x", shape=ct.EnumeratedShapes(shapes=enumerated_shapes))] mlmodel = ct.convert( self.basic_network, source="milinternal", convert_to="mlprogram", inputs=input_shape ) input_spec = mlmodel.get_spec().description.input assert len(input_spec) == 1, "1 input expected, got {} instead".format(len(input_spec)) assert input_spec[0].name == "x", "input name in MLModel is {}, 'x' is expected".format( input_spec[0].name ) assert ( input_spec[0].type.WhichOneof("Type") == "imageType" ), "Expected imageType, got {}".format(input_spec[0].type.WhichOneof("Type")) assert ( input_spec[0].type.imageType.WhichOneof("SizeFlexibility") == "enumeratedSizes" ), "Expected enumeratedShapes in ShapeFlexibility" spec_H = input_spec[0].type.imageType.height spec_W = input_spec[0].type.imageType.width assert ( spec_H == 10 and spec_W == 10 ), "expected [H, W] == [10, 10], got [{}, {}] instead".format(spec_H, spec_W) spec_enumerated_shapes = set() for enumerated in input_spec[0].type.imageType.enumeratedSizes.sizes: spec_enumerated_shapes.add(tuple([1, 3, enumerated.height, enumerated.width])) assert spec_enumerated_shapes == set(enumerated_shapes), "Enumerated shape mismatch" def test_mil_enumerated_image_with_default(self): enumerated_shapes = tuple([(1, 3, 10, 10), (1, 3, 10, 20), (1, 3, 10, 30)]) input_shape = [ ct.ImageType( name="x", shape=ct.EnumeratedShapes(shapes=enumerated_shapes, default=(1, 3, 10, 30)), ) ] mlmodel = ct.convert( self.basic_network, source="milinternal", convert_to="mlprogram", inputs=input_shape ) input_spec = mlmodel.get_spec().description.input assert len(input_spec) == 1, "1 input expected, got {} instead".format(len(input_spec)) assert input_spec[0].name == "x", "input name in MLModel is {}, 'x' is expected".format( input_spec[0].name ) assert ( input_spec[0].type.WhichOneof("Type") == "imageType" ), "Expected imageType, got {}".format(input_spec[0].type.WhichOneof("Type")) assert ( input_spec[0].type.imageType.WhichOneof("SizeFlexibility") == "enumeratedSizes" ), "Expected enumeratedShapes in ShapeFlexibility" spec_H = input_spec[0].type.imageType.height spec_W = input_spec[0].type.imageType.width assert ( spec_H == 10 and spec_W == 30 ), "expected [H, W] == [10, 30], got [{}, {}] instead".format(spec_H, spec_W) spec_enumerated_shapes = set() for enumerated in input_spec[0].type.imageType.enumeratedSizes.sizes: spec_enumerated_shapes.add(tuple([1, 3, enumerated.height, enumerated.width])) assert spec_enumerated_shapes == set(enumerated_shapes), "Enumerated shape mismatch" def test_mil_ranged_multiarray(self): input_shape = [ct.TensorType(name="x", shape=(1, 3, 10, ct.RangeDim(10, 30)))] mlmodel = ct.convert( self.basic_network, source="milinternal", convert_to="mlprogram", inputs=input_shape ) input_spec = mlmodel.get_spec().description.input assert len(input_spec) == 1, "1 input expected, got {} instead".format(len(input_spec)) assert input_spec[0].name == "x", "input name in MLModel is {}, 'x' is expected".format( input_spec[0].name ) assert ( input_spec[0].type.WhichOneof("Type") == "multiArrayType" ), "Expected multiArrayType, got {}".format(input_spec[0].type.WhichOneof("Type")) assert ( input_spec[0].type.multiArrayType.WhichOneof("ShapeFlexibility") == "shapeRange" ), "Expected shapeRange in ShapeFlexibility" spec_default_shape = [s for s in input_spec[0].type.multiArrayType.shape] ranged_shapes = [(1, 1), (3, 3), (10, 10), (10, 30)] spec_ranged_shapes = [] for range_dim in input_spec[0].type.multiArrayType.shapeRange.sizeRanges: spec_ranged_shapes.append(tuple([range_dim.lowerBound, range_dim.upperBound])) assert spec_default_shape == [ 1, 3, 10, 10, ], "Expected default shape to be [1, 3, 10, 10], got {} instead".format( str(spec_default_shape) ) assert spec_ranged_shapes == ranged_shapes, "Enumerated shape mismatch" def test_mil_ranged_multiarray_with_default(self): input_shape = [ct.TensorType(name="x", shape=(1, 3, 10, ct.RangeDim(10, 30, default=20)))] mlmodel = ct.convert( self.basic_network, source="milinternal", convert_to="mlprogram", inputs=input_shape ) input_spec = mlmodel.get_spec().description.input assert len(input_spec) == 1, "1 input expected, got {} instead".format(len(input_spec)) assert input_spec[0].name == "x", "input name in MLModel is {}, 'x' is expected".format( input_spec[0].name ) assert ( input_spec[0].type.WhichOneof("Type") == "multiArrayType" ), "Expected multiArrayType, got {}".format(input_spec[0].type.WhichOneof("Type")) assert ( input_spec[0].type.multiArrayType.WhichOneof("ShapeFlexibility") == "shapeRange" ), "Expected shapeRange in ShapeFlexibility" spec_default_shape = [s for s in input_spec[0].type.multiArrayType.shape] ranged_shapes = [(1, 1), (3, 3), (10, 10), (10, 30)] spec_ranged_shapes = [] for range_dim in input_spec[0].type.multiArrayType.shapeRange.sizeRanges: spec_ranged_shapes.append(tuple([range_dim.lowerBound, range_dim.upperBound])) assert spec_default_shape == [ 1, 3, 10, 20, ], "Expected default shape to be [1, 3, 10, 20], got {} instead".format( str(spec_default_shape) ) assert spec_ranged_shapes == ranged_shapes, "Enumerated shape mismatch" def test_mil_ranged_image(self): input_shape = [ct.ImageType(name="x", shape=(1, 3, 10, ct.RangeDim(10, 30)))] mlmodel = ct.convert( self.basic_network, source="milinternal", convert_to="mlprogram", inputs=input_shape ) input_spec = mlmodel.get_spec().description.input assert len(input_spec) == 1, "1 input expected, got {} instead".format(len(input_spec)) assert input_spec[0].name == "x", "input name in MLModel is {}, 'x' is expected".format( input_spec[0].name ) assert ( input_spec[0].type.WhichOneof("Type") == "imageType" ), "Expected imageType, got {}".format(input_spec[0].type.WhichOneof("Type")) assert ( input_spec[0].type.imageType.WhichOneof("SizeFlexibility") == "imageSizeRange" ), "Expected imageSizeRange in ShapeFlexibility" spec_H = input_spec[0].type.imageType.height spec_W = input_spec[0].type.imageType.width assert ( spec_H == 10 and spec_W == 10 ), "expected [H, W] == [10, 10], got [{}, {}] instead".format(spec_H, spec_W) spec_H_range = [ input_spec[0].type.imageType.imageSizeRange.heightRange.lowerBound, input_spec[0].type.imageType.imageSizeRange.heightRange.upperBound, ] spec_W_range = [ input_spec[0].type.imageType.imageSizeRange.widthRange.lowerBound, input_spec[0].type.imageType.imageSizeRange.widthRange.upperBound, ] assert spec_H_range == [10, 10], "Ranged height mismatch" assert spec_W_range == [10, 30], "Ranged width mismatch" def test_mil_ranged_image_with_default(self): input_shape = [ct.ImageType(name="x", shape=(1, 3, 10, ct.RangeDim(10, 30, default=20)))] mlmodel = ct.convert( self.basic_network, source="milinternal", convert_to="mlprogram", inputs=input_shape ) input_spec = mlmodel.get_spec().description.input assert len(input_spec) == 1, "1 input expected, got {} instead".format(len(input_spec)) assert input_spec[0].name == "x", "input name in MLModel is {}, 'x' is expected".format( input_spec[0].name ) assert ( input_spec[0].type.WhichOneof("Type") == "imageType" ), "Expected imageType, got {}".format(input_spec[0].type.WhichOneof("Type")) assert ( input_spec[0].type.imageType.WhichOneof("SizeFlexibility") == "imageSizeRange" ), "Expected imageSizeRange in ShapeFlexibility" spec_H = input_spec[0].type.imageType.height spec_W = input_spec[0].type.imageType.width assert ( spec_H == 10 and spec_W == 20 ), "expected [H, W] == [10, 20], got [{}, {}] instead".format(spec_H, spec_W) spec_H_range = [ input_spec[0].type.imageType.imageSizeRange.heightRange.lowerBound, input_spec[0].type.imageType.imageSizeRange.heightRange.upperBound, ] spec_W_range = [ input_spec[0].type.imageType.imageSizeRange.widthRange.lowerBound, input_spec[0].type.imageType.imageSizeRange.widthRange.upperBound, ] assert spec_H_range == [10, 10], "Ranged height mismatch" assert spec_W_range == [10, 30], "Ranged width mismatch" class TestMILDefaultValues: @mb.program(input_specs=[mb.TensorSpec(shape=[1]), mb.TensorSpec(shape=[1])]) def basic_network(x, y): return mb.add(x=x, y=y, name="output") def test_mil_default_value_to_proto(self): program_input_spec = [ ct.TensorType(name="x", shape=[1], default_value=np.array([1.0]).astype(np.float32)), ct.TensorType(name="y", shape=[1]), ] mlmodel = ct.convert(self.basic_network, convert_to="mlprogram", inputs=program_input_spec) input_spec = mlmodel.get_spec().description.input assert len(input_spec) == 2, "2 input expected, got {} instead".format(len(input_spec)) assert input_spec[0].name == "x", "input name in MLModel is {}, 'x' is expected".format( input_spec[0].name ) assert ( input_spec[0].type.WhichOneof("Type") == "multiArrayType" ), "Expected multiArrayType, got {}".format(input_spec[0].type.WhichOneof("Type")) assert ( input_spec[0].type.multiArrayType.WhichOneof("defaultOptionalValue") == "floatDefaultValue" ), "Expected floatDefaultValue, got {} instead".format( input_spec[0].type.multiArrayType.WhichOneof("defaultOptionalValue") ) assert input_spec[0].type.multiArrayType.floatDefaultValue == 1.0 def test_mil_default_value_runtime(self): program_input_spec = [ ct.TensorType(name="x", shape=[1], default_value=np.array([1.0]).astype(np.float32)), ct.TensorType(name="y", shape=[1]), ] mlmodel = ct.convert(self.basic_network, convert_to="mlprogram", inputs=program_input_spec) if _macos_version() < (12, 0): # Can only get predictions for ml program on macOS 12+ return res = mlmodel.predict({"x": np.array([3.0]), "y": np.array([2.0])}) assert res["output"][0] == 5.0 res = mlmodel.predict({"y": np.array([2.0])}) assert res["output"][0] == 3.0 class TestMILProtoLoad: """Verify that the MIL Proto in mlmodel is correctly loaded in iOS18+.""" @staticmethod @pytest.mark.parametrize("opset_version", [ct.target.iOS17, ct.target.iOS18]) def test_constexpr_use_inputs_instead_of_attributes(opset_version): """Test the constexpr uses inputs instead of attributes starting from iOS18.""" @mb.program(input_specs=[], opset_version=ct.target.iOS17) def prog_ios17(): return mb.constexpr_lut_to_dense( lut=np.array([1.0, 2.0, 3.0, 4.0]), indices=np.array([10, 4]).astype(np.uint8), shape=np.array([5]).astype(np.uint32), ) @mb.program(input_specs=[], opset_version=ct.target.iOS18) def prog_ios18(): return mb.constexpr_lut_to_dense( indices=np.array([4, 8, 10, 13, 24, 5, 6, 9, 13, 31, 17, 7, 2, 8, 3, 1]) .reshape((2, 4, 2)) .astype(np.uint8), lut=_TestConstexprLut._generate_lut(shape=(1, 2, 1, 256, 3)), vector_axis=1, ) mlmodel = ct.convert( prog_ios17 if opset_version == ct.target.iOS17 else prog_ios18, convert_to="mlprogram", minimum_deployment_target=opset_version, ) # Iterates the milproto in mlmodel to make sure lut op uses inputs instead of attributes. mil = mlmodel.get_spec().mlProgram for function in mil.functions.values(): for block in function.block_specializations.values(): for op in block.operations: if op.type == "constexpr_lut_to_dense": # The "attributes" field has at least one value for "name". expected_attributes_num = 1 expected_inputs_num = 0 if opset_version >= ct.target.iOS18: # Since iOS18, constexpr ops use inputs instead of attributes in milproto. expected_inputs_num += 3 else: expected_attributes_num += 3 assert len(op.attributes.values()) == expected_attributes_num assert len(op.inputs.values()) == expected_inputs_num @staticmethod def test_constexpr_multiple_outputs(): """Starting from iOS18 there are constexpr ops that have multiple outputs.""" @mb.program(input_specs=[], opset_version=ct.target.iOS18) def prog(): return mb.constexpr_sparse_blockwise_shift_scale( data_mask=np.array([[1, 1, 0, 0], [1, 1, 1, 0], [0, 0, 1, 1], [1, 1, 0, 0]]).astype( types.np_uint1_dtype ), nonzero_data=np.array([10, 11, 3, 4, 5, 6, 7, 8, 9]).astype(np.int8), scale=np.array([[0.1, 0.2, 0.3, 0.4]]), offset=np.array([[1, 2, 3, 4]]).astype(np.int8), )[1] mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, ) mil = mlmodel.get_spec().mlProgram for function in mil.functions.values(): for block in function.block_specializations.values(): for op in block.operations: if op.type == "constexpr_sparse_blockwise_shift_scale": assert len(op.outputs) == 2 @staticmethod def test_sub_byte_immediate_value(): """ Test the sub-byte immediate value tensor is exported as packed bytes. The sub-byte file value is tested in `coremltools/test/blob/test_weights.py` which is not in the scope of this test. """ @mb.program(input_specs=[], opset_version=ct.target.iOS18) def prog(): return mb.constexpr_blockwise_shift_scale( data=np.array([-8, 7]).reshape((1, 2, 1)).astype(types.np_int4_dtype), scale=np.array([4]).reshape((1, 1, 1)).astype(np.float16), offset=np.array([4]).reshape((1, 1, 1)).astype(types.np_int4_dtype), ) mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, ) mil = mlmodel.get_spec().mlProgram for function in mil.functions.values(): for block in function.block_specializations.values(): for op in block.operations: if op.type == "constexpr_blockwise_shift_scale": bytes_val = ( op.inputs["data"].arguments[0].value.immediateValue.tensor.bytes.values ) # The two 4-bit values should be packed into a single byte. assert len(bytes_val) == 1 @staticmethod def check_functions_description( mlmodel: ct.models.MLModel, expect_function_names: List[str], expected_default_function_name: str, ) -> None: spec = mlmodel.get_spec() desc = spec.description assert len(desc.functions) == len(expect_function_names) for i in range(len(expect_function_names)): assert desc.functions[i].name == expect_function_names[i] assert desc.defaultFunctionName == expected_default_function_name @staticmethod def convert_and_save(prog: mil.Program) -> str: mlmodel = _mil_convert( prog, convert_to="mlprogram", convert_from="milinternal", specification_version=_SPECIFICATION_VERSION_IOS_18, compute_units=ct.ComputeUnit.CPU_ONLY, export_multi_functions=True, skip_model_load=True, ) package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) return package_path @staticmethod def check_relu(model: Union[ct.models.MLModel, ct.models.CompiledMLModel]) -> None: x = np.array([-1.0, 0.0, 1.0], dtype=np.float32) y_relu = [0, 0, 1] y = model.predict({"x": x}) assert all(y["relu_0"] == y_relu) @staticmethod def check_sin(model: Union[ct.models.MLModel, ct.models.CompiledMLModel]) -> None: x = np.array([-1.0, 0.0, 1.0], dtype=np.float32) y_sin = list(map(math.sin, x)) y = model.predict({"x": x}) np.testing.assert_allclose(y["sin_0"], y_sin, rtol=5e-04, atol=5e-04) @staticmethod def check_cos(model: Union[ct.models.MLModel, ct.models.CompiledMLModel]) -> None: x = np.array([-1.0, 0.0, 1.0], dtype=np.float32) y_sin = list(map(math.cos, x)) y = model.predict({"x": x}) np.testing.assert_allclose(y["cos_0"], y_sin, rtol=5e-04, atol=5e-04) @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Multi-function only supported on macOS 15+") def test_multi_functions(self): """ Test multi-functions program can be exported into multi-functions Core ML proto. """ @mb.function( input_specs=[mb.TensorSpec((3,))], opset_version=ct.target.iOS18, ) def func(x): return mb.relu(x=x) @mb.function( input_specs=[mb.TensorSpec((3,))], opset_version=ct.target.iOS18, ) def func_1(x): return mb.sin(x=x) @mb.function( input_specs=[mb.TensorSpec((3,))], opset_version=ct.target.iOS18, ) def func_2(x): return mb.cos(x=x) prog = mil.Program() prog.add_function("main", func) prog.add_function("sin", func_1) prog.add_function("cos", func_2) package_path = self.convert_and_save(prog) # Test the proto can be loaded back and validate the spec mlmodel = ct.models.MLModel(package_path, function_name="main") self.check_functions_description( mlmodel, expect_function_names=["main", "sin", "cos"], expected_default_function_name="main", ) # Validate MLModel predictions for all three functions self.check_relu(mlmodel) self.check_sin( ct.models.MLModel( package_path, function_name="sin", compute_units=ct.ComputeUnit.CPU_ONLY ) ) self.check_cos( ct.models.MLModel( package_path, function_name="cos", compute_units=ct.ComputeUnit.CPU_ONLY ) ) # Validate MLModel function_name property assert mlmodel.function_name == "main" assert ct.models.MLModel(package_path, function_name="sin").function_name == "sin" assert ct.models.MLModel(package_path, function_name="cos").function_name == "cos" # Invalid function_name with pytest.raises(ValueError, match="function_name invalid not found in the model"): mlmodel = ct.models.MLModel(package_path, function_name="invalid") # Validate CompiledMLModel predictions for all three functions compiled_path = mlmodel.get_compiled_model_path() self.check_relu(ct.models.CompiledMLModel(compiled_path, function_name="main")) self.check_sin(ct.models.CompiledMLModel(compiled_path, function_name="sin")) self.check_cos(ct.models.CompiledMLModel(compiled_path, function_name="cos")) # clean up shutil.rmtree(package_path) @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Multi-function only supported on macOS 15+") def test_multi_functions_default_function(self): """ Test if no function_name passes to MLModel, default function name will be picked up. """ @mb.function( input_specs=[mb.TensorSpec((3,))], opset_version=ct.target.iOS18, ) def func(x): return mb.relu(x=x) @mb.function( input_specs=[mb.TensorSpec((3,))], opset_version=ct.target.iOS18, ) def func_1(x): return mb.sin(x=x) prog = mil.Program() prog.add_function("main_1", func) prog.add_function("sin", func_1) prog.default_function_name = "main_1" package_path = self.convert_and_save(prog) # With no function_name passed, mlmodel.function_name defaults to defaultFunctionName mlmodel = ct.models.MLModel(package_path) self.check_functions_description( mlmodel, expect_function_names=["main_1", "sin"], expected_default_function_name="main_1", ) assert mlmodel.function_name == "main_1" # Validate the prediction runs on default function self.check_relu(mlmodel) # Validate CompiledMLModel predictions for default function compiled_path = mlmodel.get_compiled_model_path() self.check_relu(ct.models.CompiledMLModel(compiled_path)) # clean up shutil.rmtree(package_path) @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Multi-function only supported on macOS 15+") def test_single_function_in_multifunction_format(self): @mb.function( input_specs=[mb.TensorSpec((3,))], opset_version=ct.target.iOS18, ) def func(x): return mb.relu(x=x) prog = mil.Program() prog.add_function("main_1", func) prog.default_function_name = "main_1" package_path = self.convert_and_save(prog) # No function_name is passed, default function name is picked up mlmodel = ct.models.MLModel(package_path) self.check_functions_description( mlmodel, expect_function_names=["main_1"], expected_default_function_name="main_1", ) # Validate MLModel predictions self.check_relu(mlmodel) self.check_relu(ct.models.MLModel(package_path, function_name="main_1")) # Validate CompiledMLModel predictions compiled_path = mlmodel.get_compiled_model_path() self.check_relu(ct.models.CompiledMLModel(compiled_path)) self.check_relu(ct.models.CompiledMLModel(compiled_path, function_name="main_1")) # clean up shutil.rmtree(package_path) @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Multi-function only supported on macOS 15+") def test_multi_functions_backward_compatibility(self): # Test the new MLModel class can load pre-iOS17 single function model @mb.program(input_specs=[mb.TensorSpec((3,))], opset_version=ct.target.iOS16) def prog(x): return mb.relu(x=x) mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS16, ) # Test the proto can be saved and loaded back package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) # Validate the MLModel predictions self.check_relu(ct.models.MLModel(package_path)) self.check_relu(ct.models.MLModel(package_path, function_name="main")) # Validate the MLModel function_name property assert ct.models.MLModel(package_path).function_name is None assert ct.models.MLModel(package_path, function_name="main").function_name == "main" # Other function_name will error out with pytest.raises( ValueError, match='function_name must be "main" for non multifunction model' ): mlmodel = ct.models.MLModel(package_path, function_name="invalid") # Validate the CompiledMLModel predictions compiled_path = mlmodel.get_compiled_model_path() self.check_relu(ct.models.CompiledMLModel(compiled_path)) self.check_relu(ct.models.CompiledMLModel(compiled_path, function_name="main")) # invalid function error at runtime with pytest.raises(RuntimeError): compiled_model = ct.models.CompiledMLModel(compiled_path, function_name="invalid") # clean up shutil.rmtree(package_path) @pytest.mark.skipif( ct.utils._macos_version() < (15, 0), reason="Tests are for deployment target iOS18/macos15" ) class TestStateModelLoad: """ Verify stateful model can be loaded via milproto. """ @staticmethod def verify_stateful_model(mlmodel, expected_output, input=None): def verify_numerical(mlmodel, state, expected_output, input=None): if input is None: input_dict = {} else: input_dict = {"y": input} output = mlmodel.predict(input_dict, state=state)["output"] np.testing.assert_allclose(expected_output, output, rtol=5e-04, atol=5e-04) # verify the model can be ran state_1 = mlmodel.make_state() verify_numerical(mlmodel, state_1, expected_output, input) verify_numerical(mlmodel, state_1, expected_output, input) # create a new state, and make sure the model can run prediction on both old and new state state_2 = mlmodel.make_state() verify_numerical(mlmodel, state_2, expected_output, input) verify_numerical(mlmodel, state_1, expected_output, input) def test_export_state_input_feature(self): """ Test milproto can export model with state type. """ @mb.program( input_specs=[ mb.StateTensorSpec((2, 3), dtype=types.fp16), ], opset_version=ct.target.iOS18, ) def prog(x): return mb.read_state(input=x, name="output") # verify the model can be converted mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, compute_units=ct.ComputeUnit.CPU_ONLY, ) # verify the state feature spec = mlmodel.get_spec() state = spec.description.state assert len(state) == 1 assert state[0].name == "x" assert state[0].type.WhichOneof("Type") == "stateType" assert state[0].type.stateType.WhichOneof("Type") == "arrayType" array_type = state[0].type.stateType.arrayType assert array_type.shape == [2, 3] assert array_type.dataType == proto.FeatureTypes_pb2.ArrayFeatureType.FLOAT16 # verify the model expected_output = np.zeros((2, 3)) self.verify_stateful_model(mlmodel, expected_output) def test_export_mixed_state_input_features(self): """ Test milproto can export model with states and inputs. """ @mb.program( input_specs=[ mb.StateTensorSpec((2, 3), dtype=types.fp16), mb.TensorSpec((2, 3), dtype=types.fp16), ], opset_version=ct.target.iOS18, ) def prog(x, y): x = mb.read_state(input=x) return mb.add(x=x, y=y, name="output") # verify the model can be converted mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, compute_units=ct.ComputeUnit.CPU_ONLY, ) # verify the state feature spec = mlmodel.get_spec() state = spec.description.state assert len(state) == 1 assert state[0].name == "x" assert state[0].type.WhichOneof("Type") == "stateType" assert state[0].type.stateType.WhichOneof("Type") == "arrayType" array_type = state[0].type.stateType.arrayType assert array_type.shape == [2, 3] assert array_type.dataType == proto.FeatureTypes_pb2.ArrayFeatureType.FLOAT16 # verify the input input = spec.description.input assert len(input) == 1 assert input[0].name == "y" assert input[0].type.WhichOneof("Type") == "multiArrayType" array_type = input[0].type.multiArrayType assert array_type.shape == [2, 3] assert array_type.dataType == proto.FeatureTypes_pb2.ArrayFeatureType.FLOAT16 # verify the model input = np.random.rand(2, 3) self.verify_stateful_model(mlmodel, input, input) def test_multi_functions_state_model(self): """ Make sure multi-functions Core ML models support state. """ @mb.function( input_specs=[mb.StateTensorSpec((3,), dtype=types.fp16)], opset_version=ct.target.iOS18, ) def func(x): return mb.read_state(input=x, name="output") @mb.function( input_specs=[mb.StateTensorSpec((2,), dtype=types.fp16)], opset_version=ct.target.iOS18, ) def func_1(y): return mb.read_state(input=y, name="output") prog = mil.Program() prog.add_function("main", func) prog.add_function("func_1", func_1) mlmodel = _mil_convert( prog, convert_to="mlprogram", convert_from="milinternal", specification_version=_SPECIFICATION_VERSION_IOS_18, compute_units=ct.ComputeUnit.CPU_ONLY, export_multi_functions=True, ) spec = mlmodel.get_spec() desc = spec.description assert len(desc.functions) == 2 assert desc.functions[0].name == "main" assert len(desc.functions[0].state) == 1 assert desc.functions[0].state[0].name == "x" assert desc.functions[1].name == "func_1" assert len(desc.functions[1].state) == 1 assert desc.functions[1].state[0].name == "y" # main function is the default function self.verify_stateful_model(mlmodel, np.zeros((3,))) # save the mlmodel on disk, and load "main" and "func_1" seperately package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) # test "main" function mlmodel_main = ct.models.MLModel( package_path, compute_units=ct.ComputeUnit.CPU_ONLY, function_name="main" ) self.verify_stateful_model(mlmodel_main, np.zeros((3,))) # test "func_1" function mlmodel_func_1 = ct.models.MLModel( package_path, compute_units=ct.ComputeUnit.CPU_ONLY, function_name="func_1" ) self.verify_stateful_model(mlmodel_func_1, np.zeros((2,))) # cleanup mlpackage shutil.rmtree(package_path) def test_export_coreml_update_state(self): """ The ``coreml_update_state`` dialect op is decomposed into: write_state -> read_state """ @mb.program( input_specs=[ mb.StateTensorSpec((2, 3), dtype=types.fp16), mb.TensorSpec((2, 3), dtype=types.fp16), ], opset_version=ct.target.iOS18, ) def prog(x, y): return mb.coreml_update_state(state=x, value=y, name="output") mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, compute_units=ct.ComputeUnit.CPU_ONLY, ) mil = mlmodel.get_spec().mlProgram for function in mil.functions.values(): for block in function.block_specializations.values(): ops = list(block.operations) assert ops[0].type == "write_state" assert len(ops[0].outputs) == 0 assert ops[1].type == "read_state" # verify the model input = np.random.rand(2, 3) self.verify_stateful_model(mlmodel, input, input) @staticmethod def test_invalid_state_input(): """ Test unsupported input state modes. """ # state only supports fp16 @mb.program( input_specs=[ mb.StateTensorSpec((2, 3), dtype=types.fp32), ], opset_version=ct.target.iOS18, ) def prog(x): return mb.read_state(input=x) with pytest.raises( ValueError, match="State only support fp16 dtype. Got input var x with dtype fp32.", ): mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, ) # state doesn't support flexible shape @mb.program( input_specs=[ mb.StateTensorSpec((2, get_new_symbol()), dtype=types.fp32), ], opset_version=ct.target.iOS18, ) def prog(x): return mb.read_state(input=x) with pytest.raises(ValueError, match="Flexible shape model states are not supported!"): mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, ) @staticmethod def test_coreml_update_state_lowering(): """ If the output of coreml_update_state is not a block output and it is not fed into any other ops, the op should be translated into a single write_state. """ @mb.program( input_specs=[ mb.StateTensorSpec((1,), dtype=types.fp16), mb.TensorSpec((1,), dtype=types.fp16), mb.TensorSpec((1,), dtype=types.fp16), ], opset_version=ct.target.iOS18, ) def prog(state, x, y): mb.coreml_update_state(state=state, value=x) mb.coreml_update_state(state=state, value=y) return x, mb.coreml_update_state(state=state, value=y) mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, skip_model_load=True, ) mil = mlmodel.get_spec().mlProgram for function in mil.functions.values(): for block in function.block_specializations.values(): ops = list(block.operations) expected_ops = [ "write_state", "write_state", "write_state", "read_state", ] assert [val.type for val in ops] == expected_ops @staticmethod def test_coreml_update_state_lowering_with_prefer_state_in_downstream(): @mb.program( input_specs=[ mb.StateTensorSpec((1,), dtype=types.fp16), mb.TensorSpec((1,), dtype=types.fp16), mb.TensorSpec((1,), dtype=types.fp16), mb.TensorSpec((1,), dtype=types.fp16), ], opset_version=ct.target.iOS18, ) def prog(state, x, y, z): # Although seemingly not used, graph pass prefer_state_in_downstream will # make its output as identiy.x mb.coreml_update_state(state=state, value=x) # If value only feeds into coreml_update_state, # the prefer_state_in_downstream has no affects mb.coreml_update_state(state=state, value=y) # This is the one that really is not used mb.coreml_update_state(state=state, value=z) return mb.identity(x=x), mb.coreml_update_state(state=state, value=y) mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, skip_model_load=True, ) mil = mlmodel.get_spec().mlProgram for function in mil.functions.values(): for block in function.block_specializations.values(): ops = list(block.operations) expected_ops = [ "write_state", "read_state", "write_state", "write_state", "identity", "write_state", "read_state", ] assert [val.type for val in ops] == expected_ops @staticmethod @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="State only supported on macOS 15+") def test_prediction_state(): """ Test prediction from a stateful model """ def extract_value(y): return list(y.values())[0][0] def test_state_model(mlmodel, multiplier): # Using first state state1 = mlmodel.make_state() for i in range(1, 5): y = mlmodel.predict({}, state=state1) assert extract_value(y) == multiplier * i # Use a new state state2 = mlmodel.make_state() for i in range(1, 5): y = mlmodel.predict({}, state=state2) assert extract_value(y) == multiplier * i # Go back to using the first state for i in range(5, 10): y = mlmodel.predict({}, state=state1) assert extract_value(y) == multiplier * i @mb.program( input_specs=[ mb.StateTensorSpec((1,), dtype=types.fp16), ], opset_version=ct.target.iOS18, ) def increment(x): # Read y = mb.read_state(input=x) # Update y = mb.add(x=y, y=np.array([1.0]).astype("float16")) # Write y = mb.coreml_update_state(state=x, value=y) # Return return y @mb.program( input_specs=[ mb.StateTensorSpec((1,), dtype=types.fp16), ], opset_version=ct.target.iOS18, ) def increment_by_2(x): # Read y = mb.read_state(input=x) # Update y = mb.add(x=y, y=np.array([1.0]).astype("float16")) # Write y = mb.coreml_update_state(state=x, value=y) # Update y = mb.add(x=y, y=np.array([1.0]).astype("float16")) # Write mb.coreml_update_state(state=x, value=y) # Return return y for model, multiplier in [(increment, 1), (increment_by_2, 2)]: mlmodel = ct.convert( model, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, ) # The test is failing on x86_64 machines # rdar://126957030 ([State][Bug][Intel] Stateful model prediction is wrong on Intel laptop) if platform.machine() == "arm64": test_state_model(mlmodel, multiplier) # save the model and load it back package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) # Load with CPU test_state_model( ct.models.MLModel(package_path, compute_units=ct.ComputeUnit.CPU_ONLY), multiplier ) # Load with ALL if platform.machine() == "arm64": test_state_model(ct.models.MLModel(package_path), multiplier) shutil.rmtree(package_path) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2095466 coremltools-8.0/coremltools/converters/mil/backend/nn/0000755000000000000000000000000014672075535022016 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/__init__.py0000644000000000000000000000033214672066616024125 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/load.py0000644000000000000000000003235514672066616023317 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import coremltools as ct from coremltools.converters._profile_utils import _profile from coremltools.converters.mil.backend.backend_helper import _get_probability_var_for_classifier from coremltools.converters.mil.input_types import ( ColorLayout, EnumeratedShapes, ImageType, RangeDim, Shape, TensorType, ) from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.types.symbolic import any_symbolic, any_variadic, is_symbolic from coremltools.models import model, neural_network from coremltools.models.datatypes import Array from coremltools.models.neural_network import flexible_shape_utils from coremltools.models.neural_network.flexible_shape_utils import ( add_enumerated_image_sizes, add_multiarray_ndshape_enumeration, set_multiarray_ndshape_range, ) from ..backend_helper import _get_colorspace_enum, _validate_image_input_output_shapes from .op_mapping import convert_ops def _convert_to_image_input(proto, inputs, skip_model_load=False): tmp_model = model.MLModel(proto, skip_model_load=skip_model_load) for input_type in inputs: if isinstance(input_type, ImageType): if input_type.color_layout in (ColorLayout.GRAYSCALE, ColorLayout.GRAYSCALE_FLOAT16): gray_bias = input_type.bias red_bias, green_bias, blue_bias = 0.0, 0.0, 0.0 elif input_type.color_layout == ColorLayout.RGB: gray_bias = 0.0 red_bias, green_bias, blue_bias = input_type.bias elif input_type.color_layout == ColorLayout.BGR: gray_bias = 0.0 blue_bias, green_bias, red_bias = input_type.bias tmp_model = neural_network.utils.make_image_input( tmp_model, input_type.name, is_bgr=input_type.color_layout == ColorLayout.BGR, image_format="NCHW" if input_type.channel_first else "NHWC", red_bias=red_bias, green_bias=green_bias, blue_bias=blue_bias, gray_bias=gray_bias, scale=input_type.scale, ) return tmp_model.get_spec() def _convert_to_classifier(proto, classifier_config, skip_model_load=False): tmp_model = model.MLModel(proto, skip_model_load=skip_model_load) tmp_model = neural_network.utils.make_nn_classifier( tmp_model, classifier_config.class_labels, classifier_config.predicted_feature_name, classifier_config.predicted_probabilities_output, ) return tmp_model.get_spec() def _set_user_inputs(proto, inputs): for input_type in inputs: shape = input_type.shape if isinstance(shape, EnumeratedShapes): if isinstance(input_type, ImageType): default_height, default_width = 0, 0 for inp in proto.description.input: if inp.name == input_type.name: default_height = inp.type.imageType.height default_width = inp.type.imageType.width break image_sizes = [] if input_type.channel_first: for s in shape.shapes: if s.shape[-2] == default_height and s.shape[-1] == default_width: continue image_sizes.append( flexible_shape_utils.NeuralNetworkImageSize( height=s.shape[-2], width=s.shape[-1] ) ) else: for s in shape.shapes: if s.shape[-3] == default_height and s.shape[-2] == default_width: continue image_sizes.append( flexible_shape_utils.NeuralNetworkImageSize( height=s.shape[-3], width=s.shape[-2] ) ) add_enumerated_image_sizes( proto, input_type.name, sizes=image_sizes ) else: add_multiarray_ndshape_enumeration( proto, input_type.name, [tuple(s.shape) for s in shape.shapes] ) elif isinstance(shape, Shape): shape = shape.shape # This is shape in Shape if all( [ not isinstance(s, RangeDim) and not is_symbolic(s) and s > 0 for s in shape ] ): continue if isinstance(input_type, ImageType): img_range = flexible_shape_utils.NeuralNetworkImageSizeRange() if input_type.channel_first: H = shape[-2] W = shape[-1] else: H = shape[-3] W = shape[-2] if isinstance(H, RangeDim): img_range.add_height_range((H.lower_bound, H.upper_bound)) elif is_symbolic(H): img_range.add_height_range((1, -1)) else: img_range.add_height_range((H, H)) if isinstance(W, RangeDim): img_range.add_width_range((W.lower_bound, W.upper_bound)) elif is_symbolic(W): img_range.add_width_range((1, -1)) else: img_range.add_width_range((W, W)) flexible_shape_utils.update_image_size_range( proto, input_type.name, img_range ) else: lb = [] ub = [] for s in shape: if isinstance(s, RangeDim): lb.append(s.lower_bound) ub.append(s.upper_bound) elif is_symbolic(s): lb.append(1) ub.append(-1) else: lb.append(s) ub.append(s) set_multiarray_ndshape_range( proto, input_type.name, lower_bounds=lb, upper_bounds=ub ) def _set_symbolic_inputs(proto, symbolic_inputs): # Set symbolic input shapes by -1 inferred from graph for input_name, shape in symbolic_inputs.items(): lb = [1 if is_symbolic(d) else d for d in shape] ub = [-1 if is_symbolic(d) else d for d in shape] set_multiarray_ndshape_range( proto, input_name, lower_bounds=lb, upper_bounds=ub ) def _set_optional_inputs(proto, input_types): # Set default values for optional input_types default_map = {} for input_type in input_types: if not isinstance(input_type, TensorType): continue if input_type.default_value is not None: default_map[input_type.name] = input_type.default_value for idx, input in enumerate(proto.description.input): name = proto.description.input[idx].name if name in default_map: default_value = default_map[name] proto.description.input[idx].type.isOptional = True array_t = proto.description.input[idx].type.multiArrayType default_fill_val = default_value.flatten()[0] array_t.floatDefaultValue = default_fill_val if default_fill_val != 0 or list(default_value.shape) != \ array_t.shape: # promote spec version to 5 and set the default value proto.specificationVersion = max(proto.specificationVersion, ct._SPECIFICATION_VERSION_IOS_14) # array_t.shape is not empty. array_t.ClearField('shape') array_t.shape.extend(list(default_value.shape)) @_profile def load(prog, **kwargs): if "main" not in prog.functions: msg = "main function not found in program {}" raise ValueError(msg.format(prog)) if len(prog.functions) != 1: msg = ( "Program must have exactly one `main` function to " "convert to NN. Program: {}" ) raise ValueError(msg.format(prog)) input_types = prog.functions["main"].input_types output_types = prog.functions["main"].output_types v1_inputs = [] symbolic_inputs = {} for name, var in prog.functions["main"].inputs.items(): if types.is_tensor(var.sym_type): sym_shape = var.sym_type.get_shape() if any_variadic(sym_shape): raise NotImplementedError("Variadic rank is not supported") if any_symbolic(sym_shape): user_specified = False for input_type in input_types: if name == input_type.name: sym_shape = input_type.shape.default user_specified = True break # Use dummy static shape, and will set it later. shape = [1 if is_symbolic(d) else d for d in sym_shape] if not user_specified: symbolic_inputs[name] = sym_shape else: shape = sym_shape v1_inputs.append((name, Array(*shape))) elif types.is_scalar(var.sym_type): v1_inputs.append((name, Array(1))) else: raise NotImplementedError() v1_outputs = [] for var in prog.functions["main"].outputs: if types.is_tensor(var.sym_type) or types.is_primitive(var.sym_type): # Disregard the output types v1_outputs.append((var.name, None)) else: raise NotImplementedError() # create neural network builder builder = neural_network.NeuralNetworkBuilder( v1_inputs, v1_outputs, disable_rank5_shape_mapping=True, use_float_arraytype=True, ) # const in V2 are added lazily to V1 by each op whenever needed. # `const_context` stores the const names we've added so far and avoid # adding a const more than once. # const_context: list[set of str] (const name for v1 & v2 # (the same)). Note that in NN in outer layer is visible from the inner # layer, so the const_context is simply a stack of set. const_context = [] # Iterate through ops and add to builder convert_ops( const_context, builder, prog.functions["main"].operations, prog.functions["main"].outputs, ) proto = builder.spec # image input has_image_input = any([isinstance(s, ImageType) for s in input_types]) if has_image_input: proto = _convert_to_image_input(proto, input_types, skip_model_load=kwargs.get("skip_model_load", False)) # image output if output_types is not None: assert len(output_types) == len(prog.functions["main"].outputs), \ "number of mil program outputs do not match the number of outputs provided by the user" for i, output_proto_desc in enumerate(proto.description.output): output_var = prog.functions["main"].outputs[i] if isinstance(output_types[i], ImageType): if not types.is_tensor(var.sym_type): raise ValueError("Image output, '{}', is a scalar, but it should be a tensor of rank 4".format( var.name)) shape = var.sym_type.get_shape() if any_variadic(shape): raise ValueError("Variable rank model outputs, that are ImageTypes, are not supported") if any([is_symbolic(d) for d in shape]): raise NotImplementedError("Image output '{}' has symbolic dimensions in its shape". format(var.name)) _validate_image_input_output_shapes(output_types[i].color_layout, shape, var.name, is_input=False) clr_space = _get_colorspace_enum(output_types[i].color_layout) output_proto_desc.type.imageType.colorSpace = clr_space output_proto_desc.type.imageType.width = shape[-1] output_proto_desc.type.imageType.height = shape[-2] # classifier flag classifier_config = kwargs.get("classifier_config", None) if classifier_config is not None: # verify that classifier_config.predicted_probabilities_output if its exists. # And if its empty/None, fill it with the last non const op's output # this is done in "_get_probability_var_for_classifier()" probability_var = _get_probability_var_for_classifier(prog, classifier_config) if classifier_config.predicted_probabilities_output != probability_var.name: classifier_config.predicted_probabilities_output = probability_var.name # add classifier related fields to the proto spec proto = _convert_to_classifier(proto, classifier_config, skip_model_load=kwargs.get("skip_model_load", False)) _set_user_inputs(proto, input_types) _set_symbolic_inputs(proto, symbolic_inputs) _set_optional_inputs(proto, input_types) return proto ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/mil_to_nn_mapping_registry.py0000644000000000000000000000134314672066616030012 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause MIL_TO_NN_MAPPING_REGISTRY = {} def register_mil_to_nn_mapping(func=None, override=False): def func_wrapper(_func): f_name = _func.__name__ if not override and f_name in MIL_TO_NN_MAPPING_REGISTRY: raise ValueError("MIL to NN mapping for MIL op {} is already registered.".format(f_name)) MIL_TO_NN_MAPPING_REGISTRY[f_name] = _func return _func if func is None: # decorator called without argument return func_wrapper return func_wrapper(func)././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/op_mapping.py0000644000000000000000000037553614672066616024544 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as _np from tqdm import tqdm as _tqdm from coremltools import _logger as logger from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.registry import SSAOpRegistry from coremltools.converters.mil.mil.types.symbolic import (any_symbolic, is_symbolic, is_variadic) from coremltools.converters.mil.mil.types.type_mapping import np_val_to_py_type from coremltools.models import neural_network as neural_network from coremltools.models.neural_network.quantization_utils import \ _convert_array_to_nbit_quantized_bytes from coremltools.proto import NeuralNetwork_pb2 from .mil_to_nn_mapping_registry import (MIL_TO_NN_MAPPING_REGISTRY, register_mil_to_nn_mapping) def convert_ops(const_context, builder, ops, outputs): """ const_context: list[set of str]: const name for v1 & v2 (the same) builder: neural_network.NeuralNetworkBuilder ops: list[Operation], usually from Block.operations. outputs: list[Var]. block outputs """ const_context.append(set()) custom_ops = SSAOpRegistry.custom_ops for op in _tqdm(ops, desc="Translating MIL ==> NeuralNetwork Ops", unit=" ops"): if op.op_type in custom_ops: mapper = MIL_TO_NN_MAPPING_REGISTRY["custom_op"] elif op.op_type in MIL_TO_NN_MAPPING_REGISTRY: mapper = MIL_TO_NN_MAPPING_REGISTRY[op.op_type] else: msg = ("Op {} is used in the source model. This op is not supported " "by the NeuralNetwork (compatibility with MacOS < 12, iOS < 15) model " "type. To successfully convert this model, convert to the ML Program " "model type (minimum target MacOS 12, iOS 15 and later).\n" "Use coremltools.convert(..., convert_to=\"mlprogram\") to convert to ML Program.\n" "block: {}") raise NotImplementedError(msg.format(op.op_type, op.enclosing_block)) # const is globally shared in nn. mapper(const_context, builder, op) for ov in outputs: # If block return value is a const, we need to add it. if ov.op is None: continue # placeholder if ov.op.op_type == "const": add_const(const_context, builder, ov.name, ov.val) const_context.pop() def make_input(const_context, builder, variables): """ Ensure that variables, if const, are added to builder. variables: list[Var] or Var or str. Inputs for an nn layer. Returns: list[str] or str: variables' names. """ if isinstance(variables, (list, tuple)): return [make_input(const_context, builder, v) for v in variables] if isinstance(variables, str): return variables v = variables # variables is Var if v.op is not None and v.op.op_type == "const" and v.name not in const_context[-1]: add_const(const_context, builder, v.name, v.val) return v.name def _convert_pool(const_context, builder, op, mode, exclude_padding_from_average=True): num_spatial_dimensions = len(op.kernel_sizes.val) op_pad = op.pad.val if op.pad_type.val == 'custom' \ else [0] * num_spatial_dimensions * 2 padding_type = op.pad_type.val.upper() same_padding_asymmetry_mode = "BOTTOM_RIGHT_HEAVY" if padding_type == "SAME_LOWER": if num_spatial_dimensions == 3: msg = "For the neuralnetwork backend, padding_mode ``same_lower`` is not supported for 3d pooling." raise ValueError(msg) padding_type = "SAME" same_padding_asymmetry_mode = "TOP_LEFT_HEAVY" if num_spatial_dimensions == 1: builder.add_expand_dims( name=op.name + "_expanded", input_name=op.x.name, output_name=op.name + "_expanded", axes=[-2], ) # nn's add_pool function does not support CUSTOM padding, # but VALID padding supports user-defined padding amounts. # Therefore we map CUSTOM padding to VALID padding. padding_type = "VALID" if padding_type == "CUSTOM" else padding_type builder.add_pooling( name=op.name, height=1, width=op.kernel_sizes.val[-1], stride_height=1, stride_width=op.strides.val[-1], layer_type=mode.upper(), padding_type="INCLUDE_LAST_PIXEL" if op.ceil_mode.val else padding_type, input_name=make_input(const_context, builder, op.name + "_expanded"), output_name=op.name + "_pool", exclude_pad_area=exclude_padding_from_average, padding_top=0, padding_bottom=0, padding_left=op_pad[0], padding_right=op_pad[1], is_global=False, same_padding_asymmetry_mode=same_padding_asymmetry_mode, ) builder.add_squeeze( name=op.name + "_squeeze", input_name=op.name + "_pool", output_name=op.outputs[0].name, axes=[-2], ) elif num_spatial_dimensions == 2: # nn's add_pool function does not support CUSTOM padding, # but VALID padding supports user-defined padding amounts. # Therefore we map CUSTOM padding to VALID padding. padding_type = "VALID" if padding_type == "CUSTOM" else padding_type builder.add_pooling( name=op.name, height=op.kernel_sizes.val[-2], width=op.kernel_sizes.val[-1], stride_height=op.strides.val[-2], stride_width=op.strides.val[-1], layer_type=mode.upper(), padding_type="INCLUDE_LAST_PIXEL" if op.ceil_mode.val else padding_type, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, exclude_pad_area=exclude_padding_from_average, padding_top=op_pad[0], padding_bottom=op_pad[1], padding_left=op_pad[2], padding_right=op_pad[3], is_global=False, same_padding_asymmetry_mode=same_padding_asymmetry_mode, ) elif num_spatial_dimensions == 3: builder.add_pooling3d( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, pooling_type=mode.upper(), kernel_depth=op.kernel_sizes.val[-3], kernel_height=op.kernel_sizes.val[-2], kernel_width=op.kernel_sizes.val[-1], stride_depth=op.strides.val[-3], stride_height=op.strides.val[-2], stride_width=op.strides.val[-1], padding_mode=op.pad_type.val, custom_padding_front=op_pad[0], custom_padding_back=op_pad[1], custom_padding_top=op_pad[2], custom_padding_bottom=op_pad[3], custom_padding_left=op_pad[4], custom_padding_right=op_pad[5], average_pooling_count_excludes_padding=exclude_padding_from_average, ) else: raise ValueError( "Unsupported number of spatial dimensions. Maximum is 3, but got %s" % num_spatial_dimensions ) def _try_convert_global_pool(const_context, builder, op, mode): """ Optional performance optimization pass that tries to lower spatial reduce_mean / reduce_max to global_avg_pool / global_max_pool. Return True if the lowering happened, otherwise return False to continue as normal reduction op. """ rank = op.x.rank if is_variadic(rank) or rank not in {4, 5}: return False keep_dims = op.keep_dims.val if keep_dims is False: return False axes = None if op.axes is not None and op.axes.val is not None: axes = op.axes.val else: axes = list(range(rank)) if tuple(op.outputs[0].shape[:-2]) != tuple(op.inputs["x"].shape[:-2]): return False if not all([s == 1 for s in op.outputs[0].shape[-2:]]): return False builder.add_pooling( name=op.name, height=0, width=0, stride_height=0, stride_width=0, layer_type=mode.upper(), padding_type="valid".upper(), input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, is_global=True, ) return True def add_const(const_context, builder, name, val): """ const_context (list of set of str): const names added to v1 builder. Const names are identical between v2 and v1 name (str): name of const. Should be the same for v1 and v2. val: np.ndarray No return values as `name` is the name of const in v1. Comment: we don't need to add scalar const as they are just fields in layer proto message in NN. If we really need a const scalar, we upcast it to rank-1. """ for const_set in const_context: if name in const_set: logger.warning("Const {} was already added.".format(name)) return if not isinstance(val, (_np.ndarray, _np.generic)): val = _np.array([val]) if val.dtype != float: # nn proto only supports float32 activation. (e.g., pred in cond op # needs to be converted to float) val = val.astype(float) rank = len(val.shape) if rank == 0: builder.add_load_constant_nd( name=name, output_name=name, constant_value=val.reshape([1]), shape=[1] ) else: builder.add_load_constant_nd( name=name, output_name=name, constant_value=val, shape=val.shape ) const_context[-1].add(name) logger.info("added const {} for builder {}".format(name, builder)) # Helper routines for recurrent layers def _expand_dim(builder, node_name, input_name, axes): builder.add_expand_dims( name=node_name, input_name=input_name, output_name=node_name, axes=axes ) def _squeeze(builder, node_name, input_name, axes): builder.add_squeeze( name=node_name, input_name=input_name, output_name=node_name, axes=axes ) def _split(x, sections, axis=0): if x is None: return None if x.shape[axis] % sections != 0: raise ValueError( "Cannot split axis {} into {} sections for input of shape {}".format( axis, sections, x.shape ) ) return _np.split(x, sections, axis=axis) @register_mil_to_nn_mapping def avg_pool(const_context, builder, op): _convert_pool( const_context=const_context, builder=builder, op=op, mode="average", exclude_padding_from_average=op.exclude_padding_from_average.val, ) @register_mil_to_nn_mapping def band_part(const_context, builder, op): builder.add_matrix_band_part( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, num_lower=op.lower.val, num_upper=op.upper.val, ) @register_mil_to_nn_mapping def batch_norm(const_context, builder, op): channels = op.x.shape[1] gamma = _np.array([1.0] * channels) if op.gamma is None else op.gamma.val beta = _np.array([0.0] * channels) if op.beta is None else op.beta.val x_name = make_input(const_context, builder, op.x) out_name = op.outputs[0].name is_batchnorm_1d = op.x.rank == 3 is_batchnorm_2d = op.x.rank == 4 is_batchnorm_3d = op.x.rank == 5 if is_batchnorm_1d: x_name = op.name + "_expanded" builder.add_expand_dims( name=x_name, input_name=op.x.name, output_name=x_name, axes=[-2], ) out_name += "_batch_norm" if is_batchnorm_1d or is_batchnorm_2d: # batch norm 1d / 2d builder.add_batchnorm( name=op.name, channels=channels, gamma=gamma, beta=beta, mean=op.mean.val, variance=op.variance.val, input_name=x_name, output_name=out_name, compute_mean_var=False, instance_normalization=False, epsilon=op.epsilon.val, ) elif is_batchnorm_3d: # batch norm 3d batch_size, channel, height, width, depth = op.x.shape assert not is_symbolic(channel), "Channel dimension must be known for batchnorm layer." symbolic_num = sum([is_symbolic(x) for x in op.x.shape]) if symbolic_num > 1: gamma_expand = _np.expand_dims(gamma, axis=(0, 2, 3, 4)) beta_expand = _np.expand_dims(beta, axis=(0, 2, 3, 4)) mean_expand = _np.expand_dims(op.mean.val, axis=(0, 2, 3, 4)) var_expand = _np.expand_dims(op.variance.val, axis=(0, 2, 3, 4)) # compute batch norm 3d by decomposing it into elementwise operations negative_mean_name = op.name + "_negative_mean" add_const(const_context, builder, negative_mean_name, -mean_expand) numerator_name = op.name + "_numerator" builder.add_add_broadcastable( name=numerator_name, input_names=[x_name, negative_mean_name], output_name=numerator_name, ) var_expand = var_expand + op.epsilon.val denominator = _np.sqrt(var_expand) gamma_expand = gamma_expand / denominator gamma_name = op.name + "_gamma" add_const(const_context, builder, gamma_name, gamma_expand) mul_name = op.name + "_mul" builder.add_multiply_broadcastable( name=mul_name, input_names=[numerator_name, gamma_name], output_name=mul_name, ) beta_name = op.name + "_beta" add_const(const_context, builder, beta_name, beta_expand) builder.add_add_broadcastable( name=out_name, input_names=[mul_name, beta_name], output_name=out_name, ) else: is_batch_symbloic = is_symbolic(batch_size) is_height_symbolic = is_symbolic(height) is_width_symbolic = is_symbolic(width) is_depth_symbolic = is_symbolic(depth) if is_batch_symbloic: shape1 = [-1, channel, height * width, depth] shape2 = [-1, channel, height, width, depth] elif is_height_symbolic: shape1 = [batch_size, channel, -1, width*depth] shape2 = [batch_size, channel, -1, width, depth] elif is_width_symbolic: shape1 = [batch_size, channel, -1, height*depth] shape2 = [batch_size, channel, height, -1, depth] elif is_depth_symbolic: shape1 = [batch_size, channel, height * width, -1] shape2 = [batch_size, channel, height, width, -1] else: shape1 = [batch_size, channel, height*width, depth] shape2 = [batch_size, channel, height, width, depth] reshape_4d_name = op.name + "_reshape_4d" builder.add_reshape_static( name=reshape_4d_name, input_name=x_name, output_name=reshape_4d_name, output_shape=shape1, ) batchnorm_name = op.name + "_batchnorm_4d" builder.add_batchnorm( name=batchnorm_name, channels=channels, gamma=gamma, beta=beta, mean=op.mean.val, variance=op.variance.val, input_name=reshape_4d_name, output_name=batchnorm_name, compute_mean_var=False, instance_normalization=False, epsilon=op.epsilon.val, ) builder.add_reshape_static( name=out_name, input_name=batchnorm_name, output_name=out_name, output_shape=shape2, ) # Squeeze added `Width` dimension for 1d case if is_batchnorm_1d: x_name = op.name + "_squeeze" builder.add_squeeze( name=x_name, input_name=out_name, output_name=op.outputs[0].name, axes=[-2], ) @register_mil_to_nn_mapping def const(const_context, builder, op): # const in V2 are added to V1 lazily. pass def conv_helper(const_context, builder, op): # v2 x: (n, C_in/groups, spatial_dims) x_name = make_input(const_context, builder, op.x) out_name = op.outputs[0].name is_conv1d = op.x.rank == 3 is_conv2d = op.x.rank == 4 is_conv3d = op.x.rank == 5 if not (is_conv1d or is_conv2d or is_conv3d): raise ValueError( "Input tensor rank '{}' is not one of '{}'.".format(op.x.rank, (3, 4, 5),) ) if is_conv1d: x_name = op.name + "_expand_dim" out_name += "_expanded" builder.add_expand_dims( name=x_name, input_name=op.x.name, output_name=x_name, axes=[-2], ) # `x_name` is guaranteed to be (n, C_in/groups, spatial_dims) for 1D and 2D convolution # W_v1 wil be np.ndarray (if W is const at compile time) or None # (if W is not known at compile time). weights = None input_names = [x_name] if op.weight.val is not None: # v2 convolution (conv3d) expects weights to have shape (C_out, C_in/groups, spatial_dims) # v1 convolution expects (H, W, C_in/groups, C_out) or (D, H, W, C_in/groups, C_out) weights = op.weight.val if is_conv1d: weights = _np.expand_dims(op.weight.val, -2) if is_conv1d or is_conv2d: weights = _np.transpose(weights, [2, 3, 1, 0]) else: # op.weight is not const at compile time. # When weight is dynamic, v1 convolution expects weight to be # (C_out, C_in/groups, H, W) # TODO 3D convolution doesn't support dynamic weights: if is_conv3d: raise ValueError("3D Convolution doesn't support dynamic weights.") weights_name = op.weight.name if is_conv1d: weights_name += "_expand_dim" builder.add_expand_dims( name=weights_name, input_name=op.weight.name, output_name=weights_name, axes=[-2], ) input_names.append(weights_name) # padding padding_mode = op.pad_type.val pad = {} if padding_mode == "custom": if is_conv1d: padding_mode = "valid" pad["padding_top"] = 0 pad["padding_bottom"] = 0 pad["padding_left"] = op.pad.val[0] pad["padding_right"] = op.pad.val[1] elif is_conv2d: padding_mode = "valid" pad["padding_top"] = op.pad.val[0] pad["padding_bottom"] = op.pad.val[1] pad["padding_left"] = op.pad.val[2] pad["padding_right"] = op.pad.val[3] else: pad["padding_front"] = op.pad.val[0] pad["padding_back"] = op.pad.val[1] pad["padding_top"] = op.pad.val[2] pad["padding_bottom"] = op.pad.val[3] pad["padding_left"] = op.pad.val[4] pad["padding_right"] = op.pad.val[5] same_padding_asymmetry_mode = "BOTTOM_RIGHT_HEAVY" if padding_mode == "same_lower": if is_conv3d: msg = "For the neuralnetwork backend, padding_mode ``same_lower`` is not supported for conv 3d." raise ValueError(msg) padding_mode = "same" same_padding_asymmetry_mode = "TOP_LEFT_HEAVY" has_bias = op.bias is not None groups = op.groups.val strides = op.strides.val.tolist() dilations = op.dilations.val.tolist() if is_conv1d: dilations = dilations[:-1] + [1] + dilations[-1:] strides = strides[:-1] + [1] + strides[-1:] if weights is not None and op.op_type == "conv_quantized": nbits = op.nbits.val weights = _convert_array_to_nbit_quantized_bytes(weights.flatten(), nbits).tobytes() quantization_type = op.quantization_type.val quant_bias = op.quant_bias.val quant_scale = op.quant_scale.val else: quantization_type = None nbits = None quant_bias = None quant_scale = None if is_conv1d or is_conv2d: if weights is None and has_bias: # weights are dynamic. # In this case, bias, if present, cannot be part of the conv op # it needs to be added separately via an add op out_name += "_without_bias" if weights is None and groups > 1: raise NotImplementedError("Convolution with dynamic weights and groups > 1 is not supported on the " "neuralnetwork backend. Please use the mlprogram backend " "(convert_to=\"mlprogram\")") builder.add_convolution( name=out_name, kernel_channels=op.weight.shape[1], output_channels=op.weight.shape[0], height= 1 if is_conv1d else op.weight.shape[2], width= op.weight.shape[2] if is_conv1d else op.weight.shape[3], stride_height=strides[0], stride_width=strides[1], border_mode=padding_mode, same_padding_asymmetry_mode=same_padding_asymmetry_mode, groups=groups, W=weights, b=op.bias.val if has_bias and weights is not None else None, has_bias=has_bias if weights is not None else False, is_deconv=False, input_name=input_names, output_name=out_name, dilation_factors=dilations, quantization_type=quantization_type, nbits=nbits, quant_bias=quant_bias, quant_scale=quant_scale, **pad # Python 2.7.16 will fail with a syntax error if a comma is included after `**pad` ) # add bias if weights are dynamic if weights is None and has_bias: Cout = op.weight.shape[0] assert op.bias.val.size == Cout, \ "size of bias for convolution must be same as the number of output channels" builder.add_load_constant_nd( name=op.name + '_constant_bias', output_name=op.name + "_constant_bias", constant_value=op.bias.val.reshape((Cout, 1, 1)), shape=(Cout, 1, 1) ) add_op_output_name = op.name + "_with_bias" if is_conv1d else op.outputs[0].name builder.add_add_broadcastable( name=add_op_output_name, input_names=[out_name, op.name + "_constant_bias"], output_name=add_op_output_name, ) if is_conv1d: out_name = add_op_output_name # Squeeze added `Width` dimension for 1d case if is_conv1d: x_name = op.name + "expand_dim" builder.add_squeeze( name=op.name, input_name=out_name, output_name=op.outputs[0].name, axes=[-2], ) if is_conv3d: builder.add_convolution3d( name=op.name, input_channels=op.weight.shape[1] * groups, output_channels=op.weight.shape[0], depth=op.weight.shape[2], height=op.weight.shape[3], width=op.weight.shape[4], W=op.weight.val, b=op.bias.val if has_bias else None, has_bias=has_bias, groups=groups, stride_depth=strides[0], stride_height=strides[1], stride_width=strides[2], dilation_depth=dilations[0], dilation_height=dilations[1], dilation_width=dilations[2], padding_mode=padding_mode, is_deconv=False, output_shape=None, input_name=input_names, output_name=out_name, **pad # Python 2.7.16 will fail with a syntax error if a comma is included after `**pad` ) @register_mil_to_nn_mapping def conv(const_context, builder, op): conv_helper(const_context, builder, op) @register_mil_to_nn_mapping() def conv_quantized(const_context, builder, op): conv_helper(const_context, builder, op) @register_mil_to_nn_mapping def cumsum(const_context, builder, op): input_names = make_input(const_context, builder, [op.x]) builder.add_cumsum( name=op.name, input_names=input_names, output_name=op.outputs[0].name, axis=op.axis.val, reverse=op.reverse.val, exclusive=op.exclusive.val, ) def _add_elementwise_unary( const_context, builder, op, mode, output_name=None, **kwargs ): output_name = output_name if output_name else op.outputs[0].name name = output_name if output_name else op.name if mode in ["sqrt", "rsqrt", "inverse", "power", "exp", "log", "abs", "threshold"]: builder.add_unary( name=name, input_name=make_input(const_context, builder, op.x), output_name=output_name, mode=mode, **kwargs ) else: add_func = getattr(builder, "add_" + mode, None) if add_func is None: logger.error( "Elementwise unary method {} not found in builder.".format(mode) ) add_func( name=name, input_name=make_input(const_context, builder, op.x), output_name=output_name, **kwargs ) def _add_elementwise_binary( const_context, builder, op, mode, output_name=None, **kwargs ): output_name = output_name if output_name else op.outputs[0].name name = output_name if output_name else op.name if mode in ["add", "multiply"]: params = {"name": name, "output_name": output_name, "mode": mode.upper()} if op.x.val is not None and op.x.rank == 0 and _np.isfinite(op.x.val): params["input_names"] = make_input(const_context, builder, [op.y]) val = op.x.val if not isinstance(op.x.val, _np.float16) else op.x.val.astype(_np.float32) params["alpha"] = np_val_to_py_type(val) builder.add_elementwise(**params) return elif op.y.val is not None and op.y.rank == 0 and _np.isfinite(op.y.val): params["input_names"] = make_input(const_context, builder, [op.x]) val = op.y.val if not isinstance(op.y.val, _np.float16) else op.y.val.astype(_np.float32) params["alpha"] = np_val_to_py_type(val) builder.add_elementwise(**params) return elif mode in ["equal", "not_equal"]: add_func = getattr(builder, "add_" + mode, None) params = {"name": name, "output_name": output_name} if op.x.val is not None and op.x.rank == 0 and _np.isfinite(op.x.val): params["input_names"] = make_input(const_context, builder, [op.y]) val = op.x.val if not isinstance(op.x.val, _np.float16) else op.x.val.astype(_np.float32) params["alpha"] = np_val_to_py_type(val) add_func(**params) return elif op.y.val is not None and op.y.rank == 0 and _np.isfinite(op.y.val): params["input_names"] = make_input(const_context, builder, [op.x]) val = op.y.val if not isinstance(op.y.val, _np.float16) else op.y.val.astype(_np.float32) params["alpha"] = np_val_to_py_type(val) add_func(**params) return elif mode in ["greater_than", "greater_equal", "less_than", "less_equal"]: params = {"name": name, "output_name": output_name} if op.x.val is not None and op.x.rank == 0 and _np.isfinite(op.x.val): params["input_names"] = make_input(const_context, builder, [op.y]) val = op.x.val if not isinstance(op.x.val, _np.float16) else op.x.val.astype(_np.float32) params["alpha"] = np_val_to_py_type(val) if "less" in mode: params["use_greater_than_equal"] = mode.endswith("_equal") builder.add_greater_than(**params) elif "greater" in mode: params["use_less_than_equal"] = mode.endswith("_equal") builder.add_less_than(**params) return elif op.y.val is not None and op.y.rank == 0 and _np.isfinite(op.y.val): params["input_names"] = make_input(const_context, builder, [op.x]) val = op.y.val if not isinstance(op.y.val, _np.float16) else op.y.val.astype(_np.float32) params["alpha"] = np_val_to_py_type(val) if "greater" in mode: params["use_greater_than_equal"] = mode.endswith("_equal") builder.add_greater_than(**params) elif "less" in mode: params["use_less_than_equal"] = mode.endswith("_equal") builder.add_less_than(**params) return if op.x.can_be_folded_to_const(): add_const(const_context, builder, op.x.name, op.x.val) if op.y.can_be_folded_to_const(): if mode == "pow": _add_elementwise_unary( const_context, builder, op, "power", output_name=output_name, alpha=op.y.val, ) return add_const(const_context, builder, op.y.name, op.y.val) if mode in {"add", "multiply", "max", "min"} and op.x.shape == op.y.shape: builder.add_elementwise( name=name, input_names=make_input(const_context, builder, [op.x, op.y]), output_name=output_name, mode=mode.upper(), ) return # the broadcast feature in the elementwise layer is hardcoded to 4D or less # for the 5d tensor, we need to use broadcasable layers instead. if mode in {"add", "multiply", "subtract"} and op.x.rank < 5 and op.y.rank < 5: shape_x = _np.array([1] * (5 - op.x.rank) + list(op.x.shape)) shape_y = _np.array([1] * (5 - op.y.rank) + list(op.y.shape)) internal_x = internal_y = None if all(shape_x == 1): internal_y = op.x internal_x = op.y elif all(shape_y == 1): internal_x = op.x internal_y = op.y for indices in ([1], [2], [3, 4], [2, 3, 4], [1, 2, 3, 4]): if indices == [1, 2, 3, 4] and mode == "multiply": # INTERNAL_MUL_XYKN not implemented continue if all(shape_x[indices] == shape_y[indices]): if all([True if i in indices else s == 1 for i, s in enumerate(shape_x)]): internal_y = op.x internal_x = op.y break if all([True if i in indices else s == 1 for i, s in enumerate(shape_y)]): internal_x = op.x internal_y = op.y break if internal_x is not None: if mode in {"add", "multiply"}: builder.add_elementwise( name=name, input_names=make_input(const_context, builder, [internal_x, internal_y]), output_name=output_name, mode=mode.upper(), ) elif mode == "subtract": builder.add_activation( name="_neg_y_" + name, input_name=make_input(const_context, builder, op.y), output_name="_neg_y_" + output_name, non_linearity="LINEAR", params=[-1, 0]) if op.x == internal_y: internal_x = "_neg_y_" + output_name else: internal_y = "_neg_y_" + output_name builder.add_elementwise( name=name, input_names=make_input(const_context, builder, [internal_x, internal_y]), output_name=output_name, mode="ADD", ) return if mode in {"add", "multiply", "max", "min"}: add_func = getattr(builder, "add_" + mode + "_broadcastable", None) if add_func is None: msg = "Element-wise binary method {} not found in builder." raise ValueError(msg.format(mode)) add_func( name=name, input_names=make_input(const_context, builder, [op.x, op.y]), output_name=output_name, **kwargs ) else: if mode in ["divide", "floor_div", "mod", "pow", "subtract"]: add_func = getattr(builder, "add_" + mode + "_broadcastable", None) elif mode == "less_equal": add_func = builder.add_less_than kwargs["use_less_than_equal"] = True elif mode == "greater_equal": add_func = builder.add_greater_than kwargs["use_greater_than_equal"] = True else: add_func = getattr(builder, "add_" + mode, None) if add_func is None: msg = "Element-wise binary method {} not found in builder." raise ValueError(msg.format(mode)) add_func( name=name, input_names=make_input(const_context, builder, [op.x, op.y]), output_name=output_name, **kwargs ) def _add_logical(const_context, builder, op, mode): input_names = [] input_names.append(make_input(const_context, builder, op.x)) if mode != "NOT": input_names.append(make_input(const_context, builder, op.y)) builder.add_logical( name=op.name, input_names=input_names, output_name=op.outputs[0].name, mode=mode ) @register_mil_to_nn_mapping def abs(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "abs") @register_mil_to_nn_mapping def acos(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "acos") @register_mil_to_nn_mapping def add(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "add") @register_mil_to_nn_mapping def asin(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "asin") @register_mil_to_nn_mapping def atan(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "atan") @register_mil_to_nn_mapping def atanh(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "atanh") @register_mil_to_nn_mapping def cast(const_context, builder, op): if op.dtype.val in ["int32", "int64"]: _add_elementwise_unary( const_context, builder, op, "floor", output_name=op.name + "_floor" ) _add_elementwise_unary( const_context, builder, op, "ceil", output_name=op.name + "_ceil" ) builder.add_greater_than( name=op.name + "_cond", input_names=[make_input(const_context, builder, op.x)], output_name=op.name + "_cond", alpha=0.0, ) builder.add_where_broadcastable( name=op.name, input_names=[op.name + i for i in ["_cond", "_floor", "_ceil"]], output_name=op.outputs[0].name, ) elif op.dtype.val in ["fp16", "fp32", "fp64"]: builder.add_activation( name=op.name, non_linearity="LINEAR", input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, params=[1.0, 0.0], ) elif op.dtype.val == "bool": builder.add_not_equal( name=op.name, input_names=op.x.name, output_name=op.outputs[0].name, alpha=0.0, ) else: raise NotImplementedError( "Parameter dtype of the cast operation can be one of the {}. " "Provided {}".format(["int32", "int64", "fp16", "fp32", "fp64"], op.dtype.val) ) @register_mil_to_nn_mapping def ceil(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "ceil") @register_mil_to_nn_mapping def clip(const_context, builder, op): _add_elementwise_unary( const_context, builder, op, "clip", min_value=op.alpha.val, max_value=op.beta.val, ) @register_mil_to_nn_mapping def cos(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "cos") @register_mil_to_nn_mapping def cosh(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "cosh") @register_mil_to_nn_mapping def einsum(const_context, builder, op): ''' MIL einsum is either - (B,C,H,W1) * (B,W1,H,W2) = (B,C,H,W2) or - (C,H,W1) * (W1,H,W2) = (C,H,W2) Hence to support it, first transpose the 2 inputs, so that the matrices to be multiplied are on the last 2 axes, then call bmm, and finally transpose the result again ''' rank = op.values[0].rank perm = [0, 2, 1, 3] if rank == 4 else [1, 0, 2] input_names = make_input(const_context, builder, op.values) output_name_1 = op.name + "_transpose_1" output_name_2 = op.name + "_transpose_2" builder.add_transpose(name=op.name + "_transpose_x", axes=perm, input_name=input_names[0], output_name=output_name_1 ) builder.add_transpose(name=op.name + "_transpose_y", axes=perm, input_name=input_names[1], output_name=output_name_2 ) builder.add_batched_mat_mul( name=op.name + "_batch_matmul", input_names=[output_name_1, output_name_2], output_name=op.outputs[0].name + "_pre_transpose" ) builder.add_transpose(name=op.name, axes=perm, input_name=op.outputs[0].name + "_pre_transpose", output_name=op.outputs[0].name ) @register_mil_to_nn_mapping def equal(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "equal") @register_mil_to_nn_mapping def exp(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "exp") @register_mil_to_nn_mapping def exp2(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "exp2") @register_mil_to_nn_mapping def floor(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "floor") @register_mil_to_nn_mapping def floor_div(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "floor_div") @register_mil_to_nn_mapping def greater(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "greater_than") @register_mil_to_nn_mapping def greater_equal(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "greater_equal") @register_mil_to_nn_mapping def inverse(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "inverse", epsilon=op.epsilon.val) @register_mil_to_nn_mapping def less(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "less_than") @register_mil_to_nn_mapping def less_equal(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "less_equal") @register_mil_to_nn_mapping def log(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "log", epsilon=op.epsilon.val) @register_mil_to_nn_mapping def logical_and(const_context, builder, op): _add_logical(const_context, builder, op, "AND") @register_mil_to_nn_mapping def logical_not(const_context, builder, op): _add_logical(const_context, builder, op, "NOT") @register_mil_to_nn_mapping def logical_or(const_context, builder, op): _add_logical(const_context, builder, op, "OR") @register_mil_to_nn_mapping def logical_xor(const_context, builder, op): _add_logical(const_context, builder, op, "XOR") @register_mil_to_nn_mapping def maximum(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "max") @register_mil_to_nn_mapping def minimum(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "min") @register_mil_to_nn_mapping def mod(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "mod") @register_mil_to_nn_mapping def mul(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "multiply") @register_mil_to_nn_mapping def not_equal(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "not_equal") @register_mil_to_nn_mapping def pow(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "pow") @register_mil_to_nn_mapping def real_div(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "divide") @register_mil_to_nn_mapping def round(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "round") @register_mil_to_nn_mapping def rsqrt(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "rsqrt", epsilon=op.epsilon.val) @register_mil_to_nn_mapping def sign(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "sign") @register_mil_to_nn_mapping def sin(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "sin") @register_mil_to_nn_mapping def sinh(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "sinh") @register_mil_to_nn_mapping def slice_by_index(const_context, builder, op): rank = op.x.rank stride = [1] * rank if op.stride is None else op.stride.val begin_mask = [False] * rank if op.begin_mask is None else op.begin_mask.val end_mask = [False] * rank if op.end_mask is None else op.end_mask.val squeeze_mask = [False] * rank if op.squeeze_mask is None else op.squeeze_mask.val if op.begin.val is not None and op.end.val is not None: # If only one dimension is sliced, we should use the slice layer instead of static_slice or dynamic_slice # In general, slice has a better performance. begin = op.begin.val end = op.end.val slice_dim = [] for i in range(rank): if (not begin_mask[i] and begin[i] != 0) or \ (not end_mask[i] and end[i] != op.x.shape[i]) or \ stride[i] != 1: slice_dim.append(i) if len(slice_dim) == 1 and not any(squeeze_mask): dim = slice_dim[0] - rank if dim in [-3, -2, -1]: # get the axis, only channel, width, and depth dimension are supported axis = None if dim == -1: axis = "width" elif dim == -2: axis = "height" elif dim == -3: axis = "channel" start_index = 0 if begin_mask[dim] else begin[dim] end_index = op.x.shape[dim] if end_mask[dim] else end[dim] shape = op.x.shape if not is_symbolic(shape[dim]): if start_index < 0: start_index += shape[dim] if not is_symbolic(end_index) and start_index >= 0 and stride[dim] >= 1: builder.add_slice( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, axis=axis, start_index=start_index, end_index=end_index, stride=stride[dim], ) return # use add_slice_static builder.add_slice_static( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, begin_ids=op.begin.val, end_ids=op.end.val, strides=np_val_to_py_type(stride), begin_masks=np_val_to_py_type(begin_mask), end_masks=np_val_to_py_type(end_mask), squeeze_masks=np_val_to_py_type(squeeze_mask), ) else: builder.add_slice_dynamic( name=op.name, input_names=make_input(const_context, builder, [op.x, op.begin, op.end]), output_name=op.outputs[0].name, strides=np_val_to_py_type(stride), begin_masks=np_val_to_py_type(begin_mask), end_masks=np_val_to_py_type(end_mask), squeeze_masks=np_val_to_py_type(squeeze_mask), ) @register_mil_to_nn_mapping def slice_by_size(const_context, builder, op): """ If the inputs satisfy 1. op.x has static input shape for those dimension whose size is not -1 2. op.begin and op.size are both known during compile time we use add_slice_static directly Otherwise, build a block of ops achieving slice_by_size with dynamic input x and size. """ # The static case if op.begin.val is not None and op.size.val is not None: begin = op.begin.val size = op.size.val rank = op.x.rank end = [] for i in range(rank): if size[i] == -1: end.append(op.x.shape[i]) else: end.append(begin[i] + size[i]) if not any_symbolic(end): builder.add_slice_static( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, begin_ids=begin, end_ids=end, strides=[1] * rank, begin_masks=[False] * rank, end_masks=[False] * rank, squeeze_masks=[False] * rank, ) return # The dynamic case # get the end_index of input x # for instance, x with shape [2,3,4] results in [2,3,4] end_index_name = op.name + "_end_index" builder.add_get_shape( name=end_index_name, input_name=make_input(const_context, builder, op.x), output_name=end_index_name, ) # get the mask where size = -1 # for instance, size = [-1,1,2] results in [1,0,0] const_name = op.name + "_const_name" add_const(const_context, builder, const_name, _np.array([-1] * op.x.rank)) is_end_mask_name = op.name + "_is_end_mask" builder.add_equal( name=is_end_mask_name, input_names=make_input(const_context, builder, [const_name, op.size]), output_name=is_end_mask_name, ) # get the mask where size != -1 # for instance, size = [-1,1,2] results in [0,1,1] is_not_end_mask_name = op.name + "_is_not_end_mask" builder.add_not_equal( name=is_not_end_mask_name, input_names=make_input(const_context, builder, [const_name, op.size]), output_name=is_not_end_mask_name, ) # get the end index for dimensions i where size[i] = -1 # for size[i] != -1, just make it 0 # for instance, x with shape [2,3,4] and size = [-1,1,2] # results in [2,0,0] end_index_with_mask_name = op.name + "_end_index_with_mask" builder.add_elementwise( name=end_index_with_mask_name, input_names=[end_index_name, is_end_mask_name], output_name=end_index_with_mask_name, mode="MULTIPLY", ) # get the end index for dimension i where size[i] != -1 # for size[i] = 1, just make it 0 # for instance, x with shape [2,3,4], size = [-1,1,2], # begin = [0,1,1] results in [0,2,3] end_ids = op.name + "_end_ids" builder.add_elementwise( name=end_ids, input_names=make_input(const_context, builder, [op.begin, op.size]), output_name=end_ids, mode="ADD", ) end_index_without_mask_name = op.name + "_end_index_without_mask" builder.add_elementwise( name=end_index_without_mask_name, input_names=make_input(const_context, builder, [is_not_end_mask_name, end_ids]), output_name=end_index_without_mask_name, mode="MULTIPLY", ) # add two end index array together to get the final index final_end_index_name = op.name + "_final_index" builder.add_elementwise( name=final_end_index_name, input_names=make_input( const_context, builder, [end_index_with_mask_name, end_index_without_mask_name], ), output_name=final_end_index_name, mode="ADD", ) input_names = make_input( const_context, builder, [op.x, op.begin, final_end_index_name] ) builder.add_slice_dynamic( name=op.name, input_names=input_names, output_name=op.outputs[0].name ) @register_mil_to_nn_mapping def sqrt(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "sqrt") @register_mil_to_nn_mapping def square(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "power", alpha=2.0) @register_mil_to_nn_mapping def sub(const_context, builder, op): _add_elementwise_binary(const_context, builder, op, "subtract") @register_mil_to_nn_mapping def tan(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "tan") @register_mil_to_nn_mapping def threshold(const_context, builder, op): _add_elementwise_unary(const_context, builder, op, "threshold", alpha=op.alpha.val) @register_mil_to_nn_mapping def depth_to_space(const_context, builder, op): builder.add_reorganize_data( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, mode="DEPTH_TO_SPACE", block_size=op.block_size.val, ) @register_mil_to_nn_mapping def expand_dims(const_context, builder, op): builder.add_expand_dims( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, axes=op.axes.val, ) @register_mil_to_nn_mapping def fill(const_context, builder, op): if op.shape.val is None: builder.add_fill_dynamic( name=op.name, input_name=make_input(const_context, builder, op.shape), output_name=op.outputs[0].name, value=op.value.val, ) else: builder.add_fill_static( name=op.name, output_name=op.outputs[0].name, output_shape=op.shape.val, value=op.value.val, ) @register_mil_to_nn_mapping def random_bernoulli(const_context, builder, op): if op.shape.val is None: builder.add_random_bernoulli_dynamic( name=op.name, input_names=make_input(const_context, builder, [op.shape]), output_name=op.outputs[0].name, prob=op.prob.val, seed=op.seed.val, ) else: builder.add_random_bernoulli_static( name=op.name, output_name=op.outputs[0].name, output_shape=op.shape.val, prob=op.prob.val, seed=op.seed.val, ) @register_mil_to_nn_mapping def random_categorical(const_context, builder, op): builder.add_categorical_distribution( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, num_samples=op.size.val, is_logits=(op.mode.val == "logits"), seed=op.seed.val, ) @register_mil_to_nn_mapping def random_normal(const_context, builder, op): if op.shape.val is None: builder.add_random_normal_dynamic( name=op.name, input_names=make_input(const_context, builder, [op.shape]), output_name=op.outputs[0].name, mean=op.mean.val, stddev=op.stddev.val, seed=op.seed.val, ) else: builder.add_random_normal_static( name=op.name, output_name=op.outputs[0].name, output_shape=op.shape.val, mean=op.mean.val, stddev=op.stddev.val, seed=op.seed.val, ) @register_mil_to_nn_mapping def random_uniform(const_context, builder, op): if op.shape.val is None: builder.add_random_uniform_dynamic( name=op.name, input_names=make_input(const_context, builder, [op.shape]), output_name=op.outputs[0].name, minval=op.low.val, maxval=op.high.val, seed=op.seed.val, ) else: builder.add_random_uniform_static( name=op.name, output_name=op.outputs[0].name, output_shape=op.shape.val, minval=op.low.val, maxval=op.high.val, seed=op.seed.val, ) @register_mil_to_nn_mapping def gru(const_context, builder, op): make_input(const_context, builder, [op.x, op.initial_h]) # Input shape: [b, s, I] input_name = op.x.name # Shape: [b, H] initial_h = op.initial_h.name weight_ih = op.weight_ih.val weight_hh = op.weight_hh.val b = op.bias.val if op.bias is not None else None direction = op.direction.val output_sequence = op.output_sequence.val # Add expand dims for input, in _expand_dim(builder, input_name + "_expanded", input_name, [3, 4]) input_name += "_expanded" if direction not in {"forward", "reverse"}: raise ValueError( "Unknown direction {} for GRU layer. Supported are forward, reverse".format( direction ) ) # Expand initial_h _expand_dim(builder, initial_h + "_expanded", initial_h, [0, 3, 4]) initial_h += "_expanded" def roz_to_zro(x): if x is None: return None r, o, z = _split(x, sections=3, axis=0) return [z, r, o] # w_x: [H*I, H*I, H*I] # w_h: [H*H, H*H, H*H] # where, format is [Z, R, O] # Z: Update gate, R: Reset gate, O: Output gate w_x = roz_to_zro(weight_ih) w_h = roz_to_zro(weight_hh) # bias format: [3*H] b = roz_to_zro(b) input_size = w_x[0].shape[1] hidden_size = w_x[0].shape[0] # 2 outputs # Y : [s/1, b, h, 1, 1] # Y_h: [ 1, b, h, 1, 1] output_names = [_output.name + "_5d" for _output in op.outputs] builder.add_gru( name=op.name, W_h=w_h, W_x=w_x, b=b, hidden_size=hidden_size, input_size=input_size, input_names=[input_name, initial_h], output_names=output_names, inner_activation=op.recurrent_activation.val, activation=op.activation.val, output_all=output_sequence, reverse_input=(direction == "reverse"), ) # Squeeze Output # to output shape of [Seq Len or 1, Batch Size, Hidden Size] _squeeze(builder, op.outputs[0].name, output_names[0], axes=[3, 4]) # Squeeze Output H and Output C # to output shape of [Batch Size, Hidden Size] _squeeze(builder, op.outputs[1].name, output_names[1], axes=[0, 3, 4]) @register_mil_to_nn_mapping def squeeze(const_context, builder, op): axes = op.axes.val if op.axes is not None else None builder.add_squeeze( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, axes=axes, squeeze_all=axes is None, ) @register_mil_to_nn_mapping def topk(const_context, builder, op): builder.add_topk( name=op.name, input_names=make_input(const_context, builder, [op.x]), output_names=[output.name for output in op.outputs], k=op.k.val, axis=op.axis.val, use_bottom_k=op.ascending.val, ) @register_mil_to_nn_mapping def l2_pool(const_context, builder, op): _convert_pool(const_context=const_context, builder=builder, op=op, mode="l2") @register_mil_to_nn_mapping def linear(const_context, builder, op): out_channels, in_channels = op.weight.shape if op.x.rank and op.x.rank <= 3 and op.x.rank > 0: has_bias = op.bias is not None and op.bias.val is not None builder.add_inner_product( name=op.name, W=op.weight.val, b=op.bias.val if has_bias else None, input_channels=in_channels, output_channels=out_channels, has_bias=has_bias, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, ) else: builder.add_batched_mat_mul( name=op.name, input_names=make_input(const_context, builder, [op.x]), output_name=op.outputs[0].name, W=op.weight.val.T, bias=op.bias.val, weight_matrix_rows=in_channels, weight_matrix_columns=out_channels, ) @register_mil_to_nn_mapping def matmul(const_context, builder, op): weight = None rows, columns = 0, 0 if ( op.y.val is not None and op.y.rank == 2 and len(op.y.child_ops) == 1 and len(op.y.consuming_blocks) == 0 ): weight = op.y.val if op.transpose_y.val: weight = weight.transpose((1, 0)) rows, columns = weight.shape input_names = make_input(const_context, builder, [op.x]) if op.transpose_x.val: perm = [i for i in range(op.x.rank)] perm[-1], perm[-2] = perm[-2], perm[-1] name = op.name + "_x_transpose" builder.add_transpose( name=name, axes=perm, input_name=input_names[0], output_name=name ) input_names = [name] else: input_names = make_input(const_context, builder, [op.x, op.y]) builder.add_batched_mat_mul( name=op.name, input_names=input_names, output_name=op.outputs[0].name, transpose_a=op.transpose_x.val, transpose_b=op.transpose_y.val, W=weight, weight_matrix_rows=rows, weight_matrix_columns=columns, ) @register_mil_to_nn_mapping def max_pool(const_context, builder, op): _convert_pool(const_context=const_context, builder=builder, op=op, mode="max") @register_mil_to_nn_mapping def non_zero(const_context, builder, op): builder.add_where_nonzero( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def lstm(const_context, builder, op): make_input(const_context, builder, [op.x, op.initial_h, op.initial_c]) # Input shape [b, s, I] input_name = op.x.name # Shape: [b, DIRECTION*H] initial_h = op.initial_h.name initial_c = op.initial_c.name wt_ih = op.weight_ih.val wt_hh = op.weight_hh.val b = op.bias.val if op.bias is not None else None direction = op.direction.val output_sequence = op.output_sequence.val peephole = op.peephole.val if op.peephole is not None else None # High enough clip value to be ineffective! clip = 500.0 if op.clip is None else op.clip.val # Add expand dims for input, in _expand_dim(builder, input_name + "_expanded", input_name, [3, 4]) input_name += "_expanded" if direction in {"forward", "reverse"}: # Expand initial_h and initial_c, # from shape (B, H) to shape (1, Batch, H, 1, 1). # Since initial_h and initial_c may get used in multiple places, # prepend input_name to avoid conflict _expand_dim(builder, input_name + initial_h + "_expanded", initial_h, [0, 3, 4]) initial_h = input_name + initial_h + "_expanded" # initial_c may have the same name as initial_h (e.g., same Var). # Append a different string to initial_c to avoid conflict _expand_dim(builder, input_name + initial_c + "_expanded2", initial_c, [0, 3, 4]) initial_c = input_name + initial_c + "_expanded2" # w_x: [H*I, H*I, H*I, H*I] # w_h: [H*H, H*H, H*H, H*H] # where format is, [input gate, forget gate, output gate, cell gate] w_x = _split(wt_ih, sections=4) w_h = _split(wt_hh, sections=4) # bias format: [4*H] b = _split(b, sections=4) # ifoz layout # peephole format: [3*H] # where format is, [input gate, forget gate, output gate] peephole = _split(peephole, sections=3) input_size = w_x[0].shape[1] hidden_size = w_h[0].shape[1] # 3 outputs # Y : [s/1, b, h, 1, 1] # Y_h: [ 1, b, h, 1, 1] # Y_c: [ 1, b, h, 1, 1] output_names = [_output.name + "_5d" for _output in op.outputs] builder.add_unilstm( name=op.name, W_h=w_h, W_x=w_x, b=b, hidden_size=hidden_size, input_size=input_size, input_names=[input_name, initial_h, initial_c], output_names=output_names, inner_activation=op.recurrent_activation.val, cell_state_update_activation=op.cell_activation.val, output_activation=op.activation.val, peep=peephole, output_all=output_sequence, cell_clip_threshold=clip, reverse_input=(direction == "reverse"), ) # Squeeze Output # to output shape of [Seq Len or 1, Batch Size, Hidden Size] _squeeze(builder, op.outputs[0].name, output_names[0], axes=[3, 4]) # Squeeze Output H and Output C # to output shape of [Batch Size, Hidden Size] _squeeze(builder, op.outputs[1].name, output_names[1], axes=[0, 3, 4]) _squeeze(builder, op.outputs[2].name, output_names[2], axes=[0, 3, 4]) elif direction == "bidirectional": # Expand initial_h and initial_c # Issue #810 num_layer = len(builder.layers) initial_h_expand = initial_h + "_expanded" + "_" + str(num_layer) # from shape (B, 2*H) to shape (1, Batch, 2*H, 1, 1) if initial_h_expand not in set(builder.layers): _expand_dim(builder, initial_h_expand, initial_h, [0, 3, 4]) initial_h = initial_h_expand # initial_h may have the same name as initial_c (e.g., same Var) initial_c_expand = initial_c + "_expanded2" + "_" + str(num_layer) if initial_c_expand not in set(builder.layers): _expand_dim(builder, initial_c_expand, initial_c, [0, 3, 4]) initial_c = initial_c_expand initial_h_f = initial_h + "_forward" initial_h_r = initial_h + "_reverse" initial_c_f = initial_c + "_forward" initial_c_r = initial_c + "_reverse" # split input_h and input_c into two parts builder.add_split_nd( name=op.name + "_split_h", input_name=initial_h, output_names=[initial_h_f, initial_h_r], axis=2, ) builder.add_split_nd( name=op.name + "_split_c", input_name=initial_c, output_names=[initial_c_f, initial_c_r], axis=2, ) wt_ih_back = op.weight_ih_back.val wt_hh_back = op.weight_hh_back.val # Get weights here # weight format: [I+H, 2*4*H] -> [I+H, 4*H (forward):4*H (backward)] hidden_size = wt_hh.shape[1] input_size = wt_ih.shape[1] # f_w_x and r_w_x: [H*I, H*I, H*I, H*I] # f_w_h and r_w_h: [H*H, H*H, H*H, H*H] # where format is, [input gate, forget gate, output gate, cell gate] w_x = _split(wt_ih, sections=4) w_h = _split(wt_hh, sections=4) r_w_x = _split(wt_ih_back, sections=4) r_w_h = _split(wt_hh_back, sections=4) # f_b and r_b format: [4*H] b_back = op.bias_back.val if op.bias_back is not None else None f_b, r_b = None, None if b is not None: f_b = _split(b, sections=4) if b_back is not None: r_b = _split(b_back, sections=4) # peephole format: [2*3*H] -> [3*H (forward) : 3*H (backward)] peephole_back = op.peephole_back.val if op.peephole_back is not None else None f_peephole, r_peephole = None, None if peephole is not None: f_peephole = _split(peephole, sections=3) if peephole_back is not None: r_peephole = _split(peephole_back, sections=3) output_names = [ op.outputs[0].name + "_5d", # Output Y [s/1, b, 2*h, 1, 1] op.outputs[1].name + "_5d_foward", # Output Y_h [ 1, b, h, 1, 1] op.outputs[2].name + "_5d_forward", # Output Y_c [ 1, b, h, 1, 1] op.outputs[1].name + "_5d_reverse", # Output Y_h_reverse [ 1, b, h, 1, 1] op.outputs[2].name + "_5d_reverse", ] # Output Y_c_reverse [ 1, b, h, 1, 1] builder.add_bidirlstm( name=op.name, W_h=w_h, W_x=w_x, b=f_b, W_h_back=r_w_h, W_x_back=r_w_x, b_back=r_b, hidden_size=hidden_size, input_size=input_size, input_names=[ input_name, initial_h_f, initial_c_f, initial_h_r, initial_c_r, ], output_names=output_names, inner_activation=op.recurrent_activation.val, cell_state_update_activation=op.cell_activation.val, output_activation=op.activation.val, peep=f_peephole, peep_back=r_peephole, output_all=output_sequence, cell_clip_threshold=clip, ) # Squeeze Output # to output shape of [Seq Len or 1, Batch Size, 2*Hidden Size] _squeeze(builder, op.outputs[0].name, output_names[0], axes=[3, 4]) # Output H is of format # 1, Batch_Size, Hidden_Size, 1, 1 # Concat to make it # 1, Batch_Size, 2*Hidden_Size, 1, 1 builder.add_elementwise( name=op.outputs[1].name + "_5d", input_names=[output_names[1], output_names[3]], output_name=op.outputs[1].name + "_5d", mode="CONCAT", ) # Output C is of format # 1, Batch_Size, Hidden_Size, 1, 1 builder.add_elementwise( name=op.outputs[2].name + "_5d", input_names=[output_names[2], output_names[4]], output_name=op.outputs[2].name + "_5d", mode="CONCAT", ) # Squeeze Output H and Output C # to output shape of [Batch Size, 2*Hidden Size] _squeeze( builder, op.outputs[1].name, op.outputs[1].name + "_5d", axes=[0, 3, 4] ) _squeeze( builder, op.outputs[2].name, op.outputs[2].name + "_5d", axes=[0, 3, 4] ) else: raise ValueError( "Unknown direction {} for LSTM layer. Supported are forward, reverse or bidirectional".format( direction ) ) @register_mil_to_nn_mapping def reshape(const_context, builder, op): if op.shape.val is None: builder.add_reshape_dynamic( name=op.name, input_names=make_input(const_context, builder, [op.x, op.shape]), output_name=op.outputs[0].name, ) elif -1 in op.shape.val and len(op.shape.val) == op.x.rank: # Support 0 in shape. builder.add_rank_preserving_reshape( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, output_shape=op.shape.val, ) else: if 0 in op.shape.val: # Does not support 0 in shape msg = "Use 0 in shape only if len(shape) == x.rank. Report bug." raise ValueError(msg) output_shape = (1,) if len(op.shape.val) == 0 or 0 in op.shape.shape else op.shape.val builder.add_reshape_static( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, output_shape=output_shape, ) @register_mil_to_nn_mapping def reduce_argmax(const_context, builder, op): builder.add_argmax( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, axis=op.axis.val, keepdims=op.keep_dims.val, ) @register_mil_to_nn_mapping def reduce_argmin(const_context, builder, op): builder.add_argmin( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, axis=op.axis.val, keepdims=op.keep_dims.val, ) def _reduce_axes(const_context, builder, builder_op, op): axes = op.axes.val if op.axes is not None else op.axes builder_op( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, axes=axes, keepdims=op.keep_dims.val, reduce_all=axes is None, ) @register_mil_to_nn_mapping def reduce_l1_norm(const_context, builder, op): _reduce_axes(const_context, builder, builder.add_reduce_l1, op) @register_mil_to_nn_mapping def reduce_l2_norm(const_context, builder, op): _reduce_axes(const_context, builder, builder.add_reduce_l2, op) @register_mil_to_nn_mapping def reduce_log_sum(const_context, builder, op): _reduce_axes(const_context, builder, builder.add_reduce_logsum, op) @register_mil_to_nn_mapping def reduce_log_sum_exp(const_context, builder, op): _reduce_axes(const_context, builder, builder.add_reduce_logsumexp, op) @register_mil_to_nn_mapping def reduce_max(const_context, builder, op): if not _try_convert_global_pool(const_context, builder, op, mode="max"): _reduce_axes(const_context, builder, builder.add_reduce_max, op) @register_mil_to_nn_mapping def reduce_mean(const_context, builder, op): if not _try_convert_global_pool(const_context, builder, op, mode="average"): _reduce_axes(const_context, builder, builder.add_reduce_mean, op) @register_mil_to_nn_mapping def reduce_min(const_context, builder, op): _reduce_axes(const_context, builder, builder.add_reduce_min, op) @register_mil_to_nn_mapping def reduce_prod(const_context, builder, op): _reduce_axes(const_context, builder, builder.add_reduce_prod, op) @register_mil_to_nn_mapping def reduce_sum(const_context, builder, op): _reduce_axes(const_context, builder, builder.add_reduce_sum, op) @register_mil_to_nn_mapping def reduce_sum_square(const_context, builder, op): _reduce_axes(const_context, builder, builder.add_reduce_sumsquare, op) @register_mil_to_nn_mapping def reverse(const_context, builder, op): reverse_dim = [False] * op.x.rank if op.axes is None: reverse_dim = [True] * op.x.rank else: for axis in op.axes.val: reverse_dim[axis] = True builder.add_reverse( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, reverse_dim=reverse_dim, ) @register_mil_to_nn_mapping def reverse_sequence(const_context, builder, op): builder.add_reverse_sequence( name=op.name, input_names=make_input(const_context, builder, [op.x, op.lengths]), output_name=op.outputs[0].name, batch_axis=op.batch_axis.val, seq_axis=op.seq_axis.val, ) @register_mil_to_nn_mapping def rnn(const_context, builder, op): input_name = make_input(const_context, builder, op.x) # [b, s, I] initial_h = make_input(const_context, builder, op.initial_h) # [b, H] w_ih = op.weight_ih.val w_hh = op.weight_hh.val b = op.bias.val if op.bias is not None else None direction = op.direction.val output_sequence = op.output_sequence.val activation = op.activation.val # Add expand dims for input, in _expand_dim(builder, input_name + "_expanded", input_name, [3, 4]) input_name += "_expanded" if direction not in {"forward", "reverse"}: raise ValueError( "Unknown direction {} for RNN layer. Supported are forward and reverse".format( direction ) ) # Expand initial_h and initial_c _expand_dim(builder, initial_h + "_expanded", initial_h, [2, 3, 4]) initial_h += "_expanded" # w_x: (H, I) # w_h: (H, H) hidden_size = w_hh.shape[0] input_size = w_ih.shape[-1] # 3 outputs # Y : [s/1, b, h, 1, 1] # Y_h: [ 1, b, h, 1, 1] output_names = [_output.name + "_5d" for _output in op.outputs] builder.add_simple_rnn( name=op.name, W_h=w_hh, W_x=w_ih, b=b, hidden_size=hidden_size, input_size=input_size, input_names=[input_name, initial_h], output_names=output_names, activation=activation, output_all=output_sequence, reverse_input=(direction == "reverse"), ) # Squeeze Output # to output shape of [Seq Len or 1, Batch Size, Hidden Size] _squeeze(builder, op.outputs[0].name, output_names[0], [3, 4]) # Squeeze Output H and Output C # to output shape of [Batch Size, Hidden Size] _squeeze(builder, op.outputs[1].name, output_names[1], [0, 3, 4]) @register_mil_to_nn_mapping def select(const_context, builder, op): builder.add_where_broadcastable( name=op.name, input_names=make_input(const_context, builder, [op.cond, op.a, op.b]), output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def space_to_depth(const_context, builder, op): builder.add_reorganize_data( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, mode="SPACE_TO_DEPTH", block_size=op.block_size.val, ) @register_mil_to_nn_mapping def batch_to_space(const_context, builder, op): block_size = op.block_shape.val if block_size[0] != block_size[1]: raise ValueError("batch_to_space non-equal block shape is not supported in 'neuralnetwork' backend! Please change the convert_to to 'mlprogram'.") block_size = block_size[0] if block_size == 1: raise ValueError("batch_to_space block shape == 1 not supported in 'neuralnetwork' backend! Please change the convert_to to 'mlprogram'.") transpose_1_name = op.name + "_transpose_1" builder.add_transpose( name=transpose_1_name, input_name=make_input(const_context, builder, op.x), axes=[1, 0, 2, 3], output_name=transpose_1_name, ) depth_to_space_name = op.name + "_depth_to_space" builder.add_reorganize_data( name=depth_to_space_name, input_name=transpose_1_name, output_name=depth_to_space_name, mode="DEPTH_TO_SPACE", block_size=block_size, ) crop_name = op.name + "_crop" crops = op.crops.val builder.add_crop( name=crop_name, input_names=[depth_to_space_name], output_name=crop_name, offset=0, top=crops[0][0], bottom=crops[0][1], left=crops[1][0], right=crops[1][1], ) transpose_2_name = op.name + "_transpose_2" builder.add_transpose( name=transpose_2_name, input_name=crop_name, axes=[1, 0, 2, 3], output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def space_to_batch(const_context, builder, op): block_size = op.block_shape.val if block_size[0] != block_size[1]: raise ValueError("space_to_batch non-equal block shape is not supported in 'neuralnetwork' backend! Please change the convert_to to 'mlprogram'.") block_size = block_size[0] if block_size == 1: raise ValueError("space_to_batch block shape == 1 not supported in 'neuralnetwork' backend! Please change the convert_to to 'mlprogram'.") pad = op.paddings.val.flatten() left, right = pad[2], pad[3] top, bottom = pad[0], pad[1] pad_name = op.name + "_pad" builder.add_padding( name=pad_name, left=left, right=right, top=top, bottom=bottom, input_name=make_input(const_context, builder, op.x), output_name=pad_name, padding_type="constant", value=0., ) transpose_1_name = op.name + "_transpose_1" builder.add_transpose( name=transpose_1_name, input_name=pad_name, axes=[1, 0, 2, 3], output_name=transpose_1_name, ) space_to_depth_name = op.name + "_space_to_depth" builder.add_reorganize_data( name=space_to_depth_name, input_name=transpose_1_name, output_name=space_to_depth_name, mode="SPACE_TO_DEPTH", block_size=block_size, ) transpose_2_name = op.name + "_transpose_2" builder.add_transpose( name=transpose_2_name, input_name=space_to_depth_name, axes=[1, 0, 2, 3], output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def transpose(const_context, builder, op): builder.add_transpose( name=op.name, axes=op.perm.val, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def gather(const_context, builder, op): is_embedding = False if op.x.val is not None: W = op.x.val if len(W.shape) == 2: if op.axis.val == 0 or op.axis.val == -2: if len(op.x.child_ops) == 1: # the constant feeding into the gather doesn't go to any other op is_embedding = True if is_embedding: """" The following: %3 = gather(%1, %2, axis=0) # %1 is a constant matrix of shape (vocab_size, embedding_size) can be mapped to: %2_e = expand_dims(%2, axis=-1) %3 = embeddingND(%2_e, weight=%1) """ builder.add_expand_dims( name=op.name + "_expand_dims", input_name=make_input(const_context, builder, op.indices), output_name=op.name + "_expand_dims", axes=[-1], ) builder.add_embedding_nd( name=op.name, input_name=op.name + "_expand_dims", output_name=op.outputs[0].name, vocab_size=W.shape[0], embedding_size=W.shape[1], W=_np.transpose(W), ) else: builder.add_gather( name=op.name, input_names=make_input(const_context, builder, [op.x, op.indices]), output_name=op.outputs[0].name, axis=op.axis.val, ) @register_mil_to_nn_mapping def scatter(const_context, builder, op): builder.add_scatter( name=op.name, input_names=make_input( const_context, builder, [op.data, op.indices, op.updates] ), output_name=op.outputs[0].name, axis=op.axis.val, mode=op.mode.val.upper(), ) @register_mil_to_nn_mapping def gather_along_axis(const_context, builder, op): builder.add_gather_along_axis( name=op.name, input_names=make_input(const_context, builder, [op.x, op.indices]), output_name=op.outputs[0].name, axis=op.axis.val, ) @register_mil_to_nn_mapping def scatter_along_axis(const_context, builder, op): builder.add_scatter_along_axis( name=op.name, input_names=make_input( const_context, builder, [op.data, op.indices, op.updates] ), output_name=op.outputs[0].name, axis=op.axis.val, mode=op.mode.val.upper(), ) @register_mil_to_nn_mapping def gather_nd(const_context, builder, op): builder.add_gather_nd( name=op.name, input_names=make_input( const_context, builder, [op.x, op.indices] ), output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def scatter_nd(const_context, builder, op): builder.add_scatter_nd( name=op.name, input_names=make_input( const_context, builder, [op.data, op.indices, op.updates], ), output_name=op.outputs[0].name, mode=op.mode.val.upper(), ) @register_mil_to_nn_mapping def silu(const_context, builder, op): ''' silu is: y = x * sigmoid(x) ''' inp = make_input(const_context, builder, op.x) builder.add_activation( name=op.name + "__silu_sigmoid__", non_linearity="SIGMOID", input_name=inp, output_name=op.name + "__silu_sigmoid__", ) builder.add_elementwise( name=op.name, input_names=[inp, op.name + "__silu_sigmoid__"], output_name=op.outputs[0].name, mode='MULTIPLY', ) @register_mil_to_nn_mapping def tile(const_context, builder, op): inputs = [make_input(const_context, builder, op.x)] if op.reps.val is None: inputs.append(op.reps.name) builder.add_tile( name=op.name, reps=op.reps.val, input_name=inputs, output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def tanh(const_context, builder, op): builder.add_activation( name=op.name, non_linearity="TANH", input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def scaled_tanh(const_context, builder, op): builder.add_activation( name=op.name, non_linearity="SCALED_TANH", input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, params=[op.alpha.val, op.beta.val], ) @register_mil_to_nn_mapping def sigmoid(const_context, builder, op): builder.add_activation( name=op.name, non_linearity="SIGMOID", input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def sigmoid_hard(const_context, builder, op): builder.add_activation( name=op.name, non_linearity="SIGMOID_HARD", input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, params=[op.alpha.val, op.beta.val], ) @register_mil_to_nn_mapping def erf(const_context, builder, op): builder.add_erf( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def thresholded_relu(const_context, builder, op): builder.add_activation( name=op.name, non_linearity="THRESHOLDEDRELU", input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, params=op.alpha.val, ) @register_mil_to_nn_mapping def elu(const_context, builder, op): builder.add_activation( name=op.name, non_linearity="ELU", input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, params=op.alpha.val, ) @register_mil_to_nn_mapping def leaky_relu(const_context, builder, op): builder.add_activation( name=op.name, non_linearity="LEAKYRELU", input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, params=[op.alpha.val], ) @register_mil_to_nn_mapping def gelu(const_context, builder, op): builder.add_gelu( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, mode=op.mode.val, ) @register_mil_to_nn_mapping def softplus(const_context, builder, op): builder.add_activation( name=op.name, non_linearity="SOFTPLUS", input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def softmax(const_context, builder, op): rank = op.x.rank if op.axis.val == -3 or op.axis.val > 0 and op.axis.val == rank - 3: builder.add_softmax( name=op.name, input_name=op.x.name, output_name=op.outputs[0].name, ) else: builder.add_softmax_nd( name=op.name, input_name=op.x.name, output_name=op.outputs[0].name, axis=op.axis.val, ) @register_mil_to_nn_mapping def softplus_parametric(const_context, builder, op): builder.add_activation( name=op.name, non_linearity="PARAMETRICSOFTPLUS", input_name=make_input(const_context, builder, op.x), input_shape=op.x.shape, input_rank=op.x.rank, output_name=op.outputs[0].name, params=[op.alpha.val, op.beta.val], ) @register_mil_to_nn_mapping def softsign(const_context, builder, op): builder.add_activation( name=op.name, non_linearity="SOFTSIGN", input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def linear_activation(const_context, builder, op): builder.add_activation( name=op.name, non_linearity="LINEAR", input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, params=[op.alpha.val, op.beta.val], ) @register_mil_to_nn_mapping def relu(const_context, builder, op): builder.add_activation( name=op.name, non_linearity="RELU", input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def clamped_relu(const_context, builder, op): builder.add_clamped_relu( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, alpha=op.alpha.val, beta=op.beta.val, ) @register_mil_to_nn_mapping def relu6(const_context, builder, op): builder.add_activation( name=op.name + "__relu6_relu__", input_name=make_input(const_context, builder, op.x), output_name=op.name + "__relu6_relu__", non_linearity="RELU", ) builder.add_activation( name=op.name + "__relu6_neg__", input_name=op.name + "__relu6_relu__", output_name=op.name + "__relu6_neg__", non_linearity="LINEAR", params=[-1, 0], ) builder.add_unary( name=op.name + "__relu6_threshold6__", input_name=op.name + "__relu6_neg__", output_name=op.name + "__relu6_threshold6__", mode="threshold", alpha=-6, ) builder.add_activation( name=op.name, input_name=op.name + "__relu6_threshold6__", output_name=op.outputs[0].name, non_linearity="LINEAR", params=[-1, 0], ) @register_mil_to_nn_mapping def prelu(const_context, builder, op): builder.add_activation( name=op.name, non_linearity="PRELU", input_name=make_input(const_context, builder, op.x), input_shape=op.x.shape, input_rank=op.x.rank, output_name=op.outputs[0].name, params=op.alpha.val, ) @register_mil_to_nn_mapping def pad(const_context, builder, op): if len(op.pad.shape) != 1: raise ValueError("Pad should be a 1D tensor.") pad = op.pad.val mode = op.mode.val constant_val = op.constant_val.val nn_mode_mapping = {"reflect": "reflection", "replicate": "replication"} mode = nn_mode_mapping.get(mode, mode) if pad is not None: missing_dims = op.x.rank - len(pad) // 2 pad = [0, 0] * missing_dims + list(pad) if pad is not None and op.x.rank > 1 and all(i == 0 for i in pad[:-4]): pad = pad[-4:] left, right = pad[2], pad[3] top, bottom = pad[0], pad[1] builder.add_padding( name=op.name, left=left, right=right, top=top, bottom=bottom, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, padding_type=mode, value=constant_val, ) elif mode == "constant": if pad is None: builder.add_constant_pad( name=op.name, input_names=make_input(const_context, builder, [op.x, op.pad]), output_name=op.outputs[0].name, value=constant_val ) else: builder.add_constant_pad( name=op.name, input_names=make_input(const_context, builder, [op.x]), output_name=op.outputs[0].name, value=constant_val, pad_amounts=pad, ) else: raise ValueError("Unsupported mode for Pad layer! {}".format(mode)) @register_mil_to_nn_mapping def instance_norm(const_context, builder, op): channels = op.x.shape[1] gamma = _np.array([1.0] * channels) if op.gamma is None else op.gamma.val beta = _np.array([0.0] * channels) if op.beta is None else op.beta.val x_name = make_input(const_context, builder, op.x) out_name = op.outputs[0].name if op.x.rank == 3: x_name = op.name + "_expanded" builder.add_expand_dims( name=x_name, input_name=op.x.name, output_name=x_name, axes=[-2], ) out_name += "_instance_norm" builder.add_batchnorm( name=op.name, channels=channels, gamma=gamma, beta=beta, input_name=x_name, output_name=out_name, compute_mean_var=True, instance_normalization=True, epsilon=op.epsilon.val, ) # Squeeze added `Height` dimension for 1d case if op.x.rank == 3: x_name = op.name + "_squeeze" builder.add_squeeze( name=x_name, input_name=out_name, output_name=op.outputs[0].name, axes=[-2], ) @register_mil_to_nn_mapping def l2_norm(const_context, builder, op): builder.add_l2_normalize( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, epsilon=op.epsilon.val, ) @register_mil_to_nn_mapping def layer_norm(const_context, builder, op): rank = op.x.rank input_shape = [-1 if is_symbolic(dim) else dim for dim in list(op.x.shape)] axes = list(range(op.x.rank)) if op.axes.val is None else op.axes.val axes = [axis+rank if axis < 0 else axis for axis in op.axes.val] epsilon = op.epsilon.val # if input shape = (X1, X2) or (X0, X1, X2), axes = [-1], X1 and X2 are known # then the following operations are performed # - reshape to (X1, 1, X2) / (X0, X1, 1, X2) # - apply MVN layer, which normalizes across last 2 dims # - apply scale layer # - reshape back to (X1, X2) / (X0, X1, X2) # Otherwise, we express the layer_norm as primitive operations if rank in [2, 3] and len(axes) == 1 and axes[0] == rank - 1 and input_shape.count(-1) < 2 \ and input_shape[-1] != -1 and input_shape[-2] != -1: reshaped_shape = input_shape[:] # Insert a singleton dimension in the 'height' position reshaped_shape.insert(-1, 1) # Scale layer can't take parameters of size [W], but can take [1, H, W], and H=1 in this case gamma = _np.ones((1, 1, reshaped_shape[-1])) if op.gamma is None else _np.expand_dims(op.gamma.val, axis=(0, 1)) beta = _np.zeros((1, 1, reshaped_shape[-1])) if op.beta is None else _np.expand_dims(op.beta.val, axis=(0, 1)) builder.add_reshape_static( name=op.name + "_reshape", input_name=make_input(const_context, builder, op.x), output_name=op.name + "_reshape", output_shape=reshaped_shape, ) builder.add_mvn( name=op.name + "_mvn", input_name=op.name + "_reshape", output_name=op.name + "_mvn", across_channels=False, normalize_variance=True, epsilon=epsilon, ) builder.add_scale( name=op.name + "_scale", input_name=op.name + "_mvn", output_name=op.name + "_scale", W=gamma, b=beta, has_bias=True, shape_scale=_np.shape(gamma), shape_bias=_np.shape(beta), ) builder.add_reshape_static( name=op.name, input_name=op.name + "_scale", output_name=op.outputs[0].name, output_shape=input_shape, ) else: # We don't meet the conditions for an MVN layer, so we use primitives mean_name = op.name + "_mean" builder.add_reduce_mean( name=mean_name, input_name=make_input(const_context, builder, op.x), output_name=mean_name, axes=axes, keepdims=True, reduce_all=False, ) sub_mean_name = op.name + "_sub_mean" builder.add_subtract_broadcastable( name=sub_mean_name, input_names=[op.x.name, mean_name], output_name=sub_mean_name, ) square_name = op.name + '_square' builder.add_unary( name=square_name, input_name=sub_mean_name, output_name=square_name, mode="power", alpha=2.0, ) square_sum_name = op.name + '_square_sum' builder.add_reduce_sum( name=square_sum_name, input_name=square_name, output_name=square_sum_name, axes=axes, keepdims=True, reduce_all=False, ) normalized_shape = [op.x.shape[i] if i in axes else 1 for i in range(rank)] if not any_symbolic(normalized_shape): div_prod_name = op.name + '_div_constant' add_const(const_context, builder, div_prod_name, _np.prod(normalized_shape)) else: raise NotImplementedError("dynamic shape input nor supported for layer_norm") div_square_sum_name = op.name + '_div_square_sum' builder.add_divide_broadcastable( name=div_square_sum_name, input_names=[square_sum_name, div_prod_name], output_name=div_square_sum_name ) epsilon_const_name = op.name + '_epsilon' add_const(const_context, builder, epsilon_const_name, epsilon) add_epsilon_name = op.name + '_add_epsilon' builder.add_elementwise( name=add_epsilon_name, input_names=[div_square_sum_name, epsilon_const_name], output_name=add_epsilon_name, mode="ADD", ) sqrt_name = op.name + '_sqrt' builder.add_unary( name=sqrt_name, input_name=add_epsilon_name, output_name=sqrt_name, mode="sqrt", ) div_name = op.name + '_divide' builder.add_divide_broadcastable( name=div_name, input_names=[sub_mean_name, sqrt_name], output_name=div_name ) gamma = _np.ones(normalized_shape) if op.gamma is None else _np.reshape(op.gamma.val, normalized_shape) beta = _np.zeros(normalized_shape) if op.beta is None else _np.reshape(op.beta.val, normalized_shape) gamma_name = op.name + '_gamma' beta_name = op.name + '_beta' add_const(const_context, builder, gamma_name, gamma) add_const(const_context, builder, beta_name, beta) mul_name = op.name + '_mul' builder.add_multiply_broadcastable( name=mul_name, input_names=[div_name, gamma_name], output_name=mul_name, ) builder.add_add_broadcastable( name=op.name, input_names=[mul_name, beta_name], output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def local_response_norm(const_context, builder, op): builder.add_lrn( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, alpha=op.alpha.val, beta=op.beta.val, local_size=op.size.val, k=op.k.val, ) @register_mil_to_nn_mapping def conv_transpose(const_context, builder, op): x_name = make_input(const_context, builder, op.x) out_name = op.outputs[0].name # Special handling for 1d conv transpose is_conv_transpose_1d = op.x.rank == 3 is_conv_transpose_2d = op.x.rank == 4 is_conv_transpose_3d = op.x.rank == 5 if is_conv_transpose_1d: x_name = op.name + "_expand_dim" out_name = op.name + "_expanded" builder.add_expand_dims( name=x_name, input_name=op.x.name, output_name=x_name, axes=[-2] ) # Input names to be used input_names = [x_name] # Kernel shape: [C_in, C_out, D, H, W] weight = op.weight.val kernel_channels = weight.shape[0] output_channels = weight.shape[1] * op.groups.val if is_conv_transpose_1d: weight = _np.expand_dims(weight, -2) # pyMIL Deconvolution format: [C_in, C_out / groups, spatial_dims] # NN DeConvolution3D expects weights to have shape (C_out / groups, C_in, spatial_dims) # NN DeConvolution2D/1D expects (spatial_dims, C_in, C_out/groups) if is_conv_transpose_3d: weight = _np.transpose(weight, [1, 0, 2, 3, 4]) else: weight = _np.transpose(weight, [2, 3, 0, 1]) strides = op.strides.val.tolist() dilations = op.dilations.val.tolist() output_spatial_dims = list(op.outputs[0].shape[2:]) if is_conv_transpose_1d: dilations = dilations[:-1] + [1] + dilations[-1:] strides = strides[:-1] + [1] + strides[-1:] # Must be at least 2D output_spatial_dims = output_spatial_dims[:-1] + [1] + output_spatial_dims[-1:] if any_symbolic(output_spatial_dims): output_spatial_dims = None # padding padding_mode = op.pad_type.val pad = {} if padding_mode == "custom": if is_conv_transpose_1d: padding_mode = "valid" pad["padding_top"] = 0 pad["padding_bottom"] = 0 pad["padding_left"] = op.pad.val[0] # Left pad["padding_right"] = op.pad.val[1] # Right elif is_conv_transpose_2d: padding_mode = "valid" pad["padding_top"] = op.pad.val[0] # Top pad["padding_bottom"] = op.pad.val[1] # Bottom pad["padding_left"] = op.pad.val[2] # Left pad["padding_right"] = op.pad.val[3] # Right else: pad["padding_front"] = op.pad.val[0] # Front pad["padding_back"] = op.pad.val[1] # Back pad["padding_top"] = op.pad.val[2] # Top pad["padding_bottom"] = op.pad.val[3] # Bottom pad["padding_left"] = op.pad.val[4] # Left pad["padding_right"] = op.pad.val[5] # Right groups = op.groups.val has_bias = op.bias is not None if is_conv_transpose_3d: builder.add_convolution3d( name=op.name, input_channels=kernel_channels, output_channels=output_channels, depth=weight.shape[-3], height=weight.shape[-2], width=weight.shape[-1], W=weight, b=op.bias.val if has_bias else None, has_bias=has_bias, groups=groups, stride_depth=strides[0], stride_height=strides[1], stride_width=strides[2], dilation_depth=dilations[0], dilation_height=dilations[1], dilation_width=dilations[2], padding_mode=padding_mode, is_deconv=True, output_shape=output_spatial_dims, input_name=input_names, output_name=out_name, **pad ) else: builder.add_convolution( name=out_name, kernel_channels=kernel_channels, output_channels=output_channels, height=weight.shape[0], width=weight.shape[1], stride_height=strides[0], stride_width=strides[1], border_mode=padding_mode, groups=groups, W=weight, b=op.bias.val if has_bias else None, has_bias=has_bias, is_deconv=True, output_shape=output_spatial_dims, input_name=input_names, output_name=out_name, dilation_factors=dilations, **pad ) # Squeeze added `Height` dimension for 1d case if is_conv_transpose_1d: builder.add_squeeze( name=op.name, input_name=out_name, output_name=op.outputs[0].name, axes=[-2], ) @register_mil_to_nn_mapping def range_1d(const_context, builder, op): if op.start.val is not None and op.step.val is not None: inputs = [op.end] elif op.start.val is None and op.step.val is not None: inputs = [op.end, op.start] elif op.start.val is not None and op.step.val is None: inputs = [op.end, op.start, op.step] else: inputs = [op.end, op.start, op.step] builder.add_range_dynamic( name=op.name, output_name=op.outputs[0].name, input_names=make_input(const_context, builder, inputs), start=op.start.val if op.start.val is not None else 0, step=op.step.val if op.step.val is not None else 1, ) @register_mil_to_nn_mapping def one_hot(const_context, builder, op): if op.one_hot_vector_size.val is not None: inputs = [op.indices] else: inputs = [op.indices, op.one_hot_vector_size] builder.add_one_hot( name=op.name, input_names=make_input(const_context, builder, inputs), output_name=op.outputs[0].name, one_hot_vector_size=op.one_hot_vector_size.val, axis=op.axis.val, on_value=op.on_value.val, off_value=op.off_value.val, ) @register_mil_to_nn_mapping def non_maximum_suppression(const_context, builder, op): builder.add_nms( name=op.name, input_names=make_input(const_context, builder, [op.boxes, op.scores]), output_names=[op.outputs[i].name for i in range(4)], iou_threshold=op.iou_threshold.val, score_threshold=op.score_threshold.val, max_boxes=op.max_boxes.val, per_class_suppression=op.per_class_suppression.val, ) @register_mil_to_nn_mapping def flatten2d(const_context, builder, op): builder.add_flatten_to_2d( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, axis=op.axis.val, ) @register_mil_to_nn_mapping def shape(const_context, builder, op): builder.add_get_shape( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, ) def add_upsample_nn(const_context, builder, op, scale_factor_h, scale_factor_w): mode = "NN" linear_upsample_mode = "DEFAULT" if _np.abs(_np.round(scale_factor_h) - scale_factor_h) < 1e-4 and scale_factor_h >= 1 - 1e-4: scale_factor_h = int(scale_factor_h) else: logger.warning( f"Unsupported float type 'scale_factor_height' ({scale_factor_h}) for neuralnetwork. " "Falling back to bilinear interpolation." ) mode = "BILINEAR" linear_upsample_mode = "ALIGN_CORNERS_TRUE" if _np.abs(_np.round(scale_factor_w) - scale_factor_w) < 1e-4 and scale_factor_w >= 1 - 1e-4: scale_factor_w = int(scale_factor_w) else: logger.warning( f"Unsupported float type 'scale_factor_width' ({scale_factor_w}) for neuralnetwork. " "Falling back to bilinear interpolation." ) mode = "BILINEAR" linear_upsample_mode = "ALIGN_CORNERS_TRUE" builder.add_upsample( name=op.name, scaling_factor_h=scale_factor_h, scaling_factor_w=scale_factor_w, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, mode=mode, linear_upsample_mode=linear_upsample_mode, ) @register_mil_to_nn_mapping def resize_nearest_neighbor(const_context, builder, op): Hout, Wout = op.target_size_height.val, op.target_size_width.val x_shape = op.x.shape Hin, Win = x_shape[-2], x_shape[-1] scale_factor_h = Hout / Hin if Hout % Hin == 0 else (Hout + 1e-4) / Hin scale_factor_w = Wout / Win if Wout % Win == 0 else (Wout + 1e-4) / Win add_upsample_nn(const_context, builder, op, scale_factor_h, scale_factor_w) @register_mil_to_nn_mapping def upsample_nearest_neighbor(const_context, builder, op): scale_factor_h = op.scale_factor_height.val scale_factor_w = op.scale_factor_width.val add_upsample_nn(const_context, builder, op, scale_factor_h, scale_factor_w) @register_mil_to_nn_mapping def upsample_bilinear(const_context, builder, op): builder.add_upsample( name=op.name, scaling_factor_h=op.scale_factor_height.val, scaling_factor_w=op.scale_factor_width.val, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, mode="BILINEAR", linear_upsample_mode="ALIGN_CORNERS_TRUE" if op.align_corners.val else "ALIGN_CORNERS_FALSE", ) @register_mil_to_nn_mapping def resize_bilinear(const_context, builder, op): grid_sampling_mode_map = { "STRICT_ALIGN_CORNERS": "STRICT_ALIGN_ENDPOINTS_MODE", "ALIGN_CORNERS": "ALIGN_ENDPOINTS_MODE", "DEFAULT": "UPSAMPLE_MODE", "OFFSET_CORNERS": "ROI_ALIGN_MODE" } if op.sampling_mode.val not in grid_sampling_mode_map: raise NotImplementedError( "Unsupported 'sampling_mode' ('{op.sampling_mode.val}') in neuralnetwork backend" ) builder.add_resize_bilinear( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, target_height=op.target_size_height.val, target_width=op.target_size_width.val, mode=grid_sampling_mode_map[op.sampling_mode.val], ) @register_mil_to_nn_mapping def cond(const_context, builder, op): true_block = op.blocks[0] false_block = op.blocks[1] branch_layer = builder.add_branch( name=op.name, input_name=make_input(const_context, builder, op.pred), ) true_builder = neural_network.NeuralNetworkBuilder( nn_spec=branch_layer.branch.ifBranch, disable_rank5_shape_mapping=True, use_float_arraytype=True, ) convert_ops(const_context, true_builder, true_block.operations, true_block.outputs) # Copy block output to cond op output. for block_out, op_out in zip(true_block.outputs, op.outputs): true_builder.add_copy( name=block_out.name + "_ret_copy", # No need to make_input for block_out which is guaranteed # to be a node input_name=block_out.name, output_name=op_out.name, ) false_builder = neural_network.NeuralNetworkBuilder( nn_spec=branch_layer.branch.elseBranch, disable_rank5_shape_mapping=True, use_float_arraytype=True, ) convert_ops( const_context, false_builder, false_block.operations, false_block.outputs ) for block_out, op_out in zip(false_block.outputs, op.outputs): false_builder.add_copy( name=block_out.name + "_ret_copy", input_name=block_out.name, output_name=op_out.name, ) @register_mil_to_nn_mapping def while_loop(const_context, builder, op): cond_block = op.blocks[0] body_block = op.blocks[1] # Assume that all loop vars aren't loop invariant (invariant loop vars # should've be optimized away in graph passes). for v_in, vx_in in zip(op.loop_vars, cond_block.inputs): assert v_in.name != vx_in.name, "Loop invariant detected in {}".format(op) builder.add_copy( name=vx_in.name + "_input_copy", input_name=make_input(const_context, builder, v_in), output_name=vx_in.name, ) loop_layer = builder.add_loop( name=op.name, # max_iterations=0 to use condition network. max_iterations=0, ) # Construct while_loop condition cond_builder = neural_network.NeuralNetworkBuilder( nn_spec=loop_layer.loop.conditionNetwork, disable_rank5_shape_mapping=True, use_float_arraytype=True, ) cond_builder.rank_dict = {k.name: builder.rank_dict[k.name] for k in cond_block.inputs} convert_ops( const_context, cond_builder, cond_block.operations, cond_block.outputs, ) loop_layer.loop.conditionVar = cond_block.outputs[0].name # while_loop body produces loop_vars body_builder = neural_network.NeuralNetworkBuilder( nn_spec=loop_layer.loop.bodyNetwork, disable_rank5_shape_mapping=True, use_float_arraytype=True, ) body_builder.rank_dict = {k.name: builder.rank_dict[k.name] for k in body_block.inputs} convert_ops( const_context, body_builder, body_block.operations, body_block.outputs, ) # Also assume all outputs are different from loop inputs (i.e., no loop # invariant.) for vx_in, vx_out in zip(body_block.inputs, body_block.outputs): if vx_in.name == vx_out.name: msg = "Loop invariant var {} detected in block {}" logger.warning(msg.format(vx_in.name, body_block.name)) continue body_builder.add_copy( name=vx_in.name + "_ret_copy", input_name=make_input(const_context, builder, vx_out), output_name=vx_in.name, ) @register_mil_to_nn_mapping def identity(const_context, builder, op): builder.add_copy( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def concat(const_context, builder, op): # filter out input tensor with 0 size values = [] for v in op.values: if len(v.shape) > 0 and v.shape[op.axis.val] == 0: continue values.append(v) if len(values) == 0: raise NotImplementedError('0 size tensor unsupported.') if len(values) >= 2: rank = values[0].rank if op.interleave.val: builder.add_concat_nd( name=op.name, input_names=make_input(const_context, builder, values), output_name=op.outputs[0].name, axis=op.axis.val, interleave=True) elif rank >= 4 and (op.axis.val == -3 or op.axis.val > 0 and op.axis.val == rank - 3): builder.add_elementwise( name=op.name, input_names=make_input(const_context, builder, values), output_name=op.outputs[0].name, mode="CONCAT", ) else: builder.add_concat_nd( name=op.name, input_names=make_input(const_context, builder, values), output_name=op.outputs[0].name, axis=op.axis.val) else: builder.add_copy( name=op.name, input_name=make_input(const_context, builder, values[0]), output_name=op.outputs[0].name) @register_mil_to_nn_mapping def stack(const_context, builder, op): builder.add_stack( name=op.name, input_names=make_input(const_context, builder, op.values), output_name=op.outputs[0].name, axis=op.axis.val, ) @register_mil_to_nn_mapping def split(const_context, builder, op): split = op.sizes split = [size for size in split if size != 0] has_equal_splits = all([size == split[0] for size in split]) num_splits = len(split) output_names = [op.outputs[i].name for i in range(len(op.sizes)) if op.sizes[i] != 0] if has_equal_splits: builder.add_split_nd( name=op.name, input_name=make_input(const_context, builder, op.x), output_names=output_names, axis=op.axis.val, num_splits=num_splits) else: builder.add_split_nd( name=op.name, input_name=make_input(const_context, builder, op.x), output_names=output_names, axis=op.axis.val, split_sizes=list(split)) @register_mil_to_nn_mapping def argsort(const_context, builder, op): axis = op.x.rank + op.axis.val if op.axis.val < 0 else op.axis.val builder.add_argsort( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, axis=axis, descending=(not op.ascending.val), ) @register_mil_to_nn_mapping def pixel_shuffle(const_context, builder, op): builder.add_reorganize_data( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, mode="PIXEL_SHUFFLE", block_size=op.upscale_factor.val, ) @register_mil_to_nn_mapping def sliding_windows(const_context, builder, op): builder.add_sliding_windows( name=op.name, input_name=make_input(const_context, builder, op.x), output_name=op.outputs[0].name, axis=op.axis.val, window_size=op.size.val, step=op.stride.val, ) @register_mil_to_nn_mapping def crop(const_context, builder, op): builder.add_crop( name=op.name, input_names=[op.x.name], output_name=op.outputs[0].name, offset=0, left=op.crop_width.val[0], right=op.crop_width.val[1], top=op.crop_height.val[0], bottom=op.crop_height.val[1], ) @register_mil_to_nn_mapping def crop_resize(const_context, builder, op): grid_sampling_mode_map = { "STRICT_ALIGN_CORNERS": "STRICT_ALIGN_ENDPOINTS_MODE", "ALIGN_CORNERS": "ALIGN_ENDPOINTS_MODE", "DEFAULT": "UPSAMPLE_MODE", "OFFSET_CORNERS": "ROI_ALIGN_MODE", } if op.sampling_mode.val not in grid_sampling_mode_map: raise NotImplementedError( "Unsupported 'sampling_mode' ('{}') in neuralnetwork backend".format( op.sampling_mode.val ) ) mode = grid_sampling_mode_map[op.sampling_mode.val] input_expanded = op.name + "_x_expand" builder.add_expand_dims( name=input_expanded, input_name=make_input(const_context, builder, op.x), output_name=input_expanded, axes=[0], ) builder.add_crop_resize( name=op.name, input_names=make_input(const_context, builder, [input_expanded, op.roi]), output_name=op.outputs[0].name, target_height=op.target_height.val, target_width=op.target_width.val, mode=mode, normalized_roi=op.normalized_coordinates.val, box_indices_mode=op.box_coordinate_mode.val, spatial_scale=op.spatial_scale.val, ) @register_mil_to_nn_mapping def custom_op(const_context, builder, op): class_name = op.bindings.get("class_name", op.name) input_order = op.bindings.get("input_order", []) parameters = op.bindings.get("parameters", []) weights = op.bindings.get("weights", []) description = op.bindings.get("description", "") if len(input_order) == 0: raise ValueError("Inputs not provided for Custom Layer: {}".format(op.name)) # Get input names inputs = [op.inputs[_name] for _name in input_order] # Get output names output_names = [_output.name for _output in op.outputs] # Load custom params params = NeuralNetwork_pb2.CustomLayerParams() params.className = class_name params.description = description # Load parameters for _param in parameters: param = op.inputs[_param] param_val = param.val if types.is_bool(param.dtype): params.parameters[_param].boolValue = param_val elif types.is_int(param.dtype): params.parameters[_param].intValue = param_val elif types.is_float(param.dtype): params.parameters[_param].doubleValue = param_val elif types.is_str(param.dtype): params.parameters[_param].stringValue = param_val else: raise ValueError( "Unknown parameter type for custom layer- " "Op: {}, Parameter: {}, Type: {}".format(op.name, _param, param.dtype) ) # Load weights for _weight in weights: wt = params.weights.add() wt.floatValue.extend(map(float, _weight)) # Add a custom layer builder.add_custom( name=op.name, input_names=make_input(const_context, builder, inputs), output_names=output_names, custom_proto_spec=params, ) @register_mil_to_nn_mapping def make_list(const_context, builder, op): # Set a initial size size = op.init_length.val # set the dynamic dimensions to 1 for initialization # Ex: op.elem_shape = [i0, 128] will result in [1, 128] elem_shape = [1 if isinstance(dim_var.val, str) else dim_var.val for dim_var in op.elem_shape] if size is not None: array_size = size if size > 0 else 1 array_shape = [array_size] + elem_shape add_const( const_context, builder, op.outputs[0].name, val=_np.zeros(array_shape, dtype="float"), ) else: if len(elem_shape) > 0: node_es_name = op.name + "_element_shape" add_const( const_context, builder, node_es_name, val=_np.array(elem_shape, dtype="float"), ) # Concatenate list length of the input, should be a constant vector of size 1) with element shape node_arr_shape_name = op.name + "_arr_shape" builder.add_concat_nd( name=node_arr_shape_name, input_names=[op.init_length.name, node_es_name], output_name=node_arr_shape_name, axis=0, ) else: raise ValueError("elem_shape should have length > 0.") builder.add_fill_dynamic( name=op.name, input_name=node_arr_shape_name, output_name=op.outputs[0].name ) def _realloc_list(const_context, builder, ls_var, index_var, value_var, mode): # we do two things in this helper function # (1) # check if we need to re-initialize the tensorarray: # it happens when the elem_shape is runtime determined and the runtime shape is not equal to # the default shape. Ex: elem_shape is = [i0, 10] (initialized with [1, 10]) and at the runtime we get [2, 10]. # (2) # If index_var >= len(ls_var), reallocate the array and copy over existing # contents # index_var: str or Var # ls_var: Var # check if elem_shape is runtime-determined elem_shape = tuple(value_var.shape) has_dynamic_shape = any([is_symbolic(i) for i in elem_shape]) # get the fill shape of the tensor array # [length, elem_dim1, elem_dim2, ...] full_shape_name = ls_var.name + "_full_shape" builder.add_get_shape( name=full_shape_name, input_name=ls_var.name, # no need to make_input output_name=full_shape_name, ) # slice shape [length, elem_dim1, elem_dim2, ...] to get current length curr_len_name = ls_var.name + "_length" builder.add_slice_static( name=curr_len_name, input_name=full_shape_name, output_name=curr_len_name, begin_ids=[0], end_ids=[1], begin_masks=[False], end_masks=[False], strides=[1], ) value_elem_shape_name = ls_var.name + '_value_elem_shape' if has_dynamic_shape: # get elem_shape from value if it is runtime-determined # this is similar to what the backfill_make_list_elem_type tf graph pass does. # if mode == "list_write", elem_shape equal to value.shape, # if mode == "list_scatter", elem_shape equal to value.shape[1:] if mode == "list_write": builder.add_get_shape( name=value_elem_shape_name, input_name=make_input(const_context, builder, value_var), output_name=value_elem_shape_name, ) elif mode == "list_scatter": raw_value_elem_shape_name = ls_var.name + '_raw_value_elem_shape' builder.add_get_shape( name=raw_value_elem_shape_name, input_name=make_input(const_context, builder, value_var), output_name=raw_value_elem_shape_name, ) builder.add_slice_static( name=value_elem_shape_name, input_name=raw_value_elem_shape_name, output_name=value_elem_shape_name, begin_ids=[1], end_ids=[-1], begin_masks=[False], end_masks=[True], strides=[1], ) else: add_const(const_context, builder, value_elem_shape_name, _np.array(elem_shape)) # if elem_shape is runtime-determined, check if we need to re-initialize the array if has_dynamic_shape: # slice shape [length, elem_dim1, elem_dim2, ...] to get list elem_shape curr_elem_shape_name = ls_var.name + "_ls_elem_shape" builder.add_slice_static( name=curr_elem_shape_name, input_name=full_shape_name, output_name=curr_elem_shape_name, begin_ids=[1], end_ids=[-1], begin_masks=[False], end_masks=[True], strides=[1], ) # test if the runtime elem_shape from the list and value are equal not_equal_name = ls_var.name + '_elem_shape_not_equal' builder.add_not_equal( name=not_equal_name, input_names=[curr_elem_shape_name, value_elem_shape_name], output_name=not_equal_name, ) reduce_any_name = ls_var.name + '_reduce_any' builder.add_reduce_sum( name=reduce_any_name, input_name=not_equal_name, output_name=reduce_any_name, axes=[0], keepdims=False, reduce_all=True, ) # if the two elem_shape are different, then re initialize the list with elem_shape from the value re_initialize_condition_name = ls_var.name + "_condition_re_initialize" layer = builder.add_branch(name=re_initialize_condition_name, input_name=reduce_any_name) true_builder = neural_network.NeuralNetworkBuilder( nn_spec=layer.branch.ifBranch, disable_rank5_shape_mapping=True, use_float_arraytype=True, ) re_initialize_shape_name = ls_var.name + "_re_initialize_shape" true_builder.add_concat_nd( name=re_initialize_shape_name, input_names=[curr_len_name, value_elem_shape_name], output_name=re_initialize_shape_name, axis=0, ) re_initialize_name = ls_var.name + "_re_initialize" true_builder.add_fill_dynamic( name=re_initialize_name, input_name=re_initialize_shape_name, output_name=re_initialize_name, value=0.0, ) true_builder.add_copy( name=ls_var.name + "_re_initialize_assign", input_name=re_initialize_name, output_name=ls_var.name ) # after re-initialize the list, we now check if we need to reallocate the list # check if the index > curr_length is_growing_name = ls_var.name + "_is_growing" builder.add_greater_than( name=is_growing_name, input_names=make_input(const_context, builder, [index_var, curr_len_name]), output_name=is_growing_name, use_greater_than_equal=True, ) condition_name = ls_var.name + "_condition" layer = builder.add_branch(name=condition_name, input_name=is_growing_name) true_builder = neural_network.NeuralNetworkBuilder( nn_spec=layer.branch.ifBranch, disable_rank5_shape_mapping=True, use_float_arraytype=True, ) # alloc_length_name0 = index - list_length alloc_length_name0 = ls_var.name + "_extra_length0" true_builder.add_subtract_broadcastable( name=alloc_length_name0, input_names=make_input(const_context, builder, [index_var, curr_len_name]), output_name=alloc_length_name0, ) # alloc_length_name1 = index - list_length + 1 alloc_length_name1 = ls_var.name + "_extra_length1" true_builder.add_elementwise( name=alloc_length_name1, input_names=[alloc_length_name0], mode="ADD", output_name=alloc_length_name1, alpha=1, ) # alloc_shape_name = [alloc_length] + elem_shape alloc_shape_name = ls_var.name + "_alloc_shape" true_builder.add_concat_nd( name=alloc_shape_name, input_names=[alloc_length_name1, value_elem_shape_name], output_name=alloc_shape_name, axis=0, ) # new_alloc_name is np.zeros([alloc_length] + elem_shape) new_alloc_name = ls_var.name + "_alloc" true_builder.add_fill_dynamic( name=new_alloc_name, input_name=alloc_shape_name, output_name=new_alloc_name, value=0.0, ) # new_list_name is np.concat([old_list, new_alloc]) new_list_name = ls_var.name + "_new" true_builder.add_concat_nd( name=new_list_name, input_names=[ls_var.name, new_alloc_name], output_name=new_list_name, axis=0, ) # Copy new_list_name to ls_var.name true_builder.add_copy( name=ls_var.name + "_assign", input_name=new_list_name, output_name=ls_var.name ) @register_mil_to_nn_mapping def list_write(const_context, builder, op): _realloc_list(const_context, builder, op.ls, op.index, op.value, "list_write") # expanded_value_name is [1, op.value] expanded_value_name = op.ls.name + '_' + op.value.name + "_expanded" builder.add_expand_dims( name=expanded_value_name, input_name=make_input(const_context, builder, op.value), output_name=expanded_value_name, axes=[0], ) builder.add_scatter( name=op.name, input_names=make_input( const_context, builder, [op.ls, op.index, expanded_value_name] ), output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def list_gather(const_context, builder, op): builder.add_gather( name=op.name, input_names=make_input(const_context, builder, [op.ls, op.indices]), output_name=op.outputs[0].name, axis=0, ) @register_mil_to_nn_mapping def list_scatter(const_context, builder, op): max_idx_name = op.indices.name + "_max" builder.add_reduce_max( name=max_idx_name, axes=[0], keepdims=False, input_name=make_input(const_context, builder, op.indices), output_name=max_idx_name, ) _realloc_list(const_context, builder, op.ls, max_idx_name, op.value, "list_scatter") builder.add_scatter( name=op.name, input_names=make_input(const_context, builder, [op.ls, op.indices, op.value]), output_name=op.outputs[0].name, ) @register_mil_to_nn_mapping def list_read(const_context, builder, op): # gathered_name has shape [1] + elem_shape gathered_name = op.name + "_gathered" builder.add_gather( name=op.name, input_names=make_input(const_context, builder, [op.ls, op.index]), output_name=gathered_name, axis=0, ) # squeezed_name has shape elem_shape squeezed_name = op.name + "_squeezed" builder.add_squeeze( name=squeezed_name, input_name=gathered_name, output_name=op.outputs[0].name, axes=[0], ) @register_mil_to_nn_mapping def list_length(const_context, builder, op): # list_shape_name == [list_length] + elem_shape list_shape_name = op.ls.name + "_shape" builder.add_get_shape( name=list_shape_name, input_name=make_input(const_context, builder, op.ls), output_name=list_shape_name, ) # slice to get list_length builder.add_slice_static( name=op.name, input_name=list_shape_name, output_name=op.outputs[0].name, begin_ids=[0], end_ids=[1], begin_masks=[False], end_masks=[False], strides=[1], ) @register_mil_to_nn_mapping def _const_symbolic(const_context, builder, op): # do nothing pass ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2095466 coremltools-8.0/coremltools/converters/mil/backend/nn/passes/0000755000000000000000000000000014672075535023314 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/passes/__init__.py0000644000000000000000000000066014672066616025427 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import ( alert_return_type_cast, commingle_loop_vars, conv1d_decomposition, handle_return_inputs_as_outputs, handle_return_unused_inputs, handle_unused_inputs, mlmodel_passes, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/passes/alert_return_type_cast.py0000644000000000000000000000324214672066616030450 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools import _logger as logger from coremltools.converters.mil.mil import Var, types from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="nn_backend") class alert_return_type_cast(AbstractGraphPass): """ prog: Program # NN always implicitly cast return types to fp32. Detect any return # types that are not builtin.fp32 and alert user of the implicit # casting. This pass must be at the end. Example: # # Given: # # main(%x: (2, 3, fp32)) { # block0() { # %shape_0: (2,i32)* = const(val=[4, 7]) # } -> (%shape_0) # } # # (Notice that %shape_0 is i32, not fp32) # # Result: # # The same program. # # Alert messages about %shape_0 being implicitly cast from i32 to fp32. # # Comment: This pass should do more proper casting as backend supports more types. """ def apply(self, prog): for f_name, f in prog.functions.items(): for v in f.outputs: if isinstance(v, Var) and v.dtype != types.fp32: msg = ( "Output var {} of type {} in function {} is " + "cast to type fp32" ) logger.warning( msg.format(v.name, types.builtin_to_string(v.dtype), f_name) ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/passes/commingle_loop_vars.py0000644000000000000000000000465014672066616027731 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.pass_registry import register_pass def _commingle_loop_vars_block(block): for op in block.operations: for b in op.blocks: _commingle_loop_vars_block(b) if op.op_type != "while_loop": continue for block in op.blocks: for v_out, vx_in in zip(op.outputs, block.inputs): # Disable check as v_out is not visible in block. block.replace_uses_of_var_after_op( anchor_op=None, old_var=vx_in, new_var=v_out, ) # replace block inputs block._block_inputs = op.outputs @register_pass(namespace="nn_backend") class commingle_loop_vars(AbstractGraphPass): """ prog: Program # NN backend expects output vars as loop vars. Example: # # Given: # main(%a: (1, 2, fp32), # %b: (1, 2, fp32)) { # block0() { # %loop:0: (1, 2, fp32), %loop:1: (1, 2, fp32) = \ # while_loop(loop_vars=(%a, %b)) # loop_cond(%a.x, %b.x) { # %cond_var: (bool) = some_op(x=%a.x, y=%b.x) # } -> (%cond_var) # loop_body(%a.x, %b.x) { # %add_0: (1, 2, fp32) = add(x=%a.x, y=%b.x) # } -> (%add_0, %b.x) # } -> (%loop:0, %loop:1) # } # # Result: # main(%a: (1, 2, fp32), # %b: (1, 2, fp32)) { # block0() { # %loop:0: (1, 2, fp32), %loop:1: (1, 2, fp32) = \ # while_loop(loop_vars=(%a, %b)) # loop_cond(%loop:0, %loop:1) { # %cond_var: (bool) = some_op(x=%loop:0, y=%loop:1) # } -> (%cond_var) # loop_body(%loop:0, %loop:1) { # %add_0: (1, 2, fp32) = add(x=%loop:0, y=%loop:1) # } -> (%add_0, %loop:1) # } -> (%loop:0, %loop:1) # } # # Comment: The resulting program is no longer SSA (multiple assignments on # %loop:0). """ def apply(self, prog): for f in prog.functions.values(): _commingle_loop_vars_block(f) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/passes/conv1d_decomposition.py0000644000000000000000000000726514672066616030026 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="nn_backend") class decompose_conv1d(AbstractGraphPass): """ NeuralNetwork does not support conv1d natively, instead it decomposes conv1d into expand_dims -> conv2d -> squeeze Let us decompose conv1d for NN, so we may have a chance to optimize expand_dims -> conv2d -> squeeze Given: %2 = conv(%1), %1.rank = 3 ... Result: %3 = expand_dims(%1, axes=-2) %4 = conv(%3) %2 = squeeze(%4, axes=-2) ... """ def apply(self, prog): for f in prog.functions.values(): self._decompose_conv1d_block(f) @block_context_manager def _decompose_conv1d_block(self, block: Block): def help_decompose_conv1d_block(block: Block) -> bool: fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = help_decompose_conv1d_block(b) # must be conv1d if op.op_type != "conv" or op.x.rank != 3: continue if self._try_apply_transform(op, block): fusion_occurred = True return fusion_occurred block_changed = True while block_changed: block_changed = help_decompose_conv1d_block(block) @staticmethod def _try_apply_transform(conv_op: Operation, block: Block) -> bool: # create `expand_dims` expand_out = mb.expand_dims(x=conv_op.x, axes=(-2,), before_op=conv_op) # prepare `conv2d` conv_kwargs = {"x": expand_out, "before_op": conv_op} # inherit `pad_type`, `groups`, `bias` from `conv1d` conv_kwargs["pad_type"] = conv_op.inputs["pad_type"].val conv_kwargs["groups"] = conv_op.inputs["groups"].val bias = conv_op.inputs.get("bias", None) if bias is not None: conv_kwargs["bias"] = bias # expand `weight`, `strides`, `pad`, `dilations` from `conv1d` conv_kwargs["weight"] = mb.expand_dims( x=conv_op.inputs["weight"], axes=(-2,), before_op=conv_op ) conv_kwargs["strides"] = (1, conv_op.inputs["strides"].val[-1]) conv_kwargs["pad"] = (0, 0, conv_op.inputs["pad"].val[-2], conv_op.inputs["pad"].val[-1]) conv_kwargs["dilations"] = (1, conv_op.inputs["dilations"].val[-1]) # compose `conv2d` conv_out = mb.conv(**conv_kwargs) # create `squeeze` squeeze_out = mb.squeeze( x=conv_out, axes=(-2,), name=conv_op.outputs[0].name, before_op=conv_op ) # try replacing `conv1d` output # with the new `expand_dims` -> `conv2d` -> `squeeze` output if conv_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=conv_op, old_var=conv_op.outputs[0], new_var=squeeze_out ): # remove `conv1d` block.remove_ops([conv_op]) return True return False ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/passes/handle_return_inputs_as_outputs.py0000644000000000000000000000415114672066616032411 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.pass_registry import register_pass def _handle_return_inputs_as_outputs_func(f): returned_inputs = [] for v_name, v in f.inputs.items(): if v not in f.outputs: continue returned_inputs.append(v) with f: for v in returned_inputs: # copy twice since NN layer cannot have input name == output name v_tmp = mb.identity(x=v, name=v.name + "_tmp") res = mb.identity(x=v_tmp, name=v.name) res.op.enclosing_block.replace_uses_of_var_after_op( anchor_op=res.op, old_var=v, new_var=res ) @register_pass(namespace="nn_backend") class handle_return_inputs_as_outputs(AbstractGraphPass): """ prog: Program # NN cannot handle returning input as output. Insert an identity op for # those cases. Example: # # Given: # main(%a: (1, 2, fp32), # %b: (1, 2, fp32)) { # block0() { # %mul_0_y_0: (i32)* = const(val=2) # %mul_0: (1, 2, fp64) = mul(x=%a, y=%mul_0_y_0) # } -> (%mul_0, %b) # } # # (Notice that %b is returned from input. This causes error in NN) # # Result: # main(%a: (1, 2, fp32), # %b: (1, 2, fp32)) { # block0() { # %mul_0_y_0: (i32)* = const(val=2) # %mul_0: (1, 2, fp64) = mul(x=%a, y=%mul_0_y_0) # %b_tmp: (1, 2, fp32) = identity(x=%b) # %b: (1, 2, fp32) = identity(x=%b_tmp) # } -> (%mul_0, %b) # } # # where identity is applied twice since NN layer cannot have # input name == output name """ def apply(self, prog): for f in prog.functions.values(): _handle_return_inputs_as_outputs_func(f) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/passes/handle_return_unused_inputs.py0000644000000000000000000000404614672066616031511 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.pass_registry import register_pass def _handle_return_unused_inputs_func(f): returned_unused_inputs = filter(lambda x: x in f.outputs, list(f.inputs.values())) with f: for v in returned_unused_inputs: # copy twice since NN layer cannot have input name == output name v_tmp = mb.identity(x=v, name=v.name + "_tmp") res = mb.identity(x=v_tmp, name=v.name) res.op.enclosing_block.replace_uses_of_var_after_op( anchor_op=res.op, old_var=v, new_var=res ) @register_pass(namespace="nn_backend") class handle_return_unused_inputs(AbstractGraphPass): """ prog: Program # NN cannot handle returning input as output. Insert an identity op for # those cases. Example: # # Given: # main(%a: (1, 2, fp32), # %b: (1, 2, fp32)) { # block0() { # %mul_0_y_0: (i32)* = const(val=2) # %mul_0: (1, 2, fp64) = mul(x=%a, y=%mul_0_y_0) # } -> (%mul_0, %b) # } # # (Notice that %b is returned from input. This causes error in NN) # # Result: # main(%a: (1, 2, fp32), # %b: (1, 2, fp32)) { # block0() { # %mul_0_y_0: (i32)* = const(val=2) # %mul_0: (1, 2, fp64) = mul(x=%a, y=%mul_0_y_0) # %b_tmp: (1, 2, fp32) = identity(x=%b) # %b: (1, 2, fp32) = identity(x=%b_tmp) # } -> (%mul_0, %b) # } # # where identity is applied twice since NN layer cannot have # input name == output name """ def apply(self, prog): for f in prog.functions.values(): _handle_return_unused_inputs_func(f) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/passes/handle_unused_inputs.py0000644000000000000000000000330714672066616030111 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.pass_registry import register_pass def _handle_unused_inputs_func(f): unused_inputs = [v for v_name, v in f.inputs.items() if len(v.child_ops) == 0] with f: for v in unused_inputs: # copy the input v_tmp = mb.identity(x=v, name=v.name + "_tmp") Block._copy_scope_info(v, v_tmp) @register_pass(namespace="nn_backend") class handle_unused_inputs(AbstractGraphPass): """ prog: Program # NN doesn't allow unused inputs. Insert an identity op to consume # inputs (though its outputs are not used.). This pass must come after # dead code elimination as all inserted code are "dead code". Example: # # Given: # # main(%x: (2, 3, fp32)) { # block0() { # %shape_0_const: (2,i32)* = const(val=[4, 7]) # } -> (%shape_0_const) # } # # (Notice that input %x is not consumed. This causes error in NN.) # # Result: # # main(%x: (2, 3, fp32)) { # block0() { # %unused_var: (2, 3, fp32) = identity(x=%x) # %shape_0_const: (2,i32)* = const(val=[4, 7]) # } -> (%shape_0_const) # } """ def apply(self, prog): for f in prog.functions.values(): _handle_unused_inputs_func(f) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/passes/mlmodel_passes.py0000644000000000000000000004407114672066616026703 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause def _get_nn_spec(spec): if spec.WhichOneof("Type") == "neuralNetwork": nn_spec = spec.neuralNetwork elif spec.WhichOneof("Type") == "neuralNetworkClassifier": nn_spec = spec.neuralNetworkClassifier elif spec.WhichOneof("Type") == "neuralNetworkRegressor": nn_spec = spec.neuralNetworkRegressor else: raise ValueError("Specification must contain a neural network") return nn_spec def _get_blob_out_degree(spec): """ Computes use count of every tensor/node in NN graph i.e. How many layers are using it as an input :param nn_spec : NeuralNetworkSpecification :returns use_count_dict : str -> int, a dictionary with node name as a key and it's use count as a value """ def _get_blob_out_degree_rec(nn_spec, out_degree): nn_layers = nn_spec.layers for layer in nn_layers: layer_type = layer.WhichOneof("layer") for inp in layer.input: out_degree[inp] = out_degree.get(inp, 0) + 1 if layer_type == "loop": out_degree[layer.loop.conditionVar] = ( out_degree.get(layer.loop.conditionVar, 0) + 1 ) _get_blob_out_degree_rec(layer.loop.conditionNetwork, out_degree) _get_blob_out_degree_rec(layer.loop.bodyNetwork, out_degree) elif layer_type == "branch": _get_blob_out_degree_rec(layer.branch.ifBranch, out_degree) _get_blob_out_degree_rec(layer.branch.elseBranch, out_degree) use_count_dict = {} # Collect variable use count recursively nn_spec = _get_nn_spec(spec) _get_blob_out_degree_rec(nn_spec, use_count_dict) # Network outputs are variable use network_outputs = _get_network_output(spec) for _output in network_outputs: use_count_dict[_output] = use_count_dict.get(_output, 0) + 1 return use_count_dict def _is_layer(nn_layer, layer_type): """ :param nn_layer : NN layer proto message :param layer_type : str Layer type to check against :returns True if nn_layer is of type `layer_type` otherwise False """ return nn_layer.WhichOneof("layer") == layer_type def _get_input(layer, index=0): """ :param layer : NN Layer Proto message :param index : Layer input index (Default 0) :returns name of input at provided index if present, otherwise None """ if len(layer.input) <= index: return None return layer.input[index] def _get_output(layer, index=0): """ :param layer : NN Layer Proto message :param index : Layer output index (Default 0) :returns name of output at provided index if present, otherwise None """ if len(layer.output) <= index: return None return layer.output[index] def _get_network_output(spec): """ :param spec : CoreML Specification :returns network output names """ network_output_names = [] for _out in spec.description.output: network_output_names.append(_out.name) return network_output_names def transform_conv_crop(spec): """ Transforms Conv -> Crop -> BN (if present) -> Activation (if present) into Conv -> BN (if present) -> Activation (if present) -> Crop This transformation will allow Conv -> BN -> Activation fusion by changing the position of the crop layer, which does not affect the computation """ # Collect metadata out_degree = _get_blob_out_degree(spec) network_output_names = _get_network_output(spec) nn_spec = _get_nn_spec(spec) nn_layers = nn_spec.layers for i in range(0, len(nn_layers) - 2): # If Convolution output is being using as a network output or more than one layers # that's acceptable if not _is_layer(nn_layers[i], "convolution"): continue # Output of Crop layer must not be network output or used by more than one layer if not ( _is_layer(nn_layers[i + 1], "crop") and _get_input(nn_layers[i + 1]) not in network_output_names and out_degree[_get_output(nn_layers[i + 1])] == 1 ): continue layer_to_shuffle_with = -1 # Output of Batchnorm layer must not be network output or used by more than one layer if ( _is_layer(nn_layers[i + 2], "batchnorm") and out_degree[_get_output(nn_layers[i + 2])] == 1 ): layer_to_shuffle_with = i + 2 # Output of Activation layer must not be network output or used by more than one layer if ( i + 3 < len(nn_layers) and _is_layer(nn_layers[i + 3], "activation") and out_degree[_get_output(nn_layers[i + 3])] == 1 ): layer_to_shuffle_with = i + 3 if layer_to_shuffle_with == -1: continue # restructure crop layer # Conv ---> Crop ---> BN ---> Activation ---> Layer1 # In following three steps # 1. Conv --------------> BN ---> Activation ---> Layer1 # \ / # ---> Crop -- nn_layers[i].output[0] = nn_layers[i + 1].output[0] # 2. Conv ---> BN ---> Activation ---> Layer1 # \ / # -----------------Crop ---- nn_layers[i + 1].output[0] = nn_layers[layer_to_shuffle_with].output[0] # 3. Conv ---> BN ---> Activation ---> Crop ---> Layer1 nn_layers[layer_to_shuffle_with].output[0] = nn_layers[i + 1].input[0] # Add Crop layer at new position and remove from current position crop_layer = nn_layers[i + 1] nn_layers.remove(crop_layer) nn_layers.insert(layer_to_shuffle_with, crop_layer) def remove_disconnected_layers(spec): """ Removes layers from model specification if it's output is not connected or on path to the network output. """ def _remove_layers_from_spec(nn_spec, layers_to_delete): nn_layers = nn_spec.layers for _layer in layers_to_delete: nn_layers.remove(_layer) def _get_disconnected_layers_rec(nn_spec): """ - Iterates over layers in bottom-up fashion - Collect layers if it's output is not being used (marks and does lazy deletion) - Recursively iterates over NN Spec if layer is Loop or Branch """ def _decrease_input_degree(layer): """ Helper routine to reduce degree input nodes for given layer """ for _input in layer.input: out_degree[_input] -= 1 if out_degree[_input] == 0: del out_degree[_input] nn_layers = nn_spec.layers layers_to_delete = [] for _layer in reversed(nn_layers): layer_type = _layer.WhichOneof("layer") if layer_type == "loop": condition_net_layers_to_delete = _get_disconnected_layers_rec( _layer.loop.conditionNetwork ) body_net_layers_to_delete = _get_disconnected_layers_rec( _layer.loop.bodyNetwork ) _remove_layers_from_spec( _layer.loop.conditionNetwork, condition_net_layers_to_delete ) _remove_layers_from_spec( _layer.loop.bodyNetwork, body_net_layers_to_delete ) # NOTE: Debatable? # If condition network or bodyNetwork is empty, delete loop layer if ( len(_layer.loop.conditionNetwork.layers) == 0 or len(_layer.loop.bodyNetwork.layers) == 0 ): layers_to_delete.append(_layer) _decrease_input_degree(_layer) continue if layer_type == "branch": if_layers_to_delete = _get_disconnected_layers_rec( _layer.branch.ifBranch ) else_layers_to_delete = _get_disconnected_layers_rec( _layer.branch.elseBranch ) total_if_layers = len(_layer.branch.ifBranch.layers) total_else_layers = len(_layer.branch.elseBranch.layers) if ( len(if_layers_to_delete) != total_if_layers and len(else_layers_to_delete) != total_else_layers ): # If both branches are non-empty after dead-layer elimination # remove respective layers _remove_layers_from_spec( _layer.branch.ifBranch, if_layers_to_delete ) _remove_layers_from_spec( _layer.branch.elseBranch, else_layers_to_delete ) elif ( len(if_layers_to_delete) == total_if_layers and len(else_layers_to_delete) == total_else_layers ): # If both branches are empty after dead-layer elimination # remove branch layer altogether layers_to_delete.append(_layer) _decrease_input_degree(_layer) continue output_is_used = False for _output in _layer.output: # If output is used, cannot remove current layer if _output in out_degree: output_is_used = True break # If no output from current node is used # Remove the layer and decrement use count for all the inputs if not output_is_used: layers_to_delete.append(_layer) _decrease_input_degree(_layer) return layers_to_delete def _remove_disconnected_layers_rec(nn_spec): """ Entry point for removing disconnected layers """ layers_to_delete = _get_disconnected_layers_rec(nn_spec) # delete layers to be removed _remove_layers_from_spec(nn_spec, layers_to_delete) # Get the use count of each layer out_degree = _get_blob_out_degree(spec) nn_spec = _get_nn_spec(spec) # Initiate removal from high level Neural Network spec _remove_disconnected_layers_rec(nn_spec) def remove_redundant_transposes(spec): """ Removes layers from model specification that are back to back transposes that compose to the identity. """ def blob_name_to_layers(nn_layers): """ output_to_layers: {str: layer_proto_message} : {blob name: layers that it feeds into} input_to_parent_layers: {str: layer_proto_message} : {blob name: parent layers that feed in} """ output_to_layers = {} for layer in nn_layers: for input in layer.input: if input not in output_to_layers: output_to_layers[input] = [layer] else: output_to_layers[input].append(layer) input_to_parent_layers = {} for layer in nn_layers: for output in layer.output: if not layer.WhichOneof("layer") == "copy": assert output not in input_to_parent_layers, \ "'{}' blob is generated by more than 1 layers".format(output) input_to_parent_layers[output] = layer return input_to_parent_layers, output_to_layers def _delete_layers(nn_spec, layers_to_delete): """ Given a neural network spec and pairs of transposes to remove, rewire the network to bypass those transposes and remove them from the spec. """ nn_layers = nn_spec.layers _, output_to_layers = blob_name_to_layers(nn_layers) # First pass: rewire layers to bypass those that will be deleted. for layers in layers_to_delete: start_layer = layers[0] end_layer = layers[-1] # Replace children's input by layer_start's input children = output_to_layers[end_layer.output[0]] for child in children: idx = [ i for i, input in enumerate(child.input) if input == end_layer.output[0] ] assert len(idx) == 1 idx = idx[0] child.input[idx] = start_layer.input[0] # Second pass: delete the layers. for layers in layers_to_delete: for layer in layers: nn_layers.remove(layer) def _find_redundant_transposes(nn_spec): """ Search the neural network spec for sequence of transposes that together are the identity, and return a list of those sequence. """ nn_layers = nn_spec.layers layers_to_delete = [] input_to_parent_layers, output_to_layers = blob_name_to_layers(nn_layers) for layer in nn_layers: # Only start with the last element of the transpose layers sequence if not layer.WhichOneof("layer") == "transpose": continue if ( layer.output[0] in output_to_layers and len(output_to_layers[layer.output[0]]) == 1 and output_to_layers[layer.output[0]][0].WhichOneof("layer") == "transpose" ): continue # Get the transpose layers sequence layers = [] cursor = layer while True: if cursor.output[0] in output_to_layers: layers.append(cursor) if cursor.input[0] not in input_to_parent_layers: break cursor = input_to_parent_layers[cursor.input[0]] if cursor.WhichOneof("layer") != "transpose": break if len(output_to_layers[cursor.output[0]]) != 1: break layers = layers[::-1] if len(layers) == 0: continue # Optimize for the number of layers which can be merged using dynamic programming def solve_dp(layers): """ The resulting dp[i] means the maximum length of transpose sequence resulting in identity starting at index i For example, dp[0] = 0 means there is no sequence starting at 0 results in identity dp[10] = 5 means the longest identity sequence starts at 10 is 5, so [layers[10],layer[11],..,layer[14]] is the longest identity sequence start at 10. # dic: {tuple:int} # key is the net transpose axes pattern starting from the first layer # value is the highest id of the layer which has this pattern # e.g. if dic[(1,2,0)] = 34, it means that starting from the 1st layer, # the net transpose pattern `(1,2,0)` is last seen at layer id 34. No layer after 34-th # layer will result in the net pattern `(1,2,0)` """ dim = len(layers[0].transpose.axes) dp = [0] * len(layers) dic = {} axes = list(range(dim)) dic[tuple(axes)] = 0 for i in range(len(layers)): axes = [axes[k] for k in layers[i].transpose.axes] key = tuple(axes) if key in dic: dp[dic[key]] = i - dic[key] + 1 dic[key] = i + 1 for i in range(len(layers) - 1, -1, -1): j = i + dp[i] if j < len(layers): dp[i] = dp[i] + dp[j] return dp dp = solve_dp(layers) """ Once we know the maximum identity sequence starts at each index, we solve for the maximum total node we can remove. I think there must be lots of different solution for this, but I use DP again. sol_num[i] keeps track of the maximum number of nodes can be remove after index i For example, if sol_num[10] = 5, this means after index 10, we can at most remove 5 nodes. sol_bt[i] keeps the first starting point of identity sequence which results in the optimal solution after index i. For example, if sol_num[10] = 12, means that in order to get rid of the maximum number of nodes after 10, the first starting point is index 12. After construct sol_num and sol_bt by dynamic programming, we backtrack for the optimal solution using sol_bt. """ sol_num = [0] * len(dp) sol_bt = [None] * len(dp) if dp[-1] != 0: sol_num[-1] = dp[-1] sol_bt[-1] = len(dp) - 1 for i in range(len(sol_num) - 2, -1, -1): if dp[i] == 0: sol_num[i] = sol_num[i + 1] sol_bt[i] = sol_bt[i + 1] else: num = dp[i] j = i + dp[i] if j < len(sol_num): num += sol_num[j] if num > sol_num[i + 1]: sol_num[i] = num sol_bt[i] = i else: sol_num[i] = sol_num[i + 1] sol_bt[i] = sol_bt[i + 1] # Get layers to delete using sol_bt cursor = 0 while cursor < len(dp): if sol_bt[cursor] is None: break cursor = sol_bt[cursor] tmp = [layers[i] for i in range(cursor, cursor + dp[cursor])] layers_to_delete.append(tmp) cursor += dp[cursor] return layers_to_delete nn_spec = _get_nn_spec(spec) layers_to_delete = _find_redundant_transposes(nn_spec) if len(layers_to_delete) > 0: _delete_layers(nn_spec, layers_to_delete) print("{} transpose pairs deleted".format(len(layers_to_delete))) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/passes/test_mlmodel_passes.py0000644000000000000000000011053014672066616027734 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import unittest from sys import platform import numpy as np import coremltools.models.datatypes as datatypes from coremltools import ComputeUnit from coremltools._deps import _IS_MACOS from coremltools.converters.mil.backend.nn.passes.mlmodel_passes import ( remove_disconnected_layers, remove_redundant_transposes, transform_conv_crop) from coremltools.models import MLModel from coremltools.models import neural_network as neural_network from coremltools.models.neural_network.printer import print_network_spec from coremltools.models.utils import _macos_version DEBUG = False np.random.seed(10) class MLModelPassesTest(unittest.TestCase): def test_load_constant_remove(self): input_features = [("data", datatypes.Array(*(3, 4)))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_activation("relu1", "RELU", "data", "relu1") builder.add_load_constant_nd( "const1", "c1", constant_value=np.ones((5,)), shape=(5,) ) builder.add_activation("relu2", "RELU", "relu1", "out") builder.add_load_constant_nd( "const2", "c2", constant_value=np.ones((5,)), shape=(5,) ) builder.add_load_constant_nd( "const3", "c3", constant_value=np.ones((5,)), shape=(5,) ) spec = builder.spec np.testing.assert_equal(5, len(spec.neuralNetwork.layers)) remove_disconnected_layers(spec) np.testing.assert_equal(2, len(spec.neuralNetwork.layers)) def test_dead_layer_remove(self): input_features = [("data", datatypes.Array(*(3, 4)))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_activation("relu1", "RELU", "data", "relu1") builder.add_load_constant_nd( "const1", "c1", constant_value=np.ones((5,)), shape=(5,) ) builder.add_load_constant_nd( "const2", "c2", constant_value=np.ones((5,)), shape=(5,) ) builder.add_split_nd( "splitnd1", "const2", ["s1", "s2", "s3"], axis=0, num_splits=3 ) builder.add_squeeze("squeeze", "s1", "squeeze_out") builder.add_activation("relu4", "RELU", "s2", "relu4") builder.add_activation("relu5", "RELU", "relu4", "relu5") builder.add_load_constant_nd( "const3", "c3", constant_value=np.ones((5,)), shape=(5,) ) builder.add_activation("relu2", "RELU", "relu1", "out") spec = builder.spec np.testing.assert_equal(9, len(spec.neuralNetwork.layers)) remove_disconnected_layers(spec) np.testing.assert_equal(2, len(spec.neuralNetwork.layers)) def test_dead_layer_remove_branch(self): convergence_tolerance = 1e-8 input_features = [("input", datatypes.Array(*(2,)))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) # add condition to break from the loop, if convergence criterion is met builder.add_less_than("cond", ["input"], "cond", alpha=convergence_tolerance) branch_layer = builder.add_branch("branch_layer", "cond") builder_ifbranch = neural_network.NeuralNetworkBuilder( nn_spec=branch_layer.branch.ifBranch ) builder_ifbranch.add_activation("relu1", "RELU", "input", "relu1_out") builder_ifbranch.add_activation("relu2_out", "RELU", "relu1_out", "relu2_out") builder_elsebranch = neural_network.NeuralNetworkBuilder( nn_spec=branch_layer.branch.elseBranch ) builder_elsebranch.add_activation("linear1", "LINEAR", "input", "linear1_out") builder_elsebranch.add_activation( "linear2", "LINEAR", "linear1_out", "relu2_out" ) builder.add_squeeze("out", "input", "out", squeeze_all=True) mlmodel = MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) data = np.random.rand(2,) data_dict = {"input": data} if _IS_MACOS: before_pass_out = mlmodel.predict(data_dict)["out"] if DEBUG: print( "\n mlmodel description before remove disconnected layers pass: \n" ) print_network_spec(builder.spec, style="coding") remove_disconnected_layers(builder.spec) if DEBUG: print( "\n mlmodel description after remove disconnected layers pass: \n" ) print_network_spec(builder.spec, style="coding") mlmodel = MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) after_pass_out = mlmodel.predict(data_dict)["out"] np.testing.assert_almost_equal(before_pass_out, after_pass_out, decimal=2) np.testing.assert_equal(len(builder.spec.neuralNetwork.layers), 1) def test_dead_layer_partial_branch(self): convergence_tolerance = 1e-8 input_features = [("input", datatypes.Array(*(2,)))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) # add condition to break from the loop, if convergence criterion is met builder.add_less_than("cond", ["input"], "cond", alpha=convergence_tolerance) branch_layer = builder.add_branch("branch_layer", "cond") builder_ifbranch = neural_network.NeuralNetworkBuilder( nn_spec=branch_layer.branch.ifBranch ) builder_ifbranch.add_activation("relu1", "RELU", "input", "relu1_out") builder_ifbranch.add_activation("relu2_out", "RELU", "relu1_out", "relu2_out") builder_elsebranch = neural_network.NeuralNetworkBuilder( nn_spec=branch_layer.branch.elseBranch ) builder_elsebranch.add_activation("linear1", "LINEAR", "input", "linear1_out") builder_elsebranch.add_activation( "linear_red_1", "LINEAR", "input", "linear_red1_out" ) builder_elsebranch.add_activation( "linear_red_2", "LINEAR", "linear_red1_out", "linear_red2_out" ) builder_elsebranch.add_activation( "linear2", "LINEAR", "linear1_out", "relu2_out" ) builder.add_squeeze("out", "relu2_out", "out", squeeze_all=True) mlmodel = MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) if not _IS_MACOS: # Can not get predictions unless on macOS. return data = np.random.rand(2,) data_dict = {"input": data} before_pass_out = mlmodel.predict(data_dict)["out"] if DEBUG: print("\n mlmodel description before remove disconnected layers pass: \n") print_network_spec(builder.spec, style="coding") old_spec = copy.copy(builder.spec) remove_disconnected_layers(builder.spec) if DEBUG: print("\n mlmodel description after remove disconnected layers pass: \n") print_network_spec(builder.spec, style="coding") mlmodel = MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) after_pass_out = mlmodel.predict(data_dict)["out"] np.testing.assert_almost_equal(before_pass_out, after_pass_out, decimal=2) np.testing.assert_equal( len(old_spec.neuralNetwork.layers[1].branch.ifBranch.layers), len(builder.spec.neuralNetwork.layers[1].branch.ifBranch.layers), ) np.testing.assert_equal( len(builder.spec.neuralNetwork.layers[1].branch.elseBranch.layers), 2 ) def test_conv_crop_bn_to_conv_bn_crop(self): input_features = [("data", datatypes.Array(1, 10, 10))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) W = np.ones((1, 2, 2, 2), dtype=np.float32) builder.add_convolution( name="conv", kernel_channels=1, output_channels=2, height=2, width=2, stride_height=1, stride_width=1, border_mode="valid", groups=1, W=W, b=None, has_bias=False, input_name="data", output_name="conv_out", ) builder.add_crop( name="crop", left=1, right=1, top=1, bottom=1, offset=0, input_names=["conv_out"], output_name="crop_out", ) builder.add_batchnorm( name="bn", channels=2, gamma=np.ones(2,).astype(np.float32), beta=np.ones(2,).astype(np.float32), mean=np.ones(2,).astype(np.float32), variance=np.ones(2,).astype(np.float32), input_name="crop_out", output_name="out", ) # Conv -> Crop -> BN spec = builder.spec.neuralNetwork np.testing.assert_equal("crop", spec.layers[1].WhichOneof("layer")) np.testing.assert_equal("batchnorm", spec.layers[2].WhichOneof("layer")) # Predict if _IS_MACOS: mlmodel = MLModel(builder.spec, dict, compute_units=ComputeUnit.CPU_ONLY) data = np.random.rand(1, 10, 10) data_dict = {"data": data} before_pass_out = mlmodel.predict(data_dict)["out"] # transform the pattern transform_conv_crop(builder.spec) # Conv -> BN -> Crop np.testing.assert_equal("batchnorm", spec.layers[1].WhichOneof("layer")) np.testing.assert_equal("crop", spec.layers[2].WhichOneof("layer")) if _IS_MACOS: # Predict mlmodel = MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) after_pass_out = mlmodel.predict(data_dict)["out"] np.testing.assert_almost_equal(before_pass_out, after_pass_out, decimal=3) def test_conv_crop_bn_relu_to_conv_bn_relu_crop(self): input_features = [("data", datatypes.Array(1, 10, 10))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) W = np.ones((1, 2, 2, 2), dtype=np.float32) builder.add_convolution( name="conv", kernel_channels=1, output_channels=2, height=2, width=2, stride_height=1, stride_width=1, border_mode="valid", groups=1, W=W, b=None, has_bias=False, input_name="data", output_name="conv_out", ) builder.add_crop( name="crop", left=1, right=1, top=1, bottom=1, offset=0, input_names=["conv_out"], output_name="crop_out", ) builder.add_batchnorm( name="bn", channels=2, gamma=np.ones(2,).astype(np.float32), beta=np.ones(2,).astype(np.float32), mean=np.ones(2,).astype(np.float32), variance=np.ones(2,).astype(np.float32), input_name="crop_out", output_name="bn_out", ) builder.add_activation( name="relu", non_linearity="RELU", input_name="bn_out", output_name="out" ) # Conv -> Crop -> BN -> ReLU spec = builder.spec.neuralNetwork np.testing.assert_equal("crop", spec.layers[1].WhichOneof("layer")) np.testing.assert_equal("batchnorm", spec.layers[2].WhichOneof("layer")) np.testing.assert_equal("activation", spec.layers[3].WhichOneof("layer")) # Predict if _IS_MACOS: mlmodel = MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) data = np.random.rand(1, 10, 10) data_dict = {"data": data} before_pass_out = mlmodel.predict(data_dict)["out"] # transform the pattern transform_conv_crop(builder.spec) # Conv -> BN -> ReLU -> Crop np.testing.assert_equal("batchnorm", spec.layers[1].WhichOneof("layer")) np.testing.assert_equal("activation", spec.layers[2].WhichOneof("layer")) np.testing.assert_equal("crop", spec.layers[3].WhichOneof("layer")) # Predict mlmodel = MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) if _IS_MACOS: after_pass_out = mlmodel.predict(data_dict)["out"] np.testing.assert_almost_equal(before_pass_out, after_pass_out, decimal=3) @unittest.skipIf( platform != "darwin" or _macos_version() < (10, 15), "Requires MacOS 10.15 or later" ) class Redundant_Transposees_Test(unittest.TestCase): def _test_builder(self, builder, input_shape, expected_layer_num=None): data = np.random.rand(*input_shape) # Mlmodel before mlmodel = MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) output_before = mlmodel.predict({"data": data})["out"] num_layers_before = len(builder.spec.neuralNetwork.layers) remove_redundant_transposes(builder.spec) layers = builder.spec.neuralNetwork.layers if expected_layer_num is None: self.assertTrue(len(layers) < num_layers_before) else: self.assertEqual(len(layers), expected_layer_num) # Mlmodel after mlmodel = MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) output_after = mlmodel.predict({"data": data})["out"] np.testing.assert_almost_equal(output_before, output_after, decimal=3) def test_output_edge_case(self): # For now for safety purpose, the node which are output shouldn't be merged input_shape = (1, 10, 5) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_transpose( name="first_transpose", axes=[2, 0, 1], input_name="data", output_name="first_transpose_out", ) builder.add_transpose( name="second_transpose", axes=[1, 2, 0], input_name="first_transpose_out", output_name="out", ) self._test_builder(builder, input_shape, 2) def test_output_edge_case_2(self): # For now for safety purpose, the node which are output shouldn't be merged input_shape = (1, 10, 5) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_transpose( name="ranspose", axes=[1, 2, 0], input_name="data", output_name="out" ) self._test_builder(builder, input_shape, 1) def test_remove_single_identity_transpose(self): # A single identity transpose (like 0,1,2) should also be removed input_shape = (1, 10, 5) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_transpose( name="uselss_transpose", axes=[0, 1, 2], input_name="data", output_name="useless_transpose_out", ) builder.add_activation( name="relu", non_linearity="RELU", input_name="useless_transpose_out", output_name="out", ) self._test_builder(builder, input_shape, 1) def test_remove_three_transpose(self): # Three transpose layer which can be removed input_shape = (1, 10, 5) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) transpose = [[2, 1, 0], [1, 0, 2], [2, 0, 1]] input_name = "data" for i, axes in enumerate(transpose): name = "transpose_" + str(i) output_name = name + "_out" builder.add_transpose( name=name, axes=axes, input_name=input_name, output_name=output_name ) input_name = output_name builder.add_activation( name="relu", non_linearity="RELU", input_name=input_name, output_name="out" ) self._test_builder(builder, input_shape, 1) def test_remove_thousands_identity_transpose(self): """ INPUT | v [t1] | v [t2] | v . . . | v [t1000] | v RELU tk are all identity Remove a sequence of 1000 identity transpose """ input_shape = (1, 10, 5) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) num_layers = 1000 input_name = "data" for i in range(num_layers): output_name = "layer_" + str(i) + "_output" name = "layer_" + str(i) builder.add_transpose( name=name, axes=[0, 1, 2], input_name=input_name, output_name=output_name, ) input_name = output_name builder.add_activation( name="relu", non_linearity="RELU", input_name=input_name, output_name="out" ) self._test_builder(builder, input_shape, 1) def test_remove_thousands_identity_transpose_with_activation_between(self): """ INPUT | v [t1] | v . . . [t500] | v RELU_1 | v . . . | v [t1000] | v RELU_2 tk are all identity Remove a sequence of 1000 identity transpose but with a RELU in the middle, the final output should be INPUT | v RELU_1 | v RELU_2 """ input_shape = (1, 10, 5) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) num_layers = 1000 input_name = "data" for i in range(num_layers): output_name = "layer_" + str(i) + "_output" name = "layer_" + str(i) builder.add_transpose( name=name, axes=[0, 1, 2], input_name=input_name, output_name=output_name, ) input_name = output_name if i == num_layers / 2: builder.add_activation( name="relu_inter", non_linearity="ReLU", input_name=input_name, output_name="relu_out", ) input_name = "relu_out" builder.add_activation( name="relu", non_linearity="RELU", input_name=input_name, output_name="out" ) self._test_builder(builder, input_shape, 2) def test_remove_thousands_random_transpose_layers(self): """ INPUT | v [t_0] | v [t_1] | v . . . | v [t_999] | v RELU tk are randomly generated, under this certain seed, the result should be INPUT | v [t_0] | v [t_1] | v RELU """ import random from itertools import permutations random.seed(1000) input_shape = (3, 10, 5) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) num_layers = 1000 dim = 3 input_name = "data" debug = [] for i in range(num_layers): axes = list(permutations(range(dim))) random.shuffle(axes) output_name = "layer_" + str(i) + "_output" name = "layer_" + str(i) debug.append(axes[0]) builder.add_transpose( name=name, axes=axes[0], input_name=input_name, output_name=output_name ) input_name = output_name builder.add_activation( name="relu", non_linearity="RELU", input_name=input_name, output_name="out" ) self._test_builder(builder, input_shape, None) def test_remove_thousands_random_transpose_layers_case_2(self): """ Same test as the previous one, but add more layers and dimension. """ import random from itertools import permutations random.seed(0) input_shape = (3, 10, 5, 2, 4) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) num_layers = 5000 dim = 5 input_name = "data" for i in range(num_layers): axes = list(permutations(range(dim))) random.shuffle(axes) output_name = "layer_" + str(i) + "_output" name = "layer_" + str(i) builder.add_transpose( name=name, axes=axes[0], input_name=input_name, output_name=output_name ) input_name = output_name builder.add_activation( name="relu", non_linearity="RELU", input_name=input_name, output_name="out" ) self._test_builder(builder, input_shape, None) def test_branch_structure(self): """ INPUT | v [t_0] | v [t_1] | v [t_3] --. | | v v [t_4] RELU_1 | v [t_5] | v RELU_2 t_0, t_1, t_3 can be merged. t_4, t_5 can be merged. The output should be INPUT | .------. | | v v RELU_2 RELU_1 """ input_shape = (1, 10, 5) input_features = [("data", datatypes.Array(1, 10, 5))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) transpose = [[2, 1, 0], [2, 1, 0], [0, 1, 2], [2, 0, 1], [1, 2, 0]] input_name = "data" for i, axes in enumerate(transpose): name = "transpose_" + str(i) output_name = name + "_out" builder.add_transpose( name=name, axes=axes, input_name=input_name, output_name=output_name ) input_name = output_name builder.add_activation( name="relu", non_linearity="RELU", input_name=input_name, output_name="out" ) builder.add_activation( name="dumpy", non_linearity="RELU", input_name="transpose_2_out", output_name="dumpy", ) self._test_builder(builder, input_shape, 2) def test_branch_case_2(self): """ INPUT | v [t_0] --. | | v v [t_1] RELU_1 | v RELU_2 Even though t_0, t_1 can be merged, but there is a branch from t_0, so we shouldn't remove anything here. """ input_shape = (1, 10, 5) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) transpose = [[2, 1, 0], [2, 1, 0]] input_name = "data" for i, axes in enumerate(transpose): name = "transpose_" + str(i) output_name = name + "_out" builder.add_transpose( name=name, axes=axes, input_name=input_name, output_name=output_name ) input_name = output_name builder.add_activation( name="relu", non_linearity="RELU", input_name=input_name, output_name="out" ) builder.add_activation( name="dumpy", non_linearity="RELU", input_name="transpose_0_out", output_name="dumpy", ) self._test_builder(builder, input_shape, 4) def test_fork_structure_case_3(self): """ INPUT | v [t_0] | v [t_1]--. | | | v | RELU_1 | v [t_2]--. | | | v | RELU_2 [t_3] | v [t_4]--. | | | v | RELU_3 v RELU_4 Even though t_0, t_1 can be merged, t_2 is identity, t_3, t_4 can be merge, The final output should be INPUT | .------------.----------. | | | | v v v v RELU_1 RELU_2 RELU_3 RELU_4 """ input_shape = (1, 10, 5) input_features = [("data", datatypes.Array(1, 10, 5))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) transpose = [[2, 1, 0], [2, 1, 0], [0, 1, 2], [2, 1, 0], [2, 1, 0]] input_name = "data" for i, axes in enumerate(transpose): name = "transpose_" + str(i) output_name = name + "_out" builder.add_transpose( name=name, axes=axes, input_name=input_name, output_name=output_name ) input_name = output_name builder.add_activation( name="relu", non_linearity="RELU", input_name=input_name, output_name="out" ) builder.add_activation( name="dumpy_1", non_linearity="RELU", input_name="transpose_1_out", output_name="dumpy_1", ) builder.add_activation( name="dumpy_2", non_linearity="RELU", input_name="transpose_2_out", output_name="dumpy_2", ) builder.add_activation( name="dumpy_4", non_linearity="RELU", input_name="transpose_4_out", output_name="dumpy_4", ) self._test_builder(builder, input_shape, 4) def test_fork(self): """ INPUT | .------.------. | | v v [t_1] [t_3] | | v v [t_2] [t_4] | | v v RELU_1 RELU_2 t_1,t_2 can be merged and t_3,t_4 can be merged. The result output would be INPUT | .------.------. | | v v RELU_1 RELU_2 """ input_shape = (1, 10, 5) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) transpose = [[2, 1, 0], [2, 1, 0]] input_name = "data" for i, axes in enumerate(transpose): name = "transpose_" + str(i) output_name = name + "_out" builder.add_transpose( name=name, axes=axes, input_name=input_name, output_name=output_name ) input_name = output_name builder.add_activation( name="relu", non_linearity="RELU", input_name=input_name, output_name="out" ) input_name = "data" for i, axes in enumerate(transpose): name = "transpose_branch_2_" + str(i) output_name = name + "_out" builder.add_transpose( name=name, axes=axes, input_name=input_name, output_name=output_name ) input_name = output_name builder.add_activation( name="relu_branch_2", non_linearity="RELU", input_name=input_name, output_name="out_branch_2", ) self._test_builder(builder, input_shape, 2) def test_fork_and_add(self): """ INPUT | .------.------. | | v v [t_1] [t_3] | | v v [t_2] [t_4] | | .-----. .-----. | | v v Add t_1,t_2 can be merged and t_3,t_4 can be merged. The result output would be INPUT | .------.------. | | .-----. .-----. | | v v Add """ input_shape = (1, 10, 5) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) transpose = [[2, 1, 0], [2, 1, 0]] input_name = "data" for i, axes in enumerate(transpose): name = "transpose_" + str(i) output_name = name + "_out" builder.add_transpose( name=name, axes=axes, input_name=input_name, output_name=output_name ) input_name = output_name input_1 = input_name input_name = "data" for i, axes in enumerate(transpose): name = "transpose_branch_2_" + str(i) output_name = name + "_out" builder.add_transpose( name=name, axes=axes, input_name=input_name, output_name=output_name ) input_name = output_name input_2 = input_name builder.add_add_broadcastable( name="add", input_names=[input_1, input_2], output_name="out" ) self._test_builder(builder, input_shape, 1) def test_transpose(self): def _build_and_test_network(input_size, transpose_layers, expected_layers): """ Helper function for testing transpose removal. Args: input_size: Size of the input network tensor. transpose_layers: Array of transpose axes definitions. expected_layers: Array of indices into transpose_layers indicating which of the transpose layers should be present after the graph pass. """ input_features = [("data", datatypes.Array(*input_size))] output_features = [("out", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features ) spec = builder.spec.neuralNetwork.layers last_layer = "data" for idx, axes in enumerate(transpose_layers): name = "t{}".format(idx) if idx == len(transpose_layers) - 1: output_name = "out" else: output_name = name + "_out" builder.add_transpose( name=name, axes=axes, input_name=last_layer, output_name=output_name ) last_layer = output_name spec = builder.spec.neuralNetwork # Check the network before the graph pass. for idx in range(len(transpose_layers)): np.testing.assert_equal( "transpose", spec.layers[idx].WhichOneof("layer") ) # Run the removal pass. remove_redundant_transposes(builder.spec) # Verify only the expected layers remain. np.testing.assert_equal(len(spec.layers), len(expected_layers)) for output_layer_idx, input_layer_idx in enumerate(expected_layers): np.testing.assert_equal( "transpose", spec.layers[output_layer_idx].WhichOneof("layer") ) np.testing.assert_array_equal( transpose_layers[input_layer_idx], spec.layers[output_layer_idx].transpose.axes, ) _build_and_test_network( input_size=[1, 10, 10], # These transposes are not inverses. transpose_layers=[[2, 0, 1], [2, 0, 1]], expected_layers=[0, 1], ) _build_and_test_network( input_size=[1, 1, 10, 10, 3], # First two are the identity, then an extra. transpose_layers=[[2, 4, 1, 0, 3], [3, 2, 0, 4, 1], [1, 0, 2, 3, 4]], expected_layers=[2], ) # A slightly more complicated test case where there are two transposes # in topological order, but are actually in parallel in the graph. builder = neural_network.NeuralNetworkBuilder( [("data", datatypes.Array(2, 4, 8))], [("out", None)] ) builder.add_transpose( name="t1", axes=[0, 2, 1], input_name="data", output_name="t1" ) builder.add_transpose( name="t2", axes=[0, 2, 1], input_name="data", output_name="t2" ) builder.add_stack(name="stack", input_names=["t1", "t2"], output_name="out") spec = builder.spec.neuralNetwork # Run the removal pass. remove_redundant_transposes(builder.spec) # Verify nothing was removed. np.testing.assert_equal(len(spec.layers), 3) if __name__ == "__main__": RUN_ALL_TESTS = True if RUN_ALL_TESTS: unittest.main() else: suite = unittest.TestSuite() suite.addTest(MLModelPassesTest("test_load_constant_remove")) unittest.TextTestRunner().run(suite) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/backend/nn/passes/test_passes.py0000644000000000000000000001706614672066616026235 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import itertools import numpy as np import pytest from coremltools.converters.mil import testing_reqs from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.converters.mil.testing_utils import ( apply_pass_and_basic_check, assert_model_is_valid, assert_same_output_names, get_op_types_in_program, ) backends = testing_reqs.backends class TestConv1dDeompositionPasses: @pytest.mark.parametrize( "backend, has_strides, pad_type, has_pad, has_dilations, has_bias", itertools.product( backends, (True, False), ("valid", "custom", "same"), (True, False), (True, False), (True, False), ), ) def test_conv1d_decomposition( self, backend, has_strides, pad_type, has_pad, has_dilations, has_bias ): """ Input graph: input -> expand_dims -> conv2d -> squeeze -> out Output graph: input -> conv1d -> out """ N, L = 2, 8 C_in, C_out = 3, 4 K = 3 conv_kwargs = {"weight": np.random.rand(C_out, C_in, K), "pad_type": pad_type} if has_strides: conv_kwargs["strides"] = (2,) if has_pad: conv_kwargs["pad"] = (1, 1) if has_dilations: conv_kwargs["dilations"] = (2,) if has_bias: conv_kwargs["bias"] = np.random.rand(C_out) @mb.program(input_specs=[mb.TensorSpec(shape=(N, C_in, L))]) def prog(x): y = mb.conv(x=x, **conv_kwargs) return y assert get_op_types_in_program(prog) == ["conv"] prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "nn_backend::decompose_conv1d" ) assert get_op_types_in_program(prog) == ["expand_dims", "expand_dims", "conv", "squeeze"] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::const_elimination") assert get_op_types_in_program(prog) == ["expand_dims", "conv", "squeeze"] # infer output shape strides = conv_kwargs["strides"] if has_strides else (1,) pad = conv_kwargs["pad"] if has_pad else (0, 0) dilations = conv_kwargs["dilations"] if has_dilations else (1,) L_out = None if pad_type == "valid": L_out = (L - dilations[-1] * (K - 1) - 1) // strides[-1] + 1 elif pad_type == "custom": L_out = (L + pad[-2] + pad[-1] - dilations[-1] * (K - 1) - 1) // strides[-1] + 1 elif pad_type == "same": L_out = np.ceil(L / strides[-1]) else: raise Exception("unsupported pad type") output_shape = (N, C_out, L_out) assert_model_is_valid( prog, {"x": (N, C_in, L)}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) @pytest.mark.parametrize("backend", backends) def test_conv1d_decomposition_dynamic_weight(self, backend): """ Input graph: input -> expand_dims -> conv2d -> squeeze -> out Output graph: input -> conv1d -> out """ N, L = 2, 9 C_in, C_out = 4, 3 K = 4 strides = (2,) pad = (1, 1) # MIL convolution with dynamic weights does not support dilations != 1 # see coremltools/coremltools/converters/mil/mil/ops/defs/iOS15/conv.py dilations = (1,) # infer L_out with pad_type fixed to custom L_out = (L + pad[-2] + pad[-1] - dilations[-1] * (K - 1) - 1) // strides[-1] + 1 conv_kwargs = { "strides": strides, "pad_type": "custom", "pad": pad, "dilations": dilations, } @mb.program( input_specs=[ mb.TensorSpec(shape=(N, C_in, L)), mb.TensorSpec(shape=(C_out, C_in, K)), ] ) def prog(x, weight): y = mb.conv(x=x, weight=weight, **conv_kwargs) return y assert get_op_types_in_program(prog) == ["conv"] prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "nn_backend::decompose_conv1d" ) assert get_op_types_in_program(prog) == ["expand_dims", "expand_dims", "conv", "squeeze"] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::const_elimination") assert get_op_types_in_program(prog) == ["expand_dims", "expand_dims", "conv", "squeeze"] output_shape = (N, C_out, L_out) assert_model_is_valid( prog, {"x": (N, C_in, L), "weight": (C_out, C_in, K)}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) def test_commingle_loop_vars(): def body(a, b): # b is a loop invariant return mb.add(x=a, y=b), b def cond(a, b): a_mean = mb.reduce_mean(x=a, axes=[0, 1]) b_mean = mb.reduce_mean(x=b, axes=[0, 1]) return mb.less(x=a_mean, y=b_mean) @mb.program( input_specs=[mb.TensorSpec(shape=(1, 2)), mb.TensorSpec(shape=(1, 2)),] ) def prog(a, b): return mb.while_loop(_cond=cond, _body=body, loop_vars=(a, b)) while_op = prog.find_ops(op_type="while_loop", exactly_one=True)[0] assert while_op.blocks[0].inputs[0].name == "a_x0" assert while_op.blocks[0].inputs[1].name == "b_x0" prev_prog = copy.deepcopy(prog) PASS_REGISTRY["nn_backend::commingle_loop_vars"](prog) assert_same_output_names(prev_prog, prog) while_op = prog.find_ops(op_type="while_loop", exactly_one=True)[0] assert while_op.blocks[0].inputs[0].name == while_op.outputs[0].name assert while_op.blocks[0].inputs[1].name == while_op.outputs[1].name prog.validate() # The program is not ssa and thus cannot be converted def test_handle_return_inputs_as_outputs(): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 2)), mb.TensorSpec(shape=(1, 2)),] ) def prog(a, b): return mb.mul(x=a, y=2.), b prev_main_output_names = [o.name for o in prog["main"].outputs] assert prog["main"].outputs[1].op is None # output comes from input prev_prog = copy.deepcopy(prog) PASS_REGISTRY["nn_backend::handle_return_inputs_as_outputs"](prog) assert_same_output_names(prev_prog, prog) assert prog["main"].outputs[1].op is not None # output comes from an op assert prog["main"].outputs[1].op.op_type == "identity" with pytest.raises(ValueError, match='used both as function\'s input and output'): # prog has input and output names 'b' that refer to different vars # This program can pass if we disable 'dedup_op_and_var_names' pass assert_model_is_valid(prog, {"a": (1, 2), "b": (1, 2)}) def test_handle_unused_inputs(): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 2)),] ) def prog(unused_input): return mb.const(val=[3, 2]) prev_prog = copy.deepcopy(prog) PASS_REGISTRY["nn_backend::handle_unused_inputs"](prog) assert_same_output_names(prev_prog, prog) id_op = prog.find_ops(op_type="identity", exactly_one=True)[0] # Assert that input var is consumed by an identity op. assert id_op in prog["main"].inputs["unused_input"].child_ops assert_model_is_valid(prog, {"unused_input": (1, 2)}) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/conftest.py0000644000000000000000000000071414672066616022215 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause def pytest_make_parametrize_id(config, val, argname): ''' This function is a hook into pytest. It generates a user friendly string representation of the parameterized values. ''' return "{}={}".format(argname, str(val)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/converter.py0000644000000000000000000003034214672066616022377 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import tempfile as _tempfile import warnings as _warnings from typing import Optional, Text, Tuple from coremltools.converters._profile_utils import _profile from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Program from coremltools.converters.mil.mil.types.symbolic import k_num_internal_syms, k_used_symbols from coremltools.models import MLModel from coremltools.models.model import _create_mlpackage from . import ImageType, InputType from .mil.passes.pass_pipeline import PassPipeline, PassPipelineManager class ConverterRegistry: frontends = {} backends = {} backend_alias_names = {} @staticmethod def frontend(converter): ConverterRegistry.frontends[converter.name] = converter return converter @staticmethod def backend(converter): ConverterRegistry.backends[converter.name] = converter if 'alias_names' in converter.__dict__: for name in converter.alias_names: ConverterRegistry.backend_alias_names[name] = converter.name return converter @ConverterRegistry.frontend class MILFrontend: name = "milinternal" def __call__(self, model, *args, **kwargs): specification_version = kwargs.get("specification_version", None) if specification_version is not None: max_opset_version, op = model._get_max_opset_version_and_op() if max_opset_version > specification_version: msg = ( "Please update the minimum_deployment_target to coremltools.target.{}," " since op {} is only available in opset coremltools.target.{} or newer." ).format(max_opset_version.name, op.op_type, max_opset_version.name) raise ValueError(msg) if "inputs" in kwargs and kwargs["inputs"] is not None: inputs = kwargs["inputs"] if not isinstance(inputs, (list, tuple)): raise ValueError( "Type of inputs should be list or tuple, got {} instead.".format( type(inputs) ) ) if not all([isinstance(i, InputType) for i in inputs]): raise ValueError( "Type of inputs should be list or tuple of TensorType or ImageType, got {} instead.".format( [type(i) for i in inputs] ) ) for idx, inp in enumerate(inputs): # We set the default image format in MIL as NCHW, since only NCHW is # natively supported by MIL ops (ex. Conv/Pool/etc.) if isinstance(inp, ImageType) and inputs[idx].channel_first is None: inputs[idx].channel_first = True model.functions["main"].set_input_types(tuple(inputs)) return model @ConverterRegistry.frontend class TensorFlowFrontend: name = "tensorflow" def __call__(self, *args, **kwargs): from .frontend.tensorflow.load import TF1Loader tf1_loader = TF1Loader(*args, **kwargs) return tf1_loader.load() @ConverterRegistry.frontend class TensorFlow2Frontend: name = "tensorflow2" def __call__(self, *args, **kwargs): from .frontend.tensorflow2.load import TF2Loader tf2_loader = TF2Loader(*args, **kwargs) return tf2_loader.load() @ConverterRegistry.frontend class TorchFrontend: name = "pytorch" def __call__(self, *args, **kwargs): from .frontend.torch.load import load return load(*args, **kwargs) @ConverterRegistry.backend class NNProtoBackend: name = "neuralnetwork" alias_names = [] def __call__(self, *args, **kwargs): from .backend.nn.load import load return load(*args, **kwargs) @ConverterRegistry.backend class MILProtoBackend: name = "mlprogram" alias_names = [] def __call__(self, *args, **kwargs): from .backend.mil.load import load as backend_load return backend_load(*args, **kwargs) def _reset_conversion_state(): ''' Reset any stateful properties/variables that are populated during conversion. ''' # Clear the "name_count" dict, # which is used to generate unique op names in the mil builder class. mb.name_count.clear() # Clear "k_used_symbols" dict, and the int counter "k_num_internal_syms" that are used to track symbolic names global k_used_symbols global k_num_internal_syms k_used_symbols.clear() k_num_internal_syms = 0 @_profile def mil_convert( model, convert_from, convert_to, compute_units, **kwargs ): """ Convert model from a specified frontend `convert_from` to a specified converter backend `convert_to`. Parameters ---------- model: TF, PyTorch, or `coremltools.converters.mil.Program`. See `coremltools.converters.convert` convert_from: str The value must be one of ['tensorflow', 'tensorflow2', 'pytorch', 'milinternal'] (aka name of a `ConverterRegistry.frontend`). compute_units: coremltools.ComputeUnit A enum with three possible values: - coremltools.ComputeUnit.ALL - use all compute units available, including the neural engine. - coremltools.ComputeUnit.CPU_ONLY - limit the model to only use the CPU. - coremltools.ComputeUnit.CPU_AND_GPU - use both the CPU and GPU, but not the neural engine. convert_to: str Value must be one of ['neuralnetwork', 'mlprogram', 'milinternal'] See `coremltools.converters.convert` Returns ------- model: `coremltools.models.MLModel` or `coremltools.converters.mil.Program` See `coremltools.converters.convert` """ return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs) def _mil_convert( model, convert_from, convert_to, registry, modelClass, compute_units, **kwargs ): # Map "convert_to" values that correspond to the alias_names, to the actual supported registries if convert_to in registry.backend_alias_names: msg = "Please use '{}' instead of '{}' with the 'convert_to' argument. The latter will be removed in the future." _warnings.warn(msg.format(registry.backend_alias_names[convert_to], convert_to)) convert_to = registry.backend_alias_names[convert_to] if convert_to == 'mlprogram': # mil_convert_to_proto places weight files inside the weights_dir weights_dir = _tempfile.TemporaryDirectory() kwargs["weights_dir"] = weights_dir.name proto, mil_program = mil_convert_to_proto( model, convert_from, convert_to, registry, **kwargs ) _reset_conversion_state() if convert_to == 'milinternal': return mil_program # mil program elif convert_to == 'milpython': return proto # internal mil data structure elif convert_to == "mlprogram": package_path = _create_mlpackage( proto, kwargs.get("weights_dir"), kwargs.get("package_dir") ) return modelClass( package_path, is_temp_package=not kwargs.get("package_dir"), mil_program=mil_program, skip_model_load=kwargs.get("skip_model_load", False), compute_units=compute_units, ) return modelClass( proto, mil_program=mil_program, skip_model_load=kwargs.get("skip_model_load", False), compute_units=compute_units, ) def mil_convert_to_proto( model, convert_from, convert_to, converter_registry, main_pipeline=None, **kwargs ) -> Tuple[Optional[MLModel], Program]: """ Convert model to proto object. Parameters ---------- model: See `mil_convert` convert_from: See `mil_convert` convert_to: See `mil_convert` converter_registry: `ConverterRegistry` Available frontend and backend converters main_pipeline: `PassPipeline` The main pipeline with options set by users. """ frontend_converter_type = converter_registry.frontends.get(convert_from.lower()) if not frontend_converter_type: raise NotImplementedError( f'Frontend converter "{convert_from}" not implemented, must be ' f"one of: {list(converter_registry.frontends.keys())}" ) kwargs.setdefault("convert_from", convert_from) kwargs.setdefault("convert_to", convert_to) if main_pipeline is None: # If the client calls `mil_convert` directly, the `pass_pipeline` is None. To keep the # behaviour same as before, the quantization pass is removed in this situation. # TODO: rdar://106111553 ([Infra] Quantization Pass is skipped when `mil_convert` is called directly.) main_pipeline = kwargs.get("pass_pipeline", PassPipeline()) main_pipeline.remove_passes({"common::add_fp16_cast", "common::add_int16_cast"}) frontend_pipeline, backend_pipeline = _construct_other_pipelines( main_pipeline, convert_from, convert_to ) frontend_converter = frontend_converter_type() prog = frontend_converter(model, **kwargs) PassPipelineManager.apply_pipeline(prog, frontend_pipeline) PassPipelineManager.apply_pipeline(prog, main_pipeline) if convert_to == 'milinternal': return None, prog PassPipelineManager.apply_pipeline(prog, backend_pipeline) prog._check_early_error_out_for_invalid_program() backend_converter_type = converter_registry.backends.get(convert_to.lower()) if not backend_converter_type: raise NotImplementedError( f'Backend converter "{convert_to}" not implemented, must be ' f"one of: {list(converter_registry.backends.keys())}" ) backend_converter = backend_converter_type() out = backend_converter(prog, **kwargs) return out, prog def _construct_other_pipelines( main_pipeline: PassPipeline, convert_from: Text, convert_to: Text ) -> Tuple[PassPipeline, PassPipeline]: """ Construct other pipelines based on the main pipeline. It includes: - The frontend pipeline which will run in the frontend converter - The backend pipeline which will run in the backend converter As the main pipeline could have passes which also exists in the frontend/backend passes, we need to make sure the pass options are set properly in all pipelines. For example, if users set options to skip some vars in `const_elimination` pass, we want to make sure those vars are skipped not only in main_pipeline, but also in other pipelines wherever the `const_elimination` pass runs. TODO: rdar://106046237 ([Infra] Expose Backend and Frontend Pipeline to External Users) Currently users only control the passes in the main pipeline by passing `pass_pipeline` param. There are two reasons why we don't expose the frontend/backend pipelines at the current stage: - The frontend and backend specific passes need to be well documented. - The interface need more carefully design, as we don't want to provide too many params such as ct.convert(..., frontend_pipeline=xxx, backend_pipelien=xxx, main_pipeline=xxx) to overwhelm users. """ # Set the main pipeline options specified by the user in frontend/backend pipeline. frontend_pipeline = PassPipeline.get_pipeline(f"frontend_{convert_from.lower()}") frontend_pipeline.set_options_by_another_pipeline(main_pipeline) backend_pipeline = PassPipeline.get_pipeline(f"backend_{convert_to.lower()}") backend_pipeline.set_options_by_another_pipeline(main_pipeline) # If a pass is skipped in the main pipeline, we also skip it in the frontend/backend pipeline. default_main_pipeline = PassPipeline.get_pipeline("default") passes_skipped_in_main = set(default_main_pipeline.passes) - set(main_pipeline.passes) frontend_pipeline.remove_passes(passes_skipped_in_main) backend_pipeline.remove_passes(passes_skipped_in_main) return frontend_pipeline, backend_pipeline ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/debugging_utils.py0000644000000000000000000001535214672066616023547 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy from collections import OrderedDict from typing import List, Optional import coremltools as ct from coremltools.models import MLModel from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.converters.mil.frontend.milproto.load import \ load as milproto_to_pymil def extract_submodel( model: MLModel, outputs: List[str], inputs: Optional[List[str]] = None, function_name: str = "main" ) -> MLModel: """ This utility function lets you extract a submodel from a Core ML model. For a neural network model, the function extracts only in-memory Core ML models. You should always call this function for a model directly from :py:class:`~coremltools.converters._converters_entry.convert`. It is not allowed to load the model from disk and then call this API. For an ML program model, both cases (in-memory and from disk) are supported. Parameters ---------- model: MLModel The Core ML model from which the submodel is extracted. outputs: list[str] A list of names of Vars, which are the outputs of the extracted submodel. inputs: list[str] (Optional) A list of names of Vars, which are the inputs of the extracted submodel. If not provided, the inputs from the original model are used. function_name: str (Optional) Name of the function where the subgraph is extracted. Default is ``main``. Examples -------- Neural network: >>> from coremltools.converters.mil.debugging_utils import extract_submodel >>> mlmodel = ct.convert(model, convert_to="neuralnetwork") >>> outputs = ["output_0", "output_1"] >>> submodel = extract_submodel(mlmodel, outputs) ML program: >>> from coremltools.converters.mil.debugging_utils import extract_submodel >>> mlmodel = ct.convert(model, convert_to="mlprogram") >>> outputs = ["output_0", "output_1"] >>> >>> # Directly extract model in memory >>> submodel = extract_submodel(mlmodel, outputs) >>> >>> # Extract model loaded from disk >>> mlmodel.save("model.mlpackage") >>> mlmodel = coremltools.model.models.MLModel("model.mlpackage") >>> submodel = extract_submodel(mlmodel, outputs) """ def validate_inputs(func, input_vars): reachable_vars = set(input_vars) for op in func.operations: if op.op_type == "const": reachable_vars.add(op.outputs[0]) for op in func.operations: if all([x in reachable_vars for x in op.inputs.values()]): reachable_vars.update(op.outputs) for out in func.outputs: if out not in reachable_vars: raise ValueError(f"output {output} not reachable from inputs") @block_context_manager def replace_inputs(func, input_vars): func_inputs = {} for input in input_vars: name = input.name func_inputs[name] = mb.placeholder(input.shape, dtype=input.dtype) func.replace_uses_of_var_after_op( anchor_op=input.op, old_var=input, new_var=func_inputs[name].outputs[0], ) func._input_dict = OrderedDict() for k, v in func_inputs.items(): v.set_name(k) func._input_dict[k] = v.outputs[0] if not isinstance(outputs, (list, tuple)): raise ValueError(f"outputs must be of type list/tuple. Got {type(outputs)}.") for output in outputs: if not isinstance(output, str): raise ValueError(f"outputs must be a list of str. Got element {output} with type {type(output)}.") if outputs.count(output) > 1: raise ValueError(f"outputs must be a list of unique elements. '{output}' occurs {outputs.count(output)} times.") model_spec = model.get_spec() backend = "mlprogram" if model_spec.WhichOneof("Type") == "mlProgram" else "neuralnetwork" if backend == "neuralnetwork": if model._mil_program is None: raise ValueError("NeuralNetwork model loaded from the disk is not supported by the extract_submodel util.") program = model._mil_program else: assert backend == "mlprogram" if model._mil_program is None: program = milproto_to_pymil( model_spec=model_spec, specification_version=model_spec.specificationVersion, file_weights_dir=model.weights_dir, ) else: program = model._mil_program # extract subgraph prog = copy.deepcopy(program) func = prog.functions[function_name] vars = {} new_outputs = [] for op in func.operations: for o in op.outputs: if o.name in outputs: new_outputs.append(o) vars[o.name] = o if len(outputs) != len(new_outputs): new_outputs_names = [o.name for o in new_outputs] outputs_not_found = [name for name in outputs if name not in new_outputs_names] raise ValueError(f"outputs {outputs_not_found} not found in the function.") func.set_outputs(new_outputs) # Clean up the graph PASS_REGISTRY["common::dead_code_elimination"](prog) # If the inputs are provided, we subtract the subgraph starting from them if inputs is not None: if not isinstance(inputs, (list, tuple)): raise ValueError(f"inputs must be of type list/tuple. Got {type(inputs)}.") input_vars = [] for input in inputs: if not isinstance(input, str): raise ValueError(f"inputs must be a list of str. Got element {input} with type {type(input)}.") if inputs.count(input) > 1: raise ValueError(f"inputs must be a list of unique elements. '{input}' occurs {inputs.count(input)} times.") if input not in vars and input not in func.inputs: raise ValueError(f"input {input} not found in the function.") if input in vars: input_vars.append(vars[input]) if input in func.inputs: input_vars.append(func.inputs[input]) validate_inputs(func, input_vars) replace_inputs(func, input_vars) PASS_REGISTRY["common::dead_code_elimination"](prog) prog.skip_all_passes = True submodel = ct.convert(prog, convert_to=backend, compute_units=model.compute_unit) return submodel ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2095466 coremltools-8.0/coremltools/converters/mil/experimental/0000755000000000000000000000000014672075535022511 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/experimental/__init__.py0000644000000000000000000000033214672066616024620 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2135465 coremltools-8.0/coremltools/converters/mil/experimental/passes/0000755000000000000000000000000014672075535024007 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/experimental/passes/__init__.py0000644000000000000000000000033214672066616026116 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/experimental/passes/generic_conv_batchnorm_fusion.py0000644000000000000000000001366014672066616032450 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import numpy as np from coremltools.converters.mil import Builder as mb from coremltools.converters.mil.experimental.passes.generic_pass_infrastructure import \ register_generic_pass """ Fuse the following batch_norm layer into conv and conv_transpose That is, convert conv + batch_norm to conv, by modifying the weight and bias in the conv layer Given: %2 = conv(%1) ... %3 = batch_norm(%2) ... Result: %3 = conv(%1) ... """ arbitrary_cin = 5 arbitrary_cout = 8 np.random.seed() arbitrary_input = (3, arbitrary_cin, 224, 224) arbitrary_weight = np.random.rand(arbitrary_cout, arbitrary_cin, 10, 10) arbitrary_mean= np.random.rand(arbitrary_cout) arbitrary_variance = np.random.rand(arbitrary_cout) if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": @mb.program(input_specs=[mb.TensorSpec(shape=arbitrary_input)]) def conv_batchnorm(x): conv = mb.conv(x=x, weight=arbitrary_weight, pad_type="valid", name="conv") batch_norm = mb.batch_norm(x=conv, mean=arbitrary_mean, variance=arbitrary_variance, name="batchnorm") return batch_norm if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": @mb.program(input_specs=[mb.TensorSpec(shape=arbitrary_input)]) def conv_transpose_batchorm(x): conv = mb.conv_transpose(x=x, weight=arbitrary_weight, pad_type="valid", name="conv") batch_norm = mb.batch_norm(x=conv, mean=arbitrary_mean, variance=arbitrary_variance, name="batchnorm") return batch_norm def var_constraints(pattern): return pattern.conv.weight.val is not None def transform_pattern(pattern): # get parameters from batch_norm layer gamma = pattern.batchnorm.gamma.val beta = pattern.batchnorm.beta.val mean = pattern.batchnorm.mean.val variance = pattern.batchnorm.variance.val epsilon = pattern.batchnorm.epsilon.val # get weight, bias and groups from conv layer conv_weight = pattern.conv.weight.val conv_bias = pattern.conv.bias groups = pattern.conv.groups.val # get type of the conv layer is_deconv = pattern.conv.op_type == 'conv_transpose' is_conv_1d = len(conv_weight.shape) == 3 # D_in denotes the spatial dimensions for conv kernel weight # for conv_transpose, conv_weight has shape [Cin, Cout / groups, *D_in] # for conv, conv_weight has shape [Cout, Cin / groups, *D_in] if is_deconv: Cout = conv_weight.shape[1] * groups Cin = conv_weight.shape[0] else: Cout = conv_weight.shape[0] Cin = conv_weight.shape[1] * groups # get the type of the conv weight conv_weight_type = conv_weight.dtype # create bias for conv if not exist if conv_bias is None: conv_bias = np.zeros(Cout) else: conv_bias = conv_bias.val conv_bias = conv_bias.astype(conv_weight_type) # get the original shape of weight and bias origin_weight_shape = conv_weight.shape origin_bias_shape = conv_bias.shape # update the weight for conv layer new_conv_weight = [] new_conv_bias = [] if is_deconv: conv_weight = np.transpose(conv_weight, [1, 0, 2] if is_conv_1d else [1, 0, 2, 3]) conv_weight = np.reshape(conv_weight, [Cout, Cin // groups] + list(conv_weight.shape[2:])) for i in range(Cout): # get batch norm parameters for each channel _gamma = gamma[i] _beta = beta[i] _mean = mean[i] _variance = variance[i] _scale = _gamma / np.sqrt(_variance + epsilon) # get conv weight and bias for each channel _conv_weight = conv_weight[i] _conv_bias = conv_bias[i] # update the conv weight and bias _conv_weight = _conv_weight * _scale _conv_bias = _scale * (_conv_bias - _mean) + _beta new_conv_weight.append(_conv_weight) new_conv_bias.append(_conv_bias) new_conv_weight = np.array(new_conv_weight).astype(conv_weight_type) new_conv_bias = np.array(new_conv_bias).astype(conv_weight_type) if is_deconv: new_conv_weight = np.reshape(new_conv_weight, [Cout // groups, Cin] + list(new_conv_weight.shape[2:])) new_conv_weight = np.transpose(new_conv_weight, [1, 0, 2] if is_conv_1d else [1, 0, 2, 3]) # make sure the updated weight and bias have the same shape as the original ones assert new_conv_weight.shape == origin_weight_shape, "conv weight should have the same shape before and after the fuse_conv_batchnorm pass." assert new_conv_bias.shape == origin_bias_shape, "conv bias should have the same shape before and after the fuse_conv_batchnorm pass." # create a new conv op with the new bias value, copying rest of the attributes out_name = pattern.batchnorm.outputs[0].name conv_kargs = {"weight": new_conv_weight, "bias": new_conv_bias, "name": out_name, "before_op": pattern.conv} for k, v in pattern.conv.inputs.items(): if k in ["weight", "bias"]: continue conv_kargs[k] = v if is_deconv: x = mb.conv_transpose(**conv_kargs) else: x = mb.conv(**conv_kargs) pattern.batchnorm.enclosing_block.replace_uses_of_var_after_op( anchor_op=pattern.batchnorm, old_var=pattern.batchnorm.outputs[0], new_var=x ) # Remove all the ops at once pattern.block.remove_ops(pattern.op_list()) if os.getenv('ENABLE_EXPERIMENTAL_PASSES') == '1': register_generic_pass( ops_arrangement=conv_batchnorm, var_constraints=var_constraints, transform_pattern=transform_pattern, pass_name="fuse_conv_batchnorm", namespace="common", ) register_generic_pass( ops_arrangement=conv_transpose_batchorm, var_constraints=var_constraints, transform_pattern=transform_pattern, pass_name="fuse_conv_batchnorm", namespace="common", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/experimental/passes/generic_conv_bias_fusion.py0000644000000000000000000003035314672066616031407 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import numpy as np from coremltools import _logger as logger from coremltools.converters.mil import Builder as mb from coremltools.converters.mil.experimental.passes.generic_pass_infrastructure import \ register_generic_pass from coremltools.converters.mil.mil import types """ Fold add/sub into bias of conv and conv_transpose That is, convert conv + add/sub to conv, when add/sub is adding a constant There are two main patterns supported now. The first one is: Pattern 1: Given: %2 = conv(%1) ... %3 = add(%2, constant) # where constant has shape (1,C,1)/(C,1) for 1d conv, (1,C,1,1)/(C,1,1) for 2d conv etc ... Result: %3 = conv(%1) ... The second one is: Pattern 2: Given: %2 = conv(%1) %3 = transpose(%2) ... %4 = add(%3, constant) # where constant has a broacasable shape ... Result: %2 = conv(%1) %4 = transpose(%2) ... When taking all of the conv/conv_tranpose, transpose/no transpose, and add/sub into account, We end up with a total of 8 patterns (2^3). These patterns are parameterized by the pattern_to_detect function below. """ arbitrary_cin = 5 arbitrary_cout = 8 arbitrary_scalar = 5 np.random.seed() arbitrary_perm = [0,1,2,3] arbitrary_input = (3, arbitrary_cin, 224, 224) arbitrary_weight = np.random.rand(arbitrary_cout, arbitrary_cin, 10, 10) def pattern_to_detect(conv_transpose, transpose, sub): """ Wrapper to create 8 patterns to detect for conciseness. """ @mb.program(input_specs=[mb.TensorSpec(shape=arbitrary_input)]) def conv_bias_pattern(x): if not conv_transpose: conv = mb.conv(x=x, weight=arbitrary_weight, pad_type="valid", name="conv") else: conv = mb.conv_transpose(x=x, weight=arbitrary_weight, pad_type="valid", name="conv") if transpose: transpose_layer = mb.transpose(x=conv, perm=arbitrary_perm, name="transpose") if sub: add_or_sub = mb.sub(x=transpose_layer if transpose else conv, y=arbitrary_scalar, name="add_or_sub") else: add_or_sub = mb.add(x=transpose_layer if transpose else conv, y=arbitrary_scalar, name="add_or_sub") return add_or_sub return conv_bias_pattern def var_constraints(pattern): bias_value = _get_bias_var(pattern).val rank = pattern.conv.x.rank is_bias_scalar = True if not isinstance(bias_value, np.ndarray) else False old_bias = pattern.conv.inputs.get("bias", None) old_bias_value = old_bias.val if old_bias is not None and old_bias.val is not None else None passed = True passed = passed and isinstance(bias_value, (np.ndarray, np.generic)) passed = passed and rank is not None passed = passed and (rank == 3 or rank == 4 or rank == 5) # check compatibility of bias value with the rank of the conv op # either bias value should be a scalar or: # rank=3 ==> (B,C,D), which means bias must be (1,C,1) or (C,1) # rank=4 ==> (B,C,D1,D2), which means bias must be (1,C,1,1) or (C,1,1) # rank=5 ==> (B,C,D1,D2,D3), which means bias must be (1,C,1,1,1) or (C,1,1,1) if not is_bias_scalar: # check that there is at most one dimension in the shape that is not 1 passed = passed and len(np.squeeze(bias_value).shape) <= 1 # check that addition is not happening on the batch dimension passed = passed and (len(bias_value) != rank or bias_value.shape[0] == 1) # check that last rank-2 entries in the shape vector are all 1s passed = passed and np.prod(bias_value.shape[-(rank - 2):]) == 1 bias_value = np.array([bias_value]) if is_bias_scalar else np.squeeze(bias_value) passed = passed and ( old_bias is not None or np.prod(bias_value.shape) != 1 or pattern.conv.weight.val is not None ) if old_bias is not None: try: new_bias_value = old_bias_value + bias_value except: return False return passed def var_constraints_tranpose(pattern): bias = pattern.add_or_sub.x.val if pattern.add_or_sub.x.val is not None else pattern.add_or_sub.y.val Cout = pattern.conv.outputs[0].shape[1] passed = True passed = passed and pattern.add_or_sub.x.val is not None or pattern.add_or_sub.y.val is not None passed = passed and _bias_mod_and_validity(bias, Cout, pattern) is not None return passed def transform_pattern(pattern): bias_value = _get_bias_var(pattern).val is_conv_op = (pattern.conv.op_type == "conv") is_bias_scalar = False if not isinstance(bias_value, np.ndarray): is_bias_scalar = True bias_value = np.array([bias_value]) if is_bias_scalar else np.squeeze(bias_value) if pattern.add_or_sub.op_type == "sub": bias_value *= -1 # everything looks good, now find the new updated bias old_bias = pattern.conv.inputs.get("bias", None) old_bias_value = None if old_bias is not None and old_bias.val is not None: old_bias_value = old_bias.val if old_bias is None: # need to create a fresh numpy array for bias if np.prod(bias_value.shape) == 1: # its a scalar bias # need to find the value of Cout to form a new bias # conv_transpose has weight format [K, C_out, spatial dims] # conv has weight format [C_out, K, spatial dims] Cout = pattern.conv.weight.val.shape[0 if is_conv_op else 1] new_bias_value = np.broadcast_to(bias_value, (Cout,)) else: new_bias_value = bias_value else: # just need to update the existing bias array new_bias_value = old_bias_value + bias_value # create a new conv op with the new bias value, copying rest of the attributes out_name = pattern.add_or_sub.outputs[0].name if new_bias_value.dtype != np.float32 and new_bias_value.dtype != np.float16: # cast the bias to match the weight type weight_np_type = types.nptype_from_builtin(pattern.conv.inputs["weight"].sym_type.get_primitive()) logger.warning("conv_bias_fusion pass: casting bias " "from {} to {} to match the dtype of the weight of the conv layer".format( new_bias_value.dtype, weight_np_type ) ) new_bias_value = new_bias_value.astype(weight_np_type) new_bias_var = mb.const(val=new_bias_value, before_op=pattern.conv) conv_kargs = {"bias": new_bias_var, "name": out_name, "before_op": pattern.conv} for k, v in pattern.conv.inputs.items(): if k == "bias": continue conv_kargs[k] = v if is_conv_op: x = mb.conv(**conv_kargs) else: x = mb.conv_transpose(**conv_kargs) pattern.add_or_sub.enclosing_block.replace_uses_of_var_after_op( anchor_op=pattern.add_or_sub, old_var=pattern.add_or_sub.outputs[0], new_var=x ) # Remove all the ops at once pattern.block.remove_ops(pattern.op_list()) def transform_transpose_pattern(pattern): is_deconv = pattern.conv.op_type == "conv_transpose" # get the bias bias = pattern.add_or_sub.x.val if pattern.add_or_sub.x.val is not None else pattern.add_or_sub.y.val is_first_input = pattern.add_or_sub.y.val is not None is_sub = pattern.add_or_sub.op_type == "sub" # get the conv bias/weight conv_shape = pattern.conv.outputs[0].shape Cout = conv_shape[1] conv_weight = pattern.conv.weight.val conv_weight_type = conv_weight.dtype conv_bias = np.zeros(Cout).astype(conv_weight_type) if pattern.conv.bias is None else pattern.conv.bias.val bias = _bias_mod_and_validity(bias, Cout, pattern) # compute the new bias if is_sub: if is_first_input: bias = -bias else: conv_bias = -conv_bias new_bias = conv_bias + bias # compute the new weight if is_sub and not is_first_input: new_weight = -conv_weight else: new_weight = conv_weight # create a new conv op with the new weight, bias value, copying rest of the attributes conv_kargs = {"weight": new_weight, "bias": new_bias, "before_op": pattern.conv} for k, v in pattern.conv.inputs.items(): if k in ["weight", "bias"]: continue conv_kargs[k] = v if is_deconv: x = mb.conv_transpose(**conv_kargs) else: x = mb.conv(**conv_kargs) # create a new transpose op out_name = pattern.add_or_sub.outputs[0].name tranpose_kargs = {"x": x, "name": out_name, "before_op": pattern.transpose} for k, v in pattern.transpose.inputs.items(): if k == "x": continue tranpose_kargs[k] = v x = mb.transpose(**tranpose_kargs) pattern.add_or_sub.enclosing_block.replace_uses_of_var_after_op( anchor_op=pattern.add_or_sub, old_var=pattern.add_or_sub.outputs[0], new_var=x ) # Remove all the ops at once pattern.block.remove_ops(pattern.op_list()) def _bias_mod_and_validity(bias, Cout, pattern): # check if the bias is compatible for fusion is_bias_scalar = True if isinstance(bias, np.ndarray): if bias.shape == (): bias = bias.tolist() elif np.prod(bias.shape) == 1: bias = np.squeeze(bias).tolist() else: is_bias_scalar = False if not is_bias_scalar: if np.prod(bias.shape) != Cout: return None rank = pattern.transpose.outputs[0].rank cout_dim = pattern.transpose.perm.val.tolist().index(1) - rank if bias.shape[cout_dim] != Cout: return None bias = np.reshape(bias, (Cout)) return bias def _get_bias_var(pattern): if pattern.add_or_sub.op_type == "sub": bias_var = pattern.add_or_sub.y else: bias_var = pattern.add_or_sub.x if pattern.add_or_sub.x.val is not None else pattern.add_or_sub.y return bias_var if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": # conv -> add register_generic_pass( ops_arrangement=pattern_to_detect(False, False, False), var_constraints=var_constraints, transform_pattern=transform_pattern, pass_name="fuse_conv_bias", namespace="common", ) # conv -> sub register_generic_pass( ops_arrangement=pattern_to_detect(False, False, True), var_constraints=var_constraints, transform_pattern=transform_pattern, pass_name="fuse_conv_bias", namespace="common", ) # conv_transpose -> add register_generic_pass( ops_arrangement=pattern_to_detect(True, False, False), var_constraints=var_constraints, transform_pattern=transform_pattern, pass_name="fuse_conv_bias", namespace="common", ) # conv_transpose -> sub register_generic_pass( ops_arrangement=pattern_to_detect(True, False, True), var_constraints=var_constraints, transform_pattern=transform_pattern, pass_name="fuse_conv_bias", namespace="common", ) # conv -> transpose -> add register_generic_pass( ops_arrangement=pattern_to_detect(False, True, False), var_constraints=var_constraints_tranpose, transform_pattern=transform_transpose_pattern, pass_name="fuse_conv_bias", namespace="common", ) # conv -> transpse -> sub register_generic_pass( ops_arrangement=pattern_to_detect(False, True, True), var_constraints=var_constraints_tranpose, transform_pattern=transform_transpose_pattern, pass_name="fuse_conv_bias", namespace="common", ) # conv_transpose -> transpose -> add register_generic_pass( ops_arrangement=pattern_to_detect(True, True, False), var_constraints=var_constraints_tranpose, transform_pattern=transform_transpose_pattern, pass_name="fuse_conv_bias", namespace="common", ) # conv_transpose -> transpose -> sub register_generic_pass( ops_arrangement=pattern_to_detect(True, True, True), var_constraints=var_constraints_tranpose, transform_pattern=transform_transpose_pattern, pass_name="fuse_conv_bias", namespace="common", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/experimental/passes/generic_conv_scale_fusion.py0000644000000000000000000002057414672066616031564 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import numpy as np from coremltools.converters.mil import Builder as mb from coremltools.converters.mil.experimental.passes.generic_pass_infrastructure import \ register_generic_pass """ Fold mul/div into conv/conv_transpose by updating the weight/bias of the convolution layers. The scale const can be a single number (scalar) or a vector with a broacasable shape, for instance, if the output of the conv/deconv layer is (B, Cout, H, W), const of shape (Cout, 1, 1) and (1, Cout, 1, 1) are allowed. Given: %2 = conv(%1) ... %3 = mul(%2, constant) # where constant is the scale constant ... Result: %3 = conv(%1) ... """ arbitrary_cin = 5 arbitrary_cout = 8 arbitrary_scalar = 5 np.random.seed() arbitrary_input = (3, arbitrary_cin, 224, 224) arbitrary_weight = np.random.rand(arbitrary_cout, arbitrary_cin, 10, 10) if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": @mb.program(input_specs=[mb.TensorSpec(shape=arbitrary_input)]) def conv_scale_mul(x): conv = mb.conv(x=x, weight=arbitrary_weight, pad_type="valid", name="conv") mul = mb.mul(x=conv, y=arbitrary_scalar, name="scale") return mul if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": @mb.program(input_specs=[mb.TensorSpec(shape=arbitrary_input)]) def conv_transpose_scale_mul(x): conv = mb.conv_transpose(x=x, weight=arbitrary_weight, pad_type="valid", name="conv") mul = mb.mul(x=conv, y=arbitrary_scalar, name="scale") return mul if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": @mb.program(input_specs=[mb.TensorSpec(shape=arbitrary_input)]) def conv_scale_div(x): conv = mb.conv(x=x, weight=arbitrary_weight, pad_type="valid", name="conv") real_div = mb.real_div(x=conv, y=arbitrary_scalar, name="scale") return real_div if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": @mb.program(input_specs=[mb.TensorSpec(shape=arbitrary_input)]) def conv_transpose_scale_div(x): conv = mb.conv_transpose(x=x, weight=arbitrary_weight, pad_type="valid", name="conv") real_div = mb.real_div(x=conv, y=arbitrary_scalar, name="scale") return real_div def _cin_cout(pattern): # D_in denotes the spatial dimensions for conv kernel weight # for conv_transpose, conv_weight has shape [Cin, Cout / groups, *D_in] # for conv, conv_weight has shape [Cout, Cin / groups, *D_in] is_deconv = pattern.conv.op_type == "conv_transpose" groups = pattern.conv.groups.val conv_weight = pattern.conv.weight.val if is_deconv: Cout = conv_weight.shape[1] * groups Cin = conv_weight.shape[0] else: Cout = conv_weight.shape[0] Cin = conv_weight.shape[1] * groups return Cin, Cout def _is_scalar(pattern): # for the scalar case, the scalar can be either # 1. a python int/float # 2. a 0d numpy array # 3. a 1d numpy array with shape (1,) scale_var = pattern.scale.x if pattern.scale.x.val is not None else pattern.scale.y scale = scale_var.val is_scalar = True if isinstance(scale, np.ndarray): if scale.shape == (): scale = scale.tolist() elif scale.shape == (1) or scale.shape == (1,): scale = scale[0] else: is_scalar = False return is_scalar def var_constraints(pattern): passed = True passed = passed and pattern.scale.x.val is not None or pattern.scale.y.val is not None passed = passed and pattern.conv.weight.val is not None is_scalar = _is_scalar(pattern) Cin, Cout = _cin_cout(pattern) scale_var = pattern.scale.x if pattern.scale.x.val is not None else pattern.scale.y scale = scale_var.val # for the vector scale case, check if the shape is broacastable if not is_scalar: conv_weight = pattern.conv.weight.val passed = passed and ( np.prod(scale.shape) == Cout or (len(scale.shape) == len(conv_weight.shape) and scale.shape[1] == Cout) or (len(scale.shape) == len(conv_weight.shape) - 1 and scale.shape[0] == Cout) ) return passed def transform_pattern(pattern): # get the scale scale_var = pattern.scale.x if pattern.scale.x.val is not None else pattern.scale.y scale = scale_var.val is_scalar = _is_scalar(pattern) # get weight and bias and groups from conv layer conv_weight = pattern.conv.weight.val conv_bias = pattern.conv.bias groups = pattern.conv.groups.val # get type of the conv layer is_deconv = pattern.conv.op_type == "conv_transpose" is_conv_1d = len(conv_weight.shape) == 3 Cin, Cout = _cin_cout(pattern) # transform the scale to 1./scale for the real_div case if pattern.scale.op_type == "real_div": scale = 1.0 / scale # get the type of the conv weight conv_weight_type = conv_weight.dtype # create bias for conv if not exist if conv_bias is None: conv_bias = np.zeros(Cout) else: conv_bias = conv_bias.val conv_bias = conv_bias.astype(conv_weight_type) # get the original shape of weight and bias origin_weight_shape = conv_weight.shape origin_bias_shape = conv_bias.shape # update the weight/bias for conv layer if is_scalar: new_conv_bias = np.array(conv_bias * scale).astype(conv_weight_type) new_conv_weight = np.array(conv_weight * scale).astype(conv_weight_type) else: scale = np.reshape(scale, (Cout)) new_conv_bias = np.array(conv_bias * scale).astype(conv_weight_type) new_conv_weight = [] if is_deconv: conv_weight = np.transpose(conv_weight, [1, 0, 2] if is_conv_1d else [1, 0, 2, 3]) conv_weight = np.reshape(conv_weight, [Cout, Cin // groups] + list(conv_weight.shape[2:])) for i in range(Cout): _conv_weight = conv_weight[i] * scale[i] new_conv_weight.append(_conv_weight) new_conv_weight = np.array(new_conv_weight).astype(conv_weight_type) if is_deconv: new_conv_weight = np.reshape(new_conv_weight, [Cout // groups, Cin] + list(new_conv_weight.shape[2:])) new_conv_weight = np.transpose(new_conv_weight, [1, 0, 2] if is_conv_1d else [1, 0, 2, 3]) # make sure the updated weight and bias have the same shape as the original ones assert new_conv_weight.shape == origin_weight_shape, "conv weight should have the same shape before and after the fuse_conv_scale pass." assert new_conv_bias.shape == origin_bias_shape, "conv bias should have the same shape before and after the fuse_conv_scale pass." # create a new conv op with the new weight, bias value, copying rest of the attributes out_name = pattern.scale.outputs[0].name conv_kargs = { "weight": new_conv_weight, "bias": new_conv_bias, "name": out_name, "before_op": pattern.conv, } for k, v in pattern.conv.inputs.items(): if k in ["weight", "bias"]: continue conv_kargs[k] = v if is_deconv: x = mb.conv_transpose(**conv_kargs) else: x = mb.conv(**conv_kargs) pattern.scale.enclosing_block.replace_uses_of_var_after_op( anchor_op=pattern.scale, old_var=pattern.scale.outputs[0], new_var=x ) # Remove all the ops at once pattern.block.remove_ops(pattern.op_list()) if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": register_generic_pass( ops_arrangement=conv_scale_mul, var_constraints=var_constraints, transform_pattern=transform_pattern, pass_name="fuse_conv_scale", namespace="common", ) register_generic_pass( ops_arrangement=conv_transpose_scale_mul, var_constraints=var_constraints, transform_pattern=transform_pattern, pass_name="fuse_conv_scale", namespace="common", ) register_generic_pass( ops_arrangement=conv_scale_div, var_constraints=var_constraints, transform_pattern=transform_pattern, pass_name="fuse_conv_scale", namespace="common", ) register_generic_pass( ops_arrangement=conv_transpose_scale_div, var_constraints=var_constraints, transform_pattern=transform_pattern, pass_name="fuse_conv_scale", namespace="common", ) ././@PaxHeader0000000000000000000000000000021700000000000010215 xustar00121 path=coremltools-8.0/coremltools/converters/mil/experimental/passes/generic_layernorm_instancenorm_pattern_fusion.py 22 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/experimental/passes/generic_layernorm_instancenorm_patter0000644000000000000000000005104714672066616033604 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import numpy as np from coremltools.converters.mil import Builder as mb from coremltools.converters.mil.experimental.passes.generic_pass_infrastructure import \ register_generic_pass from coremltools.converters.mil.mil import get_new_symbol if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": shape = (get_new_symbol(), get_new_symbol(), get_new_symbol(), get_new_symbol()) def _check_reduce_op(reduce_op, mode="reduce_mean") -> bool: """ Check whether or not the reduction op satisfy following conditions: - Mode is expected. - Does not change rank (keep_dims is True). - Axes are known at compile time. :param reduce_op: reduce op to check on :param mode: reduce mode """ if reduce_op is None: return False if reduce_op.op_type != mode: return False if reduce_op.keep_dims.val is False: return False if reduce_op.axes is None or reduce_op.axes.val is None: return False return True if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def instancenorm_or_layernorm(x): """ Identify the pattern: y = gamma * (x - mean) / sqrt(variance + epsilon) + beta y = x * [gamma * rsqrt(variance + eps)] + (beta - mean * [gamma * rsqrt(variance + eps)]) x --> main_reduce --> sub --> square --> reduce_mean_2 --> add(epsilon) --> rsqrt | | ^ | | | | V |----------------------- mul (gamma) | | | | | --------|--------- | | | | | | | V | |------------------------------------------------------------------> mul_3 | | | | V | |----------------------------------------------------------------> mul_2 | | V | sub (beta) --> add_2 --> [...] | ^ |------------------------------- This pattern corresponds to either layer_norm or instance_norm. It is instance_norm if all of the following are true: - input is rank 4 - axes of reduce_mean is [-2, -1] or [-3, -2] (when [-3, -2], a channel first to channel last transpose would be inserted) - gamma and beta are rank 1, after squeeze It is layer_norm if all of the following are true: - axes is either [-1] or [-1, -2] or [-1, -2, -3] and so on - rank of gamma and beta is equal to the length of the axes """ main_reduce = mb.reduce_mean(x=x, axes=[2, 3], keep_dims=True, name="main_reduce") sub = mb.sub(x=x, y=main_reduce, name="sub") square = mb.square(x=sub, name="square") reduce_mean_2 = mb.reduce_mean(x=square, axes=[2, 3], keep_dims=True, name="reduce_mean_2") add_epsilon = mb.add(x=reduce_mean_2, y=1e-5, name="add_epsilon") rsqrt = mb.rsqrt(x=add_epsilon, epsilon=1e-12, name="rsqrt") mul_gamma = mb.mul(x=rsqrt, y=np.random.rand(1, 5, 1, 1), name="mul_gamma") mul_2 = mb.mul(x=x, y=mul_gamma, name="mul_2") mul_3 = mb.mul(x=main_reduce, y=mul_gamma, name="mul_3") sub_beta = mb.sub(x=np.random.rand(1, 5, 1, 1), y=mul_3, name="sub_beta") add_2 = mb.add(x=sub_beta, y=mul_2, name="add_2") return add_2 if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def instancenorm_2(x): """ Identify the pattern: y = (x - mean) / pow(variance + epsilon) * gamma + beta This pattern corresponds to, should be fused as instance_norm. All of the following must be satisfy: 1) Input is rank 4 tensor 2) Reduce operates on spatial dimensions axes=[-2, -1], or axes=[-3, -2] (a channel first to channel last transpose would be inserted in such case) 3) Gamma and beta are both shape (C,) after squeeze, where C is number of channels |----> sub0 ----------| const (0.5) | ^ | | | | V V x ---> main_reduce square --> mean1 --> add_eps ---> pow const_gamma const_beta | | | | | | V V V V |----> sub1 --------------------------------------> real_div --> mul_gamma --> add_beta --> ... """ main_reduce = mb.reduce_mean(x=x, axes=[2, 3], keep_dims=True, name="main_reduce") sub0 = mb.sub(x=x, y=main_reduce, name="sub0") sub1 = mb.sub(x=x, y=main_reduce, name="sub1") square = mb.square(x=sub0, name="square") mean1 = mb.reduce_mean(x=square, axes=[2, 3], keep_dims=True, name="mean1") add_epsilon = mb.add(x=mean1, y=1e-5, name="add_epsilon") pow = mb.pow(x=add_epsilon, y=0.5, name="pow") real_div = mb.real_div(x=sub1, y=pow, name="real_div") mul_gamma = mb.mul(x=np.random.rand(1, 5, 1, 1), y=real_div, name="mul_gamma") add_beta = mb.add(x=np.random.rand(1, 5, 1, 1), y=mul_gamma, name="add_beta") return add_beta if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def instancenorm_3(x): """ Detect InstanceNorm pattern in TensorFlow-Addons. This pattern corresponds to, should be fused as instance_norm. All of the following must be satisfy: 1) Input is rank 4 tensor 2) Reduce operates on spatial dimensions axes=[-2, -1], or axes=[-3, -2] (a channel first to channel last transpose would be inserted in such case) 3) Gamma and beta are absent. Default values for gamma and beta would be used. |-------------------------------------------------------| | | | V x --> main_reduce square --> mean1 --> add_eps --> rsqrt --> mul2 --> mul_sub | | ^ | | | V | | | | --> sub -----------| | | | V V |--------------------------------------------------> mul1 -------------> add --> ... """ main_reduce = mb.reduce_mean(x=x, axes=[2, 3], keep_dims=True, name="main_reduce") sub = mb.sub(x=x, y=main_reduce, name="sub") square = mb.square(x=sub, name="square") mean1 = mb.reduce_mean(x=square, axes=[2, 3], keep_dims=True, name="mean1") add_epsilon = mb.add(x=mean1, y=1e-5, name="add_epsilon") # epsilon rsqrt = mb.rsqrt(x=add_epsilon, name="rsqrt") mul1 = mb.mul(x=rsqrt, y=x, name="mul1") mul2 = mb.mul(x=main_reduce, y=rsqrt, name="mul2") mul_sub = mb.mul(x=mul2, y=-1, name="mul_sub") add = mb.add(x=mul1, y=mul_sub, name="add") return add if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def instancenorm_4(x): """ Identify the pattern: y = x * [gamma * rsqrt(variance + eps)] + (beta - mean * [gamma * rsqrt(variance + eps)]) This pattern corresponds to, should be fused as instance_norm. All of the following must be satisfy: 1) Input is rank 4 tensor 2) Reduce operates on spatial dimensions axes=[-2, -1], or axes=[-3, -2] (a channel first to channel last transpose would be inserted in such case) 3) Gamma and beta are both shape (C,) after squeeze, where C is number of channels |-----------| | V |------> mul_square1 -------------> sum1 -----> mul_mean1 | | | V x --> main_reduce --> mul_mean ==> mul_square --> sub_variance --> add_eps --> rsqrt | | | | | V | | mul_gamma | | | | | |----------------| | | | V | |--------------------------------------------+-------------> mul2 | V | |------------------------------------------------------------------> mul1 | | V | sub_beta --> add --> [...] | ^ |---------------------------| """ mul_square1 = mb.mul(x=x, y=x, name="mul_square1") main_reduce = mb.reduce_sum(x=x, axes=[2, 3], keep_dims=True, name="main_reduce") mul_mean = mb.mul(x=main_reduce, y=3.3333334e-05, name="mul_mean") # dummy value here mul_square = mb.mul(x=mul_mean, y=mul_mean, name="mul_square") sum1 = mb.reduce_sum(x=mul_square1, axes=[2, 3], keep_dims=True, name="sum1") mul_mean1 = mb.mul(x=sum1, y=8.333333e-06, name="mul_mean1") # dummy value here sub_variance = mb.sub(x=mul_mean1, y=mul_square, name="sub_variance") add_epsilon = mb.add(x=sub_variance, y=1e-5, name="add_epsilon") # epsilon rsqrt = mb.rsqrt(x=add_epsilon, name="rsqrt") mul_gamma = mb.mul(x=rsqrt, y=np.random.rand(1, 5, 1, 1), name="mul_gamma") mul1 = mb.mul(x=mul_gamma, y=x, name="mul1") mul2 = mb.mul(x=mul_mean, y=mul_gamma, name="mul2") sub_beta = mb.sub(x=np.random.rand(1, 5, 1, 1), y=mul2, name="sub_beta") add = mb.add(x=mul1, y=sub_beta, name="add") return add def instancenorm_1_constraints(pattern): passed = True passed = passed and _common_pattern1_constraints(pattern) passed = passed and _instancenorm_constraints(pattern) return passed def layernorm_1_constraints(pattern): passed = True passed = passed and _common_pattern1_constraints(pattern) passed = passed and _layernorm_constraints(pattern) return passed def instancenorm_2_constraints(pattern): epsilon_var = _get_var(pattern.add_epsilon, pattern.mean1) gamma_var = _get_var(pattern.mul_gamma, pattern.real_div) beta_var = _get_var(pattern.add_beta, pattern.mul_gamma) passed = True passed = passed and _check_reduce_op(pattern.main_reduce) passed = passed and pattern.sub0.x == pattern.root_var and pattern.sub0.y == pattern.main_reduce.outputs[0] passed = passed and pattern.sub1.x == pattern.root_var and pattern.sub1.y == pattern.main_reduce.outputs[0] passed = passed and _check_reduce_op(pattern.mean1) passed = passed and pattern.pow.y.val is not None and np.isclose(pattern.pow.y.val, 0.5) passed = passed and pattern.real_div.x == pattern.sub1.outputs[0] and pattern.real_div.y == pattern.pow.outputs[0] passed = passed and _general_constraints(pattern, epsilon_var, gamma_var, beta_var) passed = passed and _instancenorm_constraints(pattern) return passed def instancenorm_3_constraints(pattern): epsilon_var = _get_var(pattern.add_epsilon, pattern.mean1) gamma_var = mb.const( val=np.ones(shape=(1, pattern.root_var.shape[1], 1, 1)), name="gamma_var" ) beta_var = mb.const( val=np.zeros(shape=(1, pattern.root_var.shape[1], 1, 1)), name="_fuse_layernorm_or_instancenorm_beta", ) passed = True passed = passed and _check_reduce_op(pattern.main_reduce) passed = passed and pattern.sub.x == pattern.root_var and pattern.sub.y == pattern.main_reduce.outputs[0] passed = passed and _check_reduce_op(pattern.mean1) passed = passed and pattern.mul_sub.y.val is not None and pattern.mul_sub.y.val == -1 passed = passed and _general_constraints(pattern, epsilon_var, gamma_var, beta_var) passed = passed and _instancenorm_constraints(pattern) return passed def instancenorm_4_constraints(pattern): epsilon_var = _get_var(pattern.add_epsilon, pattern.sub_variance) gamma_var = _get_var(pattern.mul_gamma, pattern.rsqrt) beta_var = pattern.sub_beta.x passed = True passed = passed and _check_reduce_op(pattern.main_reduce, mode="reduce_sum") passed = passed and pattern.mul_mean.y.shape == () passed = passed and _check_reduce_op(pattern.sum1, "reduce_sum") passed = passed and pattern.mul_mean1.y.shape == () passed = passed and pattern.sub_variance.y == pattern.mul_square.outputs[0] passed = passed and pattern.sub_beta.y == pattern.mul2.outputs[0] passed = passed and _general_constraints(pattern, epsilon_var, gamma_var, beta_var) passed = passed and _instancenorm_constraints(pattern) return passed def _general_constraints(pattern, epsilon_var, gamma_var, beta_var): passed = True passed = passed and pattern.root_var.shape is not None passed = passed and epsilon_var.val is not None and len(epsilon_var.val.shape) == 0 passed = passed and gamma_var.val is not None passed = passed and beta_var.val is not None pattern.add_attribute("epsilon_var", epsilon_var) pattern.add_attribute("gamma_var", gamma_var) pattern.add_attribute("beta_var", beta_var) return passed def _common_pattern1_constraints(pattern): epsilon_var = _get_var(pattern.add_epsilon, pattern.reduce_mean_2) gamma_var = _get_var(pattern.mul_gamma, pattern.rsqrt) beta_var = pattern.sub_beta.x passed = True passed = passed and _check_reduce_op(pattern.main_reduce) passed = passed and _check_reduce_op(pattern.reduce_mean_2) passed = passed and pattern.sub.x == pattern.root_var and pattern.sub.y == pattern.main_reduce.outputs[0] passed = passed and pattern.sub_beta.y == pattern.mul_3.outputs[0] passed = passed and _general_constraints(pattern, epsilon_var, gamma_var, beta_var) return passed def _layernorm_constraints(pattern): rank, axes, negative_axes = _rank_and_axes(pattern) passed = True passed = passed and len(pattern.gamma_var.val.shape) == len(axes) passed = passed and len(pattern.beta_var.val.shape) == len(axes) passed = passed and negative_axes == list(range(-len(negative_axes), 0)) requires_rank4_transpose = False if rank == 4 and negative_axes == [-3, -2]: requires_rank4_transpose = True pattern.add_attribute("requires_rank4_transpose", requires_rank4_transpose) pattern.add_attribute("is_instancenorm", False) return passed def _instancenorm_constraints(pattern): rank, axes, negative_axes = _rank_and_axes(pattern) passed = True passed = passed and rank == 4 passed = passed and _check_axes_and_var_shape(negative_axes, pattern.gamma_var.shape) passed = passed and _check_axes_and_var_shape(negative_axes, pattern.beta_var.shape) requires_rank4_transpose = False if negative_axes == [-3, -2]: requires_rank4_transpose = True pattern.add_attribute("requires_rank4_transpose", requires_rank4_transpose) pattern.add_attribute("is_instancenorm", True) return passed def _rank_and_axes(pattern): rank = len(pattern.root_var.shape) axes = pattern.main_reduce.axes.val negative_axes = [a - rank if a >= 0 else a for a in axes] negative_axes.sort() return rank, axes, negative_axes def _get_var(operation1, operation2): return operation1.y if operation1.x == operation2.outputs[0] else operation1.x def _check_axes_and_var_shape(negative_axes, shape): if len(shape) == 1: return True if negative_axes == [-2, -1]: return shape[0] == 1 and shape[2] == 1 and shape[3] == 1 if negative_axes == [-3, -2]: return shape[0] == 1 and shape[1] == 1 and shape[2] == 1 return False def transform_pattern(pattern): """ Insert instance_norm / layer_norm and delete all ops. :param pattern: A pattern object that contains all relevant information. """ out_name = pattern.final_op.outputs[0].name axes = pattern.main_reduce.axes.val if pattern.requires_rank4_transpose: x = mb.transpose( x=pattern.main_reduce.x, perm=[0, 3, 1, 2], name=out_name + "_transpose_nhwc_nchw", before_op=pattern.final_op, ) if pattern.is_instancenorm: x = mb.instance_norm( x=x if pattern.requires_rank4_transpose else pattern.main_reduce.x, gamma=np.squeeze(pattern.gamma_var.val), beta=np.squeeze(pattern.beta_var.val), epsilon=pattern.epsilon_var, name=out_name + "_instancenorm" if pattern.requires_rank4_transpose else out_name, before_op=pattern.final_op, ) else: # is_layernorm x = mb.layer_norm( x=x if pattern.requires_rank4_transpose else pattern.main_reduce.x, axes=axes, gamma=pattern.gamma_var, beta=pattern.beta_var, epsilon=pattern.epsilon_var, name=out_name + "_layernorm" if pattern.requires_rank4_transpose else out_name, before_op=pattern.final_op, ) if pattern.requires_rank4_transpose: x = mb.transpose( x=x, perm=[0, 2, 3, 1], name=out_name + "_transpose_nchw_nhwc", before_op=pattern.final_op, ) pattern.final_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=pattern.final_op, old_var=pattern.final_op.outputs[0], new_var=x ) # Remove all the ops at once pattern.block.remove_ops(pattern.op_list()) if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": register_generic_pass( ops_arrangement=instancenorm_or_layernorm, var_constraints=layernorm_1_constraints, transform_pattern=transform_pattern, pass_name="fuse_layernorm_or_instancenorm", namespace="common", ) register_generic_pass( ops_arrangement=instancenorm_or_layernorm, var_constraints=instancenorm_1_constraints, transform_pattern=transform_pattern, pass_name="fuse_layernorm_or_instancenorm", namespace="common", ) register_generic_pass( ops_arrangement=instancenorm_2, var_constraints=instancenorm_2_constraints, transform_pattern=transform_pattern, pass_name="fuse_layernorm_or_instancenorm", namespace="common", ) register_generic_pass( ops_arrangement=instancenorm_3, var_constraints=instancenorm_3_constraints, transform_pattern=transform_pattern, pass_name="fuse_layernorm_or_instancenorm", namespace="common", ) register_generic_pass( ops_arrangement=instancenorm_4, var_constraints=instancenorm_4_constraints, transform_pattern=transform_pattern, pass_name="fuse_layernorm_or_instancenorm", namespace="common", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/experimental/passes/generic_linear_bias_fusion.py0000644000000000000000000001160214672066616031710 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import numpy as np from coremltools.converters.mil import Builder as mb from coremltools.converters.mil.experimental.passes.generic_pass_infrastructure import \ register_generic_pass from coremltools.converters.mil.mil import get_new_symbol if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": arbitrary_shape = (get_new_symbol(), get_new_symbol()) np.random.seed() arbitrary_weight = np.random.rand(4, 3) arbitrary_bias = np.random.rand(4) if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": @mb.program(input_specs=[mb.TensorSpec(shape=arbitrary_shape)]) def pattern_add(x): """ Original: % 4 = linear(x= % 1, weight = % 2, bias = % 3) # %2 is a rank-2 const tensor (weight) # %3 is a rank-1 const tensor (bias) ... % 6 = add(x= % 4, y = % 5) # %5 is a const tensor with same shape as %3 Result: % 8 = linear(x= % 1, weight = % 2, bias = % 7) # where %7 is a new const tensor with value # %7 = %3 + %6 """ linear = mb.linear(x=x, weight=arbitrary_weight, bias=arbitrary_bias, name="linear") add_or_sub = mb.add(x=linear, y=arbitrary_bias, name="add_or_sub") return add_or_sub if os.getenv("ENABLE_EXPERIMENTAL_PASSES") == "1": @mb.program(input_specs=[mb.TensorSpec(shape=arbitrary_shape)]) def pattern_sub(x): """ Original: %4 = linear(x=%1, weight=%2, bias=%3) # %2 is a rank-2 const tensor (weight) # %3 is a rank-1 const tensor (bias) ... %6 = sub(x=%5, y=%4) # %5 is a const tensor with a broacasable shape with %3. i.e. if %3 has shape (Dout), %5 could be (1, Dout). Result: %9 = linear(x=%1, weight=%7, bias=%8) # where %7 is a new const tensor with value %7 = -%2 # %8 = %5 - %3 """ linear = mb.linear(x=x, weight=arbitrary_weight, bias=arbitrary_bias, name="linear") add_or_sub = mb.sub(x=linear, y=arbitrary_bias, name="add_or_sub") return add_or_sub def var_constraints(pattern): passed = True passed = passed and pattern.add_or_sub.x.val is not None or pattern.add_or_sub.y.val is not None is_sub, is_first_input = _get_is_sub_and_is_first_input(pattern) linear_bias, bias, Dout = _get_linear_bias_bias_Dout(pattern, is_first_input) # check if the shape is broadcasable passed = passed and np.prod(linear_bias.shape) == np.prod(bias.shape) passed = passed and bias.shape[-1] == Dout return passed def _get_is_sub_and_is_first_input(pattern): is_sub = pattern.add_or_sub.op_type == "sub" is_first_input = pattern.add_or_sub.x == pattern.linear.outputs[0] return is_sub, is_first_input def _get_linear_bias_bias_Dout(pattern, is_first_input): linear_bias = pattern.linear.bias.val bias = pattern.add_or_sub.y.val if is_first_input else pattern.add_or_sub.x.val Dout = linear_bias.shape[0] return linear_bias, bias, Dout def transform_pattern(pattern): is_sub, is_first_input = _get_is_sub_and_is_first_input(pattern) linear_bias, bias, Dout = _get_linear_bias_bias_Dout(pattern, is_first_input) bias = np.reshape(bias, (Dout,)) if is_sub and is_first_input: bias = -bias if is_sub and not is_first_input: linear_bias = -linear_bias new_bias = linear_bias + bias # compute the new weight if is_sub and not is_first_input: new_weight = -pattern.linear.weight.val else: new_weight = pattern.linear.weight.val # create a new linear op with the new weight, bias value, copying rest of the attributes out_name = pattern.add_or_sub.outputs[0].name linear_kargs = {"weight": new_weight, "bias": new_bias, "name": out_name, "before_op": pattern.linear} linear_kargs.update({k: v for k, v in pattern.linear.inputs.items() if k not in ["weight", "bias"]}) x = mb.linear(**linear_kargs) pattern.add_or_sub.enclosing_block.replace_uses_of_var_after_op( anchor_op=pattern.add_or_sub, old_var=pattern.add_or_sub.outputs[0], new_var=x ) # Remove all the ops at once pattern.block.remove_ops(pattern.op_list()) if os.getenv('ENABLE_EXPERIMENTAL_PASSES') == '1': register_generic_pass( ops_arrangement=pattern_add, var_constraints=var_constraints, transform_pattern=transform_pattern, pass_name="fuse_linear_bias", namespace="common", ) register_generic_pass( ops_arrangement=pattern_sub, var_constraints=var_constraints, transform_pattern=transform_pattern, pass_name="fuse_linear_bias", namespace="common", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/experimental/passes/generic_pass_infrastructure.py0000644000000000000000000002411214672066616032163 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import warnings from functools import partial from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.scope import ScopeInfo, ScopeSource from ...mil.passes import pass_registry # IMPORTANT: List of assumptions we are making about the problem # 1) The user defined pattern has exactly one root variable, and one final output operation. As such, we will be searching for a singular # root variable in the larger program, and using that root variable as a starting point for our pattern matching. # And, we will only match one of the final operations for the larger program. # 2) The root variable in the larger program, where we start off the pattern matching, must have the same number of child ops as the # root variable in the user defined program # 3) The outputs of an operation are stored in identical, predictable order. The child operations of an operation are stored in a random order. class Pattern: """This class will have references to all the ops that we have captured in the main, larger program. Each captured op will be an attribute of this class. The attribute name will be the same name that the user defined in their pattern. So, if the user defines a pattern add(name = 'add_1') -> sub(name = 'sub_1), the pattern object will have the fields pattern.add_1, pattern.sub_1, which are references to the corresponding operations in the larger program. Minimum Attributes: root_var: which is the root variable of the first operation of the captured pattern (and corresponds to the user defined pattern’s root variable) final_op: the operation in the larger machine learning model that corresponds to the last operation in the user defined pattern. block: the block in the larger machine learning model where the pattern was found op_set: a set of all the operations captured from the larger machine learning model attribute_set: used for enforcing naming (ie, so the user doesn't overwrite any of the variables mentioned above) Setters set_root_var(root_var): sets the root_var attribute of the Pattern with the given root_var set_block(block): sets the block attribute of the Pattern with the given block set_final_op(op_name, final_op): adds the operation in question to the pattern and also sets it as the final_op Other Methods add_attribute(attribute_name, attribute): Adds an attribute to the pattern object. Can be useful for the user. Verifies name using the attribute set mentioned above add_op(op_name, op): Adds an operation to the pattern, as an attribute which can be accessed and as part of the op_set op_list(): converts the op_set to a list and returns it to make it easier for the user """ def __init__(self): self.root_var = None self.block = None self.final_op = None self.op_set = set() self.attribute_set = set(["root_var", "block", "final_op", "op_set", "attribute_set"]) def set_root_var(self, root_var): self.root_var = root_var def set_block(self, block): self.block = block def set_final_op(self, op_name, final_op): self.add_op(op_name, final_op) self.final_op = final_op def add_attribute(self, attribute_name, attribute): if attribute_name in self.attribute_set: raise NameError("Pattern " + attribute_name + " is being overwritten. " "Make sure every operation in your MIL pattern to detect " "has a unique name, and that no operation in it or an attribute you are setting is named " "root_var, block, final_op, op_set, or attribute_set.") setattr(self, attribute_name, attribute) def add_op(self, op_name, op): self.add_attribute(op_name, op) self.op_set.add(op) def op_list(self): return list(self.op_set) def _lists_op_equality(oplist1, oplist2): if (len(oplist1) != len(oplist2)): return False for i in range(len(oplist1)): if oplist1[i].op_type != oplist2[i].op_type: return False return True def _pattern_detected(pattern, program_op, pattern_op, program_root_var, pattern_root_var, block): # If the pattern_op is None, that means we are dealing with root_var checking (which don't have op_types or outputs) if pattern_op is not None and program_op.op_type != pattern_op.op_type: return False if pattern_op is not None and len(program_op.outputs) != len(pattern_op.outputs): return False for i in range(len(program_op.outputs) if pattern_op is not None else 1): output_same = False # ASSUMPTION: Assuming that the outputs of an operation are ordered in a particular way # So, two identical operations will have the same ordering of outputs. program_child_op_list = list(program_op.outputs[i].child_ops) if pattern_op is not None else program_root_var.child_ops pattern_child_op_list = list(pattern_op.outputs[i].child_ops) if pattern_op is not None else pattern_root_var.child_ops # Last op in the pattern if len(pattern_child_op_list) == 0: if pattern.final_op is not None and pattern.final_op != program_op: warnings.warn( "User defined pattern matched to more than one final operation. " "Skipped the pattern matching." ) return False pattern.set_final_op(pattern_op.name, program_op) return True if len(program_child_op_list) != len(pattern_child_op_list): return False # Permuting the program child operations so that at least one of the permutations will be in # the exact same order as the pattern child operations op_combos = list(itertools.permutations(pattern_child_op_list)) for combo in op_combos: if _lists_op_equality(combo, program_child_op_list): truly_equal = True for i in range(len(combo)): truly_equal = truly_equal and _pattern_detected(pattern, program_child_op_list[i], combo[i], program_root_var, pattern_root_var, block) if truly_equal: # The operations in this sequence match perfectly with the pattern output_same = True break if output_same is False: return False if pattern_op is not None: pattern.add_op(pattern_op.name, program_op) return True # This function finds the root_variable in the program that matches with the root_variable in the pattern, # And then kicks off the pattern matching from there def _detect_pattern(program_op, ops_arrangement_root_var, block): # The goal of this function is to find the root variable of both operations program_op_inputs = program_op.get_flattened_inputs() for potential_program_root_variable in program_op_inputs: pattern = Pattern() pattern.set_block(block) if _pattern_detected(pattern, program_op, ops_arrangement_root_var.op, potential_program_root_variable, ops_arrangement_root_var, block): pattern.set_root_var(potential_program_root_variable) # check that none of the ops in this pattern is connected to the output # (except the last one) for op in pattern.op_list(): if op is not pattern.final_op: for out in op.outputs: if out in pattern.block.outputs: return False, None return True, pattern return False, None @block_context_manager def _fuse_one_block(block, ops_arrangement, var_constraints, transform_pattern): fusion_occurred = False for op in list(block.operations): for b in op.blocks: block_changed = True while block_changed: block_changed = _fuse_one_block(b, ops_arrangement, var_constraints, transform_pattern) ops_arrangement_root_var = list( list(ops_arrangement.functions.values())[0].inputs.values() )[0] fusion_occurred, pattern = _detect_pattern(op, ops_arrangement_root_var, block) if fusion_occurred: fusion_occurred &= var_constraints(pattern) if fusion_occurred: transform_pattern(pattern) return fusion_occurred return fusion_occurred def fuse_all_blocks(ops_arrangement, var_constraints, transform_pattern, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = _fuse_one_block(f, ops_arrangement, var_constraints, transform_pattern) class PassContainer(): def __init__(self, pass_name): self.pass_name = pass_name self.passes = [] def __call__(self, prog): if len(self.passes) == 0: raise ValueError("no pass functions associated with " + self.pass_name) with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=[self.pass_name])): for one_pass in self.passes: one_pass(prog) prog.validate(check_essential_scope=True) def add(self, pass_function): self.passes.append(pass_function) def register_generic_pass(ops_arrangement, var_constraints, transform_pattern, pass_name, namespace): pass_function = partial(fuse_all_blocks, ops_arrangement, var_constraints, transform_pattern) pass_id = namespace + "::" + pass_name if pass_id not in pass_registry.PASS_REGISTRY or not isinstance(pass_registry.PASS_REGISTRY[pass_id], PassContainer): pass_registry.PASS_REGISTRY.passes[pass_id] = PassContainer(pass_name) pass_registry.PASS_REGISTRY[pass_id].add(pass_function) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2135465 coremltools-8.0/coremltools/converters/mil/frontend/0000755000000000000000000000000014672075535021633 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/__init__.py0000644000000000000000000000041014672066616023737 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import tensorflow, tensorflow2, torch ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/_utils.py0000644000000000000000000006253714672066616023521 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import math as math from typing import List, Optional, Union import numpy as np from coremltools.converters.mil._deployment_compatibility import AvailableTarget as target from coremltools.converters.mil.input_types import InputType from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, Var, types from coremltools.converters.mil.mil.block import is_current_opset_version_compatible_with from coremltools.converters.mil.mil.ops.defs._utils import ( parse_einsum_equation, promote_input_dtypes, ) from coremltools.converters.mil.mil.types.symbolic import any_symbolic, is_symbolic def value_at(x: Var, idx: int, name=None, before_op=None): """ input x: 1D tensor (vector). return value at index idx. x[idx]. Could specify the name of the returned MIL scalar tensor as well. """ assert x.rank == 1 args = { "x": x, "begin": [idx], "end": [0], "squeeze_mask": [True], } if name is not None: args["name"] = name if before_op is not None: args["before_op"] = before_op return mb.slice_by_index(**args) def _construct_gather_op( op_type: str, x: Var, indices: Var, axis: Var = None, name: str = None ) -> Var: """ This utility is a more general gather in the sense that: 1. Both mb.gather and mb.gather_nd are handled 2. x is allowed to be bool, while mb.gather and mb.gather_nd only allow float or int """ assert ( op_type in {"gather", "gather_nd"} ), f"This utility only handles gather or gather_nd, but got {op_type}" if op_type == "gather_nd": assert axis is None, "mb.gather_nd should not have input axis" # if is gathering bool: # cast bool input to a smallest supported dtype to gather, then cast back gather result # the back cast carries the specified name # else: # usual gather, and gather carries the specified name is_gathering_bool = x.dtype == types.bool if is_gathering_bool: gather_name_kwarg = {} cast_name_kwarg = {} if name is None else {"name": name} else: gather_name_kwarg = {} if name is None else {"name": name} if is_gathering_bool: work_dtype = "int8" if is_current_opset_version_compatible_with(target.iOS17) else "fp16" x = mb.cast(x=x, dtype=work_dtype) if op_type == "gather": result = mb.gather(x=x, indices=indices, axis=axis, **gather_name_kwarg) else: result = mb.gather_nd(x=x, indices=indices, **gather_name_kwarg) if is_gathering_bool: result = mb.cast(x=result, dtype="bool", **cast_name_kwarg) return result def _reverse_input_einsum_eq(equation: str) -> str: """ Reverse the input order of the einsum equation e.g.: input : "nchw,nwhu->nchu" returns : "nwhu,nchw->nchu" """ input_output_strings = equation.split('->') assert len(input_output_strings) == 2, "invalid equation" input_strings = input_output_strings[0].split(',') assert len(input_strings) == 2, "invalid equation" equation = input_strings[1] + ',' + input_strings[0] + '->' + input_output_strings[1] return equation def build_einsum_mil(vars: List[Var], equation: str, name: str) -> Var: """ Get MIL variables as input and build a variable using MIL builder, that contains the output of the einsum equation :param vars: - List[var] - list of input variables :param equation: - str - the einsum equation :param name: - str - name tp be assigned to the output var :return: - var - output var that contains the einsum result """ ## TODO: rdar://73851694 (Update einsum op translation to support generic cases) equation = equation.replace(" ", "") parsed_vectors = parse_einsum_equation(equation) if len(vars) != 2: return solve_generic_einsum(list(parsed_vectors), vars, name) equation_rev = _reverse_input_einsum_eq(equation) parsed_vectors_rev = parse_einsum_equation(equation_rev) def _swap(a, b): return b, a a_var, b_var = vars is_dynamic = any([any_symbolic(var.shape) for var in vars]) # list of equations supported for explicit mil translations vec_bnqd_bnkd_bnqk = ( [0, 1, 2, 3], [0, 1, 4, 3], [0, 1, 2, 4], ) # equation == "bnqd,bnkd->bnqk" vec_bhcq_bhck_bhqk = ( [0, 1, 2, 3], [0, 1, 2, 4], [0, 1, 3, 4], ) # equation == "bhcq,bhck->bhqk" vec_abc_cd_abd = ([0, 1, 2], [2, 3], [0, 1, 3]) # equation == "abc,cd->abd" vec_abc_cde_abde = ( [0, 1, 2], [2, 3, 4], [0, 1, 3, 4], ) # equation == "abc,cde->abde" vec_btnh_bfnh_bnft = ( [0, 1, 2, 3], [0, 4, 2, 3], [0, 2, 4, 1], ) # equation == "btnh,bfnh->bnft" vec_bnft_btnh_bfnh = ( [0, 1, 2, 3], [0, 3, 1, 4], [0, 2, 1, 4], ) # equation == "bnft,btnh->bfnh" vec_abcd_cde_abe = ( [0, 1, 2, 3], [2, 3, 4], [0, 1, 4], ) # equation == "abcd,cde->abe" vec_nchw_nwhu_nchu = ( [0, 1, 2, 3], [0, 3, 2, 4], [0, 1, 2, 4], ) # equation == "nchw,nwhu->nchu" vec_chw_whu_chu = ([0, 1, 2], [2, 1, 3], [0, 1, 3]) # equation == "chw,whu->chu" # add the op(s) corresponding to the equation if vec_bnqd_bnkd_bnqk in [parsed_vectors, parsed_vectors_rev]: if parsed_vectors_rev == vec_bnqd_bnkd_bnqk: a_var, b_var = _swap(a_var, b_var) x = mb.matmul(x=a_var, y=b_var, transpose_x=False, transpose_y=True, name=name) elif vec_bhcq_bhck_bhqk in [parsed_vectors, parsed_vectors_rev]: if parsed_vectors_rev == vec_bhcq_bhck_bhqk: a_var, b_var = _swap(a_var, b_var) x = mb.matmul(x=a_var, y=b_var, transpose_x=True, transpose_y=False, name=name) elif vec_abc_cd_abd in [parsed_vectors, parsed_vectors_rev]: if parsed_vectors_rev == vec_abc_cd_abd: a_var, b_var = _swap(a_var, b_var) x = mb.matmul(x=a_var, y=b_var, transpose_x=False, transpose_y=False, name=name) elif vec_abc_cde_abde in [parsed_vectors, parsed_vectors_rev] and not is_dynamic: if parsed_vectors_rev == vec_abc_cde_abde: a_var, b_var = _swap(a_var, b_var) x_1 = mb.reshape(x=a_var, shape=[a_var.shape[0] * a_var.shape[1], a_var.shape[2]]) x_2 = mb.reshape(x=b_var, shape=[b_var.shape[0], b_var.shape[1] * b_var.shape[2]]) x = mb.matmul(x=x_1, y=x_2, transpose_x=False, transpose_y=False) x = mb.reshape( x=x, shape=[a_var.shape[0], a_var.shape[1], b_var.shape[1], b_var.shape[2]], name=name ) elif vec_btnh_bfnh_bnft in [parsed_vectors, parsed_vectors_rev]: if parsed_vectors_rev == vec_btnh_bfnh_bnft: a_var, b_var = _swap(a_var, b_var) x_1 = mb.transpose(x=a_var, perm=[0, 2, 1, 3]) x_2 = mb.transpose(x=b_var, perm=[0, 2, 1, 3]) x = mb.matmul(x=x_2, y=x_1, transpose_x=False, transpose_y=True, name=name) elif vec_bnft_btnh_bfnh in [parsed_vectors, parsed_vectors_rev]: if parsed_vectors_rev == vec_bnft_btnh_bfnh: a_var, b_var = _swap(a_var, b_var) b_var = mb.transpose(x=b_var, perm=[0, 2, 1, 3]) x = mb.matmul(x=a_var, y=b_var, transpose_x=False, transpose_y=False) x = mb.transpose(x=x, perm=[0, 2, 1, 3], name=name) elif vec_abcd_cde_abe in [parsed_vectors, parsed_vectors_rev] and not is_dynamic: if parsed_vectors_rev == vec_abcd_cde_abe: a_var, b_var = _swap(a_var, b_var) x_1 = mb.reshape(x=a_var, shape=[a_var.shape[0], a_var.shape[1], a_var.shape[2] * a_var.shape[3]]) x_2 = mb.reshape(x=b_var, shape=[b_var.shape[0] * b_var.shape[1], b_var.shape[2]]) x = mb.matmul(x=x_1, y=x_2, transpose_x=False, transpose_y=False, name=name) elif vec_nchw_nwhu_nchu in [parsed_vectors, parsed_vectors_rev]: if parsed_vectors == vec_nchw_nwhu_nchu: x = mb.einsum(values=(a_var, b_var), equation=equation, name=name) else: x = mb.einsum(values=(b_var, a_var), equation=equation_rev, name=name) elif vec_chw_whu_chu in [parsed_vectors, parsed_vectors_rev]: if parsed_vectors == vec_chw_whu_chu: x = mb.einsum(values=(a_var, b_var), equation=equation, name=name) else: x = mb.einsum(values=(b_var, a_var), equation=equation_rev, name=name) else: x = solve_generic_einsum(list(parsed_vectors), [a_var, b_var], name) return x def is_symbolic_dim_in_prog(prog): ''' Takes in a MIL program object, checks if any of the tensors in it contain a symbolic dimension. Returns true if it does. :param prog: coremltools.converters.mil.Program :return: bool ''' def _does_block_contain_symbolic_shape(block): for op in block.operations: for b in op.blocks: if _does_block_contain_symbolic_shape(b): return True for out in op.outputs: if types.is_tensor(out.sym_type): shape = out.sym_type.get_shape() if any_symbolic(shape): return True elif types.is_scalar(out.sym_type) or types.is_str(out.sym_type): if is_symbolic(out.val): return True elif types.is_list(out.sym_type): if types.is_tensor(out.elem_type): if any_symbolic(out.elem_type.get_shape()): return True else: raise NotImplementedError("\'{}\' type in a list not handled".format(out.elem_type)) else: raise NotImplementedError("\'{}\' type is not handled".format(out.sym_type)) return False for f in prog.functions.values(): if _does_block_contain_symbolic_shape(f): return True return False def get_output_names(outputs) -> Optional[List[str]]: """ :param: list[ct.TensorType/ct.ImageType] :return: list[str] or None """ output_names = None if outputs is not None: assert all([isinstance(t, InputType) for t in outputs]), \ "outputs must be a list of ct.ImageType or ct.TensorType" output_names = [t.name for t in outputs] if all([name is None for name in output_names]): output_names = None return output_names # This is a workaround in Core ML for topk with dynamic `k`: # * Core ML topk supports only constant `k` # * Luckily, Core ML gather supports dynamic `end`, so we workaround by argsort then gather # This leads to a slightly different behaviour, though: top-k elements are always sorted def dynamic_topk( x: Var, k: Var, axis: int, ascending: Optional[bool] = False, name: Optional[str] = None ): assert k.val is None, "Please use mb.topk directly if k is compile time known" indices = mb.argsort(x=x, axis=axis, ascending=ascending) if name is None: values = mb.gather_along_axis(x=x, indices=indices, axis=axis) else: values = mb.gather_along_axis(x=x, indices=indices, axis=axis, name=name) k_indices = mb.range_1d(end=k, start=0, step=1) values = mb.gather(x=values, indices=k_indices, axis=axis) if name is None: indices = mb.gather(x=indices, indices=k_indices, axis=axis) else: indices = mb.gather(x=indices, indices=k_indices, axis=axis, name=name) return values, indices def solve_diagonal_einsum(parsed_vectors, vars): def solve_diagonal_einsum_one_step(parsed_vector, x): for i in range(len(parsed_vector)): for j in range(i + 1, len(parsed_vector)): if parsed_vector[i] != parsed_vector[j]: continue perm = list(range(len(parsed_vector))) duplicated_indices = [j for j in range(len(parsed_vector)) if parsed_vector[j] == parsed_vector[i]] for i, j in enumerate(duplicated_indices): perm[i], perm[j] = perm[j], perm[i] parsed_vector[i], parsed_vector[j] = parsed_vector[j], parsed_vector[i] dims = mb.shape(x=x) dim_length = value_at(dims, duplicated_indices[0]) indices = mb.range_1d(end=dim_length, start=0, step=1) indices = mb.stack(values=[indices] * len(duplicated_indices), axis=1) x = mb.transpose(x=x, perm=perm) x = mb.gather_nd(x=x, indices=indices) ret_parsed_vector = [parsed_vector[0]] + parsed_vector[len(duplicated_indices):] return ret_parsed_vector, x for i in range(len(vars)): while len(parsed_vectors[i]) != len(set(parsed_vectors[i])): parsed_vector, var = solve_diagonal_einsum_one_step(parsed_vectors[i], vars[i]) parsed_vectors[i] = parsed_vector vars[i] = var return tuple(parsed_vectors), vars def solve_sum_einsum(parsed_vectors, vars): """ Apply reduce_sum for axes before binary einsum calculation if enable. e.g.: input : "abce,acd->ae" returns : "ace,ac->ae" In this example, since each of those axes is only used by one var and does not appear in the output, axes `b` and `d` can be reduced before binary einsum. """ def solve_sum_einsum_one_step(src_axes, used_by_other_axes, x): dst_axes = [] for axis in src_axes: if axis not in used_by_other_axes: continue dst_axes.append(axis) summed_axis_indices = [i for i in range(len(src_axes)) if src_axes[i] not in dst_axes] if summed_axis_indices: x = mb.reduce_sum(x=x, axes=summed_axis_indices) return dst_axes, x ret_parsed_vectors = [] parsed_vectors = list(parsed_vectors) for i, var in enumerate(vars): used_by_other_axes = [] for j, parsed_vector in enumerate(parsed_vectors): if i != j: used_by_other_axes += parsed_vector dst_axes, var = solve_sum_einsum_one_step(parsed_vectors[i], used_by_other_axes, vars[i]) ret_parsed_vectors.append(dst_axes) vars[i] = var ret_parsed_vectors.append(parsed_vectors[-1]) return ret_parsed_vectors, vars def get_perm_transpose_einsum(src_axes: List[int], dst_axes: List[int]) -> List[int]: """ :param src_axes: list[int] :param dst_axes: list[int] :return: list[int] """ return [src_axes.index(s) for s in dst_axes] def solve_transpose_einsum(src_parsed_vector: List[int], dst_parsed_vector: List[int], var: Var, name: str) -> Var: return mb.transpose(x=var, perm=get_perm_transpose_einsum(src_parsed_vector, dst_parsed_vector), name=name) def solve_generic_einsum(parsed_vectors, vars, name) -> Var: """ :param parsed_vectors: list[list[int]] :param vars: - list[var] - input variables :param name: - str - name to be assigned to the output var :return: - var - output var that contains the einsum result """ parsed_vectors, vars = solve_diagonal_einsum(parsed_vectors, vars) parsed_vectors, vars = solve_sum_einsum(parsed_vectors, vars) if len(vars) == 1: return solve_transpose_einsum(parsed_vectors[0], parsed_vectors[1], vars[0], name) while len(vars) >= 2: out_vector = [] input_symbols = list(itertools.chain.from_iterable(parsed_vectors[:2])) for symbol in itertools.chain.from_iterable(parsed_vectors[2:]): if symbol in input_symbols and symbol not in out_vector: out_vector.append(symbol) temp_parsed_vectors = [parsed_vectors[0], parsed_vectors[1], out_vector] parsed_vectors[0] = out_vector parsed_vectors.pop(1) vars[0] = solve_binary_generic_einsum(temp_parsed_vectors, vars[0], vars[1], name if len(vars) == 2 else None) vars.pop(1) return vars[0] def solve_binary_generic_einsum(parsed_vectors, a_var, b_var, name) -> Var: def _concat_dims(dims, none_if_empty=False): if len(dims) == 0: if none_if_empty: return None else: return 1 return mb.concat(values=dims, axis=0) a_axes, b_axes, out_axes = parsed_vectors a_dims = mb.shape(x=a_var) b_dims = mb.shape(x=b_var) batched_axes = [] reduced_axes = [] a_unique_axes = [] b_unique_axes = [] batch_dims = [] reduce_dims = [] a_unique_dims = [] b_unique_dims = [] for i, a_axis in enumerate(a_axes): a_dim = value_at(a_dims, i) if a_axis in b_axes: if a_axis in out_axes: batched_axes.append(a_axis) batch_dims.append(a_dim) else: reduced_axes.append(a_axis) reduce_dims.append(a_dim) else: a_unique_axes.append(a_axis) a_unique_dims.append(a_dim) concat_batch_dims = _concat_dims(batch_dims, True) # if there is no dim to reduce, then add a dummy dim, # so mb.matmul will reduce the dummy dim to achieve outer product concat_reduce_dims = _concat_dims(reduce_dims) # if there is no dim of `a` remains, then add a dummy dim for `a` as a matrix dim, # otherwise mb.matmul may mistake the batch dim of `a` as the matrix dim concat_a_unique_dims = _concat_dims(a_unique_dims) for i, b_axis in enumerate(b_axes): b_dim = value_at(b_dims, i) if b_axis not in a_axes: b_unique_axes.append(b_axis) b_unique_dims.append(b_dim) # if there is no dim of `b` remains, then add a dummy dim for `b`, # otherwise mb.matmul may mistake the batch dim of `b` as a matrix dim concat_b_unique_dims = _concat_dims(b_unique_dims) a_transpose_axes = batched_axes + a_unique_axes + reduced_axes a = mb.transpose(x=a_var, perm=get_perm_transpose_einsum(a_axes, a_transpose_axes)) a_reshape_dims = _concat_dims( [mb.reduce_prod(x=x) for x in [concat_batch_dims, concat_a_unique_dims, concat_reduce_dims] if x is not None]) a = mb.reshape(x=a, shape=a_reshape_dims) b_transpose_axes = batched_axes + reduced_axes + b_unique_axes b = mb.transpose(x=b_var, perm=get_perm_transpose_einsum(b_axes, b_transpose_axes)) b_reshape_dims = _concat_dims( [mb.reduce_prod(x=x) for x in [concat_batch_dims, concat_reduce_dims, concat_b_unique_dims] if x is not None]) b = mb.reshape(x=b, shape=b_reshape_dims) ab = mb.matmul(x=a, y=b) concat_batch_dims = _concat_dims(batch_dims, True) concat_a_unique_dims = _concat_dims(a_unique_dims, True) concat_b_unique_dims = _concat_dims(b_unique_dims, True) ab_reshaped_dims = _concat_dims( [ x for x in [concat_batch_dims, concat_a_unique_dims, concat_b_unique_dims] if x is not None ], True, ) # Removes excessive dimensions for scalar output if ab_reshaped_dims is None: if name is None: return mb.squeeze(x=ab) else: return mb.squeeze(x=ab, name=name) # Reshape tensor output to specified output shape else: ab = mb.reshape(x=ab, shape=ab_reshaped_dims) ab_reshaped_axes = batched_axes + a_unique_axes + b_unique_axes if name is None: ab = mb.transpose(x=ab, perm=get_perm_transpose_einsum(ab_reshaped_axes, out_axes)) else: ab = mb.transpose(x=ab, perm=get_perm_transpose_einsum(ab_reshaped_axes, out_axes), name=name) return ab def _decompose_scaled_dot_product_attention( q: Var, k: Var, v: Var, mask: Var, name: str, scale: Optional[Var] = None, before_op: Optional[Operation] = None, ) -> Var: # scale the query input embed_size = q.shape[-1] if is_symbolic(embed_size): raise ValueError( "The embedding size, i.e. last dimension of the shape of query tensor" " cannot be symbolic, in scaled_dot_product_attention op" ) q, k, v = promote_input_dtypes([q, k, v]) if scale is None: multiplicative_scale_factor = 1 / math.sqrt(embed_size) if types.builtin_to_string(q.dtype) == "fp16": multiplicative_scale_factor = np.float16(multiplicative_scale_factor) else: multiplicative_scale_factor = scale q = mb.mul(x=q, y=multiplicative_scale_factor, before_op=before_op) # multiply query and key input tensors # shape of output: (target_seq, source_seq) or (B,...,target_seq, source_seq) attn_weights = mb.matmul(x=q, y=k, transpose_y=True, before_op=before_op) # add mask if applicable if mask is not None: attn_weights = mb.add(x=attn_weights, y=mask, before_op=before_op) # do softmax attn_weights_normalized = mb.softmax(x=attn_weights, axis=-1, before_op=before_op) # multiply attn_weights and value tensor res = mb.matmul(x=attn_weights_normalized, y=v, name=name, before_op=before_op) return res def _construct_constexpr_dequant_op( quantized_weights: np.ndarray, zero_point: Optional[Union[Var, np.ndarray, np.generic]], scale: Union[Var, np.ndarray, np.generic], axis: Optional[Union[Var, int]] = None, name: Optional[str] = None, before_op: Optional[Operation] = None, ) -> Var: """ Constructs the constexpr op to represent the quantized weight. Use constexpr_affine_dequantize for pre-iOS18 and constexpr_blockwise_shift_scale for others. """ if not is_current_opset_version_compatible_with(target.iOS18): # The constexpr_affine_dequantize op requires axis. if axis is None: # Infer the axis based on scale's shape. non_single_dim = [dim for dim, dim_size in enumerate(scale.shape) if dim_size > 1] if len(non_single_dim) > 2: raise ValueError( "The constexpr_affine_dequantize op doesn't support scale which " "have more than one non-single dimensions. Got scale with shape " f"{scale.shape}" ) # Empty non_single_dim means per-tensor quantization, just use a dummy axis. axis = 0 if len(non_single_dim) == 0 else non_single_dim[0] if isinstance(axis, int): axis = np.int32(axis) # The constexpr_affine_dequantize op requires zero_point. if zero_point is None: zero_point = np.zeros_like(scale).astype(quantized_weights.dtype) # The constexpr_affine_dequantize op requires scale and zero_point to have rank 0 or 1. if isinstance(scale, (np.ndarray, np.generic)): scale = np.squeeze(scale) if isinstance(zero_point, (np.ndarray, np.generic)): zero_point = np.squeeze(zero_point) if len(scale.shape) > 1 or len(zero_point.shape) > 1: raise ValueError( "The more fine-grained quantization (such as blockwise) is only supported since iOS18." "Please set minimum_deployment_target to iOS18 for using it." ) kwargs = { "quantized_data": quantized_weights, "zero_point": zero_point, "scale": scale, "axis": axis, } if name is not None: kwargs["name"] = name if before_op is not None: kwargs["before_op"] = before_op return mb.constexpr_affine_dequantize(**kwargs) # For iOS18 constexpr_blockwise_shift_scale op, the data/scale/offset need to have same rank. if len(quantized_weights.shape) != len(scale.shape): if axis is not None: target_shape = [1] * len(quantized_weights.shape) target_shape[axis] = quantized_weights.shape[axis] else: target_shape = list(scale.shape) + [1] * ( len(quantized_weights.shape) - len(scale.shape) ) if np.prod(scale.shape) != np.prod(target_shape): raise ValueError( "Unable to infer scale's shape. Please provide a scale that has the " "same rank as the weight." ) scale = scale.reshape(target_shape) # Check the value range to determine the true data type (such as int4/uint4). sub_byte_type = ( types.uint4 if types.numpy_type_to_builtin_type(quantized_weights.dtype).is_unsigned() else types.int4 ) sub_byte_range = types.type_mapping._TYPES_TO_RANGE[sub_byte_type] if ( np.max(quantized_weights) <= sub_byte_range.high and np.min(quantized_weights) >= sub_byte_range.low ): quantized_weights = quantized_weights.astype(types.nptype_from_builtin(sub_byte_type)) kwargs = { "data": quantized_weights, "scale": scale, } if zero_point is not None and np.any(zero_point): # Only pass the offset parameter when not all elements in `zero_point` are zeroes. zero_point = zero_point.reshape(scale.shape) # When zero_point is integer, it's required to have the same dtype as the quantized weight. if np.issubdtype(zero_point.dtype, np.integer): zero_point = zero_point.astype(quantized_weights.dtype) kwargs["offset"] = zero_point if name is not None: kwargs["name"] = name if before_op is not None: kwargs["before_op"] = before_op return mb.constexpr_blockwise_shift_scale(**kwargs) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2135465 coremltools-8.0/coremltools/converters/mil/frontend/milproto/0000755000000000000000000000000014672075535023500 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/milproto/__init__.py0000644000000000000000000000035314672066616025612 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import load ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/milproto/helper.py0000644000000000000000000000501514672066616025332 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.program import get_new_symbol def get_proto_dim(dim): if dim.WhichOneof("dimension") == "constant": return dim.constant.size else: if not dim.unknown.variadic: return get_new_symbol() raise NotImplementedError("Variadic dimensions not yet implemented.") def proto_to_types(valuetype): """ A helper function that maps the proto value type to PyMIL types. """ if valuetype.WhichOneof("type") == "tensorType": tensortype = valuetype.tensorType dtype = types.PROTO_TO_BUILTIN_TYPE[tensortype.dataType] if tensortype.rank < 0: raise ValueError("Negative or Dynamic ranks not supported") if tensortype.rank != len(tensortype.dimensions): raise ValueError("Rank doesn't match the number of dimensions") if tensortype.attributes != {}: raise ValueError("Attributes on tensorType not supported") shape = [] for i in range(tensortype.rank): shape.append(get_proto_dim(tensortype.dimensions[i])) # For the zero rank tensor, we always convert it back to scalar in PyMIL first if tensortype.rank == 0: return dtype return types.tensor(dtype, shape) elif valuetype.WhichOneof("type") == "listType": listtype = valuetype.listType elem_type = proto_to_types(listtype.type) if listtype.length.unknown: init_length = None else: init_length = listtype.length.constant.size # In the MIL proto, there is no such thing of "dynamic_length", hence we set it to True when # converting back to PyMIL return types.list(elem_type, init_length, dynamic_length=True) elif valuetype.WhichOneof("type") == "dictionaryType": dicttype = valuetype.dictionaryType keytype = proto_to_types(dicttype.keyType) valuetype = proto_to_types(dicttype.valueType) return types.dict(keytype, valuetype) elif valuetype.WhichOneof("type") == "stateType": wrapped_type = proto_to_types(valuetype.stateType.wrappedType) return types.state(wrapped_type) else: raise NotImplementedError("Types {} not yet implemented".format(valuetype.WhichOneof("type"))) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/milproto/load.py0000644000000000000000000005634314672066616025004 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os from typing import Tuple import numpy as np import coremltools.converters.mil.frontend.milproto.load from coremltools import _logger as logger from coremltools import proto from coremltools.converters.mil import mil from coremltools.converters.mil._deployment_compatibility import AvailableTarget as _target from coremltools.converters.mil.backend.mil import helper from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import ( Function, ListVar, Placeholder, TupleInputType, Var, mil_list, types, ) from coremltools.converters.mil.mil.block import curr_block from coremltools.converters.mil.mil.ops.registry import SSAOpRegistry as _SSAOpRegistry from coremltools.converters.mil.mil.program import StateTensorPlaceholder from .helper import proto_to_types try: from coremltools.libmilstoragepython import _BlobStorageReader as BlobReader except Exception as e: logger.warning(f"Fail to import BlobReader from libmilstoragepython. {e}") BlobReader = None class TranscriptionContext: """ Holds shared variables needed for transcription. """ def __init__(self, weights_dir=""): self.name_to_var = {} # mapping from name -> var object self.blob_reader_from_filename = ( {} ) # mapping from filename -> BlobReader object self.weights_dir = weights_dir def register_var_with_name(self, name, var): var.name = name if name in self.name_to_var: # Overriding allow us to translate control flow blocks msg = "Var %s is added again. Overriding previous value" logger.info(msg % name) self.name_to_var[name] = var def get_var_from_name(self, name): if name not in self.name_to_var: raise KeyError("Var {} not found".format(name)) return self.name_to_var[name] def _load_tensorvalue(tensorvalue_spec): if not isinstance(tensorvalue_spec, proto.MIL_pb2.TensorValue): raise TypeError("Invalid TensorValue spec object") if tensorvalue_spec.WhichOneof("value") == "floats": return tensorvalue_spec.floats.values elif tensorvalue_spec.WhichOneof("value") == "ints": return tensorvalue_spec.ints.values elif tensorvalue_spec.WhichOneof("value") == "bools": return tensorvalue_spec.bools.values elif tensorvalue_spec.WhichOneof("value") == "strings": return tensorvalue_spec.strings.values elif tensorvalue_spec.WhichOneof("value") == "longInts": return tensorvalue_spec.longInts.values elif tensorvalue_spec.WhichOneof("value") == "doubles": return tensorvalue_spec.doubles.values elif tensorvalue_spec.WhichOneof("value") == "bytes": return tensorvalue_spec.bytes.values else: raise ValueError("Invalid dtype for TensorValue type") def _load_immediate_value(context: TranscriptionContext, immediatevalue_spec): if not isinstance(immediatevalue_spec, proto.MIL_pb2.Value.ImmediateValue): raise TypeError("Invalid ImmedidateValue spec object") if immediatevalue_spec.WhichOneof("value") == "tensor": return _load_tensorvalue(immediatevalue_spec.tensor) elif immediatevalue_spec.WhichOneof("value") == "list": return immediatevalue_spec.list.values elif immediatevalue_spec.WhichOneof("value") == "dictionary": result = {} for value in immediatevalue_spec.dictionary.values: result[_load_value(context, value.key)] = _load_value(context, value.value) return result else: raise NotImplementedError( "Immediate value type not supported yet." ) def _load_file_value(context, filevalue_spec, dtype): if BlobReader is None: raise RuntimeError("BlobReader not loaded") if not isinstance(filevalue_spec, proto.MIL_pb2.Value.BlobFileValue): raise TypeError("Invalid BlobFileValue spec object") filename = os.path.join(context.weights_dir, filevalue_spec.fileName.split("/")[-1]) offset = filevalue_spec.offset if filename in context.blob_reader_from_filename: blob_reader = context.blob_reader_from_filename[filename] else: blob_reader = BlobReader(filename) context.blob_reader_from_filename[filename] = blob_reader if dtype == types.uint1: np_value = blob_reader.read_uint1_data(offset) elif dtype == types.uint2: np_value = blob_reader.read_uint2_data(offset) elif dtype == types.uint3: np_value = blob_reader.read_uint3_data(offset) elif dtype == types.uint4: np_value = blob_reader.read_uint4_data(offset) elif dtype == types.uint6: np_value = blob_reader.read_uint6_data(offset) elif dtype == types.uint8: np_value = blob_reader.read_uint8_data(offset) elif dtype == types.int4: np_value = blob_reader.read_int4_data(offset) elif dtype == types.int8: np_value = blob_reader.read_int8_data(offset) elif dtype == types.uint16: np_value = blob_reader.read_uint16_data(offset) elif dtype == types.int16: np_value = blob_reader.read_int16_data(offset) elif dtype == types.fp16: np_value_uint16 = blob_reader.read_fp16_data(offset) np_value = np.frombuffer(np_value_uint16.tobytes(), np.float16) elif dtype == types.fp32: np_value = blob_reader.read_float_data(offset) elif dtype == types.int32: np_value = blob_reader.read_int32_data(offset) elif dtype == types.uint32: np_value = blob_reader.read_uint32_data(offset) else: raise ValueError("Invalid dtype for blob file value type") return np_value def _restore_np_from_bytes_value(value: bytes, dtype: types, shape: Tuple[int]) -> np.ndarray: # Import _utils here to avoid circular import. from coremltools.optimize.coreml import _utils as optimize_utils if types.is_sub_byte(dtype) and isinstance(value, bytes): result = np.frombuffer(value, types.nptype_from_builtin(dtype)) # For sub-byte data, the np array restored from bytes is packed, so we need to unpack it. nbits = dtype.get_bitwidth() element_num = np.prod(shape) are_packed_values_signed = not dtype.is_unsigned() return optimize_utils.restore_elements_from_packed_bits( result, nbits, element_num, are_packed_values_signed ).reshape(shape) return np.frombuffer(value, types.nptype_from_builtin(dtype)).reshape(shape) def _load_value(context, value_spec): if not isinstance(value_spec, proto.MIL_pb2.Value): raise TypeError("Invalid Value spec object") if value_spec.docString: raise ValueError("Docstring would get lost in the process.") value_spec_type = value_spec.type.WhichOneof("type") if value_spec_type == "tensorType": valuetype = proto_to_types(value_spec.type) is_tensor = types.is_tensor(valuetype) dtype = valuetype if not is_tensor else valuetype.get_primitive() shape = () if not is_tensor else valuetype.get_shape() if value_spec.WhichOneof("value") == "immediateValue": value = _load_immediate_value(context, value_spec.immediateValue) else: value = _load_file_value(context, value_spec.blobFileValue, dtype) target_np_dtype = types.nptype_from_builtin(dtype) if dtype in helper.IMMEDIATE_VALUE_TYPES_IN_BYTES: value = _restore_np_from_bytes_value(value, dtype, shape).astype(target_np_dtype) elif dtype == types.str and shape == (): value = str(value[0]) elif dtype in ( types.fp32, types.str, types.bool, types.int16, types.uint16, types.int32, types.int64, ): value = np.array(value).astype(target_np_dtype).reshape(shape) else: raise ValueError("Invalid dtype for tensor value") elif value_spec_type == "dictionaryType": assert value_spec.WhichOneof("value") == "immediateValue", "dict must be immediate value" return _load_immediate_value(context, value_spec.immediateValue) else: raise NotImplementedError( f"Deserialization from milproto {value_spec_type} to pymil is not implemented yet" ) if not is_tensor and not isinstance(value, str): value = types.nptype_from_builtin(dtype)(value.item()) return value def _create_var_from_spec(spec): """ This helper function is used for creating PyMIL Var/ListVar from the proto spec. Mainly used for the construction of the control flow ops. """ assert isinstance(spec, proto.MIL_pb2.NamedValueType) sym_type = proto_to_types(spec.type) name = spec.name if types.is_list(sym_type): var = ListVar( name, elem_type=sym_type.T[0], init_length=sym_type.T[1], dynamic_length=sym_type.T[2]) else: var = Var(name, sym_type, None, op=None, op_output_idx=None) return var def _set_outer_op_for_nested_blocks(blocks, op): """ An ultility function that sets the outer_op of the blocks for control flow ops. """ for block in blocks: block.outer_op = op def _create_nested_blocks(context, op_spec): """ An utility function that creates nested blocks for control flow ops. """ if not op_spec.blocks: return [] blocks = [] for block_spec in op_spec.blocks: input_vars = [_create_var_from_spec(input) for input in block_spec.inputs] # add block input vars to the context for v in input_vars: context.register_var_with_name(v.name, v) # In pymil, the outer_op for a block can only be None if the block is a Function. # As the result, we use a dummy outer_op here for block creation, and set it to # the legit op later on in _set_outer_op_for_nested_blocks dummy = mb.const(val=0.) with Block(block_inputs=input_vars, outer_op=dummy._op, name=Block._get_new_name()) as block: _load_block(context, block_spec) blocks.append(block) return blocks def _set_inputs_for_control_flow_op(inputs, blocks, op_type): """ An utility function that set the dummy functional inputs and blocks inputs for control flow ops. """ if op_type == "while_loop": def _dummy_cond(*loop_vars): return None def _dummy_body(*loop_vars): return None inputs["_existing_blocks"] = blocks inputs["_cond"] = _dummy_cond inputs["_body"] = _dummy_body elif op_type == "cond": def _dummy_true_fn(*loop_vars): return None def _dummy_false_fn(*loop_vars): return None inputs["_existing_blocks"] = blocks inputs["_true_fn"] = _dummy_true_fn inputs["_false_fn"] = _dummy_false_fn def _load_const_op(context, op_spec): inputs = {k: _load_value(context, v) for k, v in op_spec.attributes.items()} if len(op_spec.inputs) > 0: for param_name, argument in op_spec.inputs.items(): vars = [] for binding in argument.arguments: binding_type = binding.WhichOneof("binding") if binding_type == "name": vars.append(context.get_var_from_name(binding.name)) elif binding_type == "value": vars.append(_load_value(context, binding.value)) else: raise ValueError(f"Invalid binding_type {binding_type}") if len(vars) == 1: inputs[param_name] = vars[0] else: inputs[param_name] = vars output_var = getattr(mb, op_spec.type)(**inputs) if "val" in op_spec.attributes: if hasattr(op_spec.attributes["val"], "blobFileValue"): filevalue_spec = op_spec.attributes["val"].blobFileValue filename = filevalue_spec.fileName.split("/")[-1] if filename != "weight.bin": output_var.op.weight_key = filename.split(".")[0] if not isinstance(output_var, (tuple, list)): output_var = [output_var] if len(output_var) != len(op_spec.outputs): raise AssertionError( "Mismatch between number of outputs in operation specification vs PyMIL outputs" ) for spec, var in zip(op_spec.outputs, output_var): context.register_var_with_name(spec.name, var) def _load_operation(context: TranscriptionContext, op_spec: proto.MIL_pb2.Operation): if not isinstance(op_spec, proto.MIL_pb2.Operation): raise TypeError("Invalid Operation spec object") op_type = op_spec.type if op_type == "const" or "constexpr_" in op_type: if op_spec.blocks: raise ValueError("const / constexpr operation can't have any block") if op_type == "const" and op_spec.inputs: raise ValueError("const operation can't have any input") _load_const_op(context, op_spec) else: if op_type == "custom_layer": raise NotImplementedError( "Loading Custom Layer operation not yet implemented" ) # The conversion steps of an operation proto -> PyMIL operation are as following: # (i) Convert the input arguments: # In most of the cases, the input variable is already created beforehand, hence we can # directly access and get them through the TranscriptionContext. # There are cases, though, the inputs are literal value. This could happens in the classify op spec. # For that case, we directly create a constant variable. # (ii) Create nested blocks for control flow operations: # The Python functional input arguments for control flow ops cannot be recovered from milproto -> pymil conversion, # for instance, the _body, _cond for mb.while_loop and _true_fn, _false_fn for mb.cond are not invertible # Hence, here we directly create the nested blocks from the proto, and set them to mb.while_loop.blocks / mb.cond.blocks. # Note that, when creating a block, PyMIL required an outer_op, which should be the control flow operation itself. However, # in this approach we take, the outer_op hasn't been created at the time when the blocks produced. Here, we make a "dummy outer_op", # which could pass the check in PyMIL, also it could provide enough information (such as visible variables in the blocks etc.) # for the creation of the block. # (iii) Create PyMIL operation using inputs / blocks # Note that for the control flow cases, we create dummy functional inputs, and use the existing block to create the op. # (iv) Set the outer_op for control flow # Once the operation is created, we replace the dummy outer_op with the legit one, to make it a valid PyMIL program attrs = list(op_spec.attributes.items()) if len(attrs) > 0: if len(attrs) != 1 or attrs[0][0] != "name": raise ValueError("\"name\" is the only supported attribute for operation") inputs = {k: _load_value(context, v) for k, v in op_spec.attributes.items()} for param_name, argument in op_spec.inputs.items(): vars = [] for binding in argument.arguments: binding_type = binding.WhichOneof("binding") if binding_type == "name": vars.append(context.get_var_from_name(binding.name)) elif binding_type == "value": # We only support the list value for now (for the classifier use case) value_spec = binding.value assert value_spec.WhichOneof("value") == "immediateValue" assert value_spec.immediateValue.WhichOneof("value") == "list" list_value = _load_immediate_value(context, value_spec.immediateValue) values = [] for value_spec in list_value: values.append(_load_value(context, value_spec)) var = mb.const(val=mil_list(values)) vars.append(var) else: raise NotImplementedError("Binding {} not yet implemented".format(binding_type)) if op_type == "write_state": inputs[param_name] = vars[0] else: op_cls = _SSAOpRegistry._get_core_op_cls(op_type) if len(vars) == 1 and not isinstance( op_cls.input_spec.input_types[param_name], TupleInputType ): inputs[param_name] = vars[0] else: inputs[param_name] = vars blocks = _create_nested_blocks(context, op_spec) _set_inputs_for_control_flow_op(inputs, blocks, op_type) # write_state is translated into coreml_update_state if op_type == "write_state": new_inputs = { "state": inputs["input"], "value": inputs["data"], } getattr(mb, "coreml_update_state")(**new_inputs) return else: output_var = getattr(mb, op_type)(**inputs) if not isinstance(output_var, (tuple, list)): output_var = [output_var] if len(output_var) != len(op_spec.outputs): raise AssertionError( "Mismatch between number of outputs in operation specification vs PyMIL outputs" ) for spec, var in zip(op_spec.outputs, output_var): context.register_var_with_name(spec.name, var) pymil_type = var.sym_type proto_type = proto_to_types(spec.type) if not types.is_compatible_type(pymil_type, proto_type): # We allow a corner case where the pymil has an 0 rank tensor and the spec produces a scalar if types.is_tensor(pymil_type) and types.is_scalar(proto_type): if pymil_type.get_primitive() == proto_type: continue raise AssertionError( "Mismatch between var types in specification vs PyMIL" ) _set_outer_op_for_nested_blocks(blocks, output_var[0].op) def _load_block(context, block_spec): def _try_to_merge_state_ops(): """ We detect the pattern of: %1 = coreml_update_state(state=%state, value=%value) %2 = read_state(input=%state) and transform it into: %2 = coreml_update_state(state=%state, value=%value) """ block = curr_block() if len(block.operations) < 2: return op_1, op_2 = block.operations.end.prev.op, block.operations.end.op if op_1.op_type != "coreml_update_state" or op_2.op_type != "read_state": return if op_1.state != op_2.input: return var_1, var_2 = op_1.outputs[0], op_2.outputs[0] var_1.name = var_2.name context.register_var_with_name(var_1.name, var_1) block.remove_ops([op_2]) if not isinstance(block_spec, proto.MIL_pb2.Block): raise TypeError("Invalid Block spec object") if block_spec.attributes: raise ValueError("Attributes on block not supported") block_outputs = block_spec.outputs output_vars = [] for op_spec in block_spec.operations: _load_operation(context, op_spec) _try_to_merge_state_ops() for proto_output_name in block_outputs: output_vars.append(context.get_var_from_name(proto_output_name)) pymil_block = curr_block() pymil_block.set_outputs(output_vars) return pymil_block def _load_function(context, func_spec, spec_version): if not isinstance(func_spec, proto.MIL_pb2.Function): raise TypeError("Invalid Function spec object") if func_spec.attributes: raise ValueError("Attributes on functions not supported") func_inputs = {} for named_value_type in func_spec.inputs: name = named_value_type.name valuetype = proto_to_types(named_value_type.type) if types.is_tensor(valuetype): func_inputs[name] = Placeholder( sym_shape=valuetype.get_shape(), dtype=valuetype.get_primitive(), name=name ) elif types.is_state(valuetype): func_inputs[name] = StateTensorPlaceholder( sym_shape=valuetype.wrapped_type().get_shape(), dtype=valuetype.wrapped_type().get_primitive(), name=name, ) else: raise ValueError(f"Functions input of type {valuetype} not supported.") context.register_var_with_name(name, func_inputs[name].outputs[0]) opset = func_spec.opset if opset not in func_spec.block_specializations: raise ValueError("Missing block specialization for opset {}".format(opset)) with Function(func_inputs, opset_version=_target(spec_version)) as pymil_func: _load_block(context, func_spec.block_specializations[opset]) return pymil_func def _load_program_spec_attributes( context: TranscriptionContext, program_spec: proto.MIL_pb2.Program, pymil_program: mil.Program, ) -> None: for attr_name, attr_spec in program_spec.attributes.items(): # No need to load these attributes if attr_name in ("buildInfo",): pass else: raise ValueError(f"Invalid attribute {attr_name} for program") def load_mil_proto(program_spec, specification_version, file_weights_dir=""): """ Load in-memory Proto specification of MILSpec.Program(.Proto) object to PyMIL """ if not isinstance(program_spec, proto.MIL_pb2.Program): raise TypeError("Invalid Program spec object") if program_spec.docString: raise NotImplementedError("Docstring would be lost in the process") if program_spec.version != 1: raise ValueError("Invalid program version") context = TranscriptionContext(file_weights_dir) pymil_program = mil.Program() for func_name, func_spec in program_spec.functions.items(): pymil_program.add_function( func_name, _load_function(context, func_spec, specification_version) ) coremltools.converters.mil.frontend.milproto.load._load_program_spec_attributes(context, program_spec, pymil_program) return pymil_program def load(model_spec, specification_version, file_weights_dir="", **kwargs): """ Load in-memory Proto specification of Model(.Proto) object to PyMIL Set force_spec_version to force override the spec version. """ if not isinstance(model_spec, proto.Model_pb2.Model): raise TypeError("Invalid Model sepc object") if specification_version < model_spec.specificationVersion: if not kwargs.get("force_spec_version", False): raise ValueError( "specification_version must be greater or equal to the input model spec version" ) if model_spec.WhichOneof("Type") != "mlProgram": raise ValueError("Only MIL proto based mlmodels can be loaded") return load_mil_proto(model_spec.mlProgram, specification_version, file_weights_dir) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/milproto/test_load.py0000644000000000000000000004215614672066616026040 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools import _SPECIFICATION_VERSION_IOS_18, ComputeUnit from coremltools._deps import _HAS_TF_2, _HAS_TORCH from coremltools.converters._converters_entry import _get_metadata_from_mlmodel from coremltools.converters.mil import Builder as mb from coremltools.converters.mil.converter import mil_convert from coremltools.converters.mil.frontend.milproto.load import load as milproto_to_pymil if _HAS_TF_2: from coremltools.converters.mil.frontend.tensorflow.test.test_ops import TestTensorArray from coremltools.converters.mil.frontend.tensorflow.test.testing_utils import run_compare_tf from coremltools.converters.mil.mil import Program, types from coremltools.converters.mil.mil.ops.tests.testing_utils import compare_backend from coremltools.converters.mil.testing_utils import ( get_op_names_in_program, get_op_types_in_program, ) if _HAS_TORCH: import torch from coremltools.converters.mil.frontend.torch.test.test_torch_ops import TestScriptedModels def get_pymil_prog_from_mlmodel(mlmodel): model_spec = mlmodel.get_spec() return milproto_to_pymil( model_spec=model_spec, specification_version=model_spec.specificationVersion, file_weights_dir=mlmodel.weights_dir, ) def get_roundtrip_mlmodel(mlmodel): """ This utility function does the following roundtrip conversion: mlprogram proto -> pymil program -> mlprogram model """ pymil_prog = get_pymil_prog_from_mlmodel(mlmodel) # convert the pymil program to mlmodel model_spec = mlmodel.get_spec() roundtripped_mlmodel = mil_convert( pymil_prog, convert_to="mlprogram", convert_from="milinternal", compute_units=mlmodel.compute_unit, model_description=model_spec.description, specification_version=model_spec.specificationVersion, ) # set MIL program attributes build_info = _get_metadata_from_mlmodel(mlmodel) roundtripped_mlmodel._set_build_info_mil_attributes(build_info) return roundtripped_mlmodel def roundtrip_and_compare_mlmodel(mlmodel, input_dict): roundtripped_mlmodel = get_roundtrip_mlmodel(mlmodel) expected_outputs = mlmodel.predict(input_dict) compare_backend(roundtripped_mlmodel, input_dict, expected_outputs) class TestLoadAPIUsage: def test_mil_proto_to_pymil(self): # Define a PyMIL program @mb.program(input_specs=[mb.TensorSpec(shape=(1, 3, 100, 100)), ]) def prog(x): # MIL operation takes named inputs (instead of positional inputs). # Here `name` argument is optional. x = mb.relu(x=x, name='relu') x = mb.conv(x=x, weight=np.random.rand(10, 3, 2, 2), name="conv") x = mb.transpose(x=x, perm=[0, 3, 1, 2], name='transpose') x = mb.reduce_mean(x=x, axes=[2, 3], keep_dims=False, name='reduce') x = mb.log(x=x, name='log') return x # Convert it to MIL proto backed MLModel mlmodel = ct.convert(prog, convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY) # Load MLModel back to PyMIL loaded_pymil_prog = get_pymil_prog_from_mlmodel(mlmodel) # Assert that loaded PyMIL prog matches with defined PyMIL prog if get_op_types_in_program(loaded_pymil_prog) != get_op_types_in_program(prog): raise AssertionError("Mismatch between defined PyMIL prog and loaded PyMIL prog") def test_mil_proto_to_pymil_with_version_handling(self): # This test makes sure the correct version of the op is picked up during mil_proto -> pymil conversion # iOS15 version program with iOS13 version topk @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 4, 4))], opset_version=ct.target.iOS15) def prog(x): x = mb.topk(x=x, k=1, axis=-1, ascending=True) return x iOS15_mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS15, compute_units=ct.ComputeUnit.CPU_ONLY, ) iOS15_pymil_prog = get_pymil_prog_from_mlmodel(iOS15_mlmodel) topk_op = iOS15_pymil_prog.functions["main"].find_ops(op_type="topk")[0] assert not hasattr(topk_op, "sort") # iOS16 version program with iOS16 version topk @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 4, 4))], opset_version=ct.target.iOS16) def prog(x): x = mb.topk(x=x, k=1, axis=-1, ascending=True) return x iOS16_mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS16, compute_units=ct.ComputeUnit.CPU_ONLY, ) iOS16_pymil_prog = get_pymil_prog_from_mlmodel(iOS16_mlmodel) topk_op = iOS16_pymil_prog.functions["main"].find_ops(op_type="topk")[0] assert hasattr(topk_op, "sort") def test_mil_proto_preserving_ops_name(self): # This test is checking the route source_model -> MIL -> mil_prot -> pymil is preserving the op name # Define a PyMIL program @mb.program(input_specs=[mb.TensorSpec(shape=(1, 3, 100, 100)), ]) def prog(x): # MIL operation takes named inputs (instead of positional inputs). # Here `name` argument is optional. x = mb.relu(x=x, name='i_am_relu') x = mb.conv(x=x, weight=np.random.rand(10, 3, 2, 2), name="i_am_conv") x = mb.transpose(x=x, perm=[0, 3, 1, 2], name='i_am_transpose') x = mb.reduce_mean(x=x, axes=[2, 3], keep_dims=False, name='i_am_reduce_mean') x = mb.log(x=x, name='i_am_log') return x mlmodel = ct.convert(prog, convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY) op_names = get_op_names_in_program(mlmodel._mil_program, skip_const_ops=False) prog = get_pymil_prog_from_mlmodel(mlmodel) new_op_names = get_op_names_in_program(prog, skip_const_ops=False) assert op_names == new_op_names def test_mil_uint16(self): @mb.program( input_specs=[mb.TensorSpec(shape=(2, 2, 3))], opset_version=ct.target.iOS17, ) def prog(x): indices = np.array([[[1, 0], [0, 1]], [[1, 0], [0, 0]]], dtype=np.uint16) res = mb.gather(x=x, indices=indices, axis=2, batch_dims=2) return res mlmodel = ct.convert( prog, convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY, minimum_deployment_target=ct.target.iOS17, ) loaded_pymil_prog = get_pymil_prog_from_mlmodel(mlmodel) assert get_op_types_in_program(loaded_pymil_prog) == get_op_types_in_program(prog) @pytest.mark.parametrize( "immediate_value, dtype", itertools.product( (True, False), (types.int4, types.uint4, types.int8, types.uint8), ), ) def test_milproto_load_to_pymil_sub_byte(self, immediate_value: bool, dtype: types): """Test if value in milproto (especially sub-byte) could be corrected loaded into pymil.""" dtype_range = types.type_mapping.builtin_to_range(dtype) data_val = [dtype_range.low, dtype_range.high] if immediate_value: # Tensors with less than 10 elements will be stored as immediate values. data = np.array(data_val).reshape((1, 2, 1)) else: data = np.array(data_val * 20).reshape((1, 40, 1)) offset_val = dtype_range.high if dtype.is_unsigned() else -1 offset = np.array([offset_val]).reshape((1, 1, 1)) np_dtype = types.nptype_from_builtin(dtype) @mb.program(input_specs=[], opset_version=ct.target.iOS18) def prog(): return mb.constexpr_blockwise_shift_scale( data=data.astype(np_dtype), scale=np.array([4]).reshape((1, 1, 1)).astype(np.float16), offset=offset.astype(np_dtype), ) mlmodel = ct.convert( prog, convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY, minimum_deployment_target=ct.target.iOS18, ) pymil_prog: Program = milproto_to_pymil( model_spec=mlmodel.get_spec(), specification_version=ct.target.iOS18, file_weights_dir=mlmodel.weights_dir, ) assert get_op_types_in_program(pymil_prog) == get_op_types_in_program(prog) original_ops = mlmodel._mil_program.functions["main"].find_ops( op_type="constexpr_blockwise_shift_scale" ) load_back_ops = pymil_prog.functions["main"].find_ops( op_type="constexpr_blockwise_shift_scale" ) for (original_op, load_back_op) in zip(original_ops, load_back_ops): assert original_op.data.dtype == load_back_op.data.dtype assert original_op.offset.dtype == load_back_op.offset.dtype np.testing.assert_array_equal(original_op.data.val, load_back_op.data.val) np.testing.assert_array_equal(original_op.offset.val, load_back_op.offset.val) @pytest.mark.skipif(ct.utils._macos_version() < (12, 0), reason="mlprogram predict available only on macOS12+") class TestE2ENumericalCorrectness: @pytest.mark.skipif(not _HAS_TORCH, reason="requires torch") def test_elu(self): inputs = [ct.TensorType(name="data", shape=(2, 3, 1))] input_data = [torch.rand(*i.shape.to_list()) for i in inputs] torchmodel = torch.jit.trace(torch.nn.ELU(inplace=False), input_data) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram", compute_units=ComputeUnit.CPU_ONLY) input_values = { i.name: val.detach().numpy() for i, val in zip(inputs, input_data) } roundtrip_and_compare_mlmodel(mlmodel, input_values) @pytest.mark.skipif(not _HAS_TORCH, reason="requires torch") def test_linear(self): inputs = [ct.TensorType(name="data", shape=(10, 2))] input_data = [torch.rand(*i.shape.to_list()) for i in inputs] torchmodel = torch.jit.trace( torch.nn.Linear(in_features=2, out_features=3, bias=True), input_data ) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram", compute_units=ComputeUnit.CPU_ONLY) input_values = { i.name: val.detach().numpy() for i, val in zip(inputs, input_data) } roundtrip_and_compare_mlmodel(mlmodel, input_values) @pytest.mark.skipif(not _HAS_TORCH, reason="requires torch") def test_conv(self): inputs = [ct.TensorType(name="data", shape=(5, 10, 4, 4))] input_data = [torch.rand(*i.shape.to_list()) for i in inputs] torchmodel = torch.jit.trace( torch.nn.Conv2d(in_channels=10, out_channels=20, kernel_size=4), input_data ) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram", compute_units=ComputeUnit.CPU_ONLY) input_values = { i.name: val.detach().numpy() for i, val in zip(inputs, input_data) } roundtrip_and_compare_mlmodel(mlmodel, input_values) @pytest.mark.skipif(not _HAS_TORCH, reason="requires torch") def test_while_loop(self): model = TestScriptedModels.get_while_loop_model() model_spec = torch.jit.script(model) mlmodel = ct.convert(model_spec, inputs=[ct.TensorType(name="data", shape=model.input_size, dtype=np.float32)], convert_to="mlprogram", compute_units=ComputeUnit.CPU_ONLY ) input_values = {"data": np.array([10.])} roundtrip_and_compare_mlmodel(mlmodel, input_values) @pytest.mark.skipif(not _HAS_TORCH, reason="requires torch") def test_cond(self): model = TestScriptedModels.get_cond_model() model_spec = torch.jit.script(model) mlmodel = ct.convert(model_spec, inputs=[ct.TensorType(name="data", shape=(1,), dtype=np.float32)], convert_to="mlprogram", compute_units=ComputeUnit.CPU_ONLY ) roundtrip_and_compare_mlmodel(mlmodel, {"data": np.array([1.])}) roundtrip_and_compare_mlmodel(mlmodel, {"data": np.array([11.])}) def test_list(self): pytest.xfail( "Fix and re-enable this test: rdar://76293949 (TF2 unit test InvalidArgumentError)" ) model, inputs, outputs = TestTensorArray.get_dynamic_elem_shape_model() input_values = [np.random.rand(2, 3)] input_dict = dict(zip(inputs, input_values)) _, mlmodel, _, _ = run_compare_tf( model, input_dict, outputs, compute_unit=ct.ComputeUnit.CPU_ONLY, backend=("mlprogram", "fp16") ) roundtrip_and_compare_mlmodel(mlmodel, {"Placeholder": input_values[0]}) class TestStatefulModelLoad: @staticmethod def convert_and_load_back(prog): mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, ) return milproto_to_pymil( mlmodel.get_spec(), specification_version=_SPECIFICATION_VERSION_IOS_18, file_weights_dir=mlmodel.weights_dir, ) @staticmethod def check_update_prog(prog, output_name): # check i/o types assert len(prog.functions) == 1 func = prog.functions["main"] assert len(func.inputs) == 2 in_var = func.inputs["state_workaround"] assert types.is_state(in_var.sym_type) assert in_var.name == "state_workaround" assert in_var.shape == (2, 3) assert in_var.dtype == types.fp16 in_var_2 = func.inputs["x"] assert in_var_2.name == "x" assert in_var_2.shape == (2, 3) assert in_var_2.dtype == types.fp16 assert len(func.outputs) == 1 out_var = func.outputs[0] assert out_var.name == output_name assert out_var.shape == (2, 3) assert out_var.dtype == types.fp16 # check op get_op_types_in_program(prog) == ["coreml_update_state"] def test_load_read_state(self): @mb.program( input_specs=[ mb.StateTensorSpec((2, 3), dtype=types.fp16), ], opset_version=ct.target.iOS18, ) def prog(x): return mb.read_state(input=x, name="out") new_prog = self.convert_and_load_back(prog) # check i/o types assert len(new_prog.functions) == 1 func = new_prog.functions["main"] assert len(func.inputs) == 1 in_var = func.inputs["x"] assert types.is_state(in_var.sym_type) assert in_var.name == "x" assert in_var.shape == (2, 3) assert in_var.dtype == types.fp16 assert len(func.outputs) == 1 out_var = func.outputs[0] assert out_var.name == "out" assert out_var.shape == (2, 3) assert out_var.dtype == types.fp16 # check op get_op_types_in_program(new_prog) == ["read_state"] def test_load_coreml_update_state(self): @mb.program( input_specs=[ mb.StateTensorSpec((2, 3), dtype=types.fp16), mb.TensorSpec((2, 3), dtype=types.fp16), ], opset_version=ct.target.iOS18, ) def prog(state, x): return mb.coreml_update_state(state=state, value=x, name="out") new_prog = self.convert_and_load_back(prog) self.check_update_prog(new_prog, "out") def test_load_coreml_update_state_singular(self): @mb.program( input_specs=[ mb.StateTensorSpec((2, 3), dtype=types.fp16), mb.TensorSpec((2, 3), dtype=types.fp16), ], opset_version=ct.target.iOS18, ) def prog(state, x): mb.coreml_update_state(state=state, value=x) return x new_prog = self.convert_and_load_back(prog) self.check_update_prog(new_prog, "x") def test_load_state_complex(self): @mb.program( input_specs=[ mb.StateTensorSpec((2, 3), dtype=types.fp16), mb.TensorSpec((2, 3), dtype=types.fp16), ], opset_version=ct.target.iOS18, ) def prog(state, x): read_state = mb.read_state(input=state) add = mb.add(x=read_state, y=np.float16([0.1])) value = mb.coreml_update_state(state=state, value=add) add = mb.add(x=value, y=x) mb.coreml_update_state(state=state, value=add) return add new_prog = self.convert_and_load_back(prog) assert get_op_types_in_program(new_prog) == [ "read_state", "add", "coreml_update_state", "add", "coreml_update_state", ] ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2135465 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/0000755000000000000000000000000014672075535024035 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/__init__.py0000644000000000000000000000137214672066616026151 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import logging # suppress TensorFlow stdout prints import os from coremltools._deps import _HAS_TF if os.getenv("TF_SUPPRESS_LOGS", "1") == "1": os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3" # FATAL logging.getLogger("tensorflow").setLevel(logging.FATAL) register_tf_op = None if _HAS_TF: # Importing these causes them to register their ops from . import ops from .dialect_ops import (TfLSTMBase, tf_lstm_block, tf_lstm_block_cell, tf_make_list) from .tf_op_registry import register_tf_op ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/basic_graph_ops.py0000644000000000000000000002536714672066616027547 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause def connect_edge(g, source, dest): if isinstance(source, str): source = g[source] if isinstance(dest, str): dest = g[dest] source.outputs.append(dest.name) dest.inputs.append(source.name) def connect_edge_at_index(g, source, dest, idx): if isinstance(source, str): source = g[source] if isinstance(dest, str): dest = g[dest] source.outputs.insert(idx, dest.name) dest.inputs.insert(idx, source.name) def replace_source(g, source, dest, new_source): if isinstance(source, str): source = g[source] if isinstance(dest, str): dest = g[dest] if isinstance(new_source, str): new_source = g[new_source] dest_inputs = [] for inp in dest.inputs: if inp == source.name: dest_inputs.append(new_source.name) g[new_source.name].outputs.append(dest.name) else: dest_inputs.append(inp) dest.inputs = dest_inputs source.outputs = [i for i in g[source.name].outputs if i != dest.name] def replace_control_source(g, source, dest, new_source): if isinstance(source, str): source = g[source] if isinstance(dest, str): dest = g[dest] if isinstance(new_source, str): new_source = g[new_source] dest_inputs = [] for inp in dest.control_inputs: if inp == source.name: dest_inputs.append(new_source.name) g[new_source.name].control_outputs.append(dest.name) else: dest_inputs.append(inp) dest.control_inputs = dest_inputs source.control_outputs = [i for i in g[source.name].outputs if i != dest.name] def replace_dest(g, source, dest, new_dest): if isinstance(source, str): source = g[source] if isinstance(dest, str): dest = g[dest] if isinstance(new_dest, str): new_dest = g[new_dest] for idx, d in enumerate(source.outputs): if d == dest.name: source.outputs[idx] = new_dest.name new_dest.inputs = new_dest.inputs[:] + [source.name] dest.inputs = [i for i in dest.inputs if i != source.name] def replace_control_dest(g, source, dest, new_dest): if isinstance(source, str): source = g[source] if isinstance(dest, str): dest = g[dest] if isinstance(new_dest, str): new_dest = g[new_dest] for idx, d in enumerate(source.control_outputs): if d == dest.name: source.control_outputs[idx] = new_dest.name new_dest.control_inputs = new_dest.control_inputs[:] + [source.name] dest.control_inputs = [i for i in dest.control_inputs if i != source.name] def connect_dests(g, source, dests): for i in dests: connect_edge(g, source, i) def connect_sources(g, sources, dest): for i in sources: connect_edge(g, i, dest) def disconnect_edge(g, source, dest): if isinstance(source, str): source = g[source] if isinstance(dest, str): dest = g[dest] source.outputs = [i for i in source.outputs if i != dest.name] dest.inputs = [i for i in dest.inputs if i != source.name] def disconnect_control_edge(g, source, dest): if isinstance(source, str): source = g[source] if isinstance(dest, str): dest = g[dest] source.control_outputs = [i for i in source.control_outputs if i != dest.name] dest.control_inputs = [i for i in dest.control_inputs if i != source.name] def disconnect_vertex_outs(g, source): if isinstance(source, str): source = g[source] for out in source.outputs: g[out].inputs = [i for i in g[out].inputs if i != source.name] source.outputs = [] def disconnect_vertex_ins(g, dest): if isinstance(dest, str): dest = g[dest] for inp in dest.inputs: if isinstance(inp, str): innode = g[inp] else: innode = inp innode.outputs = [i for i in innode.outputs if i != dest.name] dest.inputs = [] def disconnect_vertex_control_ins(g, dest): if isinstance(dest, str): dest = g[dest] for inp in dest.control_inputs: if isinstance(inp, str): innode = g[inp] else: innode = inp innode.control_outputs = [i for i in innode.control_outputs if i != dest.name] dest.control_inputs = [] def disconnect_vertex_control_outs(g, source): if isinstance(source, str): source = g[source] for out in source.control_outputs: g[out].control_inputs = [i for i in g[out].control_inputs if i != source.name] source.control_outputs = [] def delete_node(g, node): if not isinstance(node, str): node = node.name disconnect_vertex_ins(g, node) disconnect_vertex_outs(g, node) disconnect_vertex_control_ins(g, node) disconnect_vertex_control_outs(g, node) del g[node] def replace_node(g, original_node, new_node): if isinstance(new_node, str): new_node = g[new_node] if not isinstance(original_node, str): original_node = original_node.name for o in list(g[original_node].control_outputs): replace_control_source(g, original_node, o, new_node) for o in list(g[original_node].outputs): replace_source(g, original_node, o, new_node) for i in list(g[original_node].control_inputs): replace_control_dest(g, i, original_node, new_node) for i in list(g[original_node].inputs): replace_dest(g, i, original_node, new_node) def fill_outputs(gd): """ Fills the output lists of of a graph of ParsedNode Takes a graph in "dict{str, ParsedNode}" form, and returns a new graph. """ # fill outputs for k, v in gd.items(): for i in v.inputs: gd[i].outputs.append(v.name) for i in v.control_inputs: gd[i].control_outputs.append(v.name) get_tuple_ops = ["Split", "SplitV", "LSTMBlock", "NonMaxSuppressionV5"] for k, v in gd.items(): if v.op in get_tuple_ops: outputs = [[out, int(gd[out].attr["index"])] for out in v.outputs] outputs.sort(key=lambda x: x[1]) gd[k].outputs = [out for [out, _] in outputs] return gd def check_connections(gd): """ Given a graph, checks that all - inputs/outputs are symmetric - control_inputs/control_outputs are symmetric - The graph does not reference vertices outside of the graph Takes a graph in "dict{str, ParsedNode}" form. Does not return, asserts false on failure. """ # check that inputs and outputs line up for k, v in gd.items(): for i in v.inputs: if isinstance(i, str): assert k in gd[i].outputs else: assert k in gd[i.name].outputs for i in v.outputs: inputs = [ inp if isinstance(inp, str) else inp.name for inp in gd[i].inputs ] assert k in inputs for i in v.control_inputs: if isinstance(i, str): assert k in gd[i].control_outputs else: assert k in gd[i.name].control_outputs for i in v.control_outputs: control_inputs = [ inp if isinstance(inp, str) else inp.name for inp in gd[i].control_inputs ] assert k in control_inputs def const_determined_nodes(gd, assume_variable_nodes=None): """ Given a graph, extract all nodes that only depends on const nodes. # TODO: extract nodes that depends on the "const part" of placeholders. """ if assume_variable_nodes is None: assume_variable_nodes = [] vis = {} def visit(node): # make sure node is a ParsedNode if isinstance(node, str): node = gd[node] if node.name in vis: return if "Const" in node.op: vis[node.name] = True elif "Variable" in node.op: vis[node.name] = False elif "Placeholder" in node.op: vis[node.name] = False # TF1 uses TensorArray* while TF2 uses TensorList* ops elif "TensorArray" in node.op or "TensorList" in node.op: vis[node.name] = False elif "function" in node.op: vis[node.name] = False elif "global" in node.op: vis[node.name] = False elif "FakeQuant" in node.op: vis[node.name] = False elif node.name in assume_variable_nodes: vis[node.name] = False else: ret = True vis[node.name] = False for innode in node.inputs: if isinstance(innode, str): inname = innode else: inname = innode.name if inname not in vis: visit(innode) if not vis[inname]: ret = False break vis[node.name] = ret for k, v in gd.items(): if k in vis: continue visit(k) ret = [] for k, v in vis.items(): if v: ret.append(k) return ret def topsort(graph): if len(graph) == 0: return [] inedge_count = {k: len(v.inputs) + len(v.control_inputs) for k, v in graph.items()} ret = [] curboundary = [k for k, v in inedge_count.items() if v == 0] nextboundary = [] if len(curboundary) == 0: raise ValueError("Graph is not a DAG!") while len(curboundary) > 0: ret.extend(curboundary) for b in curboundary: for o in graph[b].outputs + graph[b].control_outputs: inedge_count[o] -= 1 if inedge_count[o] == 0: nextboundary.append(o) curboundary = nextboundary nextboundary = [] if len(ret) != len(graph): raise ValueError("Graph is not a DAG!") return ret def simple_topsort(inputs): if len(inputs) == 0: return [] outputs = {k: [] for k in inputs} for k in inputs: for o in inputs[k]: outputs[o].append(k) inedge_count = {k: len(v) for k, v in inputs.items()} ret = [] curboundary = [k for k, v in inedge_count.items() if v == 0] nextboundary = [] if len(curboundary) == 0: raise ValueError("Graph is not a DAG!") while len(curboundary) > 0: ret.extend(curboundary) for b in curboundary: for o in outputs[b]: inedge_count[o] -= 1 if inedge_count[o] == 0: nextboundary.append(o) curboundary = nextboundary nextboundary = [] if len(ret) != len(inputs): raise ValueError("Graph is not a DAG!") return ret ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/convert_utils.py0000644000000000000000000001651114672066616027313 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import defaultdict from tqdm import tqdm as _tqdm from coremltools import _logger as logger from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.types.symbolic import (any_variadic, is_symbolic) from coremltools.converters.mil.mil.var import ListVar from .basic_graph_ops import topsort from .tf_op_registry import _TF_OPS_REGISTRY def compatible_shapes(tf_shape, inf_shape): def compare_elem(dt, ds): if dt is None or dt < 0: return True elif dt == ds: return True elif is_symbolic(ds): if is_symbolic(dt) and dt != ds: logger.warning("Symbolic dim {} and {}".format(ds, dt) +\ " assumed to be equal") return True else: return False if tf_shape is None or any_variadic(inf_shape): return True else: return all(compare_elem(dt, ds) for dt, ds in zip(tf_shape, inf_shape)) def check_output_shapes(x, node): """ x: list[Var] or tuple[Var] node: ParsedTFNode """ if isinstance(x, ListVar): # No check on list. return if not isinstance(x, (list, tuple)): x = [x] tf_shapes = node.attr.get("_output_shapes", None) if tf_shapes is None: return inf_shapes = [] for y in x: if y is None: msg = "TF convert returns None type in TF node {}" raise TypeError(msg.format(node.name)) if types.is_tensor(y.sym_type): inf_shapes.append(list(y.shape)) elif types.is_scalar(y.sym_type): inf_shapes.append([]) else: msg = "Output type {} not understood" raise ValueError(msg.format(y)) for t, s in zip(tf_shapes, inf_shapes): if not compatible_shapes(t, s): msg = ( "Op {} ({}) type inference ({}) and TF output shape " + "({}) mismatch" ) raise ValueError(msg.format(node.name, node.op, s, t)) def connect_global_initializer(graph): # In TF1, variable initialization (from frozen graph) is done by a # DAG in main function that is disconnected from the rest of the main # function. For example: # # Initialization DAG (disconnected from Main DAG): # Const -> set_global(variable='v1') # # Main DAG: # Placeholder --- # | # get_global(variable='v1') ----> some_output # # (Note that in this example there's no loop or other function.) # # If the variable does not cross block boundary, we can always represent # `get_global` by the input to `set_global`, which may or may not be # Const, following the control dependency. # # Note that this is incorrect if global variable crosses, say, # while_loop block boundary, which needs a more complex resource inference # to support and is not supported in this function. # # Due to the lack of control depeendency between thhe two DAG, we could be # converting `set_global` after `get_global`, which makes it impossible to # perform eager type inference, as type information (e.g., tensor shape) # is only provided by `set_global` (whether setting it to a const or a # non-const). # # Here we remedy the simpler case: when `set_global` takes in a Const, # we assume it's initialization and thus must # run before get_global, i.e. all get_global(variable='v1') must be a # control_output of set_global(variable='v1') where set_global's input is # Const (with and control_inputs set symmetrically). Note that multiple # `get_global(variable='v1')` might have dependences among themselves, but # they should all take the constant `set_global(variable='v1')` as control # dependency. # Phase 1: Collect get_global nodes for each variable. # variable name to list[ParsedTFNode] var_to_get_global_nodes = defaultdict(list) for node in graph.values(): if node.op == "get_global": variable_name = node.attr["variable"] var_to_get_global_nodes[variable_name].append(node) # Phase 2: Find set_global with compile time values for node_name, node in graph.items(): if node.op != "set_global": continue input_name = node.inputs[0] input_node = graph[input_name] if input_node.op != "Const": continue variable_name = node.attr["variable"] for get_node in var_to_get_global_nodes[variable_name]: logger.info( "add {} as control inputs of {}".format(node_name, get_node.name) ) get_node.control_inputs.append(node_name) node.control_outputs.append(get_node.name) def convert_graph(context, graph, outputs=None): """ Construct Core ML ops corresponding to `graph`. Inputs: - context (TranscriptContext) - graph (dict of str -> ParsedTFNode): op name --> ParsedTFNode - outputs (list[str]): List of output names. If outputs is None, the last node graph (after topsort) must have op type return. Returns: list[Var]: the output Vars of the constructed Block. """ connect_global_initializer(graph) nodes = topsort(graph) if outputs is None: # infer outputs from return last_node = graph[nodes[-1]] if last_node.op != "return": msg = "Expect the last node in graph to be 'return'; Got {}" raise ValueError(msg.format(last_node.op)) second_last_node = graph[last_node.inputs[0]] if second_last_node.op == "make_tuple": outputs = second_last_node.inputs else: # single output function outputs = second_last_node.name # Translate the non-placeholder ops. num_nodes = len(nodes) for i, node_name in enumerate( _tqdm(nodes, desc="Converting TF Frontend ==> MIL Ops", unit=" ops") ): node = graph[node_name] if node.op == "return": continue logger.info( "[{}/{}] Converting {} op '{}'".format(i + 1, num_nodes, node.op, node.name) ) if node.op in ("NoOp", "Assert"): continue add_op = _TF_OPS_REGISTRY.get(node.op, None) if add_op is None: msg = "Conversion for TF op '{0}' not implemented.\n \n{1}".format( node.op, node.original_node ) raise NotImplementedError(msg) add_op(context, node) if len(node.outputs) > 0: # set_global / get_global / NoOp has no direct consumer / outputs x = context[node.name] check_output_shapes(x, node) output_is_list = isinstance(outputs, (tuple, list)) if not output_is_list: outputs = [outputs] output_vars = [] for output in outputs: x = context[output.split(":")[0]] if isinstance(x, (tuple, list)): idx = int(output.split(":")[1]) output_vars.append(x[idx]) else: output_vars.append(x) return output_vars if output_is_list else output_vars[0] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/converter.py0000644000000000000000000005634614672066616026434 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools import _logger as logger from coremltools.converters._profile_utils import _profile from coremltools.converters.mil import mil from coremltools.converters.mil._deployment_compatibility import AvailableTarget as _target from coremltools.converters.mil.input_types import ImageType, InputType, RangeDim from coremltools.converters.mil.input_types import Shape as InputShape from coremltools.converters.mil.input_types import TensorType, _get_shaping_class from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, get_new_symbol, types from coremltools.converters.mil.mil.types.symbolic import is_symbolic from coremltools.converters.mil.mil.var import Var from .._utils import get_output_names from .basic_graph_ops import simple_topsort from .convert_utils import convert_graph # TranscriptionContext maintains a map of tf_node.name --> ssa_var available # to the current TF --> tfssa transcription. class TranscriptionContext: def __init__(self, name=None): self.name = name if name is not None else "" self.context = {} self.graphs = {} # TF loops are represented as functions, so nested loops becomes # stacked functions. Stacked functions are translated to nested # blocks in Program, like # # while_loop(loop_vars=(%a, %b)) # cond_block1(%a.x, %b.x) { # ...some ops # } -> (%bool_var1) # body_block1(%a.x, %b.x) { # %ret_axx = while_loop(loop_vars=(%a.x,)) # cond_block2(%a.x.x) { # ...some ops # } -> (%bool_var2) # body_block2(%a.x.x) { # ...some ops # } -> (%new_a.x.x) # } -> (%ret_axx) # ....some ops using %ret_a # } -> (%ret_ax, %ret_bx) # # During the translation of cond_block2, we'd have func_input_stack # # (%a.x.x,) # (%a.x, %b.x) # # where [%a.x.x] would be unstacked once cond_block2 is done. self.func_input_stack = [] # list of tuple[Var] def add(self, tf_name, ssa_vars, is_new_var=True): """ ssa_vars: list[Var] / tuple[Var] (multiple outputs) or Var (single_output) is_new_var: True if ssa_vars are newly created for tf_name. """ if tf_name in self.context: # Overriding allow us to translate while_loop body twice (which is # needed to figure out shapes changes during iterates) msg = "TF var %s is added again. Overriding previous value" logger.info(msg % tf_name) if is_new_var and isinstance(ssa_vars, Var) and tf_name != ssa_vars.name: msg = ( "MIL op's name ({}) does not match TensorFlow's node name ({})." " Warning: Node added to context must have the same name as the name passed to context." ) raise ValueError(msg.format(tf_name, ssa_vars.name)) self.context[tf_name] = ssa_vars def add_graph(self, graph_name, graph): self.graphs[graph_name] = graph def get_graph(self, graph_name): if graph_name not in self.graphs: msg = "Graph '{}' not found in: {}" raise KeyError(msg.format(graph_name, list(self.graphs.keys()))) return self.graphs[graph_name] def stack_func_inputs(self, inputs): self.func_input_stack.append(inputs) def unstack_func_inputs(self): if len(self.func_input_stack) == 0: raise ValueError("No func input available") self.func_input_stack.pop() def get_func_inputs(self): if len(self.func_input_stack) == 0: raise ValueError("No func input available") return self.func_input_stack[-1] def __getitem__(self, tf_name): if tf_name not in self.context: msg = "TF var {} not found in context {}" raise KeyError(msg.format(tf_name, self.name)) return self.context[tf_name] def __contains__(self, tf_name): return tf_name in self.context class TFConverter: def __init__( self, tfssa, inputs=None, outputs=None, opset_version=None, use_default_fp16_io=False ): """ tfssa: TensorFlow IR. inputs: list of TensorType or ImageType, optional, defaults to None. outputs: list[ct.InputType] or None list of either ct.TensorTypes or ct.ImageTypes (both of which are child classes of InputType) This is the value of the "outputs" argument, passed on by the user in "coremltools.convert" API. opset_version: An int represents the Core ML opset version. use_default_fp16_io (optional): bool. Defaults to False. When minimum_deployment_target set >= ct.target.iOS16 (the same as ct.target.macOS13), and the compute precision set to fp16, this flag is True. When True, fp32 i/o defaults to fp16. """ self.tfssa = tfssa self.global_type = {} self.inputs = None self.main_output_types = outputs self.opset_version = _target(opset_version) if opset_version is not None else None self.use_default_fp16_io = use_default_fp16_io output_names = get_output_names(outputs) main_func = tfssa.functions["main"] graph = main_func.graph # Get inputs dtype and shape defined in the tf graph tf_placeholder_names = [n for n in graph if graph[n].op == "Placeholder"] tf_input_dtype = {} tf_input_shape = {} image_input_names = [] inputs_with_defined_shape = [] if inputs is not None: # Special case: if there's only 1 input and 1 placeholder, we match them. if len(tf_placeholder_names) == 1 and len(inputs) == 1: if inputs[0].name is None: inputs[0].name = tf_placeholder_names[0] for val in inputs: if isinstance(val, ImageType): image_input_names.append(val.name) if val.shape is not None: inputs_with_defined_shape.append(val.name) for inp in main_func.inputs: node = graph[inp] # Parse dtype from the tf graph dtype = node.attr["dtype"] if use_default_fp16_io and dtype == types.fp32 and inp not in image_input_names: dtype = types.fp16 tf_input_dtype[inp] = dtype # Parse shape from the tf graph if inp not in inputs_with_defined_shape: shape = self._get_placeholder_shape_from_tf_graph(tfgraph=graph, name=inp) shape = [get_new_symbol() if s is None or s == -1 else s for s in shape] shape = _get_shaping_class(shape) tf_input_shape[inp] = shape # Filter the inputs to only Placeholder names missing_placeholder_names = [] if inputs is not None: # Check inputs format if not isinstance(inputs, (list, tuple)): raise ValueError( "Type of inputs should be list or tuple, got {} instead.".format( type(inputs) ) ) if not all([isinstance(i, InputType) for i in inputs]): raise ValueError( "Type of inputs should be list or tuple of TensorType or ImageType, got {} instead.".format( [type(i) for i in inputs] ) ) for inp in inputs: # Check inputs existence if inp.name is None: raise ValueError( "Multiple inputs are found in graph, but no input name was provided" ) if inp.name not in tf_placeholder_names: raise ValueError( "Input ({}) provided is not found in given tensorflow graph. Placeholders in graph are: {}".format( inp.name, tf_placeholder_names ) ) # We fill in shapes and dtypes for user-specified input that doesn't set if inp.shape is None: inp.shape = tf_input_shape[inp.name] if inp.dtype is None: inp.dtype = tf_input_dtype[inp.name] # Extract placeholders that users didn't specify. user_input_names = [inp.name for inp in inputs] for name in tf_placeholder_names: if name not in user_input_names: missing_placeholder_names.append(name) else: inputs = [] missing_placeholder_names = tf_placeholder_names # name -> (shape, mil_type) mapping. shape has type list[int] added_inputs = {} for inp in main_func.inputs: if inp not in missing_placeholder_names: continue shape, dtype = tf_input_shape[inp], tf_input_dtype[inp] inputs.append(TensorType(name=inp, shape=shape, dtype=dtype)) added_inputs[inp] = (shape, dtype) if len(added_inputs) > 0: logger.info( "Adding Input not specified by users: '{}'".format( added_inputs) ) for idx, inp in enumerate(inputs): # We set the default image format in TF as NHWC, since NHWC is used # for TF unless GPU is specified as device. if isinstance(inp, ImageType) and inputs[idx].channel_first is None: inputs[idx].channel_first = False self.inputs = tuple(inputs) for inputtype in self.inputs: if not isinstance(inputtype.shape, InputShape): continue if any([isinstance(s, RangeDim) for s in inputtype.shape.shape]): continue if inputtype.name not in graph: raise ValueError( f"The input {inputtype.name} provided is not in graph." ) node = graph[inputtype.name] shape = [-1 if is_symbolic(s) else s for s in inputtype.shape.shape] node.attr["_output_shapes"] = [shape] # list of length 1 # infer outputs if not provided self._validate_outputs(tfssa, output_names) output_names = main_func.outputs if output_names is None else output_names output_names = output_names if isinstance(output_names, (tuple, list)) else [output_names] output_names = [x if isinstance(x, str) else x.name for x in output_names] self.output_names = output_names # We would like a stack so that we run conversion sequentially. self.graph_stack = self._get_stack(tfssa, root="main") self.context = TranscriptionContext() def _get_placeholder_shape_from_tf_graph(self, tfgraph, name): error_message = "Unable to determine the shape of input: {}." \ " Please provide its shape during conversion, using \n" \ "'ct.convert(..., inputs=[ct.TensorType(name='{}', shape=(_FILL_ME_) ),])".format(name, name) if tfgraph[name].attr.get("shape", None) is not None: shape = tfgraph[name].attr["shape"] elif tfgraph[name].attr.get("_output_shapes", None) is not None: shape = tfgraph[name].attr["_output_shapes"][0] if shape is None: raise ValueError(error_message) else: raise ValueError(error_message) return shape def _get_stack(self, tfssa, root="main"): # We're trying to get a order of how to loop through the graphs. # This is NOT necessarily a DAG. dep = {x: [] for x in tfssa.functions} for fname in tfssa.functions: for node in tfssa.functions[fname].graph.values(): func_x, func_y = None, None if node.op == "while": func_x = node.attr["body_function"] func_y = node.attr["cond_function"] if func_x and fname not in dep[func_x]: dep[func_x].append(fname) if func_y and fname not in dep[func_y]: dep[func_y].append(fname) assert len(dep[root]) == 0 graph_stack = simple_topsort(dep) return graph_stack @staticmethod def _get_tensor_name(tensor): ret = None if isinstance(tensor, str): ret = tensor else: ret = tensor.name return ret.split(":")[0] def _validate_outputs(self, tfssa, outputs): if outputs is None: return outputs = outputs if isinstance(outputs, (tuple, list)) else [outputs] output_nodes = [] for f in tfssa.functions.values(): output_nodes += list(f.outputs) all_nodes = [] for f in tfssa.functions.values(): all_nodes += list(f.graph.keys()) for n in outputs: if self._get_tensor_name(n) not in output_nodes + all_nodes: raise KeyError('Output node name "{}" does exist.'.format(n)) def _validate_and_update_main_output_types(self, prog): assert isinstance(self.main_output_types, list) assert len(self.main_output_types) > 0 output_vars = prog.functions["main"].outputs output_vars_names = set([var.name for var in output_vars]) # validation if get_output_names(self.main_output_types) is None: # this is the case, where the user did not provide names for the outputs. # In this case, the outputs were inferred from the TF graph automatically. # There are two scenarios here: number of inferred outputs equal to 1 or greater than 1 if len(output_vars) == 1: if len(self.main_output_types) > 1: msg = "The list of ct.TensorType()/ct.ImageType() provided in the 'outputs' argument, does not " \ "have names. When more than 1 output is provided for tensorflow conversion, " \ "each entry in the outputs list must have the name specified as well, " \ "via the 'name' argument in ct.TensorType/ct.ImageType" raise ValueError(msg) else: # len(output_vars) > 1 # if there are more than 1 sink nodes (i.e. inferred outputs), the user must provide names # so that the output types can be correctly mapped. msg = "The list of ct.TensorType()/ct.ImageType() provided in the 'outputs' argument, does not " \ "have names. When names are not provided, the outputs are automatically inferred " \ "from the TF graph. There are {} outputs detected which are more than 1. " \ "In this case, to map the output types correctly, " \ "please provide names for each of the " \ "outputs. The output names inferred from the TF graph are: {} " raise ValueError(msg.format( len(output_vars), output_vars_names, )) else: # user provided output names. In this case, the appropriate tensors must have # been selected from the TF graph bases on the output names. # Verify that the names present in self.main_output_types match the output_vars_names (it should match). # Also, reconstruct the self.main_output_types list, in the same order of outputs as # present in the output_vars_names assert len(output_vars) == len(self.main_output_types), \ "this should match if the outputs were picked correctly from the TF graph" for out in self.main_output_types: if out.name not in output_vars_names: msg = "output name, '{}', not found in Tensorflow Graph. Available output names are: {}" raise KeyError(msg.format(out.name, output_vars_names)) name_to_output_type_map = {} for out in self.main_output_types: name_to_output_type_map[out.name] = out main_output_types = [] for out_var in output_vars: main_output_types.append(name_to_output_type_map[out_var.name]) self.main_output_types = main_output_types def check_placeholder_output(self, prog, outputs_name): """ Handle the cases where placeholder is output. There is a case where the program is like main(%Placeholder: (5,fp32)) { block3() { } -> (%Placeholder) } But self.output_names = ["Placeholder:0"] We need to change the block output to Placeholder:0 by inserting an identity """ block = prog["main"] input_name = [x.name for x in list(block.inputs.values())] with block: new_outputs = [] for output, output_name in zip(block.outputs, outputs_name): if output.name not in input_name or output.name == output_name: new_output = output else: new_output = mb.identity(x=output, name=output_name) new_outputs.append(new_output) block.set_outputs(new_outputs) def convert_main_graph(self, prog, graph): func_inputs = {} for input_type in self.inputs: dtype = input_type.dtype # int64 and fp64 are not supported, so they are mapped to int32 / fp32 accordingly if dtype == types.fp64: dtype = types.fp32 elif types.is_int(dtype): dtype = types.int32 func_inputs[input_type.name] = mb.placeholder( input_type.shape.symbolic_shape, dtype=dtype ) with Function(func_inputs, opset_version=self.opset_version) as ssa_func: # Get the input Var for name in func_inputs.keys(): input_var = ssa_func.inputs[name] if ( types.is_tensor(input_var.sym_type) or types.is_scalar(input_var.sym_type) ) and input_var.dtype == types.fp16: input_var = mb.cast(x=input_var, dtype="fp32", name=name) self.context.add(name, input_var) outputs = convert_graph(self.context, graph, self.output_names) ssa_func.set_outputs(outputs) prog.add_function("main", ssa_func) prog.functions["main"].set_input_types(self.inputs) # check duplicate output # Note: sometimes two outputs are pointing to the same Var, we should # create mb.identity for those cases block = prog["main"] with block: name_counts = {} new_outputs = [output for output in block.outputs] for i, v_o in enumerate(block.outputs): if v_o.name not in name_counts: name_counts[v_o.name] = 1 else: name_counts[v_o.name] += 1 new_name = v_o.name + "_duplicate_" + str(name_counts[v_o.name]) x = mb.identity(x=v_o, name=new_name) new_outputs[i] = x block.set_outputs(new_outputs) # Rename outputs to TF's name. This is needed when the last op doesn't # generate a new Var (e.g., get_tuple, Identity etc.), and thus the # last Var would have a different name than the last TF op's name. # # Example: # # TF code: # x = tf.placeholder(tf.float32, shape=(1,)) # y = tf.placeholder(tf.float32, shape=(1,)) # c = lambda i, j: \ # tf.less(tf.math.reduce_mean(i), tf.math.reduce_mean(j)) # b = lambda i, j: (tf.add(i, 1), j) # res = tf.while_loop(c, b, [x, y]) # # Resulting nodes (excluding the nodes in while loop cond & body): # # node name: Placeholder op type: Placeholder inputs: [] # node name: Placeholder_1 op type: Placeholder inputs: [] # node name: make_input_0 op type: make_tuple inputs: ['Placeholder', # 'Placeholder_1'] # node name: while_0 op type: while inputs: ['make_input_0'] # node name: while/Exit op type: get_tuple inputs: ['while_0'] # node name: while/Exit_1 op type: get_tuple inputs: ['while_0'] # # Observe that return node `while/Exit` is an output from get_tuple, # which in our translation simply unpack a python tuple of Vars # ('while_0:0', 'while_0:1') returned from while_0 SSA op. We need to # rename `while_0:0` to `while/Exit` in order for users to find the # output. # Note: only rename the output if the output is not Placeholder. input_names = [x.name for x in self.inputs] for v_o, out_name in zip(prog["main"].outputs, self.output_names): if v_o.name != out_name and v_o.name not in input_names: logger.info( "Renaming output var: '{}' -> '{}'".format(v_o.name, out_name) ) v_o.name = out_name self.check_placeholder_output(prog, self.output_names) # verify that if model output dtypes / names are provided by the user, they are valid if self.main_output_types is not None: self._validate_and_update_main_output_types(prog) if self.use_default_fp16_io: # get a list of names of fp32 output vars fp32_output_var_names = [ var.name for var in prog["main"].outputs if var.dtype == types.fp32 ] if self.main_output_types is not None: # set the dtype default to fp16 if main_output_types is provided for val in self.main_output_types: if ( val.name in fp32_output_var_names and isinstance(val, TensorType) and val.dtype is None ): val.dtype = types.fp16 else: # otherwise, we construct the main_output_types, to make every fp32 # output var fp16 main_output_types = [] for val in prog["main"].outputs: dtype = types.fp16 if val.name in fp32_output_var_names else None main_output_types.append(TensorType(name=val.name, dtype=dtype)) self.main_output_types = main_output_types prog.functions["main"].set_output_types(self.main_output_types) @_profile def convert(self): prog = mil.Program() if len(self.graph_stack) == 0: raise ValueError("At least one TF function must be present") if self.graph_stack[0] != "main": msg = "TF root graph must be named 'main'. Got {}" raise ValueError(msg.format(self.graph_stack[0])) graph = self.tfssa.functions["main"].graph for g_name in self.graph_stack[1:]: self.context.add_graph(g_name, self.tfssa.functions[g_name].graph) self.convert_main_graph(prog, graph) return prog ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/dialect_ops.py0000644000000000000000000001431214672066616026676 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Operation, types from coremltools.converters.mil.mil.input_type import (DefaultInputs, InputSpec, TensorInputType) from coremltools.converters.mil.mil.ops.registry import SSAOpRegistry register_op = SSAOpRegistry.register_op # This file contains the TF dialect of SSA. Briefly, these ops are only # understandable in the TF frontend and not acceptable in the standard op set. # No backend would support any of the op here. These ops exist to facilitate # frontend SSA passes, but must be replaced with standard ops during SSA # passes. # All tf op must start with 'tf_' prefix. # # tf_make_list allows elem_shape to be unspecified. core op make_list does # not allow that. @register_op(namespace="tf") class tf_make_list(Operation): input_spec = InputSpec( init_length=TensorInputType(optional=True, type_domain=types.int32), dynamic_length=TensorInputType(optional=True, type_domain=types.bool), elem_shape=TensorInputType(const=True, optional=True, type_domain=types.int32), dtype=TensorInputType(const=True, optional=True, type_domain=types.str), ) def default_inputs(self): return DefaultInputs( init_length=1, dynamic_length=True, dtype="fp32", ) def type_inference(self): init_length = self.init_length.val if self.elem_shape is None or self.elem_shape.sym_val is None: return types.list( types.unknown, init_length=init_length, dynamic_length=self.dynamic_length.val, ) builtin_dtype = types.string_to_builtin(self.dtype.val) elem_type = types.tensor(builtin_dtype, self.elem_shape.sym_val) return types.list( elem_type, init_length=init_length, dynamic_length=self.dynamic_length.val ) class TfLSTMBase(Operation): """ Common LSTM inputs for BlockLSTMCell and BlockLSTM. """ input_spec = InputSpec( c_prev=TensorInputType(type_domain="T"), # [batch, hidden_dim] h_prev=TensorInputType(type_domain="T"), # [batch, hidden_dim] # weight: [input_dim + hidden_dim, 4*hidden_dim] (icfo layout) weight=TensorInputType(const=True, type_domain="T"), forget_bias=TensorInputType(const=True, optional=True, type_domain="T"), # cell_clip == None implies not using cell clip cell_clip=TensorInputType(const=True, optional=True, type_domain="T"), # If use_peephole == False, weight_peep_* is ignored use_peephole=TensorInputType(const=True, optional=True, type_domain=types.bool), weight_peep_i=TensorInputType(const=True, optional=True, type_domain="T"), # [hidden_dim,] weight_peep_f=TensorInputType(const=True, optional=True, type_domain="T"), # [hidden_dim,] weight_peep_o=TensorInputType(const=True, optional=True, type_domain="T"), # [hidden_dim,] bias=TensorInputType(const=True, type_domain="T"), # [4*hidden_dim] (icfo layout) ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( forget_bias=1., use_peephole=False, ) def _check_peephole_weights(self): # Check weight_peep_* if self.use_peephole.val: if ( self.weight_peep_i is None or self.weight_peep_f is None or self.weight_peep_o is None ): raise ValueError( "weight_peep_* cannot be None when use_peephole is True" ) @register_op(namespace="tf") class tf_lstm_block_cell(TfLSTMBase): """ xh = [x, h_prev] [i, ci, f, o] = xh * w + b f = f + forget_bias if not use_peephole: wci = wcf = wco = 0 i = sigmoid(cs_prev .* wci + i) f = sigmoid(cs_prev .* wcf + f) ci = tanh(ci) cs = ci .* i + cs_prev .* f cs = clip(cs, cell_clip) o = sigmoid(cs * wco + o) co = tanh(cs) h = co .* o """ input_spec = ( InputSpec(x=TensorInputType(type_domain="T"),) + TfLSTMBase.input_spec # [batch, input_dim] ) def __init__(self, **kwargs): super(tf_lstm_block_cell, self).__init__(**kwargs) def type_inference(self): self._check_peephole_weights() # all return shapes are [batch, hidden_dim] ret_shape = self.c_prev.shape dtype = self.x.dtype # See # https://www.tensorflow.org/api_docs/python/tf/raw_ops/LSTMBlockCell # All returned shapes are [batch, hidden_dim] return ( types.tensor(dtype, ret_shape), # i types.tensor(dtype, ret_shape), # cs types.tensor(dtype, ret_shape), # f types.tensor(dtype, ret_shape), # o types.tensor(dtype, ret_shape), # ci types.tensor(dtype, ret_shape), # co types.tensor(dtype, ret_shape), ) # h @register_op(namespace="tf") class tf_lstm_block(TfLSTMBase): """ Apply LSTM to an input sequence """ input_spec = ( InputSpec( seq_len=TensorInputType(type_domain=types.int32), # int x=TensorInputType(type_domain="T"), # [padded_len, batch, input_dim] ) + TfLSTMBase.input_spec ) def type_inference(self): self._check_peephole_weights() padded_len = self.x.shape[0] ret_shape = [padded_len] + list(self.c_prev.shape) dtype = self.x.dtype # All returned shapes are [padded_len, b, hidden_dim] return ( types.tensor(dtype, ret_shape), # i types.tensor(dtype, ret_shape), # cs types.tensor(dtype, ret_shape), # f types.tensor(dtype, ret_shape), # o types.tensor(dtype, ret_shape), # ci types.tensor(dtype, ret_shape), # co types.tensor(dtype, ret_shape), ) # h ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/dot_visitor.py0000644000000000000000000001070014672066616026752 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types class DotVisitor: """ Generates a dot description of a graph in dictionary form. """ def __init__(self, annotation=None): self.result = [] self.visited_memo = {} self.highlights = {} self.alternate_labeller = None self.annotation = annotation def labeller(self, labeller): self.alternate_labeller = labeller return self def highlight_nodes(self, nodeset, color="yellow"): for i in nodeset: self.highlights[i] = color return self def visit(self, graph, node, nodename_prefix=""): if node.name in self.visited_memo: return self # For printing datatype, breaks type if node.attr.get("symbolic_datatype", None) is not None: dtype = str(types.get_type_info(node.attr["symbolic_datatype"])) elif node.datatype is not None: dtype = str(types.get_type_info(node.datatype)) else: dtype = "Unknown" label = "" if self.alternate_labeller is not None: label = self.alternate_labeller(node) else: if len(node.outputs) == 0: label = "\\n{" + node.name + "}" if "Placeholder" in node.op: label = "\\n{" + node.name + "}" if node.op == "while": label = ( "\\n{body: " + node.attr["body_function"] + " cond:" + node.attr["cond_function"] + "}" ) if node.op == "function": label = "\\n{body: " + node.attr["function_name"] + "}" if node.op == "function_entry": label = "\\n{" + node.name + "}" label = node.op + ":" + dtype + label if node.name in self.highlights: self.result.append( '"' + nodename_prefix + node.name + '"' + '[label="' + label + '",fillcolor=%s,style=filled,fontcolor=%s]' % ( self.highlights[node.name], "violetred" if node.attr.get(self.annotation, False) else "black", ) ) else: self.result.append( '"' + nodename_prefix + node.name + '"' + '[label="' + label + '",fontcolor=%s]' % ("violetred" if node.attr.get(self.annotation, False) else "black") ) for i in node.inputs: input_name = i edge = ( '"' + nodename_prefix + input_name + '"' + " -> " + '"' + nodename_prefix + node.name + '"' ) self.result.append(edge) for i in node.control_inputs: input_name = i edge = ( '"' + nodename_prefix + input_name + '"' + " -> " + '"' + nodename_prefix + node.name + '"' ) edge = edge + " [style=dotted]" self.result.append(edge) self.visited_memo[node.name] = 1 for i in node.inputs: input_name = i if input_name[0] == "^": input_name = input_name[1:] assert input_name in graph self.visit(graph, graph[input_name], nodename_prefix) return self def visit_all(self, graph, nodename_prefix=""): for i in graph: self.visit(graph, graph[i], nodename_prefix) return self def get_result(self, graphtype="digraph", graph_name="g"): return ( graphtype + " " + graph_name + " {\n\t" + "\n\t".join(str(i) for i in self.result) + ';\n\tlabel="' + graph_name[8:] + '";\n\tfontsize=96;\n}' ) def __str__(self): return self.get_result() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/load.py0000644000000000000000000003107014672066616025327 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import gc import os from tempfile import mktemp import tensorflow as tf from packaging.version import Version from tqdm import tqdm as _tqdm from coremltools import _logger as logger from coremltools._deps import _get_version from coremltools.converters._profile_utils import _profile from .._utils import get_output_names from .basic_graph_ops import fill_outputs from .converter import TFConverter from .parsed_tf_node import ParsedTFNode from .tf_graph_pass import (cond_to_where, constant_propagation, delete_asserts, delete_disconnected_nodes, delete_unnecessary_constant_nodes, functionalize_loops, fuse_dilation_conv, insert_get_tuple, quantization_pass, remove_variable_nodes, tensor_array_resource_removal) from .tfssa import NetworkEnsemble, SSAFunction class TFLoader: """Abstract class for TensorFlow model loader.""" def __init__(self, model, debug=False, **kwargs): """ TensorFlow model loader. Parameters ---------- model: TensorFlow model Model generated using TensorFlow. debug: bool, optional, defaults to False If true, display verbose logging and visualizations. kwargs: dict(str, Any), optional, defaults to None Dictionary of additional arguments. """ self.model = model self.debug = debug self.kwargs = kwargs self._graph_def = None self._tf_ssa = None @_profile def load(self): """Load TensorFlow model into MIL program.""" logger.info("Loading TensorFlow model '{}'".format(self.model)) outputs = self.kwargs.get("outputs", None) output_names = get_output_names(outputs) self._graph_def = self._graph_def_from_model(output_names) if self._graph_def is not None and len(self._graph_def.node) == 0: msg = "tf.Graph should have at least 1 node, Got empty graph." raise ValueError(msg) self._tf_ssa = self._tf_ssa_from_graph_def() del self._graph_def gc.collect() if self.debug: import graphviz dot_string = self._tf_ssa.get_dot_string( annotation=True, name_and_op_style=True, highlight_debug_nodes=[] ) graphviz.Source(dot_string).view( filename="/tmp/ssa_before_tf_passes", cleanup=True ) program = self._program_from_tf_ssa() logger.debug("program:\n{}".format(program)) return program # @abstractmethod def _graph_def_from_model(self, output_names=None): """Load TensorFlow model into GraphDef. Overwrite for different TF versions.""" pass # @abstractmethod def _tf_ssa_from_graph_def(self, fn_name="main"): """Load GraphDef and parse into NetworkEnsemble (TFSSA).""" pass # @abstractmethod def _program_from_tf_ssa(self): """Load NetworkEnsemble (TFSSA) and parse into MIL program.""" pass @staticmethod def extract_sub_graph(graph_def, outputs=None): """Extract sub-graph based on user-provided outputs.""" if outputs is None or len(outputs) == 0: return graph_def msg = "Extracting sub-graph based on outputs '{}' from the full model" logger.debug(msg.format(outputs)) outputs = outputs if isinstance(outputs, list) else [outputs] outputs = [i.split(":")[0] for i in outputs] if _get_version(tf.__version__) < Version("1.13.1"): return tf.graph_util.extract_sub_graph(graph_def, outputs) else: return tf.compat.v1.graph_util.extract_sub_graph(graph_def, outputs) class TF1Loader(TFLoader): def __init__(self, model, debug=False, **kwargs): """ TensorFlow 1.x model loader. Parameters ---------- model: Model created with TensorFlow 1.x One of the following model format: - TensorFlow tf.Graph object or frozen graph (.pb) file path - TensorFlow tf.keras.Model object or HDF5 (.h5) file path - TensorFlow SavedModel directory path debug: bool, optional. Defaults to False. This flag should generally be False except for debugging purposes for diagnosing conversion errors. Setting this flag to True will cause graph pass errors to be ignored, forcefully returning a NetworkEnsemble object. kwargs: dict(str, Any), optional Dictionary of additional arguments. """ TFLoader.__init__(self, model, debug, **kwargs) def _graph_def_from_model(self, output_names=None): """Overwrites TFLoader._graph_def_from_model()""" msg = "Expected model format: [tf.Graph | .pb | SavedModel | tf.keras.Model | .h5], got {}" if isinstance(self.model, tf.Graph) and hasattr(self.model, "as_graph_def"): graph_def = self.model.as_graph_def(add_shapes=True) return self.extract_sub_graph(graph_def, output_names) elif isinstance(self.model, tf.keras.Model): graph_def = self._from_tf_keras_model(self.model) return self.extract_sub_graph(graph_def, output_names) elif isinstance(self.model, str): if not os.path.exists(str(self.model)): raise ValueError('Input model "{}" does not exist'.format(self.model)) elif os.path.isfile(str(self.model)) and self.model.endswith(".pb"): if _get_version(tf.__version__) < Version("1.13.1"): with open(self.model, "rb") as f: gd = tf.GraphDef() gd.ParseFromString(f.read()) with tf.Graph().as_default() as graph: tf.import_graph_def(gd, name="") else: with tf.io.gfile.GFile(self.model, "rb") as f: gd = tf.compat.v1.GraphDef() gd.ParseFromString(f.read()) with tf.Graph().as_default() as graph: tf.graph_util.import_graph_def(gd, name="") graph_def = graph.as_graph_def(add_shapes=True) return self.extract_sub_graph(graph_def, output_names) elif os.path.isfile(str(self.model)) and self.model.endswith(".h5"): graph_def = self._from_tf_keras_model(self.model) return self.extract_sub_graph(graph_def, output_names) elif os.path.isdir(str(self.model)): graph_def = self._from_saved_model(self.model) return self.extract_sub_graph(graph_def, output_names) else: raise NotImplementedError(msg.format(self.model)) else: raise NotImplementedError(msg.format(self.model)) def _tf_ssa_from_graph_def(self, fn_name="main"): """Overwrites TFLoader._tf_ssa_from_graph_def()""" graph_dict = {} for node in self._graph_def.node: graph_dict[node.name] = ParsedTFNode(node) tensor_array_resource_removal(graph_dict) graph = insert_get_tuple(graph_dict) graph = fill_outputs(graph) delete_disconnected_nodes(graph) tf_ssa = NetworkEnsemble() tf_ssa.functions[fn_name] = SSAFunction(graph) return tf_ssa def _program_from_tf_ssa(self): """Overwrites TFLoader._mil_program_from_tf_ssa()""" # Applying frontend passes on TFSSA. Note that these are different from # passes applied to MIL in TF frontend. tf_passes = [ delete_asserts, functionalize_loops, constant_propagation, delete_unnecessary_constant_nodes, # must come after constant_propagation quantization_pass, cond_to_where, remove_variable_nodes, fuse_dilation_conv, ] if self.debug: for tf_pass in _tqdm( tf_passes, desc="Running TensorFlow Graph Passes", unit=" passes" ): try: tf_pass(self._tf_ssa) except Exception as e: logger.exception('Exception in pass "{}": {}'.format(tf_pass, e)) logger.info("Ignoring exception and continuing to next pass") else: for tf_pass in _tqdm( tf_passes, desc="Running TensorFlow Graph Passes", unit=" passes" ): tf_pass(self._tf_ssa) if self.debug: import graphviz dot_string = self._tf_ssa.get_dot_string( annotation=True, name_and_op_style=True, highlight_debug_nodes=[] ) graphviz.Source(dot_string).view( filename="/tmp/ssa_after_tf_passes", cleanup=True ) converter = TFConverter( tfssa=self._tf_ssa, inputs=self.kwargs["inputs"], outputs=self.kwargs["outputs"], opset_version=self.kwargs["specification_version"], use_default_fp16_io=self.kwargs["use_default_fp16_io"], ) return converter.convert() @staticmethod def _from_saved_model(saved_model_dir): # must import here as tf.contrib is only available on TF 1.x from tensorflow.contrib.saved_model.python.saved_model import reader from tensorflow.python.tools import freeze_graph saved_model_tags = reader.get_saved_model_tag_sets(saved_model_dir)[0] if not saved_model_tags: msg = "Unsupported SavedModel directory format: no tag_sets available" raise NotImplementedError(msg) # get model outputs output_node_names = [] if _get_version(tf.__version__) < Version("1.13.1"): sess = tf.Session() else: sess = tf.compat.v1.Session() metagraph = tf.saved_model.loader.load( sess, saved_model_tags, saved_model_dir ) for sd in metagraph.signature_def.values(): output_node_names += [o.name.split(":")[0] for o in sd.outputs.values()] sess.close() # get frozen graph output_graph = mktemp() tf.compat.v1.reset_default_graph() if _get_version(tf.__version__) >= Version("1.13.1") else tf.reset_default_graph() freeze_graph.freeze_graph( input_graph=None, input_saver=None, input_binary=None, input_checkpoint=None, output_node_names=",".join(output_node_names), restore_op_name=None, filename_tensor_name=None, output_graph=output_graph, clear_devices=True, initializer_nodes="", variable_names_whitelist="", variable_names_blacklist="", input_meta_graph=None, input_saved_model_dir=saved_model_dir, saved_model_tags=",".join(saved_model_tags), ) if _get_version(tf.__version__) < Version("1.13.1"): graph_def = tf.GraphDef() with open(output_graph, "rb") as f: graph_def.ParseFromString(f.read()) graph_def = tf.graph_util.remove_training_nodes(graph_def) else: graph_def = tf.compat.v1.GraphDef() with open(output_graph, "rb") as f: graph_def.ParseFromString(f.read()) graph_def = tf.compat.v1.graph_util.remove_training_nodes(graph_def) with tf.Graph().as_default() as graph: tf.graph_util.import_graph_def(graph_def, name="") return graph.as_graph_def(add_shapes=True) @staticmethod def _from_tf_keras_model(keras_model): from tensorflow.python.framework.convert_to_constants import \ convert_variables_to_constants_v2 from tensorflow.python.keras.saving import saving_utils if not isinstance(keras_model, tf.keras.Model): keras_model = tf.keras.models.load_model(keras_model, None) tf.keras.backend.clear_session() tf.keras.backend.set_learning_phase(False) fn = saving_utils.trace_model_call(keras_model) cf = fn.get_concrete_function() try: frozen_fn = convert_variables_to_constants_v2(cf) return frozen_fn.graph.as_graph_def(add_shapes=True) except Exception: raise NotImplementedError("Unhandled tf.keras model format") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/naming_utils.py0000644000000000000000000000174114672066616027103 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause _varname_charset = set( [chr(i) for i in range(ord("A"), ord("Z") + 1)] + [chr(i) for i in range(ord("a"), ord("z") + 1)] + [chr(i) for i in range(ord("0"), ord("9") + 1)] + ["_"] ) def escape_name(name): ret = "".join([i if i in _varname_charset else "_" for i in name]) if ret.endswith("_"): return ret else: return ret + "_" def escape_fn_name(name): ret = "".join([i if i in _varname_charset else "_" for i in name]) ret = escape_name(name) if ret.startswith("f_"): return ret else: return "f_" + ret def normalize_names(names): if isinstance(names, str): return names.replace(":", "__").replace("/", "__") return [i.replace(":", "__").replace("/", "__") for i in names] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/ops.py0000644000000000000000000037300614672066616025221 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as _np import numpy as np from coremltools import _logger as logger from coremltools.converters.mil._deployment_compatibility import AvailableTarget as target from coremltools.converters.mil.frontend._utils import dynamic_topk from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.block import is_current_opset_version_compatible_with from coremltools.converters.mil.mil.ops.defs._utils import broadcast_shapes, promote_input_dtypes from coremltools.converters.mil.mil.types import builtin_to_string from coremltools.converters.mil.mil.types.symbolic import is_symbolic from .._utils import build_einsum_mil from .convert_utils import convert_graph from .tf_op_registry import register_tf_op def _adjust_min_max(min, max, num_bits=8): if (min <= max) and (max <= 0): min = (min - max) * 1.0 max = 0.0 elif (min >= 0) and (max >= min): max = (max - min) * 1.0 min = 0.0 else: scale = (max - min) / (2 ** num_bits - 1) min_adj = scale * round(min / scale) max_adj = max + min_adj - min min = min_adj max = max_adj return min, max def _is_scalar(type_): if type_ is None: return False result = types.is_int(type_) or types.is_float(type_) or types.is_bool(type_) if types.is_tensor(type_) and (len(type_.get_shape()) == 0): result = True return result def _transpose_NHWC_to_NCHW(x): return mb.transpose(x=x, perm=[0, 3, 1, 2]) def _transpose_NCHW_to_NHWC(x, node_name): return mb.transpose(x=x, perm=[0, 2, 3, 1], name=node_name) def _transpose_NDHWC_to_NCDHW(x): return mb.transpose(x=x, perm=[0, 4, 1, 2, 3]) def _transpose_NCDHW_to_NDHWC(x, node_name): return mb.transpose(x=x, perm=[0, 2, 3, 4, 1], name=node_name) def _check_axes_type(x): if x is None or x.val is None: return None if isinstance(x.val, _np.int32): return _np.array([x.val]) return x.val def _value_at(x, idx): """ input x: 1D tensor (vector). return value at index idx. x[idx]. """ assert x.rank == 1 return mb.slice_by_index(x=x, begin=[idx], end=[0], squeeze_mask=[True]) def _freq_to_mel(freq): return 1127.0 * _np.log(1 + freq / 700.0) def _get_MFCC_constants(spectrogram_N, sample_rate, upper_frequency_limit, lower_frequency_limit, filterbank_channel_count, dct_coefficient_count): """ params: spectrogram_N : int sample_rate: int upper_frequency_limit : int filterbank_channel_count : int dct_coefficient_count : int returns: array(shape: (spectrogram_N,)) array(shape: (spectrogram_N, filterbank_channel_count)) array(shape: (spectrogram_N, filterbank_channel_count)) array(shape: (filterbank_channel_count, dct_coefficient_count)) reference: https://github.com/tensorflow/tensorflow/blob/dec8e0b11f4f87693b67e125e67dfbc68d26c205/tensorflow/core/kernels/mfcc_mel_filterbank.cc """ center_frequencies = _np.zeros((filterbank_channel_count + 1)) mel_low = _freq_to_mel(lower_frequency_limit) mel_hi = _freq_to_mel(upper_frequency_limit) mel_span = mel_hi - mel_low mel_spacing = mel_span / (filterbank_channel_count + 1) for i in range(filterbank_channel_count + 1): center_frequencies[i] = mel_low + (mel_spacing * (i + 1)) hz_per_sbin = 0.5 * sample_rate / (spectrogram_N - 1) start_index = int(1.5 + (lower_frequency_limit / hz_per_sbin)) end_index = int(upper_frequency_limit / hz_per_sbin) band_mapper = _np.zeros((spectrogram_N)) channel = 0 for i in range(spectrogram_N): melf = _freq_to_mel(i * hz_per_sbin) if (i < start_index) or (i > end_index): band_mapper[i] = -2 else: while channel < filterbank_channel_count and center_frequencies[channel] < melf: channel += 1 band_mapper[i] = channel - 1 # Can be == -1 weights = _np.zeros((spectrogram_N)) for i in range(spectrogram_N): channel = int(band_mapper[i]) if (i < start_index) or (i > end_index): weights[i] = 0 else: if channel >= 0: weights[i] = (center_frequencies[channel + 1] - _freq_to_mel(i * hz_per_sbin)) / ( center_frequencies[channel + 1] - center_frequencies[channel]) else: weights[i] = (center_frequencies[0] - _freq_to_mel(i * hz_per_sbin)) / (center_frequencies[0] - mel_low) mat_spec_val = _np.zeros((spectrogram_N, filterbank_channel_count)) mat_weighted = _np.zeros((spectrogram_N, filterbank_channel_count)) for i in range(start_index, end_index + 1): # For each FFT bin channel = int(band_mapper[i]) if channel >= 0: mat_weighted[i, channel] = 1 # Right side of triangle, downward slope channel += 1 if channel < filterbank_channel_count: mat_weighted[i, channel] = -1 # Left side of triangle mat_spec_val[i, channel] = 1 # Left side of triangle # compute the dct matrix cosines = _np.zeros((filterbank_channel_count, dct_coefficient_count)) fnorm = _np.sqrt(2.0 / filterbank_channel_count) arg = _np.pi / filterbank_channel_count for i in range(filterbank_channel_count): for j in range(dct_coefficient_count): cosines[i, j] = fnorm * _np.cos(j * arg * (i + 0.5)) return weights, mat_weighted, mat_spec_val, cosines def _reshape_remaining_dimensions_to_canonical_shape(x, remaining_rank): # An utility function that reshape a tensor with shape [batch, spatial_dims, remaining_dim_1, ..., remaining_dim_N] # to [batch, spatial_dims, remaining_dim_1 * ... * remaining_dim_N] # For the special case where there is no remaining dimensions, we expand the last axis assert remaining_rank != 1 if remaining_rank == 0: return mb.expand_dims(x=x, axes=[-1]) else: x_shape = mb.shape(x=x) batch_and_spatial_shape = mb.slice_by_size(x=x_shape, begin=[0], size=[x.rank-remaining_rank]) reshape_shape = mb.concat(values=[batch_and_spatial_shape, [-1]], axis=0) return mb.reshape(x=x, shape=reshape_shape) def _reshape_remaining_dimension_to_original_shape(x, original_shape, remaining_rank): # An utility function that reshape the tensor with shape [batch_new, spatial_dims_new, remaining_dims] to the original # form, which is [batch_new, spatial_dims_new, remaining_dim_1, ..., remaining_dim_N] assert remaining_rank != 1 if remaining_rank == 0: return mb.squeeze(x=x, axes=[-1]) else: x_shape = mb.shape(x=x) spatial_rank = original_shape.shape[0] - remaining_rank - 1 batch_and_spatial_shape = mb.slice_by_size(x=x_shape, begin=[0], size=[1+spatial_rank]) remaining_shape = mb.slice_by_size(x=original_shape, begin=[1+spatial_rank], size=[-1]) reshape_shape = mb.concat(values=[batch_and_spatial_shape, remaining_shape], axis=0) return mb.reshape(x=x, shape=reshape_shape) @register_tf_op(tf_alias=["BiasAdd", "AddV2"]) def Add(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x, y = promote_input_dtypes([x, y]) if "data_format" in node.attr and node.attr["data_format"] == "NCHW": if x.rank != 1 and y.rank != 1: raise AssertionError("Bias needs to have its rank equals to 1") bias, data = (y, x) if y.rank == 1 else (x, y) if not data.rank >= 3: raise AssertionError("Data needs to be of at least ranke 3") axes = [-(i + 1) for i in range(data.rank - 2)] x = data y = mb.expand_dims(x=bias, axes=axes, name=node.name + "_expanded_bias") x = mb.add(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def AddN(context, node): values = [context[name] for name in node.inputs] if len(values) == 1: Identity(context, node) return prev_var = values[0] for idx, var in enumerate(values[1:]): if var == values[-1]: x = mb.add(x=prev_var, y=var, name=node.name) else: prev_var = mb.add(x=prev_var, y=var, name=node.name + "_tmpAddN_" + str(idx)) context.add(node.name, x) @register_tf_op def Abs(context, node): x = context[node.inputs[0]] x = mb.abs(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Acos(context, node): x = context[node.inputs[0]] x = mb.acos(x=x, name=node.name) context.add(node.name, x) @register_tf_op def All(context, node): x = context[node.inputs[0]] axes = _check_axes_type(context[node.inputs[1]]) keep_dims = node.attr.get("keep_dims", False) x = mb.cast(x=x, dtype="int32") x = mb.reduce_prod(x=x, axes=axes, keep_dims=keep_dims) x = mb.cast(x=x, dtype="bool", name=node.name) context.add(node.name, x) @register_tf_op def Any(context, node): x = context[node.inputs[0]] axes = _check_axes_type(context[node.inputs[1]]) keep_dims = node.attr.get("keep_dims", False) x = mb.cast(x=x, dtype="int32") x = mb.reduce_sum(x=x, axes=axes, keep_dims=keep_dims) x = mb.cast(x=x, dtype="bool", name=node.name) context.add(node.name, x) @register_tf_op def ArgMax(context, node): x = context[node.inputs[0]] axis = context[node.inputs[1]] x = mb.reduce_argmax(x=x, axis=axis, name=node.name) context.add(node.name, x) @register_tf_op def ArgMin(context, node): x = context[node.inputs[0]] axis = context[node.inputs[1]] x = mb.reduce_argmin(x=x, axis=axis, name=node.name) context.add(node.name, x) @register_tf_op def Asin(context, node): x = context[node.inputs[0]] x = mb.asin(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Atan(context, node): x = context[node.inputs[0]] x = mb.atan(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Atanh(context, node): x = context[node.inputs[0]] x = mb.atanh(x=x, name=node.name) context.add(node.name, x) @register_tf_op def AvgPool(context, node): x = context[node.inputs[0]] in_shape = x.sym_type.get_shape() d_rank = len(in_shape) - 2 data_format = node.attr.get("data_format", "NHWC") ksize = node.attr.get("ksize", None) kernel_sizes = _pool_pads_or_strides(ksize, data_format, d_rank) strides = node.attr.get("strides", None) if strides is not None: strides = _pool_pads_or_strides(strides, data_format, d_rank) pad_type = node.attr["padding"].lower() if data_format == "NHWC": x = _transpose_NHWC_to_NCHW(x) x = mb.avg_pool( x=x, kernel_sizes=kernel_sizes, strides=strides, pad_type=pad_type, exclude_padding_from_average=True, ) x = _transpose_NCHW_to_NHWC(x, node.name) else: x = mb.avg_pool( x=x, kernel_sizes=kernel_sizes, strides=strides, pad_type=pad_type, exclude_padding_from_average=True, name=node.name, ) context.add(node.name, x) @register_tf_op def AvgPool3D(context, node): x = context[node.inputs[0]] d_rank = x.rank - 2 data_format = node.attr.get("data_format", "NDHWC") ksize = node.attr.get("ksize", None) kernel_sizes = _pool_pads_or_strides(ksize, data_format, d_rank) strides = node.attr.get("strides", None) if strides is not None: strides = _pool_pads_or_strides(strides, data_format, d_rank) pad_type = node.attr["padding"].lower() if data_format == "NDHWC": x = _transpose_NDHWC_to_NCDHW(x) x = mb.avg_pool( x=x, kernel_sizes=kernel_sizes, strides=strides, pad_type=pad_type, exclude_padding_from_average=True, ) x = _transpose_NCDHW_to_NDHWC(x, node.name) else: x = mb.avg_pool( x=x, kernel_sizes=kernel_sizes, strides=strides, pad_type=pad_type, exclude_padding_from_average=True, name=node.name, ) context.add(node.name, x) @register_tf_op def BatchToSpaceND(context, node): # In tensorflow, the input tensor has the shape of (batch,) + spatial_shape + remaining_shape. # The shape is treated as a combination of 3 components: # 1. A single batch dimension # 2. Spatial dimensions, with a length spatial_rank, which could be neither 1 or 2. Also, spatial_rank # is equal to the length of block_shape # 3. Remaining dimensions, with a length remaining_rank # The logic of translating this op is as followed: # 1. We first reshape the input to a canonical shape (rolling the remaining shape dimensions into a # single dimension): (batch,) + spatial_shape + (R), where R = remaining_dim_1 * ... * remaining_dim_n # 2. We support rank 1 and rank 2 spatial shape: # (i) rank 1: We decompose the BatchToSpace into small basic ops. # (ii) rank 2: We directly use the built in batch_to_space op. # The output would have shape (batch_new,) + spatial_shape_new + (R) # 3. We transform the tensor back, by unrolling the remaining shape: (B_new,) + spatial_shape_new + remaining_shape x = context[node.inputs[0]] block_shape = context[node.inputs[1]].val crops = context[node.inputs[2]] original_shape = mb.shape(x=x) input_rank = x.rank spatial_rank = len(block_shape) remaining_rank = x.rank - 1 - spatial_rank has_non_unity_remaining_dims = remaining_rank != 1 if block_shape is None: raise NotImplementedError("Not support dynamic block_shape for BatchToSpaceND!") if crops.val is not None: is_static_crops = True crops = crops.val else: is_static_crops = False if has_non_unity_remaining_dims: # Reshape the input tensor to shape [batch, spatial_shape, remaining_dim_1 * ... * remaining_dim_N] x = _reshape_remaining_dimensions_to_canonical_shape(x, remaining_rank) if spatial_rank >= 3: raise NotImplementedError("Rank of spatial shape > 2 is not supported.") if spatial_rank == 2: # Tensor has shape [B, H, W, C], we can directly use the batch_to_space op by doing # [B, H, W, C] -> transpose -> [B, C, H, W] -> batch_to_space -> [B_new, C, H_new, W_new] -> # transpose -> [B_new, H_new, W_new, C] x = mb.transpose(x=x, perm=[0, 3, 1, 2]) if is_static_crops: x = mb.batch_to_space(x=x, block_shape=block_shape, crops=crops, name=node.name) else: x = mb.batch_to_space( x=x, block_shape=block_shape, crops=_np.zeros((2, 2), _np.int32), name=node.name ) # crop_height, crop_width = crops[0, :], crops[1, :] crop_height = mb.slice_by_index( x=crops, begin=[0, 0], end=[0, 0], begin_mask=[False, True], end_mask=[False, True], squeeze_mask=[True, False], ) crop_width = mb.slice_by_index( x=crops, begin=[1, 0], end=[0, 0], begin_mask=[False, True], end_mask=[False, True], squeeze_mask=[True, False], ) # Otherwise, we need to use slice_by_index to implement the crop a, b = _value_at(crop_height, 0), _value_at(crop_height, 1) c, d = _value_at(crop_width, 0), _value_at(crop_width, 1) shape = mb.shape(x=x) height, width = _value_at(shape, 2), _value_at(shape, 3) begin_idx_height, end_idx_height = a, mb.sub(x=height, y=b) begin_idx_width, end_idx_width = c, mb.sub(x=width, y=d) begin = mb.concat(values=[0, 0, begin_idx_height, begin_idx_width], axis=0) end = mb.concat(values=[0, 0, end_idx_height, end_idx_width], axis=0) begin_mask = [True, True, False, False] end_mask = [True, True, False, False] x = mb.slice_by_index( x=x, begin=begin, end=end, begin_mask=begin_mask, end_mask=end_mask ) x = mb.transpose(x=x, perm=[0, 2, 3, 1]) if spatial_rank == 1: # In this case, we decompose space_to_batch into small basic ops # [B, H, C] -> decomposite ops -> [B_new, H_new, C] # reshape input to [block_shape, B/block_shape, H, C] input_shape = mb.shape(x=x) block_shape = block_shape[0] batch_size = _value_at(input_shape, 0) spatial_size = _value_at(input_shape, 1) channel_size = _value_at(input_shape, 2) new_batch_size = mb.cast(x=mb.real_div(x=batch_size, y=block_shape), dtype="int32") reshape_values = [block_shape, new_batch_size, spatial_size, channel_size] reshape_shape = mb.concat(values=reshape_values, axis=0) x = mb.reshape(x=x, shape=reshape_shape, name=node.name) # permute the tensor to [B/block_shape, H, block_shape, C] x = mb.transpose(x=x, perm=[1, 2, 0, 3]) # reshape the tensor to [B/block_shape, H*block_shape, C] new_spatial_size = mb.cast(x=mb.mul(x=spatial_size, y=block_shape), dtype="int32") reshape_values = [new_batch_size, new_spatial_size, channel_size] reshape_shape = mb.concat(values=reshape_values, axis=0) x = mb.reshape(x=x, shape=reshape_shape) # crop the tensor to [B/block_shape, H*block_shape - crops[0][0] - crops[0][1], C] if is_static_crops: # If crops is known at compile time, we can directly call mb.crop x = mb.crop(x=x, crop_height=crops[0], crop_width=[0, 0]) else: # For the dynamic crops, we implement it with slice_by_index flatten_crops = mb.reshape(x=crops, shape=[-1]) a, b = _value_at(flatten_crops, 0), _value_at(flatten_crops, 1) shape = mb.shape(x=x) height = _value_at(shape, 1) begin_idx, end_idx = a, mb.sub(x=height, y=b) begin = mb.concat(values=[0, begin_idx, 0], axis=0) end = mb.concat(values=[0, end_idx, 0], axis=0) begin_mask = [True, False, True] end_mask = [True, False, True] x = mb.slice_by_index( x=x, begin=begin, end=end, begin_mask=begin_mask, end_mask=end_mask ) if has_non_unity_remaining_dims: # Reshape the tensor from shape [batch_new, spatial_shape_new, remaining_dim_1 * ... * remaining_dim_N] back to # shape [batch_new, spatial_shape_new, remaining_shape] x = _reshape_remaining_dimension_to_original_shape(x, original_shape, remaining_rank) context.add(node.name, mb.identity(x=x, name=node.name)) @register_tf_op def Ceil(context, node): x = context[node.inputs[0]] x = mb.ceil(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Const(context, node): if node.value is None: raise ValueError("Const node '{}' cannot have no value".format(node.name)) x = mb.const(val=node.value.val, name=node.name) context.add(node.name, x) def _conv2d3d_strides_or_dilations(name, value, data_format, default_value=1): """Compute strides or dilation values for 2D and 3D convolutions.""" if value is None: value = default_value if not isinstance(value, (int, list)): raise ValueError("{} must be an int or list".format(name)) # Parse number of spatial dimensions from `data_format`, assuming N (batch) and C # (input channels) are present n_dims = len(data_format) - 2 if isinstance(value, int): return [value] * n_dims if len(value) == 1: return value * n_dims if len(value) == n_dims: return value if len(value) != n_dims + 2: raise ValueError( "{} must have length 1, {}, or {}".format(name, n_dims, n_dims + 2) ) if data_format == "NHWC": # Only support stride/dilation along N, C == 1 if not (value[0] == value[3] == 1): raise ValueError( "{} along N and C other than 1 not implemented".format(name) ) return value[1:3] elif data_format == "NCHW" or data_format == "NCDHW": if not (value[0] == value[1] == 1): raise ValueError( "{} along N and C other than 1 not implemented".format(name) ) return value[2:] # "NDHWC" if not (value[0] == value[4] == 1): raise ValueError("{} along N and C other than 1 not implemented".format(name)) return value[1:4] @register_tf_op def Cos(context, node): x = context[node.inputs[0]] x = mb.cos(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Cosh(context, node): x = context[node.inputs[0]] x = mb.cosh(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Cross(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] # last dim must be 3; other dims must match assert x.shape[1:] == y.shape[1:] assert x.shape[-1] == 3 x1 = mb.gather(x=x, indices=[1, 2, 0], axis=-1) x2 = mb.gather(x=x, indices=[2, 0, 1], axis=-1) y1 = mb.gather(x=y, indices=[1, 2, 0], axis=-1) y2 = mb.gather(x=y, indices=[2, 0, 1], axis=-1) z = mb.sub(x=mb.mul(x=x1, y=y2), y=mb.mul(x=x2, y=y1), name=node.name) context.add(node.name, z) @register_tf_op def Einsum(context, node): equation = node.attr["equation"] a = context[node.inputs[0]] b = context[node.inputs[1]] x = build_einsum_mil([a, b], equation, node.name) context.add(node.name, x) @register_tf_op def Equal(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.equal(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def ExtractImagePatches(context, node): x = context[node.inputs[0]] sizes = node.attr.get("ksizes") strides = node.attr.get("strides") rates = node.attr.get("rates") padding = node.attr.get("padding") if x.rank != 4: raise ValueError("input for ExtractImagePatches should be a 4D tensor.") if not all([rate == 1 for rate in rates]): raise NotImplementedError( "only rates with all 1s is implemented for ExtractImagePatches." ) if len(sizes) != 4 or sizes[0] != 1 or sizes[3] != 1: raise ValueError( "ExtractImagePatches only supports sizes (4D tensor) with 1s for batch and channel dimensions." ) if len(sizes) != 4 or strides[0] != 1 or strides[3] != 1: raise ValueError( "ExtractImagePatches only supports strides (4D tensor) with 1s for batch and channel dimensions." ) if padding not in ["VALID", "SAME"]: raise ValueError("non-supported padding for ExtractImagePatches.") h, w = x.shape[1], x.shape[2] # padding for SAME mode if padding == "SAME": delta_h = h % strides[1] if h % strides[1] != 0 else strides[1] delta_w = w % strides[2] if w % strides[2] != 0 else strides[2] last_h = h - delta_h + 1 last_w = w - delta_w + 1 pad_h = max(0, last_h + sizes[1] - 1 - h) pad_w = max(0, last_w + sizes[2] - 1 - w) pad_h = [pad_h // 2, pad_h // 2 if pad_h % 2 == 0 else pad_h // 2 + 1] pad_w = [pad_w // 2, pad_w // 2 if pad_w % 2 == 0 else pad_w // 2 + 1] pad = _np.array([[0, 0], pad_h, pad_w, [0, 0]]).astype(_np.int32) pad = pad.reshape(-1) if not all(pad == 0): x = mb.pad(x=x, pad=pad, mode="constant", constant_val=0.0) h, w = x.shape[1], x.shape[2] # compute boxes batch = x.shape[0] boxes = [] h_index = list(range(0, h - sizes[1] + 1, strides[1])) w_index = list(range(0, w - sizes[2] + 1, strides[2])) for hi in h_index: for wi in w_index: boxes.append((hi, wi, hi + sizes[1] - 1, wi + sizes[2] - 1)) boxes = _np.array(boxes, dtype=_np.float32) box_indices = _np.arange(batch) box_indices = _np.tile(box_indices, (len(boxes), 1)) box_indices = _np.transpose(box_indices) box_indices = box_indices.reshape(-1, 1) boxes = _np.tile(boxes, (batch, 1)) x = _transpose_NHWC_to_NCHW(x) crop_resize_args = { "x": x, "target_height": sizes[1], "target_width": sizes[2], "normalized_coordinates": False, "spatial_scale": 1.0, "box_coordinate_mode": "CORNERS_HEIGHT_FIRST", "sampling_mode": "ALIGN_CORNERS", } if not is_current_opset_version_compatible_with(target.iOS17): # Before IOS17, boxes need to be shape [N,1,4,1,1] or [N,1,5,1,1]. boxes = _np.concatenate([box_indices, boxes], axis=1) boxes = boxes.reshape(boxes.shape[0], 1, boxes.shape[1], 1, 1) # Before IOS17, the input param is `roi` instead of `boxes`. crop_resize_args["roi"] = boxes x = mb.crop_resize(**crop_resize_args) # Before IOS17, the output has an extra dim at axis 1. x = mb.squeeze(x=x, axes=[1]) else: # At this point `boxes` has shape [N, 4], which is good enough for IOS17+. crop_resize_args["boxes"] = boxes box_indices = np.squeeze(box_indices, axis=-1) crop_resize_args["box_indices"] = box_indices x = mb.crop_resize(**crop_resize_args) x = _transpose_NCHW_to_NHWC(x, node_name=node.name + "_transpose_to_nhwc") x = mb.reshape(x=x, shape=(batch, len(h_index), len(w_index), -1), name=node.name) context.add(node.name, x) @register_tf_op def Exp(context, node): x = context[node.inputs[0]] x = mb.exp(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Floor(context, node): x = context[node.inputs[0]] x = mb.floor(x=x, name=node.name) context.add(node.name, x) @register_tf_op def FloorDiv(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.floor_div(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def Greater(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.greater(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def GreaterEqual(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.greater_equal(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def Less(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.less(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def LessEqual(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.less_equal(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def Log(context, node): x = context[node.inputs[0]] x = mb.log(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Log1p(context, node): x = context[node.inputs[0]] x = mb.log(x=x, epsilon=1., name=node.name) context.add(node.name, x) @register_tf_op def LogicalAnd(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.logical_and(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def LogicalNot(context, node): x = context[node.inputs[0]] x = mb.logical_not(x=x, name=node.name) context.add(node.name, x) @register_tf_op def LogicalOr(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.logical_or(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def LogicalXor(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.logical_xor(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def LRN(context, node): x = context[node.inputs[0]] depth_radius = node.attr.get("depth_radius") size = (depth_radius * 2) + 1 alpha = node.attr.get("alpha") * size beta = node.attr.get("beta") bias = node.attr.get("bias") x = _transpose_NHWC_to_NCHW(x) x = mb.local_response_norm(x=x, size=size, alpha=alpha, beta=beta, k=bias) x = _transpose_NCHW_to_NHWC(x, node.name) context.add(node.name, x) @register_tf_op def Maximum(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.maximum(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def Minimum(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.minimum(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def FloorMod(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] floor = mb.floor_div(x=x, y=y, name=node.name + "_floor_div") floor_mutiply = mb.mul(x=floor, y=y, name=node.name + "_multiply") x = mb.sub(x=x, y=floor_mutiply, name=node.name) context.add(node.name, x) @register_tf_op def Mul(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.mul(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def Neg(context, node): x = context[node.inputs[0]] x, y = promote_input_dtypes([x, -1]) x = mb.mul(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def NotEqual(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x, y = promote_input_dtypes([x, y]) x = mb.not_equal(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def Pow(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.pow(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def DepthwiseConv2dNative(context, node): # [kH, kW, C_in, multiplier] W_hwim = context[node.inputs[1]] # m = multiplier # [kH, kW, 1, C_in * multiplier] shape_hw1o = list(W_hwim.shape[:2]) + [1, W_hwim.shape[2] * W_hwim.shape[3]] W_hw1o = mb.reshape(x=W_hwim, shape=shape_hw1o) # [C_in * multiplier, 1, kH, kW]. Note that C_in * multiplier = C_out in # MIL. C_in / groups = 1 in depthwise conv. W_o1hw = mb.transpose(x=W_hw1o, perm=[3, 2, 0, 1]) data_format = node.attr.get("data_format", "NHWC") HW_dilations = _conv2d3d_strides_or_dilations( "dilations", node.attr.get("dilations"), data_format ) HW_strides = _conv2d3d_strides_or_dilations( "strides", node.attr.get("strides"), data_format ) pad_type = node.attr.get("padding") if pad_type not in ["VALID", "SAME"]: raise ValueError("Invalid padding type for tf.nn.depthwise_conv2d") pad_type = pad_type.lower() x = context[node.inputs[0]] C_in = x.shape[-1] if data_format == "NHWC": x = _transpose_NHWC_to_NCHW(x) # Only the last op should have the same name as node.name conv_name = node.name + "x" if data_format == "NHWC" else node.name x = mb.conv( x=x, weight=W_o1hw, pad_type=pad_type, strides=HW_strides, dilations=HW_dilations, groups=C_in, name=conv_name, ) if data_format == "NHWC": x = _transpose_NCHW_to_NHWC(x, node.name) context.add(node.name, x) @register_tf_op def FakeQuantWithMinMaxVars(context, node): w = context[node.inputs[0]] min = context[node.inputs[1]].val max = context[node.inputs[2]].val num_bits = node.attr['num_bits'] narrow_range = node.attr['narrow_range'] min, max = _adjust_min_max(min, max, num_bits) if narrow_range: scale = (max-min) / (2 ** (num_bits) - 2) bias = min - scale else: scale = (max-min) / (2 ** (num_bits) - 1) bias = min w = mb.clip(x=w, alpha=min, beta=max) w = mb.sub(x=w, y=bias) x = mb.real_div(x=w, y=scale) x = mb.round(x=x) x = mb.mul(x=x, y=scale) x = mb.add(x=x, y=bias, name=node.name) context.add(node.name, x) @register_tf_op def Conv2D(context, node): if "quantize" in node.attr: quantization_type = "linear" min = node.attr['quantize_min'] max = node.attr['quantize_max'] nbits = node.attr['num_bits'] narrow_range = node.attr['narrow_range'] w = context[node.inputs[1]].sym_val min, max = _adjust_min_max(min, max, nbits) if narrow_range: quant_scale = (max - min) / (2 ** (nbits) - 2) quant_bias = (min-quant_scale) else: quant_scale = (max - min) / (2 ** (nbits) - 1) quant_bias = (min) w_clip = _np.clip(w, min, max) w_round = _np.round((w_clip-quant_bias)/quant_scale) W_hwio = w_round.astype(_np.uint8) if not isinstance(quant_scale, list) and not isinstance(quant_scale, tuple): quant_bias = [quant_bias] quant_scale = [quant_scale] else: quantization_type = None nbits = None quant_scale = None quant_bias = None W_hwio = context[node.inputs[1]] if quantization_type is not None: W_oihw = _np.transpose(W_hwio, axes=[3, 2, 0, 1]) else: W_oihw = mb.transpose(x=W_hwio, perm=[3, 2, 0, 1]) data_format = node.attr.get("data_format", "NHWC") HW_dilations = _conv2d3d_strides_or_dilations( "dilations", node.attr.get("dilations"), data_format ) HW_strides = _conv2d3d_strides_or_dilations( "strides", node.attr.get("strides"), data_format ) pad_type = node.attr.get("padding") pad_type = pad_type.lower() pad_type = "custom" if pad_type == "explicit" else pad_type assert pad_type in {"same", "valid", "custom"} x = context[node.inputs[0]] if data_format == "NHWC": x = _transpose_NHWC_to_NCHW(x) if pad_type == "custom": pad_val = node.attr["explicit_paddings"] pad_val = pad_val[2:-2] elif data_format == "NCHW" and pad_type == "custom": pad_val = node.attr["explicit_paddings"] pad_val = pad_val[4:] # Only the last op should have the same name as node.name conv_name = node.name + "x" if data_format == "NHWC" else node.name # get the groups from the weighs shape and the input shape _, in_channel, _, _ = x.shape _, weight_in_channel, _, _ = W_oihw.shape if in_channel % weight_in_channel != 0: raise ValueError("input channel should be divided by the weight channel.") groups = int(in_channel / weight_in_channel) if quantization_type is not None: x = mb.conv_quantized( x=x, weight=W_oihw, pad_type=pad_type, strides=HW_strides, dilations=HW_dilations, name=conv_name, quantization_type=quantization_type, nbits=nbits, quant_scale=quant_scale, quant_bias=quant_bias, groups=groups, ) elif pad_type == "custom": x = mb.conv( x=x, weight=W_oihw, pad_type=pad_type, strides=HW_strides, dilations=HW_dilations, pad=pad_val, groups=groups, name=conv_name, ) else: x = mb.conv( x=x, weight=W_oihw, pad_type=pad_type, strides=HW_strides, dilations=HW_dilations, groups=groups, name=conv_name, ) if data_format == "NHWC": x = _transpose_NCHW_to_NHWC(x, node.name) context.add(node.name, x) @register_tf_op def Conv3D(context, node): W_dhwio = context[node.inputs[1]] W_oidhw = mb.transpose(x=W_dhwio, perm=[4, 3, 0, 1, 2]) data_format = node.attr.get("data_format", "NDHWC") DHW_dilations = _conv2d3d_strides_or_dilations( "dilations", node.attr.get("dilations"), data_format ) DHW_strides = _conv2d3d_strides_or_dilations( "strides", node.attr.get("strides"), data_format ) pad_type = node.attr.get("padding") if not isinstance(pad_type, str): pad_type = "custom" raise NotImplementedError("Custom padding not implemented for TF") pad_type = pad_type.lower() x = context[node.inputs[0]] if data_format == "NDHWC": # Convert input to NCDHW x = _transpose_NDHWC_to_NCDHW(x) # Only the last op should have the same name as node.name conv_name = node.name + "x" if data_format == "NDHWC" else node.name _, in_channel, _, _, _ = x.shape _, weight_in_channel, _, _, _ = W_oidhw.shape if in_channel % weight_in_channel != 0: raise ValueError("input channel should be divided by the weight channel.") groups = int(in_channel / weight_in_channel) x = mb.conv( x=x, weight=W_oidhw, pad_type=pad_type, strides=DHW_strides, dilations=DHW_dilations, groups=groups, name=conv_name, ) if data_format == "NDHWC": # Convert input back to NDHWC (from NCDHW) x = _transpose_NCDHW_to_NDHWC(x, node.name) context.add(node.name, x) @register_tf_op def Conv3DBackpropInputV2(context, node): # Output shape: [N, D_out, H_out, W_out, C_out] output_shape = context[node.inputs[0]].val # Weight shape: [D, H, W, C_out, C_in] W_dhwoi = context[node.inputs[1]] W_iodhw = mb.transpose(x=W_dhwoi, perm=[4, 3, 0, 1, 2]) # Input shape: [N, D_in, H_in, W_in, C_in] x = context[node.inputs[2]] data_format = node.attr.get("data_format", "NDHWC") DHW_dilations = _conv2d3d_strides_or_dilations( "dilations", node.attr.get("dilations"), data_format ) DHW_strides = _conv2d3d_strides_or_dilations( "strides", node.attr.get("strides"), data_format ) pad_type = node.attr.get("padding", None) if pad_type is None: raise ValueError("Padding type not specified for op: {}".format(node.name)) if not isinstance(pad_type, str): pad_type = "custom" raise NotImplementedError("Custom padding not implemented for TF") pad_type = pad_type.lower() if data_format == "NDHWC": # Convert input to NCDHW x = _transpose_NDHWC_to_NCDHW(x) if output_shape is not None: output_shape = [output_shape[0], output_shape[4], output_shape[1], output_shape[2], output_shape[3]] # Only the last op should have the same name as node.name conv_name = node.name + "_x" if data_format == "NDHWC" else node.name # Pass output shape provided above x = mb.conv_transpose( x=x, weight=W_iodhw, pad_type=pad_type, strides=DHW_strides, output_shape=output_shape, dilations=DHW_dilations, name=conv_name, ) if data_format == "NDHWC": # Convert input back to NDHWC (from NCDHW) x = _transpose_NCDHW_to_NDHWC(x, node.name) context.add(node.name, x) @register_tf_op def DepthToSpace(context, node): x = context[node.inputs[0]] block_size = node.attr.get("block_size") data_format = node.attr.get("data_format", "NHWC") if data_format == "NHWC": x = _transpose_NHWC_to_NCHW(x) x = mb.depth_to_space(x=x, block_size=block_size) x = _transpose_NCHW_to_NHWC(x, node.name) else: x = mb.depth_to_space(x=x, block_size=block_size, name=node.name) context.add(node.name, x) @register_tf_op def EuclideanNorm(context, node): x = context[node.inputs[0]] axes = _check_axes_type(context[node.inputs[1]]) keep_dims = node.attr.get("keep_dims", False) x = mb.reduce_l2_norm(x=x, axes=axes, keep_dims=keep_dims, name=node.name) context.add(node.name, x) @register_tf_op def IdentityN(context, node): res = [mb.identity(x=context[x]) for x in node.inputs] context.add(node.name, res) @register_tf_op def ExpandDims(context, node): x = context[node.inputs[0]] axis = context[node.inputs[1]] if axis.op.op_type == "const" and (axis.val is not None and axis.val.size == 1): axis = axis.val[0] if axis.shape == (1,) else axis.val else: raise ValueError("Expand Dims: Invalid value for parameter axis") x = mb.expand_dims(x=x, axes=[axis], name=node.name) context.add(node.name, x) @register_tf_op(tf_alias=["FusedBatchNormV2", "FusedBatchNormV3"]) def FusedBatchNorm(context, node): # Get attributes data_format = node.attr.get("data_format", "NHWC") epsilon = node.attr.get("epsilon", None) # Get inputs x = context[node.inputs[0]] scale = context[node.inputs[1]] offset = context[node.inputs[2]] mean = context[node.inputs[3]] variance = context[node.inputs[4]] if data_format == "NHWC": # TF's FusedBatchNorm is only for 4D inputs x = _transpose_NHWC_to_NCHW(x) x = mb.batch_norm( x=x, mean=mean, variance=variance, gamma=scale, beta=offset, epsilon=epsilon ) x = _transpose_NCHW_to_NHWC(x, node.name + ":0") else: x = mb.batch_norm( x=x, mean=mean, variance=variance, gamma=scale, beta=offset, epsilon=epsilon, name=node.name + ":0", ) # Inference only batch norm does not have meaningful outputs for # batch_mean, batch_variance etc. context.add(node.name, [x, mean, variance]) @register_tf_op def Fill(context, node): shape = context[node.inputs[0]] value = context[node.inputs[1]] x = mb.fill(shape=shape, value=value, name=node.name) context.add(node.name, x) @register_tf_op(tf_alias=["ImageProjectiveTransformV3"]) def ImageProjectiveTransformV2(context, node): # Data shape format: [batch, height, width, channels] x = context[node.inputs[0]] # Transforms shape format: [batch, 8] or [1, 8] matrix, [a0, a1, a2, b0, b1, b2, c0, c1] transforms = context[node.inputs[1]] # 1-D Tensor [new_height, new_width] output_shape = context[node.inputs[2]] # For V3, there is an additional fill_value input if len(node.inputs) == 4: fill_value = context[node.inputs[3]].val if fill_value != 0.0: msg = ("fill_value {} not supported for tf ImageProjectiveTransformV2/V3 op {}. " "Only fill_value = 0.0 is supported.").format(fill_value, node.name) raise ValueError(msg) interpolation = node.attr.get("interpolation") if interpolation != "BILINEAR": msg = ("interpolation {} not supported for tf ImageProjectiveTransformV2/V3 op {}. " "Only interpolation = BILINEAR is supported.").format(interpolation, node.name) raise ValueError(msg) fill_mode = node.attr.get("fill_mode") if fill_mode != "CONSTANT": msg = ("fill_mode {} not supported for tf ImageProjectiveTransformV2/V3 op {}. " "Only fill_mode = CONSTANT is supported.").format(fill_mode, node.name) raise ValueError(msg) h_out = output_shape.val[0] w_out = output_shape.val[1] h_in = x.shape[1] w_in = x.shape[2] # Don't allow non-zero c0 or c1, check for each batch n_batch = transforms.val.shape[0] transform_matrix = [] for batch in range(n_batch): c0 = transforms.val[batch][6] c1 = transforms.val[batch][7] if not (c0 == c1 == 0.0): raise NotImplementedError( "'affine' op with 'transforms' contains non-zero " + "c0 or c1 is not supported, Got: {}".format( transforms ) ) # In the tensorflow affine transform function, the coordinate is in the original image size range, # i.e., for the input image, x is in range [0, W_in), and y is in range [0, H_in) # For the output image, x is in range [0, W_out), and y is in range [0, H_out) # However, the MIL affine op is in the normalized coordinate, in which x and y are both in range [-1, 1] # So we need to update the affine transformation matrix. # We have the following four equations: # (1) x_original_in = (2 * x_normalized_in + 1) * (W_in - 1) # (2) y_original_in = (2 * y_normalized_in + 1) * (H_in - 1) # (3) x_original_out = (2 * x_normalized_out + 1) * (W_out - 1) # (4) y_original_out = (2 * y_normalized_out + 1) * (H_out - 1) # The original transforms matrix is in the original coordinate: # (i) x_original_in = a * x_original_out + b * y_original_out + c # (ii) y_original_in = d * x_original_out + e * y_original_out + f # After plugging (1) - (4) into (i) (ii), we could have the new transformation matrix in the normalized coordinate a, b, c, d, e, f = transforms.val[batch].tolist()[:6] new_a = a * (w_out - 1) / (w_in - 1) new_b = b * (h_out - 1) / (w_in - 1) new_c = (2 * c + a * (w_out - 1) + b * (h_out - 1)) / (w_in - 1) - 1 new_d = d * (w_out - 1) / (h_in - 1) new_e = e * (h_out - 1) / (h_in - 1) new_f = (2 * f + d * (w_out - 1) + e * (h_out - 1)) / (h_in - 1) - 1 transform_matrix.append([new_a, new_b, new_c, new_d, new_e, new_f]) transform_matrix = _np.array(transform_matrix) x = _transpose_NHWC_to_NCHW(x) x = mb.affine( x=x, transform_matrix=transform_matrix, output_height=output_shape.val[0], output_width=output_shape.val[1], sampling_mode="bilinear", padding_mode="constant", padding_value=0.0, coordinates_mode="normalized_minus_one_to_one", align_corners=True, name=node.name + "_affine", ) x = _transpose_NCHW_to_NHWC(x, node.name) context.add(node.name, x) @register_tf_op(tf_alias=["DivNoNan"]) def RealDiv(context, node): x = mb.cast(x=context[node.inputs[0]], dtype="fp32") y = mb.cast(x=context[node.inputs[1]], dtype="fp32") x = mb.real_div(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op(tf_alias=["Addons>Resampler"]) def Resampler(context, node): # Data shape format: (Batch, Hin, Win, C) x = context[node.inputs[0]] # Warp shape format: (Batch, Hout, Wout, 2) warp = context[node.inputs[1]] # Handle rank-3 warp tensor is_rank3_warp = warp.rank == 3 if is_rank3_warp: # expand spatial dimension warp = mb.expand_dims(x=warp, axes=[1], name=warp.name + "_expand_dims") x = _transpose_NHWC_to_NCHW(x) x = mb.resample( x=x, coordinates=warp, sampling_mode="bilinear", padding_mode="constant", padding_value=0.0, coordinates_mode="unnormalized", align_corners=True, name=node.name + "_resample", ) x = _transpose_NCHW_to_NHWC( x, node.name + "_transpose" if is_rank3_warp else node.name ) if is_rank3_warp: # squeeze spatial dimension x = mb.squeeze(x=x, axes=[1], name=node.name) context.add(node.name, x) @register_tf_op def Rsqrt(context, node): x = context[node.inputs[0]] x = mb.rsqrt(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Sub(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.sub(x=x, y=y, name=node.name) context.add(node.name, x) @register_tf_op def StopGradient(context, node): Identity(context, node) @register_tf_op def Identity(context, node): x = context[node.inputs[0]] # In many cases we can skip and just make downstream ops reference the # pre-identity op. However, when identity is an output or pre-identity # is a placeholder, an identity op, or mb.mul(x, 1.0) is required. if len(node.outputs) != 0 or x.op is not None: context.add(node.name, x, is_new_var=False) else: x = mb.mul(x=x, y=1.0, name=node.name) context.add(node.name, x) @register_tf_op def Print(context, node): Identity(context, node) @register_tf_op def Placeholder(context, node): # no-op as we add Placeholder separately. pass def _pool_pads_or_strides(tf_spec, data_format, d_rank): if tf_spec is None: d_spec = [1] * d_rank elif not isinstance(tf_spec, list): d_spec = [tf_spec] * d_rank elif len(tf_spec) == 2: d_spec = tf_spec elif len(tf_spec) == 4: if data_format == "NHWC": d_spec = tf_spec[1:3] else: d_spec = tf_spec[2:] elif len(tf_spec) == 5: if data_format == "NDHWC": d_spec = tf_spec[1:4] else: # NCDHW d_spec = tf_spec[2:] else: raise ValueError("Unsupported tf_spec: %s" % tf_spec) return d_spec @register_tf_op(tf_alias=["BatchMatMul", "BatchMatMulV2"]) def MatMul(context, node): a = context[node.inputs[0]] b = context[node.inputs[1]] transpose_a = node.attr.get("adj_x", False) or node.attr.get("transpose_a", False) transpose_b = node.attr.get("adj_y", False) or node.attr.get("transpose_b", False) a, b = promote_input_dtypes([a, b]) x = mb.matmul( x=a, y=b, transpose_x=transpose_a, transpose_y=transpose_b, name=node.name ) context.add(node.name, x) @register_tf_op def MaxPool(context, node): x = context[node.inputs[0]] in_shape = x.sym_type.get_shape() d_rank = len(in_shape) - 2 data_format = node.attr.get("data_format", "NHWC") ksize = node.attr.get("ksize", None) kernel_sizes = _pool_pads_or_strides(ksize, data_format, d_rank) strides = node.attr.get("strides", None) if strides is not None: strides = _pool_pads_or_strides(strides, data_format, d_rank) pad_type = node.attr["padding"].lower() if data_format == "NHWC": x = _transpose_NHWC_to_NCHW(x) x = mb.max_pool( x=x, kernel_sizes=kernel_sizes, strides=strides, pad_type=pad_type ) x = _transpose_NCHW_to_NHWC(x, node.name) else: x = mb.max_pool( x=x, kernel_sizes=kernel_sizes, strides=strides, pad_type=pad_type, name=node.name, ) context.add(node.name, x) @register_tf_op def MaxPool3D(context, node): x = context[node.inputs[0]] d_rank = x.rank - 2 data_format = node.attr.get("data_format", "NDHWC") ksize = node.attr.get("ksize", None) kernel_sizes = _pool_pads_or_strides(ksize, data_format, d_rank) strides = node.attr.get("strides", None) if strides is not None: strides = _pool_pads_or_strides(strides, data_format, d_rank) pad_type = node.attr["padding"].lower() if data_format == "NDHWC": x = _transpose_NDHWC_to_NCDHW(x) x = mb.max_pool( x=x, kernel_sizes=kernel_sizes, strides=strides, pad_type=pad_type ) x = _transpose_NCDHW_to_NDHWC(x, node.name) else: x = mb.max_pool( x=x, kernel_sizes=kernel_sizes, strides=strides, pad_type=pad_type, name=node.name, ) context.add(node.name, x) @register_tf_op def MatrixBandPart(context, node): x = context[node.inputs[0]] lower = context[node.inputs[1]] upper = context[node.inputs[2]] x = mb.band_part(x=x, lower=lower, upper=upper, name=node.name) context.add(node.name, x) @register_tf_op def Max(context, node): x = context[node.inputs[0]] axes = _check_axes_type(context[node.inputs[1]]) keep_dims = node.attr.get("keep_dims", False) x = mb.reduce_max(x=x, axes=axes, keep_dims=keep_dims, name=node.name) context.add(node.name, x) @register_tf_op def Min(context, node): x = context[node.inputs[0]] axes = _check_axes_type(context[node.inputs[1]]) keep_dims = node.attr.get("keep_dims", False) x = mb.reduce_min(x=x, axes=axes, keep_dims=keep_dims, name=node.name) context.add(node.name, x) @register_tf_op def Prod(context, node): x = context[node.inputs[0]] axes = _check_axes_type(context[node.inputs[1]]) keep_dims = node.attr.get("keep_dims", False) x = mb.reduce_prod(x=x, axes=axes, keep_dims=keep_dims, name=node.name) context.add(node.name, x) @register_tf_op def Cast(context, node): type_map = { types.fp16: "fp16", types.float: "fp32", types.double: "fp32", types.int32: "int32", types.int64: "int32", } if node.attr["DstT"] not in type_map.keys(): raise NotImplementedError( "Cast: Provided destination type {} not " "supported.".format(types.get_type_info(node.attr["DstT"])) ) x = context[node.inputs[0]] dtype = type_map[node.attr["DstT"]] x = mb.cast(x=x, dtype=dtype, name=node.name) context.add(node.name, x) @register_tf_op def Round(context, node): x = context[node.inputs[0]] x = mb.round(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Sign(context, node): x = context[node.inputs[0]] x = mb.sign(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Sin(context, node): x = context[node.inputs[0]] x = mb.sin(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Sinh(context, node): x = context[node.inputs[0]] x = mb.sinh(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Slice(context, node): x = context[node.inputs[0]] begin = context[node.inputs[1]] size = context[node.inputs[2]] res = mb.slice_by_size(x=x, begin=begin, size=size, name=node.name) context.add(node.name, res) @register_tf_op def Sqrt(context, node): x = context[node.inputs[0]] x = mb.sqrt(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Square(context, node): x = context[node.inputs[0]] x = mb.mul(x=x, y=x, name=node.name) context.add(node.name, x) def _softmax_cross_entropy_with_logits(feats, labels, name): # compute the log softmax y = mb.reduce_log_sum_exp(x=feats, axes=[-1], keep_dims=True) log_softmax = mb.sub(x=feats, y=y) loss = mb.mul(x=labels, y=log_softmax) loss = mb.mul(x=loss, y=-1.) loss = mb.reduce_sum(x=loss, axes=[-1], name=name) return loss @register_tf_op def SparseSoftmaxCrossEntropyWithLogits(context, node): feats = context[node.inputs[0]] labels = context[node.inputs[1]] class_nums = feats.shape[1] labels = mb.one_hot( indices=labels, one_hot_vector_size=class_nums, ) labels = mb.cast(x=labels, dtype="fp32") loss = _softmax_cross_entropy_with_logits(feats, labels, node.name) context.add(node.name, loss) @register_tf_op def SoftmaxCrossEntropyWithLogits(context, node): feats = context[node.inputs[0]] labels = context[node.inputs[1]] loss = _softmax_cross_entropy_with_logits(feats, labels, node.name) context.add(node.name, loss) @register_tf_op def StridedSlice(context, node): x = context[node.inputs[0]] begin = context[node.inputs[1]] end = context[node.inputs[2]] stride = context[node.inputs[3]] def bitmask_to_array(bit): if bit < 0: arr = _np.binary_repr(bit, width=8)[::-1] arr = [bool(int(x)) for x in list(arr)] if node.attr.get("ellipsis_mask", 0) != 0: # In case of non-zero ellipsis_mask, we compute the output rank to be the # max rank of all the masks. This doesn't work if we computed a mask of constant # width 8 here (since the max rank is then taken to be 8 wrongly). raise ValueError("Cannot figure out slice rank with negative mask values and " \ "non-zero ellipsis_mask") else: # This method prevents unnecessary padding of the bitmask when it is not negative. # It can be padded with any extra False values later, based on output rank. arr = [] while bit > 0: if bit & 1: arr.append(True) else: arr.append(False) bit >>= 1 return arr begin_mask = bitmask_to_array(node.attr.get("begin_mask", 0)) end_mask = bitmask_to_array(node.attr.get("end_mask", 0)) squeeze_mask = bitmask_to_array(node.attr.get("shrink_axis_mask", 0)) ellipsis_mask = bitmask_to_array(node.attr.get("ellipsis_mask", 0)) new_axis_mask = bitmask_to_array(node.attr.get("new_axis_mask", 0)) def _pad_mask( x, begin, end, stride, begin_mask, end_mask, squeeze_mask, ellipsis_mask, new_axis_mask, ): # This function pad the masks, stride, begin and end to the same rank as the input tensor. if begin.rank != 1: raise ValueError( "begin should be 1-D tensor, got {}-D tensor instead".format(begin.rank) ) if end.rank != 1: raise ValueError( "end should be 1-D tensor, got {}-D tensor instead".format(end.rank) ) # check if inputs can be determined begin_cache = begin end_cache = end begin = [] if begin.val is None else begin.val.tolist() end = [] if end.val is None else end.val.tolist() stride = [] if stride is None else stride.val.tolist() # pad masks function new_dims = sum(i is True for i in new_axis_mask) if new_dims > 0: x_rank = x.rank + new_dims else: x_rank = x.rank def pad_array(arr, max_rank, idx, default_value): """ This function pads the arr to x_rank with default_value. idx is the index where ellipis_mask = True. max_rank is the maximum rank of the masks, stride, begin and end. """ mask = arr[:] mask += [default_value] * (x_rank - len(mask)) new_mask = [] for i in range(max_rank): num = 1 if i != idx else x_rank - max_rank + 1 new_mask += [mask[i]] * num return new_mask mask_list = [ begin_mask, end_mask, squeeze_mask, ellipsis_mask, new_axis_mask, stride, begin, end, ] max_rank = max([len(arr) for arr in mask_list]) # If ellipsis_mask is given, the last element of it would be True # Otherwise, we simply pad each mask by appending default value if ellipsis_mask != []: rank = max_rank idx = len(ellipsis_mask) - 1 else: rank = x_rank idx = -1 begin_mask = pad_array(begin_mask, rank, idx, False) end_mask = pad_array(end_mask, rank, idx, False) squeeze_mask = pad_array(squeeze_mask, rank, idx, False) ellipsis_mask = pad_array(ellipsis_mask, rank, idx, False) new_axis_mask = pad_array(new_axis_mask, rank, idx, False) stride = pad_array(stride, rank, idx, 1) # pad begin and end if they are determined during compile time if begin != []: begin = pad_array(begin, rank, idx, 0) if end != []: end = pad_array(end, rank, idx, 0) # make sure begin_mask, end_mask, and stride are consistent with ellipsis mask # begin_mask and end_mask should be True, and stride should be 1. for i, mask in enumerate(ellipsis_mask): if mask: begin_mask[i] = True end_mask[i] = True stride[i] = 1 # make sure begin_mask, end_mask, and stride are consistent with new axis mask # begin_mask and end_mask should be True, and stride should be 1. for i, mask in enumerate(new_axis_mask): if mask: begin_mask[i] = True end_mask[i] = True stride[i] = 1 # convert begin and end back to cache value if they are run-time determined if begin == []: begin = begin_cache if end == []: end = end_cache # check which mask is adding by our default value # This happens when the given index is less than the tensor rank, # for instance, indexing a 3D tensor A with A[:1, :1] is equivalent to # A[:1, :1, :]. In this case we should append True to begin_mask and end_mask if ellipsis_mask == [False] * x_rank: for i in range(max_rank, x_rank): begin_mask[i] = True end_mask[i] = True return begin, end, stride, begin_mask, end_mask, squeeze_mask, new_axis_mask begin, end, stride, begin_mask, end_mask, squeeze_mask, new_axis_mask = _pad_mask( x, begin, end, stride, begin_mask, end_mask, squeeze_mask, ellipsis_mask, new_axis_mask, ) if sum(i is True for i in new_axis_mask) > 0: axes = [i for i, val in enumerate(new_axis_mask) if val is True] x = mb.expand_dims(x=x, axes=axes, name=node.name + "_new_axes") x = mb.slice_by_index( x=x, name=node.name, begin=begin, end=end, stride=stride, begin_mask=begin_mask, end_mask=end_mask, squeeze_mask=squeeze_mask, ) context.add(node.name, x) @register_tf_op def Sum(context, node): x = context[node.inputs[0]] axes = _check_axes_type(context[node.inputs[1]]) keep_dims = node.attr.get("keep_dims", False) input_type = x.sym_type if _is_scalar(input_type): context.add(node.name, x, is_new_var=False) else: x = mb.reduce_sum(x=x, axes=axes, keep_dims=keep_dims, name=node.name) context.add(node.name, x) @register_tf_op def Tan(context, node): x = context[node.inputs[0]] x = mb.tan(x=x, name=node.name) context.add(node.name, x) @register_tf_op def get_tuple(context, node): x = context[node.inputs[0]] if not isinstance(x, (list, tuple)): # In some rare cases, the upstream op produces a single output x = [x] idx = node.attr["index"] if idx >= len(x): msg = "Index {} out of range, op '{}' only has {} outputs: {}" raise IndexError(msg.format(idx, node.inputs[0], len(x), [v.name for v in x])) context.add(node.name, x[idx], is_new_var=False) @register_tf_op def Mean(context, node): x = context[node.inputs[0]] axes = _check_axes_type(context[node.inputs[1]]) keep_dims = node.attr.get("keep_dims", False) x = mb.reduce_mean(x=x, axes=axes, keep_dims=keep_dims, name=node.name) context.add(node.name, x) @register_tf_op def MatrixDiag(context, node): x = context[node.inputs[0]] if x.rank != 1: raise NotImplementedError('Only support MatrixDiag op with input rank = 1.') length = mb.shape(x=x) x = mb.expand_dims(x=x, axes=[0]) reps = mb.concat(values=[length, [1]], axis=0) x = mb.tile(x=x, reps=reps) x = mb.band_part(x=x, lower=0, upper=0, name=node.name) context.add(node.name, x) @register_tf_op def MirrorPad(context, node): x = context[node.inputs[0]] pad = context[node.inputs[1]] constant_val = node.attr.get("constant_val", 0.0) if pad is None: raise ValueError("TF `paddings` in Pad op must be const.") mode = node.attr.get("mode", "reflect").lower() if mode == "symmetric": mode = "reflect" in_rank = len(x.sym_type.get_shape()) if in_rank > 5 or in_rank < 2: raise ValueError( "Unsupported Pad configuration with input rank {}!".format(str(in_rank)) ) if pad.val.shape != (in_rank, 2): raise ValueError("Padding must have length as input tensor rank.") pad = pad.val # get axis which is non zero non_zero_axis = [] for i in range(len(pad)): if not all(pad[i] == 0): non_zero_axis.append(i) if len(non_zero_axis) > 2: raise ValueError("Unsupported configuration for Pad layer!") # make padding a 2 x 2 tensor if len(non_zero_axis) < 2 if len(non_zero_axis) == 0: non_zero_axis = [0, 1] if len(non_zero_axis) == 1: if non_zero_axis[0] != len(pad) - 1: non_zero_axis.append(len(pad) - 1) else: non_zero_axis = [0, non_zero_axis[0]] # transpose the input such that the padding dim is the last two perm = [i for i in range(in_rank) if i not in non_zero_axis] + non_zero_axis x = mb.transpose(x=x, perm=perm, name=node.name + "_transpose_1") pad = pad[non_zero_axis, :] pad = pad.reshape(-1) x = mb.pad( x=x, pad=pad, name=node.name + "_pad", constant_val=constant_val, mode=mode ) inverse_perm = [-1] * len(perm) for i, index in enumerate(perm): inverse_perm[index] = i x = mb.transpose(x=x, perm=inverse_perm, name=node.name) context.add(node.name, x) @register_tf_op def Pad(context, node): x = context[node.inputs[0]] pad = context[node.inputs[1]] input_dtype = x.dtype mode = node.attr.get("mode", "constant").lower() if mode == "symmetric": mode = "reflect" constant_val = node.attr.get("constant_val", 0.0) constant_val = mb.const(val=constant_val) in_rank = len(x.sym_type.get_shape()) if in_rank > 5: raise ValueError("Unsupported Pad configuration!") if pad.val is None: pad = mb.reshape(x=pad, shape=[-1]) else: pad = pad.val.reshape(-1) x = mb.cast(x=x, dtype=builtin_to_string(constant_val.dtype)) x = mb.pad(x=x, pad=pad, mode=mode, constant_val=constant_val) x = mb.cast(x=x, dtype=builtin_to_string(input_dtype), name=node.name) context.add(node.name, x) @register_tf_op def PadV2(context, node): # compared to tf.raw_ops.Pad, tf.raw_ops.PadV2 allow constant values rather than 0. x = context[node.inputs[0]] pad = context[node.inputs[1]] constant_val = context[node.inputs[2]] if constant_val.shape != (): raise NotImplementedError( "TF `constant_values` in PadV2 op must be const scalar." ) in_rank = x.rank if in_rank > 5: raise ValueError("Unsupported Pad configuration!") if pad.val is None: pad = mb.reshape(x=pad, shape=[-1]) else: pad = pad.val.reshape(-1) constant_val = constant_val.val if constant_val == -_np.inf: INT_MIN = -_np.iinfo(_np.int64).max - 1 constant_val = float(INT_MIN) if constant_val == _np.inf: INT_MAX = _np.iinfo(_np.int64).max constant_val = float(INT_MAX) x = mb.pad(x=x, pad=pad, name=node.name, mode="constant", constant_val=constant_val) context.add(node.name, x) @register_tf_op def Relu(context, node): x = context[node.inputs[0]] x = mb.relu(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Reciprocal(context, node): x = context[node.inputs[0]] x = mb.inverse(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Relu6(context, node): x = context[node.inputs[0]] x = mb.relu6(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Reshape(context, node): x = context[node.inputs[0]] new_shape = context[node.inputs[1]] x = mb.reshape(x=x, shape=new_shape, name=node.name) context.add(node.name, x) @register_tf_op(tf_alias=["ReverseV2"]) def Reverse(context, node): x = context[node.inputs[0]] axes = context[node.inputs[1]] x = mb.reverse(x=x, axes=axes, name=node.name) context.add(node.name, x) @register_tf_op def ReverseSequence(context, node): x = context[node.inputs[0]] lengths = context[node.inputs[1]] seq_axis = node.attr.get("seq_dim") batch_axis = node.attr.get("batch_dim") x = mb.reverse_sequence( x=x, lengths=lengths, seq_axis=seq_axis, batch_axis=batch_axis, name=node.name ) context.add(node.name, x) @register_tf_op def Transpose(context, node): x = context[node.inputs[0]] perm = context[node.inputs[1]] x = mb.transpose(x=x, perm=perm, name=node.name) context.add(node.name, x) @register_tf_op def Squeeze(context, node): x = context[node.inputs[0]] axes = node.attr.get("squeeze_dims", []) if axes == []: axes = None x = mb.squeeze(x=x, axes=axes, name=node.name) context.add(node.name, x) @register_tf_op def Multinomial(context, node): x = context[node.inputs[0]] size = context[node.inputs[1]] x = mb.random_categorical(x=x, size=size, name=node.name) context.add(node.name, x) @register_tf_op(tf_alias=["Elu"]) def ELU(context, node): x = context[node.inputs[0]] x = mb.elu(x=x, alpha=1.0, name=node.name) context.add(node.name, x) @register_tf_op(tf_alias=["Erf"]) def ERF(context, node): x = context[node.inputs[0]] x = mb.erf(x=x, name=node.name) context.add(node.name, x) @register_tf_op(tf_alias=["LeakyRelu"]) def LeakyReLU(context, node): x = context[node.inputs[0]] alpha = node.attr["alpha"] x = mb.leaky_relu(x=x, alpha=alpha, name=node.name) context.add(node.name, x) @register_tf_op def Selu(context, node): x = context[node.inputs[0]] x = mb.elu(x=x, alpha=1.6732632423543772) x = mb.mul(x=x, y=1.0507009873554805, name=node.name) context.add(node.name, x) @register_tf_op(tf_alias=["SelectV2"]) def Select(context, node): cond = context[node.inputs[0]] a = context[node.inputs[1]] b = context[node.inputs[2]] # broadcast vector type cond rank_cond = cond.rank rank_a = a.rank if rank_cond == 1 and rank_a > 1: axes = [-i - 1 for i in range(rank_a - rank_cond)] cond = mb.expand_dims(x=cond, axes=axes) if not types.is_bool(cond.dtype): # cond must be bool type cond = mb.cast(x=cond, dtype="bool") x = mb.select(cond=cond, a=a, b=b, name=node.name) context.add(node.name, x) @register_tf_op def Sigmoid(context, node): x = context[node.inputs[0]] x = mb.sigmoid(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Softplus(context, node): x = context[node.inputs[0]] x = mb.softplus(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Softsign(context, node): x = context[node.inputs[0]] x = mb.softsign(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Softmax(context, node): logit = context[node.inputs[0]] axis = node.attr.get("axis") x = mb.softmax(x=logit, axis=axis, name=node.name) context.add(node.name, x) @register_tf_op def SpaceToBatchND(context, node): # In tensorflow, the input tensor has the shape of (batch,) + spatial_shape + remaining_shape. # The shape is treated as a combination of 3 components: # 1. A single batch dimension # 2. Spatial dimensions, with a length spatial_rank, which could be neither 1 or 2. Also, spatial_rank # is equal to the length of block_shape # 3. Remaining dimensions, with a length remaining_rank # The logic of translating this op is as followed: # 1. We first reshape the input to a canonical shape (rolling the remaining shape dimensions into a # single dimension): (batch,) + spatial_shape + (R), where R = remaining_dim_1 * ... * remaining_dim_n # 2. We support rank 1 and rank 2 spatial shape: # (i) rank 1: We decompose the SpaceToBatch into small basic ops. # (ii) rank 2: We directly use the built in space_to_batch op. # The output would have shape (batch_new,) + spatial_shape_new + (R) # 3. We transform the tensor back, by unrolling the remaining shape: (B_new,) + spatial_shape_new + remaining_shape x = context[node.inputs[0]] block_shape = context[node.inputs[1]].val paddings = context[node.inputs[2]] original_shape = mb.shape(x=x) input_rank = x.rank spatial_rank = len(block_shape) remaining_rank = x.rank - 1 - spatial_rank has_non_unity_remaining_dims = remaining_rank != 1 if block_shape is None: raise NotImplementedError("Not support dynamic block_shape for SpaceToBatchND!") if paddings.val is not None: is_static_paddings = True paddings = paddings.val else: is_static_paddings = False if has_non_unity_remaining_dims: # Reshape the input tensor to shape [batch, spatial_shape, remaining_dim_1 * ... * remaining_dim_N] x = _reshape_remaining_dimensions_to_canonical_shape(x, remaining_rank) if spatial_rank >= 3: raise NotImplementedError("Rank of spatial shape > 2 is not supported.") if spatial_rank == 2: # Tensor has shape [B, H, W, C], we can directly use the space_to_batch op by doing # [B, H, W, C] -> transpose -> [B, C, H, W] -> space_to_batch -> [B_new, C, H_new, W_new] -> # transpose -> [B_new, H_new, W_new, C] x = mb.transpose(x=x, perm=[0, 3, 1, 2]) if is_static_paddings: x = mb.space_to_batch(x=x, block_shape=block_shape, paddings=paddings) else: flatten_paddings = mb.reshape( x=paddings, shape=[ 4, ], ) flatten_paddings = mb.cast(x=flatten_paddings, dtype="int32") flatten_paddings = mb.concat(values=[[0, 0, 0, 0], flatten_paddings], axis=0) x = mb.pad(x=x, pad=flatten_paddings, mode="constant") x = mb.space_to_batch(x=x, block_shape=block_shape, paddings=_np.zeros((2, 2), _np.int32)) x = mb.transpose(x=x, perm=[0, 2, 3, 1]) if spatial_rank == 1: # In this case, we decompose space_to_batch into small basic ops # [B, H, C] -> decomposite ops -> [B_new, H_new, C] # expand padding to shape [3, 2] paddings = mb.cast(x=paddings, dtype="int32") values = [[[0, 0]], paddings, [[0, 0]]] paddings = mb.concat(values=values, axis=0) needs_paddings = not is_static_paddings or any(paddings.val.flatten()) if needs_paddings: flatten_paddings = mb.reshape(x=paddings, shape=[-1]) padded = mb.pad(x=x, pad=flatten_paddings, mode="constant") x = padded else: padded = x # padded_shape = [B, H_padded, C] padded_shape = mb.shape(x=padded) # reshape to [B, H_padded/block_shape, block_shape, C] block_shape = block_shape[0] batch_size = _value_at(padded_shape, 0) spatial_dim = mb.real_div(x=_value_at(padded_shape, 1), y=block_shape) spatial_dim = mb.cast(x=spatial_dim, dtype="int32") remain_dim = _value_at(padded_shape, 2) reshape_shape = mb.concat(values=[batch_size, spatial_dim, block_shape, remain_dim], axis=0) reshaped_padded = mb.reshape(x=padded, shape=reshape_shape) # permute the shape to: [block_shape, B, H_padded/block_shape, C] permuted_reshaped_padded = mb.transpose(x=reshaped_padded, perm=[2, 0, 1, 3]) # reshape the tensor to [block_shape * B, H_padded/block_shape, C] final_reshape_values = [mb.mul(x=batch_size, y=block_shape), spatial_dim, remain_dim] final_shape = mb.concat(values=final_reshape_values, axis=0) x = mb.reshape(x=permuted_reshaped_padded, shape=final_shape) if has_non_unity_remaining_dims: # Reshape the tensor from shape [batch_new, spatial_shape_new, remaining_dim_1 * ... * remaining_dim_N] back to # shape [batch_new, spatial_shape_new, remaining_shape] x = _reshape_remaining_dimension_to_original_shape(x, original_shape, remaining_rank) context.add(node.name, mb.identity(x=x, name=node.name)) @register_tf_op def SpaceToDepth(context, node): x = context[node.inputs[0]] block_size = node.attr.get("block_size") data_format = node.attr.get("data_format", "NHWC") if data_format == "NHWC": x = _transpose_NHWC_to_NCHW(x) x = mb.space_to_depth(x=x, block_size=block_size) x = _transpose_NCHW_to_NHWC(x, node.name) else: x = mb.space_to_depth(x=x, block_size=block_size, name=node.name) context.add(node.name, x) @register_tf_op def Tanh(context, node): x = context[node.inputs[0]] x = mb.tanh(x=x, name=node.name) context.add(node.name, x) @register_tf_op(tf_alias=["TopKV2"]) def TopK(context, node): x = context[node.inputs[0]] k = context[node.inputs[1]] if k.val is not None: sort = node.attr["sorted"] kwargs = {"x": x, "k": k, "axis": -1, "name": node.name} if is_current_opset_version_compatible_with(target.iOS16): kwargs["sort"] = sort elif not sort: raise ValueError("For opset <= iOS16, only sorted=True supported for the topk") context.add(node.name, mb.topk(**kwargs)) else: context.add(node.name, dynamic_topk(x, k, -1, name=node.name)) @register_tf_op(tf_alias=["InTopKV2"]) def InTopK(context, node): x = context[node.inputs[0]] target = context[node.inputs[1]] k = context[node.inputs[2]] _, class_num = x.shape if k.val is not None and not is_symbolic(class_num): k = min(k.val, class_num) _, indices = mb.topk(x=x, k=k, axis=-1) else: x_shape = mb.shape(x=x) class_num = mb.slice_by_index(x=x_shape, begin=(-1,), end=(-1,), squeeze_mask=(True,)) k = mb.minimum(x=k, y=class_num) _, indices = dynamic_topk(x, k, -1) target = mb.expand_dims(x=target, axes=[-1]) x = mb.equal(x=target, y=indices) x = mb.cast(x=x, dtype="fp32") x = mb.reduce_sum(x=x, axes=[-1], keep_dims=False) x = mb.cast(x=x, dtype="bool", name=node.name) context.add(node.name, x) @register_tf_op def Cumsum(context, node): x = context[node.inputs[0]] axis = context[node.inputs[1]] exclusive = node.attr.get("exclusive", False) reverse = node.attr.get("reverse", False) x = mb.cumsum(x=x, axis=axis, exclusive=exclusive, reverse=reverse, name=node.name) context.add(node.name, x) @register_tf_op def Gather(context, node): x = context[node.inputs[0]] indices = context[node.inputs[1]] axis = 0 x = mb.gather(x=x, indices=indices, axis=axis, name=node.name) context.add(node.name, x) def _perform_gather_with_batch_dims(x, indices, batch_dims, gather_func, func_args, name): """ An utility function to compute gather and gather_nd with batch_dims """ # (Step 1) # Reshape x, indices with shape # x: [batch_1, ..., batch_n, *remaining_x_shape] # indices: [batch_1, ..., batch_n, *remaing_indices_shape] # into shape # x_reshape: [prod(batch_1, ..., batch_n), *remaning_x_shape] # indices_reshape: [prod(batch_1, ..., batch_n), *remaning_indices_shape] msg = ("The implementation of gather/gather_nd for iOS15 and older is not efficient. Highly recommend " " set minimum_deployment_target=coremltools.target.iOS16 in the coremltools.convert() function." ) logger.warning(msg) x_shape = mb.shape(x=x) indices_shape = mb.shape(x=indices) batch_shape = mb.gather(x=x_shape, indices=_np.array(range(batch_dims)), axis=0) batch_prod = mb.reduce_prod(x=batch_shape, axes=[0], keep_dims=True) x_remaining_shape = mb.gather(x=x_shape, indices=_np.array(range(batch_dims, x.rank)), axis=0) indices_remaining_shape = mb.gather(x=indices_shape, indices=_np.array(range(batch_dims, indices.rank)), axis=0) new_x_shape = mb.concat(values=[batch_prod, x_remaining_shape], axis=0) new_indices_shape = mb.concat(values=[batch_prod, indices_remaining_shape], axis=0) x_reshape = mb.reshape(x=x, shape=new_x_shape) indices_reshape = mb.reshape(x=indices, shape=new_indices_shape) # (Step 2) # We iterate through the batch dimension, and compute the gather individually for each batch # All results are stacked into a tensor with shape [prod(batch_1, ..., batch_n), *remaning_result_shape] res = [] if batch_prod.val is None: raise ValueError("batch dimension must be known at compile time") for i in range(batch_prod.val[0]): temp_x = mb.gather(x=x_reshape, indices=[i], axis=0) temp_indices = mb.gather(x=indices_reshape, indices=[i], axis=0) temp_x = mb.squeeze(x=temp_x, axes=[0]) temp_indices = mb.squeeze(x=temp_indices, axes=[0]) func_args.update({"x": temp_x, "indices": temp_indices}) temp = gather_func(**func_args) res.append(temp) res = mb.stack(values=res, axis=0) # (Step 3) # Finally, we reshape the result to shape [batch_1, ..., batch_n, *remaining_result_shape] res_shape = mb.shape(x=res) res_remaning_shape = mb.gather(x=res_shape, indices=_np.array(range(1, res_shape.shape[0])), axis=0) res_new_shape = mb.concat(values=[batch_shape, res_remaning_shape], axis=0) return mb.reshape(x=res, shape=res_new_shape, name=name) @register_tf_op def GatherV2(context, node): x = context[node.inputs[0]] indices = context[node.inputs[1]] axis = context[node.inputs[2]].val batch_dims = node.attr.get("batch_dims", 0) if is_current_opset_version_compatible_with(target.iOS16): # For iOS16 and above, we can directly use the batch_dims argument x = mb.gather(x=x, indices=indices, axis=axis, batch_dims=batch_dims, name=node.name) else: # For iOS15 or below, we have to manually compute it if batch_dims == 0: x = mb.gather(x=x, indices=indices, axis=axis, name=node.name) else: func_args = {"axis": axis - batch_dims} x = _perform_gather_with_batch_dims(x, indices, batch_dims, mb.gather, func_args, node.name) context.add(node.name, x) @register_tf_op def GatherNd(context, node): x = context[node.inputs[0]] indices = context[node.inputs[1]] batch_dims = node.attr.get("batch_dims", 0) if is_current_opset_version_compatible_with(target.iOS16): # For iOS16 and above, we can directly use the batch_dims argument x = mb.gather_nd(x=x, indices=indices, batch_dims=batch_dims, name=node.name) else: if batch_dims == 0: x = mb.gather_nd(x=x, indices=indices, name=node.name) else: x = _perform_gather_with_batch_dims(x, indices, batch_dims, mb.gather_nd, {}, node.name) context.add(node.name, x) @register_tf_op def Tile(context, node): x = context[node.inputs[0]] reps = context[node.inputs[1]] x = mb.tile(x=x, reps=reps, name=node.name) context.add(node.name, x) @register_tf_op def Where(context, node): if len(node.inputs) > 1: raise NotImplementedError('tf.where with x,y will be supported by ' 'MIL::select in the future') x = context[node.inputs[0]] x = mb.non_zero(x=x, name=node.name) context.add(node.name, x) @register_tf_op def SquaredDifference(context, node): x = context[node.inputs[0]] y = context[node.inputs[1]] x = mb.sub(x=x, y=y, name=node.name + '_sub') x = mb.square(x=x, name=node.name) context.add(node.name, x) @register_tf_op def Conv2DBackpropInput(context, node): # Output shape: [N, H_out, W_out, C_out] output_shape = context[node.inputs[0]].val # Weight shape: [H, W, C_out, C_in] W_hwoi = context[node.inputs[1]] W_iohw = mb.transpose(x=W_hwoi, perm=[3, 2, 0, 1]) # Input shape: [N, H_in, W_in, C_in] x = context[node.inputs[2]] data_format = node.attr.get("data_format", "NHWC") HW_dilations = _conv2d3d_strides_or_dilations( "dilations", node.attr.get("dilations"), data_format ) HW_strides = _conv2d3d_strides_or_dilations( "strides", node.attr.get("strides"), data_format ) pad_type = node.attr.get("padding") if not isinstance(pad_type, str): pad_type = "custom" raise NotImplementedError("Custom padding not implemented for TF") pad_type = pad_type.lower() # CoreML expects input to be in NCHW format # Transpose input to NCHW format if data_format == "NHWC": x = _transpose_NHWC_to_NCHW(x) if output_shape is not None: output_shape = [output_shape[0], output_shape[3], output_shape[1], output_shape[2]] # Only the last op should have the same name as node.name conv_name = node.name + "x" if data_format == "NHWC" else node.name # Pass output shape provided above x = mb.conv_transpose( x=x, weight=W_iohw, pad_type=pad_type, output_shape=output_shape, strides=HW_strides, dilations=HW_dilations, name=conv_name, ) # Convert NCHW output back to NHWC format if data_format == "NHWC": x = _transpose_NCHW_to_NHWC(x, node.name) context.add(node.name, x) @register_tf_op def Range(context, node): start = context[node.inputs[0]] end = context[node.inputs[1]] step = context[node.inputs[2]] x = mb.range_1d(start=start, end=end, step=step, name=node.name) context.add(node.name, x) @register_tf_op def RandomUniform(context, node): shape = context[node.inputs[0]] seed = node.attr["seed"] x = mb.random_uniform(shape=shape, seed=seed, name=node.name) context.add(node.name, x) @register_tf_op def RandomStandardNormal(context, node): shape = context[node.inputs[0]] seed = node.attr["seed"] x = mb.random_normal(shape=shape, seed=seed, name=node.name) context.add(node.name, x) @register_tf_op def OneHot(context, node): indices = context[node.inputs[0]] depth = context[node.inputs[1]] on_value = context[node.inputs[2]] off_value = context[node.inputs[3]] axis = node.attr.get("axis", -1) x = mb.one_hot( indices=indices, one_hot_vector_size=depth, axis=axis, on_value=on_value, off_value=off_value, name=node.name, ) context.add(node.name, x) def _get_non_maximum_supression(context, node, iou_threshold_override=None, score_threshold_override=None): """ The helper function returns the outputs from mb.non_maximum_suppression, along with the number of boxes and the maximum number of boxes. """ boxes = context[node.inputs[0]] scores = context[node.inputs[1]] max_boxes = context[node.inputs[2]] iou_threshold = iou_threshold_override or context[node.inputs[3]] score_threshold = score_threshold_override or context[node.inputs[4]] # The boxes' coordinates in Tensorflow is (y1, x1, y2, x2) where (y1, x1) and (y2, x2) are the # coordinates of diagonal pair of box corners. However, MIL NMS expects CENTER_SIZE_WIDTH_FIRST # format, which is (x, y, width, height) where (x, y) is the center coordinate. y1, x1, y2, x2 = mb.split(x=boxes, num_splits=4, axis=-1) # As the input coordinates could be any diagonal pair of box corners, it's not guaranteed that # x2 > x1 nor y2 > y1. So we need to use abs to get width/height, and (x1+x2)/2 to get center. width = mb.abs(x=mb.sub(x=x2, y=x1)) height = mb.abs(x=mb.sub(x=y2, y=y1)) center_x = mb.real_div(x=mb.add(x=x1, y=x2), y=2.0) center_y = mb.real_div(x=mb.add(x=y1, y=y2), y=2.0) boxes = mb.concat(values=[center_x, center_y, width, height], axis=-1) if score_threshold.val == float("-inf"): # TensorFlow's default value for score_threshold, Core ML does not # have float('-inf') support, converted to minimum float32 instead score_threshold = -3.4e38 boxes = mb.expand_dims(x=boxes, axes=[0]) scores = mb.expand_dims(x=scores, axes=[0, -1]) coordinates, scores, indices, valid_outputs = mb.non_maximum_suppression( boxes=boxes, scores=scores, max_boxes=max_boxes, iou_threshold=iou_threshold, score_threshold=score_threshold, ) # The results from MIL NMS op are padded to max_boxes. We need to extract the valid part for TF. # Notice that the batch dim and class num dim also need to be squeezed. valid_outputs = mb.squeeze(x=valid_outputs, axes=[0]) range = mb.range_1d(end=valid_outputs, start=0, step=1) coordinates = mb.squeeze(x=coordinates, axes=[0]) valid_coordinates = mb.gather(x=coordinates, indices=range, axis=0) scores = mb.squeeze(x=scores, axes=[0, -1]) valid_scores = mb.gather(x=scores, indices=range, axis=0) indices = mb.squeeze(x=indices, axes=[0]) valid_indices = mb.cast( x=mb.gather(x=mb.cast(x=indices, dtype="fp32"), indices=range, axis=0), dtype="int32", name=node.name, ) return valid_coordinates, valid_scores, valid_indices, valid_outputs @register_tf_op(tf_alias=["NonMaxSuppressionV3"]) def NonMaxSuppression(context, node): _, _, valid_indices, valid_outputs = _get_non_maximum_supression(context, node) context.add(node.name, valid_indices) @register_tf_op def NonMaxSuppressionV5(context, node): """ Different from NonMaxSuppression/NonMaxSuppressionV3, which only returns the indices of the selected boxes, NonMaxSuppressionV5 returns all indices, scores and number of the selected boxes. """ soft_nms_sigma = context[node.inputs[5]].val iou_threshold_override = None score_threshold_override = None if soft_nms_sigma != 0: # fallback to "hard" NMS with sensible defaults iou_threshold_override = types.fp32(0.5) score_threshold_override = types.fp32(float("-inf")) logger.warning("NonMaxSuppressionV5 with soft_nms_sigma != 0 not supported. " "Setting soft_nms_sigma to zero.") _, valid_scores, valid_indices, valid_outputs = _get_non_maximum_supression( context, node, iou_threshold_override=iou_threshold_override, score_threshold_override=score_threshold_override ) res = [valid_indices, valid_scores, valid_outputs] context.add(node.name, res) @register_tf_op def Shape(context, node): x = context[node.inputs[0]] if types.is_complex(x.dtype): x = mb.complex_shape(x=x, name=node.name) else: x = mb.shape(x=x, name=node.name) context.add(node.name, x) @register_tf_op def ResizeNearestNeighbor(context, node): # "ResizeNearestNeighbor" op in TF is always in the channel last mode # instead of upsample factor, it uses output size, which is the second input x = context[node.inputs[0]] input_shape = x.shape # (N,Hin,Win,C) if len(input_shape) != 4: raise ValueError('"ResizeNearestNeighbor" op: input rank is not 4') if len(context[node.inputs[1]].shape) != 1: raise ValueError('"ResizeNearestNeighbor" op: the second input, must have rank 1') if context[node.inputs[1]].shape[0] != 2: raise ValueError( '"ResizeNearestNeighbor" op: the second input, which is the output size, must have 2 elements' ) Hout, Wout = None, None scaling_factor_h, scaling_factor_w = None, None target_shape = context[node.inputs[1]] if target_shape.val is None: if target_shape.op is not None and target_shape.op.op_type == "mul": scaling_factor_h = target_shape.op.y.val[0] scaling_factor_w = target_shape.op.y.val[1] elif not is_current_opset_version_compatible_with(target.iOS17): # For the dynamic input shape case before iOS17, # context[node.inputs[1]] need to be a mul(x=input_shape, y=scaling_factor) op. raise ValueError( "Cannot determine the scale factor for the resize layer. " "Please make sure the target size is known statically, or " "use mul op to get the target size. If the target size has to be dynamic, please" "set minimum_deployment_target to iOS17 during conversion." ) else: Hin, Win = input_shape[1], input_shape[2] Hout, Wout = target_shape.val scaling_factor_h = Hout / Hin if Hout % Hin == 0 else (Hout + 1e-4) / Hin scaling_factor_w = Wout / Win if Wout % Win == 0 else (Wout + 1e-4) / Win if ( scaling_factor_h is not None and scaling_factor_w is not None and scaling_factor_h < 1 and scaling_factor_w < 1 ): ResizeBilinear(context, node) return # first transpose to from channel last to channel first format for coreml x = _transpose_NHWC_to_NCHW(x) align_corners = node.attr.get("align_corners", False) half_pixel_centers = node.attr.get("half_pixel_centers", False) # add either the resize or the upsample layer if align_corners is False and half_pixel_centers is False: x = mb.upsample_nearest_neighbor( x=x, scale_factor_height=scaling_factor_h, scale_factor_width=scaling_factor_w, name=node.name + "_channel_first_upsample", ) elif align_corners is False and half_pixel_centers is True: # if output size can be determined at compile time, # we call the core op resize_nearest_neighbor, # otherwise we use upsample_nearest_neighbor for approximation. # rdar://75204549 (resize_nearest_neighbor need to support dynamic input shape) if Hout is not None and Wout is not None: x = mb.resize_nearest_neighbor( x=x, target_size_height=Hout, target_size_width=Wout, name=node.name + "_channel_first_resize", ) elif is_current_opset_version_compatible_with(target.iOS17): x = mb.resize( x=x, shape=target_shape, resized_dims=np.uint32(2), interpolation_mode="NEAREST_NEIGHBOR", name=node.name + "_channel_first_resize", ) else: logger.warning('Using upsample_nearest_neighbor to approximate resize_nearest_neighbor.') x = mb.upsample_nearest_neighbor( x=x, scale_factor_height=scaling_factor_h, scale_factor_width=scaling_factor_w, name=node.name + "_channel_first_upsample", ) else: raise NotImplementedError( "ResizeNearestNeighbor op with align_corners={}and half_pixel_centers={} not supported".format( align_corners, half_pixel_centers ) ) # transpose again x = _transpose_NCHW_to_NHWC(x, node.name) context.add(node.name, x) @register_tf_op def ResizeBilinear(context, node): # "ResizeBilinear" op in TF is always in the channel last mode # second input is the output size x = context[node.inputs[0]] input_shape = x.shape # (N,Hin,Win,C) if len(input_shape) != 4: raise ValueError('"ResizeBilinear" op: input rank is not 4') if len(context[node.inputs[1]].shape) != 1: raise ValueError('"ResizeBilinear" op: the second input, must have rank 1') if context[node.inputs[1]].shape[0] != 2: raise ValueError( '"ResizeBilinear" op: the second input, which is the output size, must have 2 elements' ) align_corners = node.attr.get("align_corners", False) half_pixel_centers = node.attr.get("half_pixel_centers", False) if align_corners and half_pixel_centers: # we should not come here since TF does not support align_corners=True and half_pixel_centers=True raise ValueError( '"ResizeBilinear" op: "align_corners" and "half_pixel_centers" are both True and this mode is not supported' ) # In iOS16, we can support dynamic shape + any combination of aligh_corners and half_pixel_centers, # if the output_shape comes from a pattern of input_shape * (h_scale, w_scale) if is_current_opset_version_compatible_with(target.iOS16) and context[node.inputs[1]].val is None: output_shape = context[node.inputs[1]] if output_shape.op is not None and output_shape.op.op_type == "mul": scale_factor_height = context[node.inputs[1]].op.y.val[0] scale_factor_width = context[node.inputs[1]].op.y.val[1] x = _transpose_NHWC_to_NCHW(x) x = mb.upsample_bilinear( x=x, scale_factor_height=scale_factor_height, scale_factor_width=scale_factor_width, align_corners=align_corners, half_pixel_centers=half_pixel_centers, ) x = _transpose_NCHW_to_NHWC(x, node.name) context.add(node.name, x) return # first transpose to from channel last to channel first format for coreml x = _transpose_NHWC_to_NCHW(x) # [half_pixel_centers = False] if not half_pixel_centers: sampling_mode = "STRICT_ALIGN_CORNERS" if align_corners else "DEFAULT" node_name = node.name + "_channel_first_resize_bilinear" target_size = context[node.inputs[1]] if target_size.val is not None: Hout, Wout = target_size.val if not ( isinstance(Hout, (_np.int32, _np.int64)) and isinstance(Wout, (_np.int32, _np.int64)) ): raise ValueError( '"ResizeBilinear" op: the second input, which is the output size, must have elements of type int32 or int64' ) x = mb.resize_bilinear( x=x, target_size_height=Hout, target_size_width=Wout, sampling_mode=sampling_mode, name=node_name, ) elif is_current_opset_version_compatible_with(target.iOS17): x = mb.resize( x=x, shape=target_size, resized_dims=np.uint32(2), sampling_mode=sampling_mode, name=node_name, ) else: raise ValueError( '"ResizeBilinear" op: the second input, which is the output size, must be known ' "statically. Consider setting minimum_deployment_target to iOS17 during conversion." ) # [align_corners = False, half_pixel_centers = True] elif not align_corners and half_pixel_centers: if context[node.inputs[1]].val is None: # for the dynamic input shape case, # context[node.inputs[1]] is a mul(x=input_shape, y=scaling_factor) op. if context[node.inputs[1]].op.op_type != "mul": raise NotImplementedError("Cannot determine the scale factor for the bilinear resize layer.") scale_factor_height = context[node.inputs[1]].op.y.val[0] scale_factor_width = context[node.inputs[1]].op.y.val[1] else: Hin, Win = input_shape[1], input_shape[2] Hout, Wout = context[node.inputs[1]].val # check if the output size divide the input size, # if not, then cast the scale factor to float type. scale_factor_height = Hout / Hin if Hout % Hin == 0 else (Hout + 1e-4) / Hin scale_factor_width = Wout / Win if Wout % Win == 0 else (Wout + 1e-4) / Win x = mb.upsample_bilinear( x=x, scale_factor_height=scale_factor_height, scale_factor_width=scale_factor_width, align_corners=False, name=node.name + "_channel_first_upsample_bilinear", ) # transpose again x = _transpose_NCHW_to_NHWC(x, node.name) context.add(node.name, x) @register_tf_op def make_tuple(context, node): res = tuple([context[in_name] for in_name in node.inputs]) context.add(node.name, res) @register_tf_op def function_entry(context, node): if context.get_func_inputs() is None: msg = ( "function_entry requires function inputs stored in " + "context.curr_func_inputs" ) raise ValueError(msg) context.add(node.name, context.get_func_inputs()) @register_tf_op(tf_alias=["while"]) def While(context, node): # TF while will never have break statement, because break can always be # transformed into while and condition. Example: # # while pred: # a = op1(...) # if a == 0: # break # b = op2(...) # # is equivalent to # # while pred and not break_a: # a = op1(...) # break_a = a == 0 # if not break_a: # b = op2(...) # node.inputs[0] == 'make_tuple_X' (always a make_tuple) loop_vars = context[node.inputs[0]] # python tuple of Vars cond_graph = context.get_graph(node.attr["cond_function"]) body_graph = context.get_graph(node.attr["body_function"]) def cond(*loop_vars): context.stack_func_inputs(loop_vars) # convert_graph uses context to convert cond_graph. During conversion # it constructs operations (mb.some_op). Note that cond(*loop_vars) is # only evaluated inside while_loop's type_inference(), not here. In # other words, we use python's deferred function evaluation to defer # the SSA block construction until inside while_loop Operation. res = convert_graph(context, cond_graph) # Done with translating the function context.unstack_func_inputs() return res def body(*loop_vars): context.stack_func_inputs(loop_vars) res = convert_graph(context, body_graph) # Done with translating the function context.unstack_func_inputs() return res x = mb.while_loop(_cond=cond, _body=body, loop_vars=loop_vars, name=node.name) # wraps x as tuple for get_tuple that always follow the while node. if not isinstance(x, (tuple, list)): x = (x,) context.add(node.name, x) @register_tf_op def iff(context, node): pred = context[node.inputs[0]] # this is always a tensor, as TF uses one iff op for each returned value. # # Example TF program: # # x = tf.placeholder(tf.float32, shape=(1,)) # y = tf.placeholder(tf.float32, shape=(1,)) # z = tf.multiply(x, y) # pred = tf.less(tf.math.reduce_mean(x), tf.math.reduce_mean(y)) # def true_fn(): return tf.add(x, z), x # def false_fn(): return tf.square(y), z # res = tf.cond(pred, true_fn, false_fn) # # There will be 2 iffs: # # iff('cond/pred_id', 'cond/Add', 'cond/Square') # iff('cond/pred_id', 'cond/Add/Switch', 'cond/Switch_1') # # where # 'cond/pred_id': pred # 'cond/Add': tf.add(x, z) # 'cond/Square': tf.square(y) # 'cond/Add/Switch': x # 'cond/Switch_1': z # # And both branches are executed, and one of the results will be # discarded at iff nodes. # # Note that the above program would translate to two cond ops, each with # two blocks. true_output_var = context[node.inputs[1]] false_output_var = context[node.inputs[2]] def true_fn(): return mb.identity(x=true_output_var) def false_fn(): return mb.identity(x=false_output_var) x = mb.cond(pred=pred, _true_fn=true_fn, _false_fn=false_fn, name=node.name) context.add(node.name, x) @register_tf_op def Concat(context, node): values = [context[input] for input in node.inputs[1:]] axis = context[node.inputs[0]] x = mb.concat(values=values, axis=axis, name=node.name) context.add(node.name, x) @register_tf_op def ConcatV2(context, node): values = [context[input] for input in node.inputs[:-1]] axis = context[node.inputs[-1]] x = mb.concat(values=values, axis=axis, name=node.name) context.add(node.name, x) @register_tf_op def Pack(context, node): values = [context[name] for name in node.inputs] axis = node.attr["axis"] if axis < 0: # TF axis = -1 creates new dim at the end axis += values[0].rank + 1 if len(values) == 1: # for example: # y = tf.raw_ops.Pack(values=[2], axis=0). # or y = tf.raw_ops.Pack(values=[tf.constant([1,2])], axis=0) input_type = values[0].sym_type if _is_scalar(input_type): x = mb.mul(x=_np.array([1], dtype=_np.int32), y=values[0], name=node.name) else: x = mb.expand_dims(x=values[0], axes=[axis], name=node.name) else: if all([_is_scalar(input.sym_type) for input in values]): x = mb.concat(values=values, axis=axis, name=node.name) else: x = mb.stack(values=values, axis=axis, name=node.name) context.add(node.name, x) @register_tf_op def Unpack(context, node): x = context[node.inputs[0]] axis = int(node.attr["axis"]) num_splits = node.attr.get("num", None) if num_splits is None: num_splits = x.shape[axis] if num_splits == 1: y = [x] else: y = mb.split(x=x, num_splits=num_splits, axis=axis, name=node.name + "_unsqueezed") output_vars = [] for i in range(num_splits): output_vars.append( mb.squeeze(x=y[i], axes=[axis], name=node.name + ":{}".format(i)) ) context.add(node.name, output_vars) @register_tf_op def Split(context, node): axis = context[node.inputs[0]] x = context[node.inputs[1]] if "num_split" not in node.attr: raise ValueError("num_splits not found in TF op {}".format(node.name)) num_splits = node.attr["num_split"] if num_splits == 1: if len(node.outputs) == 0: x = mb.mul(x=x, y=1.0, name=node.name) context.add(node.name, x) else: # Don't change tfssa. Just make downstream ops reference the pre-identity op. context.add(node.name, [x], is_new_var=False) else: x = mb.split(x=x, num_splits=num_splits, axis=axis, name=node.name) context.add(node.name, x) # TODO : If tf.split output is returned, there's no # get_tuple nodes. Some graph pass is needed. Example: # # x = tf.placeholder(tf.float32, shape=input_shape1) # res = tf.split(x, 3, axis=0) # # res are ['split:0', 'split:1', 'split'] # # but node.outputs == ['gto_1', 'gto_2', 'gto_3'] @register_tf_op def SplitV(context, node): x = context[node.inputs[0]] split_sizes = context[node.inputs[1]] axis = context[node.inputs[2]] if "num_split" not in node.attr: raise ValueError("num_splits not found in TF op {}".format(node.name)) num_splits = node.attr["num_split"] if num_splits == 1: Identity(context, node) else: x = mb.split( x=x, num_splits=num_splits, split_sizes=split_sizes, axis=axis, name=node.name, ) context.add(node.name, x) @register_tf_op def ScatterNd(context, node): indices = context[node.inputs[0]] updates = context[node.inputs[1]] shape = context[node.inputs[2]] x = mb.fill(shape=shape, value=types.nptype_from_builtin(updates.dtype)(0)) x = mb.scatter_nd(data=x, indices=indices, updates=updates, name=node.name) context.add(node.name, x) @register_tf_op def TensorScatterAdd(context, node): tensor, indices, updates, = [context[name] for name in node.inputs] output = mb.scatter_nd(data=tensor, indices=indices, updates=updates, mode="add", name=node.name) context.add(node.name, output) @register_tf_op def ZerosLike(context, node): x = context[node.inputs[0]] if x.rank == 0: np_type = types.nptype_from_builtin(x.sym_type) x = mb.const(val=np_type(0), name=node.name) else: np_type = types.nptype_from_builtin(x.sym_type.get_primitive()) x = mb.fill(shape=mb.shape(x=x), value=np_type(0), name=node.name) context.add(node.name, x) @register_tf_op def IsFinite(context, node): x = context[node.inputs[0]] # In floating-point arithmetic, symbolically, inf + anything = inf, # so we can detect if x is finite by x + y != x # # To avoid false alarm, i.e. x + y = x due to rounding error for small y, # here we use the fp16 max as y dtype = types.nptype_from_builtin(x.sym_type.get_primitive()) y_add = dtype(_np.finfo(_np.float16).max) x_plus = mb.add(x=x, y=y_add) result = mb.not_equal(x=x, y=x_plus, name=node.name) context.add(node.name, result) @register_tf_op def CropAndResize(context, node): x = context[node.inputs[0]] input_shape = x.shape # (B, h_in, w_in, C) if len(input_shape) != 4: raise ValueError( '"CropResize" op: expected input rank 4, got {}'.format(x.rank) ) Hin, Win = input_shape[1:3] const_box_info = True if context[node.inputs[1]].val is None or context[node.inputs[2]].val is None: const_box_info = False crop_size = context[node.inputs[3]].val method = node.attr.get("method", "bilinear") pad_value = node.attr.get("extrapolation_value", 0.0) # CoreML index information along with boxes if const_box_info: boxes = context[node.inputs[1]].val box_indices = context[node.inputs[2]].val if not is_current_opset_version_compatible_with(target.iOS17): # Before IOS17, CoreML expects boxes/ROI in [N, 1, 5, 1, 1] shape. box_indices = _np.expand_dims(box_indices, axis=1) boxes = _np.concatenate([box_indices, boxes], axis=1) boxes = boxes.reshape(boxes.shape[0], 1, boxes.shape[1], 1, 1) else: box_indices = context[node.inputs[2]] boxes = context[node.inputs[1]] if not is_current_opset_version_compatible_with(target.iOS17): # Before IOS17, CoreML expects ROI in [N, 1, 5, 1, 1] shape. if box_indices.dtype != boxes.dtype: box_indices = mb.cast(x=box_indices, dtype=types.builtin_to_string(boxes.dtype)) box_indices = mb.expand_dims(x=box_indices, axes=[1]) boxes = mb.concat(values=(box_indices, boxes), axis=1) # TODO: Dynamic rank: Use GetShape and select indices dynamically boxes = mb.reshape(x=boxes, shape=[boxes.shape[0], 1, boxes.shape[1], 1, 1]) # Get Height and Width of crop h_out, w_out = crop_size[0], crop_size[1] # TF `nearest` mode not supported method_map = {"bilinear": "ALIGN_CORNERS"} if method not in method_map: raise ValueError( "CropResize op: Unsupported method {}. Supports {}".format( method, method_map.keys() ) ) method = method_map[method] # TF input format: [B, h_in, w_in, C] # CoreML input format: [B, C, h_in, w_in] x = _transpose_NHWC_to_NCHW(x) # Crop Resize crop_resize_args = { "x": x, "target_height": h_out, "target_width": w_out, "normalized_coordinates": True, "spatial_scale": 1.0, "box_coordinate_mode": "CORNERS_HEIGHT_FIRST", "sampling_mode": method, } if is_current_opset_version_compatible_with(target.iOS16): crop_resize_args["pad_value"] = pad_value else: if pad_value != 0.0: raise ValueError( f"For iOS15 or older, only extrapolation_value=0.0 is supported or the tf CropAndResize op. Got {pad_value}" ) if not is_current_opset_version_compatible_with(target.iOS17): # Before IOS17, the input param is `roi` instead of `boxes`. crop_resize_args["roi"] = boxes else: crop_resize_args["boxes"] = boxes crop_resize_args["box_indices"] = box_indices x = mb.crop_resize(**crop_resize_args) if not is_current_opset_version_compatible_with(target.iOS17): # Before IOS17, the output has an extra dim at axis 1. # CoreML output format: [N, 1, C, h_out, w_out] # TF output format: [N, h_out, w_out, C] x = mb.squeeze(x=x, axes=[1]) x = _transpose_NCHW_to_NHWC(x, node.name) context.add(node.name, x) @register_tf_op def TensorArrayV3(context, node): if "infer_shape" in node.attr: if not node.attr["infer_shape"]: raise ValueError("Only fixed size TensorArray is supported") dynamic_length = node.attr.get("dynamic_size", True) elem_shape = node.attr.get("element_shape", None) size = node.attr.get("size", None) if size is None: size = context[node.inputs[0]] if size.val is None: init_length = size else: init_length = size.val if init_length == 0: # Dynamic list. Use 1 as init_length init_length = 1 builtin_dtype = node.attr["dtype"] dtype_str = types.builtin_to_string(builtin_dtype) if elem_shape is not None and -1 not in elem_shape: ls = mb.make_list( init_length=init_length, dtype=dtype_str, elem_shape=elem_shape, dynamic_length=dynamic_length, name=node.name, ) else: ls = mb.tf_make_list( init_length=init_length, dtype=dtype_str, dynamic_length=dynamic_length, name=node.name, ) context.add(node.name, ls) @register_tf_op def TensorArrayWriteV3(context, node): index = context[node.inputs[0]] new_val = context[node.inputs[1]] ls = context[node.inputs[2]] new_list = mb.list_write(ls=ls, index=index, value=new_val, name=node.name) context.add(node.name, new_list) @register_tf_op def TensorArraySizeV3(context, node): ls = context[node.inputs[0]] length = mb.list_length(ls=ls, name=node.name) context.add(node.name, length) @register_tf_op def TensorArrayGatherV3(context, node): indices = context[node.inputs[0]] ls = context[node.inputs[1]] tensor = mb.list_gather(ls=ls, indices=indices, name=node.name) context.add(node.name, tensor) @register_tf_op def TensorArrayReadV3(context, node): idx = context[node.inputs[0]] ls = context[node.inputs[1]] ls = mb.list_read(ls=ls, index=idx, name=node.name) context.add(node.name, ls) @register_tf_op def TensorArrayScatterV3(context, node): indices = context[node.inputs[0]] value = context[node.inputs[1]] ls = context[node.inputs[2]] ls = mb.list_scatter(ls=ls, indices=indices, value=value, name=node.name) context.add(node.name, ls) @register_tf_op def BroadcastTo(context, node): x = context[node.inputs[0]] shape = context[node.inputs[1]] if shape.val is None: # dynamic shape raise NotImplementedError("dynamic shape not yet supported") else: # static shape target_shape = tuple(shape.val) broadcast_shape = broadcast_shapes(x.shape, target_shape) if target_shape != broadcast_shape: msg = "shapes are not broadcastable: {} vs. {}" raise ValueError(msg.format(x.shape, target_shape)) target_rank = len(target_shape) if x.rank != target_rank: axes = [i for i in range(target_rank - x.rank)] x = mb.expand_dims(x=x, axes=axes) reps = [1] * target_rank for i in range(target_rank): reps[i] = target_shape[i] // x.shape[i] x = mb.tile(x=x, reps=reps, name=node.name) context.add(node.name, x) @register_tf_op def get_global(context, node): # Design comment: This is only works if variable doesn't cross block # boundary (e.g. while_loop, cond, function) variable_name = node.attr["variable"] x = context[variable_name] # This must've been set by set_global context.add(node.name, x, is_new_var=False) @register_tf_op def set_global(context, node): x = context[node.inputs[0]] variable_name = node.attr["variable"] context.add(variable_name, x, is_new_var=False) def _get_const_or_raise(variable): if variable.val is None: raise ValueError("Var {} must be const".format(variable.name)) return variable.val @register_tf_op def LSTMBlockCell(context, node): x = context[node.inputs[0]] # [batch, input_dim] c_prev = context[node.inputs[1]] # [b, hidden_dim] h_prev = context[node.inputs[2]] # [b, hidden_dim] # W layout is ifco W = context[node.inputs[3]] # [input_dim + hidden_dim, 4*hidden_dim] kwargs = {} use_peephole = node.attr["use_peephole"] if use_peephole: peep_i = context[node.inputs[4]] # [hidden_dim,] peep_f = context[node.inputs[5]] # [hidden_dim,] peep_o = context[node.inputs[6]] # [hidden_dim,] kwargs["weight_peep_i"] = peep_i kwargs["weight_peep_f"] = peep_f kwargs["weight_peep_o"] = peep_o bias = context[node.inputs[7]] # [4*hidden_dim,] forget_bias = node.attr["forget_bias"] cell_clip = None if node.attr["cell_clip"] is not None and node.attr["cell_clip"] > 0: cell_clip = node.attr["cell_clip"] res = mb.tf_lstm_block_cell( x=x, c_prev=c_prev, h_prev=h_prev, weight=W, bias=bias, forget_bias=forget_bias, cell_clip=cell_clip, use_peephole=use_peephole, name=node.name, **kwargs ) context.add(node.name, res) @register_tf_op(tf_alias=["BlockLSTMV2"]) def BlockLSTM(context, node): # BlockLSTM: https://www.tensorflow.org/api_docs/python/tf/raw_ops/BlockLSTM # BlockLSTMV2: https://www.tensorflow.org/api_docs/python/tf/raw_ops/BlockLSTMV2 seq_len = context[node.inputs[0]] # int x = context[node.inputs[1]] # [padded_len, batch, input_dim] init_c = context[node.inputs[2]] # [1, hidden_dim] init_h = context[node.inputs[3]] # [1, hidden_dim] # BlockLSTM: icfo format, BlockLSTMV2: ifco format weight = context[node.inputs[4]] # [input_dim + hidden_dim, 4*hidden_dim] kwargs = {} use_peephole = node.attr["use_peephole"] if use_peephole: peep_i = context[node.inputs[5]] # [hidden_dim,] peep_f = context[node.inputs[6]] # [hidden_dim,] peep_o = context[node.inputs[7]] # [hidden_dim,] kwargs["weight_peep_i"] = peep_i kwargs["weight_peep_f"] = peep_f kwargs["weight_peep_o"] = peep_o # BlockLSTM: icfo format, BlockLSTMV2: ifco format bias = context[node.inputs[8]] # [4*hidden_dim,] # forget bias is always 0 for BlockLSTMV2 forget_bias = 0.0 if node.op == "BlockLSTMV2" else node.attr["forget_bias"] cell_clip = None if node.attr["cell_clip"] is not None and node.attr["cell_clip"] > 0: cell_clip = node.attr["cell_clip"] if node.op == "BlockLSTMV2": # mb.tf_lstm_block takes weights and bias in icfo format # BlockLSTMV2's weights and bias are in ifco format # convert from ifco to icfo format w_i, w_f, w_c, w_o = mb.split(x=weight, num_splits=4, axis=-1) weight = mb.concat(values=(w_i, w_c, w_f, w_o), axis=1, name=weight.name) b_i, b_f, b_c, b_o = mb.split(x=bias, num_splits=4, axis=-1) bias = mb.concat(values=(b_i, b_c, b_f, b_o), axis=0, name=bias.name) res = mb.tf_lstm_block( seq_len=seq_len, x=x, c_prev=init_c, h_prev=init_h, weight=weight, bias=bias, forget_bias=forget_bias, cell_clip=cell_clip, use_peephole=use_peephole, name=node.name, **kwargs ) context.add(node.name, res) @register_tf_op def ClipByValue(context, node): x = context[node.inputs[0]] min_value = context[node.inputs[1]] max_value = context[node.inputs[2]] if min_value.val < max_value.val: x = mb.clip(x=x, alpha=min_value, beta=max_value, name=node.name) else: # When min >= max, TensorFlow sets all values to min. x = mb.fill(shape=mb.shape(x=x), value=min_value, name=node.name) context.add(node.name, x) @register_tf_op def Size(context, node): x = context[node.inputs[0]] x = mb.shape(x=x) x = mb.reduce_prod(x=x, axes=[0], name=node.name) context.add(node.name, x) @register_tf_op def LogSoftmax(context, node): x = context[node.inputs[0]] axis = node.attr.get('axis', -1) x_max = mb.reduce_max(x=x, axes=[axis], keep_dims=True) x_off = mb.sub(x=x, y=x_max) y = mb.reduce_log_sum_exp(x=x_off, axes=[axis], keep_dims=True) res = mb.sub(x=x_off, y=y, name=node.name) context.add(node.name, res) @register_tf_op def AudioSpectrogram(context, node): """ input shape: (Tin, channels) attributes: stride (int), window_size (int), magnitude_squared (bool) output shape : (channels, Tout, fout) where, Tout = floor((Tin - window_size)/stride + 1) fout = N / 2 + 1 where N = next_power_of_2(window_size) = 2 ^ ceil(log2(window_size)) reference: https://github.com/tensorflow/tensorflow/blob/dec8e0b11f4f87693b67e125e67dfbc68d26c205/tensorflow/core/kernels/spectrogram_op.cc """ x = context[node.inputs[0]] # (Tin, channels) if x.rank != 2: raise NotImplementedError("AudioSpectrogram op: rank of the input must be 2") if "magnitude_squared" not in node.attr: raise ValueError("AudioSpectrogram op: missing attribute: 'magnitude_squared'") if "stride" not in node.attr: raise ValueError("AudioSpectrogram op: missing attribute: 'stride'") if "window_size" not in node.attr: raise ValueError("AudioSpectrogram op: missing attribute: 'window_size'") magnitude_squared = node.attr["magnitude_squared"] stride = node.attr["stride"] window_size = node.attr["window_size"] N = 2 ** _np.ceil(_np.log2(window_size)) N = N.astype(_np.int32) fout = N / 2 + 1 fout = fout.astype(_np.int32) # construct constant for hann window tensor, of shape (window_size,) h = _np.arange(window_size) * ((2 * _np.pi) / window_size) h = 0.5 - 0.5 * _np.cos(h) # construct the constant DFT matrices k = _np.arange(fout).reshape(1, fout) # (1, fout) n = _np.arange(N).reshape(N, 1) # (N, 1) kn = _np.matmul(n, k) * (2 * _np.pi / N) # (N, fout) Re_DFT_matrix_const = _np.cos(kn) # (N, fout) Im_DFT_matrix_const = -_np.sin(kn) # (N, fout) # transpose input x = mb.transpose(x=x, perm=[1,0]) # (channels, Tin) # extract slices from the input x = mb.sliding_windows(x=x, axis=1, size=window_size, stride=stride) # (channels, Tout, window_size) # multiply with hann window x = mb.mul(x=x, y=h) # pad the last dimension to size N x = mb.pad(x=x, pad=[0,0,0,0,0,N - window_size], mode="constant", constant_val=0.0) # (channels, Tout, N) # multiply by DFT matrices re = mb.matmul(x=x, y=Re_DFT_matrix_const) # (channels, Tout, fout) im = mb.matmul(x=x, y=Im_DFT_matrix_const) # (channels, Tout, fout) # compute spectrogram re = mb.mul(x=re, y=re) im = mb.mul(x=im, y=im) if not magnitude_squared: y = mb.add(x=re, y=im) y = mb.sqrt(x=y, name=node.name) else: y = mb.add(x=re, y=im, name=node.name) context.add(node.name, y) @register_tf_op def Mfcc(context, node): """ inputs: - x : (channels, T, N) - sampling rate: int attributes: - upper_frequency_limit : int - lower_frequency_limit : int - filterbank_channel_count : int - dct_coefficient_count : int output shape: (channels, T, dct_coefficient_count) """ x = context[node.inputs[0]] # (channels, T, F) if x.rank != 3: raise NotImplementedError("Mfcc op: rank of the input must be 3") sampling_rate_var = context[node.inputs[1]] if sampling_rate_var.val is None: raise NotImplementedError("Mfcc op: dynamic sampling rate not supported") sample_rate = sampling_rate_var.val if is_symbolic(x.shape[2]): raise NotImplementedError("Mfcc op: the last dimension, i.e. spectrogram size of the input must be known") spectrogram_N = x.shape[2] upper_frequency_limit = node.attr.get("upper_frequency_limit", 4000) lower_frequency_limit = node.attr.get("lower_frequency_limit", 20) filterbank_channel_count = node.attr.get("filterbank_channel_count", 40) dct_coefficient_count = node.attr.get("dct_coefficient_count", 13) # get the constant weights, matrices for MFCC filterbank and for DCT # weights: (N,) # mat_weighted, mat_spec_val : (N, filterbank_channel_count) # cosines : (filterbank_channel_count, dct_coefficient_count) weights, mat_weighted, mat_spec_val, cosines = _get_MFCC_constants(spectrogram_N, sample_rate, upper_frequency_limit, lower_frequency_limit, filterbank_channel_count, dct_coefficient_count) spectogram_value = mb.sqrt(x=x) # (channels, T, N) weighted_spectogram_value = mb.mul(x=spectogram_value, y=weights) # (channels, T, N) x1 = mb.matmul(x=weighted_spectogram_value, y=mat_weighted) # (channels, T, filterbank_channel_count) x2 = mb.matmul(x=spectogram_value, y=mat_spec_val) # (channels, T, filterbank_channel_count) y = mb.add(x=x1, y=x2) # (channels, T, filterbank_channel_count) y = mb.log(x=y, epsilon=1e-12) y = mb.matmul(x=y, y=cosines, name=node.name) # (channels, T, dct_coefficient_count) context.add(node.name, y) @register_tf_op def Complex(context, node): real_part = context[node.inputs[0]] imag_part = context[node.inputs[1]] result = mb.complex(real_data=real_part, imag_data=imag_part, name=node.name) context.add(node.name, result) @register_tf_op def Real(context, node): input_data = context[node.inputs[0]] if types.is_complex(input_data.dtype): real_part = mb.complex_real(data=input_data, name=node.name) else: real_part = input_data context.add(node.name, real_part) @register_tf_op def Imag(context, node): input_data = context[node.inputs[0]] if types.is_complex(input_data.dtype): imag_part = mb.complex_imag(data=input_data, name=node.name) else: # According to the doc of tf.math.imag, it returns a tensor of all zeros if input is real. np_type = types.nptype_from_builtin(input_data.sym_type.get_primitive()) imag_part = mb.fill( shape=mb.shape(x=input_data), value=np_type(0), name=node.name ) context.add(node.name, imag_part) @register_tf_op def FFT(context, node): input_data = context[node.inputs[0]] fft_res = mb.complex_fft(data=input_data, name=node.name) context.add(node.name, fft_res) @register_tf_op def RFFT(context, node): input_data = context[node.inputs[0]] fft_length = context[node.inputs[1]] # The fft_length is an int32 tensor of shape [1] instead of an integer. To make it compatible # to complex_rfft (which use PyTorch's params as reference), we extract the value from tensor. rfft_res = mb.complex_rfft( data=input_data, n=mb.const(val=fft_length.val[0]), name=node.name ) context.add(node.name, rfft_res) @register_tf_op def IFFT(context, node): input_data = context[node.inputs[0]] ifft_res = mb.complex_ifft(data=input_data, name=node.name) context.add(node.name, ifft_res) @register_tf_op def IRFFT(context, node): input_data = context[node.inputs[0]] fft_length = context[node.inputs[1]] # The fft_length is an int32 tensor of shape [1] instead of an integer. To make it compatible # to complex_rfft (which use PyTorch's params as reference), we extract the value from tensor. irfft_res = mb.complex_irfft( data=input_data, n=mb.const(val=fft_length.val[0]), name=node.name ) context.add(node.name, irfft_res) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/parse.py0000644000000000000000000000776214672066616025535 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as _np from tensorflow.core.framework.types_pb2 import DataType from tensorflow.python.framework.dtypes import _TF_TO_NP from coremltools import _logger as logger from coremltools.converters.mil.mil import types def parse_type(t): mapping = { # bool DataType.DT_BOOL: types.bool, # floating point DataType.DT_HALF: types.fp16, DataType.DT_FLOAT: types.float, DataType.DT_DOUBLE: types.double, # int DataType.DT_INT8: types.int8, DataType.DT_INT16: types.int16, DataType.DT_INT32: types.int32, DataType.DT_INT64: types.int32, # unsigned int DataType.DT_UINT8: types.uint8, DataType.DT_UINT16: types.uint16, DataType.DT_UINT32: types.uint32, DataType.DT_UINT64: types.uint64, # string DataType.DT_STRING: types.str, } t = int(t) if t in mapping: return mapping[t] else: logger.info("Type %d cannot be mapped", t) return None def parse_shape(t): if t.unknown_rank: return None ret = [d.size for d in t.dim] return ret def parse_tensor(t): typ = parse_type(t.dtype) shape = parse_shape(t.tensor_shape) retval = None if len(t.half_val) > 0: retval = _np.array(t.half_val, dtype=_TF_TO_NP[t.dtype]) elif len(t.float_val) > 0: retval = _np.array(t.float_val, dtype=_TF_TO_NP[t.dtype]) elif len(t.double_val) > 0: retval = _np.array(t.double_val, dtype=_TF_TO_NP[t.dtype]) elif len(t.int_val) > 0: retval = _np.array(t.int_val, dtype=_TF_TO_NP[t.dtype]) elif len(t.int64_val) > 0: retval = _np.array(t.int64_val, dtype=_TF_TO_NP[t.dtype]) elif len(t.bool_val) > 0: retval = _np.array(t.bool_val, dtype=_TF_TO_NP[t.dtype]) elif hasattr(t, "uint32_val") and len(t.uint32_val) > 0: retval = _np.array(t.uint32_val, dtype=_TF_TO_NP[t.dtype]) elif hasattr(t, "uint64_val") and len(t.uint64_val) > 0: retval = _np.array(t.uint64_val, dtype=_TF_TO_NP[t.dtype]) if not t.tensor_shape.unknown_rank and len(shape) == 0: retobj = typ() if retval is not None: retobj.val = retval[0] else: rettype = types.tensor(typ, tuple(shape)) retobj = rettype() retobj.shape = shape if retval is not None: retobj.val = retval return retobj def parse_string(s): if isinstance(s, bytes): return s.decode("utf-8", errors="ignore") else: return s def parse_list(t): if len(t.s) > 0: return list(parse_string(s) for s in t.s) elif len(t.i) > 0: return list(t.i) elif len(t.f) > 0: return list(t.f) elif len(t.b) > 0: return list(t.b) elif len(t.type) > 0: return list(parse_type(z) for z in t.type) elif len(t.shape) > 0: return list(parse_shape(z) for z in t.shape) elif len(t.tensor) > 0: return list(parse_tensor(z) for z in t.tensor) else: return [] def parse_func(f): return f.name def parse_attr(attr): if attr.HasField("s"): return parse_string(attr.s) elif attr.HasField("i"): return attr.i elif attr.HasField("f"): return attr.f elif attr.HasField("b"): return attr.b elif attr.HasField("type"): return parse_type(attr.type) elif attr.HasField("shape"): return parse_shape(attr.shape) elif attr.HasField("tensor"): return parse_tensor(attr.tensor) elif attr.HasField("list"): return parse_list(attr.list) elif attr.HasField("func"): return parse_func(attr.func) elif attr.HasField("placeholder"): raise NotImplementedError("placeholder not yet implemented") raise ValueError("unintelligible TFNode attributes") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/parsed_tf_node.py0000644000000000000000000000624214672066616027367 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types from .tfssa import ParsedNode class ParsedTFNode(ParsedNode): """ A parsed TensorFlow Node. name: The name of the node (str) op: The operation represented by the node (str) datatype: The type of the node. (type) value: The value of the node if available inputs: The list of nodes which are inputs to this node (list[str]) control_inputs: The list of nodes which have to be executed before this node (list[str]) attr: The attributes of the node outputs: The list of nodes which consume the result of this node (list[str]) control_outputs: The list of nodes which have to be executed after this node (list[str]) """ def __init__(self, tfnode=None): super(ParsedTFNode, self).__init__() self.original_node = tfnode if tfnode is not None: from .parse import parse_attr self.name = tfnode.name if tfnode.op == "PlaceholderWithDefault": self.op = "Placeholder" else: self.op = tfnode.op self.inputs = [x for x in tfnode.input if not x.startswith("^")] self.control_inputs = [x[1:] for x in tfnode.input if x.startswith("^")] self.attr = {k: parse_attr(v) for k, v in tfnode.attr.items()} def parse_from_attr(self): if "value" in self.attr: self.datatype = self.attr["value"].__class__ elif "_output_shapes" in self.attr: output_shapes = self.attr["_output_shapes"] if output_shapes[0] is not None and len(output_shapes[0]) > 0: if "dtype" in self.attr: rettype = types.tensor(self.attr["dtype"], tuple(output_shapes[0])) elif "T" in self.attr: rettype = types.tensor(self.attr["T"], tuple(output_shapes[0])) elif "Tparams" in self.attr: rettype = types.tensor( self.attr["Tparams"], tuple(output_shapes[0]) ) else: raise NotImplementedError( "Op-(%s) %s not implemented\nWith attribute:" + str(self.attr) % (self.op, self.name) ) self.datatype = rettype elif "dtype" in self.attr: self.datatype = self.attr["dtype"] elif "shape" in self.attr: shape = self.attr["shape"] assert "dtype" in self.attr if len(shape) == 0: self.datatype = self.attr["dtype"] else: self.datatype = types.tensor(self.attr["dtype"], shape) elif "dtype" in self.attr: self.datatype = self.attr["dtype"] def _copy_impl(self, dest): dest = super(ParsedTFNode, self)._copy_impl(dest) dest.original_node = self.original_node return dest def __copy__(self): return self._copy_impl(ParsedTFNode()) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2175465 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/ssa_passes/0000755000000000000000000000000014672075535026201 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/ssa_passes/__init__.py0000644000000000000000000000045414672066616030315 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import backfill_make_list_elem_type, expand_tf_lstm, tf_lstm_to_core_lstm ././@PaxHeader0000000000000000000000000000021100000000000010207 xustar00115 path=coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/ssa_passes/backfill_make_list_elem_type.py 22 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/ssa_passes/backfill_make_list_elem_ty0000644000000000000000000001127214672066616033444 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.types.symbolic import is_symbolic from coremltools.converters.mil.mil.var import ListVar @register_pass(namespace="tensorflow") class backfill_make_list_elem_type(AbstractGraphPass): """ TF's TensorArrayV3 (represented as make_list in mil) doesn't necessarily contain elem shape/type, which is known when write is performed. We backfill elem type info to make_list Inputs: prog: Program """ def apply(self, prog): for f in prog.functions.values(): _backfill_make_list_elem_type_block(f) @block_context_manager def _backfill_make_list_elem_type_block(block): # shallow copy hides changes on f.operations during the loop for op in list(block.operations): for b in op.blocks: _backfill_make_list_elem_type_block(b) if op.op_type != "tf_make_list": continue if op.outputs[0].elem_type != types.unknown: # elem_type of the list is known continue list_var = op.outputs[0] elem_type = _infer_elem_type(list_var) # types.tensor if elem_type is None: msg = ( "No list_write or list_scatter op to infer make_list " + "'{}' element type. Block:\n{}" ) raise ValueError(msg.format(op.name, op.enclosing_block)) # elem_shape can be runtime-detemrined, which cannot be inferred here at this point, # so we add an internal _const_symbolic node to cover both static and dynamic cases. elem_shape = [dim.name if is_symbolic(dim) else dim for dim in elem_type.get_shape()] new_list = mb.make_list( init_length=op.init_length, dynamic_length=op.dynamic_length, elem_shape=tuple(elem_shape), dtype=op.inputs["dtype"], before_op=op, name=op.name, ) block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=new_list ) block.remove_ops([op]) def _infer_elem_type(list_var): """ Returns types.tensor. None if failed to infer element type. Example: Given: main(%update: (2,fp32)) { block0() { %list: List[unknown] = tf_make_list(...) # unknown elem type %while_loop_0:0: (i32), %while_loop_0:1: List[(2,fp32)] = while_loop(loop_vars=(...)) while_loop_0_body(...) { %list_write_0: List[(2,fp32)] = list_write(index=..., ls=%list, value=%update) } -> (%add_0, %list_write_0) Result: main(%update: (2,fp32)) { block0() { %list: List[(2,fp32)] = tf_make_list(...) # Get the elem type from list_write %while_loop_0:0: (i32), %while_loop_0:1: List[(2,fp32)] = while_loop(loop_vars=(...)) while_loop_0_body(...) { %list_write_0: List[(2,fp32)] = list_write(index=..., ls=%list, value=%update) } -> (%add_0, %list_write_0) """ # Search for child op that have informative element types for o in list_var.child_ops: if o.op_type in ["list_write", "list_scatter"]: return o.outputs[0].elem_type if o.op_type == "while_loop": idx = list(o.loop_vars).index(list_var) block = o.blocks[0] # the corresponding Var in body block block_var = block.inputs[idx] elem_type = _infer_elem_type(block_var) if elem_type is not None: def _set_types_for_block_inputs(block): block_var = block.inputs[idx] new_block_var = ListVar(name=block_var.name, elem_type=elem_type, init_length=block_var.sym_type.T[1], dynamic_length=block_var.sym_type.T[2]) block._replace_var(block_var, new_block_var) _set_types_for_block_inputs(o.blocks[0]) # condition block _set_types_for_block_inputs(o.blocks[1]) # body block return elem_type # otherwise continue to other block_var (a list_var can be # passed into while_loop twice). return None ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/ssa_passes/expand_tf_lstm.py0000644000000000000000000001670414672066616031572 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools import _logger as logger from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="tensorflow") class expand_tf_lstm(AbstractGraphPass): """ Expand tf_lstm_block_cell to fine-grained SSA ops following: xh = [x, h_prev] [i, ci, f, o] = xh * w + b f = f + forget_bias if not use_peephole: wci = wcf = wco = 0 i = sigmoid(cs_prev .* wci + i) f = sigmoid(cs_prev .* wcf + f) ci = tanh(ci) cs = ci .* i + cs_prev .* f cs = clip(cs, cell_clip) o = sigmoid(cs * wco + o) co = tanh(cs) h = co .* o Inputs: prog: Program """ def apply(self, prog): for f in prog.functions.values(): _expand_tf_lstm_helper(f) def _expand_tf_lstm_helper(block): # shallow copy hides changes on f.operations during the loop for op in list(block.operations): for b in op.blocks: _expand_tf_lstm_helper(b) if op.op_type == "tf_lstm_block_cell": _expand_tf_lstm_block_cell(op) logger.info("Expanding {} (op_type: {})".format(op.name, op.op_type)) if op.op_type == "tf_lstm_block": # only cs, h are supported for now. Can be easily extended to other outputs at performance hit. i, cs, f, o, ci, co, h = op.outputs if all( map(lambda x: len(x.child_ops) == 0 and len(x.consuming_blocks) == 0, (i, f, o, ci, co) ) ): _expand_tf_lstm_block(op) logger.info("Expanding {} (op_type: {})".format(op.name, op.op_type)) def _lstm_cell_builder(op, x, h_prev, cs_prev, before_op=None): b = op.bias # [4*hidden_dim] forget_bias = op.forget_bias.val # python:float # xh = [x, h_prev] # xh shape: [b, input_dim+hidden_dim] xh = mb.concat(values=[x, h_prev], axis=-1, before_op=before_op) # w: [4*hidden_dim, input_dim + hidden_dim] (icfo layout) w = np.transpose(op.weight.val) # [i, ci, f, o] = xh * w + b. Shape is [b, 4*hidden_dim] icfo = mb.linear(x=xh, weight=w, bias=b, before_op=before_op) # i, ci, f, o shape: [b, hidden_dim] i, ci, f, o = mb.split(x=icfo, num_splits=4, axis=-1, before_op=before_op) if op.forget_bias.val != 0: f = mb.add(x=f, y=forget_bias, before_op=before_op) # note that .* means Hadamard product # i = sigmoid(cs_prev .* wci + i) # f = sigmoid(cs_prev .* wcf + f) if op.use_peephole.val: wci = op.weight_peep_i.val # [hidden_dim] wcf = op.weight_peep_f.val # [hidden_dim] x = mb.mul(x=cs_prev, y=wci, before_op=before_op) pre_i = mb.add(x=x, y=i, before_op=before_op) x = mb.mul(x=cs_prev, y=wcf, before_op=before_op) pre_f = mb.add(x=x, y=f, before_op=before_op) else: pre_i = i pre_f = f i = mb.sigmoid(x=pre_i, before_op=before_op) f = mb.sigmoid(x=pre_f, before_op=before_op) ci = mb.tanh(x=ci, before_op=before_op) # cs = ci .* i + cs_prev .* f x = mb.mul(x=ci, y=i, before_op=before_op) y = mb.mul(x=cs_prev, y=f, before_op=before_op) cs = mb.add(x=x, y=y, before_op=before_op) # cs = clip(cs, cell_clip) if op.cell_clip is not None: clip_val = op.cell_clip.val cs = mb.clip(x=cs, alpha=-clip_val, beta=clip_val, before_op=before_op) # o = sigmoid(cs * wco + o) if op.use_peephole.val: wco = op.weight_peep_o.val x = mb.mul(x=cs, y=wco, before_op=before_op) pre_o = mb.add(x=x, y=o, before_op=before_op) else: pre_o = o o = mb.sigmoid(x=pre_o, before_op=before_op) co = mb.tanh(x=cs, before_op=before_op) # h = co .* o h = mb.mul(x=co, y=o, before_op=before_op) return [i, cs, f, o, ci, co, h] def _expand_tf_lstm_block_cell(op): if op.op_type != "tf_lstm_block_cell": raise ValueError() with op.enclosing_block as block: x = op.x # [b, input_dim] h_prev = op.h_prev # [b, hidden_dim] cs_prev = op.c_prev # [b, hidden_dim] i, cs, f, o, ci, co, h = _lstm_cell_builder( op, x, h_prev, cs_prev, before_op=op ) # Replace all outputs new_outputs = [i, cs, f, o, ci, co, h] for old_v, new_v in zip(op.outputs, new_outputs): block.replace_uses_of_var_after_op( anchor_op=op, old_var=old_v, new_var=new_v ) block.remove_ops([op]) def _expand_tf_lstm_block(op): if op.op_type != "tf_lstm_block": raise ValueError() with op.enclosing_block as block: x = op.x # [s, b, input_dim] h_prev = op.h_prev # [b, hidden_dim] cs_prev = op.c_prev # [b, hidden_dim] # cond and body function gor the while_loop def cond(i, cs_list, h_list): return mb.less(x=i, y=length) def body(i, cs_list, h_list): xi = mb.gather(x=x, indices=i, axis=0) h_prev = mb.gather(x=h_list, indices=i, axis=0) cs_prev = mb.gather(x=cs_list, indices=i, axis=0) ig, cs, fg, og, ci, co, h = _lstm_cell_builder(op, xi, h_prev, cs_prev) counter = mb.add(x=i, y=1) return ( counter, mb.scatter(data=cs_list, indices=counter, updates=cs), mb.scatter(data=h_list, indices=counter, updates=h), ) # Allocate two lists: cs & h x_shape = mb.shape(x=x, before_op=op) length = mb.slice_by_index(x=x_shape, begin=[0], end=[1], before_op=op) h_shape = mb.shape(x=h_prev, before_op=op) list_shape = mb.concat(values=[length, h_shape], axis=0, before_op=op) cs_list = mb.fill(shape=list_shape, before_op=op) h_list = mb.fill(shape=list_shape, before_op=op) # append initial state at index 0 cs_prev = mb.expand_dims(x=cs_prev, axes=[0], before_op=op) cs_list = mb.concat(values=[cs_prev, cs_list], axis=0, before_op=op) h_prev = mb.expand_dims(x=h_prev, axes=[0], before_op=op) h_list = mb.concat(values=[h_prev, h_list], axis=0, before_op=op) _, cs_list, h_list = mb.while_loop( _cond=cond, _body=body, loop_vars=([0], cs_list, h_list), before_op=op ) # strip initial state or element at index 0 begin, end = [1, 0, 0], [0, 0, 0] begin_mask = [False, True, True] end_mask = [True, True, True] cs = mb.slice_by_index( x=cs_list, begin=begin, end=end, begin_mask=begin_mask, end_mask=end_mask, before_op=op, ) h = mb.slice_by_index( x=h_list, begin=begin, end=end, begin_mask=begin_mask, end_mask=end_mask, before_op=op, ) # Replace all outputs new_outputs = [cs, h] for old_v, new_v in zip( [ov for index, ov in enumerate(op.outputs) if index in [1, 6]], new_outputs ): block.replace_uses_of_var_after_op( anchor_op=op, old_var=old_v, new_var=new_v ) block.remove_ops([op]) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/ssa_passes/test_passes.py0000644000000000000000000000402714672066616031113 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.converters.mil.testing_utils import (assert_model_is_valid, assert_same_output_names) pytest.importorskip("tensorflow", minversion="1.15.0") def test_backfill_make_list_elem_type(): # The while_loop appends [1, 2]*i to `ls` for each iteration # i = 0, ... num_iters-1. elem_shape = (2,) @mb.program( input_specs=[mb.TensorSpec(shape=elem_shape),] ) def prog(update): def body(i, ls): return mb.add(x=i, y=1), mb.list_write(ls=ls, index=i, value=update) def cond(i, ls): return mb.less(x=i, y=num_iters) i = 0 ls = mb.tf_make_list(init_length=1) num_iters = 3 _, final_tensor_list = mb.while_loop(_cond=cond, _body=body, loop_vars=(i, ls)) list_len = mb.list_length(ls=final_tensor_list) indices = mb.range_1d(start=0, end=list_len, step=1) return mb.list_gather(ls=final_tensor_list, indices=indices) # tf_make_list has no elem_type info make_list_op = prog.find_ops(op_type="tf_make_list", exactly_one=True)[0] assert make_list_op.outputs[0].elem_type == types.unknown prev_prog = copy.deepcopy(prog) PASS_REGISTRY["tensorflow::backfill_make_list_elem_type"](prog) assert_same_output_names(prev_prog, prog) prog.validate() # tf_make_list is replaced with make_list and should have elem_type now make_list_op = prog.find_ops(op_type="make_list", exactly_one=True)[0] assert make_list_op.outputs[0].elem_type.get_shape() == elem_shape assert_model_is_valid(prog, {"update": elem_shape}) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/ssa_passes/tf_lstm_to_core_lstm.py0000644000000000000000000003023214672066616032774 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools import _logger as logger from coremltools.converters.mil.mil import ( Block, Builder as mb, Operation, Var, ) from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.types.symbolic import is_symbolic SUPPORTED_TF_LSTM_OPS = ["tf_lstm_block_cell", "tf_lstm_block"] @register_pass(namespace="tensorflow") class tf_lstm_to_core_lstm(AbstractGraphPass): """ Try to map TF dialect ops `tf_lstm_block` and `tf_lstm_block_cell` to `lstm` in the core op set if compatible. They are compatible if all of the following are satisfied: - If tf_lstm_block: only h output is consumed. tf_lstm_block has 7 sequence outputs: [i, cs, f, o, ci, co, h]. Each of them (e.g., i) has shape [seq_len, batch, hidden_dim] (see tf_lstm_block op doc string). core lstm only supports sequence output for hidden state h, and thus if any outputs other than `h` is consumed, we cannot convert to lstm in the core op set. - If tf_lstm_block_cell: only cs, h output (outputs[1], outputs[6]) are consumed. Similar to above. Inputs: prog: Program """ def apply(self, prog): for f in prog.functions.values(): _tf_lstm_to_core_lstm_block(f) @block_context_manager def _tf_lstm_to_core_lstm_block(block: Block): # shallow copy hides changes on f.operations during the loop for op in list(block.operations): for b in op.blocks: _tf_lstm_to_core_lstm_block(b) if op.op_type in SUPPORTED_TF_LSTM_OPS: if _try_replace_with_core_lstm(op): logger.info("Successfully map {} to lstm".format(op.op_type)) else: logger.info("Unable to map {} to lstm".format(op.op_type)) def _try_get_last_cell_state_in_tf_lstm_block(op: Operation) -> Var: """ Parameters ---------- op: Operation Must have op type "tf_lstm_block" Returns ------- Var, a var representing the last cell state in the lstm. None if check fails. One of the outputs of the op "tf_lstm_block" is the cell state (cs) which has shape [seq_len, batch, feat]. That is, it is the cell state tensor of the lstm, which includes all the time steps. This, normally, can not be mapped to the MIL lstm op's cell state output, since that op only returns the last time step of the cell state, which is a tensor of shape [batch, feat]. However, if the cell state output of "tf_lstm_block" is being sliced, before being used anywhere else, and sliced in such a way that it extracts just the last time step of the seq dimension, then it can indeed be mapped to MIL's lstm op. This utility function detects this condition. If true, it returns the var that corresponds to the rank 2 sliced cell state. In particular, the following pattern is detected: Input pattern: ..., cs, ... = tf_lstm_block(...) # [seq_len, batch, feat] extracted_cell_state = slice_by_index(x=cs, ...) # [batch, feat] or [1, batch, feat], such that seq dim. is sliced at the last time step out = op(extracted_cell_state) The "cs" var can be feeding into multiple "slice_by_index" ops, some of which slice it into [batch, feat] and some into [1, batch, feat] shaped tensors. This scenario is handled in the following manner: step 1: verify that the output "cs" only feeds into slice_by_index ops step 2: add a slice_by_index op to the graph, which slices the last time step and creates a tensor, "last_cs", of shape [batch, feat] step 3: add an expand_dims op to the graph which takes in "last_cs" and expands it to create a tensor, "expanded_last_cs", of shape [1, batch, feat] step 4: now, iterate over all the child ops of "cs". Each one of these will be of type "slice_by_index". Verify that they are slicing only the last time step. If not, exit out of the function by returning None. Once verified, replace its output var with either "last_cs" or "expanded_last_cs", depending on its shape. step 5: remove all the child ops of "cs". Return "last_cs" """ if op.op_type != "tf_lstm_block": raise ValueError("op must have type 'tf_lstm_block'. Got {}".format(op.op_type)) cs = op.outputs[1] if len(cs.child_ops) == 0 and len(cs.consuming_blocks) == 0: return cs if len(cs.consuming_blocks) > 1: return None if not all([child_op.op_type == "slice_by_index" for child_op in cs.child_ops]): return None child_ops = cs.child_ops[:] block = op.enclosing_block # extract the last time step of the cell states last_cs = mb.slice_by_index( x=cs, begin=[-1, 0, 0], end=[-1, 0, 0], begin_mask=[False, True, True], end_mask=[False, True, True], squeeze_mask=[True, False, False], before_op=child_ops[0], ) # this is of shape [batch, feat] expanded_last_cs = mb.expand_dims( x=last_cs, axes=[0], before_op=child_ops[0] ) # shape: [1, batch, feat] # for each child op, which is a "slice_by_index" op, verify the following conditions: # - input is a rank 3 tensor, of shape [seq_len, batch, feat] # - output is either a rank 2 tensor of shape [batch, feat] or rank 3 of shape [1, batch, feat] # - the first dimension is sliced with an index that is the last index, # so if its positive it should be of value, seq-1, or if negative, it should be -1 for slice_op in child_ops: # if any of the input arguments of the slice op is not compile time known, the check fails early for input in slice_op.inputs.values(): if input == slice_op.x: continue if input is None or input.val is None: return None x = slice_op.x out = slice_op.outputs[0] # check input rank if x.rank != 3: return None # check output rank and shape if out.rank not in (2, 3): return None if out.shape[-2:] != x.shape[-2:]: return None if out.rank == 3 and out.shape[0] != 1: return None # check that only the last time step is being extracted begin = slice_op.begin.val.tolist() end = slice_op.end.val.tolist() stride = slice_op.stride.val.tolist() begin_mask = slice_op.begin_mask.val.tolist() end_mask = slice_op.end_mask.val.tolist() squeeze_mask = slice_op.squeeze_mask.val.tolist() # the stride for the first dimension must be 1 if stride[0] != 1: return None # check if the first dimension is sliced exactly for the last time step if is_symbolic(x.shape[0]): """ When the first dimension is symbolic, we check for the following condition to be true: - begin[0] == -1 and begin_mask[0] == False If this condition is not met, we return None and exit """ if begin[0] != -1 or begin_mask[0]: return None else: time = x.shape[0] begin = [i + time if i < 0 else i for i in begin] end = [i + time if i < 0 else i for i in end] begin_time = 0 if begin_mask[0] else begin[0] end_time = time if end_mask[0] else end[0] if squeeze_mask[0]: if begin_time != time - 1: return None else: if end_time - begin_time != 1: return None if begin_time != time - 1: return None block.replace_uses_of_var_after_op( anchor_op=slice_op, old_var=slice_op.outputs[0], new_var=last_cs if len(out.shape) == 2 else expanded_last_cs, ) block.remove_ops(child_ops) return last_cs def _try_replace_with_core_lstm(op: Operation) -> bool: """ Inputs: op (Operation): op.op_type must be 'tf_lstm_block_cell' or `tf_lstm_block` Returns: True if op can be represented by mb.lstm op in SSA. False otherwise """ def _check_unsupported_outputs(unsupported_outputs): for ov in unsupported_outputs: if len(ov.child_ops) > 0 or len(ov.consuming_blocks) > 0: return False return True # Check for unsupported configuration : When peephole is present if op.use_peephole.val: return False # Check if the tf lstm op can be replaced with coreml lstm op # We check the following two conditions # (1) The outputs must not be (i, f, o, ci, co), since there is no corresponding outputs with the LSTM in Core ML # (2) For the tf_lstm_block op, only the last time step of cell state can be used # Here is an example of valid supported configuration: # _, cell_states, _, _, _, _, _, _ = tf_lstm_block.outputs # output = cell_states[-1, 1:2, :] # And here is an example that coreml cannot handle currently: # _, cell_states, _, _, _, _, _, _ = tf_lstm_block.outputs # output = cell_states[:2, :, :] i, cs, f, o, ci, co, h = op.outputs unsupported_outputs = [i, f, o, ci, co] if not _check_unsupported_outputs(unsupported_outputs): return False if op.op_type == "tf_lstm_block": cs = _try_get_last_cell_state_in_tf_lstm_block(op) if cs is None: return False # op is compatible with lstm mb_peep = None if op.use_peephole.val: mb_peep = np.stack( [op.weight_peep_i.val, op.weight_peep_f.val, op.weight_peep_o.val] ) # Set weights. The layout of the weight in TF1 is icfo (input, cell, forget, output gate). # Need to convert to ifoc for coreml tf_w = op.weight.val # [input_dim+hidden_dim, 4*hidden_dim] in icfo layout tf_w_i, tf_w_c, tf_w_f, tf_w_o = np.split(tf_w, 4, axis=1) w = np.concatenate([tf_w_i, tf_w_f, tf_w_o, tf_w_c], axis=1) w = np.transpose(w, [1, 0]) hidden_dim = w.shape[0] // 4 input_dim = w.shape[1] - hidden_dim # Split input and hidden weights w_ih, w_hh = np.split(w, [input_dim], axis=1) # Bias is icfo. Convert to ssa LSTM's ifoc layout tf_b = op.bias.val tf_b_i, tf_b_c, tf_b_f, tf_b_o = np.split(tf_b, 4, axis=0) tf_b_f += op.forget_bias.val # add forget bias to bias bias = np.concatenate([tf_b_i, tf_b_f, tf_b_o, tf_b_c], axis=0) cell_clip = None if op.cell_clip is None else op.cell_clip.val output_sequence = op.op_type == "tf_lstm_block" block = op.enclosing_block # x: [seq_len, batch, input_dim] if op.op_type == "tf_lstm_block_cell": x = mb.expand_dims(x=op.x, axes=[0], before_op=op) elif op.op_type == "tf_lstm_block": x = op.x else: raise ValueError("tf lstm op {} not supported. Only {} supported".format(op.op_type, SUPPORTED_TF_LSTM_OPS)) new_h_all, new_h, new_cs = mb.lstm( x=x, initial_c=op.c_prev, initial_h=op.h_prev, weight_ih=w_ih, weight_hh=w_hh, bias=bias, recurrent_activation="sigmoid", cell_activation="tanh", activation="tanh", peephole=mb_peep, clip=cell_clip, output_sequence=output_sequence, name=op.name, before_op=op, ) ops_to_remove = [op] block.replace_uses_of_var_after_op(anchor_op=op, old_var=cs, new_var=new_cs) if op.op_type == "tf_lstm_block_cell": block.replace_uses_of_var_after_op(anchor_op=op, old_var=h, new_var=new_h) elif op.op_type == "tf_lstm_block": block.replace_uses_of_var_after_op(anchor_op=op, old_var=h, new_var=new_h_all) if cs.op != op: ops_to_remove.append(cs.op) else: raise ValueError("tf lstm op {} not supported. Only {} supported".format(op.op_type, SUPPORTED_TF_LSTM_OPS)) block.remove_ops(ops_to_remove) return True ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2175465 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/test/0000755000000000000000000000000014672075535025014 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/test/__init__.py0000644000000000000000000000033214672066616027123 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/test/test_composite_ops.py0000644000000000000000000000455514672066616031321 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.frontend.tensorflow.test.testing_utils import ( TensorFlowBaseTest, make_tf_graph) # Importing _TF_OPS_REGISTRY to ensure `overriding` existing TF op does not break # testing of default op # pytest imports all the tests and hence overriding op invokes custom op which is not expected # In real usecase, importing following is not recommended!! from coremltools.converters.mil.frontend.tensorflow.tf_op_registry import ( _TF_OPS_REGISTRY, register_tf_op) from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.testing_reqs import backends, compute_units from coremltools.converters.mil.testing_utils import random_gen tf = pytest.importorskip("tensorflow") class TestCompositeOp(TensorFlowBaseTest): @pytest.fixture(scope="class") def create_custom_selu(self): default_selu = _TF_OPS_REGISTRY.get("Selu", None) @register_tf_op(tf_alias=[], override=True) def Selu(context, node): x = context[node.inputs[0]] alpha = 1.6732631921768188 lmda = 1.0507010221481323 out_elu = mb.elu(x=x, alpha=alpha) out = mb.mul(x=out_elu, y=lmda, name=node.name) context.add(node.name, out) yield _TF_OPS_REGISTRY["Selu"] = default_selu @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, list(range(1, 5)) ), ) @pytest.mark.usefixtures("create_custom_selu") def test_selu(self, compute_unit, backend, rank): input_shape = np.random.randint(low=1, high=6, size=rank) @make_tf_graph([input_shape]) def build_model(x): return tf.keras.activations.selu(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -10.0, 10.0)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/test/test_custom_ops.py0000644000000000000000000002547114672066616030631 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.frontend.tensorflow.test.testing_utils import ( TensorFlowBaseTest, make_tf_graph) # Importing _TF_OPS_REGISTRY to ensure `overriding` existing TF op does not break # testing of default op # pytest imports all the tests and hence overriding op invokes custom op which is not expected # In real usecase, importing following is not recommended!! from coremltools.converters.mil.frontend.tensorflow.tf_op_registry import ( _TF_OPS_REGISTRY, register_tf_op) from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, types from coremltools.converters.mil.mil.input_type import (DefaultInputs, InputSpec, TensorInputType) from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.types.symbolic import is_symbolic from coremltools.converters.mil.testing_reqs import backends, compute_units from coremltools.converters.mil.testing_utils import random_gen tf = pytest.importorskip("tensorflow") class TestCustomMatMul: # Define SSA Custom Op for Sparse MatMul # This will map to `custom_op` in SSA with binding information # to bind input spec to the custom implementation @register_op(is_custom_op=True) class custom_sparse_matmul(Operation): # Defining input spec for current op input_spec = InputSpec( x=TensorInputType(type_domain="T"), y=TensorInputType(type_domain="T"), transpose_x=TensorInputType(const=True, optional=True, type_domain=types.bool), transpose_y=TensorInputType(const=True, optional=True, type_domain=types.bool), x_is_sparse=TensorInputType(const=True, optional=True, type_domain=types.bool), y_is_sparse=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32), } # Specifying binding for custom op for specifying inputs, # parameters required for creating custom op to be synced with Swift API bindings = { "class_name": "SparseMatMul", "input_order": ["x", "y"], "parameters": ["transpose_x", "transpose_y", "x_is_sparse", "y_is_sparse"], "description": "Custom Sparse MatMul Layer", } def default_inputs(self): return DefaultInputs( transpose_x=False, transpose_y=False, x_is_sparse=False, y_is_sparse=False, ) def type_inference(self): x_type = self.x.dtype x_shape = self.x.shape y_shape = self.y.shape # For illustration purpose, assuming getting valid shape # Ideally, should consider transpose_?, ?_is_sparse parameters into consideration # for computing output shape return types.tensor(x_type, [x_shape[0], y_shape[1]]) # TensorFlow Sparse Matmul Op @register_tf_op def SparseMatMul(context, node): a = context[node.inputs[0]] b = context[node.inputs[1]] transpose_a = node.attr.get("transpose_a", False) transpose_b = node.attr.get("transpose_b", False) a_is_sparse = node.attr.get("a_is_sparse", False) b_is_sparse = node.attr.get("b_is_sparse", False) x = mb.custom_sparse_matmul( x=a, y=b, transpose_x=transpose_a, transpose_y=transpose_b, x_is_sparse=a_is_sparse, y_is_sparse=b_is_sparse, name=node.name, ) context.add(node.name, x) @pytest.mark.parametrize( "compute_unit, backend, transpose_a, transpose_b," "a_is_sparse, b_is_sparse, b_is_const", itertools.product( compute_units, backends, [True, False], [True, False], [True, False], [True, False], [True, False], ), ) def test_tf( self, compute_unit, backend, transpose_a, transpose_b, a_is_sparse, b_is_sparse, b_is_const, ): if backend[0] == 'mlprogram': pytest.skip("Custom layer not supported with ML Program backend") rank = 2 input_shape = list(np.random.randint(low=3, high=100, size=1)) * rank if b_is_const: @make_tf_graph([input_shape]) def build_model(x): ref = tf.compat.v1.sparse_matmul( x, random_gen(input_shape), transpose_a=transpose_a, transpose_b=transpose_b, a_is_sparse=a_is_sparse, b_is_sparse=b_is_sparse, ) return ref input_values = [random_gen(input_shape, -1.0, 1.0)] else: @make_tf_graph([input_shape, input_shape]) def build_model(x, y): ref = tf.compat.v1.sparse_matmul( x, y, transpose_a=transpose_a, transpose_b=transpose_b, a_is_sparse=a_is_sparse, b_is_sparse=b_is_sparse, ) return ref input_values = [random_gen(input_shape, -1.0, 1.0), random_gen(input_shape, -1.0, 1.0)] model, inputs, outputs = build_model input_dict = dict(zip(inputs, input_values)) spec, _, _, _, _, _ = TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, frontend_only=True, backend=backend, ) layers = spec.neuralNetwork.layers assert layers[-1].custom is not None, "Expecting a custom layer" assert ( "SparseMatMul" == layers[-1].custom.className ), "Custom Layer class name mismatch" assert ( transpose_a == layers[-1].custom.parameters["transpose_x"].boolValue ), "Incorrect parameter value k" assert ( transpose_b == layers[-1].custom.parameters["transpose_y"].boolValue ), "Incorrect parameter value k" assert ( a_is_sparse == layers[-1].custom.parameters["x_is_sparse"].boolValue ), "Incorrect parameter value k" assert ( b_is_sparse == layers[-1].custom.parameters["y_is_sparse"].boolValue ), "Incorrect parameter value k" assert len(layers) == 2 if b_is_const else len(layers) == 1 class TestCustomTopK: @pytest.fixture(scope="class") def create_custom_TopK(self): # Defining SSA TopK Op @register_op(is_custom_op=True) class custom_topk(Operation): input_spec = InputSpec( x=TensorInputType(type_domain="T"), k=TensorInputType(const=True, optional=True, type_domain=types.int32), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), sorted=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32), } bindings = { "class_name": "TopK", "input_order": ["x"], "parameters": ["k", "axis", "sorted"], "description": "Top K Custom layer", } def default_inputs(self): return DefaultInputs( k=1, axis=-1, sorted=False, ) def __init__(self, **kwargs): super(custom_topk, self).__init__(**kwargs) def type_inference(self): x_type = self.x.dtype x_shape = self.x.shape k = self.k.val axis = self.axis.val if not is_symbolic(x_shape[axis]) and k > x_shape[axis]: msg = "K={} is greater than size of the given axis={}" raise ValueError(msg.format(k, axis)) ret_shape = list(x_shape) ret_shape[axis] = k return types.tensor(x_type, ret_shape), types.tensor(types.int32, ret_shape) # Following logging is to ensure testing of TopK implemented in tf converter # default path is testing with appropriate conversion function # Log default tf topk default_tf_topk = _TF_OPS_REGISTRY.get("TopKV2", None) # Override TopK op with override=True flag @register_tf_op(tf_alias=["TopKV2"], override=True) def CustomTopK(context, node): x = context[node.inputs[0]] k = context[node.inputs[1]] sorted = node.attr.get("sorted", False) x = mb.custom_topk(x=x, k=k.val, axis=-1, sorted=sorted, name=node.name) context.add(node.name, x) yield _TF_OPS_REGISTRY["TopKV2"] = default_tf_topk @pytest.mark.parametrize( "compute_unit, backend, rank, k", itertools.product( compute_units, backends, [rank for rank in range(1, 4)], [1, 2], ), ) @pytest.mark.usefixtures("create_custom_TopK") def test_tf(self, compute_unit, backend, rank, k): if backend[0] == 'mlprogram': pytest.skip("Custom layer not supported with ML Program backend") input_shape = np.random.randint(low=3, high=6, size=rank) @make_tf_graph([input_shape]) def build_model(x): ref = tf.math.top_k(x, k=k, sorted=True) return ref[1], ref[0] model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1.0, 1.0)] input_dict = dict(zip(inputs, input_values)) spec, _, _, _, _, _ = TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, frontend_only=True, backend=backend, ) layers = spec.neuralNetwork.layers assert layers[-1].custom is not None, "Expecting a custom layer" assert ( "TopK" == layers[-1].custom.className ), "Custom Layer class name mismatch" assert ( k == layers[-1].custom.parameters["k"].intValue ), "Incorrect parameter value k" assert ( layers[-1].custom.parameters["sorted"].boolValue is True ), "Incorrect parameter value for Sorted" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/test/test_graphs.py0000644000000000000000000000266714672066616027724 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.frontend.tensorflow.test.testing_utils import ( TensorFlowBaseTest, make_tf_graph) from coremltools.converters.mil.testing_reqs import backends, compute_units tf = pytest.importorskip("tensorflow") # TODO (rdar://103050703): Move it to test_ops because it only test tf ops instead of graphs. class TestTFGraphs(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_masked_input(self, compute_unit, backend): input_shape = [4, 10, 8] val = np.random.rand(*input_shape).astype(np.float32) @make_tf_graph([input_shape]) def build_model(input): sliced_input = input[..., 4] mask = tf.where(sliced_input > 0) masked_input = tf.gather_nd(input, mask) return masked_input model, inputs, outputs = build_model input_values = [val] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/test/test_load.py0000644000000000000000000004454514672066616027360 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import shutil import tempfile import numpy as np import pytest import coremltools as ct import coremltools.converters as converter import coremltools.proto.FeatureTypes_pb2 as ft from coremltools import EnumeratedShapes, ImageType, RangeDim, TensorType from coremltools._deps import _HAS_TF_1, _IS_MACOS, MSG_TF1_NOT_FOUND from coremltools.converters.mil.frontend.tensorflow.converter import TFConverter from coremltools.converters.mil.frontend.tensorflow.test.testing_utils import ( TensorFlowBaseTest, get_tf_keras_io_names, make_tf_graph) from coremltools.converters.mil.testing_reqs import backends from coremltools.converters.mil.testing_utils import random_gen tf = pytest.importorskip("tensorflow") frontend = "tensorflow" class TestTfModelInputsOutputs(TensorFlowBaseTest): def setup(self): self.saved_model_dir = tempfile.mkdtemp() _, self.model_path_h5 = tempfile.mkstemp( suffix=".h5", prefix=self.saved_model_dir ) _, self.model_path_pb = tempfile.mkstemp( suffix=".pb", prefix=self.saved_model_dir ) def teardown(self): if os.path.exists(self.saved_model_dir): shutil.rmtree(self.saved_model_dir) @pytest.mark.parametrize( "backend", backends, ) def test_infer_inputs(self, backend): x_shape = (3, 4, 5) @make_tf_graph([x_shape]) def build_model(x): return tf.nn.relu(x) model, inputs, outputs = build_model if not isinstance(outputs, (tuple, list)): outputs = [outputs] output_names = [j if isinstance(j, str) else j.op.name for j in outputs] mlmodel = converter.convert(model, outputs=output_names, convert_to=backend[0]) assert mlmodel is not None input_values = [random_gen(x_shape, -10.0, 10.0)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf(model, input_dict, outputs) @pytest.mark.parametrize( "backend", backends, ) def test_infer_outputs(self, backend): x_shape = (3, 4, 5) @make_tf_graph([x_shape]) def build_model(x): return tf.nn.relu(x) model, inputs, outputs = build_model input_name = inputs[0] if isinstance(inputs[0], str) else inputs[0].op.name mlmodel = converter.convert( model, inputs=[TensorType(input_name, (3, 4, 5))], convert_to=backend[0] ) assert mlmodel is not None input_values = [random_gen(x_shape, -10.0, 10.0)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf(model, input_dict, outputs) @pytest.mark.parametrize( "backend", backends, ) def test_infer_inputs_and_outputs(self, backend): x_shape = (3, 4, 5) @make_tf_graph([x_shape]) def build_model(x): return tf.nn.relu(x) model, inputs, outputs = build_model mlmodel = converter.convert(model, convert_to=backend[0]) assert mlmodel is not None input_values = [random_gen(x_shape, -10.0, 10.0)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf(model, input_dict, outputs) @pytest.mark.parametrize( "backend", backends, ) def test_extract_sub_model(self, backend): x_shape = (3, 4, 5) y_shape = (3, 4, 5) @make_tf_graph([x_shape, y_shape]) def build_model(x, y): return tf.nn.relu(x), tf.math.add(x, y) model, inputs, outputs = build_model if isinstance(outputs[0], str): first_output_name = outputs[0] else: first_output_name = outputs[0].name.split(":")[0] mlmodel = converter.convert(model, outputs=[first_output_name], convert_to=backend[0]) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_auto_image_nhwc_input_names(self, backend): x_shape = (4, 5, 3) if backend[0] == "neuralnetwork" else (1, 4, 5, 3) @make_tf_graph([x_shape]) def build_model(x): return tf.nn.relu(x) model, inputs, outputs = build_model mlmodel = converter.convert(model, inputs=[ImageType()], convert_to=backend[0]) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_auto_image_nchw_input_names(self, backend): x_shape = (3, 4, 5) if backend[0] == "neuralnetwork" else (1, 3, 4, 5) @make_tf_graph([x_shape]) def build_model(x): return tf.nn.relu(x) model, inputs, outputs = build_model mlmodel = converter.convert( model, inputs=[ImageType(channel_first=True)], convert_to=backend[0] ) assert mlmodel is not None @pytest.mark.parametrize( "target", [ct.target.iOS13, ct.target.macOS10_15, ct.target.watchOS6, ct.target.tvOS13], ) def test_invalid_deployment_target_cumsum(self, target): x_shape = (3, 4, 5) @make_tf_graph([x_shape]) def build_model(x): return tf.math.cumsum(x, axis=-1, reverse=False, exclusive=False) model, inputs, outputs = build_model with pytest.raises(ValueError) as e: converter.convert(model, minimum_deployment_target=target) e.match( r"Provided minimum deployment target requires model to be of version 4 but converted model " r"uses following features which are available from version 5 onwards. " r"Please use a higher minimum deployment target to convert. \n 1. Cumsum operation\n" ) @pytest.mark.parametrize( "target", [ct.target.iOS14, ct.target.macOS10_16, ct.target.watchOS7, ct.target.tvOS14], ) def test_valid_deployment_target_cumsum(self, target): x_shape = (3, 4, 5) @make_tf_graph([x_shape]) def build_model(x): return tf.math.cumsum(x, axis=-1, reverse=False, exclusive=False) model, inputs, outputs = build_model # successful conversion converter.convert(model, minimum_deployment_target=target) @pytest.mark.parametrize( "backend", backends, ) def test_invalid_output_names(self, backend): x_shape = (3, 4, 5) @make_tf_graph([x_shape]) def build_model(x): return tf.nn.relu(x) model, inputs, outputs = build_model with pytest.raises(AssertionError) as e: converter.convert( model, source=frontend, outputs=["invalid_name"], convert_to=backend[0] ) e.match(r".* is not in graph") @pytest.mark.parametrize( "backend", backends, ) def test_missing_placeholder_shape(self, backend): x_shape = None # Missing Placeholder shape @make_tf_graph([x_shape]) def build_model(x): return tf.nn.relu(x) model, inputs, outputs = build_model with pytest.raises(ValueError) as e: converter.convert(model, source=frontend, convert_to=backend[0]) e.match(r"Unable to determine the shape of input .*") mlmodel = converter.convert( model, source=frontend, inputs=[ct.TensorType(shape=(1,))], convert_to=backend[0] ) assert mlmodel is not None @pytest.mark.skip(reason="Rank-0 input is not supported") @pytest.mark.parametrize( "backend", backends, ) def test_scalar_placeholder_shape(self, backend): x_shape = () # Scalar Placeholder Shape @make_tf_graph([x_shape]) def build_model(x): return tf.nn.relu(x) model, inputs, outputs = build_model mlmodel = converter.convert(model, source=frontend, convert_to=backend[0]) assert mlmodel is not None input_values = [random_gen(x_shape, -10.0, 10.0)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf(model, input_dict, outputs) @pytest.mark.parametrize( "backend", backends, ) def test_shaping_utils(self, backend): @make_tf_graph([(None, 4, 5)]) def build_flexible_model(x): return tf.nn.relu(x) model, inputs, outputs = build_flexible_model input_name = TFConverter._get_tensor_name(inputs[0]) output_name = TFConverter._get_tensor_name(outputs[0]) # static-Flexible shape if backend[0] == "neuralnetwork": inputs = [ # Use TF's input shapes (None, 4, 5) TensorType(name=input_name) ] else: inputs = [TensorType(name=input_name, shape=(RangeDim(upper_bound=3), 4, 5))] mlmodel = converter.convert( model, inputs=inputs, outputs=[output_name], convert_to=backend[0] ) assert mlmodel is not None input_values = [random_gen((3, 4, 5), -10.0, 10.0)] input_dict = {input_name: input_values[0]} if _IS_MACOS: ret = mlmodel.predict(input_dict) np.allclose(ret[output_name], np.maximum(input_values[0], 0.0)) # Enumerate shape inputs_shape = [TensorType(input_name, EnumeratedShapes(shapes=[(3, 4, 5), (4, 4, 5)]))] mlmodel = converter.convert( model, inputs=inputs_shape, outputs=[output_name], convert_to=backend[0] ) assert mlmodel is not None input_values = [random_gen((3, 4, 5), -10.0, 10.0)] input_dict = {input_name: input_values[0]} if _IS_MACOS: ret = mlmodel.predict(input_dict) np.allclose(ret[output_name], np.maximum(input_values[0], 0.0)) input_values = [random_gen((4, 4, 5), -10.0, 10.0)] input_dict = {input_name: input_values[0]} if _IS_MACOS: ret = mlmodel.predict(input_dict) np.allclose(ret[output_name], np.maximum(input_values[0], 0.0)) if _IS_MACOS: with pytest.raises(RuntimeError): input_values = [random_gen((5, 4, 5), -10.0, 10.0)] input_dict = {input_name: input_values[0]} ret = mlmodel.predict(input_dict) # Ranged shape inputs_shape = [TensorType(input_name, [RangeDim(3, 5), 4, 5])] mlmodel = converter.convert( model, inputs=inputs_shape, outputs=[output_name], convert_to=backend[0] ) assert mlmodel is not None input_values = [random_gen((3, 4, 5), -10.0, 10.0)] input_dict = {input_name: input_values[0]} if _IS_MACOS: ret = mlmodel.predict(input_dict) np.allclose(ret[output_name], np.maximum(input_values[0], 0.0)) input_values = [random_gen((4, 4, 5), -10.0, 10.0)] input_dict = {input_name: input_values[0]} if _IS_MACOS: ret = mlmodel.predict(input_dict) np.allclose(ret[output_name], np.maximum(input_values[0], 0.0)) if _IS_MACOS: with pytest.raises(RuntimeError): input_values = [random_gen((2, 4, 5), -10.0, 10.0)] input_dict = {input_name: input_values[0]} ret = mlmodel.predict(input_dict) @pytest.mark.parametrize( "backend", backends, ) def test_default_data_types(self, backend): @make_tf_graph([(2, 2)]) def build_model(x): return tf.nn.relu(x) model, inputs, outputs = build_model mlmodel = converter.convert(model, convert_to=backend[0]) assert mlmodel is not None spec = mlmodel.get_spec() # Defaults should be FLOAT32 instead of DOUBLE it = spec.description.input[0].type.multiArrayType.dataType assert it == ft.ArrayFeatureType.ArrayDataType.Value("FLOAT32") ot = spec.description.output[0].type.multiArrayType.dataType assert ot == ft.ArrayFeatureType.ArrayDataType.Value("FLOAT32") @pytest.mark.skipif(not _HAS_TF_1, reason=MSG_TF1_NOT_FOUND) class TestTf1ModelFormats: def setup(self): self.saved_model_dir = tempfile.mkdtemp() _, self.model_path_h5 = tempfile.mkstemp( suffix=".h5", prefix=self.saved_model_dir ) _, self.model_path_pb = tempfile.mkstemp( suffix=".pb", prefix=self.saved_model_dir ) def teardown(self): if os.path.exists(self.saved_model_dir): shutil.rmtree(self.saved_model_dir) @pytest.mark.parametrize( "backend", backends, ) def test_graph_def(self, backend): with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=(3, 4, 5)) out = tf.nn.relu(x) mlmodel = converter.convert( graph, inputs=[TensorType(x.op.name, (3, 4, 5))], outputs=[out.op.name], convert_to=backend[0], ) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_graph_def_file(self, backend): with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=(3, 4, 5)) out = tf.nn.relu(x) tf.io.write_graph( graph, self.saved_model_dir, self.model_path_pb, as_text=False ) mlmodel = converter.convert( self.model_path_pb, inputs=[TensorType(x.op.name, (3, 4, 5))], outputs=[out.op.name], convert_to=backend[0], ) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_saved_model_from_simple_save(self, backend): with tf.compat.v1.Session() as sess: x = tf.placeholder(shape=(1, 3, 5), dtype=tf.float32) y = tf.nn.relu(x) inputs = {"x": x} outputs = {"y": y} tf.compat.v1.saved_model.simple_save(sess, self.saved_model_dir, inputs, outputs) mlmodel = converter.convert(self.saved_model_dir, convert_to=backend[0]) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_tf_keras(self, backend): keras_model = tf.keras.Sequential([tf.keras.layers.ReLU(input_shape=(4, 5), batch_size=3)]) input_names, output_names = get_tf_keras_io_names(keras_model) mlmodel = converter.convert( keras_model, inputs=[TensorType(input_names[0], (3, 4, 5))], outputs=["Identity"], source=frontend, convert_to=backend[0], ) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_tf_keras_hdf5_file(self, backend): keras_model = tf.keras.Sequential([tf.keras.layers.ReLU(input_shape=(4, 5), batch_size=3)]) keras_model.save(self.model_path_h5) input_names, output_names = get_tf_keras_io_names(keras_model) mlmodel = converter.convert( self.model_path_h5, inputs=[TensorType(input_names[0], (3, 4, 5))], outputs=["Identity"], source=frontend, convert_to=backend[0], ) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_model_metadata(self, backend): with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=(3, 4, 5)) out = tf.nn.relu(x) mlmodel = converter.convert( graph, inputs=[TensorType(x.op.name, (3, 4, 5))], outputs=[out.op.name], convert_to=backend[0], ) metadata_keys = mlmodel.get_spec().description.metadata.userDefined assert "com.github.apple.coremltools.version" in metadata_keys assert "com.github.apple.coremltools.source" in metadata_keys assert "tensorflow==1." in metadata_keys["com.github.apple.coremltools.source"] @pytest.mark.parametrize( "backend", backends, ) def test_invalid_format_none(self, backend): with pytest.raises(NotImplementedError) as e: converter.convert(None, source="tensorflow", convert_to=backend[0]) e.match(r"Expected model format: .* .pb") @pytest.mark.parametrize( "backend", backends, ) def test_invalid_format_invalid_extension(self, backend): _, invalid_filename = tempfile.mkstemp(suffix=".invalid", prefix=self.saved_model_dir) with pytest.raises(NotImplementedError) as e: converter.convert(invalid_filename, source="tensorflow", convert_to=backend[0]) e.match(r"Expected model format: .* .pb") @pytest.mark.parametrize( "backend", backends, ) def test_invalid_converter_source(self, backend): with pytest.raises(ValueError) as e: converter.convert(None, source="invalid", convert_to=backend[0]) expected_msg = r'Unrecognized value of argument "source": .*' e.match(expected_msg) def test_invalid_converter_minimum_deployment_flag(self): with pytest.raises(TypeError) as e: converter.convert( None, source="tensorflow", minimum_deployment_target="iOs14" ) expected_msg = ( "Unrecognized value of argument 'minimum_deployment_target': iOs14. " "It needs to be a member of 'coremltools.target' enumeration" ) e.match(expected_msg) def test_invalid_converter_target(self): with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=(3, 4, 5)) with pytest.raises(NotImplementedError) as e: converter.convert(graph, convert_to="invalid", source="tensorflow") e.match(r"Backend converter .* not implemented") @pytest.mark.parametrize( "backend", backends, ) def test_invalid_format_non_exist(self, backend): non_exist_filename = self.model_path_pb.replace(".pb", "_non_exist.pb") with pytest.raises(ValueError) as e: converter.convert(non_exist_filename, source="tensorflow", convert_to=backend[0]) e.match(r"Input model .* does not exist") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508431.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/test/test_ops.py0000644000000000000000000103362514672066617027241 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import math import os import platform import shutil import tempfile from typing import Optional import numpy as np import pytest from packaging.version import Version import coremltools as ct from coremltools import RangeDim, TensorType from coremltools._deps import _HAS_TF_1, _HAS_TF_2, MSG_TF1_NOT_FOUND, _get_version from coremltools.converters.mil.frontend.tensorflow.test.testing_utils import ( TensorFlowBaseTest, freeze_g, get_tf_node_names, layer_counts, load_tf_pb, make_tf_graph, ) from coremltools.converters.mil.mil import Operation, Program, types from coremltools.converters.mil.testing_reqs import backends, compute_units from coremltools.converters.mil.testing_utils import ( einsum_equations, gen_input_shapes_einsum, random_gen, ) from coremltools.models.utils import _is_macos, _macos_version tf = pytest.importorskip("tensorflow") PREBUILT_TF1_WHEEL_VERSION = "1.15.5" @pytest.mark.skipif(not _HAS_TF_1, reason=MSG_TF1_NOT_FOUND) class TestContribResampler(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, data_warp_shapes", itertools.product( compute_units, backends, [ # Data shape format: (Batch, Hin, Win, C) # Warp shape format: (Batch, Hout, Wout, 2) [(1, 3, 3, 1), (1, 3, 3, 2)], # no size change [(2, 5, 5, 3), (2, 3, 3, 2)], # down-sampling [(3, 6, 6, 1), (3, 8, 8, 2)], # up-sampling [(1, 3, 9, 1), (1, 19, 2)], # rank-3 warp tensor ], ), ) def test( self, compute_unit, backend, data_warp_shapes, ): if backend[0] == "neuralnetwork": pytest.skip("nn backend not supported") data_shape, warp_shape = data_warp_shapes @make_tf_graph([data_shape, warp_shape]) def build_model(x, warp): return tf.contrib.resampler.resampler(data=x, warp=warp) model, inputs, outputs = build_model # warp exceeding input sizes in order to test more padding modes input_values = [ random_gen(data_shape, -100, 100), random_gen(warp_shape, -15, 15), ] input_dict = dict(zip(inputs, input_values)) self.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestDebugging(TensorFlowBaseTest): """ TF converter does not handling debugging nodes, they are expected to be deleted by graph pass before op conversions in Grappler graph pass: debug_stripper. """ @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_assert(self, compute_unit, backend): input_shape = (1,) @make_tf_graph([input_shape]) def build_model(x): tf.debugging.Assert(True, [x]) return tf.nn.relu(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, 0, 1)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_check_numerics(self, compute_unit, backend): input_shape = (1,) @make_tf_graph([input_shape]) def build_model(x): tf.debugging.check_numerics(x, 'check') return tf.nn.relu(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, 0, 1)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_print(self, compute_unit, backend): input_shape = (1,) @make_tf_graph([input_shape]) def build_model(x): tf.raw_ops.Print(input=x, data=[x], message='[x]') return tf.nn.relu(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, 0, 1)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestPlaceholderAsOutput(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [rank for rank in range(6)] ), ) def test(self, compute_unit, backend, rank): if rank == 0: pytest.skip('Rank 0 not supported by CoreML runtime') input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape, input_shape]) def build_model(x, y): return x, y, x + 1, x + y model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1), random_gen(input_shape, -1, 1)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestDuplicateOutputs(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [rank for rank in range(6)] ), ) def test(self, compute_unit, backend, rank): if rank == 0: pytest.skip('Rank 0 not supported by CoreML runtime') input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): b = tf.identity(x) c = tf.identity(x) d = b + c return b, c, d model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1), random_gen(input_shape, -1, 1)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestIdentity(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [rank for rank in range(6)] ), ) def test(self, compute_unit, backend, rank): if rank == 0: pytest.skip('Rank 0 not supported by CoreML runtime') input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): return x model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestActivation(TensorFlowBaseTest): @staticmethod def run_compare_tf(model, input_dict, outputs, target_op: Optional[str] = None, **kwargs): """Override compare method for Activation ops tests, as we want to verify the mixed precision support for alpha/beta in IOS17 Activation Ops.""" results = TensorFlowBaseTest.run_compare_tf(model, input_dict, outputs, **kwargs) if target_op and kwargs.get("backend", (None, None))[1] == "fp16": prog: Program = results[1]._mil_program activation_op: Operation = prog.find_ops(op_type=target_op, exactly_one=True)[0] assert activation_op.x.dtype == types.fp16 # Before IOS17, both alpha and input/output are converted to fp16. # After IOS17, alpha is kept as fp32 because it supports mixed precision. expected_alpha_beta_dtype = types.fp16 if kwargs.get("minimum_deployment_target", None) == ct.target.iOS17: expected_alpha_beta_dtype = types.fp32 if hasattr(activation_op, "alpha"): assert activation_op.alpha.dtype == expected_alpha_beta_dtype if hasattr(activation_op, "beta"): assert activation_op.beta.dtype == expected_alpha_beta_dtype return results @pytest.mark.parametrize( "compute_unit, backend, rank, minimum_deployment_target", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [None, ct.target.iOS17], ), ) def test_elu(self, compute_unit, backend, rank, minimum_deployment_target): input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): return tf.nn.elu(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1)] input_dict = dict(zip(inputs, input_values)) self.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=minimum_deployment_target, target_op="elu", ) @pytest.mark.parametrize( "compute_unit, backend, rank, minimum_deployment_target", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [None, ct.target.iOS17], ), ) def test_leaky_relu(self, compute_unit, backend, rank, minimum_deployment_target): input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): return tf.nn.leaky_relu(x, 0.2) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1)] input_dict = dict(zip(inputs, input_values)) self.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=minimum_deployment_target, target_op="leaky_relu", ) @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product(compute_units, backends, [rank for rank in range(1, 6)]), ) def test_relu(self, compute_unit, backend, rank): input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): return tf.nn.relu(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -10.0, 10)] input_dict = dict(zip(inputs, input_values)) self.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product(compute_units, backends, [rank for rank in range(1, 6)]), ) def test_relu6(self, compute_unit, backend, rank): input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): return tf.nn.relu6(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1)] input_dict = dict(zip(inputs, input_values)) self.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product(compute_units, backends, [rank for rank in range(1, 6)]), ) def test_sigmoid(self, compute_unit, backend, rank): input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): return tf.math.sigmoid(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1)] input_dict = dict(zip(inputs, input_values)) self.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product(compute_units, backends, [rank for rank in range(1, 6)]), ) def test_softplus(self, compute_unit, backend, rank): input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): return tf.math.softplus(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1)] input_dict = dict(zip(inputs, input_values)) self.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank_and_axes", itertools.product( compute_units, backends, [(rank, axis) for rank in range(1, 6) for axis in range(-1, rank)], ), ) def test_softmax(self, compute_unit, backend, rank_and_axes): rank, axis = rank_and_axes input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): return tf.nn.softmax(x, axis=axis) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1)] input_dict = dict(zip(inputs, input_values)) self.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product(compute_units, backends, [rank for rank in range(1, 6)]), ) def test_softsign(self, compute_unit, backend, rank): input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): return tf.math.softsign(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1)] input_dict = dict(zip(inputs, input_values)) self.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank, minimum_deployment_target", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [None, ct.target.iOS17], ), ) def test_selu(self, compute_unit, backend, rank, minimum_deployment_target): input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): return tf.nn.selu(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1.0, 1.0)] input_dict = dict(zip(inputs, input_values)) self.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=minimum_deployment_target, target_op="elu", ) class TestAddN(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, num_inputs", itertools.product( compute_units, backends, list(range(6)), [1, 3, 9], ), ) def test(self, compute_unit, backend, rank, num_inputs): if rank == 0: pytest.skip('Rank 0 not supported by CoreML runtime') input_shape = np.random.randint(low=1, high=4, size=rank) input_shapes = [input_shape[:] for _ in range(num_inputs)] @make_tf_graph(input_shapes) def build_model(*inputs): return tf.raw_ops.AddN(inputs=inputs) model, inputs, outputs = build_model input_values = [random_gen(shape, -1, 1) for shape in input_shapes] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestAddOrdering(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test(self, compute_unit, backend): @make_tf_graph([(2, 3, 4), (2, 3, 4)]) def build_model(x, y): return tf.math.add(x, y) model, inputs, outputs = build_model input_values = [random_gen((2, 3, 4), -1, 1)] * 2 input_dict = dict(zip(inputs, input_values)) spec, _, _, _, _, _ = TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) if backend[0] == "neuralnetwork": nn_spec = spec.neuralNetwork if _HAS_TF_1: input_names = ["Placeholder", "Placeholder_1"] elif _HAS_TF_2: input_names = ["args_0", "args_1"] assert nn_spec.layers[0].input[0] == input_names[0] assert nn_spec.layers[0].input[1] == input_names[1] class TestGelu(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, mode", itertools.product( compute_units, backends, [rank for rank in range(2, 3)], ("tanh_approx", "exact_1", "exact_2", "exact_3") ), ) def test(self, compute_unit, backend, rank, mode): input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model_tanh_approx(x): a = 0.5 * ( 1.0 + tf.tanh((math.sqrt(2 / math.pi) * (x + 0.044715 * tf.pow(x, 3)))) ) return a * x @make_tf_graph([input_shape]) def build_model_exact_1(x): return x * (0.5 * (1.0 + tf.math.erf(x / tf.math.sqrt(2.0)))) @make_tf_graph([input_shape]) def build_model_exact_2(x): return 0.5 * (x * (1.0 + tf.math.erf(x / tf.math.sqrt(2.0)))) @make_tf_graph([input_shape]) def build_model_exact_3(x): return (x * 0.5) * (1.0 + tf.math.erf(x / tf.math.sqrt(2.0))) if mode == "tanh_approx": build_model = build_model_tanh_approx elif mode == "exact_1": build_model = build_model_exact_1 elif mode == "exact_2": build_model = build_model_exact_2 elif mode == "exact_3": build_model = build_model_exact_3 else: raise ValueError("Unexpected mode for Gelu layer") model, inputs, outputs = build_model input_values = [random_gen(input_shape, -5, 5)] input_dict = dict(zip(inputs, input_values)) spec, mlmodel, _, _, _, _ = TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) assert TestGelu._op_count_in_mil_program(mlmodel, "gelu") == 1 assert TestGelu._op_count_in_mil_program(mlmodel, "erf") == 0 assert TestGelu._op_count_in_mil_program(mlmodel, "pow") == 0 assert TestGelu._op_count_in_mil_program(mlmodel, "tanh") == 0 class Testlog1p(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [1, 3, 5] ), ) def test(self, compute_unit, backend, rank): input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): return tf.math.log1p(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, 0.0, 2.0)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, minimum_deployment_target", itertools.product( compute_units, [None, ct.target.iOS17], ), ) def test_ios17_mixed_precision(self, compute_unit, minimum_deployment_target): input_shape = np.random.randint(low=1, high=4, size=2) @make_tf_graph([input_shape]) def build_model(x): return tf.math.log1p(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, 0.0, 2.0)] input_dict = dict(zip(inputs, input_values)) results = TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=("mlprogram", "fp16"), minimum_deployment_target=minimum_deployment_target, ) prog: Program = results[1]._mil_program log_op: Operation = prog.find_ops(op_type="log", exactly_one=True)[0] assert log_op.x.dtype == types.fp16 # Before IOS17, the epsilon param is converted to fp16. # After IOS17, the epsilon param is kept as fp32 because it supports mixed precision. if minimum_deployment_target is not None and minimum_deployment_target >= ct.target.iOS17: expected_epsilon_dtype = "fp32" else: expected_epsilon_dtype = "fp16" assert types.builtin_to_string(log_op.epsilon.dtype) == expected_epsilon_dtype class TestSelect(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, broadcast, dynamic", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [True, False], [True, False], ), ) def test_select(self, compute_unit, backend, rank, broadcast, dynamic): shape = np.random.randint(low=1, high=4, size=rank) cond_shape = np.array([shape[0]]) if broadcast else shape cond_val = np.random.randint(low=0, high=2, size=cond_shape).astype(bool) a_val = random_gen(shape=shape, rand_min=-1962.0, rand_max=0.0) b_val = random_gen(shape=shape, rand_min=0.0, rand_max=1964.0) if dynamic: cond_shape = [None] * len(cond_shape) + [tf.bool] a_shape = [None] * len(shape) + [tf.float32] b_shape = [None] * len(shape) + [tf.float32] else: cond_shape = cond_shape.tolist() + [tf.bool] a_shape = shape.tolist() + [tf.float32] b_shape = shape.tolist() + [tf.float32] @make_tf_graph([cond_shape, a_shape, b_shape]) def build_model_select(cond, a, b): return tf.raw_ops.Select(condition=cond, x=a, y=b) model, inputs, outputs = build_model_select inputs_dic = dict(zip(inputs, [cond_val, a_val, b_val])) TensorFlowBaseTest.run_compare_tf( model, inputs_dic, outputs, backend=backend, compute_unit=compute_unit, ) class TestWhere(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [rank for rank in range(1, 6)] ), ) def test_where_1_input(self, compute_unit, backend, rank): shape = np.random.randint(low=1, high=4, size=rank) cond_val = np.random.randint(low=-1, high=2, size=shape).astype(np.float32) @make_tf_graph([shape]) def build_model(condition): return tf.where(condition=condition) model, inputs, outputs = build_model inputs_dic = dict(zip(inputs, [cond_val])) TensorFlowBaseTest.run_compare_tf( model, inputs_dic, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [rank for rank in range(1, 6)] ), ) def test_where(self, compute_unit, backend, rank): shape = np.random.randint(low=1, high=4, size=rank) cond_val = np.random.randint(low=0, high=2, size=shape).astype(bool) x_val = random_gen(shape=shape, rand_min=-1962.0, rand_max=0.0) y_val = random_gen(shape=shape, rand_min=0.0, rand_max=1964.0) @make_tf_graph([[*shape, tf.bool], shape, shape]) def build_model(condition, x, y): return tf.where(condition=condition, x=x, y=y) model, inputs, outputs = build_model inputs_dic = dict(zip(inputs, [cond_val, x_val, y_val])) TensorFlowBaseTest.run_compare_tf( model, inputs_dic, outputs, compute_unit=compute_unit, backend=backend, ) class TestCast(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, dtype", itertools.product( compute_units, backends, list(range(1, 6)), ['int32', 'float64'] ), ) def test(self, compute_unit, backend, rank, dtype): shape = np.random.randint(low=1, high=3, size=rank) if backend[0] == "mlprogram" and dtype == "int32": pytest.xfail("rdar://78630549") @make_tf_graph([shape]) def build_model(x): y = tf.cast(x, dtype=dtype) y = tf.square(y) return y model, inputs, outputs = build_model min_range, max_range = -100, 100 input_values = [random_gen(shape, min_range, max_range)] # When using GPU with neuralnetwork backend, that uses FP16 precision, we make sure that # the input is not too close to its ceiling / floor, # for instance, 24.993 or -13.985 will not be allowed. if compute_unit != ct.ComputeUnit.CPU_ONLY and dtype == "int32": TOR_THRESHOLD = 0.03 value = input_values[0].flatten() for i, v in enumerate(value): while abs(math.ceil(v) - v) < TOR_THRESHOLD or abs(math.floor(v) - v) < TOR_THRESHOLD: v = random_gen((1,), min_range, max_range)[0] value[i] = v value = np.reshape(value, shape) input_values = [value] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestCond(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,) ) def test_cond_naive(self, compute_unit, backend): if (backend[0] == "mlprogram" and backend[1] == "fp16"): pytest.xfail("rdar://96627246 (ConsTest unittest is failing)") @make_tf_graph([(1,), (1,)]) def build_model(x, y): return tf.cond(tf.constant(True), lambda: x + y, lambda: x * y) model, inputs, outputs = build_model input_values = [ np.array([1], dtype=np.float32), np.array([6], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,) ) def test_cond(self, compute_unit, backend): @make_tf_graph([(1,), (1,)]) def build_model(x, y): z = tf.multiply(x, y) pred = tf.less(tf.math.reduce_mean(x), tf.math.reduce_mean(y)) return tf.cond(pred, lambda: tf.add(x, z), lambda: tf.square(y)) model, inputs, outputs = build_model input_values = [ np.array([1], dtype=np.float32), np.array([2], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,) ) def test_cond_multi_returns(self, compute_unit, backend): @make_tf_graph([(1,), (1,)]) def build_model(x, y): z = tf.multiply(x, y) pred = tf.less(tf.math.reduce_mean(x), tf.math.reduce_mean(y)) def true_fn(): return tf.add(x, z), tf.math.multiply(x, z) def false_fn(): return tf.square(y), tf.sqrt(z) return tf.cond(pred, true_fn, false_fn) model, inputs, outputs = build_model input_values = [ np.array([1], dtype=np.float32), np.array([2], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,) ) def test_cond_with_identity(self, compute_unit, backend): @make_tf_graph([(1,), (1,)]) def build_model(x, y): z = tf.multiply(x, y) pred = tf.less(tf.math.reduce_mean(x), tf.math.reduce_mean(y)) return tf.cond(pred, lambda: z, lambda: tf.square(y)) model, inputs, outputs = build_model input_values = [ np.array([1], dtype=np.float32), np.array([2], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,) ) def test_cond_multi_returns_with_identity(self, compute_unit, backend): @make_tf_graph([(1,), (1,)]) def build_model(x, y): z = tf.multiply(x, y) pred = tf.less(tf.math.reduce_mean(x), tf.math.reduce_mean(y)) def true_fn(): return tf.add(x, z), x def false_fn(): return tf.square(y), z return tf.cond(pred, true_fn, false_fn) model, inputs, outputs = build_model input_values = [ np.array([1], dtype=np.float32), np.array([2], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,) ) def test_cond_nested_0(self, compute_unit, backend): if backend == ("mlprogram", "fp16"): pytest.xfail("rdar://80660074 (Cond mlprogram FP16 tests falling in TF1 converter with numerical errors)") @make_tf_graph([(1,), (1,)]) def build_model(x, y): z = tf.multiply(x, y) t = tf.less(tf.math.reduce_mean(x), tf.math.reduce_mean(y)) f = tf.less(tf.math.reduce_mean(z), tf.math.reduce_mean(y)) inner_cond = tf.cond( f, lambda: tf.pow(x, y), lambda: tf.math.subtract(x, y) ) return tf.cond(t, lambda: inner_cond, lambda: tf.square(y)) model, inputs, outputs = build_model input_values = [ np.array([2], dtype=np.float32), np.array([3], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,) ) def test_cond_nested_1(self, compute_unit, backend): if backend == ("mlprogram", "fp16"): pytest.xfail("rdar://80660074 (Cond mlprogram FP16 tests falling in TF1 converter with numerical errors)") @make_tf_graph([(1,), (1,)]) def build_model(x, y): z = tf.multiply(x, y) t = tf.less(tf.math.reduce_mean(x), tf.math.reduce_mean(y)) f = tf.less(tf.math.reduce_mean(z), tf.math.reduce_mean(y)) cond_1 = tf.cond(f, lambda: tf.pow(x, y), lambda: tf.math.subtract(x, y)) cond_2 = tf.cond(t, lambda: tf.multiply(x, y), lambda: tf.math.mod(x, y)) cond_3 = tf.cond(f, lambda: tf.math.divide(x, y), lambda: cond_2) return tf.cond(t, lambda: cond_1, lambda: cond_3) model, inputs, outputs = build_model input_values = [ np.array([2], dtype=np.float32), np.array([3], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestWhileLoop(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_while_loop_with_changing_shape(self, compute_unit, backend): @make_tf_graph([(2, 1), (2, 1)]) def build_model(x, y): c = lambda i, j: tf.less(tf.shape(j)[1], 5) b = lambda i, j: (i, tf.concat([i, j], axis=1)) return tf.while_loop(c, b, [x, y], shape_invariants=[x.get_shape(), tf.TensorShape([2, None])]) model, inputs, outputs = build_model input_values = [np.array([[1], [2]], dtype=np.float32), np.array([[1], [2]], dtype=np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_while_loop_no_entry(self, compute_unit, backend): @make_tf_graph([(1,)]) def build_model(x): c = lambda i: tf.greater(tf.math.reduce_mean(i), 5) b = lambda i: i - 1 return tf.while_loop(c, b, [x]) model, inputs, outputs = build_model input_values = [np.array([5], dtype=np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_while_loop_0(self, compute_unit, backend): @make_tf_graph([(1,)]) def build_model(x): c = lambda i: tf.greater(tf.math.reduce_mean(i), 5) b = lambda i: i - 1 return tf.while_loop(c, b, [x]) model, inputs, outputs = build_model input_values = [np.array([10], dtype=np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_while_loop_1(self, compute_unit, backend): @make_tf_graph([(1,), (1,)]) def build_model(x, y): c = lambda i, j: tf.greater(tf.math.reduce_mean(i), tf.math.reduce_mean(j)) b = lambda i, j: (tf.add(i, 1), tf.square(j)) return tf.while_loop(c, b, [x, y]) model, inputs, outputs = build_model input_values = [ np.array([1], dtype=np.float32), np.array([2], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_while_loop_2(self, compute_unit, backend): @make_tf_graph([(1,), (1, 2)]) def build_model(x, y): c = lambda i, j: tf.greater(tf.math.reduce_mean(i), 5) b = lambda i, j: (i - 3, j * 2) return tf.while_loop(c, b, [x, y]) model, inputs, outputs = build_model input_values = [ np.array([10], dtype=np.float32), np.array([[2, 3]], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_while_loop_3(self, compute_unit, backend): @make_tf_graph([(1,), (1, 2), (1,)]) def build_model(x, y, z): c = lambda i, j, k: tf.greater( tf.math.reduce_mean(i), tf.math.reduce_mean(j) ) b = lambda i, j, k: (i / 3, j ** 2, k - 2) return tf.while_loop(c, b, [x, y, z]) model, inputs, outputs = build_model input_values = [ np.array([10], dtype=np.float32), np.array([[2, 3]], dtype=np.float32), np.array([5], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_while_loop_4(self, compute_unit, backend): @make_tf_graph([(1,), (1, 2), (1,), (2, 1)]) def build_model(x, y, z, m): c = lambda i, j, k, l: tf.greater( tf.math.reduce_mean(i), tf.math.reduce_mean(j) ) b = lambda i, j, k, l: (i / 3, j ** 2, k - 2, l % 2) return tf.while_loop(c, b, [x, y, z, m]) model, inputs, outputs = build_model input_values = [ np.array([10], dtype=np.float32), np.array([[2, 3]], dtype=np.float32), np.array([5], dtype=np.float32), np.array([[2], [3]], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.skipif(_HAS_TF_2, reason="tf.function() error in TF2") @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,) ) def test_nested_while_body(self, compute_unit, backend): @make_tf_graph([(1,), (1,)]) def build_model(x, y): # The following while loop: # # i, j = 0, 10 # while i < j: # while 2*i < i+2: # i += 1 # i += 2 def cond2(i): return tf.less(2 * tf.math.reduce_mean(i), tf.math.reduce_mean(i + 2)) def body2(i): return i + 1 def cond1(i, j): return tf.less(tf.math.reduce_mean(i), tf.math.reduce_mean(j)) def body1(i, j): new_i = tf.while_loop(cond2, body2, [i]) return new_i + 2, j return tf.while_loop(cond1, body1, [x, y]) model, inputs, outputs = build_model input_values = [ np.array([0], dtype=np.float32), np.array([10], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,) ) def test_nested_while_cond(self, compute_unit, backend): @make_tf_graph([(1,), (1,)]) def build_model(x, y): # The following while loop: # # def cond(i, j): # while 2*i < i+2: # i += 1 # return i < j # # i, j = 0, 10 # while cond(i, j): # i += 2 # j += 1 def cond2(i): return tf.less(2 * tf.math.reduce_mean(i), tf.math.reduce_mean(i + 2)) def body2(i): return i + 1 def cond1(i, j): new_i = tf.while_loop(cond2, body2, [i]) return tf.less(tf.squeeze(new_i), tf.squeeze(j)) def body1(i, j): return i + 2, j + 1 return tf.while_loop(cond1, body1, [x, y]) model, inputs, outputs = build_model input_values = [ np.array([0], dtype=np.float32), np.array([10], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestConv(TensorFlowBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "conv_dim", # 1d or 2d conv "padding", "data_format", "HWkHkW", "strides", "dilations", "dynamic_weights", "batch_size", ] ), itertools.product( compute_units, backends, ["conv1d", "conv2d"], ["SAME", "VALID", [[2, 3], [3, 2]]], ["NHWC"], # NCHW not supported by TF. [(11, 12, 3, 2), (12, 11, 2, 3)], [(1, 1), (2, 3)], [(1, 1), (2, 3)], [True, False], [1, 3], ), ) def test( self, compute_unit, backend, conv_dim, padding, data_format, HWkHkW, strides, dilations, dynamic_weights, batch_size, ): H, W, kH, kW = HWkHkW N, C_in, C_out = batch_size, 2, 3 if data_format == "NHWC": input_shape = (N, W, C_in) if conv_dim == "conv1d" else (N, H, W, C_in) if isinstance(padding, list): padding = [[0, 0]] + padding + [[0, 0]] if conv_dim == "conv1d": data_format = "NWC" if isinstance(padding, list): # No explicit padding for conv1d in TF return else: # 'NCHW' input_shape = (N, C_in, W) if conv_dim == "conv1d" else (N, C_in, H, W) if isinstance(padding, list): padding = [[0, 0], [0, 0]] + padding if conv_dim == "conv1d": data_format = "NCW" if isinstance(padding, list): # No explicit padding for conv1d in TF return W_shape = (kW, C_in, C_out) if conv_dim == "conv1d" else (kH, kW, C_in, C_out) dilations = dilations[1] if conv_dim == "conv1d" else dilations strides = strides[1] if conv_dim == "conv1d" else strides # We do not support dynamic weight when dilations != 1. if dynamic_weights and dilations == (1, 1): @make_tf_graph([input_shape, W_shape]) def build_model_dynamic_weights(x, W): if conv_dim == "conv1d": conv = tf.nn.conv1d( x, W, stride=strides, padding=padding, dilations=dilations, data_format=data_format, ) else: conv = tf.nn.conv2d( x, W, strides=strides, padding=padding, dilations=dilations, data_format=data_format, ) return conv model, inputs, outputs = build_model_dynamic_weights input_values = [ random_gen(input_shape, -10.0, 10.0), random_gen(W_shape, -1.0, 1.0), ] input_dict = dict(zip(inputs, input_values)) else: @make_tf_graph([input_shape]) def build_model_static_weights(x): W = tf.constant(np.random.rand(*W_shape), tf.float32) if conv_dim == "conv1d": conv = tf.nn.conv1d( x, W, stride=strides, padding=padding, dilations=dilations, data_format=data_format, ) else: conv = tf.nn.conv2d( x, W, strides=strides, padding=padding, dilations=dilations, data_format=data_format, ) return conv model, inputs, outputs = build_model_static_weights input_values = [random_gen(input_shape, -10.0, 10.0)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestConv3d(TensorFlowBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "data_format", "input_size", "kernel_size", "strides", "dilations", "padding_type", "batch_size", ] ), itertools.product( compute_units, # compute_unit backends, ["NDHWC"], # NCDHW not supported by TF. [(7, 11, 13), (32, 16, 8)], # input_size [(1, 1, 1), (3, 3, 3), (1, 2, 3)], # kernel_size [(1, 1, 1), (2, 2, 2), (3, 2, 1)], # strides [ (1, 1, 1) ], # , (2, 2, 2), (2, 3, 1)], # dilations: dilations greater than 1 not supported on CPU ["SAME", "VALID"], # padding_type [1, 3], # batch_size ), ) def test_tf( self, compute_unit, backend, data_format, input_size, kernel_size, strides, dilations, padding_type, batch_size, ): C_in = np.random.randint(low=1, high=4) C_out = np.random.randint(low=1, high=(C_in + 1)) input_shape = [batch_size] + list(input_size) + [C_in] weights_shape = list(kernel_size) + [C_in, C_out] # TF1 and TF2 tf.nn.conv3d require dilations and strides to have length 5 or greater, with values of 1 for # indices 0 and 4 (batch and channel in NDHWC format) tf_strides = [1] + list(strides) + [1] tf_dilations = [1] + list(dilations) + [1] @make_tf_graph([input_shape]) def build_model_static_weights(x): W = tf.constant(np.random.rand(*weights_shape), tf.float32) return tf.nn.conv3d( x, W, strides=tf_strides, padding=padding_type, data_format=data_format, dilations=tf_dilations, ) model, inputs, outputs = build_model_static_weights input_values = [random_gen(input_shape, -10.0, 10.0)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, atol=1e-03, # default 1e-04 rtol=2e-03, # default 1e-05 ) class TestDepthwiseConv(TensorFlowBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "padding", "HWkHkW", "strides", "dilations", "dynamic_weights", "batch_size", ] ), itertools.product( compute_units, backends, ["SAME", "VALID"], [(11, 12, 3, 2), (12, 11, 2, 3)], # TF doesn't support non-square strides for depthwise # https://github.com/tensorflow/tensorflow/issues/33005 [(1, 1, 1, 1), (1, 2, 2, 1)], [ (1, 1), (2, 2), ], [True, False], [1, 3], ), ) def test_depthwise_conv( self, compute_unit, backend, padding, HWkHkW, strides, dilations, dynamic_weights, batch_size, ): if backend[0] == "mlprogram" and dilations == (1,1) and dynamic_weights and compute_unit != ct.ComputeUnit.CPU_ONLY: # in this case, there is a numerical mismatch on the GPU MIL backend. The GPU runtime tests are # tracked separately. return if np.sum(strides) != len(strides) and np.sum(dilations) != len(dilations): # TF doesn't compute correct output for non-one stride+dilation return H, W, kH, kW = HWkHkW N, C_in, C_out = batch_size, 2, 6 input_shape = (N, H, W, C_in) data_format = "NHWC" assert C_out % C_in == 0 multiplier = int(C_out / C_in) W_shape = (kH, kW, C_in, multiplier) def test_static_W(): W = np.random.rand(*W_shape).astype(np.float32) @make_tf_graph([input_shape]) def build_model_static_weights(x): return tf.nn.depthwise_conv2d( x, W, strides=strides, padding=padding, dilations=dilations, data_format=data_format, ) model, inputs, outputs = build_model_static_weights input_values = [(np.random.rand(*input_shape).astype(np.float32))] input_dict = dict(zip(inputs, input_values)) proto, _, _, _, _, _ = TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) if backend[0] == 'neuralnetwork': assert layer_counts(proto, "reorganizeData") == 0 def test_dynamic_W(): @make_tf_graph([input_shape, W_shape]) def build_model_dynamic_weights(x, W): return tf.nn.depthwise_conv2d( x, W, strides=strides, padding=padding, dilations=dilations, data_format=data_format, ) model, inputs, outputs = build_model_dynamic_weights input_values = [ (np.random.rand(*input_shape).astype(np.float32)), (np.random.rand(*W_shape).astype(np.float32)), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) if backend[0] == "neuralnetwork" and dynamic_weights: pytest.skip("dynamic conv with groups > 1 is not supported on the neuralnetwork backend") # We do not support dynamic weight when dilations != 1. test_dynamic_W() if dynamic_weights and dilations == (1, 1) else test_static_W() class TestSeparableConv(TensorFlowBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "padding", "HWkHkW", "strides", "dilations", "dynamic_weights", "batch_size", ] ), itertools.product( compute_units, backends, ["SAME", "VALID"], [(11, 12, 3, 2), (12, 11, 2, 3)], [(1, 1, 1, 1), (1, 2, 2, 1)], [(1, 1), (2, 2)], [True, False], [1, 3], ), ) def test_separable_conv( self, compute_unit, backend, padding, HWkHkW, strides, dilations, dynamic_weights, batch_size, ): if backend[0] == "mlprogram" and dilations == (1,1) and compute_unit != ct.ComputeUnit.CPU_ONLY: msg = "In this case, there is a numerical mismatch on the GPU MIL backend. The GPU runtime tests are tracked separately." pytest.skip(msg) H, depthwise_filter, kH, kW = HWkHkW N, C_in, C_out = batch_size, 2, 6 input_shape = (N, H, depthwise_filter, C_in) data_format = "NHWC" assert C_out % C_in == 0 multiplier = int(C_out / C_in) depthwise_filter_shape = (kH, kW, C_in, multiplier) pointwise_filter_shape = [1, 1, multiplier * C_in, C_out] if dilations != (1, 1): strides = (1, 1, 1, 1) def test_dynamic_W(): @make_tf_graph( [input_shape, depthwise_filter_shape, pointwise_filter_shape] ) def build_model_dynamic_weights(x, depthwise_filter, pointwise_filter): return tf.nn.separable_conv2d( x, depthwise_filter, pointwise_filter, strides=strides, padding=padding, dilations=dilations, data_format=data_format, ) model, inputs, outputs = build_model_dynamic_weights input_values = [ (np.random.rand(*input_shape).astype(np.float32)), (np.random.rand(*depthwise_filter_shape).astype(np.float32)), (np.random.rand(*pointwise_filter_shape).astype(np.float32)), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) def test_static_W(): depthwise_filter = np.random.rand(*depthwise_filter_shape).astype( np.float32 ) pointwise_filter = np.random.rand(*pointwise_filter_shape).astype( np.float32 ) @make_tf_graph([input_shape]) def build_model_static_weights(x): return tf.nn.separable_conv2d( x, depthwise_filter, pointwise_filter, strides=strides, padding=padding, dilations=dilations, data_format=data_format, ) model, inputs, outputs = build_model_static_weights input_values = [(np.random.rand(*input_shape).astype(np.float32))] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) test_static_W() if not any([True if d > 1 else False for d in dilations]): if backend[0] == "neuralnetwork": pytest.skip("dynamic conv with groups > 1 is not supported on the neuralnetwork backend") test_dynamic_W() class TestConvTranspose(TensorFlowBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "conv_dim", # 1d or 2d conv "padding", "data_format", "HWkHkW", "strides", "dilations", "dynamic", ] ), itertools.product( compute_units, backends, ["conv1d", "conv2d"], ["SAME", "VALID"], ["NHWC"], # NCHW not supported by TF [(12, 10, 2, 2), (4, 2, 2, 3), (7, 5, 3, 3)], [(1, 1), (1, 2)], [(1, 1)], # Dilation > 1 not supported by TF [True, False], ), ) def test_conv_transpose( self, compute_unit, backend, conv_dim, padding, data_format, HWkHkW, strides, dilations, dynamic, ): H, W, kH, kW = HWkHkW N, C_in, C_out = 1, 1, 2 if data_format == "NHWC": input_shape = (N, W, C_in) if conv_dim == "conv1d" else (N, H, W, C_in) if conv_dim == "conv1d": data_format = "NWC" else: # 'NCHW' pass w_shape = (kW, C_out, C_in) if conv_dim == "conv1d" else (kH, kW, C_out, C_in) # dynamic input shape tf_input_shape = list(input_shape) if dynamic: if data_format == "NHWC": tf_input_shape[1] = None tf_input_shape[2] = None elif data_format == "NWC": tf_input_shape[1] = None @make_tf_graph([tf_input_shape]) def build_model(x): Weight = tf.constant(np.random.rand(*w_shape), tf.float32) # get the dynamic height and width if dynamic: shape = tf.shape(x) if data_format == "NHWC": H, W = shape[1], shape[2] elif data_format == "NWC": W = shape[1] else: H, W = HWkHkW[:2] kH, kW = HWkHkW[2:] is_conv_2d = conv_dim == "conv2d" # compute the output shape, in both static / dynamic cases if padding == "SAME": oW = W * strides[1] if is_conv_2d: oH = H * strides[0] elif padding == "VALID": oW = (W - 1) * strides[1] + (kW - 1) * dilations[1] + 1 if is_conv_2d: oH = (H - 1) * strides[0] + (kH - 1) * dilations[0] + 1 if data_format == "NHWC": output_shape = [N, oH, oW, C_out] elif data_format == "NWC": output_shape = [N, oW, C_out] if conv_dim == "conv1d": return tf.nn.conv1d_transpose( x, Weight, output_shape=output_shape, strides=strides[1], padding=padding, dilations=dilations[1], data_format=data_format, ) elif conv_dim == "conv2d": return tf.nn.conv2d_transpose( x, Weight, output_shape=output_shape, strides=strides, padding=padding, dilations=dilations, data_format=data_format, ) model, inputs, outputs = build_model input_values = [(np.random.rand(*input_shape).astype(np.float32))] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "padding", "data_format", "DHWkDkHkW", "strides", "dilations", "dynamic", ] ), itertools.product( compute_units, backends, [ "SAME", "VALID" ], ["NDHWC"], [ (10, 12, 14, 2, 3, 5), (4, 6, 8, 2, 3, 1), (6, 8, 10, 3, 3, 3), (5, 7, 9, 2, 4, 2), ], [(1, 1, 1), (1, 2, 3)], [(1, 1, 1)], # Dilation > 1 not supported by TF [True, False], ), ) def test_conv3d_transpose( self, compute_unit, backend, padding, data_format, DHWkDkHkW, strides, dilations, dynamic, ): if _macos_version() < (12, 0) and strides == (1, 2, 3) and padding == "VALID": # Behavior changed in macOS 12 return D, H, W, kD, kH, kW = DHWkDkHkW N, C_in, C_out = 2, 1, 2 if data_format == "NDHWC": input_shape = (N, D, H, W, C_in) else: # 'NCDHW' pass tf_input_shape = list(input_shape) if dynamic: if data_format == "NDHWC": tf_input_shape[1] = None tf_input_shape[2] = None tf_input_shape[3] = None else: pass w_shape = (kD, kH, kW, C_out, C_in) @make_tf_graph([tf_input_shape]) def build_model(x): weight = tf.constant(np.random.rand(*w_shape), tf.float32) # get the depth, height and width if dynamic: shape = tf.shape(x) if data_format == "NDHWC": D, H, W = shape[1], shape[2], shape[3] else: pass else: D, H, W = DHWkDkHkW[:3] kD, kH, kW = DHWkDkHkW[3:] # compute the output shape if padding == "SAME": oD = D * strides[0] oH = H * strides[1] oW = W * strides[2] else: oD = (D - 1) * strides[0] + (kD - 1) * dilations[0] + 1 oH = (H - 1) * strides[1] + (kH - 1) * dilations[1] + 1 oW = (W - 1) * strides[2] + (kW - 1) * dilations[2] + 1 if data_format == "NDHWC": output_shape = [N, oD, oH, oW, C_out] else: pass return tf.nn.conv3d_transpose( x, weight, output_shape=output_shape, strides=strides, padding=padding, dilations=dilations, data_format=data_format, ) model, inputs, outputs = build_model input_values = [(np.random.rand(*input_shape).astype(np.float32))] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestElementWiseBinary(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, tf_op, broadcast_case", itertools.product( compute_units, backends, [0, 1, 2, 3, 4], [ tf.math.add, tf.math.floordiv, tf.math.floormod, tf.math.maximum, tf.math.minimum, tf.math.mod, tf.math.multiply, tf.math.pow, tf.math.truediv, tf.math.subtract, tf.math.squared_difference, ], [0, 1, 2, 3] ), ) def test_binary_math(self, compute_unit, backend, rank, tf_op, broadcast_case): if rank == 0 or broadcast_case == 0: pytest.skip("Rank-0 input is not supported") x_shape = y_shape = list(np.random.randint(low=2, high=4, size=rank)) # test broadcasting # 0 -> broadcast with one of the inputs is a 0-D tensor (scalar) # 1 -> broadcast with same rank, some of dimensions are size 1 # 2 -> broadcast with different rank, extra dimension with size 1 # 3 -> no broadcast, same type for both inputs if broadcast_case == 0: y_shape = [] elif broadcast_case == 1: y_shape = [1 if np.random.randint(2) == 0 else d for d in y_shape] elif broadcast_case == 2: y_shape = [1] + y_shape # randomly swap x and y if np.random.randint(2) == 0: x_shape, y_shape = y_shape, x_shape # lower precision input data for non-CPU tests dtype = np.float32 if compute_unit == ct.ComputeUnit.CPU_ONLY else np.float16 if tf_op in {tf.math.add, tf.math.subtract, tf.math.multiply}: x_val = random_gen(x_shape, -100, 100, dtype=dtype).astype(np.float32) y_val = random_gen(y_shape, -100, 100, dtype=dtype).astype(np.float32) elif tf_op in {tf.math.truediv, tf.math.floordiv, tf.math.floormod, tf.math.mod}: x_val = random_gen(x_shape, -100, 100, dtype=dtype).astype(np.float32) y_val = random_gen(y_shape, 1, 20, dtype=dtype).astype(np.float32) elif tf_op in {tf.math.maximum, tf.math.minimum}: x_val = random_gen(x_shape, -10, 10, dtype=dtype).astype(np.float32) y_val = random_gen(y_shape, -10, 10, dtype=dtype).astype(np.float32) elif tf_op in {tf.math.pow, tf.math.squared_difference}: x_val = random_gen(x_shape, -5, 5, dtype=np.int32).astype(np.float32) y_val = random_gen(y_shape, -5, 5, dtype=np.int32).astype(np.float32) else: raise NotImplementedError("input values needs to be defined") @make_tf_graph([x_shape, y_shape]) def build_model(x, y): return tf_op(x, y) model, inputs, outputs = build_model input_values = [x_val, y_val] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, rank, tf_op, broadcast_case", itertools.product( compute_units, backends, [0, 1, 2, 3, 4], [ tf.equal, tf.not_equal, tf.greater, tf.greater_equal, tf.less, tf.less_equal, ], [0, 1, 2, 3], ), ) def test_binary_compare(self, compute_unit, backend, rank, tf_op, broadcast_case): if rank == 0 or broadcast_case == 0: pytest.skip("Rank-0 input is not supported") x_shape = y_shape = list(np.random.randint(low=2, high=4, size=rank)) # test broadcasting # 0 -> broadcast with one of the inputs is a 0-D tensor (scalar) # 1 -> broadcast with same rank, some of dimensions are size 1 # 2 -> broadcast with different rank, extra dimension with size 1 # 3 -> no broadcast, same type for both inputs if broadcast_case == 0: y_shape = [] elif broadcast_case == 1: y_shape = [1 if np.random.randint(2) == 0 else d for d in y_shape] elif broadcast_case == 2: y_shape = [1] + y_shape # randomly swap x and y if np.random.randint(2) == 0: x_shape, y_shape = y_shape, x_shape # lower precision input data for non-CPU tests dtype = np.float32 if compute_unit == ct.ComputeUnit.CPU_ONLY else np.float16 @make_tf_graph([x_shape, y_shape]) def build_model(x, y): return tf_op(x, y) model, inputs, outputs = build_model input_values = [ random_gen(x_shape, -5, 3, dtype=dtype).astype(np.float32), random_gen(y_shape, -5, 3, dtype=dtype).astype(np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank, tf_op, broadcast_case", itertools.product( compute_units, backends, [0, 1, 2, 3, 4], [ tf.math.logical_and, tf.math.logical_or, tf.math.logical_xor, ], [0, 1, 2, 3], ), ) def test_binary_logical(self, compute_unit, backend, rank, tf_op, broadcast_case): if rank == 0 or broadcast_case == 0: pytest.skip("Rank-0 input is not supported") x_shape = y_shape = list(np.random.randint(low=2, high=4, size=rank)) # test broadcasting # 0 -> broadcast with one of the inputs is a 0-D tensor (scalar) # 1 -> broadcast with same rank, some of dimensions are size 1 # 2 -> broadcast with different rank, extra dimension with size 1 # 3 -> no broadcast, same type for both inputs if broadcast_case == 0: y_shape = [] elif broadcast_case == 1: y_shape = [1 if np.random.randint(2) == 0 else d for d in y_shape] elif broadcast_case == 2: y_shape = [1] + y_shape # randomly swap x and y if np.random.randint(2) == 0: x_shape, y_shape = y_shape, x_shape @make_tf_graph([x_shape + [tf.bool], y_shape + [tf.bool]]) def build_model(x, y): return tf_op(x, y) model, inputs, outputs = build_model input_values = [ random_gen(x_shape, 0, 2, dtype=np.int32).astype(bool), random_gen(y_shape, 0, 2, dtype=np.int32).astype(bool), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestCross(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [2, 3, 4], ) ) def test(self, compute_unit, backend, rank): input_shape = list(np.random.randint(low=2, high=4, size=rank)) + [3] input_shapes = [input_shape, input_shape] @make_tf_graph(input_shapes) def build_model(x, y): return tf.linalg.cross(x, y) model, inputs, outputs = build_model input_values = [random_gen(shape, -1, 1) for shape in input_shapes] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestEinsum(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, equation, reverse_input_order", itertools.product( compute_units, backends, einsum_equations, [False, True], ) ) def test(self, compute_unit, backend, equation, reverse_input_order): input_shapes, _ = gen_input_shapes_einsum(equation, False, backend) if _HAS_TF_1: if len(set(input_shapes[0])) < len(input_shapes[0]) or len(set(input_shapes[1])) < len(input_shapes[1]): pytest.skip("tf1 does not support diagonal cases") if reverse_input_order: input_output_strings = equation.split('->') input_strings = input_output_strings[0].split(',') equation = input_strings[1] + ',' + input_strings[0] + '->' + input_output_strings[1] input_shapes = [input_shapes[1], input_shapes[0]] @make_tf_graph(input_shapes) def build_model(x, y): return tf.einsum(equation, x, y) model, inputs, outputs = build_model input_values = [ random_gen(input_shapes[0], -1, 1), random_gen(input_shapes[1], -1, 1), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestElementWiseUnary(TensorFlowBaseTest): _FP16_UNSUPPORTED = {'acos', 'asin', 'atan', 'atanh', 'cosh', 'sinh'} @pytest.mark.parametrize( "compute_unit, backend, rank, mode", itertools.product( compute_units, backends, [1, 2, 5], [ "abs", "acos", "asin", "atan", "atanh", "cast", "ceil", "clip", "cos", "cosh", "erf", "exp", "floor", "inverse", "log", "negative", "round", "rsqrt", "sign", "sin", "sinh", "sqrt", "square", "tan", "tanh", ], ), ) def test_unary(self, compute_unit, backend, rank, mode): _PREBUILD_WHEEL_SEGFAULTING_MODE = ["acos", "asin", "atan", "atanh", "cosh", "sinh"] if compute_unit != ct.ComputeUnit.CPU_ONLY and mode in self._FP16_UNSUPPORTED: return if _get_version(tf.__version__) == Version(PREBUILT_TF1_WHEEL_VERSION): if mode in _PREBUILD_WHEEL_SEGFAULTING_MODE: # we should re-enable these tests after this radar rdar://100735561 ([CI] Build a more stable TF1 Rosetta wheel for the lightning CI) is fixed pytest.skip("Prebuilt wheel segfaulting on several functions.") if _macos_version() < (13, 0): if backend == ("mlprogram", "fp16") and _is_macos(): pytest.skip("Requires macOS13 or greater") elif compute_unit != ct.ComputeUnit.CPU_ONLY: pytest.skip("GPU issue fixed in iOS16/macOS13") else: dtype = np.float32 tf_dtype = tf.float32 atol, rtol = 1e-4, 1e-5 input_shape = np.random.randint(low=2, high=4, size=rank) if backend == ("mlprogram", "fp16") and mode != "clip": # For the clip mode with tf.float16 as input, it seems like the tf graph is producing wrong results # It looks like a tensorflow bug, tracked by this radar: # rdar://96850184 (Tensor clip_by_value is producing wrong numerical outputs with tf.float16 type input) dtype = np.float16 tf_dtype = tf.float16 else: dtype = np.float32 tf_dtype = tf.float32 def cast_func(x): return tf.cast(x, dtype=tf.int32) def clip_func(x): return tf.clip_by_value(x, clip_value_min=0.0, clip_value_max=5.0) def _get_test(test_mode): if test_mode == "abs": res = tf.abs val = random_gen(input_shape, rand_min=-1, rand_max=1) elif test_mode == "acos": res = tf.acos val = random_gen(input_shape, rand_min=-1, rand_max=1) elif test_mode == "asin": res = tf.asin val = random_gen(input_shape, rand_min=-1, rand_max=1) elif test_mode == "atan": res = tf.atan val = random_gen(input_shape, rand_min=-100, rand_max=100) elif test_mode == "atanh": res = tf.atanh val = random_gen(input_shape, rand_min=-0.9, rand_max=0.9) elif test_mode == "cast": eps_from_int = 0.0 if compute_unit != ct.ComputeUnit.CPU_ONLY: eps_from_int = 0.1 res = cast_func val = random_gen( input_shape, rand_min=-10, rand_max=10, eps_from_int=eps_from_int, dtype=dtype, ) elif test_mode == "ceil": res = tf.math.ceil eps_from_int = 0.0 if compute_unit != ct.ComputeUnit.CPU_ONLY: eps_from_int = 0.1 val = random_gen( input_shape, rand_min=-100, rand_max=100, eps_from_int=eps_from_int, dtype=dtype, ) elif test_mode == "clip": if compute_unit != ct.ComputeUnit.CPU_ONLY: return None, None # clip does not support float16 res = clip_func val = random_gen(input_shape, rand_min=-5, rand_max=10) elif test_mode == "cos": res = tf.cos rand_range = 1000 if compute_unit != ct.ComputeUnit.CPU_ONLY: rand_range = 10 val = random_gen(input_shape, rand_min=-rand_range, rand_max=rand_range) elif test_mode == "cosh": res = tf.cosh val = random_gen(input_shape, rand_min=-4, rand_max=4) elif test_mode == "erf": res = tf.math.erf val = random_gen(input_shape, rand_min=1, rand_max=6) elif test_mode == "exp": if compute_unit != ct.ComputeUnit.CPU_ONLY: # We skip GPU here, since exp(1) already differs in backend. return None, None res = tf.exp val = random_gen(input_shape, rand_min=-4, rand_max=4) elif test_mode == "floor": res = tf.floor eps_from_int = 0.0 if compute_unit != ct.ComputeUnit.CPU_ONLY: eps_from_int = 0.1 val = random_gen( input_shape, rand_min=-100, rand_max=100, eps_from_int=eps_from_int, dtype=dtype, ) elif test_mode == "inverse": res = tf.math.reciprocal val = random_gen(input_shape, rand_min=0.1, rand_max=10) elif test_mode == "log": res = tf.math.log val = random_gen(input_shape, rand_min=0.2, rand_max=1000) elif test_mode == "negative": res = tf.math.negative val = random_gen(input_shape, rand_min=-100.0, rand_max=100.0) elif test_mode == "round": res = tf.round val = random_gen( input_shape, rand_min=-1000, rand_max=1000, dtype=dtype ) elif test_mode == "rsqrt": res = tf.math.rsqrt val = random_gen(input_shape, rand_min=0.5, rand_max=1000) elif test_mode == "sign": res = tf.sign val = random_gen(input_shape, rand_min=-5, rand_max=5) elif test_mode == "sin": res = tf.sin rand_range = 1000 if compute_unit != ct.ComputeUnit.CPU_ONLY: rand_range = 10 val = random_gen(input_shape, rand_min=-rand_range, rand_max=rand_range) elif test_mode == "sinh": res = tf.sinh val = random_gen(input_shape, rand_min=-10, rand_max=10) elif test_mode == "sqrt": res = tf.sqrt val = random_gen(input_shape, rand_min=0.5, rand_max=1000) elif test_mode == "square": res = tf.math.square val = random_gen(input_shape, rand_min=-5, rand_max=5) elif test_mode == "tan": res = tf.tan val = random_gen(input_shape, rand_min=-1000, rand_max=1000) elif test_mode == "tanh": res = tf.tanh val = random_gen(input_shape, rand_min=-1000, rand_max=1000) return res, val func, input_val = _get_test(mode) if func is None: return input_type = list(input_shape) + [tf_dtype] @make_tf_graph([input_type]) def build_model(x): return func(x) model, inputs, outputs = build_model input_dict = dict(zip(inputs, [input_val.astype(dtype)])) if mode == "inverse" or mode == "rsqrt": atol, rtol = 1e-2, 1e-3 TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, atol=atol, rtol=rtol, minimum_deployment_target=ct.target.iOS16 if backend == ("mlprogram", "fp16") else None, ) class TestImageResizing(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_shape, target_shape, align_corners, half_pixel_centers", itertools.product( compute_units, backends, [(1, 10, 20, 1), (2, 5, 1, 3)], [(25, 30), (2, 20)], [True, False], [True, False], ), ) def test_resize_bilinear( self, compute_unit, backend, input_shape, target_shape, align_corners, half_pixel_centers, ): if half_pixel_centers and align_corners: return @make_tf_graph([input_shape]) def build_model(x): return tf.raw_ops.ResizeBilinear( images=x, size=target_shape, half_pixel_centers=half_pixel_centers, align_corners=align_corners, ) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -100, 100)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, input_shape, scale_factor, align_corners, half_pixel_centers", itertools.product( compute_units, backends, [(1, 10, 20, 1), (2, 5, 2, 3)], [(2, 3),], [True, False], [True, False], ), ) def test_ios16_resize_bilinear_dynamic_shape_by_upsample_bilinear( self, compute_unit, backend, input_shape, scale_factor, align_corners, half_pixel_centers, ): """ Since iOS16, dynamic shape is supported only if the output_shape comes from a pattern of ``input_shape * (h_scale, w_scale)``, which will be lowered to `upsample_bilinear` MIL op. """ if backend[0] == "neuralnetwork" or ct.utils._macos_version() < (13, 0): pytest.skip("half_pixel_centers only support for iOS16 upsample_bilinear layer") if half_pixel_centers and align_corners: pytest.skip("half_pixel_centers and align_corners cannot be both True") batch_dim, _, _, channel = input_shape h_factor, w_factor = scale_factor @make_tf_graph([(batch_dim, None, None, channel, tf.float32)]) def build_model(x): input_shape = tf.shape(x) target_shape = tf.math.multiply(input_shape[1:3], (h_factor, w_factor)) return tf.raw_ops.ResizeBilinear( images=x, size=target_shape, half_pixel_centers=half_pixel_centers, align_corners=align_corners, ) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=ct.target.iOS16, ) @pytest.mark.parametrize( "compute_unit, backend, input_shape, target_shape, align_corners", itertools.product( compute_units, backends, [(1, 10, 20, 1), (2, 5, 2, 3)], [(20, 60)], [True, False], ), ) def test_ios17_resize_bilinear_dynamic_shape( self, compute_unit, backend, input_shape, target_shape, align_corners, ): """ Since iOS17, dynamic shape is supported by lowering to `resize` MIL op. """ batch_dim, _, _, channel = input_shape @make_tf_graph([(batch_dim, None, None, channel, tf.float32), (2, tf.int32)]) def build_model(x, size): return tf.raw_ops.ResizeBilinear( images=x, size=size, half_pixel_centers=False, align_corners=align_corners, ) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1), np.array(target_shape, dtype=np.int32)] input_dict = dict(zip(inputs, input_values)) # Before iOS17, the dynamic shape will error out. with pytest.raises( ValueError, match="the second input, which is the output size, must be known statically. " "Consider setting minimum_deployment_target to iOS17 during conversion.", ): TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) # Since iOS17, the dynamic shape will be handled correctly. TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=ct.target.iOS17, ) @pytest.mark.parametrize( "compute_unit, backend, input_shape, upsample_factor, data_format", itertools.product( compute_units, backends, [(1, 1, 1, 3), (1, 10, 5, 3)], [(1, 2), (4, 3)], ["channels_last", "channels_first"], ), ) def test_upsampling_2d( self, compute_unit, backend, input_shape, upsample_factor, data_format ): if data_format == "channels_last": input_shape = ( input_shape[0], input_shape[2], input_shape[3], input_shape[1], ) @make_tf_graph([input_shape]) def build_model(x): return tf.keras.layers.UpSampling2D( size=upsample_factor, data_format=data_format, interpolation="nearest" )(x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -100, 100)] input_dict = dict(zip(inputs, input_values)) spec = TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, )[0] # also check if the scale factor are integers if backend[0] == 'neuralnetwork': for layer in spec.neuralNetwork.layers: if layer.WhichOneof('layer') == "upsample": assert len(layer.upsample.fractionalScalingFactor) == 0 @pytest.mark.parametrize( "compute_unit, backend, input_shape, target_shape", itertools.product( compute_units, backends, [(1, 10, 20, 1), (2, 5, 2, 3)], [(20, 60)], ), ) def test_ios17_resize_nearest_neighbor_dynamic_shape( self, compute_unit, backend, input_shape, target_shape, ): """ Since iOS17, dynamic shape is supported by lowering to `resize` MIL op. """ batch_dim, _, _, channel = input_shape @make_tf_graph([(batch_dim, None, None, channel, tf.float32), (2, tf.int32)]) def build_model(x, size): return tf.raw_ops.ResizeNearestNeighbor( images=x, size=size, half_pixel_centers=True, align_corners=False, ) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1), np.array(target_shape, dtype=np.int32)] input_dict = dict(zip(inputs, input_values)) # Before iOS17, the dynamic shape will error out. with pytest.raises( ValueError, match="Cannot determine the scale factor for the resize layer. " "Please make sure the target size is known statically, or " "use mul op to get the target size. If the target size has to be dynamic, please" "set minimum_deployment_target to iOS17 during conversion.", ): TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=ct.target.iOS17, ) @pytest.mark.parametrize( "compute_unit, backend, input_shape, num_of_crops, crop_size, method, dynamic, " "extrapolation_value, minimum_deployment_target", itertools.product( compute_units, backends, [(1, 64, 64, 1)], [1, 3, 5], [(2, 2), (1, 1), (4, 4), (128, 128)], ["bilinear"], [False, True], [0.0, 1.0], [None, ct.target.iOS17], ), ) def test_crop_and_resize( self, compute_unit, backend, input_shape, num_of_crops, crop_size, method, dynamic, extrapolation_value, minimum_deployment_target, ): if extrapolation_value != 0.0: if minimum_deployment_target is None or minimum_deployment_target < ct.target.iOS16: pytest.skip( "extrapolation_value (corresponds to `pad_value` in MIL crop_resize op) only " "supported in IOS16+." ) # rdar://98749492 (crop_resize is unstable for cropping out of bound setting in fp16) if backend[0] == "mlprogram": backend = ("mlprogram", "fp32") # TODO(rdar://98749492): Once resolved, set crop_bias = 0.5 in order to test the crop outside the image crop_bias = 0.0 input = np.random.randn(*input_shape).astype(np.float32) boxes = np.random.uniform(size=(num_of_crops, 4)).astype(np.float32) + crop_bias box_indices = np.random.randint( size=(num_of_crops,), low=0, high=input_shape[0] ).astype(np.int32) def test_static(): @make_tf_graph([input_shape]) def build_model(x): return tf.raw_ops.CropAndResize( image=x, boxes=boxes, box_ind=box_indices, crop_size=crop_size, method=method, extrapolation_value=extrapolation_value, ) model, inputs, outputs = build_model input_values = [input] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=minimum_deployment_target, ) def test_dynamic(): @make_tf_graph([input_shape, boxes.shape, list(box_indices.shape) + [tf.int32]]) def build_model(x, boxes_pl, box_indices_pl): return tf.raw_ops.CropAndResize( image=x, boxes=boxes_pl, box_ind=box_indices_pl, crop_size=crop_size, method=method, extrapolation_value=extrapolation_value ) model, inputs, outputs = build_model input_values = [input, boxes, box_indices] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=minimum_deployment_target, ) test_dynamic() if dynamic else test_static() @pytest.mark.parametrize( "compute_unit, backend, width, height, strides, sizes, padding, minimum_deployment_target", itertools.product( compute_units, backends, [1, 3, 5], [2, 7, 12], [(1, 1), (2, 1), (3, 5)], [(1, 1), (1, 2), (5, 4)], ["VALID", "SAME"], [None, ct.target.iOS17], ), ) def test_extract_patches( self, compute_unit, backend, width, height, strides, sizes, padding, minimum_deployment_target, ): # TODO: theoretically, the current extractpatches code handle batch size rather than 1, # but there seems to have a bug in crop_resize when using GPU and batch_size > 1. # We should test batch_size > 1 after the issue is fixed. # input = np.random.rand(1, height, width, 128).astype(np.float32) if padding == "VALID": size_h = min(sizes[0], height) size_w = min(sizes[1], width) else: size_h = sizes[0] size_w = sizes[1] @make_tf_graph([input.shape]) def build_model(x): return tf.compat.v1.image.extract_image_patches( images=x, ksizes=[1, size_h, size_w, 1], strides=[1, strides[0], strides[1], 1], rates=[1, 1, 1, 1], padding=padding, ) model, inputs, outputs = build_model input_values = [input] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=minimum_deployment_target, ) class TestLinear(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, dim, transpose_a, transpose_b, use_constant", itertools.product( compute_units, backends, [2, 4, 8], [True, False], [True, False], [True, False], ), ) def test_matmul( self, compute_unit, backend, dim, transpose_a, transpose_b, use_constant ): shape_x = np.array([dim, dim * 2, dim * 4]) shape_y = np.array([dim * 4, dim * 2]) flip = (not transpose_a and transpose_b) or (transpose_a and not transpose_b) shape_y = np.flip(shape_y) if flip else shape_y if not use_constant: @make_tf_graph([shape_x, shape_y]) def build_model(x, y): return tf.linalg.matmul( x, y, transpose_a=transpose_a, transpose_b=transpose_b ) input_values = [ random_gen(shape=shape_x, rand_min=-100, rand_max=100), random_gen(shape=shape_y, rand_min=-1.0, rand_max=1.0), ] else: y = random_gen(shape=shape_y, rand_min=-1.0, rand_max=1.0) @make_tf_graph([shape_x]) def build_model(x): return tf.linalg.matmul( x, y, transpose_a=transpose_a, transpose_b=transpose_b ) input_values = [random_gen(shape=shape_x, rand_min=-100, rand_max=100)] model, inputs, outputs = build_model input_dict = dict(zip(inputs, input_values)) proto, _, _, _, _, _ = TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) for layer in proto.neuralNetwork.layers: if layer.WhichOneof("layer") == "batchedMatmul": wp = layer.batchedMatmul.weights if use_constant: assert len(wp.floatValue) != 0 else: assert len(wp.floatValue) == 0 class TestBatchNormalization(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, shape_mode, epsilon", itertools.product( compute_units, backends, [rank for rank in range(3, 6)], [True, False], [1e-1, 1e-10], ), ) def test_batch_norm(self, compute_unit, backend, rank, shape_mode, epsilon): input_shape = np.random.randint(low=1, high=4, size=rank) if shape_mode: # same shape with 1 for being normalized over attr_shape = list(input_shape) attr_shape[1] = 1 attr_shape[2] = 1 else: # 1D tensor of the same size as channel dimension attr_shape = [list(input_shape)[-1]] @make_tf_graph([input_shape, attr_shape, attr_shape, attr_shape, attr_shape]) def build_model(x, m, v, o, s): return tf.nn.batch_normalization( x, mean=m, variance=v, offset=o, scale=s, variance_epsilon=epsilon ) model, inputs, outputs = build_model input_values = [ random_gen(shape=input_shape, rand_min=-100.0, rand_max=100.0), random_gen(shape=attr_shape, rand_min=-1.0, rand_max=1.0), random_gen(shape=attr_shape, rand_min=0.0, rand_max=10.0), random_gen(shape=attr_shape, rand_min=1.0, rand_max=10.0), random_gen(shape=attr_shape, rand_min=-1.0, rand_max=1.0), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, atol=.2, rtol=1e-4, ) @pytest.mark.parametrize( "compute_unit, backend, rank, shape_mode, epsilon, scale_after_normalization", itertools.product( compute_units, backends, [rank for rank in range(3, 6)], [True, False], [1e-1, 1e-10], [True, False], ), ) def test_batch_norm_with_global_normalization( self, compute_unit, backend, rank, shape_mode, epsilon, scale_after_normalization, ): input_shape = np.random.randint(low=1, high=4, size=rank) if shape_mode: # same shape with 1 for being normalized over attr_shape = list(input_shape) attr_shape[1] = 1 attr_shape[2] = 1 else: # 1D tensor of the same size as channel dimension attr_shape = [list(input_shape)[-1]] if scale_after_normalization: @make_tf_graph( [input_shape, attr_shape, attr_shape, attr_shape, attr_shape] ) def build_model(x, m, v, b, g): return tf.nn.batch_norm_with_global_normalization( x, mean=m, variance=v, beta=b, gamma=g, variance_epsilon=epsilon, scale_after_normalization=scale_after_normalization, ) else: @make_tf_graph([input_shape, attr_shape, attr_shape, attr_shape]) def build_model(x, m, v, b): return tf.nn.batch_norm_with_global_normalization( x, mean=m, variance=v, beta=b, gamma=None, variance_epsilon=epsilon, scale_after_normalization=scale_after_normalization, ) model, inputs, outputs = build_model input_values = [ random_gen(shape=input_shape, rand_min=-100.0, rand_max=100.0), random_gen(shape=attr_shape, rand_min=-1.0, rand_max=1.0), random_gen(shape=attr_shape, rand_min=0.0, rand_max=10.0), random_gen(shape=attr_shape, rand_min=1.0, rand_max=10.0), ] if scale_after_normalization: input_values.append( random_gen(shape=attr_shape, rand_min=-1.0, rand_max=1.0) ) input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, atol=0.2, rtol=1e-4, ) class TestNormalization(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, epsilon", itertools.product( compute_units, backends, [1e-1, 1e-10] ), ) def test_fused_batch_norm(self, compute_unit, backend, epsilon): if backend[0] == "neuralnetwork" and epsilon == 1e-10 and platform.machine() == "x86_64": pytest.xfail( "rdar://108739991 ([CI][TF] re-enable batch norm unittest failing in Intel machines)" ) # TensorFlow's FusedBatchNorm is only for 4D inputs input_shape = np.random.randint(low=1, high=4, size=4) attr_shape = [list(input_shape)[-1]] m = random_gen(shape=attr_shape, rand_min=-1.0, rand_max=1.0) v = random_gen(shape=attr_shape, rand_min=0.0, rand_max=10.0) o = random_gen(shape=attr_shape, rand_min=1.0, rand_max=10.0) s = random_gen(shape=attr_shape, rand_min=-1.0, rand_max=1.0) @make_tf_graph([input_shape]) def build_model(x): return tf.compat.v1.nn.fused_batch_norm( x, mean=m, variance=v, offset=o, scale=s, epsilon=epsilon, is_training=False, )[0] model, inputs, outputs = build_model input_values = [random_gen(shape=input_shape, rand_min=-100.0, rand_max=100.0)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, atol=1e-2, rtol=1e-3, ) class TestL2Normalization(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, axes, epsilon", itertools.product( compute_units, backends, [rank for rank in range(3, 6)], [(-1,), (-2,), (0, 1)], [1e-5, 1e-10], ), ) def test_l2_normalize(self, compute_unit, backend, rank, axes, epsilon): input_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): return tf.math.l2_normalize(x, axis=axes, epsilon=epsilon) model, inputs, outputs = build_model input_values = [random_gen(input_shape, rand_min=-10, rand_max=10)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, atol=0.05, rtol=1e-4, ) class TestLocalResponseNormalization(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, size, alpha, beta, k", itertools.product( compute_units, backends, [1, 2, 3], [0.0001, 0.01], [0.75, 1.0], [1.0, 2.0], ), ) def test_local_response_normalization( self, compute_unit, backend, size, alpha, beta, k ): # TensorFlow's local_response_normalization only supports rank 4 input_shape = np.random.randint(low=3, high=4, size=4) @make_tf_graph([input_shape]) def build_model(x): return tf.nn.local_response_normalization( x, depth_radius=size, bias=k, alpha=alpha, beta=beta ) model, inputs, outputs = build_model input_values = [random_gen(shape=input_shape, rand_min=-100, rand_max=100)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, atol=1e-2, rtol=1e-3, ) class TestPool1d(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, kernel_sizes, strides, pad_type", itertools.product( compute_units, backends, [(1,)], [(1,), (2,)], ["same", "valid"], ), ) def test_avg_pool_1d(self, compute_unit, backend, kernel_sizes, strides, pad_type): input_shape = np.random.randint(low=2, high=4, size=3) @make_tf_graph([input_shape]) def build_model(x): return tf.nn.avg_pool1d( x, ksize=kernel_sizes[:], strides=strides[:], padding=pad_type.upper() ) model, inputs, outputs = build_model input_values = [random_gen(shape=input_shape, rand_min=-100, rand_max=100)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, kernel_sizes, strides, pad_type", itertools.product( compute_units, backends, [(1,)], [(1,), (2,)], ["same", "valid"], ), ) def test_max_pool_1d(self, compute_unit, backend, kernel_sizes, strides, pad_type): input_shape = np.random.randint(low=2, high=4, size=3) @make_tf_graph([input_shape]) def build_model(x): return tf.nn.max_pool1d( x, ksize=kernel_sizes[:], strides=strides[:], padding=pad_type.upper() ) model, inputs, outputs = build_model input_values = [random_gen(shape=input_shape, rand_min=-100, rand_max=100)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestPool2d(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, kernel_sizes, strides, pad_type", itertools.product( compute_units, backends, [(1,), (2,), (1, 1), (1, 2), (2, 2)], [(1,), (2,), (1, 1), (1, 2), (2, 2)], ["same", "valid"], ), ) def test_avg_pool_2d(self, compute_unit, backend, kernel_sizes, strides, pad_type): input_shape = np.random.randint(low=2, high=4, size=4) @make_tf_graph([input_shape]) def build_model(x): return tf.nn.avg_pool( x, ksize=kernel_sizes[:], strides=strides[:], padding=pad_type.upper() ) model, inputs, outputs = build_model input_values = [random_gen(shape=input_shape, rand_min=-100, rand_max=100)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, kernel_sizes, strides, pad_type", itertools.product( compute_units, backends, [(1,), (2,), (1, 1), (1, 2), (2, 2)], [(1,), (2,), (1, 1), (1, 2), (2, 2)], ["same", "valid"], ), ) def test_max_pool_2d(self, compute_unit, backend, kernel_sizes, strides, pad_type): input_shape = np.random.randint(low=2, high=4, size=4) @make_tf_graph([input_shape]) def build_model(x): return tf.nn.max_pool( x, ksize=kernel_sizes[:], strides=strides[:], padding=pad_type.upper() ) model, inputs, outputs = build_model input_values = [random_gen(shape=input_shape, rand_min=-100, rand_max=100)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestPool3d(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, kernel_sizes, strides, pad_type", itertools.product( compute_units, backends, [(1,), (2,), (1, 1, 1), (1, 2, 3), (2, 2, 3), (3, 3, 3)], [(1,), (2,), (1, 1, 1), (1, 2, 3), (2, 2, 3), (3, 3, 3)], ["same", "valid"], ), ) def test_avg_pool_3d(self, compute_unit, backend, kernel_sizes, strides, pad_type): input_shape = np.random.randint(low=3, high=4, size=5) if kernel_sizes[0] == 1 and pad_type == "same": pytest.xfail("rdar://81630684 (Pool3d with pad type == same fails from TF2.5 onwards)") @make_tf_graph([input_shape]) def build_model(x): return tf.nn.avg_pool3d( x, ksize=kernel_sizes[:], strides=strides[:], padding=pad_type.upper() ) model, inputs, outputs = build_model input_values = [random_gen(shape=input_shape, rand_min=-100, rand_max=100)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, kernel_sizes, strides, pad_type", itertools.product( compute_units, backends, [(1,), (2,), (1, 1, 1), (1, 2, 3), (2, 2, 3), (3, 3, 3)], [(1,), (2,), (1, 1, 1), (1, 2, 3), (2, 2, 3), (3, 3, 3)], ["same", "valid"], ), ) def test_max_pool_3d(self, compute_unit, backend, kernel_sizes, strides, pad_type): input_shape = np.random.randint(low=3, high=4, size=5) if kernel_sizes[0] == 1 and pad_type == "same": pytest.xfail("rdar://81630684 (Pool3d with pad type == same fails from TF2.5 onwards)") @make_tf_graph([input_shape]) def build_model(x): return tf.nn.max_pool3d( x, ksize=kernel_sizes[:], strides=strides[:], padding=pad_type.upper() ) model, inputs, outputs = build_model input_values = [random_gen(shape=input_shape, rand_min=-100, rand_max=100)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestPrint(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [size for size in range(1, 5)], ), ) def test_print(self, compute_unit, backend, rank): shape = np.random.randint(low=1, high=4, size=rank).astype(np.int32) @make_tf_graph([shape]) def build_model(x): print_layer = tf.raw_ops.Print(input=x, data=[]) res = print_layer + 1 return res model, inputs, outputs = build_model input_value = [random_gen(shape=shape, rand_min=-100, rand_max=100)] input_dict = dict(zip(inputs, input_value)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestRandom(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, size, rank, constant", itertools.product( compute_units, backends, [1, 4], [1, 5], [True, False], ), ) def test_random_binomial(self, compute_unit, backend, size, rank, constant): if not constant and backend[0] != "neuralnetwork": return # dynamic input is only support in neuralnetwork backend shape = np.random.randint(low=1, high=4, size=rank).astype(np.int32) @make_tf_graph([shape]) def build_model(x): if constant: ref = tf.add(x, tf.keras.backend.random_binomial(shape=shape, p=1.0)) else: ref = tf.add( x, tf.keras.backend.random_binomial( shape=tf.raw_ops.Shape(input=x), p=1.0 ), ) return ref model, inputs, outputs = build_model input_value = [random_gen(shape=shape)] input_dict = dict(zip(inputs, input_value)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, size", itertools.product( compute_units, backends, [1, 4] ), ) def test_random_categorical(self, compute_unit, backend, size): # TensorFlow's input is 2-D tensor with shape [batch_size, num_classes]. shape = np.random.randint(low=1, high=4, size=2) y_shape = (1,) @make_tf_graph([shape, y_shape]) def build_model(x, y): x = tf.random.categorical(x, size) x = tf.cast(x, dtype=tf.float32) return x * y model, inputs, outputs = build_model input_value = [np.zeros(shape).astype(np.float32), np.zeros(y_shape).astype(np.float32)] input_dict = dict(zip(inputs, input_value)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, mean, rank, constant", itertools.product( compute_units, backends, [0.0], [1, 5], [True, False], ), ) def test_random_normal(self, compute_unit, backend, mean, rank, constant): if not constant and backend[0] != "neuralnetwork": return # dynamic input is only support in neuralnetwork backend shape = np.random.randint(low=1, high=4, size=rank).astype(np.int32) @make_tf_graph([shape]) def build_model(x): if constant: ref = tf.add(x, tf.random.normal(shape=shape, mean=mean, stddev=0.0)) else: ref = tf.add( x, tf.random.normal( shape=tf.raw_ops.Shape(input=x), mean=mean, stddev=0.0 ), ) return ref model, inputs, outputs = build_model input_value = [random_gen(shape=shape)] input_dict = dict(zip(inputs, input_value)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, mean, rank, constant", itertools.product( compute_units, backends, [0.0], [1, 5], [True, False], ), ) def test_keras_random_normal(self, compute_unit, backend, mean, rank, constant): if not constant and backend[0] != "neuralnetwork": return # dynamic input is only support in neuralnetwork backend shape = np.random.randint(low=1, high=4, size=rank).astype(np.int32) @make_tf_graph([shape]) def build_model(x): if constant: ref = tf.add(x, tf.keras.backend.random_normal(shape=shape, mean=mean, stddev=0.0)) else: ref = tf.add( x, tf.keras.backend.random_normal( shape=tf.raw_ops.Shape(input=x), mean=mean, stddev=0.0 ), ) return ref model, inputs, outputs = build_model input_value = [random_gen(shape=shape)] input_dict = dict(zip(inputs, input_value)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, low, high, rank, constant", itertools.product( compute_units, backends, [0.0], [0.0], [1], [True, False], ), ) def test_random_uniform(self, compute_unit, backend, low, high, rank, constant): if not constant and backend[0] != "neuralnetwork": return # dynamic input is only support in neuralnetwork backend shape = np.random.randint(low=1, high=4, size=rank).astype(np.int32) @make_tf_graph([shape]) def build_model(x): if constant: ref = tf.add(x, tf.random.uniform(shape=shape, minval=low, maxval=high)) else: ref = tf.add( x, tf.random.uniform( shape=tf.raw_ops.Shape(input=x), minval=low, maxval=high ), ) return ref model, inputs, outputs = build_model input_value = [random_gen(shape=shape)] input_dict = dict(zip(inputs, input_value)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, low, high, rank, constant", itertools.product( compute_units, backends, [1.0], [1.0], [rank for rank in range(1, 6)], [True, False], ), ) def test_keras_random_uniform( self, compute_unit, backend, low, high, rank, constant ): if not constant and backend[0] != "neuralnetwork": return # dynamic input is only support in neuralnetwork backend shape = np.random.randint(low=1, high=4, size=rank).astype(np.int32) @make_tf_graph([shape]) def build_model(x): if constant: ref = tf.add(x, tf.keras.backend.random_uniform(shape=shape, minval=low, maxval=high)) else: ref = tf.add( x, tf.keras.backend.random_uniform( shape=tf.raw_ops.Shape(input=x), minval=low, maxval=high ), ) return ref model, inputs, outputs = build_model input_value = [random_gen(shape=shape)] input_dict = dict(zip(inputs, input_value)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.skipif(_macos_version() < (10, 16), reason="This only works for 'neuralnetwork' on macOS 11") class TestReduction(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank_and_axes, keep_dims, tf_op", itertools.product( compute_units, backends, [ (1, (-1,)), (2, (0,)), (2, (-1, 0)), (3, (1, -3)), (3, (-2,)), (3, (-3, -2, -1)), (4, (0, 1, 2)), (4, (-2, -1, 0)), (4, (1, -2)), (5, (-3, -1)), (5, (-2, -1)), (5, (-3, -2, -1)), (5, (0, -1, 1, -2)), (3, None), (5, None), (3, 1), ], [True, False], [ tf.reduce_all, tf.math.reduce_euclidean_norm, tf.reduce_max, tf.reduce_mean, tf.reduce_min, tf.reduce_prod, tf.reduce_sum, tf.reduce_any, tf.reduce_logsumexp, tf.math.argmax, tf.math.argmin, ], ), ) def test_reduction(self, compute_unit, backend, rank_and_axes, keep_dims, tf_op): rank, axes = rank_and_axes shape = np.random.randint(low=1, high=3, size=rank) def parse_axes(axes): if axes is None: axes = 0 elif isinstance(axes, (tuple, list)): axes = axes[0] return axes def test_tf_argmax(): @make_tf_graph([shape]) def build_model(x): return tf.math.argmax(x, axis=parse_axes(axes)) model, inputs, outputs = build_model input_values = [random_gen(shape, rand_min=-5.0, rand_max=5.0)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) def test_tf_argmin(): @make_tf_graph([shape]) def build_model(x): return tf.math.argmin(x, axis=parse_axes(axes)) model, inputs, outputs = build_model input_values = [random_gen(shape, rand_min=-5.0, rand_max=5.0)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) def test_tf_reduction(): if isinstance(axes, list) and axes and len(axes) == rank and not keep_dims: return input_type = list(shape) x_val = random_gen(shape=shape, rand_min=-5.0, rand_max=5.0) if tf_op in {tf.reduce_all, tf.reduce_any}: input_type += [tf.bool] x_val = np.random.randint(low=0, high=2, size=shape).astype(bool) elif tf_op in {tf.math.reduce_euclidean_norm}: x_val = random_gen(shape=shape, rand_min=0.0, rand_max=10.0) elif tf_op in {tf.reduce_prod}: x_val = random_gen(shape=shape, rand_min=1.0, rand_max=1.3) elif tf_op in {tf.reduce_logsumexp}: x_val = random_gen(shape=shape, rand_min=-5, rand_max=5) @make_tf_graph([input_type]) def build_model(x): ref = tf_op(x, axis=axes, keepdims=keep_dims) if tf_op == tf.reduce_any: ref = tf.cast(ref, tf.float32) return ref model, inputs, outputs = build_model input_dict = dict(zip(inputs, [x_val])) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) if tf_op in {tf.math.argmax}: test_tf_argmax() elif tf_op in {tf.math.argmin}: test_tf_argmin() else: test_tf_reduction() class TestGather(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rankX_rankIndices_axis, mode", itertools.product( compute_units, backends, [ (1, 2, -1), (2, 1, 0), (3, 2, -2), (2, 3, 1), (2, 2, 1), (1, 1, 0), (3, 3, -2), (3, 3, 2), (3, 3, 0), (1, 3, -1), (3, 1, 2), (3, 1, -1), ], ["Gather", "GatherV2", "gather"], ), ) def test_gather_function(self, compute_unit, backend, rankX_rankIndices_axis, mode): x_rank, indices_rank, axis = rankX_rankIndices_axis x_shape = np.random.randint(low=2, high=4, size=x_rank) indices_shape = np.random.randint(low=2, high=4, size=indices_rank) @make_tf_graph([x_shape, list(indices_shape) + [tf.int32]]) def build_model(x, indices): if mode == "Gather": res = tf.raw_ops.Gather(params=x, indices=indices) elif mode == "GatherV2": res = tf.raw_ops.GatherV2(params=x, indices=indices, axis=axis) elif mode == "gather": res = tf.gather(x, indices, axis=axis) return res model, inputs, outputs = build_model axis = 0 if mode == "Gather" else axis input_dict = {inputs[0]: np.random.rand(*x_shape).astype(np.float32), inputs[1]: np.random.randint(0, x_shape[axis], size=indices_shape, dtype=np.int32)} TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, mode", itertools.product( compute_units, backends, ["Gather", "GatherV2", "gather"], ), ) def test_gather_invalid_indices(self, compute_unit, backend, mode): """ This test is to verify that TensorFlow Gather op doesn't allow negative nor out-of-range indices, so don't need mb.select for IOS17 mb.gather when lowering TensorFlow gather op. Use TensorFlowBaseTest.run_compare_tf to make this test compatible with both TF1 and TF2. """ @make_tf_graph([[4, tf.int32]]) def build_model(indices): params = tf.constant([0.0, 1.0, 2.0, 3.0, 4.0, 5.0]) if mode == "Gather": res = tf.raw_ops.Gather(params=params, indices=indices) elif mode == "GatherV2": res = tf.raw_ops.GatherV2(params=params, indices=indices, axis=0) elif mode == "gather": res = tf.gather(params, indices) else: raise ValueError(f"Unsupported mode: {mode}") return res model, inputs, outputs = build_model with pytest.raises(tf.errors.InvalidArgumentError, match="-1 is not in \[0, 6\)"): # Negative indices will error out. input_dict = dict(zip(inputs, [np.array([2, 0, -1, 5], dtype=np.int32)])) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) with pytest.raises(tf.errors.InvalidArgumentError, match="6 is not in \[0, 6\)"): # Out-of-range indices will error out. input_dict = dict(zip(inputs, [np.array([2, 0, 1, 6], dtype=np.int32)])) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rankX_rankIndices_axis_batchdims, mode", itertools.product( compute_units, backends, [ (2, 2, 1, 0), (3, 2, 1, 1), (3, 3, 2, 0), (3, 3, 2, 1), (3, 3, 2, 2), ], ["GatherV2", "gather"], ), ) def test_gather_with_batch_dims(self, compute_unit, backend, rankX_rankIndices_axis_batchdims, mode): if _macos_version() < (13, 0) and backend[0] == 'mlprogram': pytest.skip("Requires macOS 13 or higher") x_rank, indices_rank, axis, batch_dims = rankX_rankIndices_axis_batchdims x_shape = np.random.randint(low=2, high=4, size=x_rank) indices_shape = np.random.randint(low=2, high=4, size=indices_rank) indices_shape[:batch_dims] = x_shape[:batch_dims] @make_tf_graph([x_shape, list(indices_shape) + [tf.int32]]) def build_model(x, indices): if mode == "GatherV2": res = tf.raw_ops.GatherV2(params=x, indices=indices, axis=axis, batch_dims=batch_dims) elif mode == "gather": res = tf.gather(x, indices, axis=axis, batch_dims=batch_dims) else: raise ValueError("Unsupported tf op {}".format(mode)) return res model, inputs, outputs = build_model axis = 0 if mode == "Gather" else axis input_dict = {inputs[0]: np.random.rand(*x_shape).astype(np.float32), inputs[1]: np.random.randint(0, x_shape[axis], size=indices_shape, dtype=np.int32)} TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=ct.target.iOS16 if backend[0] == "mlprogram" else None ) @pytest.mark.parametrize( "compute_unit, backend, rankX_rankIndices", itertools.product( compute_units, backends, [ (1, 2), (2, 2), (3, 2), (2, 3), (1, 4), (5, 2), (2, 5), (4, 3), (3, 4), (2, 4), (4, 2), (1, 5), ], ), ) def test_gather_nd(self, compute_unit, backend, rankX_rankIndices): x_rank, indices_rank = rankX_rankIndices x_shape = np.random.randint(low=2, high=4, size=x_rank) indices_shape = np.random.randint(low=2, high=4, size=indices_rank) indices_shape[-1] = np.random.randint(low=1, high=x_rank + 1) @make_tf_graph([x_shape, list(indices_shape) +[tf.int32]]) def build_model(x, indices): return tf.gather_nd(x, indices) model, inputs, outputs = build_model a = np.random.rand(*x_shape).astype(np.float32) indices_list = [] for i in range(indices_shape[-1]): indices_list.append( np.random.randint(0, x_shape[i], size=indices_shape[:-1]) ) input_dict = { inputs[0]: a, inputs[1]: np.stack(indices_list, axis=-1).astype(np.int32), } TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rankX_rankIndices_batchdims", itertools.product( compute_units, backends, [ (1, 2, 0), (2, 2, 1), (3, 5, 2), (5, 5, 3), ], ), ) def test_gather_nd_with_batch_dims(self, compute_unit, backend, rankX_rankIndices_batchdims): if _macos_version() < (13, 0) and backend[0] == 'mlprogram': pytest.skip("Requires macOS 13 or higher") x_rank, indices_rank, batch_dims = rankX_rankIndices_batchdims x_shape = np.random.randint(low=2, high=4, size=x_rank) indices_shape = np.random.randint(low=2, high=4, size=indices_rank) x_shape[:batch_dims] = indices_shape[:batch_dims] indices_shape[-1] = np.random.randint(low=1, high=x_rank + 1 - batch_dims) @make_tf_graph([x_shape, list(indices_shape) +[tf.int32]]) def build_model(x, indices): return tf.gather_nd(x, indices, batch_dims=batch_dims) model, inputs, outputs = build_model a = np.random.rand(*x_shape).astype(np.float32) indices_list = [] for i in range(indices_shape[-1]): indices_list.append( np.random.randint(0, x_shape[i+batch_dims], size=indices_shape[:-1]) ) input_dict = { inputs[0]: a, inputs[1]: np.stack(indices_list, axis=-1).astype(np.int32), } TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=ct.target.iOS16 if backend[0] == "mlprogram" else None ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_gather_nd_invalid_indices(self, compute_unit, backend): """ This test is to verify that TensorFlow GatherNd op doesn't allow negative nor out-of-range indices, so don't need mb.select for IOS17 mb.gather when lowering TensorFlow GatherNd op. Use TensorFlowBaseTest.run_compare_tf to make this test compatible with both TF1 and TF2. """ @make_tf_graph([[2, 2, tf.int32]]) def build_model(indices): params = tf.constant([[0.0, 1.0], [2.0, 3.0]]) return tf.gather_nd(params, indices) model, inputs, outputs = build_model with pytest.raises( tf.errors.InvalidArgumentError, match="\[1, -1\] does not index into param shape \[2,2\]", ): # Negative indices will error out. input_dict = dict(zip(inputs, [np.array([[0, 0], [1, -1]], dtype=np.int32)])) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) with pytest.raises( tf.errors.InvalidArgumentError, match="\[2, 0\] does not index into param shape \[2,2\]" ): # Out-of-range indices will error out. input_dict = dict(zip(inputs, [np.array([[2, 0], [1, 1]], dtype=np.int32)])) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestScatter(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, data_rank, indices_rank, minimum_deployment_target", itertools.product( compute_units, backends, list(range(1, 4)), list(range(2, 4)), [None, ct.target.iOS17], ), ) def test_scatter_nd_with_zeros( self, compute_unit, backend, data_rank, indices_rank, minimum_deployment_target ): shape = np.random.randint(low=2, high=4, size=data_rank).astype(np.int32) indices_shape = np.random.randint(low=2, high=4, size=indices_rank) indices_shape[-1] = np.random.randint(low=1, high=data_rank + 1) updates_shape = list(indices_shape[:-1]) + list(shape[indices_shape[-1] :]) updates = np.random.rand(*updates_shape).astype(np.int32) indices_list = [] for i in range(indices_shape[-1]): indices_list.append(np.random.randint(0, shape[i], size=indices_shape[:-1])) indices = np.stack(indices_list, axis=-1).astype(np.int32) @make_tf_graph( [list(indices.shape) + [tf.int32], updates_shape + [tf.int32], [data_rank, tf.int32]] ) def build_model(indices, updates, shape): return tf.raw_ops.ScatterNd(indices=indices, updates=updates, shape=shape) model, inputs, outputs = build_model input_values = [indices, updates, shape] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_scatter_nd_with_invalid_indices(self, compute_unit, backend): shape = np.random.randint(low=2, high=4, size=3).astype(np.int32) indices_shape = np.random.randint(low=2, high=4, size=3) indices_shape[-1] = np.random.randint(low=1, high=4) updates_shape = list(indices_shape[:-1]) + list(shape[indices_shape[-1] :]) updates = np.random.rand(*updates_shape).astype(np.int32) neg_indices_list = [] for i in range(indices_shape[-1]): neg_indices_list.append(np.random.randint(-shape[i], 0, size=indices_shape[:-1])) indices = np.stack(neg_indices_list, axis=-1).astype(np.int32) @make_tf_graph( [list(indices.shape) + [tf.int32], updates_shape + [tf.int32], [3, tf.int32]] ) def build_model(indices, updates, shape): return tf.raw_ops.ScatterNd(indices=indices, updates=updates, shape=shape) model, inputs, outputs = build_model # TensorFlow ScatterNd doesn't support negative indices. with pytest.raises(tf.errors.InvalidArgumentError, match="does not index into shape"): TensorFlowBaseTest.run_compare_tf( model, dict(zip(inputs, [indices, updates, shape])), outputs, compute_unit=compute_unit, backend=backend, ) out_of_range_indices_list = [] for i in range(indices_shape[-1]): out_of_range_indices_list.append( np.random.randint(shape[i], shape[i] * 2, size=indices_shape[:-1]) ) indices = np.stack(out_of_range_indices_list, axis=-1).astype(np.int32) # TensorFlow ScatterNd doesn't support out of range indices. with pytest.raises(tf.errors.InvalidArgumentError, match="does not index into shape"): TensorFlowBaseTest.run_compare_tf( model, dict(zip(inputs, [indices, updates, shape])), outputs, compute_unit=compute_unit, backend=backend, ) class TestTensorScatterAdd(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, tensor_rank, indices_rank, minimum_deployment_target", itertools.product( compute_units, backends, # updates_rank = indices_rank - 1 + tensor_rank - indices_shape[-1] <= tensor_rank + indices_rank - 2 # and Core ML only supports updates_rank < 6, # so we constrain tensor_rank + indices_rank - 2 < 6 [tensor_rank for tensor_rank in range(1, 5)], [indices_rank for indices_rank in range(2, 4)], [None, ct.target.iOS17], ), ) def test_scatter_add(self, compute_unit, backend, tensor_rank, indices_rank, minimum_deployment_target): # To avoid indexing out of bound: # tensor size for each dimension >= MIN_TENSOR_SIZE # index for each dimension < MIN_TENSOR_SIZE MIN_TENSOR_SIZE = 3 tensor_shape = np.random.randint(low=MIN_TENSOR_SIZE, high=9, size=tensor_rank) # indices shape constraint: 0 < indices_shape[-1] <= tensor_rank indices_shape = np.random.randint(low=1, high=tensor_rank + 1, size=indices_rank) # updates rank and shape are inferred from tensor and indices # reference https://www.tensorflow.org/api_docs/python/tf/compat/v1/scatter_nd_add updates_rank = indices_rank - 1 + tensor_rank - indices_shape[-1] updates_shape = [] for i in range(indices_rank - 1): updates_shape.append(indices_shape[i]) for i in range(indices_shape[-1], tensor_rank): updates_shape.append(tensor_shape[i]) updates_shape = np.array(updates_shape) @make_tf_graph([tensor_shape, list(indices_shape) + [tf.int32], updates_shape]) def build_model(tensor, indices, updates): return tf.tensor_scatter_nd_add(tensor, indices, updates) model, inputs, outputs = build_model input_values = [ random_gen(tensor_shape, rand_min=-1.0, rand_max=1.0), random_gen(indices_shape, rand_min=0, rand_max=MIN_TENSOR_SIZE, dtype=np.int32), random_gen(updates_shape, rand_min=-1.0, rand_max=1.0), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_scatter_add_invalid_indices(self, compute_unit, backend): # To avoid indexing out of bound: # tensor size for each dimension >= MIN_TENSOR_SIZE # index for each dimension < MIN_TENSOR_SIZE MIN_TENSOR_SIZE = 3 tensor_rank = 3 indices_rank = 3 tensor_shape = np.random.randint(low=MIN_TENSOR_SIZE, high=9, size=tensor_rank) # indices shape constraint: 0 < indices_shape[-1] <= tensor_rank indices_shape = np.random.randint(low=1, high=tensor_rank + 1, size=indices_rank) updates_shape = [] for i in range(indices_rank - 1): updates_shape.append(indices_shape[i]) for i in range(indices_shape[-1], tensor_rank): updates_shape.append(tensor_shape[i]) updates_shape = np.array(updates_shape) @make_tf_graph([tensor_shape, list(indices_shape) + [tf.int32], updates_shape]) def build_model(tensor, indices, updates): return tf.tensor_scatter_nd_add(tensor, indices, updates) model, inputs, outputs = build_model # TensorFlow tensor_scatter_nd_add doesn't support negative indices. neg_indices = random_gen(indices_shape, rand_min=-3, rand_max=-1, dtype=np.int32) input_values = [ random_gen(tensor_shape, rand_min=-1.0, rand_max=1.0), neg_indices, random_gen(updates_shape, rand_min=-1.0, rand_max=1.0), ] with pytest.raises(tf.errors.InvalidArgumentError, match="does not index into shape"): TensorFlowBaseTest.run_compare_tf( model, dict(zip(inputs, input_values)), outputs, compute_unit=compute_unit, backend=backend, ) # TensorFlow tensor_scatter_nd_add doesn't support out of range indices. out_of_range_indices = random_gen(indices_shape, rand_min=10, rand_max=20, dtype=np.int32) input_values = [ random_gen(tensor_shape, rand_min=-1.0, rand_max=1.0), out_of_range_indices, random_gen(updates_shape, rand_min=-1.0, rand_max=1.0), ] with pytest.raises(tf.errors.InvalidArgumentError, match="does not index into shape"): TensorFlowBaseTest.run_compare_tf( model, dict(zip(inputs, input_values)), outputs, compute_unit=compute_unit, backend=backend, ) class TestSliceByIndex(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, masking_type", itertools.product( compute_units, backends, [rank for rank in range(1, 5)], ["none", "positive_mask", "negative_mask"] ), ) def test_slice_by_index_simple(self, compute_unit, backend, rank, masking_type): if backend[0] == "mlprogram": pytest.xfail( "rdar://109854221 ([Bug][Regression] slice_by_index is throwing exception through E5ML - Follow up radar)" ) if backend[0] == "neuralnetwork": pytest.xfail( "rdar://111134257 ([Bug][Regression] nnv1 slice_by_index unittests are failing)" ) input_shape = np.random.randint(low=2, high=4, size=rank) begin_val = np.array( [ np.random.randint(low=-input_shape[i], high=input_shape[i]) for i in range(rank) ] ).astype(np.int32) end_val = np.array( [ np.random.randint(low=-input_shape[i], high=input_shape[i]) for i in range(rank) ] ).astype(np.int32) stride_val = np.array( [ np.random.randint(low=-input_shape[i], high=input_shape[i]) for i in range(rank) ] ).astype(np.int32) if masking_type == "none": begin_mask = [False] * rank end_mask = [False] * rank squeeze_mask = [False] * rank else: begin_mask = np.array( [np.random.choice([True, False, False]) for i in range(rank)] ).astype(bool) end_mask = np.array( [np.random.choice([True, False, False]) for i in range(rank)] ).astype(bool) squeeze_flag = True # We do not squeeze to scalar in nn while squeeze_flag: squeeze_mask = np.array( [np.random.choice([True, False]) for i in range(rank)] ).astype(bool) for i in range(rank): if begin_mask[i] or end_mask[i]: squeeze_mask[i] = False for s in squeeze_mask: if not s: squeeze_flag = False for i in range(rank): if begin_mask[i] or end_mask[i]: stride = 0 while stride == 0: stride = np.random.randint(low=-input_shape[i], high=input_shape[i]) stride_val[i] = stride if not end_mask[i]: while True: end = np.random.randint( low=-input_shape[i], high=input_shape[i] ) normalized_end = input_shape[i] + end if end < 0 else end if normalized_end == 0 and stride_val[i] > 0: continue elif normalized_end == input_shape[i] - 1 and stride_val[i] < 0: continue else: end_val[i] = end break continue if squeeze_mask[i]: stride_val[i] = 1 while True: end = np.random.randint(low=-input_shape[i], high=input_shape[i]) normalized_end = input_shape[i] + end if end < 0 else end normalized_begin = ( input_shape[i] + begin_val[i] if begin_val[i] < 0 else begin_val[i] ) if normalized_end == normalized_begin: continue if begin_mask[i] or end_mask[i] or squeeze_mask[i]: stride = 1 elif normalized_end < normalized_begin: stride = -np.random.randint(low=1, high=input_shape[i]) else: stride = np.random.randint(low=1, high=input_shape[i]) end_val[i] = end stride_val[i] = stride break def _mask_to_bit(mask): ret = 0 for x in mask[::-1]: ret <<= 1 if x: ret += 1 if ret > 0 and masking_type == "negative_mask": ret = ret - 2**rank return ret @make_tf_graph( [ input_shape, list(begin_val.shape) + [tf.int32], list(end_val.shape) + [tf.int32], ] ) def build_model(x, begin, end): return tf.strided_slice( x, begin, end, stride_val, begin_mask=_mask_to_bit(begin_mask), end_mask=_mask_to_bit(end_mask), shrink_axis_mask=_mask_to_bit(squeeze_mask), ) model, inputs, outputs = build_model input_values = [ np.array(list(range(np.prod(input_shape)))) .reshape(input_shape) .astype(np.float32), begin_val, end_val, ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, testcase", itertools.product( compute_units, backends, # Change to slice representation for allowing iteration with a non-constant input [ ( slice(1, 2), slice(1, 2), slice(1, 2), ), # equivalent to [1:2, 1:2, 1:2] (slice(-3, -2), slice(-4, -3), slice(-5, -4)), (slice(0, -2), slice(0, -1), slice(-3, -2)), (slice(-1, 0, -2), slice(-1, 1, -1), slice(-1, -3, -3)), (slice(1, 2), slice(1, 3), slice(1, 4, 2)), (slice(None, 2), slice(1, 3), slice(None, 4, 2)), ( slice(None), slice(1, None), slice(None, 4, 2), ), # equivalent to [:,1:,:4:2] (slice(1, None, 1), 1, slice(None, 3, 2)), (slice(None), slice(None), slice(None)), (slice(1, 2), slice(1, 2), 1), (slice(1, 2), slice(None), slice(None)), (slice(None), slice(None), slice(None)), (slice(1, 2), slice(None), slice(1, 2)), (slice(None), slice(None), 1), (0, 0, slice(None)), (slice(1, 2)), (slice(1, 2), slice(1, 2)), (1), (slice(0, 3)), (slice(None)), (slice(None), slice(None), slice(None, None, -1)), ], ), ) def test_slice_by_index_from_scratch(self, compute_unit, backend, testcase): input_shape = np.array([3, 4, 5]) @make_tf_graph([input_shape]) def build_model(x): return x[testcase] model, inputs, outputs = build_model input_values = [ np.array(list(range(np.prod(input_shape)))) .reshape(input_shape) .astype(np.float32) ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, shape_and_slice", itertools.product( compute_units, backends, [ [[3], (slice(1, 2))], [[2,10], (slice(0, 2), slice(None, 8, 2))], [[2,3,4,5], (slice(None), slice(1, None, 3), slice(None), slice(0, 5))], [[2,3,4,5], (slice(0, None), slice(None), slice(2, None, 1), slice(None))], ], ), ) def test_slice_by_index_one_dimension(self, compute_unit, backend, shape_and_slice): input_shape, testcase = shape_and_slice @make_tf_graph([input_shape]) def build_model(x): return x[testcase] model, inputs, outputs = build_model input_values = [ np.array(list(range(np.prod(input_shape)))) .reshape(input_shape) .astype(np.float32) ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_slice_by_index_smoke(self, compute_unit, backend): input_shape = [1, 64, 2] x_val = np.random.rand(*input_shape).astype(np.float32) y_val = np.random.rand(*input_shape).astype(np.float32) @make_tf_graph([input_shape, input_shape]) def build_model(x, y): x_slice = x[:, :, 0] y_slice = y[:, :, 0] return (x_slice, y_slice) model, inputs, outputs = build_model input_values = [x_val, y_val] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.xfail(reason="ExpandDims exist mismatch", run=False) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_slice_by_index_with_new_axes(self, compute_unit, backend): input_shape = [4, 5, 64] val = np.random.rand(*input_shape).astype(np.float32) num_cases = 8 @make_tf_graph([input_shape] * num_cases) def build_model(*args): a, b, c, d, e, f, g, h = args slice_0 = a[:1, tf.newaxis, :3, :] slice_1 = b[:, tf.newaxis] slice_2 = c[..., tf.newaxis] slice_3 = d[..., tf.newaxis, :, 10] slice_4 = e[:, 2, tf.newaxis, ...] slice_5 = f[2, ..., :, tf.newaxis] slice_6 = g[tf.newaxis, ..., tf.newaxis] slice_7 = h[tf.newaxis, 2, tf.newaxis, ...] return ( slice_0, slice_1, slice_2, slice_3, slice_4, slice_5, slice_6, slice_7, ) model, inputs, outputs = build_model input_values = [val] * num_cases input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestSliceBySize(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, single_size, dynamic_size", itertools.product( compute_units, backends, [rank for rank in range(1, 5)], [True, False], [True, False], ), ) def test_dynamic_slice_by_size( self, compute_unit, backend, rank, single_size, dynamic_size ): # Test for when either begin or size are runtime determines input_shape = np.random.randint(low=2, high=4, size=rank) begin_val = np.array( [np.random.randint(input_shape[i]) for i in range(rank)] ).astype(np.int32) size_val = np.array( [np.random.randint(input_shape[i] - begin_val[i]) + 1 for i in range(rank)] ) if single_size: for r in range(rank): size_val_r = np.array( [s if i == r else -1 for i, s in enumerate(size_val)] ).astype(np.int32) @make_tf_graph([input_shape, list(begin_val.shape) + [tf.int32]]) def build_model(x, begin): return tf.slice(x, begin, size_val_r) @make_tf_graph( [ input_shape, list(begin_val.shape) + [tf.int32], list(size_val_r.shape) + [tf.int32], ] ) def build_model_dynamic_size(x, begin, size): return tf.slice(x, begin, size) if dynamic_size: model, inputs, outputs = build_model_dynamic_size input_values = [ random_gen(input_shape, rand_min=-100, rand_max=100), begin_val, size_val_r, ] else: model, inputs, outputs = build_model input_values = [ random_gen(input_shape, rand_min=-100, rand_max=100), begin_val, ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) else: size_val = np.array( [s if np.random.randint(2) == 0 else -1 for s in size_val] ).astype(np.int32) @make_tf_graph([input_shape, list(begin_val.shape) + [tf.int32]]) def build_model(x, begin): return tf.slice(x, begin, size_val) @make_tf_graph( [ input_shape, list(begin_val.shape) + [tf.int32], list(size_val.shape) + [tf.int32], ] ) def build_model_dynamic_size(x, begin, size): return tf.slice(x, begin, size) if dynamic_size: model, inputs, outputs = build_model_dynamic_size input_values = [ random_gen(input_shape, rand_min=-100, rand_max=100), begin_val, size_val, ] else: model, inputs, outputs = build_model input_values = [ random_gen(input_shape, rand_min=-100, rand_max=100), begin_val, ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, begin_size", itertools.product( compute_units, backends, [ [[0, 1, 2], [1, 1, 1]], [[0, 0, 0], [-1, -1, -1]], [[0, 0, 1], [1, 2, -1]], [[0, 1, 2], [-1, -1, -1]], ] ), ) def test_static_slice_by_size( self, compute_unit, backend, begin_size ): # Test for when begin and size are both constant input_shape = [1, 2, 3] begin, size = begin_size tf_input_shape = input_shape.copy() for i in range(3): if np.random.randint(2) == 0: tf_input_shape[i] = None # We set the begin to 0 for the symbolic dimension, # since the default input shape will be 1 in this case, # we need to make sure that begin = 0 and size = 1 (unless size == -1) begin[i] = 0 if size[i] != -1: size[i] = 1 @make_tf_graph([tf_input_shape]) def build_model(x): return tf.slice(x, begin, size) model, inputs, outputs = build_model input_values = [random_gen(input_shape, rand_min=-2, rand_max=2)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestMatrixBandPart(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, lower_and_upper", itertools.product( compute_units, backends, [rank for rank in range(2, 6)], [(0, -1), (-1, 0), (0, 0)], ), ) def test_matrix_band_part(self, compute_unit, backend, rank, lower_and_upper): lower, upper = lower_and_upper shape = np.random.randint(low=3, high=4, size=rank) @make_tf_graph([shape]) def build_model(x): return tf.raw_ops.MatrixBandPart(input=x, num_lower=lower, num_upper=upper) model, inputs, outputs = build_model TensorFlowBaseTest.run_compare_tf( model, {inputs[0]: random_gen(shape, rand_min=-100, rand_max=100)}, outputs, compute_unit=compute_unit, backend=backend, ) class TestCumSum(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, reverse, exclusive", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [True, False], [True, False], ), ) def test_cumsum(self, compute_unit, backend, rank, reverse, exclusive): input_shape = np.random.randint(low=1, high=4, size=rank) for axis in range(-1, rank, 3): @make_tf_graph([input_shape]) def build_model(x): return tf.math.cumsum(x, axis=axis, reverse=reverse, exclusive=exclusive) model, inputs, outputs = build_model input_values = [random_gen(input_shape, rand_min=-10, rand_max=10)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf(model, input_dict, outputs, compute_unit=compute_unit, backend=backend) @pytest.mark.skipif(not _HAS_TF_1, reason=MSG_TF1_NOT_FOUND) class TestFakeQuant(TensorFlowBaseTest): @pytest.mark.parametrize( "num_bits, weight_boundaries, compute_unit, backend", itertools.product( [2, 8], # TensorFlow does not support 1-bit quantization [(0, 10), (-0.01, 0.02), (-101, 100)], compute_units, backends, ), ) def test_fake_quant_weight_quantization_with_conv(self, num_bits, weight_boundaries, compute_unit, backend): if backend[0] == 'mlprogram': pytest.skip("Not supported with ML Program backend") tf.reset_default_graph() filter_width = 1 filter_height = 1 spatial_size = 2 input_channels = 3 output_channels = 1 input_tensor = tf.placeholder(tf.float32, [1, spatial_size, spatial_size, input_channels], name='input') output_tensor = tf.placeholder(tf.float32, [1, spatial_size, spatial_size, output_channels], name='output') kernel_in = random_gen((filter_width, filter_height), weight_boundaries[0], weight_boundaries[1]) init = tf.constant_initializer(kernel_in) def model(x): with tf.compat.v1.variable_scope('quantized_model'): x = tf.layers.conv2d(x, filters=3, kernel_size=1, strides=1, kernel_initializer=init) return x with tf.compat.v1.variable_scope('quantize'): output = model(x=input_tensor) tf.contrib.quantize.experimental_create_training_graph(quant_delay=0, weight_bits=num_bits, activation_bits=num_bits) loss = tf.losses.mean_squared_error(labels=input_tensor, predictions=output) saver = tf.train.Saver() update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) with tf.control_dependencies(update_ops): optimizer = tf.train.AdamOptimizer().minimize(loss) checkpoint_dir = tempfile.mkdtemp() # Run training pass to retrieve the correct min and max in FakeQuant op (to avoid using default values) and # save dummy checkpoint. with tf.Session() as sess: tf.global_variables_initializer().run() for iter in range(1): image = np.random.rand(spatial_size, spatial_size, input_channels).astype(np.float32) * 255 label = np.random.rand(spatial_size, spatial_size, output_channels).astype(np.float32) * 255 training_loss, _ = sess.run([loss, optimizer], feed_dict={input_tensor: image[None, ...], output_tensor: label[None, ...]}) saver.save(sess=sess, save_path=os.path.join(checkpoint_dir, 'quantization')) with tf.Graph().as_default() as g: input_tensor = tf.placeholder(tf.float32, [1, spatial_size, spatial_size, input_channels], name='input') with tf.variable_scope('quantize'): output = model(x=input_tensor) # define eval graph, by quantizing the weights of the model with learned min/max values for each layer tf.contrib.quantize.experimental_create_eval_graph(input_graph=g, weight_bits=num_bits, activation_bits=num_bits) tmpdir = tempfile.mkdtemp() tf_graph_path = os.path.join(str(tmpdir), "tf_graph.pb") tf_graph_path_quantized = os.path.join(str(tmpdir), "frozen_graph_quantized.pb") with open(tf_graph_path, 'wb') as f: f.write(g.as_graph_def().SerializeToString()) freeze_g(input_graph=tf_graph_path, input_saver="", input_binary=True, input_checkpoint=os.path.join(checkpoint_dir, 'quantization'), output_node_names="quantize/quantized_model/conv2d/Conv2D", restore_op_name="save/restore_all", filename_tensor_name="save/Const:0", output_graph=tf_graph_path_quantized, clear_devices=True, initializer_nodes="") shutil.rmtree(checkpoint_dir) graph = load_tf_pb(tf_graph_path_quantized) tf.reset_default_graph() graphdef = tf.GraphDef() input_dict = {} with open(tf_graph_path_quantized, "rb") as f: graphdef.ParseFromString(f.read()) shutil.rmtree(tmpdir) with tf.Graph().as_default(), tf.Session(config=None) as sess: tf.graph_util.import_graph_def(graphdef, name='') input_dict[sess.graph.get_tensor_by_name('input:0')] = (np.random.rand(1, spatial_size, spatial_size, input_channels).astype(np.float32)) outputs = [] outputs.append(sess.graph.get_tensor_by_name('quantize/quantized_model/conv2d/Conv2D:0')) tf_outs = sess.run(outputs, feed_dict=input_dict) TensorFlowBaseTest.run_compare_tf( graph, input_dict, ["quantize/quantized_model/conv2d/Conv2D"], compute_unit=compute_unit, backend=backend, tf_outputs=tf_outs, rtol=0.005, ) class TestFill(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, value", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [-19.0, 0.0, 37.0], ), ) def test_fill(self, compute_unit, backend, rank, value): def test_tf_static(): shape = np.random.randint(low=1, high=3, size=rank) @make_tf_graph([shape]) def build_model(x): return tf.add( x, tf.fill(dims=np.array(shape, dtype=np.float32), value=value) ) model, inputs, outputs = build_model input_values = [np.random.rand(*shape).astype(np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) def test_tf_dynamic(): shape = np.random.randint(low=1, high=3, size=rank) @make_tf_graph([(len(shape), tf.int32)]) def build_model(x): return tf.fill(dims=x, value=value) model, inputs, outputs = build_model input_values = [np.array(shape, dtype=np.int32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) test_tf_static() test_tf_dynamic() class TestNonMaximumSuppression(TensorFlowBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "num_boxes", "max_boxes", "iou_threshold", "score_threshold", "use_V5", ] ), itertools.product( compute_units, backends, [1, 5, 20, 1000], [1, 8, 100], [0.2, 0.8], [float("-inf"), -200.0, 200.0], [True, False], ), ) def test_non_max_suppression( self, compute_unit, backend, num_boxes, max_boxes, iou_threshold, score_threshold, use_V5, ): if _macos_version() >= (14, 0) and compute_unit == ct.ComputeUnit.CPU_ONLY and backend == ("neuralnetwork", "fp32"): pytest.xfail("rdar://118512264 Three specific instances are failing on at least early versions of macOS 14") if score_threshold > 100.0: pytest.xfail( "When score threshold is too high, TF will return empty result, while MIL " "will still keep the highest score box." ) if num_boxes >= 1000: pytest.xfail( "rdar://103891349 ([TensorFlow] [PyTorch] NMS discrepancy in Fp16 when " "number of boxes is large)" ) if backend[0] == "mlprogram": # force we are using fp16 for mlprogram, until this radar is fix: # rdar://109871491 ([Bug][CI][Regression] Numerical regression on E5ML for nms layers) backend = ("mlprogram", "fp32") if _HAS_TF_1 and score_threshold == -200 and backend[0] == "mlprogram": pytest.xfail( "rdar://111714405 ([Bug][Regression] Tensorflow nms layer unitests are failing)" ) boxes_val = random_gen(shape=(num_boxes, 4), rand_min=0, rand_max=32) # When the input score is too close, the returned index order is not guaranteed. # So instead of generating random scores by rand, use shuffle. scores_val = np.arange(num_boxes).astype(np.float32) np.random.shuffle(scores_val) @make_tf_graph([boxes_val.shape, scores_val.shape]) def build_model(boxes, scores): if use_V5: ret = tf.raw_ops.NonMaxSuppressionV5( boxes=boxes, scores=scores, max_output_size=max_boxes, iou_threshold=iou_threshold, score_threshold=score_threshold, soft_nms_sigma=0., ) else: ret = tf.image.non_max_suppression( boxes=boxes, scores=scores, max_output_size=max_boxes, iou_threshold=iou_threshold, score_threshold=score_threshold, ) return ret model, inputs, outputs = build_model input_dict = dict(zip(inputs, [boxes_val, scores_val])) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestOneHot(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank_and_axis, dynamic", itertools.product( compute_units, backends, [ (2, 0), (2, -1), (3, 3), (3, 0), (3, -2), (4, -4), (4, 1), (4, -1), (4, -2), (4, 3), ], [True, False], ), ) def test_one_hot(self, compute_unit, backend, rank_and_axis, dynamic): rank, axis = rank_and_axis depth, on_value, off_value = 30, 28.0, -4.0 x_shape = np.random.randint(low=2, high=4, size=rank) axis = (axis if axis >= -1 else axis + rank + 1) if not dynamic: @make_tf_graph([list(x_shape)+[tf.int32]]) def build_model(x): return tf.one_hot(x, axis=axis, depth=depth, on_value=on_value, off_value=off_value) model, inputs, outputs = build_model input_values = [np.random.randint(0, depth, size=x_shape).astype(np.int32)] input_dict = dict(zip(inputs, input_values)) else: # Dynamic Case with depth being an input @make_tf_graph([list(x_shape)+[tf.int32], [1, tf.int32]]) def build_model(x, depth_input): # tf.squeeze since CoreML input has to be rank 1~5. return tf.one_hot(x, axis=axis, depth=tf.squeeze(depth_input), on_value=on_value, off_value=off_value) model, inputs, outputs = build_model input_values = [np.random.randint(0, depth, size=x_shape).astype(np.int32), np.array([depth]).astype(np.int32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestSoftmaxCrossEntropyWithLogits(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, class_num", itertools.product( compute_units, backends, [1, 3], ), ) def test_sparse_softmax_cross_entropy_with_logits(self, compute_unit, backend, class_num): batch_size = 2 feature_shape = [batch_size, class_num] label_shape = [batch_size, tf.int32] @make_tf_graph([feature_shape, label_shape]) def build_model(feat, label): return tf.raw_ops.SparseSoftmaxCrossEntropyWithLogits(features=feat, labels=label)[0] model, inputs, outputs = build_model features = random_gen(feature_shape, rand_min=0, rand_max=1) labels = np.random.randint(low=0, high=class_num, size=(batch_size,), dtype=np.int32) input_values = [features, labels] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, class_num", itertools.product( compute_units, backends, [1, 3], ), ) def test_softmax_cross_entropy_with_logits(self, compute_unit, backend, class_num): batch_size = 2 feature_shape = [batch_size, class_num] label_shape = [batch_size, class_num] @make_tf_graph([feature_shape, label_shape]) def build_model(feat, label): return tf.raw_ops.SoftmaxCrossEntropyWithLogits(features=feat, labels=label)[0] model, inputs, outputs = build_model input_values = [ random_gen(feature_shape, rand_min=0, rand_max=1), random_gen(label_shape, rand_min=0, rand_max=1), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestIdentityN(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_identity_n(self, compute_unit, backend): shape_1 = [1,] shape_2 = [3, 4] shape_3 = [5, 6, 7] @make_tf_graph([shape_1, shape_2, shape_3]) def build_model(x, y ,z): return tf.raw_ops.IdentityN(input=[x, y, z]) model, inputs, outputs = build_model input_values = [ random_gen(shape_1, rand_min=0, rand_max=1), random_gen(shape_2, rand_min=0, rand_max=1), random_gen(shape_3, rand_min=0, rand_max=1), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_identity_n_with_downstream_op(self, compute_unit, backend): shape = [3, 4] @make_tf_graph([shape]) def build_model(x): x = tf.identity_n(input=[x, x]) return tf.reduce_max(x, 1) model, inputs, outputs = build_model input_values = [np.random.rand(*shape).astype(np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestPad(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, mode, dynamic, trial", itertools.product( compute_units, backends, [2, 3, 4], ['constant', 'reflect'], [True, False], list(range(10)), ), ) def test(self, compute_unit, backend, rank, mode, dynamic, trial): input_shape = np.random.randint(low=2, high=10, size=rank) min_input_dim_size = input_shape.min() padding_val = np.random.randint(low=0, high=min_input_dim_size, size=(rank, 2), dtype=np.int32) # Only constant mode supports padding across all dimensions # All other padding modes are only applied on two dimensions. perm = list(range(rank)) import random random.shuffle(perm) if mode != "constant": padding_val[perm[:-2]] = 0 tf_mode = mode.upper() if dynamic: if mode != "constant": return padding_shape = padding_val.shape @make_tf_graph([input_shape, list(padding_shape)+[tf.int32]]) def build_model(x, paddings): return tf.pad(x, paddings=paddings, mode=tf_mode) model, inputs, outputs = build_model input_values = [random_gen(input_shape, rand_min=0.2, rand_max=1000), padding_val] input_dict = dict(zip(inputs, input_values)) else: @make_tf_graph([input_shape]) def build_model(x): return tf.pad(x, paddings=padding_val, mode=tf_mode) model, inputs, outputs = build_model input_values = [random_gen(input_shape, rand_min=0.2, rand_max=1000)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestPadV2(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, constant_values, dynamic, trial", itertools.product( compute_units, backends, list(range(1, 6)), [0., 10, -1], [True], list(range(10)) ), ) def test(self, compute_unit, backend, rank, constant_values, dynamic, trial): input_shape = np.random.randint(low=2, high=10, size=rank) paddings = np.random.randint(low=2, high=5, size=2*rank).astype(np.int32) padding_val = paddings.reshape(-1,2) if dynamic: padding_shape = padding_val.shape @make_tf_graph([input_shape, list(padding_shape)+[tf.int32]]) def build_model(x, paddings): return tf.raw_ops.PadV2(input=x, paddings=paddings, constant_values=constant_values) model, inputs, outputs = build_model input_values = [random_gen(input_shape, rand_min=0.2, rand_max=1000), padding_val] input_dict = dict(zip(inputs, input_values)) else: @make_tf_graph([input_shape]) def build_model(x): return tf.raw_ops.PadV2(input=x, paddings=padding_val, constant_values=constant_values) model, inputs, outputs = build_model input_values = [random_gen(input_shape, rand_min=0.2, rand_max=1000)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestRange(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, params", itertools.product( compute_units, backends, [ (-10.4, 23, 12.2), (0, 10, 1), (50.5, 90.5, 1.5), (5, 8, 2), (5, 8, 98), (5, 8, 1.5), (10, 5, -0.6), (24, -65, -2), ], ), ) def test_range(self, compute_unit, backend, params): start, end, step = np.array(params).astype(np.float32) # CoreML requires rank-1~5 input. @make_tf_graph([[1, tf.float32]]) def build_model(limit): return tf.range(start=start, limit=tf.squeeze(limit), delta=step) model, inputs, outputs = build_model input_values = [np.array([end])] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) # CoreML requires rank-1~5 input. @make_tf_graph([[1, tf.float32]]) def build_model(delta): return tf.range(start=start, limit=end, delta=tf.squeeze(delta)) model, inputs, outputs = build_model input_values = [np.array([step])] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) # CoreML requires rank-1~5 input. @make_tf_graph([[1, tf.float32]]) def build_model(begin): return tf.range(start=tf.squeeze(begin), limit=end, delta=step) model, inputs, outputs = build_model input_values = [np.array([start])] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestTile(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank_and_reps", itertools.product( compute_units, backends, [ (1, (2,)), (2, (1, 2)), (2, (2, 2)), (3, (3, 2, 1)), (3, (2, 1, 3)), (3, (2, 1, 1)), (4, (1, 3, 2, 1)), (4, (2, 1, 1, 2)), (5, (2, 1, 1, 3, 2)), (5, (1, 1, 2, 3, 2)), ], ), ) def test_tile(self, compute_unit, backend, rank_and_reps): rank, reps = rank_and_reps x_shape = np.random.randint(low=2, high=4, size=rank) @make_tf_graph([x_shape]) def build_model(x): return tf.tile(x, multiples=reps) model, inputs, outputs = build_model input_values = [random_gen(x_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_tile_invalid(self, compute_unit, backend): """TF doesn't support tile where `multiples` have different length than x's rank.""" x_shape = (2, 3, 4) with pytest.raises(ValueError, match="Shape must be rank 3 but is rank 2"): @make_tf_graph([x_shape]) def build_model(x): return tf.tile(x, multiples=[1, 2]) model, inputs, outputs = build_model input_values = [random_gen(x_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestDynamicTile(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product(compute_units, backends, [1, 2, 3, 4, 5]), ) def test_tile(self, compute_unit, backend, rank): x_shape = np.random.randint(low=2, high=4, size=rank) reps_val = np.random.randint(low=1, high=3, size=rank).astype(np.int32) @make_tf_graph([x_shape, [*reps_val.shape, tf.int32]]) def build_model(x, reps): return tf.tile(input=x, multiples=reps) model, inputs, outputs = build_model input_values = [random_gen(x_shape), reps_val] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestTopK(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, k, sort", itertools.product( compute_units, backends, [1, 3, 5], [1, 3, None], # None denotes dynamic k [True, False], ), ) def test_top_k(self, compute_unit, backend, rank, k, sort): if not sort and backend[0] == "neuralnetwork": pytest.skip("iOS16 version topk needed for sort = False") if not sort and _macos_version() < (13, 0): pytest.skip("New functionality in macOS13/iOS16") # TensorFlow only supports last dimension (axis = -1). shape = np.random.randint(low=3, high=4, size=rank) if k is None: @make_tf_graph([shape, (1, tf.int32)]) def build_model(x, k): ref = tf.math.top_k(x, k=k[0], sorted=sort) if not sort: ref = (tf.sort(ref[0]), tf.sort(ref[1])) return ref else: @make_tf_graph([shape]) def build_model(x): ref = tf.math.top_k(x, k=k, sorted=sort) if not sort: ref = (tf.sort(ref[0]), tf.sort(ref[1])) return ref model, inputs, outputs = build_model input_values = [random_gen(shape, rand_min=-100, rand_max=100)] if k is None: input_values.append(np.random.randint(low=1, high=shape[-1], size=1, dtype=np.int32)) input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=ct.target.iOS16 if not sort else None, ) @pytest.mark.parametrize( "compute_unit, backend, shape, k", itertools.product( compute_units, backends, [(1, 3), (1, 10), (3, 50)], [1, 3, 20, None], # None denotes dynamic k ), ) def test_in_top_k(self, compute_unit, backend, shape, k): # TensorFlow only supports last dimension (axis = -1). batch_size, class_num = shape if k is None: @make_tf_graph([shape, (batch_size, tf.int32), (1, tf.int32)]) def build_model(predictions, targets, k): return tf.math.in_top_k(predictions=predictions, targets=targets, k=k[0]) else: @make_tf_graph([shape, (batch_size, tf.int32)]) def build_model(predictions, targets): return tf.math.in_top_k(predictions=predictions, targets=targets, k=k) model, inputs, outputs = build_model pred_values = random_gen(shape, rand_min=-2, rand_max=2) target_values = np.random.randint(class_num, size=batch_size).astype(np.int32) input_values = [pred_values, target_values] if k is None: input_values.append(np.random.randint(low=1, high=shape[-1], size=1, dtype=np.int32)) input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank, dynamic", itertools.product( compute_units, backends, (1, 3, 5), (True, False), ), ) def test_sort(self, compute_unit, backend, rank, dynamic): """ tf.sort dispatches to tf.math.top_k, and k = size of the axis to be sorted """ if platform.machine() == "x86_64" and dynamic: pytest.xfail("rdar://135843153 ([Bug] Models failed on x86_64 platform)") # Here we test the conversion of tf.sort(x, axis=0) # If dynamic, we prepend None to x shape as the dynamic shape axis if rank == 5 and dynamic: rank -= 1 shape = tuple(np.random.randint(low=3, high=8, size=rank)) tf_input_shape = (None,) + shape if dynamic else shape @make_tf_graph([tf_input_shape]) def build_model(x): return tf.sort(x, axis=0) model, inputs, outputs = build_model if dynamic: input_values = [random_gen((5,) + shape, rand_min=-100, rand_max=100)] else: input_values = [random_gen(shape, rand_min=-100, rand_max=100)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestConcat(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, op_version, rank, num_inputs", itertools.product( compute_units, backends, ['v1', 'v2'], list(range(6)), list(range(1, 4)), ), ) def test_concat(self, compute_unit, backend, op_version, rank, num_inputs): import random for axis in range(-rank, rank): input_shape = np.random.randint(low=1, high=4, size=rank) input_shapes = [input_shape.copy() for _ in range(num_inputs)] concat_axis_value = np.random.randint(low=1, high=3, size=num_inputs) for i, v in enumerate(concat_axis_value): input_shapes[i][axis] = concat_axis_value[i] @make_tf_graph(input_shapes) def build_model(*inputs): # add 3 additional tensor contains dimension size of 0 zero_shape = input_shape.copy() zero_shape[axis] = 0 const = [tf.constant([], shape=zero_shape) for _ in range(3)] values = inputs + tuple(const) values = list(values) random.shuffle(values) values = tuple(values) if op_version == 'v1': # Seems like now the tf functions are using concatV2, so create as raw_ops here res = tf.raw_ops.Concat(concat_dim=axis, values=values) elif op_version == 'v2': res = tf.raw_ops.ConcatV2(values=values, axis=axis) return res model, inputs, outputs = build_model input_values = [random_gen(shape) for shape in input_shapes] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestSplit(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, dynamic", itertools.product( compute_units, backends, [1, 2, 3, 4], [True, False] ), ) def test_split(self, compute_unit, backend, rank, dynamic): if backend[0] == "mlprogram" and compute_unit != ct.ComputeUnit.CPU_ONLY and dynamic: pytest.xfail("rdar://97398133 (TestSplit::test_split is failing on mlprogram + GPU + dynamic combination)") if _macos_version() < (13, 0) and (dynamic or (backend[0] == "mlprogram" and compute_unit != ct.ComputeUnit.CPU_ONLY)): pytest.skip("Issue fixed in iOS16/macOS13") input_shape1 = np.random.randint(low=1, high=3, size=rank) for axis in range(-rank, rank, 2): for split_num in range(2, input_shape1[axis] + 1, 2): if input_shape1[axis] % split_num != 0: continue tf_input_shape = list(input_shape1) if dynamic: axis1 = np.random.randint(low=0, high=rank) tf_input_shape[axis1] = None @make_tf_graph([tf_input_shape]) def build_model(x): res = tf.split(x, split_num, axis=axis) # Comment: If tf.split output is returned, there's no # get_tuple nodes. Some graph pass is needed. Example: # # x = tf.placeholder(tf.float32, shape=input_shape1) # res = tf.split(x, 3, axis=0) # # res are ['split:0', 'split:1', 'split'] # # but node.outputs == ['gto_1', 'gto_2', 'gto_3'] import random random.shuffle(res) return tuple(res) model, inputs, outputs = build_model input_values = [random_gen(input_shape1)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, sizes", itertools.product( compute_units, backends, [[1, 1, 2], [0, 2, 2], [1, 0, 3], [2, 0, 1, 1, 0]] ), ) def test_split_with_sizes(self, compute_unit, backend, sizes): input_shape = (4, 2) @make_tf_graph([input_shape]) def build_model(x): res = tf.split(x, sizes, axis=0) # split sizes can contain 0s, and we skip those in outputs return tuple([res[i] for i in range(len(sizes)) if sizes[i] != 0]) model, inputs, outputs = build_model input_values = [random_gen(input_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_splitv(self, compute_unit, backend): input_shape = [3, 2, 1] @make_tf_graph([input_shape]) def build_model(x): res = tf.split(x, [1, 2], axis=0) return res[0], res[1] model, inputs, outputs = build_model input_values = [random_gen(input_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestStack(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends, ) ) def test_stack(self, compute_unit, backend): input_shape1 = [3, 1, 1] input_shape2 = [3, 1, 1] @make_tf_graph([input_shape1, input_shape2]) def build_model(x, y): return [tf.stack((x, y), axis=0), tf.stack((y, x), axis=-1)] model, inputs, outputs = build_model input_values = [random_gen(input_shape1), random_gen(input_shape2)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestUnstack(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product( compute_units, backends, [[3, 1], [4, 3]] ), ) def test_unstack(self, compute_unit, backend, shape): @make_tf_graph([shape]) def build_model(x): return tf.unstack(x, axis=1) model, inputs, outputs = build_model input_values = [random_gen(shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product( compute_units, backends, [[3, 1], [4, 3]] ), ) def test_unstack_and_stack(self, compute_unit, backend, shape): @make_tf_graph([shape]) def build_model(x): x = tf.unstack(x, axis=1) return tf.stack(x) model, inputs, outputs = build_model input_values = [random_gen(shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestPack(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, num_inputs", itertools.product(compute_units, backends, list(range(5)), list(range(1, 5))), ) def test_pack(self, compute_unit, backend, rank, num_inputs): if rank == 0: pytest.skip('Rank 0 not supported by CoreML runtime') shape = np.random.randint(low=1, high=4, size=rank) input_shapes = [shape[:] for _ in range(num_inputs)] @make_tf_graph(input_shapes) def build_model(*inputs): return tf.raw_ops.Pack(values=inputs, axis=0) model, inputs, outputs = build_model input_values = [ random_gen(shape, rand_min=-1, rand_max=1) for shape in input_shapes ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestArgSort(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, axis, direction", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [-1, 0], ["ascending", "descending"], ), ) def test_argsort(self, compute_unit, backend, rank, axis, direction): shape = np.random.randint(low=1, high=4, size=rank) dtype = np.float32 tf_dtype = tf.float32 @make_tf_graph([list(shape) + [tf_dtype]]) def build_model(x): return tf.argsort(x, axis=axis, direction=direction.upper()) model, inputs, outputs = build_model input_values = np.arange(np.prod(shape)) np.random.shuffle(input_values) input_values = [np.reshape(input_values, shape).astype(dtype)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestDepthToSpace(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_shape, block_size", itertools.product( compute_units, backends, [(1, 1, 1, 16), (1, 1, 1, 32), (1, 3, 3, 16)], [2, 4], ), ) def test_depth_to_space(self, compute_unit, backend, input_shape, block_size): @make_tf_graph([input_shape]) def build_model(x): return tf.nn.depth_to_space(x, block_size) model, inputs, outputs = build_model input_values = [random_gen(input_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestExpandDims(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank_and_axis", itertools.product( compute_units, backends, [ (rank, axis) for rank in range(1, 5) for axis in range(-rank - 1, rank + 1) ], ), ) def test_expand_dims(self, compute_unit, backend, rank_and_axis): rank, axis = rank_and_axis input_shape = np.random.randint(low=2, high=4, size=rank) @make_tf_graph([input_shape]) def build_model(x): return tf.expand_dims(x, axis=axis) model, inputs, outputs = build_model input_values = [np.random.rand(*input_shape).astype(np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestReshape(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, minimum_deployment_target", itertools.product(compute_units, backends, [None, ct.target.iOS17]), ) def test_flatten(self, compute_unit, backend, minimum_deployment_target): shapes = [[2, 2], [3, 2, 1, 2], [2, 1, 4, 3]] for input_shape in shapes: @make_tf_graph([input_shape]) def build_model(x): return tf.keras.backend.flatten(x) model, inputs, outputs = build_model input_values = [np.random.rand(*input_shape).astype(np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, input_shape, minimum_deployment_target", itertools.product( compute_units, backends, [ ([10, 10], [5, 20]), ([3, 4, 5, 6], [4, 5, 3, 6]), ([4, 4, 5, 6], [2, 2, -1]), ], [None, ct.target.iOS17], ), ) def test_reshape_static(self, compute_unit, backend, input_shape, minimum_deployment_target): @make_tf_graph([input_shape[0]]) def build_model(x): return tf.reshape(x, shape=input_shape[1]) model, inputs, outputs = build_model input_values = [np.random.rand(*input_shape[0]).astype(np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, input_shape, minimum_deployment_target", itertools.product( compute_units, backends, [ ([10, 10], [5, 20]), ([3, 4, 5, 6], [4, 5, 3, 6]), ([4, 4, 5, 6], [2, 2, -1]), ([2, 3, 5, 3], [2, -1]), ], [None, ct.target.iOS17], ), ) def test_reshape_dynamic(self, compute_unit, backend, input_shape, minimum_deployment_target): @make_tf_graph([input_shape[0], (len(input_shape[1]), tf.int32)]) def build_model(x, y): return tf.reshape(x, shape=y) model, inputs, outputs = build_model input_values = [ np.random.rand(*input_shape[0]).astype(np.float32), np.array(input_shape[1], dtype=np.int32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, shape, minimum_deployment_target", itertools.product( compute_units, backends, [[1], [1, 1], [1, 1, -1], []], [None, ct.target.iOS17] ), ) def test_reshape_scalar(self, compute_unit, backend, shape, minimum_deployment_target): pytest.skip('Rank 0 not supported by CoreML runtime') input_shape = () @make_tf_graph([input_shape]) def build_model(x): return tf.raw_ops.Reshape(tensor=x, shape=shape) model, inputs, outputs = build_model input_values = [np.random.rand(*input_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=minimum_deployment_target, ) class TestShape(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], ), ) def test_shape(self, compute_unit, backend, rank): shape = np.random.randint(low=3, high=4, size=rank) shape_holder = [None] * rank @make_tf_graph([shape_holder]) def build_model(x): return tf.shape(x) model, inputs, outputs = build_model input_values = [random_gen(shape, rand_min=-100, rand_max=100)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestMatrixDiag(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, length, dynamic", itertools.product( compute_units, backends, [length for length in range(1, 5)], [True, False] ), ) def test(self, compute_unit, backend, length, dynamic): if dynamic: input_shape = np.random.randint(low=1, high=4, size=length) a, b = np.prod(input_shape[:2]), np.prod(input_shape[2:]) size = np.array([a,b]).astype(np.int32) reshape_shape = [2] @make_tf_graph([input_shape, reshape_shape+[tf.int32]]) def build_model(x, reshape): x = tf.reshape(x, reshape) x = tf.reshape(x, [-1]) return tf.raw_ops.MatrixDiag(diagonal=x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1), size] else: input_shape = [length] @make_tf_graph([input_shape]) def build_model(x): return tf.raw_ops.MatrixDiag(diagonal=x) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -1, 1)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestReverse(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank_and_axes", itertools.product( compute_units, backends, [ (1, (-1,)), (2, (0,)), (2, (-1, 0)), (3, (1, -3)), (3, (-2,)), (3, (0, 1, 2)), (4, (-2, -1, 0)), (4, (-1, -2)), (4, []), (5, (-3, -1, 3)), (5, (0, -1, 1, -2)), ], ), ) def test_reverse(self, compute_unit, backend, rank_and_axes): rank, axes = rank_and_axes shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([shape]) def build_model(x): return tf.reverse(x, axis=axes) model, inputs, outputs = build_model input_values = [random_gen(shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestReverseSequence(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [rank for rank in range(2, 6)] ), ) def test_reverse_sequence(self, compute_unit, backend, rank): shape = np.random.randint(low=1, high=4, size=rank) seq_axis = np.random.randint(low=1, high=rank) batch_axis = np.random.randint(low=0, high=seq_axis) lengths = np.random.randint(low=0, high=shape[seq_axis], size=shape[batch_axis]) @make_tf_graph([shape]) def build_model(x): return tf.reverse_sequence( x, seq_lengths=lengths, seq_axis=seq_axis, batch_axis=batch_axis ) model, inputs, outputs = build_model input_values = [random_gen(shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestSpaceToDepth(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_shape, block_size", itertools.product( compute_units, backends, [(1, 6, 6, 1), (1, 12, 12, 1), (1, 6, 6, 3)], [2, 3], ), ) def test_space_to_depth(self, compute_unit, backend, input_shape, block_size): @make_tf_graph([input_shape]) def build_model(x): return tf.nn.space_to_depth(x, block_size) model, inputs, outputs = build_model input_values = [random_gen(input_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestSqueeze(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank_and_axes", itertools.product( compute_units, backends, [ (2, (1,)), (2, (0,)), (3, (1,)), (3, (0, -1)), (3, []), (4, (-1, 2, 1)), (4, (0, 1)), (5, (3, 1, 2)), (5, (-1,)), ], ), ) def test_squeeze(self, compute_unit, backend, rank_and_axes): rank, axes = rank_and_axes x_shape = np.random.randint(low=2, high=4, size=rank) for axis in axes: x_shape[axis] = 1 @make_tf_graph([x_shape]) def build_model(x): return tf.squeeze(x, axis=axes) model, inputs, outputs = build_model input_values = [np.random.rand(*x_shape).astype(np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestTranspose(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank_and_perm", itertools.product( compute_units, backends, [ (1, (0,)), (2, (1, 0)), (2, (0, 1)), (3, (0, 2, 1)), (3, (2, 1, 0)), (3, (2, 0, 1)), (4, (0, 3, 2, 1)), (4, (3, 0, 1, 2)), (5, (2, 3, 1, 0, 4)), (5, (3, 1, 0, 4, 2)), ], ), ) def test_transpose_1(self, compute_unit, backend, rank_and_perm): rank, perm = rank_and_perm x_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([x_shape]) def build_model(x): return tf.transpose(x, perm=perm) model, inputs, outputs = build_model input_values = [random_gen(x_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [1, 2, 3, 4], ), ) def test_transpose_2(self, compute_unit, backend, rank): input_shape = np.random.randint(low=1, high=4, size=rank) perm = np.random.permutation(rank) def static_perm(): @make_tf_graph([input_shape]) def build_model(x): return tf.transpose(x, perm=perm) model, inputs, outputs = build_model input_values = [random_gen(input_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) def dynamic_perm(): @make_tf_graph([input_shape, list(perm.shape) + [tf.int32]]) def build_model(x, tf_perm): return tf.transpose(x, perm=tf_perm) model, inputs, outputs = build_model input_values = [random_gen(input_shape), perm.astype(np.int32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) static_perm() # Note that TF supports dynamic perm in tf.transpose. with pytest.raises(ValueError, match=r".*must be const at compile time.*"): dynamic_perm() @pytest.mark.parametrize( "compute_unit, backend, rank_and_perm", itertools.product( compute_units, backends, [ (2, (0, 1)), (3, (0, 2, 1)), ], ), ) def test_transpose_after_another_op(self, compute_unit, backend, rank_and_perm): rank, perm = rank_and_perm x_shape = np.random.randint(low=1, high=4, size=rank) @make_tf_graph([x_shape]) def build_model(x): # Test transpose operations after another operation that may return symbolic value # in value_inference implementation (e.g. concat) - see issue #1556 x = tf.concat([x, x], axis=-1) return tf.transpose(x, perm=perm) model, inputs, outputs = build_model input_values = [random_gen(x_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [1, 3], ), ) def test_redundant_transpose(self, compute_unit, backend, rank): import random random.seed(10) input_shape = np.random.randint(low=1, high=4, size=rank) num_layers = 30 perms = [] for _ in range(num_layers): perm = list(range(rank)) random.shuffle(perm) perms.append(perm) @make_tf_graph([input_shape]) def build_model(x): net = x for perm in perms: net = tf.transpose(net, perm=perm) return net model, inputs, outputs = build_model input_values = [random_gen(input_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestSpaceToBatchND(TensorFlowBaseTest): # No direct mil smoke test since it's a TF op which is a composite of several ops. @pytest.mark.parametrize( "compute_unit, backend, input_shape, block_shape, paddings, dynamic_paddings", itertools.product( compute_units, backends, [(1, 4, 4, 1), (1, 4, 4, 3), (2, 4, 6, 1)], [[2, 2]], [[[0, 0], [0, 0]], [[1, 1], [0, 2]], [[4, 2], [4, 2]]], [True, False], ), ) def test_smoke( self, compute_unit, backend, input_shape, block_shape, paddings, dynamic_paddings ): paddings = np.array(paddings, dtype=np.int32) if dynamic_paddings: @make_tf_graph([input_shape, (2, 2, tf.int32)]) def build_model(x, paddings): return tf.raw_ops.SpaceToBatchND( input=x, block_shape=block_shape, paddings=paddings ) else: @make_tf_graph([input_shape]) def build_model(x): return tf.raw_ops.SpaceToBatchND( input=x, block_shape=block_shape, paddings=paddings ) model, inputs, outputs = build_model if dynamic_paddings: input_values = [random_gen(input_shape), paddings] else: input_values = [random_gen(input_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, shape_block_paddings, dynamic_input, dynamic_paddings", itertools.product( compute_units, backends, [ [(1, 4, 6, 2, 2), [2, 3], [[2, 0], [3, 6]]], [(2, 4, 6, 1), [1, 2], [[2, 1], [3, 3]]], [(2, 4, 6, 1, 2), [2, 1], [[0, 0],[0, 0]]], [(2, 4, 6, 1, 2), [2], [[0, 0]]], ], [True, False], [True, False], ), ) def test_smoke_new_op( self, compute_unit, backend, shape_block_paddings, dynamic_input, dynamic_paddings ): input_shape, block_shape, paddings = shape_block_paddings paddings = np.array(paddings, dtype=np.int32) # The neuralnetwork backend doesn't support these tests if backend[0] == "neuralnetwork": return tf_input_shape = input_shape if not dynamic_input else [None] * len(input_shape) if dynamic_paddings: @make_tf_graph([tf_input_shape, (*paddings.shape, tf.int32)]) def build_model(x, paddings): return tf.raw_ops.SpaceToBatchND( input=x, block_shape=block_shape, paddings=paddings ) else: @make_tf_graph([tf_input_shape]) def build_model(x): return tf.raw_ops.SpaceToBatchND( input=x, block_shape=block_shape, paddings=paddings ) model, inputs, outputs = build_model if dynamic_paddings: input_values = [random_gen(input_shape), paddings] else: input_values = [random_gen(input_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, input_block_rank, dynamic_input, dynamic_paddings", itertools.product( compute_units, backends, [(3, 1), (3, 2), (4, 1)], [True, False], [True, False], ), ) def test_programmatic( self, compute_unit, backend, input_block_rank, dynamic_input, dynamic_paddings ): input_rank, block_rank = input_block_rank # generate data input_shape = np.random.randint(low=1, high=4, size=input_rank) block_shape = np.random.randint(low=1, high=3, size=block_rank) if backend[0] == "neuralnetwork": if block_rank == 2 and block_shape[0] != block_shape[1]: pytest.skip("neuralnetwork backend doesn't support unequal block shape.") if block_shape[0] == 1: pytest.skip("neuralnetwork backend doesn't support unity block shape.") if input_block_rank == (4, 1) and dynamic_input and not dynamic_paddings: pytest.xfail("rdar://133558007 shape deduction failure") paddings = [] for i in range(block_rank): while True: temp = np.random.randint(low=0, high=10, size=2) if (np.sum(temp) + input_shape[i + 1]) % block_shape[i] == 0: paddings.append(temp) break paddings = np.array(paddings, dtype=np.int32) tf_input_shape = input_shape if not dynamic_input else [None] * len(input_shape) if dynamic_paddings: @make_tf_graph([tf_input_shape, (*paddings.shape, tf.int32)]) def build_model(x, paddings): return tf.raw_ops.SpaceToBatchND( input=x, block_shape=block_shape, paddings=paddings ) else: @make_tf_graph([tf_input_shape]) def build_model(x): return tf.raw_ops.SpaceToBatchND( input=x, block_shape=block_shape, paddings=paddings ) model, inputs, outputs = build_model if dynamic_paddings: input_values = [random_gen(input_shape), paddings] else: input_values = [random_gen(input_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestBatchToSpaceND(TensorFlowBaseTest): # No direct mil smoke test since it's a TF op which is a composite of several ops. @pytest.mark.parametrize( "compute_unit, backend, input_shape, block_size, crops, dynamic_crops", itertools.product( compute_units, backends, [(4, 4, 4, 1), (4, 4, 4, 3), (4, 4, 6, 1)], [[2, 2]], [[[0, 0], [0, 0]], [[1, 1], [0, 2]], [[4, 2], [4, 2]]], [True, False], ), ) def test_smoke(self, compute_unit, backend, input_shape, block_size, crops, dynamic_crops): if dynamic_crops: @make_tf_graph([input_shape, (2, 2, tf.int32)]) def build_model(x, y): return tf.raw_ops.BatchToSpaceND(input=x, block_shape=block_size, crops=y) else: @make_tf_graph([input_shape]) def build_model(x): return tf.raw_ops.BatchToSpaceND(input=x, block_shape=block_size, crops=crops) model, inputs, outputs = build_model if dynamic_crops: input_values = [random_gen(input_shape), np.array(crops, np.int32)] else: input_values = [random_gen(input_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, input_block_rank, dynamic_input, dynamic_crops", itertools.product( compute_units, backends, [(3, 1), (3, 2), (4, 1), (4, 2)], [True, False], [True, False], ), ) def test_programmatic( self, compute_unit, backend, input_block_rank, dynamic_input, dynamic_crops ): if ( platform.machine() == "x86_64" and input_block_rank == (3, 1) and dynamic_input and not dynamic_crops ): pytest.xfail("rdar://135843153 ([Bug] Models failed on x86_64 platform)") input_rank, block_rank = input_block_rank # generate data input_shape = np.random.randint(low=1, high=4, size=input_rank) block_shape = np.random.randint(low=1, high=3, size=block_rank) if backend[0] == "neuralnetwork": if block_rank == 2 and block_shape[0] != block_shape[1]: pytest.skip("neuralnetwork backend doesn't support unequal block shape.") if block_shape[0] == 1: pytest.skip("neuralnetwork backend doesn't support unity block shape.") input_shape[0] = input_shape[0] * np.prod(block_shape) crops = [] for i in range(block_rank): while True: temp = np.random.randint(low=0, high=4, size=2) if np.sum(temp) < input_shape[i + 1] * block_shape[i]: crops.append(temp) break crops = np.array(crops, dtype=np.int32) tf_input_shape = [None] * input_rank if dynamic_input else input_shape if dynamic_crops: @make_tf_graph([tf_input_shape, (*crops.shape, tf.int32)]) def build_model(x, crops): return tf.raw_ops.BatchToSpaceND( input=x, block_shape=block_shape, crops=crops ) else: @make_tf_graph([tf_input_shape]) def build_model(x): return tf.raw_ops.BatchToSpaceND( input=x, block_shape=block_shape, crops=crops ) model, inputs, outputs = build_model if dynamic_crops: input_values = [random_gen(input_shape), crops] else: input_values = [random_gen(input_shape)] input_dict = dict(zip(inputs, input_values)) # Before rdar://93071454 (batch_to_space is error out in espresso for dynamic inputs cormel model) is fixed, # we need to specify the default shape for the dynamic model by setting inputs_for_conversion input_names = get_tf_node_names(inputs, mode="inputs") if dynamic_input: shape = tuple( [ RangeDim(default=dim, upper_bound=dim if backend[0] == "mlprogram" else -1) for dim in input_shape ] ) inputs_for_conversion = [TensorType(shape=shape, name=input_names[0], dtype=np.float32)] else: inputs_for_conversion = [ TensorType(shape=tuple(input_shape), name=input_names[0], dtype=np.float32) ] if dynamic_crops: inputs_for_conversion += [ TensorType(shape=crops.shape, name=input_names[1], dtype=np.int32) ] TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, inputs_for_conversion=inputs_for_conversion, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, shape_block_crops, dynamic_input, dynamic_crops", itertools.product( compute_units, backends, [ [(6, 4, 6, 2, 2), [2, 3], [[2, 0], [3, 6]]], [(4, 4, 6, 1), [1, 2], [[2, 1], [3, 3]]], [(4, 4, 6, 1, 2), [2, 1], [[0, 0],[0, 0]]], [(4, 4, 6, 1, 2), [2], [[0, 0]]], ], [True, False], [True, False], ), ) def test_smoke_new_op( self, compute_unit, backend, shape_block_crops, dynamic_input, dynamic_crops ): input_shape, block_shape, crops = shape_block_crops crops = np.array(crops, dtype=np.int32) # The neuralnetwork backend doesn't support these tests if backend[0] == "neuralnetwork": return tf_input_shape = input_shape if not dynamic_input else [None] * len(input_shape) if dynamic_crops: @make_tf_graph([tf_input_shape, (*crops.shape, tf.int32)]) def build_model(x, crops): return tf.raw_ops.BatchToSpaceND(input=x, block_shape=block_shape, crops=crops) else: @make_tf_graph([tf_input_shape]) def build_model(x): return tf.raw_ops.BatchToSpaceND(input=x, block_shape=block_shape, crops=crops) model, inputs, outputs = build_model # Before rdar://93071454 (batch_to_space is error out in espresso for dynamic inputs cormel model) is fixed, # we need to specify the default shape for the dynamic model by setting inputs_for_conversion input_names = get_tf_node_names(inputs, mode="inputs") if dynamic_input: shape = tuple( [ RangeDim(default=dim, upper_bound=dim if backend[0] == "mlprogram" else -1) for dim in input_shape ] ) inputs_for_conversion = [TensorType(shape=shape, name=input_names[0], dtype=np.float32)] else: inputs_for_conversion = [ TensorType(shape=tuple(input_shape), name=input_names[0], dtype=np.float32) ] if dynamic_crops: inputs_for_conversion += [ TensorType(shape=crops.shape, name=input_names[1], dtype=np.int32) ] input_values = [random_gen(input_shape), crops] else: input_values = [random_gen(input_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, inputs_for_conversion=inputs_for_conversion, backend=backend, ) @pytest.mark.skipif(_HAS_TF_2, reason="Fix and re-enable this test: rdar://76293949 (TF2 unit test InvalidArgumentError)") class TestTensorArray(TensorFlowBaseTest): @staticmethod def get_dynamic_elem_shape_model(): elem_shape = (None, None) @make_tf_graph([elem_shape]) def build_model(x): ta = tf.TensorArray(dtype=tf.float32, size=0, dynamic_size=True) ta = ta.write(10, x) ta = ta.write(9, x) ta = ta.scatter([3], tf.expand_dims(x, 0)) ta = ta.scatter([8], tf.expand_dims(x, 0)) return ta.stack() return build_model @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,) ) def test_tf_basic(self, compute_unit, backend): # TF1: TensorArrayV3, TensorArrayWriteV3, TensorArrayScatterV3, # TensorArraySizeV3, TensorArrayGatherV3 # TF2: TensorListReserve, TensorListLength, TensorListSetItem, # TensorListScatterIntoExistingList, TensorListStack, # TensorListResize elem_shape = (3, 2) @make_tf_graph([elem_shape]) def build_model(x): ta = tf.TensorArray(dtype=tf.float32, size=1, dynamic_size=True) ta = ta.write(2, x) # TensorArray has write-once semantics, and thus we write to a new # index # (https://www.tensorflow.org/api_docs/python/tf/TensorArray) # writing to out of bound index ta = ta.scatter([3], tf.expand_dims(x, 0)) # writing to in-bound index ta = ta.scatter([0], tf.expand_dims(x, 0)) return ta.stack() model, inputs, outputs = build_model input_values = [random_gen(elem_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_tf_dynamic_elem_shape(self, compute_unit, backend): # TF1: TensorArrayV3, TensorArrayWriteV3, TensorArrayScatterV3, # TensorArraySizeV3, TensorArrayGatherV3 # TF2: TensorListReserve, TensorListLength, TensorListSetItem, # TensorListScatterIntoExistingList, TensorListStack, # TensorListResize model, inputs, outputs = TestTensorArray.get_dynamic_elem_shape_model() input_values = [random_gen((2, 3))] input_dict = dict(zip(inputs, input_values)) _, mlmodel, _, _, _, _ = TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend) # Once rdar://76293949 (TF2 unit test InvalidArgumentError) is fixed, the following milproto frontend tests should be removed from coremltools.converters.mil.frontend.milproto.test_load import \ roundtrip_and_compare_mlmodel if backend[0] != "mlprogram": pytest.skip("milproto front end only supported in mlprogram") roundtrip_and_compare_mlmodel(mlmodel, {"Placeholder": input_values[0]}) @pytest.mark.skip( reason="[NNv2 TensorArray scatter returns wrong result](rdar://63345281)" ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,) ) def test_tf_while_loop(self, compute_unit, backend): @make_tf_graph([(3, 2)]) def build_model(x): def body(i, num_iters, array, update): return i + 1, num_iters, array.write(i, update), update def cond(i, num_iters, array, update): return i < num_iters i = 0 max_iters = 3 ta = tf.TensorArray(dtype=tf.float32, size=1, dynamic_size=True) _, _, new_ta, _ = tf.while_loop(cond, body, [i, max_iters, ta, x]) new_ta = new_ta.scatter([max_iters], tf.expand_dims(x, 0)) return new_ta.stack() model, inputs, outputs = build_model input_values = [random_gen(shape=(3, 2))] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestBroadcastTo(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shapes, is_dynamic", itertools.product( compute_units, backends, [ ((2,), (2,)), ((1,), (10,)), ((3,), (3, 3)), ((1, 1), (1, 4)), ((1, 1, 5), (3, 4, 4, 4, 5)), ((3,), (1, 3, 2, 1, 3)), ((3, 5), (2, 3, 5)), ((1, 2), (2, 3, 1, 2)), ((1, 3, 1, 4), (8, 3, 32, 4)), ((2, 16), (3, 1, 4, 2, 16)), ], [False], ), ) def test(self, compute_unit, backend, shapes, is_dynamic): input_shape, output_shape = shapes if is_dynamic is False: @make_tf_graph([input_shape]) def build_model(x): return tf.broadcast_to(x, output_shape) else: # output / target shape is an input (placeholder) @make_tf_graph([input_shape, (len(output_shape), tf.int32)]) def build_model(x, shape): return tf.broadcast_to(x, shape) model, inputs, outputs = build_model if is_dynamic is False: input_values = [random_gen(input_shape)] else: input_values = [ random_gen(input_shape), np.array(output_shape, dtype=np.int32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.skipif(not _HAS_TF_1, reason=MSG_TF1_NOT_FOUND) class TestContribLSTMBlockCell(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, batch, return_hc_only, has_peephole, has_clip", itertools.product( compute_units, backends, [1, 2], [True, False], [True, False], [True, False], ), ) def test_tf_no_variable( self, compute_unit, batch, backend, return_hc_only, has_peephole, has_clip ): """ If return_hc_only == True, the op can be mapped to mb.lstm. Otherwise it has to be expanded. """ # _lstm_block_cell allows fine-grained control of W, peephole etc from tensorflow.contrib.rnn.python.ops.lstm_ops import _lstm_block_cell input_dim, hidden_dim = 2, 3 x_shape = (batch, input_dim) init_h = np.random.rand(batch, hidden_dim).astype(np.float32) init_c = np.random.rand(batch, hidden_dim).astype(np.float32) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=x_shape) res = _lstm_block_cell( x, tf.constant(init_c), tf.constant(init_h), w=tf.constant( np.random.rand(input_dim + hidden_dim, 4 * hidden_dim).astype( np.float32 ) ), b=tf.constant(np.random.rand(4 * hidden_dim).astype(np.float32)), use_peephole=has_peephole, wci=tf.constant(np.random.rand(hidden_dim).astype(np.float32)), wcf=tf.constant(np.random.rand(hidden_dim).astype(np.float32)), wco=tf.constant(np.random.rand(hidden_dim).astype(np.float32)), forget_bias=np.random.rand(), cell_clip=np.random.rand() if has_clip else -1, ) if return_hc_only: # All other outputs aren't supported by mb.lstm. res = res[1], res[6] TensorFlowBaseTest.run_compare_tf( graph, {x: np.random.rand(*x_shape).astype(np.float32),}, res, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, batch", itertools.product(compute_units, backends, [1, 2],), ) def test_tf_lstm_block_cell(self, compute_unit, backend, batch): # tf.contrib.rnn.LSTMBlockCell runs a single step of an LSTM. It needs to be wrapped # inside a for loop to handle inputs with sequence length more than 1. In that case, use # tf.contrib.rnn.LSTMBlockFusedCell input_dim, hidden_dim = 2, 3 x_shape = (batch, input_dim) init_h = np.random.rand(batch, hidden_dim).astype(np.float32) init_c = np.random.rand(batch, hidden_dim).astype(np.float32) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=x_shape) rnn_cell = tf.contrib.rnn.LSTMBlockCell( hidden_dim, use_peephole=True, forget_bias=np.random.rand() ) res = rnn_cell(x, (init_h, init_c)) cs_new, h_new = res[1][0], res[1][1] res = [h_new, cs_new] # shape of h_new, cs_new: (batch_dim, hidden_dim) TensorFlowBaseTest.run_compare_tf( graph, {x: np.random.rand(*x_shape).astype(np.float32),}, res, compute_unit=compute_unit, backend=backend, # variable needs to be frozen freeze_graph=True, ) @pytest.mark.parametrize( "compute_unit, backend, batch_size", itertools.product(compute_units, backends, [1, 2],), ) def test_tf_lstm_block_fused_cell(self, compute_unit, backend, batch_size): # tf.contrib.rnn.LSTMBlockFusedCell runs an LSTM over a sequence of inputs input_dim, hidden_dim = 4, 3 seq_length = 5 init_h = np.zeros((batch_size, hidden_dim)).astype(np.float32) init_c = np.zeros((batch_size, hidden_dim)).astype(np.float32) x_shape = (seq_length, batch_size, input_dim) with tf.Graph().as_default() as graph: lstm_cell = tf.contrib.rnn.LSTMBlockFusedCell( num_units=hidden_dim, forget_bias=2.0, cell_clip=None, use_peephole=False, ) x = tf.placeholder(tf.float32, shape=x_shape) # shape of output: (seq_length, batch_size, hidden_dim) # shape of output_state: Tuple of shape ((batch_size, hidden_dim), (batch_size, hidden_dim)) output, output_state = lstm_cell( inputs=x, initial_state=(init_c, init_h), ) output = tf.nn.relu(output) res = TensorFlowBaseTest.run_compare_tf( graph, {x: np.random.rand(*x_shape).astype(np.float32),}, output, compute_unit=compute_unit, backend=backend, # variable needs to be frozen freeze_graph=True, ) # check that the resulting program has the LSTM block as a fused op coreml_model = res[1] mil_prog = coreml_model._get_mil_internal() assert len(mil_prog.find_ops(op_type="lstm")) == 1 @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,), ) def test_tf_multiple_lstm_block_fused_cell(self, compute_unit, backend): ''' Define a network with a stack of fused LSTM ops: %input (shape: (Seq, Batch, idim) == (5, 2, 4)) %x1 = LSTM(h=10) (%input) # shape = (5, 2, 10) %x2 = LSTM(h=20) (%x1) # shape = (5, 2, 20) %x3 = slice()(%x2) # shape = (1, 2, 20), to get the final seq value %x4 = reshape((1, -1)) (%x3) # shape = (1, 40) %x5 = Dense(h=3)(%x4) # shape = (1, 3) ''' input_dim = 4 seq_length = 5 batch_size = 2 x_shape = (seq_length, batch_size, input_dim) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=x_shape) # shape = (5, 2, 4) lstm_cell_1 = tf.contrib.rnn.LSTMBlockFusedCell(num_units=10) x1, _ = lstm_cell_1(x, dtype=tf.float32) # shape = (5, 2, 10) lstm_cell_2 = tf.contrib.rnn.LSTMBlockFusedCell(num_units=20) x2 , _ = lstm_cell_2(x1, dtype=tf.float32) # shape = (5, 2, 20) x3 = tf.slice(x2, begin=[4, 0, 0], size=[1, 2, 20]) # shape = [1, 2, 20] x4 = tf.reshape(x3, shape=(1, -1)) # shape = [1, 40] x5 = tf.linalg.matmul(x4, tf.constant(np.arange(1, 40*3, dtype=np.float32), shape=[40, 3])) # shape: [1, 3] res = TensorFlowBaseTest.run_compare_tf( graph, {x: np.random.rand(*x_shape).astype(np.float32),}, x5, compute_unit=compute_unit, backend=backend, # variable needs to be frozen freeze_graph=True, ) # check that the resulting program has the LSTM block ops as fused ops coreml_model = res[1] mil_prog = coreml_model._get_mil_internal() assert len(mil_prog.find_ops(op_type="lstm")) == 2 @pytest.mark.skipif(not _HAS_TF_1, reason=MSG_TF1_NOT_FOUND) class TestVariable(TensorFlowBaseTest): @pytest.mark.xfail(reason="Investigate get_global ", run=False) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,) ) def test_tf_no_variable(self, compute_unit, backend): with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=[1,], name="input") y = tf.Variable([1.0], dtype=tf.float32, name="y") # We set our assign op assign_op = tf.assign(y, y + 10) with tf.control_dependencies([assign_op]): res = tf.multiply(x, y, name="output") TensorFlowBaseTest.run_compare_tf( graph, {x: np.random.rand(1).astype(np.float32),}, res, compute_unit=compute_unit, backend=backend, ) class TestZerosLike(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, dynamic", itertools.product( compute_units, backends, [rank for rank in range(5)], [True, False], ), ) def test(self, compute_unit, backend, rank, dynamic): if rank == 0: pytest.skip('Rank 0 not supported by CoreML runtime') input_shape = np.random.randint(low=2, high=4, size=rank) input_value = random_gen(input_shape, rand_min=-1, rand_max=1) if dynamic: a, b = np.prod(input_shape[:2]), np.prod(input_shape[2:]) reshape_vals = np.array([a, b], dtype=np.int32) reshape_input_shape = np.array([2], dtype=np.int32) @make_tf_graph([input_shape, list(reshape_input_shape) + [tf.int32]]) def build_model(x, reshape): x = tf.reshape(x, shape=reshape) return tf.raw_ops.ZerosLike(x=x) model, inputs, outputs = build_model input_values = [input_value, reshape_vals] else: @make_tf_graph([input_shape]) def build_model(x): return tf.raw_ops.ZerosLike(x=x) model, inputs, outputs = build_model input_values = [input_value] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestIsFinite(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, dynamic", itertools.product(compute_units, backends, [rank for rank in range(1, 5)], [True, False]), ) def test(self, compute_unit, backend, rank, dynamic): def _generate_num_with_inf(input_shape): res = random_gen(input_shape, rand_min=-1, rand_max=1) random_map = np.random.choice([np.inf, -np.inf, 0], size=input_shape) if len(input_shape) == 0: return random_map.astype(np.float32) res[np.where(random_map == np.inf)] = np.inf res[np.where(random_map == -np.inf)] = -np.inf return res.astype(np.float32) input_shape = np.random.randint(low=2, high=4, size=rank) input_value = _generate_num_with_inf(input_shape) if dynamic: reshape_shape = [2, tf.int32] if len(input_shape) == 0: reshape_value = np.array([1, 1], dtype=np.int32) else: reshape_value = np.array( [input_shape[0], np.prod(input_shape[1:])], dtype=np.int32 ) @make_tf_graph([input_shape, reshape_shape]) def build_model(x, reshape): x = tf.reshape(x, reshape) x = tf.raw_ops.IsFinite(x=x) return tf.raw_ops.Cast(x=x, DstT=tf.float32) model, inputs, outputs = build_model input_values = [input_value, reshape_value] else: @make_tf_graph([input_shape]) def build_model(x): x = tf.raw_ops.IsFinite(x=x) return tf.raw_ops.Cast(x=x, DstT=tf.float32) model, inputs, outputs = build_model input_values = [input_value] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, backend=backend, compute_unit=compute_unit, ) class TestLogSoftMax(TensorFlowBaseTest): @pytest.mark.parametrize( 'compute_unit, backend', itertools.product( compute_units, backends, ), ) def test(self, compute_unit, backend): input_shape = (5, 20) input_value = random_gen(input_shape, rand_min=-1, rand_max=1) @make_tf_graph([input_shape]) def build_model(x): return tf.math.log_softmax(x) model, inputs, outputs = build_model input_values = [input_value] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( 'compute_unit, backend', itertools.product( compute_units, backends, ), ) def test_numerical_stability(self, compute_unit, backend): input_shape = (4,) input_value = np.array([10, 2, 10000, 4], dtype=np.float32) @make_tf_graph([input_shape]) def build_model(x): return tf.math.log_softmax(x) model, inputs, outputs = build_model input_values = [input_value] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestClipByValue(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, min_and_max, minimum_deployment_target", itertools.product( compute_units, backends, [rank for rank in range(5)], [(-1, 1), (-1, -1), (1, 2), (-3, -2)], [None, ct.target.iOS17], ), ) def test(self, compute_unit, backend, rank, min_and_max, minimum_deployment_target): if rank == 0: pytest.skip('Rank 0 not supported by CoreML runtime') input_shape = np.random.randint(low=2, high=4, size=rank) min_val, max_val = min_and_max input_value = random_gen(input_shape, rand_min=min_val-1, rand_max=max_val+1) @make_tf_graph([input_shape]) def build_model(x): return tf.raw_ops.ClipByValue(t=x, clip_value_min=min_val, clip_value_max=max_val) model, inputs, outputs = build_model input_values = [input_value] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, minimum_deployment_target=minimum_deployment_target, ) class TestSize(TensorFlowBaseTest): @pytest.mark.parametrize( 'compute_unit, backend, rank, dynamic', itertools.product( compute_units, backends, [rank for rank in range(5)], [True, False], ), ) def test(self, compute_unit, backend, rank, dynamic): if rank == 0: pytest.skip('Rank 0 not supported by CoreML runtime') input_shape = np.random.randint(low=2, high=4, size=rank) input_value = random_gen(input_shape, rand_min=-1, rand_max=1) if dynamic: a, b = np.prod(input_shape[:2]), np.prod(input_shape[2:]) reshape_vals = np.array([a,b], dtype=np.int32) reshape_input_shape = np.array([2], dtype=np.int32) @make_tf_graph([input_shape, list(reshape_input_shape)+[tf.int32]]) def build_model(x, reshape): x = tf.reshape(x, shape=reshape) return tf.raw_ops.Size(input=x) model, inputs, outputs = build_model input_values = [input_value, reshape_vals] else: @make_tf_graph([input_shape]) def build_model(x): return tf.raw_ops.Size(input=x) model, inputs, outputs = build_model input_values = [input_value] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestAudioSpectrogram(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, params, magnitude_squared", itertools.product( compute_units, backends, [ ((100, 2), 5, 10), ((50, 1), 18, 2), ((512, 1), 512, 320), ], [True, False], ), ) def test_audio_spectrogram(self, compute_unit, backend, params, magnitude_squared): input_shape = params[0] window_size = params[1] stride = params[2] @make_tf_graph([input_shape]) def build_model(x): y = tf.raw_ops.AudioSpectrogram(input=x, window_size=window_size, stride=stride, magnitude_squared=magnitude_squared) return y model, inputs, outputs = build_model input_values = [(2 * np.random.rand(*input_shape) - 1).astype(np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestMfcc(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, params", itertools.product( compute_units, backends, [ ((100, 2), 5, 10, 8000, (40, 4000), 20, 13), ((50, 1), 18, 2, 4000, (20, 1500), 40, 26), ((512, 1), 512, 320, 16000, (20, 8000), 40, 26), ], ), ) def test_mfcc(self, compute_unit, backend, params): if backend == ("mlprogram", "fp16"): pytest.xfail("rdar://80660411 (MFCC FP16 unit tests failing in TF1 converter with numerical errors)") input_shape = params[0] window_size = params[1] stride = params[2] sample_rate = params[3] lower_frequency_limit, upper_frequency_limit = params[4] filterbank_channel_count = params[5] dct_coefficient_count = params[6] @make_tf_graph([input_shape]) def build_model(x): y = tf.raw_ops.AudioSpectrogram(input=x, window_size=window_size, stride=stride, magnitude_squared=True) y_out = tf.raw_ops.Mfcc(spectrogram=y, sample_rate=sample_rate, upper_frequency_limit=upper_frequency_limit, lower_frequency_limit=lower_frequency_limit, filterbank_channel_count=filterbank_channel_count, dct_coefficient_count=dct_coefficient_count) return y_out model, inputs, outputs = build_model input_values = [(2 * np.random.rand(*input_shape) - 1).astype(np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestComplex(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_shape", # Placeholder doesn't support rank-0 input, so we don't use empty shape here. itertools.product(compute_units, backends, [[1], [2, 3], [4, 1, 5]]), ) def test_complex_basic(self, compute_unit, backend, input_shape): x_shape = input_shape y_shape = input_shape @make_tf_graph([x_shape, y_shape]) def build_model(x, y): complex_data = tf.complex(x, y) return tf.stack([tf.math.real(complex_data), tf.math.imag(complex_data)]) model, inputs, outputs = build_model input_values = [ np.random.rand(*x_shape).astype(np.float32), np.random.rand(*y_shape).astype(np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestReal(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_shape", itertools.product(compute_units, backends, [[1], [2, 3], [4, 1, 5]]), ) def test_real_real_input(self, compute_unit, backend, input_shape): @make_tf_graph([input_shape]) def build_model(x): return tf.math.real(x) model, inputs, outputs = build_model input_values = [np.random.rand(*input_shape).astype(np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, input_shape", itertools.product(compute_units, backends, [[1], [2, 3], [4, 1, 5]]), ) def test_real_complex_input(self, compute_unit, backend, input_shape): x_shape = input_shape y_shape = input_shape @make_tf_graph([x_shape, y_shape]) def build_model(x, y): return tf.math.real(tf.complex(x, y)) model, inputs, outputs = build_model input_values = [ np.random.rand(*x_shape).astype(np.float32), np.random.rand(*y_shape).astype(np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestImag(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_shape", itertools.product(compute_units, backends, [[1], [2, 3], [4, 1, 5]]), ) def test_imag_real_input(self, compute_unit, backend, input_shape): @make_tf_graph([input_shape]) def build_model(x): return x + tf.math.imag(x) model, inputs, outputs = build_model input_values = [np.random.rand(*input_shape).astype(np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, input_shape", itertools.product(compute_units, backends, [[1], [2, 3], [4, 1, 5]]), ) def test_imag_complex_input(self, compute_unit, backend, input_shape): x_shape = input_shape y_shape = input_shape @make_tf_graph([x_shape, y_shape]) def build_model(x, y): return tf.math.imag(tf.complex(x, y)) model, inputs, outputs = build_model input_values = [ np.random.rand(*x_shape).astype(np.float32), np.random.rand(*y_shape).astype(np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestFft(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_shape", itertools.product(compute_units, backends, [[1], [2, 3], [4, 1, 5]]), ) def test_fft_basic(self, compute_unit, backend, input_shape): # No need to test other parameter combinations because tf.signal.fft doesn't provide API to # control more fine-grained params such as "n,dim,norm" in PyTorch. x_shape = input_shape y_shape = input_shape @make_tf_graph([x_shape, y_shape]) def build_model(x, y): complex_data = tf.complex(x, y) fft_res = tf.signal.fft(complex_data) return tf.stack([tf.math.real(fft_res), tf.math.imag(fft_res)]) model, inputs, outputs = build_model input_values = [ np.random.rand(*x_shape).astype(np.float32), np.random.rand(*y_shape).astype(np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_fft_directly_output_error(self, compute_unit, backend): x_shape = [2, 3] y_shape = [2, 3] @make_tf_graph([x_shape, y_shape]) def build_model(x, y): complex_data = tf.complex(x, y) return tf.signal.fft(complex_data) model, inputs, outputs = build_model input_values = [ np.random.rand(*x_shape).astype(np.float32), np.random.rand(*y_shape).astype(np.float32), ] input_dict = dict(zip(inputs, input_values)) with pytest.raises( ValueError, match="MIL doesn't support complex data as model's output" ): TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, input_shape", itertools.product(compute_units, backends, [[1], [2, 3], [4, 1, 5]]), ) def test_fft_nested(self, compute_unit, backend, input_shape): x_shape = input_shape y_shape = input_shape @make_tf_graph([x_shape, y_shape]) def build_model(x, y): complex_data = tf.complex(x, y) fft_res1 = tf.signal.fft(complex_data) fft_res2 = tf.signal.fft(fft_res1) fft_res3 = tf.signal.fft(fft_res2) return tf.stack([tf.math.real(fft_res3), tf.math.imag(fft_res3)]) model, inputs, outputs = build_model input_values = [ np.random.rand(*x_shape).astype(np.float32), np.random.rand(*y_shape).astype(np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestRfft(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, fft_length, input_shape", # TF requires fft_length be an int32 tensor of shape [1] instead of an integer. itertools.product( compute_units, backends, [None, [1], [3], [5]], [[1], [2, 3], [4, 1, 5]] ), ) def test_rfft_basic(self, compute_unit, backend, fft_length, input_shape): @make_tf_graph([input_shape]) def build_model(x): rfft_res = tf.signal.rfft(x, fft_length=fft_length) return tf.stack([tf.math.real(rfft_res), tf.math.imag(rfft_res)]) model, inputs, outputs = build_model input_values = [ np.random.rand(*input_shape).astype(np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestIfft(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_shape", itertools.product(compute_units, backends, [[1], [2, 3], [4, 1, 5]]), ) def test_ifft_basic(self, compute_unit, backend, input_shape): x_shape = input_shape y_shape = input_shape @make_tf_graph([x_shape, y_shape]) def build_model(x, y): complex_input = tf.complex(x, y) ifft_res = tf.signal.ifft(complex_input) return tf.stack([tf.math.real(ifft_res), tf.math.imag(ifft_res)]) model, inputs, outputs = build_model input_values = [ np.random.rand(*x_shape).astype(np.float32), np.random.rand(*y_shape).astype(np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestIrfft(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, fft_length, input_shape", # TF requires fft_length be an int32 tensor of shape [1] instead of an integer. itertools.product( compute_units, backends, [None, [1], [3], [5]], [[6], [2, 3], [4, 1, 5]] ), ) def test_irfft_basic(self, compute_unit, backend, fft_length, input_shape): x_shape = input_shape y_shape = input_shape @make_tf_graph([x_shape, y_shape]) def build_model(x, y): complex_input = tf.complex(x, y) return tf.signal.irfft(complex_input, fft_length=fft_length) model, inputs, outputs = build_model input_values = [ np.random.rand(*x_shape).astype(np.float32), np.random.rand(*y_shape).astype(np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, input_shape", itertools.product(compute_units, backends, [[6], [2, 3], [4, 1, 5]]), ) def test_fft_length_specify_by_shape(self, compute_unit, backend, input_shape): x_shape = input_shape y_shape = input_shape @make_tf_graph([x_shape, y_shape]) def build_model(x, y): complex_input = tf.complex(x, y) return tf.signal.irfft(complex_input, fft_length=[complex_input.shape[-1]]) model, inputs, outputs = build_model input_values = [ np.random.rand(*x_shape).astype(np.float32), np.random.rand(*y_shape).astype(np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/test/test_parse.py0000644000000000000000000001162514672066616027544 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import pytest pytest.importorskip("tensorflow", minversion="1.15.0") from tensorflow.core.framework import attr_value_pb2 as attr_value from tensorflow.core.framework import tensor_shape_pb2 as tensor_shape from tensorflow.core.framework import types_pb2 as types import coremltools.converters.mil.frontend.tensorflow.parse as parse from coremltools.converters.mil.mil import types as mil_types class TestParse(unittest.TestCase): def test_parse_list(self): def compare(expected, lst, field_name): attr = attr_value.AttrValue() field = getattr(attr.list, field_name) field.extend(lst) actual = parse.parse_attr(attr) self.assertEqual(expected, actual) compare([1, 2, 3], [1, 2, 3], "i") compare(["foo", "bar"], [b"foo", b"bar"], "s") def test_parse_scalar(self): def compare(expected, val, field_name): a = attr_value.AttrValue() setattr(a, field_name, val) actual = parse.parse_attr(a) self.assertEqual(expected, actual) compare("a String", b"a String", "s") compare(55, 55, "i") compare(True, True, "b") attr = attr_value.AttrValue() attr.f = 12.3 self.assertAlmostEqual(12.3, parse.parse_attr(attr), places=2) @staticmethod def _attr_with_shape(dims, unknown_rank=0): attr = attr_value.AttrValue() for (dim_size, dim_name) in dims: tf_dim = tensor_shape.TensorShapeProto.Dim() tf_dim.size = dim_size tf_dim.name = dim_name attr.shape.dim.append(tf_dim) attr.shape.unknown_rank = unknown_rank return attr def test_parse_shape(self): def compare(expected, dims, unknown_rank=0): attr = self._attr_with_shape(dims, unknown_rank) actual = parse.parse_attr(attr) self.assertEqual(expected, actual) compare(None, [], 5) compare([100], [(100, "outer")]) compare([1, 2, 3], [(1, "outer"), (2, "middle"), (3, "inner")]) def test_parse_tensor(self): # Zero-rank tensor attr = attr_value.AttrValue() attr.tensor.version_number = 1 attr.tensor.dtype = types.DataType.DT_INT32 t = parse.parse_attr(attr) self.assertTrue(isinstance(t, mil_types.int32)) self.assertEqual(0, t.val) # Non-zero rank attr = attr_value.AttrValue() attr.tensor.version_number = 1 attr.tensor.dtype = types.DataType.DT_INT32 shaped_attr = self._attr_with_shape([(1, "outer"), (2, "middle"), (3, "inner")]) attr.tensor.tensor_shape.dim.extend(shaped_attr.shape.dim) attr.tensor.int_val.extend([55, 56, 57]) t = parse.parse_attr(attr) self.assertEqual([55, 56, 57], t.val.tolist()) self.assertEqual("tensor", mil_types.get_type_info(t).name) # Note that the result of t.get_primitive() is a function that returns a type # rather than an instance of that type as it is when the tensor has rank zero. self.assertTrue(isinstance(t.get_primitive()(), mil_types.int32)) self.assertEqual((1, 2, 3), t.get_shape()) def test_parse_type(self): def compare(expected, tf_type): attr = attr_value.AttrValue() attr.type = tf_type self.assertEqual(expected, parse.parse_attr(attr)) compare(None, types.DataType.DT_INVALID) compare(mil_types.float, types.DataType.DT_FLOAT) compare(mil_types.double, types.DataType.DT_DOUBLE) compare(mil_types.int32, types.DataType.DT_INT32) compare(mil_types.uint8, types.DataType.DT_UINT8) compare(mil_types.int16, types.DataType.DT_INT16) compare(mil_types.int8, types.DataType.DT_INT8) compare(mil_types.int8, types.DataType.DT_INT8) compare(mil_types.str, types.DataType.DT_STRING) compare(None, types.DataType.DT_COMPLEX64) compare(mil_types.int32, types.DataType.DT_INT64) compare(mil_types.bool, types.DataType.DT_BOOL) compare(None, types.DataType.DT_QINT8) compare(None, types.DataType.DT_QUINT8) compare(None, types.DataType.DT_QINT32) compare(None, types.DataType.DT_BFLOAT16) compare(None, types.DataType.DT_QINT16) compare(None, types.DataType.DT_QUINT16) compare(mil_types.uint16, types.DataType.DT_UINT16) compare(None, types.DataType.DT_COMPLEX128) compare(mil_types.fp16, types.DataType.DT_HALF) compare(None, types.DataType.DT_RESOURCE) compare(None, types.DataType.DT_VARIANT) compare(mil_types.uint32, types.DataType.DT_UINT32) compare(mil_types.uint64, types.DataType.DT_UINT64) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/test/test_parsed_tf_node.py0000644000000000000000000000421314672066616031401 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import pytest pytest.importorskip("tensorflow", minversion="1.15.0") from tensorflow.core.framework import node_def_pb2 as node_def from tensorflow.core.framework import tensor_shape_pb2 as tensor_shape from tensorflow.core.framework import types_pb2 as types from coremltools.converters.mil.frontend.tensorflow.parsed_tf_node import \ ParsedTFNode def _mock_tf_node(): tfnode = node_def.NodeDef() tfnode.name = "aNode" tfnode.op = "PlaceholderWithDefault" tfnode.input.extend(["anInput", "^aControlInput"]) tfnode.attr["dtype"].type = types.DataType.DT_INT32 dims = [(1, "outer"), (2, "middle"), (3, "inner")] for (dim_size, dim_name) in dims: tf_dim = tensor_shape.TensorShapeProto.Dim() tf_dim.size = dim_size tf_dim.name = dim_name tfnode.attr["shape"].shape.dim.append(tf_dim) return tfnode class TestParsedTFNode(unittest.TestCase): def test_init(self): parsed_node = ParsedTFNode(_mock_tf_node()) parsed_node.parse_from_attr() self.assertEqual("aNode", parsed_node.name) self.assertEqual("Placeholder", parsed_node.op) self.assertEqual(["anInput"], parsed_node.inputs) self.assertEqual(["aControlInput"], parsed_node.control_inputs) def test_copy(self): parsed_node = ParsedTFNode(_mock_tf_node()) parsed_node.parse_from_attr() copy = parsed_node.copy() self.assertTrue(isinstance(copy, type(parsed_node))) props = [ "name", "op", "datatype", "value", "inputs", "control_inputs", "outputs", "control_outputs", "attr", "original_node", ] for prop in props: self.assertEqual( getattr(parsed_node, prop), getattr(copy, prop), "Mismatch in property {}".format(prop), ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/test/test_tf_conversion_api.py0000644000000000000000000017760414672066616032153 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import os import tempfile import numpy as np import pytest import coremltools as ct from coremltools._deps import _HAS_TF_1, _HAS_TF_2, MSG_TF1_NOT_FOUND from coremltools.converters.mil.testing_reqs import backends, compute_units from coremltools.converters.mil.testing_utils import ( assert_cast_ops_count, assert_input_dtype, assert_ops_in_mil_program, assert_output_dtype, assert_prog_input_type, assert_prog_output_type, assert_spec_input_image_type, assert_spec_output_image_type, get_op_types_in_program, verify_prediction, ) from coremltools.proto import FeatureTypes_pb2 as ft from coremltools.test.api.test_api_examples import TestInputs as _TestInputs tf = pytest.importorskip("tensorflow") ################################################################################# # Note: all tests are also used as examples in https://coremltools.readme.io/docs # as a reference. # Whenever any of the following test fails, we should update API documentations ################################################################################# @pytest.mark.skipif(not _HAS_TF_1, reason=MSG_TF1_NOT_FOUND) @pytest.mark.skipif(ct.utils._macos_version() < (10, 15), reason='Model produces specification 4.') class TestTensorFlow1ConverterExamples: @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_convert_from_frozen_graph(tmpdir, backend): with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=(1, 2, 3), name="input") y = tf.nn.relu(x, name="output") mlmodel = ct.convert(graph, convert_to=backend[0], compute_units=ct.ComputeUnit.CPU_ONLY) test_input = np.random.rand(1, 2, 3) - 0.5 with tf.compat.v1.Session(graph=graph) as sess: expected_val = sess.run(y, feed_dict={x: test_input}) results = mlmodel.predict({"input": test_input}) np.testing.assert_allclose(results["output"], expected_val) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_convert_from_frozen_graph_file(tmpdir, backend): # create the model to convert # write a toy frozen graph # Note that we usually needs to run freeze_graph() on tf.Graph() # skipping here as this toy model does not contain any variables with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=(1, 2, 3), name="input") y = tf.nn.relu(x, name="output") save_path = str(tmpdir) tf.io.write_graph(graph, save_path, "frozen_graph.pb", as_text=False) # Create a test sample # -0.5 to have some negative values test_input = np.random.rand(1, 2, 3) - 0.5 with tf.compat.v1.Session(graph=graph) as sess: expected_val = sess.run(y, feed_dict={x: test_input}) # The input `.pb` file is a frozen graph format that usually # generated by TensorFlow's utility function `freeze_graph()` pb_path = os.path.join(save_path, "frozen_graph.pb") # 3 ways to specify inputs: # (1) Fully specify inputs mlmodel = ct.convert( pb_path, # We specify inputs with name matching the placeholder name. inputs=[ct.TensorType(name="input", shape=(1, 2, 3))], outputs=["output"], convert_to=backend[0], ) # (2) Specify input TensorType without name (when there's only one # input) mlmodel = ct.convert( pb_path, # TensorType name is optional when there's only one input. inputs=[ct.TensorType(shape=(1, 2, 3))], outputs=["output"], convert_to=backend[0], ) # (3) Not specify inputs at all. `inputs` is optional for TF. When # inputs is not specified, convert() infers inputs from Placeholder # nodes. mlmodel = ct.convert( pb_path, outputs=["output"], convert_to=backend[0], compute_units=ct.ComputeUnit.CPU_ONLY, ) results = mlmodel.predict({"input": test_input}) np.testing.assert_allclose(results["output"], expected_val) suffix = ".mlmodel" if backend[0] == "neuralnetwork" else ".mlpackage" mlmodel_path = os.path.join(save_path, "model" + suffix) # Save the converted model mlmodel.save(mlmodel_path) results = mlmodel.predict({"input": test_input}) np.testing.assert_allclose(results["output"], expected_val, atol=1e-3) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_convert_from_saved_model_dir(tmpdir, backend): # Sample input test_input = np.random.rand(1, 3, 5) - 0.5 # create the model to convert with tf.compat.v1.Session() as sess: x = tf.placeholder(shape=(1, 3, 5), dtype=tf.float32) y = tf.nn.relu(x) expected_val = sess.run(y, feed_dict={x: test_input}) # Save model as SavedModel inputs = {"x": x} outputs = {"y": y} save_path = str(tmpdir) tf.compat.v1.saved_model.simple_save(sess, save_path, inputs, outputs) # SavedModel directory generated by TensorFlow 1.x # when converting from SavedModel dir, inputs / outputs are optional mlmodel = ct.convert( save_path, convert_to=backend[0], compute_units=ct.ComputeUnit.CPU_ONLY ) # Need input output names to call mlmodel # x.name == 'Placeholder:0'. Strip out ':0' input_name = x.name.split(":")[0] results = mlmodel.predict({input_name: test_input}) # y.name == 'Relu:0'. output_name == 'Relu' output_name = y.name.split(":")[0] np.testing.assert_allclose(results[output_name], expected_val) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_freeze_and_convert_matmul_graph(backend): # testing : https://coremltools.readme.io/docs/tensorflow-1#export-as-frozen-graph-and-convert graph = tf.Graph() with graph.as_default(): x = tf.placeholder(tf.float32, shape=[None, 20], name="input") W = tf.Variable(tf.truncated_normal([20, 10], stddev=0.1)) b = tf.Variable(tf.ones([10])) y = tf.matmul(x, W) + b output_names = [y.op.name] from tensorflow.python.tools.freeze_graph import freeze_graph model_dir = tempfile.TemporaryDirectory() graph_def_file = os.path.join(model_dir.name, "tf_graph.pb") checkpoint_file = os.path.join(model_dir.name, "tf_model.ckpt") frozen_graph_file = os.path.join(model_dir.name, "tf_frozen.pb") with tf.Session(graph=graph) as sess: # initialize variables sess.run(tf.global_variables_initializer()) # save graph definition somewhere tf.train.write_graph( sess.graph, model_dir.name, graph_def_file, as_text=False ) # save the weights saver = tf.train.Saver() saver.save(sess, checkpoint_file) # take the graph definition and weights # and freeze into a single .pb frozen graph file freeze_graph(input_graph=graph_def_file, input_saver="", input_binary=True, input_checkpoint=checkpoint_file, output_node_names=",".join(output_names), restore_op_name="save/restore_all", filename_tensor_name="save/Const:0", output_graph=frozen_graph_file, clear_devices=True, initializer_nodes="") print("Tensorflow frozen graph saved at {}".format(frozen_graph_file)) ct.convert(frozen_graph_file, convert_to=backend[0]) @staticmethod def test_convert_tf1_frozen_graph_to_milinternal(tmpdir): with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=(1, 2, 3), name="input") y = tf.nn.relu(x, name="output") model = ct.convert(graph, convert_to='milinternal') assert isinstance(model, ct.converters.mil.Program) @staticmethod def test_mil_op_names_consistency(tmpdir): ''' Test to make sure that when the same model is converted to MIL program, in the same session, it gives the same program, with the same op names ''' with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=(1, 5, 5, 3), name="input") conv = tf.nn.conv2d( x, filter = tf.constant(np.random.rand(1, 1, 3, 5), tf.float32), padding = "VALID", ) y = tf.nn.relu(conv, name="output") mil_prog1 = ct.convert(graph, convert_to='milinternal') # convert the same model again mil_prog2 = ct.convert(graph, convert_to='milinternal') # compare op names of the two programs np.testing.assert_array_equal(get_op_types_in_program(mil_prog1), get_op_types_in_program(mil_prog2)) ############################################################################### # Note: Stress tests for TF1 input / output types ############################################################################### @pytest.mark.skipif(ct.utils._macos_version() < (10, 15), reason='Model produces specification 4.') @pytest.mark.skipif(not _HAS_TF_1, reason=MSG_TF1_NOT_FOUND) class TestTf1Inputs(_TestInputs): @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_input_noname(backend): with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=(1, 2, 3), name="input") x1 = tf.placeholder(tf.float32, shape=(1, 2, 3), name="input_1") y = tf.nn.relu(x, name="output") y1 = tf.nn.relu(x1, name="output_1") with pytest.raises(ValueError) as e: model = ct.convert( graph, inputs=[ct.TensorType(shape=(1, 2, 3))], convert_to=backend[0], ) expected_error = "Multiple inputs are found in graph, but no input name was provided" assert expected_error == str(e.value) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_input_wrongname(backend): with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=(1, 2, 3), name="input") x1 = tf.placeholder(tf.float32, shape=(1, 2, 3), name="input_1") y = tf.nn.relu(x, name="output") y1 = tf.nn.relu(x1, name="output_1") with pytest.raises(ValueError) as e: model = ct.convert( graph, inputs=[ct.TensorType(shape=(1, 2, 3), name="wrong_input")], convert_to=backend[0], ) expected_error = "Multiple inputs are found in graph, but no input name was provided" expected_error = "Input ({}) provided is not found in given tensorflow graph. Placeholders in graph are: {}".format("wrong_input", ["input", "input_1"]) assert expected_error == str(e.value) @pytest.mark.parametrize( "backend, compute_unit", itertools.product( backends, compute_units, ), ) def test_input_dynamic_without_inputs_param(self, backend, compute_unit): """The `inputs` param is not provided for a dynamic input (shape has `None`).""" with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=(None, None, 3), name="input") x1 = tf.placeholder(tf.float32, shape=(1, 2, 3), name="input_1") y = tf.nn.relu(x, name="output") y1 = tf.nn.relu(x1, name="output_1") convert_to = backend[0] if convert_to == "mlprogram": with pytest.warns( UserWarning, match="Some dimensions in the input shape are unknown, hence they are set to " "flexible ranges with lower bound and default value = 1, and upper bound = 2. " "To set different values for the default shape and upper bound, please use " "the ct.RangeDim.*", ): mlmodel = ct.convert( graph, convert_to=convert_to, compute_units=compute_unit, ) else: mlmodel = ct.convert( graph, convert_to=convert_to, compute_units=compute_unit, ) spec = mlmodel.get_spec() assert list(spec.description.input[0].type.multiArrayType.shape) == [1, 1, 3] assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[1].lowerBound == 1 ) assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[1].upperBound == -1 if convert_to == "neuralnetwork" else 2 ) @staticmethod @pytest.mark.parametrize( "backend", backends, ) @pytest.mark.skipif(not ct.utils._is_macos(), reason="test needs predictions") def test_tf_predict_input(backend): TestTf1Inputs._test_variant_input_type_prediction(tf.convert_to_tensor, backend[0]) @pytest.fixture def uint8_input_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.uint8, shape=[10, 20], name="input") out = tf.add(x, tf.constant(5, dtype=tf.uint8), name="output") return graph @pytest.fixture def int8_input_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.int8, shape=[10, 20], name="input") out = tf.add(x, tf.constant(5, dtype=tf.int8), name="output") return graph @pytest.fixture def int32_input_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.int32, shape=[10, 20], name="input") out = tf.add(x, tf.constant(5, dtype=tf.int32), name="output") return graph @pytest.fixture def int32_two_input_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.int32, shape=[10, 20], name="input1") y = tf.placeholder(tf.int32, shape=[10, 20], name="input2") out = tf.add(x, y, name="output") return graph @pytest.fixture def int32_two_output_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.int32, shape=[10, 20], name="input1") y = tf.placeholder(tf.int32, shape=[10, 20], name="input2") out1 = tf.add(x, 1, name="output1") out2 = tf.add(y, 1, name="output2") return graph @pytest.fixture def int32_float32_two_output_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=[10, 20], name="input1") y = tf.placeholder(tf.float32, shape=[10, 20], name="input2") x_add = tf.add(x, 1.0, name="output1") y_add = tf.add(y, 1.0) y_cast = tf.cast(y_add, dtype=tf.int32, name="output2") return graph @pytest.fixture def int32_float32_two_input_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.int32, shape=[10, 20], name="input1") y = tf.placeholder(tf.float32, shape=[10, 20], name="input2") x_cast = tf.cast(x, dtype=tf.float32) out = tf.add(x_cast, y, name="output") return graph @pytest.fixture def float32_input_model_add_op(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=[10, 20], name="input") out = tf.add(x, tf.constant(5.5, dtype=tf.float32), name="output") return graph @pytest.fixture def float32_input_model_relu_ops(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=[10, 20], name="input") x1 = tf.nn.relu(x) out = tf.nn.relu(x1, name="output") return graph @pytest.fixture def int64_input_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.int64, shape=[10, 20], name="input") out = tf.add(x, tf.constant(5, dtype=tf.int64), name="output") return graph @pytest.fixture def float32_two_input_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=[10, 20], name="input1") y = tf.placeholder(tf.float32, shape=[10, 20], name="input2") out = tf.add(x, y, name="output") return graph @pytest.fixture def float32_two_output_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=[10, 20], name="input") y = tf.nn.relu(x) out2 = tf.nn.relu6(x, name="output2") out1 = tf.nn.relu(y, name="output1") return graph @pytest.fixture def float64_input_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float64, shape=[10, 20], name="input") out = tf.add(x, tf.constant(5, dtype=tf.float64), name="output") return graph @pytest.fixture def rank3_input_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=[1, 10, 20], name="input") out = tf.add(x, tf.constant(5, dtype=tf.float32), name="output") return graph @pytest.fixture def rank4_input_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=[1, 10, 20, 3], name="input") out = tf.add(x, tf.constant(5, dtype=tf.float32), name="output") return graph @pytest.fixture def rank4_input_model_with_channel_first_output(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=[1, 10, 20, 3], name="input") y = tf.add(x, tf.constant(5, dtype=tf.float32)) out = tf.transpose(y, perm=[0, 3, 1, 2], name="output") return graph @pytest.fixture def rank4_grayscale_input_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=[1, 10, 20, 1], name="input") out = tf.add(x, tf.constant(5, dtype=tf.float32), name="output") return graph @pytest.fixture def rank4_grayscale_input_model_with_channel_first_output(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=[1, 10, 20, 1], name="input") y = tf.add(x, tf.constant(5, dtype=tf.float32)) out = tf.transpose(y, perm=[0, 3, 1, 2], name="output") return graph @pytest.fixture def linear_model(): if not _HAS_TF_1: pytest.skip(MSG_TF1_NOT_FOUND) # this model will test the fuse_matmul_weight_bias pass with tf.Graph().as_default() as graph: x = tf.placeholder(tf.float32, shape=[1, 2], name="input") y = tf.matmul(x, tf.constant([1, 2], shape=(2, 4), dtype=tf.float32)) y = tf.add(y, tf.constant([1, 2, 3, 4], shape=(4,), dtype=tf.float32)) out = tf.nn.relu(y) return graph @pytest.mark.skipif(ct.utils._macos_version() < (13, 0), reason='Tests are for deployment target ios16/macos13') class TestInputOutputConversionAPI: def test_input_dtype_inferred(self, int32_input_model): # test that the input dtype is picked up from TF correctly mlmodel = ct.convert(int32_input_model, minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="int32") verify_prediction(mlmodel) def test_unsupported_input_dtype_in_tf_graph_uint8(self, uint8_input_model): # test that no error is raised when no dtype is provided by the user, # and the TF graph's input dtype is not supported. # In this case, it will be mapped to the closest supported dtype mlmodel = ct.convert(uint8_input_model, minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="int32") verify_prediction(mlmodel) def test_unsupported_input_dtype_in_tf_graph_int8(self, int8_input_model): # test that no error is raised when no dtype is provided by the user, # and the TF graph's input dtype is not supported. # In this case, it will be mapped to the closest supported dtype mlmodel = ct.convert(int8_input_model, minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="int32") verify_prediction(mlmodel) def test_unsupported_input_dtype_in_tf_graph_int64(self, int64_input_model): # test that no error is raised when no dtype is provided by the user, # and the TF graph's input dtype is not supported. # In this case, it will be mapped to the closest supported dtype mlmodel = ct.convert(int64_input_model, minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="int32") verify_prediction(mlmodel) def test_unsupported_input_dtype_in_tf_graph_fp64(self, float64_input_model): # test that no error is raised when no dtype is provided by the user, # and the TF graph's input dtype is not supported. # In this case, it will be mapped to the closest supported dtype mlmodel = ct.convert(float64_input_model, minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) def test_input_dtype_user_provided(self, int32_input_model): # test that provided dtype in the api overrides the input dtype in the TF model mlmodel = ct.convert(int32_input_model, inputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="fp32") assert_output_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) def test_invalid_input_dtype(self, int32_input_model): # error should be raised if a dtype is provided by the user that is not supported with pytest.raises(TypeError, match="is unsupported for inputs/outputs of the model" ): mlmodel = ct.convert(int32_input_model, inputs=[ct.TensorType(dtype=np.int16)], minimum_deployment_target=ct.target.macOS12) with pytest.raises(TypeError, match="float16 dtype for inputs is only supported for deployment target >= iOS16/macOS13" ): mlmodel = ct.convert(int32_input_model, inputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS12) def test_fp16_input_dtype(self, float32_input_model_add_op, float32_input_model_relu_ops, int32_input_model): """ Test that providing fp16 input dtype works with macOS13. """ mlmodel = ct.convert( float32_input_model_add_op, inputs=[ct.TensorType(dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["add", "cast"]) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) mlmodel = ct.convert( float32_input_model_relu_ops, inputs=[ct.TensorType(dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) # Two consecutive relus are merged in the `merge_consecutive_relus` pass. assert_ops_in_mil_program(mlmodel, expected_op_list=["relu", "cast"]) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) mlmodel = ct.convert( int32_input_model, inputs=[ct.TensorType(dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["add", "cast"]) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) def test_fp16_input_dtype_fp32_precision(self, float32_input_model_add_op, float32_input_model_relu_ops, int32_input_model): """ Same test as test_fp16_input_dtype, but with Float32 precision """ mlmodel = ct.convert( float32_input_model_add_op, inputs=[ct.TensorType(dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, compute_precision=ct.precision.FLOAT32, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "add"]) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) mlmodel = ct.convert( float32_input_model_relu_ops, inputs=[ct.TensorType(dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, compute_precision=ct.precision.FLOAT32, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "relu"]) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp32") def test_two_input_model(self, float32_two_input_model): # test forcing input type of "input1" to be int32 mlmodel = ct.convert( float32_two_input_model, inputs=[ct.TensorType(name="input1", dtype=np.int32)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS12, ) assert_input_dtype(mlmodel, expected_type_str="int32", expected_name="input1") assert_input_dtype(mlmodel, expected_type_str="fp32", expected_name="input2") assert_output_dtype(mlmodel, expected_type_str="fp32") # test forcing both inputs to be int32 mlmodel = ct.convert(float32_two_input_model, inputs=[ct.TensorType(name="input1", dtype=np.int32), ct.TensorType(name="input2", dtype=np.int32), ], minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="int32", expected_name="input1") assert_input_dtype(mlmodel, expected_type_str="int32", expected_name="input2") assert_output_dtype(mlmodel, expected_type_str="int32") # if names are not provided an error should be raised with pytest.raises(ValueError): mlmodel = ct.convert(float32_two_input_model, inputs=[ct.TensorType(dtype=np.int32), ct.TensorType(dtype=np.int32), ], minimum_deployment_target=ct.target.macOS12) # test forcing both inputs to be float16 mlmodel = ct.convert( float32_two_input_model, inputs=[ ct.TensorType(name="input1", dtype=np.float16), ct.TensorType(name="input2", dtype=np.float16), ], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) assert_input_dtype(mlmodel, expected_type_str="fp16", expected_name="input1") assert_input_dtype(mlmodel, expected_type_str="fp16", expected_name="input2") assert_output_dtype(mlmodel, expected_type_str="fp32") assert_cast_ops_count(mlmodel, expected_count=1) verify_prediction(mlmodel) def test_single_output_model(self, int32_input_model, float32_input_model_relu_ops): # test output type mlmodel = ct.convert(int32_input_model, minimum_deployment_target=ct.target.macOS12) assert_ops_in_mil_program(mlmodel, expected_op_list=["add"]) assert_output_dtype(mlmodel, expected_type_str="int32") # test that error is raised when an output of unknown name is provided with pytest.raises(Exception): # output name does not exist in the model mlmodel = ct.convert(int32_input_model, outputs=["z"], minimum_deployment_target=ct.target.macOS12) # test that error is raised when two outputs are provided without names with pytest.raises(ValueError, match=", does not have names"): mlmodel = ct.convert(int32_input_model, outputs=[ct.TensorType(dtype=np.float32), ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS12) # test that an error is raised when shape is provided for the output with pytest.raises(ValueError): mlmodel = ct.convert(int32_input_model, outputs=[ct.TensorType(dtype=np.float32, shape=(10, 20))], minimum_deployment_target=ct.target.macOS12) # test that the output dtype provided by the user is applied during conversion mlmodel = ct.convert(int32_input_model, outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS12) assert_output_dtype(mlmodel, expected_type_str="fp32", expected_name="Identity" if _HAS_TF_2 else "output") assert_ops_in_mil_program(mlmodel, expected_op_list=["add", "cast"]) # test that output dtype of float16 is rejected when deployment target is low with pytest.raises(TypeError, match="float16 dtype for outputs is only supported for deployment target >= iOS16/macOS13" ): ct.convert(float32_input_model_relu_ops, outputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS12, ) # test that output type float16 is applied correctly mlmodel = ct.convert( float32_input_model_relu_ops, inputs=[ct.TensorType(name="input", dtype=np.float32)], outputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, ) assert_output_dtype( mlmodel, expected_type_str="fp16", expected_name="Identity" if _HAS_TF_2 else "output" ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "relu"]) # test that input and output types float16 are applied correctly mlmodel = ct.convert(float32_input_model_relu_ops, inputs=[ct.TensorType(dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, ) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp16", expected_name="Identity" if _HAS_TF_2 else "output") assert_ops_in_mil_program(mlmodel, expected_op_list=["relu"]) verify_prediction(mlmodel) def test_multi_output_model(self, float32_two_output_model): # check that error is raised when only 1 output provided with pytest.raises(ValueError, match="please provide names for each of the outputs"): mlmodel = ct.convert(float32_two_output_model, outputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, ) # check that error is raised when multiple outputs are provided without names with pytest.raises(ValueError, match="please provide names for each of the outputs"): mlmodel = ct.convert(float32_two_output_model, outputs=[ct.TensorType(dtype=np.float16), ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) # set 1 output to float16 and the other to float32 output1_name = "Identity" if _HAS_TF_2 else "output1" output2_name = "Identity_1" if _HAS_TF_2 else "output2" mlmodel = ct.convert(float32_two_output_model, inputs=[ct.TensorType(dtype=np.float16)], outputs=[ct.TensorType(name=output2_name, dtype=np.float16), ct.TensorType(name=output1_name, dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) assert_cast_ops_count(mlmodel, expected_count=1) assert_output_dtype(mlmodel, expected_type_str="fp16", expected_name=output2_name, index=0) assert_output_dtype(mlmodel, expected_type_str="fp32", expected_name=output1_name, index=1) assert_input_dtype(mlmodel, expected_type_str="fp16") verify_prediction(mlmodel) # in this case only the single output will be selected mlmodel = ct.convert(float32_two_output_model, inputs=[ct.TensorType(dtype=np.float16)], outputs=[ct.TensorType(name=output2_name, dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, ) assert_cast_ops_count(mlmodel, expected_count=0) assert_output_dtype(mlmodel, expected_type_str="fp16", expected_name=output2_name, index=0) assert_input_dtype(mlmodel, expected_type_str="fp16") verify_prediction(mlmodel) def test_color_input(self, rank4_input_model, rank3_input_model): mlmodel = ct.convert( rank4_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "transpose", "add", "cast"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") verify_prediction(mlmodel) with pytest.raises(ValueError, match="must have rank 4"): mlmodel = ct.convert(rank3_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.macOS12, ) def test_grayscale_input(self, rank4_input_model, rank3_input_model, rank4_grayscale_input_model): with pytest.raises(ValueError, match="must have rank 4"): mlmodel = ct.convert(rank3_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE)], minimum_deployment_target=ct.target.macOS13, ) # invalid shape with pytest.raises(ValueError): mlmodel = ct.convert(rank4_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE)], minimum_deployment_target=ct.target.macOS13, ) mlmodel = ct.convert( rank4_grayscale_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "transpose", "add", "cast"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") verify_prediction(mlmodel) with pytest.raises(TypeError, match="float16 dtype for inputs is only supported for deployment target >= iOS16/macOS13"): mlmodel = ct.convert(rank4_grayscale_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], minimum_deployment_target=ct.target.macOS12, ) # test that grayscale_16 raises error when used with neural network with pytest.raises(TypeError, match="float16 dtype for inputs is only supported for deployment target >= iOS16/macOS13"): mlmodel = ct.convert(rank4_grayscale_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], ) mlmodel = ct.convert(rank4_grayscale_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], outputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["transpose", "add"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp16") verify_prediction(mlmodel) def test_color_output(self, rank4_input_model, rank4_input_model_with_channel_first_output): # check that an error is raised if the output shape is not of form (1, 3, H, W) with pytest.raises(ValueError, match="Shape of the RGB/BGR image output,"): mlmodel = ct.convert(rank4_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], outputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.macOS13, ) mlmodel = ct.convert(rank4_input_model_with_channel_first_output, inputs=[ct.ImageType(color_layout=ct.colorlayout.BGR)], outputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "add", "cast"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.BGR) assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") verify_prediction(mlmodel) # check neural network conversion mlmodel = ct.convert( rank4_input_model_with_channel_first_output, inputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], outputs=[ct.ImageType(color_layout=ct.colorlayout.BGR)], convert_to="neuralnetwork", ) assert_ops_in_mil_program(mlmodel, expected_op_list=["add"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB) assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.BGR) verify_prediction(mlmodel) def test_grayscale_output(self, rank4_grayscale_input_model, rank4_grayscale_input_model_with_channel_first_output): # check that an error is raised if the output shape is not of form (1, 1, H, W) with pytest.raises(ValueError, match="Shape of the Grayscale image output,"): mlmodel = ct.convert( rank4_grayscale_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE)], outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE)], convert_to="neuralnetwork", ) with pytest.raises(TypeError, match="float16 dtype for outputs is only supported for deployment target >= iOS16/macOS13"): mlmodel = ct.convert(rank4_grayscale_input_model_with_channel_first_output, outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], minimum_deployment_target=ct.target.macOS12, ) mlmodel = ct.convert( rank4_grayscale_input_model_with_channel_first_output, inputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE)], outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE)], convert_to="neuralnetwork", ) assert_ops_in_mil_program(mlmodel, expected_op_list=["add"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE) assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE) verify_prediction(mlmodel) mlmodel = ct.convert(rank4_grayscale_input_model_with_channel_first_output, inputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], minimum_deployment_target=ct.target.macOS13, ) assert_cast_ops_count(mlmodel, expected_count=0) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16) assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) mlmodel = ct.convert(rank4_grayscale_input_model_with_channel_first_output, inputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE)], outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "add"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE) assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) def test_linear_model(self, linear_model): # this will test the fuse_matmul_weight_bias pass, when the inputs are of type float16 mlmodel = ct.convert(linear_model, inputs=[ct.TensorType(dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, ) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp16") assert_ops_in_mil_program(mlmodel, ["linear", "relu"]) verify_prediction(mlmodel) def test_default_input_dtype(self, int32_input_model, int32_two_input_model): """ If ``dtype`` is not specified, it defaults to the ``dtype`` of the inputs in the TF model. """ # Case 1: Single input model with no dtype specified mlmodel = ct.convert( int32_input_model, inputs=[ct.TensorType(shape=(10, 20))], minimum_deployment_target=ct.target.macOS13, ) assert_input_dtype(mlmodel, expected_type_str="int32") verify_prediction(mlmodel) # Case 2: two inputs model with dtype specified for the first input mlmodel = ct.convert( int32_two_input_model, inputs=[ ct.TensorType(name="input1", shape=(10, 20), dtype=np.float16), ct.TensorType(name="input2", shape=(10, 20)), ], minimum_deployment_target=ct.target.macOS13, ) assert_input_dtype(mlmodel, expected_type_str="fp16", index=0) assert_input_dtype(mlmodel, expected_type_str="int32", index=1) verify_prediction(mlmodel) # Case 3: two inputs model with dtype specified for the second input mlmodel = ct.convert( int32_two_input_model, inputs=[ ct.TensorType(name="input1", shape=(10, 20)), ct.TensorType(name="input2", shape=(10, 20), dtype=np.float16), ], minimum_deployment_target=ct.target.macOS13, ) assert_input_dtype(mlmodel, expected_type_str="int32", index=0) assert_input_dtype(mlmodel, expected_type_str="fp16", index=1) verify_prediction(mlmodel) # Case 4: two inputs model with no dtype specified for both inputs mlmodel = ct.convert( int32_two_input_model, inputs=[ ct.TensorType(name="input1", shape=(10, 20)), ct.TensorType(name="input2", shape=(10, 20)), ], minimum_deployment_target=ct.target.macOS13, ) assert_input_dtype(mlmodel, expected_type_str="int32", index=0) assert_input_dtype(mlmodel, expected_type_str="int32", index=1) verify_prediction(mlmodel) class TestiOS16DefaultIODtype: def test_iO16_default_fp16_input( self, float32_input_model_add_op, int32_input_model, ): """ With minimum_deployment_target set >= iOS16, if the compute precision is set to fp16. By default, a fp16 i/o model is produced for fp32 models. However, if the users specify the dtype, the converter is going to respect that. """ # Case 1: fp32 single input model mlmodel = ct.convert( float32_input_model_add_op, inputs=[ct.TensorType(shape=(10, 20))], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp16") verify_prediction(mlmodel) # Case 2: fp32 single input model mlmodel = ct.convert( float32_input_model_add_op, minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp16") verify_prediction(mlmodel) # Case 3: int32 single input model. No change made. mlmodel = ct.convert( int32_input_model, inputs=[ct.TensorType(shape=(10, 20))], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="int32") verify_prediction(mlmodel) def test_iO16_default_fp16_multiple_input( self, float32_two_input_model, int32_two_input_model, int32_float32_two_input_model, ): # Case 1: fp32 two inputs model. First input dtype missing mlmodel = ct.convert( float32_two_input_model, inputs=[ ct.TensorType(name="input1", shape=(10, 20)), ct.TensorType(name="input2", shape=(10, 20), dtype=np.float32), ], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp16", index=0) assert_input_dtype(mlmodel, expected_type_str="fp32", index=1) verify_prediction(mlmodel) # Case 2: fp32 two inputs model. Second input dtype missing mlmodel = ct.convert( float32_two_input_model, inputs=[ ct.TensorType(name="input1", shape=(10, 20), dtype=np.float32), ct.TensorType(name="input2", shape=(10, 20)), ], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp32", index=0) assert_input_dtype(mlmodel, expected_type_str="fp16", index=1) verify_prediction(mlmodel) # Case 3: fp32 two inputs model. Both dtype missing mlmodel = ct.convert( float32_two_input_model, inputs=[ ct.TensorType(name="input1", shape=(10, 20)), ct.TensorType(name="input2", shape=(10, 20)), ], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp16", index=0) assert_input_dtype(mlmodel, expected_type_str="fp16", index=1) verify_prediction(mlmodel) # Case 4: fp32 two inputs model. inputs not given mlmodel = ct.convert( float32_two_input_model, minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp16", index=0) assert_input_dtype(mlmodel, expected_type_str="fp16", index=1) verify_prediction(mlmodel) # Case 5: fp32 two inputs model. Both dtype given mlmodel = ct.convert( float32_two_input_model, inputs=[ ct.TensorType(name="input1", shape=(10, 20), dtype=np.int32), ct.TensorType(name="input2", shape=(10, 20), dtype=np.float32), ], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="int32", index=0) assert_input_dtype(mlmodel, expected_type_str="fp32", index=1) verify_prediction(mlmodel) # Case 6: int32 two inputs model. Both dtype missing. No change made. mlmodel = ct.convert( int32_two_input_model, inputs=[ ct.TensorType(name="input1", shape=(10, 20)), ct.TensorType(name="input2", shape=(10, 20)), ], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="int32", index=0) assert_input_dtype(mlmodel, expected_type_str="int32", index=1) verify_prediction(mlmodel) # Case 7: mixed dtype model with two inputs. Both dtype missing. The fp32 input is cast to fp16. mlmodel = ct.convert( int32_float32_two_input_model, inputs=[ ct.TensorType(name="input1", shape=(10, 20)), ct.TensorType(name="input2", shape=(10, 20)), ], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="int32", index=0) assert_input_dtype(mlmodel, expected_type_str="fp16", index=1) verify_prediction(mlmodel) def test_iO16_default_fp16_output( self, float32_input_model_add_op, int32_input_model, ): """ With minimum_deployment_target set >= iOS16, if the compute precision is set to fp16. By default, a fp16 i/o model is produced for fp32 models. However, if the users specify the dtype, the converter is going to respect that. """ # Case 1: fp32 single output model mlmodel = ct.convert( float32_input_model_add_op, minimum_deployment_target=ct.target.iOS16, ) assert_output_dtype(mlmodel, expected_type_str="fp16") verify_prediction(mlmodel) # Case 2: int32 single output model. No change made. mlmodel = ct.convert( int32_input_model, minimum_deployment_target=ct.target.iOS16, ) assert_output_dtype(mlmodel, expected_type_str="int32") verify_prediction(mlmodel) # Case 3: fp32 single output model, with dtype set by the user mlmodel = ct.convert( float32_input_model_add_op, outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.iOS16, ) assert_output_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) def test_iO16_default_fp16_multiple_output( self, float32_two_output_model, int32_two_output_model, int32_float32_two_output_model, ): output1_name = "Identity" if _HAS_TF_2 else "output1" output2_name = "Identity_1" if _HAS_TF_2 else "output2" # Case 1: fp32 two outputs model. First output dtype missing mlmodel = ct.convert( float32_two_output_model, outputs=[ ct.TensorType(name=output1_name), ct.TensorType(name=output2_name, dtype=np.float32), ], minimum_deployment_target=ct.target.iOS16, ) assert_output_dtype(mlmodel, expected_type_str="fp16", index=0) assert_output_dtype(mlmodel, expected_type_str="fp32", index=1) verify_prediction(mlmodel) # Case 2: fp32 two outputs model. Second output dtype missing mlmodel = ct.convert( float32_two_output_model, outputs=[ ct.TensorType(name=output1_name, dtype=np.int32), ct.TensorType(name=output2_name), ], minimum_deployment_target=ct.target.iOS16, ) assert_output_dtype(mlmodel, expected_type_str="int32", index=0) assert_output_dtype(mlmodel, expected_type_str="fp16", index=1) verify_prediction(mlmodel) # Case 3: fp32 two outputs model. Both output dtype missing mlmodel = ct.convert( float32_two_output_model, outputs=[ ct.TensorType(name=output1_name), ct.TensorType(name=output2_name), ], minimum_deployment_target=ct.target.iOS16, ) assert_output_dtype(mlmodel, expected_type_str="fp16", index=0) assert_output_dtype(mlmodel, expected_type_str="fp16", index=1) verify_prediction(mlmodel) # Case 4: fp32 two outputs model. outputs not set. mlmodel = ct.convert( float32_two_output_model, minimum_deployment_target=ct.target.iOS16, ) assert_output_dtype(mlmodel, expected_type_str="fp16", index=0) assert_output_dtype(mlmodel, expected_type_str="fp16", index=1) verify_prediction(mlmodel) # Case 5: int32 two outputs model. outputs not set. No change happens. mlmodel = ct.convert( int32_two_output_model, minimum_deployment_target=ct.target.iOS16, ) assert_output_dtype(mlmodel, expected_type_str="int32", index=0) assert_output_dtype(mlmodel, expected_type_str="int32", index=1) verify_prediction(mlmodel) # Case 6: int32 two outputs model. The first input is force set to fp32. # In this case, the first output is inferred as fp32 as well, so it defaults # to fp16. mlmodel = ct.convert( int32_two_output_model, inputs=[ ct.TensorType(name="input1", dtype=np.float32), ct.TensorType(name="input2"), ], minimum_deployment_target=ct.target.iOS16, ) assert_output_dtype(mlmodel, expected_type_str="fp16", index=0) assert_output_dtype(mlmodel, expected_type_str="int32", index=1) verify_prediction(mlmodel) # Case 7: int32 two outputs model. The second input is force set to fp16. # In this case, the second output is inferred as fp32 as well, so it defaults # to fp16. mlmodel = ct.convert( int32_two_output_model, inputs=[ ct.TensorType(name="input1"), ct.TensorType(name="input2", dtype=np.float16), ], minimum_deployment_target=ct.target.iOS16, ) assert_output_dtype(mlmodel, expected_type_str="int32", index=0) assert_output_dtype(mlmodel, expected_type_str="fp16", index=1) verify_prediction(mlmodel) # Case 8: two outputs model with int32/fp32. # In this case, the fp32 output defaults to fp16, while the int32 one remains unchanged. mlmodel = ct.convert( int32_float32_two_output_model, minimum_deployment_target=ct.target.iOS16, ) assert_output_dtype(mlmodel, expected_type_str="fp16", index=0) assert_output_dtype(mlmodel, expected_type_str="int32", index=1) verify_prediction(mlmodel) def test_iO17_default_fp32_io( self, int32_float32_two_input_model, int32_float32_two_output_model, ): """ With minimum_deployment_target set >= iOS16, and if the compute precision is set to fp32. By default, a fp32 i/o model is produced. """ # Example 1 mlmodel = ct.convert( int32_float32_two_input_model, compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="int32", index=0) assert_input_dtype(mlmodel, expected_type_str="fp32", index=1) assert_output_dtype(mlmodel, expected_type_str="fp32", index=0) # Example 2 mlmodel = ct.convert( int32_float32_two_output_model, compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp32", index=0) assert_input_dtype(mlmodel, expected_type_str="fp32", index=1) assert_output_dtype(mlmodel, expected_type_str="fp32", index=0) assert_output_dtype(mlmodel, expected_type_str="int32", index=1) def test_iO16_default_image_dtype_input( self, rank4_input_model, rank4_grayscale_input_model, ): """ We keep the input dtype for the image input model to fp32, unless it is GRAYSCALE_FLOAT16 """ # Example 1 mlmodel = ct.convert( rank4_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.iOS16, ) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) # Example 2 mlmodel = ct.convert( rank4_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.BGR)], minimum_deployment_target=ct.target.iOS16, ) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.BGR) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) # Example 3 mlmodel = ct.convert( rank4_grayscale_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE)], minimum_deployment_target=ct.target.iOS16, ) assert_spec_input_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE ) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) # Example 4 mlmodel = ct.convert( rank4_grayscale_input_model, inputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], minimum_deployment_target=ct.target.iOS16, ) assert_spec_input_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16 ) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) def test_iO16_default_image_dtype_output( self, rank4_input_model_with_channel_first_output, rank4_grayscale_input_model_with_channel_first_output, ): """ We keep the output dtype for the image input model to fp32, unless it is GRAYSCALE_FLOAT16 """ # Example 1 mlmodel = ct.convert( rank4_input_model_with_channel_first_output, outputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.iOS16, ) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB) verify_prediction(mlmodel) # Example 2 mlmodel = ct.convert( rank4_input_model_with_channel_first_output, outputs=[ct.ImageType(color_layout=ct.colorlayout.BGR)], minimum_deployment_target=ct.target.iOS16, ) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.BGR) verify_prediction(mlmodel) # Example 3 mlmodel = ct.convert( rank4_grayscale_input_model_with_channel_first_output, outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE)], minimum_deployment_target=ct.target.iOS16, ) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_spec_output_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE ) verify_prediction(mlmodel) # Example 4 mlmodel = ct.convert( rank4_grayscale_input_model_with_channel_first_output, outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], minimum_deployment_target=ct.target.iOS16, ) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_spec_output_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16 ) verify_prediction(mlmodel) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/test/testing_utils.py0000644000000000000000000003542014672066616030267 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import tempfile import numpy as np import pytest import coremltools.models.utils as coremltoolsutils from coremltools._deps import _HAS_TF_2 from coremltools.converters.mil.testing_reqs import ct from coremltools.converters.mil.testing_utils import ( compare_backend, ct_convert, validate_minimum_deployment_target, ) tf = pytest.importorskip("tensorflow", minversion="1.15.0") from tensorflow.python.framework import dtypes from tensorflow.python.keras.saving import saving_utils as _saving_utils from tensorflow.python.tools.freeze_graph import freeze_graph as freeze_g def make_tf_graph(input_types): """ Decorator to help construct TensorFlow 1.x model. Parameters ---------- input_types: list of tuple or list of list List of input types. E.g. [(3, 224, 224, tf.int32)] represent 1 input, with shape (3, 224, 224), and the expected data type is tf.int32. The dtype is optional, in case it's missing, tf.float32 will be used. Returns ------- tf.Graph, list of str, list of str """ def wrapper(ops): with tf.Graph().as_default() as model: inputs = [] for input_type in input_types: input_type = tuple(input_type) if input_type is not None else None if input_type is not None and len(input_type) > 0 and isinstance(input_type[-1], dtypes.DType): shape, dtype = input_type[:-1], input_type[-1] else: shape, dtype = input_type, tf.float32 inputs.append(tf.placeholder(shape=shape, dtype=dtype)) outputs = ops(*inputs) return model, inputs, outputs return wrapper def get_tf_keras_io_names(model): """ Utility function to get tf.keras inputs/outputs names from a tf.keras model. Parameter --------- model: tf.keras.Model """ input_names, output_names = [], [] try: # The order of outputs in conc_func.structured_outputs is the same order # that Keras predicts in, which can be different from model.outputs input_signature = _saving_utils.model_input_signature( model, keep_original_batch_size=True ) fn = _saving_utils.trace_model_call(model, input_signature) conc_func = fn.get_concrete_function() for key in conc_func.structured_outputs: output_names.append(conc_func.structured_outputs[key].name.split(":")[0]) except: for o in model.outputs: output_names.append(o.name.split(":")[0].split("/")[-1]) for name in model.input_names: input_names.append(name.split(":")[0]) return input_names, output_names def get_tf_node_names(tf_nodes, mode="inputs"): """ Inputs: - tf_nodes: list[str]. Names of target placeholders or output variable. - mode: str. When mode == inputs, do the stripe for the input names, for instance 'placeholder:0' could become 'placeholder'. when model == 'outputs', we keep the origin suffix number, like 'bn:0' will still be 'bn:0'. Return a list of names from given list of TensorFlow nodes. Tensor name's postfix is eliminated if there's no ambiguity. Otherwise, postfix is kept """ if not isinstance(tf_nodes, list): tf_nodes = [tf_nodes] names = list() for n in tf_nodes: tensor_name = n if isinstance(n, str) else n.name if mode == "outputs": names.append(tensor_name) continue name = tensor_name.split(":")[0] if name in names: # keep postfix notation for multiple inputs/outputs names[names.index(name)] = name + ":" + str(names.count(name) - 1) names.append(tensor_name) else: names.append(name) return names def tf_graph_to_mlmodel( graph, feed_dict, output_nodes, frontend="tensorflow", backend=("neuralnetwork", "fp32"), compute_unit=ct.ComputeUnit.CPU_ONLY, inputs_for_conversion=None, minimum_deployment_target=None, ): """ Parameters ---------- graph: tf.Graph TensorFlow 1.x model in tf.Graph format. feed_dict: dict of {tf.placeholder -> np.array or python primitive) Dict of placeholder and value pairs representing inputs. output_nodes: tf.node or list[tf.node] List of names representing outputs. frontend: str Frontend to convert from. backend: str Backend to convert to. compute_unit: Enum[ct.ComputeUnit]. Compute unit for the coreml model inputs_for_conversion: list of coremltools.TensorType() or coremltools.ImageType() objects Defaults to None. It is passed as is to the "inputs" argument of the converter. minimum_deployment_target : coremltools.target enumeration It set the minimum_deployment_target argument in the coremltools.convert function. ----------- Returns MLModel, Input Values, Output Names """ if isinstance(output_nodes, tuple): output_nodes = list(output_nodes) if not isinstance(output_nodes, list): output_nodes = [output_nodes] # Convert TF graph. input_names = get_tf_node_names(list(feed_dict.keys()), mode="inputs") output_names = get_tf_node_names(output_nodes, mode="outputs") input_values = {name: val for name, val in zip(input_names, feed_dict.values())} if inputs_for_conversion is None and backend[0] == "mlprogram": # As mlprogram by default use a small upper-bound for dynamic shapes, set a larger one here # to avoid test failures. has_dynamic_shape = False input_types = [] for input_placeholder in list(feed_dict.keys()): input_shape = [ ct.RangeDim(upper_bound=64) if dim.value is None else dim.value for dim in input_placeholder.shape ] input_types.append( ct.TensorType(name=input_placeholder.name.split(":")[0], shape=input_shape) ) if any([dim.value is None for dim in input_placeholder.shape]): has_dynamic_shape = True if has_dynamic_shape: inputs_for_conversion = input_types mlmodel = ct_convert( graph, inputs=inputs_for_conversion, outputs=output_names, source=frontend, convert_to=backend, compute_units=compute_unit, minimum_deployment_target=minimum_deployment_target, ) return mlmodel, input_values, output_names, output_nodes def load_tf_pb(pb_file): """ Loads a pb file to tf.Graph """ # We load the protobuf file from the disk and parse it to retrieve the # unsterilized graph_def with tf.io.gfile.GFile(pb_file, "rb") as f: graph_def = tf.compat.v1.GraphDef() graph_def.ParseFromString(f.read()) # Then, we import the graph_def into a new Graph and returns it with tf.Graph().as_default() as graph: # The name var will prefix every op/nodes in your graph # Since we load everything in a new graph, this is not needed tf.import_graph_def(graph_def, name="") return graph def run_compare_tf( graph, feed_dict, output_nodes, inputs_for_conversion=None, compute_unit=ct.ComputeUnit.CPU_ONLY, frontend_only=False, frontend="tensorflow", backend=("neuralnetwork", "fp32"), atol=1e-04, rtol=1e-05, freeze_graph=False, tf_outputs=None, minimum_deployment_target=None, ): """ Utility function to convert and compare a given TensorFlow 1.x model. Parameters ---------- graph: tf.Graph TensorFlow 1.x model in tf.Graph format. feed_dict: dict of (tf.placeholder, np.array) Dict of placeholder and value pairs representing inputs. output_nodes: tf.node or list[tf.node] List of names representing outputs. inputs_for_conversion: list of coremltools.TensorType() or coremltools.ImageType() objects Defaults to None. It is passed as is to the "inputs" argument of the converter. compute_unit: Enum[ct.ComputeUnit]. Compute unit for the coreml model frontend_only: bool If true, skip the prediction call, only validate conversion. frontend: str Frontend to convert from. backend: str Backend to convert to. atol: float The absolute tolerance parameter. rtol: float The relative tolerance parameter. freeze_graph: bool If True, use the "tensorflow.python.tools.freeze_graph" function to freeze the TF graph prior to conversion. This will ensure that all the variables in the graph have been converted to constants. tf_outputs: float or list[float] If present, use it as TensorFlow predictions minimum_deployment_target : coremltools.target enumeration It set the minimum_deployment_target argument in the coremltools.convert function. Return: Proto, mlmodel, input dictionary, prediction(if possible) """ if not isinstance(output_nodes, (tuple, list)): output_nodes = [output_nodes] if freeze_graph: with tempfile.TemporaryDirectory() as model_dir: graph_def_file = os.path.join(model_dir, "tf_graph.pb") checkpoint_file = os.path.join(model_dir, "tf_model.ckpt") static_model_file = os.path.join(model_dir, "tf_static.pb") with tf.Session(graph=graph) as sess: sess.run(tf.global_variables_initializer()) if tf_outputs is None: tf_outputs = sess.run(output_nodes, feed_dict=feed_dict) tf.train.write_graph(sess.graph, model_dir, graph_def_file, as_text=False) saver = tf.train.Saver() saver.save(sess, checkpoint_file) output_node_names = get_tf_node_names(output_nodes, mode="outputs") output_node_names = [name.split(":")[0] for name in output_node_names] output_op_names = ",".join(output_node_names) freeze_g( input_graph=graph_def_file, input_saver="", input_binary=True, input_checkpoint=checkpoint_file, output_node_names=output_op_names, restore_op_name="save/restore_all", filename_tensor_name="save/Const:0", output_graph=static_model_file, clear_devices=True, initializer_nodes="", ) graph = load_tf_pb(static_model_file) mlmodel, input_key_values, output_names, output_nodes = tf_graph_to_mlmodel( graph, feed_dict, output_nodes, frontend, backend, compute_unit=compute_unit, inputs_for_conversion=inputs_for_conversion, minimum_deployment_target=minimum_deployment_target ) if frontend_only or coremltoolsutils._macos_version() < (10, 13) \ or (mlmodel.is_package and coremltoolsutils._macos_version() < (12, 0)): return mlmodel._spec, mlmodel, input_key_values, None if tf_outputs is None: with tf.Session(graph=graph) as sess: sess.run(tf.global_variables_initializer()) tf_outputs = sess.run(output_nodes, feed_dict=feed_dict) expected_outputs = {name: val for name, val in zip(output_names, tf_outputs)} for k, v in input_key_values.items(): if isinstance(v, np.ndarray) and issubclass(v.dtype.type, np.integer): input_key_values[k] = v.astype(float) # Core ML only accepts floats pred = None if not coremltoolsutils._has_custom_layer(mlmodel._spec): pred = compare_backend( mlmodel, input_key_values, expected_outputs, atol=atol, rtol=rtol, also_compare_shapes=True, dtype=backend[1], ) else: print('Skipping model prediction as it has a custom nn layer!') return mlmodel._spec, mlmodel, input_key_values, pred def layer_counts(spec, layer_type): spec_type_map = { "neuralNetworkClassifier": spec.neuralNetworkClassifier, "neuralNetwork": spec.neuralNetwork, "neuralNetworkRegressor": spec.neuralNetworkRegressor, } nn_spec = spec_type_map.get(spec.WhichOneof("Type")) if nn_spec is None: raise ValueError("MLModel must have a neural network") n = 0 for layer in nn_spec.layers: if layer.WhichOneof("layer") == layer_type: n += 1 return n class TensorFlowBaseTest: testclassname='' testmodelname='' @pytest.fixture(autouse=True) def store_testname_with_args(self, request): TensorFlowBaseTest.testclassname = type(self).__name__ TensorFlowBaseTest.testmodelname = request.node.name @staticmethod def run_compare_tf(graph, feed_dict, output_nodes, inputs_for_conversion=None, compute_unit=ct.ComputeUnit.CPU_ONLY, frontend_only=False, frontend="tensorflow", backend=("neuralnetwork", "fp32"), atol=1e-04, rtol=1e-05, freeze_graph=False, tf_outputs=None, minimum_deployment_target=None): if minimum_deployment_target is not None: validate_minimum_deployment_target(minimum_deployment_target, backend) res = run_compare_tf(graph, feed_dict, output_nodes, inputs_for_conversion=inputs_for_conversion, compute_unit=compute_unit, frontend_only=frontend_only, frontend=frontend, backend=backend, atol=atol, rtol=rtol, freeze_graph=freeze_graph, tf_outputs=tf_outputs, minimum_deployment_target=minimum_deployment_target ) alist = [] if res is not None: alist = list(res) alist.append(TensorFlowBaseTest.testclassname) alist.append(TensorFlowBaseTest.testmodelname) return tuple(alist) @staticmethod def _op_count_in_mil_program(mlmodel, op_type): prog = mlmodel._mil_program return len(prog.find_ops(op_type=op_type)) if _HAS_TF_2: from coremltools.converters.mil.frontend.tensorflow2.test.testing_utils import ( TensorFlow2BaseTest, make_tf2_graph) from coremltools.converters.mil.frontend.tensorflow.test.testing_utils import TensorFlowBaseTest TensorFlowBaseTest.run_compare_tf = TensorFlow2BaseTest.run_compare_tf2 make_tf_graph = make_tf2_graph ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2175465 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/0000755000000000000000000000000014672075535026655 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/__init__.py0000644000000000000000000000151714672066616030772 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .cond_to_where import cond_to_where from .constant_propagation import constant_propagation # graph passes from .delete_asserts import delete_asserts from .delete_constant import delete_unnecessary_constant_nodes # graphdef to tfssa from .delete_disconnected_nodes import delete_disconnected_nodes from .functionalize_loops import functionalize_loops from .fuse_dilation_conv import fuse_dilation_conv from .insert_get_tuple import insert_get_tuple from .quantization_pass import quantization_pass from .tensor_array_transform import tensor_array_resource_removal from .variable_node_transform import remove_variable_nodes ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/cond_to_where.py0000644000000000000000000001043114672066616032045 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools import _logger as logger from coremltools._deps import _HAS_TF_2 from ..basic_graph_ops import delete_node, disconnect_edge from .visitors import FindAllUpstreamTerminals def compute_max_rank(graph): # highly inefficient way to calculate the rank of every node ret = {} # begin at max rank for v in graph.keys(): if len(graph[v].inputs) == 0: ret[v] = 0 else: ret[v] = len(graph) changes = True while changes: changes = False for v in graph.keys(): if len(graph[v].inputs) > 0: rank = max(ret[i] for i in graph[v].inputs) + 1 if ret[v] != rank: changes = True ret[v] = rank return ret class CondToWhere: @staticmethod def _search(g, node_name): """ Find the nearest Switch nodes upstream of node_name. """ node = g[node_name] switches = ( FindAllUpstreamTerminals(lambda x: x.op == "Switch") .visit(g, node.name) .get_result() ) if len(switches) == 0: switches = ( FindAllUpstreamTerminals( lambda x: x.op == "Switch" or x.attr.get("was_switch") is not None ) .visit(g, node.name) .get_result() ) return switches @staticmethod def _fix_found_cond(g, merge, switches): """ Convert a Merge's Switch nodes to Identity ops and the Merge to iff. """ if g[switches[0]].op == "Switch": condition_input = g[switches[0]].inputs[1] else: condition_input = g[switches[0]].attr["was_switch"] # convert the merge to a select # TensorFlow seems to ensure the condition that the first # merge input is the True branch and the second merge input # is the false branch. # we convert switches to identity, detaching to switch condition for s in switches: if g[s].op == "Switch": g[s].op = "Identity" g[s].attr["was_switch"] = g[s].inputs[1] # detach input 1: the switch condition if g[s].inputs[0] == g[s].inputs[1]: g[s].inputs.pop() g[g[s].inputs[0]].outputs.pop() else: disconnect_edge(g, g[s].inputs[1], s) # build the final select g[merge].op = "iff" if not _HAS_TF_2: # swap true branch with false branch to get the right semantics for IFF g[merge].inputs[0], g[merge].inputs[1] = ( g[merge].inputs[1], g[merge].inputs[0], ) g[merge].inputs = [condition_input] + g[merge].inputs g[condition_input].outputs.append(merge) def cond_to_where(self, graph): stuff_done = False g = graph ranks = compute_max_rank(graph) merges = [a for a in g if g[a].op == "Merge"] merges = sorted(merges, key=lambda k: ranks[k]) if len(merges) == 0: return False for m in merges: logger.debug("Fixing cond at merge location: %s", m) switches = self._search(g, m) self._fix_found_cond(g, m, switches) stuff_done = True # delete the extra switches that seem to just lead to identities # which then lead nowhere but into control dependencies extra_switches = [a for a in g if g[a].op == "Switch"] for s in extra_switches: if all( [g[o].op == "Identity" and len(g[o].outputs) == 0 for o in g[s].outputs] ): nodes_to_delete = g[s].outputs + [s] for d in nodes_to_delete: delete_node(g, d) stuff_done = True return stuff_done def cond_to_where(tfssa): for k, v in tfssa.functions.items(): while True: stuff_done = CondToWhere().cond_to_where(v.graph) if not stuff_done: break ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/constant_propagation.py0000644000000000000000000001530114672066616033463 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import gc import tensorflow as tf from packaging.version import Version from coremltools import _logger as logger from coremltools._deps import _get_version from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.types.type_mapping import \ numpy_val_to_builtin_val from ..basic_graph_ops import const_determined_nodes def _get_const_nodes(fn): from tensorflow.core.framework import graph_pb2, node_def_pb2 new_graph = graph_pb2.GraphDef() constant_nodes = set() constant_node_num_outputs = {} generated_nodes = [k for k, v in fn.graph.items() if v.original_node is None] const_nodes_in_this_graph = const_determined_nodes(fn.graph, set(generated_nodes)) # we can only run TF on nodes with outputs since we must evaluate # tensors and not ops const_nodes_in_this_graph = [ i for i in const_nodes_in_this_graph if fn.graph[i].op != "NoOp" ] constant_nodes = constant_nodes.union(set(const_nodes_in_this_graph)) # topological sort const nodes topsort = [] topsort_set = set() while len(const_nodes_in_this_graph) > 0: for n in const_nodes_in_this_graph: input_names = fn.graph[n].inputs if len(set(input_names).difference(topsort_set)) == 0: topsort.append(n) topsort_set.add(n) const_nodes_in_this_graph = set(const_nodes_in_this_graph).difference( topsort_set ) for node in topsort: new_node = node_def_pb2.NodeDef() new_node.CopyFrom(fn.graph[node].original_node) if "_class" in new_node.attr: del new_node.attr["_class"] del new_node.input[:] new_node.input.extend(fn.graph[node].inputs) if "_output_shapes" in fn.graph[node].attr: constant_node_num_outputs[node] = len(fn.graph[node].attr["_output_shapes"]) else: constant_node_num_outputs[node] = 1 new_graph.node.extend([new_node]) del new_node gc.collect() return new_graph, list(constant_nodes), constant_node_num_outputs def _constant_propagation(fn, new_graph, constant_nodes, constant_node_num_outputs): try: if len(constant_nodes) > 0: with tf.Graph().as_default() as graph: tf.import_graph_def(new_graph, name="") # We're only making one call to `sess.run()` in order to compute constant values. # In this context, the default optimization settings make everything dramatically # slower and more memory-intensive. if _get_version(tf.__version__) < Version("1.13.1"): session_config = tf.ConfigProto() session_config.graph_options.optimizer_options.opt_level = ( tf.OptimizerOptions.L0 ) sess = tf.Session(graph=graph, config=session_config) else: session_config = tf.compat.v1.ConfigProto() session_config.graph_options.optimizer_options.opt_level = ( tf.compat.v1.OptimizerOptions.L0 ) session_config.graph_options.rewrite_options.disable_meta_optimizer = ( True ) sess = tf.compat.v1.Session(graph=graph, config=session_config) query_list = list() control_flow_ops = list() for c in constant_nodes: for j in range(constant_node_num_outputs[c]): query = c + ":" + str(j) lower_query = query.lower() if "switch" in lower_query or "cond" in lower_query: control_flow_ops.append(query) else: query_list.append(query) result_list = sess.run(query_list) result = { query_list[i]: result_list[i] for i in range(len(query_list)) } # propagate switch one by one for op in control_flow_ops: try: res = sess.run([op]) result.update({op: res[0]}) except: logger.warning( '[Constant Propagation] Skip "dead" tensor: {}'.format( op ) ) result.update({op: None}) sess.close() for k, v in fn.graph.items(): if k in constant_node_num_outputs: if constant_node_num_outputs[k] == 1: result_entry = k + ":0" try: v.value, v.datatype = numpy_val_to_builtin_val( result[result_entry] ) except: logger.error(result_entry) logger.error(result[result_entry]) else: values = [ result[k + ":" + str(i)] for i in range(constant_node_num_outputs[k]) ] try: npval = [numpy_val_to_builtin_val(i) for i in values] v.datatype = types.tuple(tuple([val[1] for val in npval])) v.value = v.datatype() for idx, val in enumerate(npval): v.value.val[idx] = val[0] except: logger.error(values) for k, v in fn.graph.items(): if v.op == "get_tuple": inp = fn.graph[v.inputs[0]] idx = v.attr["index"] if inp.value is not None: v.value = inp.value.val[idx] v.datatype = inp.datatype.T[idx] except Exception as e: logger.exception("Constant Propagation pass failed: {}".format(e)) def constant_propagation(tfssa): # we are going to rely on the TensorFlow graph to perform constant # propagation. For each graph, we construct a new graph comprising # only a subset of nodes that are constant nodes. for f in tfssa.functions.values(): const_nodes_info = _get_const_nodes(f) _constant_propagation(f, *const_nodes_info) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/delete_asserts.py0000644000000000000000000000452214672066616032240 0ustar00rootroot# -*- coding: utf-8 -*- # Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import sys from coremltools import _logger as logger from ..basic_graph_ops import delete_node sys.setrecursionlimit(5000) # increase recursion limit to support convert large models def _all_assert_leaves(gdict, nodename, memo): """ Does the given node lead to only assertions? Args: gdict (dict): The node's graph. nodename (str): The name of the node to test. memo (dict): Storage for memoization. """ work = [nodename] while True: assert len(work) <= len(gdict) # If true, this algorithm is broken node = gdict[work.pop()] # Entries in memo have one of the following values for a given node: # None: the node is in the stack; this node is downstream. # True: the node is an assertion or leads only to assertions. # False: the node does not lead only to assertions. if not isinstance(memo.get(node.name), bool): memo[node.name] = None outputs = node.outputs if len(outputs) == 0: # Leaf node: stack shrinks memo[node.name] = node.op in ("Assert", "CheckNumerics") else: outputs_to_process = [n for n in outputs if n not in memo] if len(outputs_to_process) == 0: # Non-leaf node with fully processed outputs: stack shrinks memo[node.name] = all(memo[n] for n in outputs) else: # Non-leaf node with unprocess outputs: stack grows work.append(node.name) work.extend(outputs_to_process) if len(work) == 0: return memo[node.name] def delete_asserts(tfssa): """ Delete all nodes that lead only to assertions. """ delete_count = 0 for f in tfssa.functions.values(): memo = {} for n in f.graph: _all_assert_leaves(f.graph, n, memo) for m in memo: if memo[m]: delete_count += 1 delete_node(f.graph, m) logger.debug("%d assert nodes deleted", delete_count) return delete_count ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/delete_constant.py0000644000000000000000000000576314672066616032415 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools import _logger as logger from ..basic_graph_ops import check_connections, delete_node, disconnect_edge def convert_constant_nodes_to_const_ops(tfssa): """ Convert nodes with known constant value to Const nodes """ for fn_key in list(tfssa.functions.keys()): f = tfssa.functions[fn_key] for k in list(f.graph.keys()): v = f.graph.get(k, None) if v is None: continue if v.value is not None: v.op = "Const" # delete all upstream edges now that this is constant inv = v.inputs[:] for i in inv: curnode = i nextnode = v.name disconnect_edge(f.graph, curnode, nextnode) # keep deleting upwards as long as it is a chain while curnode is not None: prevnode = None if len(f.graph[curnode].outputs) == 0: if len(f.graph[curnode].inputs) == 1: prevnode = f.graph[curnode].inputs[0] delete_node(f.graph, curnode) curnode = prevnode def delete_nodes_with_only_constant_descendents(tfssa): # look for nodes whose value is known AND downstream values are known # and delete them delete_count = 0 for fn_key in list(tfssa.functions.keys()): f = tfssa.functions[fn_key] keys = list(f.graph.keys()) for k in keys: if k not in f.graph: continue to_delete = (f.graph[k].value is not None) and (k not in f.outputs) if to_delete: # check the outputs for o in f.graph[k].outputs: if f.graph[o].value is None: to_delete = False else: disconnect_edge(f.graph, k, o) if to_delete: delete_count += 1 delete_node(f.graph, k) # also delete all Const nodes with no descendants keys = list(f.graph.keys()) for k in keys: if k not in f.graph: continue if ( f.graph[k].op == "Const" and len(f.graph[k].outputs) == 0 and (k not in f.outputs) ): delete_count += 1 delete_node(f.graph, k) return delete_count def delete_unnecessary_constant_nodes(tfssa): delete_count = delete_nodes_with_only_constant_descendents(tfssa) for f in list(tfssa.functions.values()): check_connections(f.graph) convert_constant_nodes_to_const_ops(tfssa) logger.debug("%s nodes deleted", delete_count) return delete_count ././@PaxHeader0000000000000000000000000000021100000000000010207 xustar00115 path=coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/delete_disconnected_nodes.py 22 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/delete_disconnected_nod0000644000000000000000000000123514672066616033425 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause def delete_disconnected_nodes(gd): # delete all nodes with no inputs and outputs empty_nodes = [] for k, v in gd.items(): if ( len(gd[k].inputs) == 0 and len(gd[k].outputs) == 0 and len(gd[k].control_inputs) == 0 and len(gd[k].control_outputs) == 0 and gd[k].op != "Placeholder" ): empty_nodes.append(k) for k in empty_nodes: del gd[k] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/functionalize_loops.py0000644000000000000000000004514714672066616033330 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools import _logger as logger from ..basic_graph_ops import (connect_dests, connect_edge, connect_sources, delete_node, disconnect_edge, replace_dest, replace_source) from ..parsed_tf_node import ParsedTFNode from ..tfssa import SSAFunction from .visitors import (FindAllReachableNodes, FindImmediateDownstreamNodes, FindImmediateUpstreamNodes, FindSubgraph) class FunctionalizeLoops: """ Turns while loops in TensorFlow dataflow graph into the functional form: while(cond_function, body_function) Usage: Given a graph in tfssa (the NetworkEnsemble defined in network.py) form: This will functionalize *ONE* loop in the main function. f = FunctionalizeLoops() ret = f.functionalize_loops(self, tfssa, "main") if ret is True, one loop has been functionalized, and the new functions added to tfssa. If False, there is no loop to functionalize. Generally, repeated calls to this will be necessary to catch all loops. Instead, use functionalize_loops. """ def __init__(self): self.exits = None self.merges = None self.enters = None self.constant_enters = None self.switches = None self.subgraph = None self.loopcond = None self.is_constant = None self.next_iterations = None self.cond = None self.body = None def _search(self, g, node): if not isinstance(node, ParsedTFNode): node = g[node] # we look for NextIteration nodes assert node.op == "Enter" frame_name = node.attr["frame_name"] logger.debug("Fixing frame name: %s", frame_name) # find all the enter args # this is basically the enter frame # functionalize_control_flow.cc:FunctionalizeControlFlow (1160-1196) self.enters = [ k for k, v in g.items() if v.attr.get("frame_name", "") == frame_name ] self.is_constant = [ bool(g[n].attr.get("is_constant", False)) for n in self.enters ] self.merges = ( FindImmediateDownstreamNodes(lambda x: x.op == "Merge") .visit_many(g, self.enters) .get_result() ) self.next_iterations = ( FindImmediateUpstreamNodes(lambda x: x.op == "NextIteration") .visit_many(g, self.merges) .get_result() ) self.switches = ( FindImmediateDownstreamNodes(lambda x: x.op == "Switch") .visit_many(g, self.merges) .get_result() ) self.exits = ( FindImmediateDownstreamNodes(lambda x: x.op == "Exit") .visit_many(g, self.switches) .get_result() ) self.loopcond = list( set( FindImmediateUpstreamNodes(lambda x: x.op == "LoopCond") .visit_many(g, self.switches) .get_result() ) ) self.subgraph = FindSubgraph(self.exits).visit_many(g, self.enters).get_result() self.cond = FindSubgraph(self.switches).visit_many(g, self.merges).get_result() self.body = ( FindSubgraph([node.name] + self.exits) .visit_many(g, self.switches) .get_result() ) # drop merges and switches from cond and body self.cond = [ i for i in self.cond if i not in (self.merges + self.switches + self.enters) ] self.body = ( [i for i in self.body if i not in ([node.name] + self.switches)] + [node.name] + self.switches + self.merges + self.enters ) # ok. we can now rebuild. def _fix_graph_invariants(self, g): import copy check = lambda x: x is not None and len(x) > 0 check(self.exits) check(self.merges) check(self.enters) check(self.switches) check(self.subgraph) check(self.cond) check(self.loopcond) assert len(self.loopcond) == 1 # maintain the invariant of a unique Enter node per argument # functionalize_control_flow.cc:FunctionalizeLoop (295) for i in copy.copy(self.enters): node = g[i] assert len(node.outputs) > 0 assert len(node.inputs) == 1 assert len(node.control_inputs) == 0 assert len(node.control_outputs) == 0 if len(node.outputs) == 1: continue node_output_copy = copy.copy(node.outputs) for j in range(1, len(node_output_copy)): # make a new enter node for each new_enter_node = copy.deepcopy(node) new_enter_node.inputs = [] new_enter_node.outputs = [] new_enter_node.name = node.name + "/trsplit%d" % (j) g[new_enter_node.name] = new_enter_node logger.debug("splitting %s", node.name) # connect the new node enter_output = node_output_copy[j] disconnect_edge(g, node.name, enter_output) connect_edge(g, new_enter_node.name, enter_output) connect_sources(g, node.inputs, new_enter_node.name) # insert into graph self.enters.append(new_enter_node.name) def functionalize_loops(self, tfssa, function_to_functionalize): g = tfssa.functions[function_to_functionalize].graph loopni = [a for a in g if g[a].op == "Enter"] if len(loopni) == 0: return False self._search(g, loopni[0]) self.constant_enters = [ self.enters[i] for i in range(len(self.enters)) if self.is_constant[i] ] self.enters = [ self.enters[i] for i in range(len(self.enters)) if not self.is_constant[i] ] self._fix_graph_invariants(g) # for each enter node, find the corresponding downstream merge node enter_corresponding_merge = [ FindImmediateDownstreamNodes(lambda x: x.op == "Merge") .visit(g, enter) .get_result()[0] for enter in self.enters ] merge_corresponding_ni = [ FindImmediateUpstreamNodes(lambda x: x.op == "NextIteration") .visit(g, merge) .get_result()[0] for merge in enter_corresponding_merge ] switch_corresponding_merge = [] for merge in enter_corresponding_merge: switch_after_merge = ( FindImmediateDownstreamNodes(lambda x: x.op == "Switch") .visit(g, merge) .get_result() ) if len(switch_after_merge) > 0: switch_corresponding_merge.append(switch_after_merge[0]) else: # There are some situations there is no switch not for a given # merge. While odd... its ok. we construct one # In this situation there is no Exit either, but it can be # constructed later on new_switch_node = ParsedTFNode() new_switch_node.op = "Switch" new_switch_node.name = tfssa._find_free_name("fake_switch_") g[new_switch_node.name] = new_switch_node connect_edge(g, merge, new_switch_node.name) connect_edge(g, self.loopcond[0], new_switch_node.name) switch_corresponding_merge.append(new_switch_node.name) exit_corresponding_switch = [] for switch in switch_corresponding_merge: res = ( FindImmediateDownstreamNodes(lambda x: x.op == "Exit") .visit(g, switch) .get_result() ) if len(res) > 0: exit_corresponding_switch.append(res[0]) else: new_exit_node = ParsedTFNode() new_exit_node.op = "Exit" new_exit_node.name = tfssa._find_free_name("fake_exit_") g[new_exit_node.name] = new_exit_node connect_edge(g, switch, new_exit_node.name) exit_corresponding_switch.append(new_exit_node.name) while_loop = ParsedTFNode() while_loop.op = "while" while_loop.name = tfssa._find_free_name("while_") g[while_loop.name] = while_loop # Build the Loop Condition # replace all enters with a single make_tuple # we replace merge with get_tuple and turn it into a function call # terminated with LoopCond make_inputs = ParsedTFNode() make_inputs.op = "make_tuple" make_inputs.name = tfssa._find_free_name("make_input_") g[make_inputs.name] = make_inputs for enter in self.enters: replace_dest(g, g[enter].inputs[0], enter, make_inputs.name) constant_base_index = len(make_inputs.inputs) for enter in self.constant_enters: replace_dest(g, g[enter].inputs[0], enter, make_inputs.name) connect_edge(g, make_inputs.name, while_loop.name) connect_dests(g, while_loop.name, exit_corresponding_switch) # build the cond function cond_body = ParsedTFNode() cond_body.op = "function_entry" cond_body.name = tfssa._find_free_name("cond_function_") cond_body.inputs = [] g[cond_body.name] = cond_body for merge_idx in range(len(enter_corresponding_merge)): merge = enter_corresponding_merge[merge_idx] switch = switch_corresponding_merge[merge_idx] enter_node = g[self.enters[merge_idx]] merge_node = g[merge] if switch is not None: switch_node = g[switch] else: switch_node = None merge_node.op = "get_tuple" merge_node.attr = {"index": merge_idx} # disconnect merge from switch # disconnect loopcond from switch disconnect_edge(g, enter_node.name, merge_node.name) if switch_node is not None: disconnect_edge(g, merge_node.name, switch_node.name) disconnect_edge(g, self.loopcond[0], switch_node.name) for i in merge_node.inputs[:]: disconnect_edge(g, i, merge_node.name) connect_edge(g, cond_body.name, merge_node.name) # delete get_tuple if it does nothing if len(merge_node.outputs) == 0: delete_node(g, merge) g[self.loopcond[0]].op = "return" # build the body function body = ParsedTFNode() body.op = "function_entry" body.name = tfssa._find_free_name("body_function_") body.inputs = [] g[body.name] = body for switch_idx in range(len(switch_corresponding_merge)): switch = switch_corresponding_merge[switch_idx] exit = exit_corresponding_switch[switch_idx] disconnect_edge(g, switch, exit) # replace switch with a get_tuple switch_node = g[switch] switch_node.op = "get_tuple" switch_node.attr = {"index": switch_idx} connect_edge(g, body.name, switch_node.name) # delete get_tuple if it does nothing if len(switch_node.outputs) == 0: delete_node(g, switch) # replace all next_iteration with a single make_tuple # we replace merge with get_tuple and turn it into a function call # terminated with LoopCond make_outputs = ParsedTFNode() make_outputs.op = "make_tuple" make_outputs.name = tfssa._find_free_name("make_output_") g[make_outputs.name] = make_outputs for ni in merge_corresponding_ni: connect_edge(g, g[ni].inputs[0], make_outputs.name) # connect constant enters to come from function # connect constant enters to exit for idx, enter in enumerate(self.constant_enters): for output in list(g[enter].outputs): if output not in self.cond and output not in self.body: cond_intersection = ( FindSubgraph(self.cond).visit(g, output).get_result() ) body_intersection = ( FindSubgraph(self.body).visit(g, output).get_result() ) if len(cond_intersection) > 0: cond_intersection.append(output) self.cond += cond_intersection if len(body_intersection) > 0: body_intersection.append(output) self.body += body_intersection get_tuple = ParsedTFNode() get_tuple.op = "get_tuple" get_tuple.name = tfssa._find_free_name("get_tuple_const_") get_tuple.attr = {"index": idx + constant_base_index} g[get_tuple.name] = get_tuple if output in self.cond: connect_edge(g, cond_body.name, get_tuple.name) elif output in self.body: connect_edge(g, body.name, get_tuple.name) replace_source(g, enter, output, get_tuple.name) # body must accept and return everything get_tuple = ParsedTFNode() get_tuple.op = "get_tuple" get_tuple.name = tfssa._find_free_name("get_tuple_const_") get_tuple.attr = {"index": idx + constant_base_index} g[get_tuple.name] = get_tuple connect_edge(g, body.name, get_tuple.name) connect_edge(g, get_tuple.name, make_outputs.name) assert len(g[make_outputs.name].inputs) == len(g[make_inputs.name].inputs) output_return = ParsedTFNode() output_return.op = "return" output_return.name = tfssa._find_free_name("body_return_") g[output_return.name] = output_return connect_edge(g, make_outputs.name, output_return.name) while_loop.attr["cond_function"] = cond_body.name while_loop.attr["body_function"] = body.name for i in self.enters: delete_node(g, i) for i in self.next_iterations: delete_node(g, i) for i in self.constant_enters: delete_node(g, i) for i in range(len(exit_corresponding_switch)): exit_node = exit_corresponding_switch[i] g[exit_node].op = "get_tuple" g[exit_node].attr = {"index": i} cond_function = ( FindSubgraph(self.loopcond[0]).visit(g, cond_body.name).get_result() ) cond_function = set(cond_function + [self.loopcond[0], cond_body.name]) body_function = ( FindSubgraph(output_return.name).visit(g, body.name).get_result() ) body_function = set(body_function + [body.name, output_return.name]) # trace input constants associated with the cond_graph # and the body_graph. These constants can only have one consumer # for now. Any more and we will either need to associate # it as an argument, or split the constant. cond_constants = ( FindImmediateUpstreamNodes(lambda x: x.op == "Const") .visit_many(g, cond_function) .get_result() ) body_constants = ( FindImmediateUpstreamNodes(lambda x: x.op == "Const") .visit_many(g, body_function) .get_result() ) # for const_node in cond_constants + body_constants: # assert(len(g[const_node].outputs) == 1) cond_function = cond_function.union(set(cond_constants)) body_function = body_function.union(set(body_constants)) downstream_cond = ( FindAllReachableNodes(lambda x: True) .visit_many(g, cond_function) .get_result() ) downstream_cond = set(downstream_cond) - cond_function if len(downstream_cond) > 0: logger.debug( "Disconnecting unused variables in condition function %s", downstream_cond, ) for i in downstream_cond: delete_node(g, i) downstream_body = ( FindAllReachableNodes(lambda x: True) .visit_many(g, body_function) .get_result() ) downstream_body = set(downstream_body) - body_function if len(downstream_body) > 0: logger.debug( "Disconnecting unused variables in body function %s", downstream_body ) for i in downstream_body: delete_node(g, i) cond_graph = {k: v for k, v in g.items() if k in cond_function} body_graph = {k: v for k, v in g.items() if k in body_function} g = { k: v for k, v in g.items() if k not in cond_function and k not in body_function } # localize control dependencies # In the main graph, reattach the control dependency to the while op for k, v in g.items(): for idx in range(len(v.control_inputs)): if v.control_inputs[idx] not in g: v.control_inputs[idx] = while_loop.name while_loop.control_outputs.append(k) for idx in range(len(v.control_outputs)): if v.control_outputs[idx] not in g: v.control_outputs[idx] = while_loop.name while_loop.control_inputs.append(k) # in the cond and body graphs, drop non-local control dependencies # entirely for graph in [cond_graph, body_graph]: for k, v in graph.items(): for idx in range(len(v.control_inputs) - 1, -1, -1): if v.control_inputs[idx] not in graph: v.control_inputs.pop(idx) for idx in range(len(v.control_outputs) - 1, -1, -1): if v.control_outputs[idx] not in graph: v.control_outputs.pop(idx) tfssa.functions[function_to_functionalize] = SSAFunction(g) tfssa.add_function(cond_body.name, SSAFunction(cond_graph)) tfssa.add_function(body.name, SSAFunction(body_graph)) return True def functionalize_loops(tfssa): """ Functionalize all loops in an tfssa """ done = False while not done: done = True for f in list(tfssa.functions.keys()): functionalize = FunctionalizeLoops() ret = functionalize.functionalize_loops(tfssa, f) if ret: done = False ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/fuse_dilation_conv.py0000644000000000000000000001454214672066616033107 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from ..basic_graph_ops import delete_node, replace_source from coremltools.converters.mil.mil.passes.defs.optimize_conv import fuse_dilated_conv _try_same = fuse_dilated_conv._uses_same_padding def _pattern_match_and_rewrite(gddict, conv_op): node = gddict[conv_op] channel_first = node.attr["data_format"].startswith("NC") if len(node.inputs) == 0 or len(node.outputs) == 0: return prev_node = gddict[node.inputs[0]] next_node = gddict[node.outputs[0]] expand_node = None squeeze_node = None # Check for Conv1D cases if prev_node.op == "ExpandDims": # All Conv1D has ExpandDims and Squeeze as pairs. if next_node.op != "Squeeze": return expand_node = prev_node squeeze_node = next_node if len(prev_node.inputs) == 0 or len(next_node.outputs) == 0: return prev_node = gddict[prev_node.inputs[0]] next_node = gddict[next_node.outputs[0]] # Check if Conv1D/Conv2D is surrounded by SpaceToBatchND and BatchToSpaceND if prev_node.op != "SpaceToBatchND" or next_node.op != "BatchToSpaceND": return else: stb_node = prev_node bts_node = next_node dilation_node = gddict[stb_node.inputs[1]] if dilation_node.value is None: return dilation_factor = dilation_node.value.val if gddict[bts_node.inputs[1]].value is None or np.any( dilation_factor != gddict[bts_node.inputs[1]].value.val ): # If SpaceToBatchND and BatchToSpaceND doesn't match, we do not fuse. return padding_node = gddict[stb_node.inputs[2]] if padding_node.value is None: return padding_val = padding_node.value.val.flatten() crop_node = gddict[bts_node.inputs[2]] if crop_node.value is None: return crop_val = crop_node.value.val.flatten() if expand_node: dilation_factor = [1] + list(dilation_factor) padding_val = [0, 0] + list(padding_val) crop_val = [0, 0] + list(crop_val) # Trying to inverse the logic of TF generating padding/cropping values for # SpaceToBatchND and BatchToSpaceND with different padding values in Conv2D. # Logic extracted from TF's builder at: # tensorflow/python/ops/nn_ops.py and tensorflow/python/ops/array_ops.py is_same = False if np.any(padding_val != 0): input_shape = gddict[stb_node.inputs[0]].attr.get("_output_shapes", None) if input_shape is None: input_shape = gddict[stb_node.inputs[0]].attr.get("shape", None) else: input_shape = input_shape[0] W_node = gddict[node.inputs[1]] W_shape = None if W_node.op != "Const" else W_node.datatype.get_shape() if input_shape is None or W_shape is None: return W_h, W_w = W_shape[0], W_shape[1] HW = input_shape[2:] if channel_first else input_shape[1:-1] if expand_node: HW = [1] + list(HW) is_same = _try_same( HW[0], HW[1], W_h, W_w, dilation_factor, padding_val, crop_val ) # Re-wiring the nodes to skip SpaceToBatchND. # We change BatchToSpaceND to Identity since it might be a terminate op. deleted_nodes = set() if expand_node: replace_source(gddict, stb_node, expand_node, stb_node.inputs[0]) else: replace_source(gddict, stb_node, node, stb_node.inputs[0]) bts_node.op = "Identity" bts_node.attr = {} deleted_nodes.update(stb_node.inputs[1:]) deleted_nodes.update([stb_node.name]) deleted_nodes.update(bts_node.inputs[1:]) # Rewrite dilation attribute for (Depthwise)Conv2D dilation_val = ( [1, 1] + list(dilation_factor) if node.attr["data_format"] == "NCHW" else [1] + list(dilation_factor) + [1] ) node.attr["dilations"] = dilation_val # Rewrite padding attribute for (Depthwise)Conv2D # This is due to, TF always plug in VALID padding for Conv2D after # SpaceToBatchND. If, the original Conv2D is SAME padding, TF would # automatically insert padding, therefore, we set it as SAME over here. if is_same: node.attr["padding"] = "SAME" # Removing stale attributes for nodes. if expand_node and "_output_shapes" in expand_node.attr: del expand_node.attr["_output_shapes"] if squeeze_node and "_output_shapes" in squeeze_node.attr: del squeeze_node.attr["_output_shapes"] if "_output_shapes" in node.attr: del node.attr["_output_shapes"] if expand_node and "shape" in expand_node.attr: del expand_node.attr["shape"] if squeeze_node and "shape" in squeeze_node.attr: del squeeze_node.attr["shape"] if "shape" in node.attr: del node.attr["shape"] for d in deleted_nodes: delete_node(gddict, d) def _fuse_dilation_conv(gddict): """ A dilated convolution in older tensorflow versions might not be fused in the Conv2D or DepthwiseConv2D op, but represented with the following format: SpaceToBatchND -> (Depthwise)Conv2D -> BatchToSpaceND We try to fuse it back into (Depthwise)Conv2D with the dilation parameter set in attribute. There are several patterns that exist in tensorflow for breaking up dilation convolutions. We detect the following patterns: SpaceToBatchND -> ExpandDims -> Conv2D -> Squeeze -> BatchToSpaceND SpaceToBatchND -> Conv2D -> BatchToSpaceND The first case appears when Conv1D is used, TF expands/squeeze the inputs to conform Conv2D pattern. The second case is a basic Conv2D pattern. """ for name in list(gddict.keys()): if name not in gddict: # Node might have been removed from graph during fusion. continue node = gddict[name] if node.op in {"Conv2D", "DepthwiseConv2dNative"}: _pattern_match_and_rewrite(gddict, name) def fuse_dilation_conv(tfssa): """ Tensorflow decomposes Depthwise Convolution with dialtion into: SpaceToBatchND ---> Conv2D/DepthwiseConv2D ---> BatchToSpaceND We identify such pattern and use Conv2D/DepthwiseConv2D to represent it. """ for f in tfssa.functions.keys(): _fuse_dilation_conv(tfssa.functions[f].graph) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/insert_get_tuple.py0000644000000000000000000000650614672066616032612 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy from ..parsed_tf_node import ParsedTFNode def insert_get_tuple(gddict): """ TensorFlow uses input "nodename:i" to denote "get tuple i" from "nodename". Here we split it so that: node1:i -> node2 gets transformed into node1 -> get_tuple(i) --> node2 Takes a graph in "dict{str, ParsedTFNode}" form, and returns a new graph. We do not do this for control flow nodes(Switch, Enter, Exit, Merge LoopCond, NextIteration). For these nodes, we just convert node1:i -> node2 to node1 -> node2 """ retdict = {} get_tuple_op_var_index = 1 inserted_ops = {} def make_op(input_node, index, new_node_name, gto_make_op_cache): cache_key = ( input_node, index, ) if cache_key in gto_make_op_cache: return gto_make_op_cache[cache_key] inserted_op_name = new_node_name inserted_op = ParsedTFNode() inserted_op.name = inserted_op_name inserted_op.op = "get_tuple" inserted_op.inputs = [input_node] inserted_op.attr["index"] = index inserted_ops[inserted_op_name] = inserted_op gto_make_op_cache[cache_key] = inserted_op return inserted_op exclusions = [ "Switch", "Enter", "Exit", "Merge", "LoopCond", "NextIteration", "TensorArrayV3", "Const", ] inclusions = ["IdentityN", "Split", "SplitV", "LSTMBlockCell", "TopK", "TopKV2", "Unpack", "BlockLSTM", "BlockLSTMV2", "NonMaxSuppressionV5"] gto_make_op_cache = {} for name in list(gddict.keys()): new_node = ParsedTFNode() new_node = copy.deepcopy(gddict[name]) new_inputs = [] for idx in range(len(new_node.inputs)): if ":" in new_node.inputs[idx]: input_node, input_index = new_node.inputs[idx].split(":") else: input_node = new_node.inputs[idx] input_index = 0 if ( "_output_shapes" in gddict[input_node].attr and len(gddict[input_node].attr["_output_shapes"]) > 1 and gddict[input_node].op not in exclusions ) or (gddict[input_node].op in inclusions): get_tuple_node_name = "gto_%s" % (get_tuple_op_var_index) new_inputs.append( make_op( input_node, int(input_index), get_tuple_node_name, gto_make_op_cache, ).name ) get_tuple_op_var_index += 1 else: new_inputs.append(new_node.inputs[idx]) new_node.inputs = new_inputs retdict[name] = new_node for k, v in inserted_ops.items(): retdict[k] = v # Force fix up the remaining node names by dropping the : # for k, v in retdict.items(): for idx in range(len(v.inputs)): if ":" in v.inputs[idx]: nodename, nodeindex = v.inputs[idx].split(":") v.inputs[idx] = nodename return retdict ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/quantization_pass.py0000644000000000000000000000562614672066616033014 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..basic_graph_ops import delete_node def delete_fakequant_node_and_repair_graph(g, node): inputs = node.inputs # Delete const inputs of the fakequant op for i in inputs: if g[i].op == 'Const': delete_node(g, i) else: non_const_input = i outputs = node.outputs # Append FakeQuant Op's outputs to its input node's outputs g[non_const_input].outputs = [i for i in g[non_const_input].outputs if i != node.name] g[non_const_input].outputs.extend(outputs) # Modify the FakeQuant op's outputs to set FakeQuant op's parent node as the new input. for i in outputs: for j in range(len(g[i].inputs)): if g[i].inputs[j] == node.name: g[i].inputs[j] = non_const_input delete_node(g, node) def quantization_pass_impl(fn): all_quantization_ops = [i for i in fn.graph.values() if "FakeQuant" in i.op] for node in all_quantization_ops: is_const_input = True for input in node.inputs: if fn.graph[input].op != 'Const': is_const_input = False if not is_const_input and ('weights_quant' not in input): # If activation quantization - # Delete the FakeQuant op and its const inputs, # Append FakeQuant Op's outputs to its input node's outputs, # Modify the FakeQuant op's outputs to reflect the 'new' input node. delete_fakequant_node_and_repair_graph(fn.graph, node) else: # If weight quantization - # Add attributes of the FakeQuant op to its output's attr dict for output in node.outputs: output_node = fn.graph[output] output_node.attr['quantize'] = True output_node.attr['num_bits'] = node.attr['num_bits'] output_node.attr['narrow_range'] = node.attr['narrow_range'] output_node.attr['quantize_min'] = fn.graph[node.inputs[1]].value.val output_node.attr['quantize_max'] = fn.graph[node.inputs[2]].value.val def quantization_pass(tfssa): """ Delete activation quantization ops and repair TF graph: If the FakeQuant op is not connected to constant inputs (which means that the op performs activation quantization) then delete that FakeQuant op and repair the graph. Edit weight quantization ops: If the FakeQuant op is connected to constant inputs then add its attributes to its output op so that parameters min, max, narrow_range, num_bits are available (in addition to weights) to downstream ops for denoting and supporting weight quantization. """ for v in tfssa.functions.values(): quantization_pass_impl(v) ././@PaxHeader0000000000000000000000000000020600000000000010213 xustar00112 path=coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/tensor_array_transform.py 22 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/tensor_array_transform.0000644000000000000000000000710114672066616033460 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause # A TensorArray is essentially a runtime vector with # # - an optional requirement "infer_shape" (True by default) that all Tensors # stored within the vector have the same size/shape (inferred by the # first element stored into the tensor) # - an optional "element_shape" which requires all elements to have this # exact shape. # - an optional "clear_after_read" (True by default) where read of an index # is destructive. (It doesn't *really* destroy, but just enables a particular # optimization where the tensor memory can be reused). # - An optional "dynamic_size" (False by default) where the vector is resized # automatically at runtime # # The way it works is rather odd. To enforce "control dependency" constraints, # a single float (flow) variable is passed between operations that write/read # the TensorArray. Additionally, a "Resource" variable is also passed along # which contains the actual handle to the TensorArray. # # The TensorArray can therefore also be passed around as as argument to while # loops. Thus unlike a global "Variable", this really is better thought of as # an additional type, a list[tensor]. # # See: # # https://github.com/tensorflow/tensorflow/blob/r1.6/tensorflow/python/ops/tensor_array_ops.py # https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/tensor_array.h # https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/tensor_array.cc # https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/ops/data_flow_ops.cc # https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/tensor_array_ops.cc # # The way we transform it is to introduce a new type. list[tensor] # The flow variable is the list[tensor] since that is consistently passed through # every operation. # The 'resource' edges then gets passed as void. # # We would like to delete the resource edges, but once too many graph passes are # performed, this becomes very difficult (since tuple shapes have to be updated). # The ideal is to perform the resource edge deletion *BEFORE* any additional # graph transformations. # The conversion of the flow variable to list[tensor] can be performed during # type inference. # # # After this op: # All nodes which take a TensorArray resource input will have the resource input # edge deleted. # # TensorArrayV3 op will only have 1 output, a flow variable. def tensor_array_resource_removal(gd): # this should be called *BEFORE* introduction of tuples, # and before output edges are added (for simplicity) for k, node in gd.items(): if node.op.startswith("TensorArray") and node.op != "TensorArrayV3": # generally the resource edge is the first edge # input is resource, indices, flow # output is generally flow node.inputs = node.inputs[1:] # TensorArrayV3 node outputs resource and flow # shift all flow reads from TensorArray to output 0 of TensorArray for i in range(len(node.inputs)): if ":" in node.inputs[i]: input_node, input_index = node.inputs[i].split(":") input_index = int(input_index) else: input_node = node.inputs[i] input_index = 0 if gd[input_node].op == "TensorArrayV3": if input_index == 1: node.inputs[i] = "%s" % input_node ././@PaxHeader0000000000000000000000000000020700000000000010214 xustar00113 path=coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/variable_node_transform.py 22 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/variable_node_transform0000644000000000000000000000552214672066616033471 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..basic_graph_ops import delete_node, disconnect_vertex_ins # Variable nodes are not horribly complicated. # # There are Variable nodes which don't really do much on their own # # To initialize, there is an additional Assign op which is just dangling away # on one side which assigns from "Variable/initial_value". # # [Variable] --> Assign <-- Const (VariableName/initial_value) # | # | ... rest of graph ... # v # ... Assign <---- New Values # ... etc # # Reads of the variable go through an Identity node with the name # VariableName/read, and has attribute _class:loc:@VariableName. # # Writes of the variable go through an Assign nodes which take as input # one Variable and one value, and has attribute _class:loc:@VariableName. # Assign also returns the new value of the variable. # # # # - We transform Variable to a function attribute # - We transform Assign ops to just "set_global" with attribute variable:VariableName # - We transform Read ops to just "get_global" with attribute variable:VariableName def remove_variable_node_impl(fn, tfssa): variables = [var for var in fn.graph.values() if var.op == "VariableV2"] assigns = [assign for assign in fn.graph.values() if assign.op == "Assign"] reads = [ read for read in fn.graph.values() if read.op == "Identity" and len(read.inputs) == 1 and fn.graph[read.inputs[0]].op == "VariableV2" ] # find the variable initial values variable_values = {} additional_nodes_to_delete = [] for v in variables: v.parse_from_attr() variable_values[v.name] = v.datatype() for node in fn.graph.values(): if ( node.op == "Assign" and node.inputs[0] == v.name and node.inputs[1] == v.name + "/initial_value" ): variable_values[v.name] = fn.graph[node.inputs[1]].value additional_nodes_to_delete += [node.name, node.inputs[1]] for r in reads: r.op = "get_global" r.attr["variable"] = r.inputs[0] disconnect_vertex_ins(fn.graph, r.name) # transform writes to set_global for r in assigns: r.op = "set_global" r.attr["variable"] = r.inputs[0] for var in variables: delete_node(fn.graph, var.name) for node in additional_nodes_to_delete: delete_node(fn.graph, node) for k, v in variable_values.items(): tfssa.variables[k] = v def remove_variable_nodes(tfssa): """ This should be performed after constant propagation pass. """ for v in tfssa.functions.values(): remove_variable_node_impl(v, tfssa) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/visitors.py0000644000000000000000000001443114672066616031114 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..parsed_tf_node import ParsedTFNode class FindAllDownstreamTerminals: # Find all nodes matching a particular function # which is downstream reachable from a set of nodes. def __init__(self, fn): self.result = [] self.fn = fn self.memo = {} def visit(self, g, node): if not isinstance(node, ParsedTFNode): node = g[node] if node.name in self.memo: return self self.memo[node.name] = 1 if self.fn(node): self.result.append(node.name) return self for i in node.outputs: self.visit(g, g[i]) return self def visit_many(self, g, nodes): for i in nodes: self.visit(g, i) return self def get_result(self): return self.result class FindAllReachableNodes: # Find all nodes reachable from a set of nodes which satisfy a criteria def __init__(self, fn): self.result = [] self.fn = fn self.memo = {} def visit(self, g, node): if not isinstance(node, ParsedTFNode): node = g[node] if node.name in self.memo: return self self.memo[node.name] = 1 if self.fn(node): self.result.append(node.name) for i in node.outputs: self.visit(g, g[i]) for i in node.inputs: self.visit(g, g[i]) return self def visit_many(self, g, nodes): for i in nodes: self.visit(g, i) return self def get_result(self): return self.result class FindImmediateUpstreamNodes: # Find all nodes matching a particular function which is immediately above a set of nodes def __init__(self, fn): self.result = [] self.fn = fn def visit(self, g, node): if not isinstance(node, ParsedTFNode): node = g[node] for i in node.inputs: if self.fn(g[i]): self.result.append(i) return self def visit_many(self, g, nodes): for i in nodes: self.visit(g, i) return self def get_result(self): return self.result class FindImmediateDownstreamNodes: # Find all nodes matching a particular function which is immediately above a set of nodes def __init__(self, fn): self.result = [] self.fn = fn def visit(self, g, node): if not isinstance(node, ParsedTFNode): node = g[node] for i in node.outputs: if self.fn(g[i]): self.result.append(i) return self def visit_many(self, g, nodes): for i in nodes: self.visit(g, i) self.result = list(set(self.result)) return self def get_result(self): return self.result class FindAllUpstreamTerminals: # Find the "upstream frontier" of nodes passing some predicate. # In other words, perform a pre-order traversal of a node and its inputs, collecting all nodes # passing a given predicate as we go along. Terminate the search along a given branch as soon # as a node is collected. def __init__(self, fn, control_dependencies=False): self.result = [] self.fn = fn self.control_dependencies = control_dependencies self.memo = {} def visit(self, g, node): if not isinstance(node, ParsedTFNode): node = g[node] if node.name in self.memo: return self self.memo[node.name] = 1 if self.fn(node): self.result.append(node.name) return self for i in node.inputs: self.visit(g, g[i]) if self.control_dependencies: for i in node.control_inputs: self.visit(g, g[i]) return self def visit_many(self, g, nodes): for i in nodes: self.visit(g, i) self.result = list(set(self.result)) return self def get_result(self): return self.result class FindSubgraph: # Find all nodes between a set of sources and a set of terminals # Sources are not returned, but reached terminals are returned def __init__(self, terminal_nodes): self.memo = {} self.terminal = terminal_nodes def visit_impl(self, g, node): if not isinstance(node, ParsedTFNode): node = g[node] if node.name in self.terminal: self.memo[node.name] = True return True if node.name in self.memo: return self.memo[node.name] # add self to memo first otherwise cycles will not terminate self.memo[node.name] = None reachable = None all_unreachable = True for i in node.outputs + node.control_outputs: visit_result = self.visit_impl(g, g[i]) if visit_result is True: reachable = True if visit_result is not False: all_unreachable = False if reachable: self.memo[node.name] = reachable elif all_unreachable: self.memo[node.name] = False else: self.memo[node.name] = None return reachable def visit(self, g, node): self.visit_impl(g, node) while True: if None in iter(self.memo.values()): revisit = [k for k, v in self.memo.items() if v is None] self.memo = {k: v for k, v in self.memo.items() if v is not None} for n in revisit: self.visit_impl(g, n) else: break return self def visit_many(self, g, nodes): for node in nodes: self.visit_impl(g, node) while True: if None in iter(self.memo.values()): revisit = [k for k, v in self.memo.items() if v is None] self.memo = {k: v for k, v in self.memo.items() if v is not None} for n in revisit: self.visit_impl(g, n) else: break return self def get_result(self): return [k for k, v in self.memo.items() if v] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tf_op_registry.py0000644000000000000000000000335114672066616027450 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause _TF_OPS_REGISTRY = {} def register_tf_op(_func=None, tf_alias=None, override=False): """ Registration routine for TensorFlow operators _func: (TF conversion function) [Default=None] TF conversion function to register tf_alias: (List of string) [Default=None] All other TF operators that should also be mapped to current conversion routine. e.g. Sort aliased with SortV1, SortV2 All provided alias operators must not be registered previously. override: (Boolean) [Default=False] If True, overrides earlier registration i.e. specified operator and alias will start pointing to current conversion function. Otherwise, duplicate registration will error out. """ def func_wrapper(func): f_name = func.__name__ if not override and f_name in _TF_OPS_REGISTRY: raise ValueError("TF op {} already registered.".format(f_name)) _TF_OPS_REGISTRY[f_name] = func # If tf_alias is provided, then all the functions mentioned as aliased # are mapped to current function if tf_alias is not None: for name in tf_alias: if not override and name in _TF_OPS_REGISTRY: msg = "TF op alias {} already registered." raise ValueError(msg.format(name)) _TF_OPS_REGISTRY[name] = func return func if _func is None: # decorator called without argument return func_wrapper return func_wrapper(_func) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow/tfssa.py0000644000000000000000000005106614672066616025537 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy from coremltools import _logger as logger from coremltools.converters.mil.mil import types from .basic_graph_ops import check_connections, const_determined_nodes from .dot_visitor import DotVisitor from .naming_utils import escape_fn_name class ParsedNode: """ Node class for the tfssa graph. name: The name of the node (str) op: The operation represented by the node (str) datatype: The type of the node. (type) value: The value of the node if available inputs: The list of nodes which are inputs to this node (list[str]) control_inputs: The list of nodes which have to be executed before this node (list[str]) attr: The attributes of the node outputs: The list of nodes which consume the result of this node (list[str]) control_outputs: The list of nodes which have to be executed after this node (list[str]) """ __slots__ = [ "name", "op", "datatype", "value", "inputs", "control_inputs", "outputs", "control_outputs", "attr", ] def __init__(self): self.name = None self.op = None self.datatype = None self.value = None self.inputs = [] self.outputs = [] self.control_inputs = [] self.control_outputs = [] self.attr = {} def __copy__(self): return self._copy_impl(ParsedNode()) def _copy_impl(self, dest): dest.name = self.name dest.op = self.op dest.datatype = self.datatype dest.value = copy.deepcopy(self.value) dest.inputs = self.inputs[:] dest.control_inputs = self.control_inputs[:] dest.outputs = self.outputs[:] dest.control_outputs = self.control_outputs[:] dest.attr = {k: copy.deepcopy(v) for k, v in self.attr.items()} return dest def copy(self): return self.__copy__() class SSAFunction: __slots__ = ["graph", "inputs", "input_types", "outputs", "output_types", "ret"] def __init__(self, gdict=None, inputs=None, outputs=None, ret=None): if gdict is None: gdict = {} self.graph = gdict self.inputs = [] if inputs is None else inputs self.outputs = [] if outputs is None else outputs self.input_types = [] self.output_types = [] # ret is a mapping from the output arg names from `signature` to the # outputs from `node_def` that should be returned by the function. # Only used in TF2 for getting indices when generating get_tuple ops # for control flow ops. Because the sub-graph's outputs and control # flow node's outputs mapping is defined in `ret` dict. See usages in # tf_graph_pass: rewrite_control_flow_functions for details. # https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/function.proto self.ret = [] if ret is None else ret check_connections(gdict) # respect TF inputs/outputs if given, otherwise, infer from the graph # in currently implementation: TF1 will always infer from graph. TF2, # on the other hand, respect the inputs/outputs provided. if len(self.inputs) == 0 or len(self.outputs) == 0: self.find_inputs_and_outputs() else: self.inputs, self.outputs = inputs, outputs self.filter_inputs_and_outputs() def find_inputs_and_outputs(self): # solve for input and output vars sorted_keys = sorted(self.graph.keys()) # we use function entry and exit points if available # otherwise we find graph entry and exit points enters = [ n.name for n in self.graph.values() if ("entry" in n.op or "Entry" in n.op) ] exits = [n.name for n in self.graph.values() if n.op in ("Return", "return")] if len(enters) > 0 or len(exits) > 0: assert len(enters) > 0 assert len(exits) > 0 self.inputs = enters self.input_types = [self.graph[v].datatype for v in self.inputs] self.outputs = exits self.output_types = [self.graph[v].datatype for v in self.outputs] else: for k in sorted_keys: v = self.graph[k] if len(v.inputs) == 0 and v.op not in ["Const", "get_global", "NoOp"]: self.inputs.append(k) self.input_types.append(v.datatype) elif len(v.inputs) != 0 and v.op == "Placeholder": assert len(v.inputs) == 1, "This is not a PlaceholderWithDefault!" self.inputs.append(k) self.input_types.append(v.datatype) if ( len(v.outputs) == 0 and len(v.control_outputs) == 0 and v.op != "set_global" ): self.outputs.append(k) self.output_types.append(v.datatype) def filter_inputs_and_outputs(self): """ Eliminate invalid input/output nodes in the given list. Should only be invoked if the self.inputs and self.outputs are both provided and we want to respect those when adding SSAFunctions. Only needed for TF2 for now because of the needs to parse multiple functions in graph. TF1 only has one "main" function. """ filtered_inputs = [] filtered_outputs = [] for k in self.inputs: if k not in self.graph.keys(): continue v = self.graph[k] if len(v.inputs) == 0 and v.op not in {"Const", "get_global", "NoOp"}: filtered_inputs.append(k) self.input_types.append(v.datatype) elif len(v.inputs) != 0 and v.op == "Placeholder": assert len(v.inputs) == 1, "This is not a PlaceholderWithDefault!" filtered_inputs.append(k) self.input_types.append(v.datatype) for k in self.outputs: if k not in self.graph.keys(): continue v = self.graph[k] filtered_outputs.append(k) self.output_types.append(v.datatype) self.inputs, self.outputs = filtered_inputs, filtered_outputs def __copy__(self): ret = SSAFunction() ret.inputs = self.inputs[:] ret.input_types = self.input_types[:] ret.outputs = self.outputs[:] ret.output_types = self.output_types[:] ret.graph = {k: copy.deepcopy(v) for k, v in self.graph.items()} return ret def copy(self): return self.__copy__() class NetworkEnsemble: __slots__ = ["functions", "variables", "global_resource"] def __init__(self, instance=None): self.functions = {} self.variables = {} self.global_resource = {} if isinstance(instance, NetworkEnsemble): self.functions = instance.functions self.variables = instance.variables self.global_resource = instance.global_resource elif instance is not None: raise ValueError( "Instance type {} not compatible with NetworkEnsemble".format( type(instance) ) ) def rename_function(self, src_func, tgt_func): """ Renames the function with function name (src_func) to (tgt_func) """ if src_func not in self.functions: logger.warning("Couldn't find function name (%s).", src_func) return if tgt_func in self.functions: logger.warning("(%s) already exists in some function name.", tgt_func) return self.functions[tgt_func] = self.functions.pop(src_func) logger.debug( "Successfully changed function name from (%s) to (%s)", src_func, tgt_func ) def rename_node(self, src_node, tgt_node): """ Rename the node with node name (src_node) to (tgt_node). Note that the name (tgt_node) cannot appear in the whole network, not only the function it lies in. """ in_ssa = False success = None for func, tfssa in self.functions.items(): if src_node in tfssa.graph: in_ssa = True if tgt_node in tfssa.graph: logger.warning( "(%s) already exists in function (%s).", tgt_node, func ) break success = func tfssa.graph[tgt_node] = tfssa.graph.pop(src_node) # Replace other nodes' output dependency for inp in tfssa.graph[tgt_node].inputs: for idx, out in enumerate(tfssa.graph[inp].outputs): if out == src_node: tfssa.graph[inp].outputs[idx] = tgt_node break # Replace other nodes' control output dependency for c_inp in tfssa.graph[tgt_node].control_inputs: for idx, c_out in enumerate(tfssa.graph[c_inp].control_outputs): if c_out == src_node: tfssa.graph[c_inp].control_outputs[idx] = tgt_node break # Replace other nodes' input dependency for out in tfssa.graph[tgt_node].outputs: for idx, inp in enumerate(tfssa.graph[out].inputs): if inp == src_node: tfssa.graph[out].inputs[idx] = tgt_node break # Replace other nodes' control input dependency for c_out in tfssa.graph[tgt_node].control_outputs: for idx, c_inp in enumerate(tfssa.graph[c_out].control_inputs): if c_inp == src_node: tfssa.graph[c_out].control_inputs[idx] = tgt_node break break if not in_ssa: logger.warning("Couldn't find (%s) in any functions", src_node) if success is not None: logger.debug( "Changed (%s) to (%s) in function (%s)", src_node, tgt_node, success ) def extract_subgraph(self, outputs, target_inputs=None, name=""): """Add a new SSAFunction to the current NetworkEnsemble to produce the given outputs. Args: outputs: The outputs the new function must produce. target_inputs: name: The name of the new function to create. If unspecified, a name will be generated by joining output names. Returns: The name of the new function. """ if not isinstance(outputs, list): raise TypeError("Expected a list of output names for subgraph extraction") if name == "": outputs.sort() name = escape_fn_name("_".join(outputs)) if target_inputs is None: target_inputs = [] def DFS_inputs(graph, node, vis): vis.add(node) if node in target_inputs: return [node] if ( len(graph[node].inputs) == 0 and len(graph[node].control_inputs) == 0 and graph[node].op != "Const" ): return [node] inputs = [] for i in graph[node].inputs + graph[node].control_inputs: if i in vis: continue inputs += DFS_inputs(graph, i, vis) return inputs def DFS_set_globals(graph, node, vis): vis.add(node) set_globals = [] if graph[node].op == "set_global": set_globals.append(node) for i in graph[node].outputs + graph[node].control_outputs: if i in vis: continue set_globals += DFS_set_globals(graph, i, vis) return set_globals for k in list(self.functions.keys()): v = self.functions[k] extract = [] for output in outputs: if output in v.graph: extract.append(output) if len(extract) == 0: continue incl_nodes = set() gdict = copy.deepcopy(v.graph) inputs = [] set_globals = [] for output in extract: inputs += DFS_inputs(gdict, output, incl_nodes) vis_nodes = set() for inp in inputs: set_globals += DFS_set_globals(gdict, inp, vis_nodes) for node in set_globals: inputs += DFS_inputs(gdict, node, incl_nodes) for new_k, new_v in v.graph.items(): if new_k not in incl_nodes: del gdict[new_k] continue if new_k in target_inputs: gdict[new_k].op = "Placeholder" gdict[new_k].inputs = [inp for inp in new_v.inputs if inp in incl_nodes] gdict[new_k].outputs = [ out for out in new_v.outputs if out in incl_nodes ] gdict[new_k].control_inputs = [ inp for inp in new_v.control_inputs if inp in incl_nodes ] gdict[new_k].control_outputs = [ out for out in new_v.control_outputs if out in incl_nodes ] for output in extract: old_name = "preIdentity_" + output output_node = copy.deepcopy(gdict[output]) output_node.op = "Identity" output_node.inputs = [old_name] output_node.control_inputs = [] output_node.outputs = [] output_node.control_outputs = [] for inp in gdict[output].inputs: for idx, out in enumerate(gdict[inp].outputs): if out == output: gdict[inp].outputs[idx] = old_name for inp in gdict[output].control_inputs: for idx, out in enumerate(gdict[inp].control_outputs): if out == output: gdict[inp].control_outputs[idx] = old_name for out in gdict[output].outputs: for idx, inp in enumerate(gdict[out].inputs): if inp == output: gdict[out].inputs[idx] = old_name for out in gdict[output].control_outputs: for idx, inp in enumerate(gdict[out].control_inputs): if inp == output: gdict[out].control_inputs[idx] = old_name gdict[output].outputs.append(output) gdict[output].name = old_name gdict[old_name] = gdict[output] gdict[output] = output_node self.functions[name] = SSAFunction(gdict) return name def delete_subgraph(self, name): """ Delete the SSAfunction with function_name. """ if name not in self.functions: logger.warning("(%s) not in NetworkEnsemble", name) return del self.functions[name] def __repr__(self): return str(self) def __str__(self): ret = "" for func, v in self.functions.items(): if func.startswith("body_function_") or func.startswith("f_body_function_"): continue elif func.startswith("cond_function_") or func.startswith( "f_cond_function_" ): continue ret += "Input Function Name: %s\n" % (func) ret += " Inputs:\n" for inp in v.inputs: ret += " %s\n" % (inp) ret += " Outputs:\n" for out in v.outputs: if out.startswith("fake_exit_"): continue ret += " %s\n" % (out) return ret def get_dot_string( self, name_and_op_style=False, annotation=False, highlight_debug_nodes=None ): """ Return the dot string that can be used to show the whole graph with dot. By default, the graph contains op and type. If name_and_op_style is set, the graph will contain the name of the node and the op instead. * Input nodes : yellow * constant nodes : azure * output nodes : goldenrod2 * nodes with variable shaped tensors : cyan * node names or op types that user wants to highlight: green Parameters ---------- name_and_op_style: bool If set, graph contains only the name and the op. annotation: bool Examples -------- >>> import graphviz >>> graphviz.Source(network.get_dot_string()).view() """ if highlight_debug_nodes is None: highlight_debug_nodes = [] function_names = sorted(self.functions.keys()) dotstring = "digraph g {\n" + "\tcompound=true;\n" # find all tensor nodes with unknown sizes ctr = 0 for k in function_names: const_nodes = const_determined_nodes(self.functions[k].graph) unknown_sized_tensor_ops = [] for v, n in self.functions[k].graph.items(): if n.datatype is None or ( n.datatype is not None and types.is_tensor(n.datatype) and ( len(n.datatype.get_shape()) == 0 or -1 in n.datatype.get_shape() ) ): unknown_sized_tensor_ops.append(v) if n.op in highlight_debug_nodes: highlight_debug_nodes.append(v) v = self.functions[k] vis = DotVisitor(annotation) vis.highlight_nodes(v.inputs, "yellow").highlight_nodes( const_nodes, "azure2" ).highlight_nodes(v.outputs, "goldenrod2").highlight_nodes( unknown_sized_tensor_ops, "cyan2" ) if len(highlight_debug_nodes) > 0: vis.highlight_nodes(highlight_debug_nodes, "green") if name_and_op_style: vis.labeller(lambda n: n.name + " (" + n.op + ")") res = vis.visit_all(v.graph, nodename_prefix=str(ctr)).get_result( "subgraph", "cluster_" + k.replace("/", "_") ) dotstring += "\n".join("\t" + r for r in res.split("\n")) + "\n" ctr += 1 dotstring += "}" return dotstring def add_function_with_prefix(self, fprefix, tfssa): assert isinstance(tfssa, SSAFunction) s = 0 while fprefix + str(s) in self.functions: s += 1 self.functions[fprefix + str(s)] = tfssa def add_function(self, f, tfssa): self.functions[f] = tfssa def __copy__(self): ret = self.__class__() ret.functions = self.functions ret.variables = self.variables ret.global_resource = self.global_resource return ret def __deepcopy__(self, memo): ret = self.__class__() ret.functions = {k: copy.copy(v) for k, v in self.functions.items()} ret.variables = {k: copy.copy(v) for k, v in self.variables.items()} ret.global_resource = {k: copy.copy(v) for k, v in self.global_resource.items()} return ret def copy(self): return self.__copy__() def _find_free_name(self, prefix): idx = 0 while True: name = prefix + str(idx) found = False for v in self.functions.values(): if name in v.graph: found = True break if found: idx += 1 else: return name def get_image_format(self): """ Iterates over graph and returns input format (`NCHW` or `NHWC`) if input is of type Image, otherwise `None` """ for fn_key in list(self.functions.keys()): graph = self.functions[fn_key].graph for name in graph: node = graph[name] if ( node.attr.get("data_format", None) == "NHWC" or node.attr.get("data_format") == "NHWC_format_inserted" ): return "NHWC" elif node.attr.get("data_format", None) == "NCHW": return "NCHW" return None ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2215466 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/0000755000000000000000000000000014672075535024117 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/__init__.py0000644000000000000000000000070714672066616026234 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ....._deps import _HAS_TF_2 if _HAS_TF_2: # importing these causes all its imports to be registered from coremltools.converters.mil.frontend.tensorflow.tf_op_registry import \ register_tf_op from . import ops ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/converter.py0000644000000000000000000000300414672066616026475 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.frontend.tensorflow.basic_graph_ops import \ simple_topsort from coremltools.converters.mil.frontend.tensorflow.converter import \ TFConverter class TF2Converter(TFConverter): def _get_stack(self, tfssa, root="main"): """ Overwrite TFConverter._get_stack() as TF2 generates different sub-graphs. """ # We're trying to get a order of how to loop through the graphs. # This is NOT necessarily a DAG. dep = {x: [] for x in tfssa.functions} for fname in tfssa.functions: for node in tfssa.functions[fname].graph.values(): func_x, func_y = None, None if node.op in {"StatelessIf", "If"}: func_x = node.attr.get("then_branch") func_y = node.attr.get("else_branch") elif node.op in {"StatelessWhile", "While"}: func_x = node.attr.get("body") func_y = node.attr.get("cond") if func_x and fname not in dep[func_x]: dep[func_x].append(fname) if func_y and fname not in dep[func_y]: dep[func_y].append(fname) assert len(dep[root]) == 0 graph_stack = simple_topsort(dep) return graph_stack ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/load.py0000644000000000000000000003471314672066616025420 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os.path as _os_path import tensorflow as _tf from packaging.version import Version from tensorflow.lite.python.util import \ get_grappler_config as _get_grappler_config from tensorflow.lite.python.util import \ run_graph_optimizations as _run_graph_optimizations from tensorflow.python.eager import context from tensorflow.python.framework import dtypes as _dtypes from tensorflow.python.framework.convert_to_constants import \ convert_variables_to_constants_v2 as _convert_variables_to_constants_v2 from tensorflow.python.framework.function_def_to_graph import \ function_def_to_graph as _function_def_to_graph from tensorflow.python.keras.saving import saving_utils as _saving_utils from tqdm import tqdm as _tqdm from coremltools import _logger as logger from coremltools._deps import _get_version from coremltools.converters.mil.frontend.tensorflow2.tf_graph_pass import ( flatten_sub_graph_namespaces, rewrite_control_flow_functions) from coremltools.converters.mil.frontend.tensorflow.basic_graph_ops import \ fill_outputs from coremltools.converters.mil.frontend.tensorflow.load import TFLoader from coremltools.converters.mil.frontend.tensorflow.parsed_tf_node import \ ParsedTFNode from coremltools.converters.mil.frontend.tensorflow.tf_graph_pass import ( constant_propagation, delete_disconnected_nodes, delete_unnecessary_constant_nodes, fuse_dilation_conv, insert_get_tuple, remove_variable_nodes, tensor_array_resource_removal) from coremltools.converters.mil.frontend.tensorflow.tfssa import ( NetworkEnsemble, SSAFunction) from coremltools.converters.mil.input_types import TensorType from .converter import TF2Converter class TF2Loader(TFLoader): """ There are the steps how the TF2Loader loads and converts the TF2 model 1. Get the concrete functions from the Keras model (only 1 concrete function is supported now) 2. Get the tensorflow graphdef from the concrete function by doing (a) calling tensorflow's convert_variables_to_constants_v2 API to freeze variables into constants (b) run grappler optimizations on the graphdef ("constfold", "dependency", "debug_stripper") 3. Extract sub graph based on "outputs" 4. Construct tfssa IR from graphdef 5. Run tfssa graph passes 6. Convert tfssa to program by TF2Converter """ def __init__(self, model, debug=False, **kwargs): """ TensorFlow 2.x model loader. Parameters ---------- model: Model created with TensorFlow 2.x One of the following model format: - TensorFlow tf.keras.Model object or HDF5 (.h5 or .hdf5) file path - TensorFlow SavedModel directory path - TensorFlow list of concrete functions(s) debug: bool, optional. Defaults to False. This flag should generally be False except for debugging purposes for diagnosing conversion errors. Setting this flag to True will cause graph pass errors to be ignored, forcefully returning a NetworkEnsemble object. kwargs: dict(str, Any), optional Dictionary of additional arguments. """ TFLoader.__init__(self, model, debug, **kwargs) """ tf_ssa graph passes Notes: - "flatten_while_loop_namespaces" should be after "constant_propagation" as it changes node names which constant propagation pass is relying on to perform session.run(), renamed nodes are not understandable for TF. """ self.tfssa_passes = [ constant_propagation, delete_unnecessary_constant_nodes, # delete_unnecessary_constant_nodes must come right after constant_propagation rewrite_control_flow_functions, flatten_sub_graph_namespaces, remove_variable_nodes, fuse_dilation_conv, ] def _get_concrete_functions_and_graph_def(self): if not isinstance(self.model, (list, str, _tf.keras.Model, _tf.compat.v1.GraphDef)): raise NotImplementedError( f"Expected model format: [SavedModel | concrete_function | " f"tf.keras.Model | .h5 | GraphDef], got {self.model}" ) cfs = [] if isinstance(self.model, list): cfs = self.model if isinstance(self.model, _tf.keras.Model): cfs = self._concrete_fn_from_tf_keras(self.model) elif isinstance(self.model, _tf.compat.v1.GraphDef): return None, self.model elif isinstance(self.model, str): if not _os_path.exists(self.model): raise ValueError(f'Input model "{self.model}" does not exist') elif _os_path.isfile(self.model) and ( self.model.endswith(".h5") or self.model.endswith(".hdf5") ): # Keep a reference to loaded model, or it errors out due to variables deletion, see # https://github.com/tensorflow/tensorflow/issues/37615#issuecomment-1552237114. keras_model = _tf.keras.models.load_model(self.model) cfs = self._concrete_fn_from_tf_keras(keras_model) elif _os_path.isdir(self.model): saved_model = _tf.saved_model.load(self.model) sv = saved_model.signatures.values() cfs = sv if isinstance(sv, list) else list(sv) else: raise ValueError( f"Input model path should be .h5/.hdf5 file or a directory, but " f"got {self.model}" ) graph_def = self._graph_def_from_concrete_fn(cfs) return cfs, graph_def def _graph_def_from_model(self, output_names=None): """Overwrites TFLoader._graph_def_from_model()""" cfs, graph_def = self._get_concrete_functions_and_graph_def() if isinstance(self.model, _tf.keras.Model) and self.kwargs.get("outputs", None) is None: # For the keras model, check if the outputs is provided by the user. # If not, we make sure the coreml model outputs order is the same as # the original keras model cf = cfs[0] output_names = [] for key in cf.structured_outputs: output_names.append(cf.structured_outputs[key].name.split(":")[0]) self.kwargs["outputs"] = [TensorType(name=name) for name in output_names] return self.extract_sub_graph(graph_def, output_names) def _tf_ssa_from_graph_def(self, fn_name="main"): """Overwrites TFLoader._tf_ssa_from_graph_def()""" with _tf.Graph().as_default() as tf_graph: _tf.graph_util.import_graph_def(self._graph_def, name="") # sub-graphs' input shapes are required for extracting sub-graphs sg_input_shapes = self._populate_sub_graph_input_shapes( tf_graph, tf_graph._functions ) # get graph_dict and sub-graphs' inputs / outputs graph_dict, inputs, outputs, ret = self._dict_from_graph_def( tf_graph, fn_name, sg_input_shapes ) tf_ssa = NetworkEnsemble() for name, graph in graph_dict.items(): tensor_array_resource_removal(graph) graph = insert_get_tuple(graph) graph = fill_outputs(graph) if name == "main": # skip for sub-graphs as input can be also output delete_disconnected_nodes(graph) tf_ssa.functions[name] = SSAFunction( graph, inputs=inputs[name], outputs=outputs[name], ret=ret[name] ) return tf_ssa def _run_tf_ssa_passes(self): tf_passes = self.tfssa_passes if self.debug: for tf_pass in _tqdm( tf_passes, desc="Running TensorFlow Graph Passes", unit=" passes" ): try: tf_pass(self._tf_ssa) except Exception as e: logger.exception('Exception in pass "{}": {}'.format(tf_pass, e)) logger.info("Ignoring exception and continuing to next pass") else: for tf_pass in _tqdm( tf_passes, desc="Running TensorFlow Graph Passes", unit=" passes" ): tf_pass(self._tf_ssa) if self.debug: import graphviz dot_string = self._tf_ssa.get_dot_string( annotation=True, name_and_op_style=True, highlight_debug_nodes=[] ) graphviz.Source(dot_string).view( filename="/tmp/ssa_after_tf_passes", cleanup=True ) def _program_from_tf_ssa(self): self._run_tf_ssa_passes() converter = TF2Converter( tfssa=self._tf_ssa, inputs=self.kwargs["inputs"], outputs=self.kwargs["outputs"], opset_version=self.kwargs["specification_version"], use_default_fp16_io=self.kwargs["use_default_fp16_io"], ) return converter.convert() def _populate_sub_graph_input_shapes(self, graph, graph_fns): """ Populate function (sub-graph) input shapes from control flow op's inputs Note that the functions (sub-graphs) are not nested but the control flow ops are nested. The input shapes are used to extract sub-graphs from the parent graph (as the input of function_def_to_graph). Parameter --------- graph: tf.Graph TensorFlow graph. graph_fns: list of graph functions. List of TensorFlow graph functions. Returns ------- sg_input_shapes: dict(str: list) Dictionary of function (sub-graph) name and input shape pairs. """ sg_input_shapes = {} sub_graphs = [] for op in graph.get_operations(): if op.type not in {"StatelessIf", "If", "StatelessWhile", "While"}: continue sg1, sg2 = None, None if op.type in {"StatelessIf", "If"}: sg1 = op.get_attr("then_branch").name sg2 = op.get_attr("else_branch").name if op.type in {"StatelessWhile", "While"}: sg1 = op.get_attr("cond").name sg2 = op.get_attr("body").name # memorize input shapes for sub-graph conversions op_input_shapes = [i.get_shape() for i in op.inputs] sg_input_shapes.update({sg1: op_input_shapes, sg2: op_input_shapes}) sub_graphs += [sg1, sg2] for name in sub_graphs: sg = graph_fns.get(name) fn_def = context.get_function_def(name) op_input_shapes = sg_input_shapes[name] op_input_shapes = op_input_shapes[-len(fn_def.signature.input_arg) :] fn_graph = _function_def_to_graph(fn_def, input_shapes=op_input_shapes) sg_input_shapes.update( self._populate_sub_graph_input_shapes(fn_graph, graph_fns) ) return sg_input_shapes @staticmethod def _dict_from_graph_def(graph, fn_name="main", sg_input_shapes=None): """ Loads a tf.Graph and transform it into dictionary of ParsedTFNodes. Potentially contains multiple functions, in such case, recursively resolve functions (sub-graphs). Parameters ---------- graph: tf.Graph TensorFlow graph. fn_name: str, optional, defaults to 'main' Function name of the graph. sg_input_shapes: dict(str: list) Dictionary of name and input shapes for functions / sub-graphs. Returns ------- dict(str: dict(str: ParsedTFNode)) Dictionary of function name and dictionary of node name and ParsedTFNode object. """ graph_dict = {fn_name: {}} graph_inputs = {fn_name: []} graph_outputs = {fn_name: []} graph_ret = {fn_name: {}} for op in graph.get_operations(): graph_dict[fn_name].update({op.name: ParsedTFNode(op.node_def)}) for name, sg in graph._functions.items(): sg_def = context.get_function_def(name) if name in sg_input_shapes: input_shapes = sg_input_shapes[name] input_shapes = input_shapes[-len(sg_def.signature.input_arg):] fn_graph = _function_def_to_graph(sg_def, input_shapes=input_shapes) graph_dict.update( TF2Loader._dict_from_graph_def(fn_graph, name, sg_input_shapes)[0] ) graph_inputs.update({name: [t.name.split(":")[0] for t in fn_graph.inputs]}) graph_outputs.update( {name: [t.name.split(":")[0] for t in fn_graph.outputs]} ) # ret is a mapping from the output arg names from `signature` to the # outputs from `node_def` that should be returned by the function. graph_ret.update({name: sg_def.ret}) return graph_dict, graph_inputs, graph_outputs, graph_ret @staticmethod def _concrete_fn_from_tf_keras(keras_model: _tf.keras.Model): input_signature = _saving_utils.model_input_signature( keras_model, keep_original_batch_size=True ) fn = _saving_utils.trace_model_call(keras_model, input_signature) return [fn.get_concrete_function()] def _graph_def_from_concrete_fn(self, cfs): if len(cfs) != 1: raise NotImplementedError("Only a single concrete function is supported.") if _get_version(_tf.__version__) >= Version("2.2.0"): frozen_fn = _convert_variables_to_constants_v2(cfs[0], lower_control_flow=False, aggressive_inlining=True) else: frozen_fn = _convert_variables_to_constants_v2(cfs[0], lower_control_flow=False) graph_def = frozen_fn.graph.as_graph_def(add_shapes=True) # run a Grappler's constant folding pass. fn_inputs = [t for t in frozen_fn.inputs if t.dtype != _dtypes.resource] grappler_optimizers_list = self._get_grappler_optimizers_list() graph_def = _run_graph_optimizations( graph_def, fn_inputs, frozen_fn.outputs, config=_get_grappler_config(grappler_optimizers_list), graph=frozen_fn.graph, ) return graph_def def _get_grappler_optimizers_list(self): return ["constfold", "dependency", "debug_stripper"] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/ops.py0000644000000000000000000002041414672066616025273 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as _np # TF 2.x now imports and registers all TF 1.x op against the new registry # (separated from TF 1.x registry). Overwrite might needed in case the op # semantics are different between TF 1.x and TF 2.x.< from coremltools.converters.mil.frontend.tensorflow.convert_utils import \ convert_graph from coremltools.converters.mil.frontend.tensorflow.ops import ( _transpose_NCDHW_to_NDHWC, _transpose_NCHW_to_NHWC, _transpose_NDHWC_to_NCDHW, _transpose_NHWC_to_NCHW) from coremltools.converters.mil.frontend.tensorflow.tf_op_registry import \ register_tf_op from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.types import builtin_to_string from coremltools.converters.mil.mil.types.symbolic import any_symbolic @register_tf_op(override=True, tf_alias=["FusedBatchNorm"]) def FusedBatchNormV3(context, node): # helper function that add the batch norm layer def _add_batch_norm(x, mean, variance, scale, offset, epsilon, name): if mean.shape[0] != 0 and variance.shape[0] != 0: # In this case, we can use the mb.batch_norm directly x = mb.batch_norm( x=x, mean=mean, variance=variance, gamma=scale, beta=offset, epsilon=epsilon, name=name ) else: # In this case, we need to manually compute the batch_norm axes = [axis for axis in range(x.rank) if axis != 1] mean = mb.reduce_mean(x=x, axes=axes, keep_dims=True) num = mb.sub(x=x, y=mean) square = mb.mul(x=num, y=num) variance = mb.reduce_mean(x=square, axes=axes, keep_dims=True) variance_add_epsilon = mb.add(x=variance, y=epsilon) sqrt = mb.sqrt(x=variance_add_epsilon) x = mb.real_div(x=num, y=sqrt) shape = [1] * x.rank shape[1] = -1 if any_symbolic(scale.shape) else scale.shape[0] scale_reshape = mb.reshape(x=scale, shape=shape) offset_reshape = mb.reshape(x=offset, shape=shape) x = mb.mul(x=x, y=scale_reshape) x = mb.add(x=x, y=offset_reshape, name=name) return x # Get attributes data_format = node.attr.get("data_format", "NHWC") epsilon = node.attr.get("epsilon", None) # Get inputs x = context[node.inputs[0]] scale = context[node.inputs[1]] offset = context[node.inputs[2]] mean = context[node.inputs[3]] variance = context[node.inputs[4]] input_dtype = x.dtype batch_norm_name = node.name + "_nchw" if data_format == "NHWC" else node.name if data_format == "NHWC": x = _transpose_NHWC_to_NCHW(x) elif data_format == "NDHWC": x = _transpose_NDHWC_to_NCDHW(x) x = mb.cast(x=x, dtype=builtin_to_string(mean.dtype)) x = _add_batch_norm(x, mean, variance, scale, offset, epsilon, batch_norm_name) if data_format == "NHWC": x = _transpose_NCHW_to_NHWC(x, node.name + "_to_NHWC") elif data_format == "NDHWC": x = _transpose_NCDHW_to_NDHWC(x, node.name + "_to_NDHWC") x = mb.cast(x=x, dtype=builtin_to_string(input_dtype), name=node.name) # Inference only batch norm does not have meaningful outputs for # batch_mean, batch_variance etc. context.add(node.name, x) @register_tf_op(tf_alias=["If"], override=True) def StatelessIf(context, node): pred = context[node.inputs[0]][0] then_graph = context.get_graph(node.attr.get("then_branch")) else_graph = context.get_graph(node.attr.get("else_branch")) def then_fn(): context.stack_func_inputs(context[node.inputs[0]]) then_output_var = convert_graph(context, then_graph) context.unstack_func_inputs() return then_output_var def else_fn(): context.stack_func_inputs(context[node.inputs[0]]) else_output_var = convert_graph(context, else_graph) context.unstack_func_inputs() return else_output_var x = mb.cond(pred=pred, _true_fn=then_fn, _false_fn=else_fn, name=node.name) # wraps x as tuple for get_tuple that always follow the cond node. x = (x,) if not isinstance(x, (tuple, list)) else x context.add(node.name, x) @register_tf_op(tf_alias=["While"], override=True) def StatelessWhile(context, node): # inputs are loop_counter, max_iterations, [loop_vars] loop_vars = context[node.inputs[0]][2:] cond_graph = context.get_graph(node.attr.get("cond")) body_graph = context.get_graph(node.attr.get("body")) def cond(*loop_vars): context.stack_func_inputs(loop_vars) cond_output_vars = convert_graph(context, cond_graph) context.unstack_func_inputs() return cond_output_vars def body(*loop_vars): context.stack_func_inputs(loop_vars) body_output_vars = convert_graph(context, body_graph) context.unstack_func_inputs() return body_output_vars x = mb.while_loop(_cond=cond, _body=body, loop_vars=loop_vars, name=node.name) # wraps x as tuple for get_tuple that always follow the while node. x = (x,) if not isinstance(x, (tuple, list)) else x context.add(node.name, x) @register_tf_op def TensorListFromTensor(context, node): value = context[node.inputs[0]] element_shape = context[node.inputs[1]] element_dtype = node.attr.get("element_dtype") dtype_str = builtin_to_string(element_dtype) length = mb.shape(x=value) length = mb.slice_by_index(x=length, begin=[0], end=[1], squeeze_mask=[True]) if element_shape is not None and all(_np.atleast_1d(element_shape.val) != -1): ls = mb.make_list(init_length=length, elem_shape=tuple(element_shape.val.tolist()), dtype=dtype_str) else: ls = mb.tf_make_list(init_length=length, dtype=dtype_str) indices = mb.range_1d(end=length, start=0, step=1) ls = mb.list_scatter(ls=ls, indices=indices, value=value, name=node.name) context.add(node.name, ls) @register_tf_op def TensorListGather(context, node): ls = context[node.inputs[0]] indices = context[node.inputs[1]] tensor = mb.list_gather(ls=ls, indices=indices, name=node.name) context.add(node.name, tensor) @register_tf_op def TensorListGetItem(context, node): ls = context[node.inputs[0]] index = context[node.inputs[1]] new_ls = mb.list_read(ls=ls, index=index, name=node.name) context.add(node.name, new_ls) @register_tf_op def TensorListLength(context, node): ls = context[node.inputs[0]] length = mb.list_length(ls=ls, name=node.name) context.add(node.name, length) @register_tf_op def TensorListReserve(context, node): element_shape = context[node.inputs[0]] num_elements = context[node.inputs[1]] element_dtype = node.attr.get("element_dtype") dtype = builtin_to_string(element_dtype) if element_shape is not None and all(_np.atleast_1d(element_shape.val) != -1): ls = mb.make_list( init_length=num_elements, elem_shape=tuple(element_shape.val.tolist()), dynamic_length=num_elements.val is None, dtype=dtype, name=node.name, ) else: ls = mb.tf_make_list(init_length=num_elements, dtype=dtype, dynamic_length=num_elements.val is None, name=node.name) context.add(node.name, ls) @register_tf_op def TensorListScatterIntoExistingList(context, node): ls = context[node.inputs[0]] value = context[node.inputs[1]] indices = context[node.inputs[2]] ls = mb.list_scatter(ls=ls, indices=indices, value=value, name=node.name) context.add(node.name, ls) @register_tf_op def TensorListSetItem(context, node): ls = context[node.inputs[0]] index = context[node.inputs[1]] value = context[node.inputs[2]] new_ls = mb.list_write(ls=ls, index=index, value=value, name=node.name) context.add(node.name, new_ls) @register_tf_op def TensorListStack(context, node): ls = context[node.inputs[0]] length = mb.list_length(ls=ls) indices = mb.range_1d(end=length, start=0, step=1) x = mb.list_gather(ls=ls, indices=indices, name=node.name) context.add(node.name, x) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2215466 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/ssa_passes/0000755000000000000000000000000014672075535026263 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/ssa_passes/__init__.py0000644000000000000000000000037514672066616030401 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import remove_vacuous_cond ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/ssa_passes/remove_vacuous_cond.py0000644000000000000000000001104514672066616032703 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools import _logger as logger from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @block_context_manager def _remove_vacuous_cond_block(block): num_changes = 0 for op in list(block.operations): for b in op.blocks: num_changes += _remove_vacuous_cond_block(b) if op.op_type != "cond": continue then_ops = op.blocks[0].operations else_ops = op.blocks[1].operations if len(then_ops) > 1 or len(else_ops) > 1: continue # Pattern 1: dynamic length TensorList generates this pattern. See # conversion functions of TensorList* ops for details. TF2's graph # contains a tf.cond op with 2 sub-graphs. The condition is either # `less_equal` or `greater_equal` op. 1 sub-graph contains only an # identity op forwarding the original TensorList, another sub-graph # contains TensorListResize op to generate a new TensorList. But in # backend, list length is handled dynamically in list_write/scatter # and thus, the entire tf.cond and it's sub-graphs can be removed. if len(then_ops) == 0 and len(else_ops) == 0: if op.pred.op.op_type not in {"less_equal", "greater_equal"}: continue # cond op must have pred pred_x = op.pred.op.x.op pred_y = op.pred.op.y.op if pred_x is None and pred_y is None: continue if op.pred.op.op_type == "less_equal": if pred_x.op_type != "list_length": continue new_var = pred_x.ls else: # op.pred.op.op_type == 'greather_equal': if pred_y.op_type != "list_length": continue new_var = pred_y.ls op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=new_var ) block.remove_ops([op]) # rely on DCE to remove extra cond inputs num_changes += 1 # Pattern 2: both than and else branch contains exactly 1 identity op if len(then_ops) == 1 and len(then_ops) == 1: if then_ops[0].op_type != "identity" or else_ops[0].op_type != "identity": continue if then_ops[0].x != else_ops[0].x: continue new_var = mb.identity(x=then_ops[0].x, before_op=op, name=op.name) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=new_var ) block.remove_ops([op]) # rely on DCE to remove extra cond inputs num_changes += 1 return num_changes @register_pass(namespace="tensorflow2") class remove_vacuous_cond(AbstractGraphPass): """ Remove cond op and it's sub-graphs that produces identity on both then and else branch. One example use case is the TensorListReverse op, in Core ML, we dynamically resize in write operations, and thus, both branches of the cond op will be a skip (identity) op. Given: main(%a: (1, bool), %b: (2, 3, fp32)) { block0() { %squeeze_0: (bool) = squeeze(x=%a, name="squeeze_0") %cond_0: (2, 3, fp32) = cond(pred=%squeeze_0, name="cond_0") cond_0_true() { %identity_0: (2, 3, fp32) = identity(x=%b, name="identity_0") } -> (%identity_0) cond_0_false() { %identity_1: (2, 3, fp32) = identity(x=%b, name="identity_1") } -> (%identity_1) } -> (%cond_0) } Result: main(%a: (1, bool), %b: (2, 3, fp32)) { block0() { %squeeze_0: (bool) = squeeze(x=%a, name="squeeze_0") %cond_0: (2, 3, fp32) = identity(x=%b, name="cond_0") } -> (%cond_0) } """ def apply(self, prog): for f_name, f in prog.functions.items(): num_changes = _remove_vacuous_cond_block(f) msg = "remove_vacuous_cond: changed {} ops in function '{}'" logger.info(msg.format(num_changes, f_name)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/ssa_passes/test_v2_passes.py0000644000000000000000000000347214672066616031607 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import numpy as np from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.converters.mil.testing_utils import (assert_model_is_valid, assert_same_output_names) np.random.seed(1984) validate_model = True def test_remove_vacuous_cond(): @mb.program( input_specs=[ mb.TensorSpec(shape=(1,), dtype=types.bool), mb.TensorSpec(shape=(2, 3)), ] ) def prog(a, b): def then_branch(): return mb.identity(x=b) def else_branch(): return mb.identity(x=b) pred = mb.squeeze(x=a) return mb.cond(pred=pred, _true_fn=then_branch, _false_fn=else_branch) cond_op = prog.find_ops(op_type="cond", exactly_one=True)[0] original_cond_op_name = cond_op.name assert len(cond_op.blocks[0].operations) == 1 assert len(cond_op.blocks[1].operations) == 1 assert cond_op.blocks[0].operations[0].op_type == "identity" assert cond_op.blocks[1].operations[0].op_type == "identity" prev_prog = copy.deepcopy(prog) PASS_REGISTRY["tensorflow2::remove_vacuous_cond"](prog) assert_same_output_names(prev_prog, prog) cond_op = prog.find_ops(op_type="cond") assert len(cond_op) == 0 identity_op = prog.find_ops(prefix=original_cond_op_name, exactly_one=True)[0] assert identity_op.op_type == "identity" if validate_model: assert_model_is_valid(prog, {"a": (1,), "b": (2, 3)}) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2215466 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/test/0000755000000000000000000000000014672075535025076 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/test/__init__.py0000644000000000000000000000033214672066616027205 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/test/test_tf2_conversion_api.py0000644000000000000000000005257414672066616032315 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import os import platform from os import chdir, getcwd from shutil import rmtree from tempfile import mkdtemp import numpy as np import pytest import coremltools as ct from coremltools.converters.mil.frontend.tensorflow.test.test_tf_conversion_api import ( TestInputOutputConversionAPI as TestTf2InputOutputConversionAPI, ) from coremltools.converters.mil.frontend.tensorflow.test.test_tf_conversion_api import ( TestiOS16DefaultIODtype as TestTf2iOS16DefaultIODtype, ) from coremltools.converters.mil.mil import types from coremltools.converters.mil.testing_reqs import backends # We need to keep this, other the pre-commit hook is going to remove the TestInputOutputConversionAPI, TestiOS16DefaultIODtype assert TestTf2InputOutputConversionAPI is not None assert TestTf2iOS16DefaultIODtype is not None tf = pytest.importorskip("tensorflow", minversion="2.1.0") import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers @pytest.fixture def uint8_input_model(): x = tf.keras.Input(batch_input_shape=(10, 20), name="input", dtype=tf.uint8) out = tf.add(x, tf.constant(5, dtype=tf.uint8), name="output") return tf.keras.Model(inputs=x, outputs=out) @pytest.fixture def int8_input_model(): x = tf.keras.Input(batch_input_shape=(10, 20), name="input", dtype=tf.int8) out = tf.add(x, tf.constant(5, dtype=tf.int8), name="output") return tf.keras.Model(inputs=x, outputs=out) @pytest.fixture def int32_input_model(): x = tf.keras.Input(batch_input_shape=(10, 20), name="input", dtype=tf.int32) out = tf.add(x, tf.constant(5, dtype=tf.int32), name="output") return tf.keras.Model(inputs=x, outputs=out) @pytest.fixture def int32_two_input_model(): x = tf.keras.Input(batch_input_shape=(10, 20), name="input1", dtype=tf.int32) y = tf.keras.Input(batch_input_shape=(10, 20), name="input2", dtype=tf.int32) out = tf.add(x, y, name="output") return tf.keras.Model(inputs=[x, y], outputs=out) @pytest.fixture def int32_two_output_model(): x = tf.keras.Input(batch_input_shape=(10, 20), name="input1", dtype=tf.int32) y = tf.keras.Input(batch_input_shape=(10, 20), name="input2", dtype=tf.int32) out1 = tf.add(x, 1, name="output1") out2 = tf.add(y, 1, name="output2") return tf.keras.Model(inputs=[x, y], outputs=[out1, out2]) @pytest.fixture def int32_float32_two_output_model(): x = tf.keras.Input(batch_input_shape=(10, 20), name="input1", dtype=tf.float32) y = tf.keras.Input(batch_input_shape=(10, 20), name="input2", dtype=tf.float32) x_add = tf.add(x, 1.0, name="output1") y_add = tf.add(y, 1.0) y_cast = tf.cast(y_add, dtype=tf.int32, name="output2") return tf.keras.Model(inputs=[x, y], outputs=[x_add, y_cast]) @pytest.fixture def int32_float32_two_input_model(): x = tf.keras.Input(batch_input_shape=(10, 20), name="input1", dtype=tf.int32) y = tf.keras.Input(batch_input_shape=(10, 20), name="input2", dtype=tf.float32) x_cast = tf.cast(x, dtype=tf.float32) out = tf.add(x_cast, y, name="output") return tf.keras.Model(inputs=[x, y], outputs=out) @pytest.fixture def float32_input_model_add_op(): x = tf.keras.Input(batch_input_shape=(10, 20), name="input", dtype=tf.float32) out = tf.add(x, tf.constant(5.5, dtype=tf.float32), name="output") return tf.keras.Model(inputs=x, outputs=out) @pytest.fixture def float32_input_model_relu_ops(): x = tf.keras.Input(batch_input_shape=(10, 20), name="input", dtype=tf.float32) x1 = tf.keras.layers.ReLU()(x) out = tf.keras.layers.ReLU(name="output")(x1) return tf.keras.Model(inputs=x, outputs=out) @pytest.fixture def int64_input_model(): x = tf.keras.Input(batch_input_shape=(10, 20), name="input", dtype=tf.int64) out = tf.add(x, tf.constant(5, dtype=tf.int64), name="output") return tf.keras.Model(inputs=x, outputs=out) @pytest.fixture def float32_two_input_model(): x = tf.keras.Input(batch_input_shape=(10, 20), name="input1", dtype=tf.float32) y = tf.keras.Input(batch_input_shape=(10, 20), name="input2", dtype=tf.float32) out = tf.add(x, y, name="output") return tf.keras.Model(inputs=[x, y], outputs=out) @pytest.fixture def float32_two_output_model(): x = tf.keras.Input(batch_input_shape=(10, 20), name="input", dtype=tf.float32) y = tf.nn.relu(x) out2 = tf.nn.relu6(x, name="output2") out1 = tf.nn.relu(y, name="output1") return tf.keras.Model(inputs=x, outputs=[out1, out2]) @pytest.fixture def float64_input_model(): x = tf.keras.Input(batch_input_shape=(10, 20), name="input", dtype=tf.float64) out = tf.add(x, tf.constant(5, dtype=tf.float64), name="output") return tf.keras.Model(inputs=x, outputs=out) @pytest.fixture def rank3_input_model(): x = tf.keras.Input(batch_input_shape=(1, 10, 20), name="input", dtype=tf.float32) out = tf.add(x, tf.constant(5, dtype=tf.float32), name="output") return tf.keras.Model(inputs=x, outputs=out) @pytest.fixture def rank4_input_model(): x = tf.keras.Input(batch_input_shape=(1, 10, 20, 3), name="input", dtype=tf.float32) out = tf.add(x, tf.constant(5, dtype=tf.float32), name="output") return tf.keras.Model(inputs=x, outputs=out) @pytest.fixture def rank4_input_model_with_channel_first_output(): x = tf.keras.Input(batch_input_shape=(1, 10, 20, 3), name="input", dtype=tf.float32) y = tf.add(x, tf.constant(5, dtype=tf.float32)) out = tf.transpose(y, perm=[0, 3, 1, 2], name="output") return tf.keras.Model(inputs=x, outputs=out) @pytest.fixture def rank4_grayscale_input_model(): x = tf.keras.Input(batch_input_shape=(1, 10, 20, 1), name="input", dtype=tf.float32) out = tf.add(x, tf.constant(5, dtype=tf.float32), name="output") return tf.keras.Model(inputs=x, outputs=out) @pytest.fixture def rank4_grayscale_input_model_with_channel_first_output(): x = tf.keras.Input(batch_input_shape=(1, 10, 20, 1), name="input", dtype=tf.float32) y = tf.add(x, tf.constant(5, dtype=tf.float32)) out = tf.transpose(y, perm=[0, 3, 1, 2], name="output") return tf.keras.Model(inputs=x, outputs=out) @pytest.fixture def linear_model(): # this model will test the fuse_matmul_weight_bias pass x = tf.keras.Input(batch_input_shape=(1, 10), name="input", dtype=tf.float32) y = tf.keras.layers.Dense(4)(x) y = tf.add(y, tf.constant([1, 2, 3, 4], shape=(4,), dtype=tf.float32)) out = tf.nn.relu(y) return tf.keras.Model(inputs=x, outputs=out) ################################################################################# # Note: all tests are also used as examples in https://coremltools.readme.io/docs # as a reference. # Whenever any of the following test fails, we should update API documentations ################################################################################# class TestTensorFlow2ConverterExamples: def setup_class(self): self._cwd = getcwd() self._temp_dir = mkdtemp() # step into temp directory as working directory # to make the user-facing examples cleaner chdir(self._temp_dir) # create toy models for conversion examples # write a toy tf.keras HDF5 model tf_keras_model = tf.keras.Sequential( [ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(128, activation=tf.nn.relu), tf.keras.layers.Dense(10, activation=tf.nn.softmax), ] ) tf_keras_model.save("./tf_keras_model.h5") # write a toy SavedModel directory tf_keras_model.save("./saved_model", save_format="tf") def teardown_class(self): chdir(self._cwd) if os.path.exists(self._temp_dir): rmtree(self._temp_dir) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_convert_tf_keras_h5_file(backend): for file_extension in ("h5", "hdf5"): x = tf.keras.Input(shape=(32,), name="input") y = tf.keras.layers.Dense(16, activation="softmax")(x) keras_model = tf.keras.Model(x, y) temp_dir = mkdtemp() save_dir = str(temp_dir) path = os.path.join(save_dir, "tf_keras_model." + file_extension) keras_model.save(path) mlmodel = ct.convert(path, convert_to=backend[0]) test_input = np.random.rand(2, 32) expected_val = keras_model(test_input) results = mlmodel.predict({"input": test_input}) # We should check the numerical on Rosetta after the radar is fixed: # rdar://126185417 ([CI][TF] Two TF2 API testing is failing on Rosetta with numerical issues) if platform.machine() == "arm64": np.testing.assert_allclose(results["Identity"], expected_val, rtol=1e-2, atol=1e-2) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_convert_tf_keras_model(backend): x = tf.keras.Input(shape=(32,), name="input") y = tf.keras.layers.Dense(16, activation="softmax")(x) keras_model = tf.keras.Model(x, y) mlmodel = ct.convert(keras_model, convert_to=backend[0]) test_input = np.random.rand(2, 32) expected_val = keras_model(test_input) results = mlmodel.predict({"input": test_input}) # We should check the numerical on Rosetta after the radar is fixed: # rdar://126185417 ([CI][TF] Two TF2 API testing is failing on Rosetta with numerical issues) if platform.machine() == "arm64": np.testing.assert_allclose(results["Identity"], expected_val, rtol=0.005) @staticmethod @pytest.mark.parametrize( "dtype", ['default', 'mil_type', 'np type']) def test_convert_tf_keras_applications_model(dtype): tf_keras_model = tf.keras.applications.MobileNet( weights="imagenet", input_shape=(224, 224, 3) ) # inputs / outputs are optional, we can get from tf.keras model # this can be extremely helpful when we want to extract sub-graphs input_name = tf_keras_model.inputs[0].name.split(":")[0] if dtype == 'default': dtype = None elif dtype == 'mil_type': dtype = types.fp32 else: dtype = np.float32 mlmodel = ct.convert( tf_keras_model, inputs=[ct.TensorType(shape=(1, 224, 224, 3), dtype=dtype)], convert_to="neuralnetwork", ) mlmodel.save("./mobilenet.mlmodel") @staticmethod def test_convert_from_saved_model_dir(): # SavedModel directory generated by TensorFlow 2.x mlmodel = ct.convert("./saved_model", convert_to="neuralnetwork") mlmodel.save("./model.mlmodel") @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_keras_custom_layer_model(backend): # testing : https://coremltools.readme.io/docs/tensorflow-2#conversion-from-user-defined-models class CustomDense(layers.Layer): def __init__(self, units=32): super(CustomDense, self).__init__() self.units = units def build(self, input_shape): self.w = self.add_weight( shape=(input_shape[-1], self.units), initializer="random_normal", trainable=True, ) self.b = self.add_weight( shape=(self.units,), initializer="random_normal", trainable=True ) def call(self, inputs): return tf.matmul(inputs, self.w) + self.b inputs = keras.Input((4,)) outputs = CustomDense(10)(inputs) model = keras.Model(inputs, outputs) ct.convert(model, convert_to=backend[0]) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_concrete_function_conversion(backend): # testing : https://coremltools.readme.io/docs/tensorflow-2#conversion-from-user-defined-models @tf.function(input_signature=[tf.TensorSpec(shape=(6,), dtype=tf.float32)]) def gelu_tanh_activation(x): a = (np.sqrt(2 / np.pi) * (x + 0.044715 * tf.pow(x, 3))) y = 0.5 * (1.0 + tf.tanh(a)) return x * y conc_func = gelu_tanh_activation.get_concrete_function() mlmodel = ct.convert([conc_func], convert_to=backend[0]) @staticmethod def test_convert_tf2_keras(): x = tf.keras.Input(shape=(32,), name="input") y = tf.keras.layers.Dense(16, activation="softmax")(x) keras_model = tf.keras.Model(x, y) model = ct.convert(keras_model, convert_to='milinternal') assert isinstance(model, ct.converters.mil.Program) class TestTF2FlexibleInput: # Test examples in https://coremltools.readme.io/docs/flexible-inputs @staticmethod @pytest.mark.parametrize( "use_symbol, backend", itertools.product( [True, False], backends, ), ) def test_tf2keras_shared_range_dim(use_symbol, backend): input_dim = 3 # None denotes seq_len dimension x1 = tf.keras.Input(shape=(None,input_dim), name="seq1") x2 = tf.keras.Input(shape=(None,input_dim), name="seq2") y = x1 + x2 keras_model = tf.keras.Model(inputs=[x1, x2], outputs=[y]) # One RangeDim shared by two inputs upper_bound = -1 if backend[0] == "neuralnetwork" else 5 if use_symbol: seq_len_dim = ct.RangeDim(symbol="seq_len", upper_bound=upper_bound) else: # symbol is optional seq_len_dim = ct.RangeDim(upper_bound=upper_bound) seq1_input = ct.TensorType(name="seq1", shape=(1, seq_len_dim, input_dim)) seq2_input = ct.TensorType(name="seq2", shape=(1, seq_len_dim, input_dim)) mlmodel = ct.convert(keras_model, inputs=[seq1_input, seq2_input], convert_to=backend[0]) batch = 1 seq_len = 5 test_input_x1 = np.random.rand(batch, seq_len, input_dim).astype(np.float32) test_input_x2 = np.random.rand(batch, seq_len, input_dim).astype(np.float32) expected_val = keras_model([test_input_x1, test_input_x2]) if ct.utils._is_macos(): results = mlmodel.predict({ "seq1": test_input_x1, "seq2": test_input_x2}) np.testing.assert_allclose(results["Identity"], expected_val, rtol=1e-2, atol=1e-2) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_tf2keras_incorrect_range_dim(backend): input_dim = 3 # None denotes seq_len dimension x1 = tf.keras.Input(shape=(None,input_dim), name="seq1") y = x1 + 1 keras_model = tf.keras.Model(inputs=[x1], outputs=[y]) # Incorrectly using -1 instead of ct.RangeDim # One RangeDim shared by two inputs with pytest.raises(ValueError, match=r"Can\'t convert to CoreML shaping"): seq1_input = ct.TensorType(name="seq1", shape=(1, -1, input_dim)) mlmodel = ct.convert(keras_model, inputs=[seq1_input], convert_to=backend[0]) @staticmethod @pytest.mark.parametrize( "use_symbol, backend", itertools.product( [True, False], backends, ), ) def test_tf2keras_outofbound_range_dim(use_symbol, backend): input_dim = 3 # None denotes seq_len dimension x = tf.keras.Input(shape=(None,input_dim), name="seq") y = x * 2 keras_model = tf.keras.Model(inputs=[x], outputs=[y]) if use_symbol: seq_len_dim = ct.RangeDim(symbol='sequence_len', lower_bound=3, upper_bound=5) else: seq_len_dim = ct.RangeDim(lower_bound=3, upper_bound=5) seq_input = ct.TensorType(name="seq", shape=(1, seq_len_dim, input_dim)) mlmodel = ct.convert(keras_model, inputs=[seq_input], convert_to=backend[0]) # seq_len is within bound batch = 1 seq_len = 3 test_input_x = np.random.rand(batch, seq_len, input_dim).astype(np.float32) expected_val = keras_model([test_input_x]) if ct.utils._is_macos(): results = mlmodel.predict({"seq": test_input_x}) np.testing.assert_allclose(results["Identity"], expected_val, rtol=1e-4, atol=1e-3) # seq_len below/above lower_bound/upper_bound with pytest.raises(RuntimeError, match=r"Size \(2\) of dimension \(1\) is not in allowed range \(3\.\.5\)"): seq_len = 2 test_input_x = np.random.rand(batch, seq_len, input_dim).astype(np.float32) results = mlmodel.predict({"seq": test_input_x}) with pytest.raises(RuntimeError, match=r"Size \(6\) of dimension \(1\) is not in allowed range \(3\.\.5\)"): seq_len = 6 test_input_x = np.random.rand(batch, seq_len, input_dim).astype(np.float32) results = mlmodel.predict({"seq": test_input_x}) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_tf2_image_enumerated_shapes(backend): keras_model = tf.keras.applications.MobileNetV2( input_shape=(None, None, 3,), classes=1000, include_top=False, ) input_shapes = ct.EnumeratedShapes(shapes=[(1, 192, 192, 3), (1, 224, 224, 3)]) image_input = ct.ImageType(shape=input_shapes, bias=[-1,-1,-1], scale=1/127) model = ct.convert(keras_model, inputs=[image_input], convert_to=backend[0]) assert model is not None spec = model.get_spec() assert len(spec.description.input[0].type.imageType.enumeratedSizes.sizes) == 2 @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_tf2keras_enumerated_shapes(backend): input_shape = (28, 28, 3) # None denotes seq_len dimension x = tf.keras.Input(shape=input_shape, name="input") C_out = 2 kHkW = 3 y = tf.keras.layers.Conv2D(C_out, kHkW, activation='relu', input_shape=input_shape)(x) keras_model = tf.keras.Model(inputs=[x], outputs=[y]) # One RangeDim shared by two inputs shapes = [(1, 28, 28, 3), (1, 56, 56, 3)] enumerated_shapes = ct.EnumeratedShapes(shapes=shapes) tensor_input = ct.TensorType(name="input", shape=enumerated_shapes) mlmodel = ct.convert(keras_model, inputs=[tensor_input], convert_to=backend[0]) # Test (1, 28, 28, 3) shape test_input_x = np.random.rand(*shapes[0]).astype(np.float32) expected_val = keras_model([test_input_x]) if ct.utils._is_macos(): results = mlmodel.predict({ "input": test_input_x}) # rdar://101303143 ([CI] test_tf2keras_enumerated_shapes is getting some stochastic numerical issues on intel machines) # The tolerance is set a little bit big here. Need to investigate this issue if possible and lower the threshold down. np.testing.assert_allclose(results["Identity"], expected_val, atol=1e-2, rtol=3) # Test (1, 56, 56, 3) shape (can't verify numerical parity with Keras # which doesn't support enumerated shape) test_input_x = np.random.rand(*shapes[1]).astype(np.float32) results = mlmodel.predict({ "input": test_input_x}) # Test with a wrong shape with pytest.raises(RuntimeError, match=r"MultiArray Shape \(1 x 29 x 29 x 3\) was not in enumerated set of allowed shapes"): test_input_x = np.random.rand(1, 29, 29, 3).astype(np.float32) results = mlmodel.predict({ "input": test_input_x}) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_tf2keras_optional_input(backend): input_dim = 3 # None denotes seq_len dimension x1 = tf.keras.Input(shape=(None,input_dim), name="optional_input") x2 = tf.keras.Input(shape=(None,input_dim), name="required_input") y = x1 + x2 keras_model = tf.keras.Model(inputs=[x1, x2], outputs=[y]) upper_bound = -1 if backend[0] == "neuralnetwork" else 2 seq_len_dim = ct.RangeDim(upper_bound=upper_bound) default_value = np.ones((1, 2, input_dim)).astype(np.float32) optional_input = ct.TensorType( name="optional_input", shape=(1, seq_len_dim, input_dim), default_value=default_value, ) required_input = ct.TensorType( name="required_input", shape=(1, seq_len_dim, input_dim), ) mlmodel = ct.convert( keras_model, inputs=[optional_input, required_input], convert_to=backend[0] ) batch = 1 seq_len = 2 test_input_x2 = np.random.rand(batch, seq_len, input_dim).astype(np.float32) expected_val = keras_model([default_value, test_input_x2]) if ct.utils._is_macos(): results = mlmodel.predict({"required_input": test_input_x2}) np.testing.assert_allclose(results["Identity"], expected_val, rtol=1e-2) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/test/test_v2_load.py0000644000000000000000000002336214672066616030043 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import shutil import tempfile import pytest import coremltools.converters as converter from coremltools.converters.mil.frontend.tensorflow.test.test_load import \ frontend from coremltools.converters.mil.frontend.tensorflow.test.testing_utils import \ get_tf_keras_io_names from coremltools.converters.mil.input_types import TensorType from coremltools.converters.mil.testing_reqs import backends tf = pytest.importorskip("tensorflow", minversion="2.1.0") class TestTf2ModelFormats: def setup(self): self.saved_model_dir = tempfile.mkdtemp() _, self.model_path_h5 = tempfile.mkstemp( suffix=".h5", prefix=self.saved_model_dir ) def teardown(self): if os.path.exists(self.saved_model_dir): shutil.rmtree(self.saved_model_dir) @pytest.mark.parametrize( "backend", backends, ) def test_keras_model(self, backend): keras_model = tf.keras.Sequential( [tf.keras.layers.ReLU(input_shape=(4, 5), batch_size=3)] ) input_names, output_names = get_tf_keras_io_names(keras_model) mlmodel = converter.convert( keras_model, inputs=[TensorType(input_names[0], (3, 4, 5))], outputs=["Identity"], source=frontend, convert_to=backend[0], ) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_keras_saved_model_file(self, backend): keras_model = tf.keras.Sequential( [ tf.keras.layers.Flatten(input_shape=(28, 28), batch_size=1), tf.keras.layers.Dense(10, activation=tf.nn.relu), ] ) keras_model.save(self.saved_model_dir, save_format="tf") mlmodel = converter.convert( self.saved_model_dir, outputs=["Identity"], source=frontend, convert_to=backend[0] ) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_keras_h5_file(self, backend): keras_model = tf.keras.Sequential( [tf.keras.layers.ReLU(input_shape=(4, 5), batch_size=3)] ) input_names, output_names = get_tf_keras_io_names(keras_model) keras_model.save(self.model_path_h5, save_format="h5") mlmodel = converter.convert( self.model_path_h5, inputs=[TensorType(input_names[0], (3, 4, 5))], outputs=["Identity"], source=frontend, convert_to=backend[0], ) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_keras_hdf5_file(self, backend): keras_model = tf.keras.Sequential( [tf.keras.layers.ReLU(input_shape=(4, 5), batch_size=3)] ) input_names, output_names = get_tf_keras_io_names(keras_model) keras_model.save(self.model_path_h5, save_format="h5") mlmodel = converter.convert( self.model_path_h5, inputs=[TensorType(input_names[0], (3, 4, 5))], outputs=["Identity"], source=frontend, convert_to=backend[0], ) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_concrete_function_list_from_tf_low_level_api(self, backend): root = tf.train.Checkpoint() root.v1 = tf.Variable(3.0) root.v2 = tf.Variable(2.0) root.f = tf.function(lambda x: root.v1 * root.v2 * x) input_data = tf.constant(1.0, shape=[1, 1]) to_save = root.f.get_concrete_function(input_data) tf.saved_model.save(root, self.saved_model_dir, to_save) tf_model = tf.saved_model.load(self.saved_model_dir) concrete_func = tf_model.signatures[ tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY ] mlmodel = converter.convert( [concrete_func], outputs=["Identity"], source=frontend, convert_to=backend[0] ) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_saved_model_list_from_tf_function(self, backend): class build_model(tf.Module): @tf.function( input_signature=[tf.TensorSpec(shape=[3, 4, 5], dtype=tf.float32)] ) def __call__(self, x): return tf.nn.relu(x) model = build_model() tf.saved_model.save(model, self.saved_model_dir) mlmodel = converter.convert( self.saved_model_dir, outputs=["Identity"], source=frontend, convert_to=backend[0] ) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_concrete_function_list_from_tf_function(self, backend): class build_model(tf.Module): @tf.function( input_signature=[tf.TensorSpec(shape=[3, 4, 5], dtype=tf.float32)] ) def __call__(self, x): return tf.nn.relu(x) model = build_model() concrete_func = model.__call__.get_concrete_function() mlmodel = converter.convert( [concrete_func], outputs=["Identity"], source=frontend, convert_to=backend[0] ) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_graphdef_from_tf_function(self, backend): class build_model(tf.Module): def __init__(self): self.dense = tf.keras.layers.Dense(256, activation="relu") input_signature = [ tf.TensorSpec(name="input", shape=( 128, 128), dtype=tf.float32), ] @tf.function(input_signature=input_signature) def call(self, x): x = self.dense(x) return x model = build_model() from tensorflow.python.framework.convert_to_constants import \ convert_variables_to_constants_v2 frozen_graph_func = convert_variables_to_constants_v2( model.call.get_concrete_function()) frozen_graph_def = frozen_graph_func.graph.as_graph_def() mlmodel = converter.convert(frozen_graph_def, convert_to=backend[0]) assert mlmodel is not None @pytest.mark.parametrize( "backend", backends, ) def test_model_metadata(self, backend): keras_model = tf.keras.Sequential( [tf.keras.layers.ReLU(input_shape=(4, 5), batch_size=3)] ) input_names, output_names = get_tf_keras_io_names(keras_model) mlmodel = converter.convert( keras_model, inputs=[TensorType(input_names[0], (3, 4, 5))], outputs=["Identity"], source=frontend, convert_to=backend[0], ) metadata_keys = mlmodel.get_spec().description.metadata.userDefined assert "com.github.apple.coremltools.version" in metadata_keys assert "com.github.apple.coremltools.source" in metadata_keys assert "tensorflow==2." in metadata_keys["com.github.apple.coremltools.source"] @pytest.mark.parametrize( "backend", backends, ) def test_invalid_format_none(self, backend): with pytest.raises(NotImplementedError, match="Expected model format: .* .h5"): converter.convert(None, source=frontend, convert_to=backend[0]) @pytest.mark.parametrize( "backend", backends, ) def test_invalid_format_invalid_extension(self, backend): _, invalid_filename = tempfile.mkstemp(suffix=".invalid", prefix=self.saved_model_dir) with pytest.raises( ValueError, match="Input model path should be .h5/.hdf5 file or a directory, but got .*.invalid", ): converter.convert(invalid_filename, source=frontend, convert_to=backend[0]) @pytest.mark.parametrize( "backend", backends, ) def test_invalid_format_multiple_concrete_functions(self, backend): class build_model(tf.Module): @tf.function( input_signature=[tf.TensorSpec(shape=[3, 4, 5], dtype=tf.float32)] ) def __call__(self, x): return tf.nn.relu(x) model = build_model() cf = model.__call__.get_concrete_function() with pytest.raises( NotImplementedError, match="Only a single concrete function is supported" ): converter.convert([cf, cf, cf], source=frontend, convert_to=backend[0]) @pytest.mark.parametrize( "backend", backends, ) def test_invalid_converter_type(self, backend): keras_model = tf.keras.Sequential( [tf.keras.layers.ReLU(input_shape=(4, 5), batch_size=3)] ) with pytest.raises(ValueError) as e: converter.convert(keras_model, source="invalid", convert_to=backend[0]) expected_msg = r'Unrecognized value of argument "source": .*' e.match(expected_msg) with pytest.raises(NotImplementedError) as e: converter.convert(keras_model, convert_to="invalid", source=frontend) e.match(r"Backend converter .* not implemented") @pytest.mark.parametrize( "backend", backends, ) def test_invalid_format_non_exist(self, backend): non_exist_filename = self.model_path_h5.replace(".h5", "_non_exist.h5") with pytest.raises(ValueError) as e: converter.convert(non_exist_filename, source=frontend, convert_to=backend[0]) e.match(r"Input model .* does not exist") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/test/test_v2_ops.py0000644000000000000000000006172314672066616027730 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools.converters.mil import testing_reqs from coremltools.converters.mil.frontend.tensorflow.test.testing_utils import \ TensorFlowBaseTest from coremltools.converters.mil.frontend.tensorflow2.test.testing_utils import \ TensorFlow2BaseTest from coremltools.converters.mil.frontend.tensorflow2.test.testing_utils import \ make_tf2_graph as make_tf_graph from coremltools.converters.mil.testing_utils import random_gen TensorFlowBaseTest.run_compare_tf = TensorFlow2BaseTest.run_compare_tf2 tf = pytest.importorskip("tensorflow", minversion="2.1.0") backends = testing_reqs.backends compute_units = testing_reqs.compute_units class TestImageResample(TensorFlowBaseTest): @pytest.mark.skip( "TODO: rdar://100812753 ([TF] [Infra] TensorFlow Addons dylib issues in TF 2.10.0)" ) @pytest.mark.parametrize( "compute_unit, backend, data_warp_shapes", itertools.product( compute_units, backends, [ # Data shape format: (Batch, Hin, Win, C) # Warp shape format: (Batch, Hout, Wout, 2) [(1, 3, 3, 1), (1, 3, 3, 2)], # no size change [(2, 5, 5, 3), (2, 3, 3, 2)], # down-sampling [(3, 6, 6, 1), (3, 8, 8, 2)], # up-sampling ], ), ) def test_resample( self, compute_unit, backend, data_warp_shapes, ): if backend[0] == "neuralnetwork": pytest.skip("nn backend not supported") tfa = pytest.importorskip("tensorflow_addons") data_shape, warp_shape = data_warp_shapes @make_tf_graph([data_shape, warp_shape]) def build_model(x, warp): return tfa.image.resampler(data=x, warp=warp) model, inputs, outputs = build_model # warp exceeding input sizes in order to test more padding modes input_values = [ random_gen(data_shape, -100, 100), random_gen(warp_shape, -15, 15), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestImageTransform(TensorFlowBaseTest): @pytest.mark.skip( "TODO: rdar://73165549 (Add other mode in 'affine' to coremltools when backend is ready)" ) @pytest.mark.parametrize( "compute_unit, backend, transforms, interpolation, shapes", itertools.product( [True], backends, [ [1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0], [1.0, 1.0, -250, 0.0, 1.0, 0.0, 0.0, 0.0], [1.25, -1.75, 25.0, -25.0, 1.5, -1.5, 0.0, 0.0], ], ["BILINEAR"], [ ((1, 2, 2, 1), None), ((2, 2, 2, 1), (2, 3)), ((3, 5, 5, 2), (4, 4)), ((1, 3, 3, 2), (6, 6)), ((3, 50, 50, 2), (20, 20)), ], ), ) def test(self, compute_unit, backend, transforms, interpolation, shapes): x_shape, output_shape = shapes if backend[0] == "neuralnetwork": pytest.skip("nn backend not supported") tfa = pytest.importorskip("tensorflow_addons") @make_tf_graph([x_shape]) def build_model(x): return tfa.image.transform( x, transforms=transforms, interpolation=interpolation, output_shape=output_shape, ) model, inputs, outputs = build_model input_values = [ random_gen(x_shape, -100, 100), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, InputShape_OutputShape, op", itertools.product( compute_units, backends, [ [(2, 5, 15, 3), (2, 5, 15, 3)], [(2, 4, 8, 5), (2, 2, 4, 5)], [(2, 4, 8, 3), (2, 9, 13, 3)], ], ["V2", "V3"], ), ) def test_affine_transform(self, compute_unit, backend, InputShape_OutputShape, op): if backend[0] == "neuralnetwork": pytest.skip("Affine op not available in the neuralnetwork backend") input_shape, output_shape = InputShape_OutputShape batch_size = input_shape[0] transforms = np.random.rand(batch_size, 8) - 0.05 transforms[:, 6:8] = 0 @make_tf_graph([input_shape]) def build_model(x): if op == "V2": return tf.raw_ops.ImageProjectiveTransformV2( images=x, transforms=transforms, fill_mode="CONSTANT", output_shape=(output_shape[0], output_shape[1]), interpolation="BILINEAR", ) elif op == "V3": return tf.raw_ops.ImageProjectiveTransformV3( images=x, transforms=transforms, fill_mode="CONSTANT", output_shape=(output_shape[0], output_shape[1]), interpolation="BILINEAR", fill_value=0.0, ) else: raise ValueError("tensorflow op {} not supported".format(op)) model, inputs, outputs = build_model input_values = [np.random.rand(*input_shape).astype(np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestActivationSiLU(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, tf_op", itertools.product( compute_units, backends, list(range(1, 6)), [ tf.nn.swish, # TODO(yuduo): in TF 2.4.0+, it's renamed to tf.nn.silu, tf.keras.activations.swish, ], ), ) def test(self, compute_unit, backend, rank, tf_op): if backend[0] == "neuralnetwork": pytest.skip("nn backend not supported") x_shape = tuple(np.random.randint(low=1, high=4, size=rank)) @make_tf_graph([x_shape]) def build_model(x): return tf_op(x) model, inputs, outputs = build_model input_values = [ random_gen(x_shape, -100, 100), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestResizeNearestNeighbor(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_shape, target_shape, align_corners, half_pixel_centers", itertools.product( compute_units, backends, [(1, 10, 20, 1), (2, 5, 1, 3)], [(25, 30), (2, 20)], [False], [True, False], ), ) def test_raw_ops( self, compute_unit, backend, input_shape, target_shape, align_corners, half_pixel_centers, ): if align_corners is True and half_pixel_centers is True: return if backend[0] == "neuralnetwork": # neural network backend does not support fractional scale factors for nearest neighbor upsample op if target_shape[-1] % input_shape[-1] != 0: return if target_shape[-2] % input_shape[-2] != 0: return if backend[0] == "mlprogram" and compute_unit != ct.ComputeUnit.CPU_ONLY and not half_pixel_centers: pytest.xfail("rdar://97399545 (TestResizeNearestNeighbor failing on mlprogram + GPU + half_pixel_centers=False)") @make_tf_graph([input_shape]) def build_model(x): return tf.raw_ops.ResizeNearestNeighbor( images=x, size=target_shape, align_corners=align_corners, half_pixel_centers=half_pixel_centers, ) model, inputs, outputs = build_model input_values = [random_gen(input_shape, -100, 100)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, size", itertools.product(compute_units, backends, [(1, 1), (2, 3), (4, 1)]), ) def test_keras_layer(self, compute_unit, backend, size): if backend[0] == "neuralnetwork": pytest.skip("nn backend not supported") x_shape = tuple(np.random.randint(low=1, high=4, size=4)) @make_tf_graph([x_shape]) def build_model(x): return tf.keras.layers.UpSampling2D( size=size, interpolation="nearest", )(x) model, inputs, outputs = build_model input_values = [random_gen(x_shape, -100, 100)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, size, method", itertools.product( compute_units, backends, [(1, 1), (2, 3)], [tf.image.ResizeMethod.NEAREST_NEIGHBOR], ), ) def test_tf_image_resize(self, compute_unit, backend, size, method): if backend[0] == "mlprogram" and size == (1, 1): pytest.xfail("rdar://79699954 (Nearest neighbor resize numerical mismatch when output size is (1,1))") if backend[0] == "neuralnetwork": pytest.skip("nn backend not supported") x_shape = tuple(np.random.randint(low=1, high=3, size=4)) @make_tf_graph([x_shape]) def build_model(x): return tf.image.resize(x, size=size, method=method) model, inputs, outputs = build_model input_values = [ random_gen(x_shape, -100, 100), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, ) class TestNormalizationTF2(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, func, backend, epsilon", itertools.product( compute_units, [tf.raw_ops.FusedBatchNorm, tf.raw_ops.FusedBatchNormV3], backends, [1e-1, 1e-10] ), ) def test_fused_batch_norm(self, compute_unit, func, backend, epsilon): input_shape = np.random.randint(low=1, high=4, size=4) attr_shape = [list(input_shape)[-1]] m = random_gen(shape=attr_shape, rand_min=-1.0, rand_max=1.0) v = random_gen(shape=attr_shape, rand_min=0.0, rand_max=10.0) o = random_gen(shape=attr_shape, rand_min=1.0, rand_max=10.0) s = random_gen(shape=attr_shape, rand_min=-1.0, rand_max=1.0) @make_tf_graph([input_shape]) def build_model(x): return func( x=x, scale=s, offset=o, mean=m, variance=v, epsilon=epsilon, is_training=False, )[0] model, inputs, outputs = build_model input_values = [random_gen(shape=input_shape)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend, atol=1e-2, rtol=1e-3, ) class TestElementWiseBinaryTF2(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product(compute_units, backends, [rank for rank in range(1, 4)]), # False ) def test_add_v2(self, compute_unit, backend, rank): x_shape = list(np.random.randint(low=2, high=5, size=rank)) y_shape = x_shape[:] for i in range(rank): if np.random.randint(4) == 0: y_shape[i] = 1 if np.random.randint(2) == 0: y_shape = [1] + y_shape @make_tf_graph([x_shape, y_shape]) def build_model(x, y): return tf.raw_ops.AddV2(x=x, y=y) model, inputs, outputs = build_model input_values = [ np.random.randint(low=-1, high=1, size=x_shape).astype(np.float32), np.random.randint(low=-1, high=1, size=y_shape).astype(np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestControlFlowFromAutoGraph(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_if_unary_const(self, compute_unit, backend): @make_tf_graph([(1,)]) def build_model(x): if x > 0.5: y = x - 0.5 else: y = x + 0.5 return y model, inputs, outputs = build_model input_values = [np.array([0.7], dtype=np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_if_unary_double_if_positive_else_square(self, compute_unit, backend): @make_tf_graph([(1,)]) def build_model(x): if x >= 0: out = x + x else: out = x * x return out model, inputs, outputs = build_model input_values = [np.array([2], dtype=np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_if_binary_add_if_else_mul(self, compute_unit, backend): @make_tf_graph([(1,), (1,)]) def build_model(x, y): if x > y: out = x + x else: out = x * x return out model, inputs, outputs = build_model input_values = [ np.array([3], dtype=np.float32), np.array([7], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_while_loop_square(self, compute_unit, backend): @make_tf_graph([(1,)]) def build_model(x): i = 0 while i < 10: x *= 2 i += 1 return x model, inputs, outputs = build_model input_values = [np.array([2.0], dtype=np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_while_loop_power(self, compute_unit, backend): @make_tf_graph([(1,)]) def build_model(x): i = 0 while i < 3: x *= x i += 1 return x model, inputs, outputs = build_model input_values = [np.array([2.0], dtype=np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_while_loop_nested_body(self, compute_unit, backend): @make_tf_graph([(1,)]) def build_model(x): i, j = 0, 10 while i < j: while 2 * i < i + 2: i += 1 x -= 1 i += 2 x *= 2 return x model, inputs, outputs = build_model input_values = [np.array([9.0], dtype=np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.xfail(reason="rdar://76293949 (TF2 unit test InvalidArgumentError)", run=False) class TestTensorList(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, size_dynamic_shape", itertools.product( compute_units, backends, [ (1, True, None), (1, True, (1,)), (2, False, (1,)) ], ), ) def test_write_read_and_stack(self, compute_unit, backend, size_dynamic_shape): size, dynamic_size, element_shape = size_dynamic_shape @make_tf_graph([(1,), (1,)]) def build_model(x, y): ta = tf.TensorArray( tf.float32, size=size, dynamic_size=dynamic_size, element_shape=element_shape, ) ta = ta.write(0, x) ta = ta.write(1, y) return ta.read(0), ta.read(1), ta.stack() model, inputs, outputs = build_model input_values = [ np.array([3.14], dtype=np.float32), np.array([6.17], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, size_dynamic_shape", itertools.product( compute_units, backends, [ (0, True, None), (1, True, (1,)), (3, False, (1,)) ], ), ) def test_unstack_and_read(self, compute_unit, backend, size_dynamic_shape): size, dynamic_size, element_shape = size_dynamic_shape @make_tf_graph([(3, 1)]) def build_model(x): ta = tf.TensorArray( tf.float32, size=size, dynamic_size=dynamic_size, element_shape=element_shape, ) ta = ta.unstack(x) return ta.read(0), ta.read(1), ta.read(2) model, inputs, outputs = build_model input_values = [np.array([[3.14], [6.17], [12.14]], dtype=np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, size_dynamic_shape", itertools.product( compute_units, backends, [ (2, True, None), (1, True, (1,)), (3, False, (1,)) ], ), ) def test_write_and_gather(self, compute_unit, backend, size_dynamic_shape): size, dynamic_size, element_shape = size_dynamic_shape @make_tf_graph([(1,), (1,)]) def build_model(x, y): ta = tf.TensorArray( tf.float32, size=size, dynamic_size=dynamic_size, element_shape=element_shape, ) ta = ta.write(0, x) ta = ta.write(1, y) return ta.gather(indices=[0, 1]) model, inputs, outputs = build_model input_values = [ np.array([3.14], dtype=np.float32), np.array([6.17], dtype=np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, size_dynamic_shape", itertools.product( compute_units, backends, [ (2, True, None), (1, True, (1,)), (3, False, (1,)) ], ), ) def test_scatter_and_read(self, compute_unit, backend, size_dynamic_shape): size, dynamic_size, element_shape = size_dynamic_shape @make_tf_graph([(3, 1)]) def build_model(x): ta = tf.TensorArray( tf.float32, size=size, dynamic_size=dynamic_size, element_shape=element_shape, ) ta = ta.scatter(indices=[0, 1, 2], value=x) return ta.read(0), ta.read(1), ta.read(2) model, inputs, outputs = build_model input_values = [np.array([[3.14], [6.17], [12.14]], dtype=np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) @pytest.mark.parametrize( "compute_unit, backend, size_dynamic_shape", itertools.product(compute_units, backends, [(2, False, (None, 8))]), ) def test_partial_element_shape(self, compute_unit, backend, size_dynamic_shape): size, dynamic_size, element_shape = size_dynamic_shape @make_tf_graph([(3, 1, 8)]) def build_model(x): ta = tf.TensorArray( tf.float32, size=size, dynamic_size=dynamic_size, element_shape=element_shape, ) ta = ta.scatter(indices=[0, 1, 2], value=x) return ta.read(0), ta.read(1), ta.read(2) model, inputs, outputs = build_model input_values = [np.random.rand(3, 1, 8).astype(np.float32)] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) class TestPartitionedCall(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_partitioned_call_optimized_to_add_op(self, compute_unit, backend): """ The PartitionedCall will be optimized to V2Add op in TF's internal optimization pass (see `_run_inline_graph_optimization`), so this test passes even when we haven't implemented the `PartitionedCall` op). """ x_shape = [2, 3] y_shape = [2, 3] @tf.function def simple_func(*args): output = [args[0] + args[1]] return output @make_tf_graph([x_shape, y_shape]) def build_model(x, y): return tf.raw_ops.PartitionedCall( args=[x, y], f=simple_func.get_concrete_function(tf.zeros(x_shape), tf.zeros(y_shape)), Tout=[tf.float32] ) model, inputs, outputs = build_model input_values = [ np.zeros(x_shape).astype(np.float32), np.zeros(y_shape).astype(np.float32), ] input_dict = dict(zip(inputs, input_values)) TensorFlowBaseTest.run_compare_tf( model, input_dict, outputs, compute_unit=compute_unit, backend=backend ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/test/test_v2_ops_tf_keras.py0000644000000000000000000017026714672066616031612 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import platform import random import numpy as np import pytest from packaging.version import Version import coremltools as ct from coremltools._deps import _get_version from coremltools.converters.mil import testing_reqs from coremltools.converters.mil.frontend._utils import is_symbolic_dim_in_prog from coremltools.converters.mil.frontend.tensorflow.test.testing_utils import ( TensorFlowBaseTest, ) from coremltools.converters.mil.frontend.tensorflow2.test.testing_utils import ( TensorFlow2BaseTest, ) from coremltools.converters.mil.testing_utils import get_op_types_in_program, random_gen from coremltools.models.utils import _macos_version TensorFlowBaseTest.run_compare_tf_keras = TensorFlow2BaseTest.run_compare_tf_keras backends = testing_reqs.backends compute_units = testing_reqs.compute_units tf = pytest.importorskip("tensorflow", minversion="2.1.0") import tensorflow as _tf # should be after pytest.importorskip checks from tensorflow.keras import Input from tensorflow.keras.layers import Conv2D, GlobalMaxPooling2D from tensorflow.keras.models import Model class TestActivation(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, op", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [ tf.keras.layers.ELU, tf.keras.layers.LeakyReLU, tf.keras.layers.ReLU, tf.keras.layers.PReLU, tf.keras.layers.Softmax, tf.keras.layers.ThresholdedReLU, ], ), ) def test_layer(self, compute_unit, backend, rank, op): shape = np.random.randint(low=2, high=4, size=rank) model = tf.keras.Sequential([op(batch_input_shape=shape)]) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, -10, 10)], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank, op", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [ tf.keras.activations.elu, tf.keras.activations.exponential, tf.keras.activations.hard_sigmoid, tf.keras.activations.linear, tf.keras.activations.relu, tf.keras.activations.selu, tf.keras.activations.sigmoid, tf.keras.activations.softmax, tf.keras.activations.softplus, tf.keras.activations.softsign, tf.keras.activations.tanh, ], ), ) def test_activation(self, compute_unit, backend, rank, op): kwargs = ( {"atol": 1e-3, "rtol": 1e-4} if op == tf.keras.activations.exponential and compute_unit != ct.ComputeUnit.CPU_ONLY else {} ) if op == tf.keras.activations.softmax and rank == 1: return # skip apply softmax to a tensor that is 1D shape = np.random.randint(low=2, high=4, size=rank) model = tf.keras.Sequential( [tf.keras.layers.Activation(op, batch_input_shape=shape)] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, -10, 10)], compute_unit=compute_unit, backend=backend, **kwargs ) @pytest.mark.parametrize("backend", backends) def test_conv2d_prelu_fusion(self, backend): x_shape = (1, 10, 10, 32) x = tf.keras.Input(batch_input_shape=x_shape) # (B, H, W, C) x1 = tf.keras.layers.Conv2D(16, kernel_size=1)(x) x1 = tf.keras.layers.PReLU(alpha_initializer='glorot_uniform', shared_axes=[1, 2])(x1) x1 = tf.keras.layers.Conv2D(16, kernel_size=1)(x1) x1 = tf.keras.layers.PReLU(alpha_initializer='glorot_uniform', shared_axes=[1, 2])(x1) keras_model = tf.keras.Model(inputs=x, outputs=x1) res = TensorFlowBaseTest.run_compare_tf_keras( keras_model, [random_gen(x_shape, -1, 1)], compute_unit=ct.ComputeUnit.CPU_ONLY, backend=backend, ) coreml_model = res[1] mil_prog = coreml_model._get_mil_internal() # assert that "prelu" ops are present in the mil program, # which should be if "fuse_prelu" pass worked correctly assert len(mil_prog.find_ops(op_type="prelu")) == 2 assert "relu" not in get_op_types_in_program(mil_prog) class TestBinary(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, op", itertools.product( compute_units, backends, [rank for rank in range(2, 6)], [ tf.keras.layers.Add, tf.keras.layers.Average, tf.keras.layers.Subtract, tf.keras.layers.Maximum, tf.keras.layers.Minimum, ], ), ) def test(self, compute_unit, backend, rank, op): shape = np.random.randint(low=1, high=4, size=rank) input_x = tf.keras.layers.Input(batch_input_shape=tuple(shape)) input_y = tf.keras.layers.Input(batch_input_shape=tuple(shape)) out = op()([input_x, input_y]) model = tf.keras.Model(inputs=[input_x, input_y], outputs=out) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, -10, 10), random_gen(shape, -10, 10)], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank, axes, normalize", itertools.product( compute_units, backends, [rank for rank in range(2, 3)], [-1,], [True, False], ), ) def test_dot(self, compute_unit, rank, backend, axes, normalize): shape = np.random.randint(low=2, high=4, size=rank) input_x = tf.keras.layers.Input(batch_input_shape=tuple(shape)) input_y = tf.keras.layers.Input(batch_input_shape=tuple(shape)) out = tf.keras.layers.Dot(axes=axes, normalize=normalize)([input_x, input_y]) model = tf.keras.Model(inputs=[input_x, input_y], outputs=out) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, -10, 10), random_gen(shape, -10, 10)], compute_unit=compute_unit, backend=backend, ) class TestConcatenate(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, axis", itertools.product( compute_units, backends, [rank for rank in range(5, 6)], [-1, -2], ), ) def test(self, compute_unit, backend, rank, axis): shape = np.random.randint(low=2, high=4, size=rank) inputs = [] for _ in range(2): inputs.append(tf.keras.layers.Input(batch_input_shape=tuple(shape))) out = tf.keras.layers.Concatenate(axis=axis)(inputs) model = tf.keras.Model(inputs=inputs, outputs=out) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape), random_gen(shape)], compute_unit=compute_unit, backend=backend, ) class TestConvolution(TensorFlowBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "op", "padding", "data_format", "spatial_dim_and_ks", "strides", "dilations", "batch_size", "groups", ] ), itertools.product( compute_units, backends, [ tf.keras.layers.Conv1D, tf.keras.layers.Conv2D, tf.keras.layers.Conv3D, ], ["same", "valid"], ["channels_last"], [ (2, 4, 4, 2, 2, 2), (3, 7, 5, 1, 3, 2) ], [ (1, 1, 1), (1, 2, 3), (1, 3, 2) ], [ (1, 1, 1), (2, 2, 2), ], [1, 3], [1, 2], ), ) def test_conv( self, compute_unit, backend, op, padding, data_format, spatial_dim_and_ks, strides, dilations, batch_size, groups, ): if _get_version(_tf.__version__) < Version("2.5.0") and groups != 1: pytest.skip("TF supports groupwise convolution only for version > tf.2.5.0-rc3") if _get_version(_tf.__version__) > Version("2.8.0") and groups != 1: pytest.xfail("rdar://100814590 ([TF] [Infra] TF 2.10.0 Uses Unimplemented " "PartitionedCall op for Groupwise Convolution)") if op == tf.keras.layers.Conv3D and groups != 1: pytest.xfail("rdar://81629932 (Conv3d with group > 1 tests failing in TF2.0 converter)") for i, stride in enumerate(strides): if stride > 1 and dilations[i] > 1: pytest.skip("TF does not support strides > 1 in conjunction with dilation_rate > 1") for d in dilations: if d > 1 and op == tf.keras.layers.Conv3D: pytest.skip("Dilations with Conv3D not supported yet, since SpaceToBatchND is " "only supported for ranks 3 or 4") s1, s2, s3, k1, k2, k3 = spatial_dim_and_ks c_in, c_out = 2, 4 input_shape = None kernel_size = None if op == tf.keras.layers.Conv1D: input_shape = (batch_size, s3, c_in) kernel_size = k3 strides = strides[2] dilations = dilations[2] elif op == tf.keras.layers.Conv2D: input_shape = (batch_size, s2, s3, c_in) kernel_size = (k2, k3) strides = (strides[1], strides[2]) dilations = dilations[1:] elif op == tf.keras.layers.Conv3D: input_shape = (batch_size, s1, s2, s3, c_in) kernel_size = (k1, k2, k3) model = tf.keras.Sequential( [ op( batch_input_shape=input_shape, filters=c_out, kernel_size=kernel_size, strides=strides, padding=padding.upper(), data_format=data_format, dilation_rate=dilations, groups=groups, ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(input_shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "op", "padding", "data_format", "spatial_dim_and_ks", "strides", "dilations", "batch_size", ] ), itertools.product( compute_units, backends, [ tf.keras.layers.LocallyConnected1D, tf.keras.layers.LocallyConnected2D, ], ["same", "valid"], ["channels_last"], [ (2, 4, 4, 2, 2, 2), (3, 7, 5, 1, 3, 2) ], [ (1, 1, 1), (1, 2, 3), (1, 3, 2) ], [ (1, 1, 1), (2, 2, 2), ], [1, 3], ), ) def test_conv_locally_connected( self, compute_unit, backend, op, padding, data_format, spatial_dim_and_ks, strides, dilations, batch_size, ): s1, s2, s3, k1, k2, k3 = spatial_dim_and_ks c_in, c_out = 2, 3 input_shape = None kernel_size = None if op in {tf.keras.layers.Conv1D, tf.keras.layers.LocallyConnected1D}: input_shape = (batch_size, s3, c_in) kernel_size = k3 strides = strides[2] dilations = dilations[2] elif op in {tf.keras.layers.Conv2D, tf.keras.layers.LocallyConnected2D}: input_shape = (batch_size, s2, s3, c_in) kernel_size = (k2, k3) strides = (strides[1], strides[2]) dilations = dilations[1:] elif op == tf.keras.layers.Conv3D: input_shape = (batch_size, s1, s2, s3, c_in) kernel_size = (k1, k2, k3) if op in { tf.keras.layers.LocallyConnected1D, tf.keras.layers.LocallyConnected2D, }: if padding != "valid": return # tf.keras only supports "valid" model = tf.keras.Sequential( [ op( batch_input_shape=input_shape, filters=c_out, kernel_size=kernel_size, strides=strides, padding=padding.upper(), data_format=data_format, ) ] ) else: model = tf.keras.Sequential( [ op( batch_input_shape=input_shape, filters=c_out, kernel_size=kernel_size, strides=strides, padding=padding.upper(), data_format=data_format, dilation_rate=dilations, ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(input_shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "op", "padding", "data_format", "spatial_dim_and_ks", "strides", "dilations", "batch_size", ] ), itertools.product( compute_units, backends, [tf.keras.layers.DepthwiseConv2D], ["same", "valid"], ["channels_last"], [(11, 12, 3, 2), (12, 11, 2, 3)], [(1, 1), (2, 2)], [(1, 1), (2, 2)], [1, 3], ), ) def test_depth_wise_conv( self, compute_unit, backend, op, padding, data_format, spatial_dim_and_ks, strides, dilations, batch_size, ): s1, s2, k1, k2 = spatial_dim_and_ks c_in = 2 if len(strides) != np.sum(strides) and len(dilations) != np.sum(dilations): # TF produces incorrect output for non-one strides + dilations return input_shape = (batch_size, s1, s2, c_in) model = tf.keras.Sequential( [ op( batch_input_shape=input_shape, kernel_size=(k1, k2), strides=strides, padding=padding.upper(), data_format=data_format, dilation_rate=dilations, ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(input_shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "padding", ] ), itertools.product( compute_units, backends, ["same", "valid"], ), ) def test_conv2d_padding_dynamic_input( self, compute_unit, backend, padding, ): if backend[0] == "mlprogram" and _macos_version() < (13, 0): pytest.skip("Error in declaring network.") # Test same padding input_layer = Input(batch_size=1, shape=(None, None, 1)) layer = Conv2D( filters=16, kernel_size=(3, 3), padding=padding, activation="relu" )(input_layer) output_layer = GlobalMaxPooling2D()(layer) model = Model(inputs=[input_layer], outputs=[output_layer]) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen((1, 80, 40, 1), rand_min=-10, rand_max=10)], inputs_for_conversion=[ ct.TensorType( shape=(1, ct.RangeDim(upper_bound=80), ct.RangeDim(upper_bound=80), 1) ) ], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "op", "padding", "data_format", "spatial_dim_and_ks", "strides", "dilations", "batch_size", ] ), itertools.product( compute_units, backends, [tf.keras.layers.SeparableConv1D, tf.keras.layers.SeparableConv2D], ["same", "valid"], ["channels_last"], [ (14, 14, 2, 2), (11, 9, 3, 2), (12, 11, 2, 3) ], [ (1, 1), (2, 2), (3, 3) ], [(1, 1)], [1, 3], ), ) def test_separable_conv( self, compute_unit, backend, op, padding, data_format, spatial_dim_and_ks, strides, dilations, batch_size, ): s1, s2, k1, k2 = spatial_dim_and_ks c_in, c_out = 2, 3 input_shape = None kernel_size = None if op == tf.keras.layers.SeparableConv1D: input_shape = (batch_size, s2, c_in) kernel_size = k2 strides = strides[1] dilations = dilations[1] elif op == tf.keras.layers.SeparableConv2D: input_shape = (batch_size, s1, s2, c_in) kernel_size = (k1, k2) model = tf.keras.Sequential( [ op( batch_input_shape=input_shape, filters=c_out, kernel_size=kernel_size, strides=strides, padding=padding.upper(), data_format=data_format, dilation_rate=dilations, ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(input_shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) class TestConvTranspose(TensorFlowBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "op", "padding", "data_format", "spatial_dim_and_ks", "output_padding", "strides", "dilations", "batch_size", ] ), itertools.product( compute_units, backends, [tf.keras.layers.Conv2DTranspose, tf.keras.layers.Conv3DTranspose], ["same", "valid"], ["channels_last"], [(7, 11, 12, 1, 2, 2), (9, 5, 7, 3, 3, 3)], [(1, 1, 1)], [(2, 2, 2), (2, 3, 3)], [(1, 1, 1)], # Dilation > 1 not supported by TF [1, 3], ), ) def test_conv_transpose( self, compute_unit, backend, op, padding, data_format, spatial_dim_and_ks, output_padding, strides, dilations, batch_size, ): if ( platform.machine() == "arm64" and backend == ("mlprogram", "fp16") and op == tf.keras.layers.Conv3DTranspose and padding == "valid" and spatial_dim_and_ks == (7, 11, 12, 1, 2, 2) and strides == (2, 3, 3) and batch_size == 3 ): pytest.xfail("rdar://98015195 ([M1 native tests] Some MIL unittests are failing M1 native)") s1, s2, s3, k1, k2, k3 = spatial_dim_and_ks c_in, c_out = 2, 3 input_shape = None kernel_size = None if op == tf.keras.layers.Conv2DTranspose: input_shape = (batch_size, s2, s3, c_in) kernel_size = (k2, k3) strides = (strides[1], strides[2]) dilations = dilations[1:] output_padding = (output_padding[1], output_padding[2]) elif op == tf.keras.layers.Conv3DTranspose: input_shape = (batch_size, s1, s2, s3, c_in) kernel_size = (k1, k2, k3) model = tf.keras.Sequential( [ op( batch_input_shape=input_shape, filters=c_out, kernel_size=kernel_size, strides=strides, padding=padding.upper(), output_padding=output_padding, data_format=data_format, dilation_rate=dilations, ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(input_shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) class TestCropping(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, begin_end", itertools.product( compute_units, backends, [(0, 0), (1, 1), (1, 2), (2, 1), (2, 4), (3, 2)], ), ) def test_cropping_1d(self, compute_unit, backend, begin_end): shape = (1, 10, 3) model = tf.keras.Sequential( [tf.keras.layers.Cropping1D(batch_input_shape=shape, cropping=begin_end)] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-1, rand_max=1)], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, begin_end1, begin_end2", itertools.product( compute_units, backends, [(0, 0), (1, 1), (2, 1)], [(0, 0), (1, 2), (4, 2)], ), ) def test_cropping_2d(self, compute_unit, backend, begin_end1, begin_end2): shape = (1, 10, 10, 3) model = tf.keras.Sequential( [ tf.keras.layers.Cropping2D( batch_input_shape=shape, cropping=(begin_end1, begin_end2) ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-1, rand_max=1)], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, begin_end1, begin_end2, begin_end3", itertools.product( compute_units, backends, [(0, 0), (1, 2), (2, 1)], [(1, 1), (1, 2), (4, 2)], [(0, 0), (1, 1), (2, 4)], ), ) def test_cropping_3d( self, compute_unit, backend, begin_end1, begin_end2, begin_end3 ): shape = (1, 10, 10, 10, 3) model = tf.keras.Sequential( [ tf.keras.layers.Cropping3D( batch_input_shape=shape, cropping=(begin_end1, begin_end2, begin_end3), ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-1, rand_max=1)], compute_unit=compute_unit, backend=backend, ) class TestDense(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, units, activation, use_bias", itertools.product( compute_units, backends, [rank for rank in range(2, 6)], [2, 4, 8], [tf.nn.relu, tf.nn.softmax, tf.nn.swish], [True, False], ), ) def test(self, compute_unit, backend, rank, units, activation, use_bias): shape = np.random.randint(low=2, high=4, size=rank) model = tf.keras.Sequential( [ tf.keras.layers.Dense( batch_input_shape=shape, units=units, activation=activation, use_bias=use_bias, ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) class TestEmbedding(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, dims, batch_size, input_length", itertools.product( compute_units, backends, [(4, 1), (8, 3), (16, 5), (32, 7), (64, 9)], [1, 3, 5], [2, 4, 10], ), ) def test(self, compute_unit, backend, dims, batch_size, input_length): # input shape: 2D tensor (batch_size, input_length) # output shape: 3D tensor (batch_size, input_length, output_dim) shape = (batch_size, input_length) model = tf.keras.Sequential( [ tf.keras.layers.Embedding( batch_input_shape=shape, input_dim=dims[0], output_dim=dims[1], input_length=input_length, ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=0, rand_max=dims[0])], compute_unit=compute_unit, backend=backend, atol=1e-3, rtol=1e-4, ) class TestFlatten(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, data_format", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], ["channels_last", "channels_first"], ), ) def test(self, compute_unit, backend, rank, data_format): shape = np.random.randint(low=2, high=4, size=rank) model = tf.keras.Sequential( [tf.keras.layers.Flatten(batch_input_shape=shape, data_format=data_format,)] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) class TestMasking(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, mask_value, is_masked", itertools.product( compute_units, backends, [2, 5], [0, 0.4], [False, True], ), ) def test(self, compute_unit, backend, rank, mask_value, is_masked): shape = np.random.randint(low=2, high=4, size=rank) model = tf.keras.Sequential( [ tf.keras.layers.Masking( batch_input_shape=shape, mask_value=mask_value, ) ] ) input_value = random_gen(shape, -10, 10) if is_masked: input_value[:, 1] = mask_value TensorFlowBaseTest.run_compare_tf_keras( model, [input_value], compute_unit=compute_unit, backend=backend, ) class TestLambda(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, function", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [ lambda x: x + x, lambda x: x * 3.14 - 1.0, lambda x: np.sqrt(4) + x, lambda x: tf.math.abs(x), ], ), ) def test_unary(self, compute_unit, backend, rank, function): shape = np.random.randint(low=2, high=4, size=rank) model = tf.keras.Sequential( [tf.keras.layers.Lambda(batch_input_shape=shape, function=function,)] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-5, rand_max=5)], compute_unit=compute_unit, backend=backend, ) class TestBatchNormalization(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, axis, momentum, epsilon, mixed_precision", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [0, -1], [0.99, 0.85], [1e-2, 1e-5], [True, False], ), ) def test_batch_normalization( self, compute_unit, backend, rank, axis, momentum, epsilon, mixed_precision ): if backend[0] != "mlprogram" and mixed_precision: pytest.skip("neuralnetwork backend doesn't support fp16 computation.") if mixed_precision: tf.keras.mixed_precision.set_global_policy('mixed_float16') shape = np.random.randint(low=2, high=4, size=rank) model = tf.keras.Sequential( [ tf.keras.layers.BatchNormalization( batch_input_shape=shape, axis=axis, momentum=momentum, epsilon=epsilon, ) ] ) random_weights = np.random.rand(4, shape[axis]) model.layers[0].set_weights(random_weights) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) if mixed_precision: tf.keras.mixed_precision.set_global_policy(tf.keras.backend.floatx()) @pytest.mark.parametrize( "compute_unit, backend, rank_and_axis, momentum, epsilon, mixed_precision", itertools.product( compute_units, backends, [(4, 1), (4, -3)], [0.99, 0.85], [1e-2, 1e-5], [True, False], ), ) def test_fused_batch_norm_v3( self, compute_unit, backend, rank_and_axis, momentum, epsilon, mixed_precision ): if backend[0] != "mlprogram" and mixed_precision: pytest.skip("neuralnetwork backend doesn't support fp16 computation.") if mixed_precision: tf.keras.mixed_precision.set_global_policy('mixed_float16') rank, axis = rank_and_axis shape = np.random.randint(low=2, high=4, size=rank) model = tf.keras.Sequential( [ tf.keras.layers.BatchNormalization( batch_input_shape=shape, axis=axis, momentum=momentum, epsilon=epsilon, ) ] ) random_weights = np.random.rand(4, shape[axis]) model.layers[0].set_weights(random_weights) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) if mixed_precision: tf.keras.mixed_precision.set_global_policy(tf.keras.backend.floatx()) class TestInstanceNormalization(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, axis, epsilon, center, scale", itertools.product( compute_units, backends, [rank for rank in range(4, 5)], [-1], [1e-3, 1e-5], [True, False], [True, False], ), ) def test_instance_normalization( self, compute_unit, backend, rank, axis, epsilon, center, scale ): tensorflow_addons = pytest.importorskip("tensorflow_addons") from tensorflow_addons.layers import InstanceNormalization shape = np.random.randint(low=2, high=4, size=rank) model = tf.keras.Sequential( [ InstanceNormalization( batch_input_shape=shape, axis=axis, epsilon=epsilon, center=center, scale=scale, ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-1, rand_max=1)], compute_unit=compute_unit, backend=backend, atol=1e-2, rtol=1e-3, ) class TestNormalization(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, axis, epsilon, dynamic", itertools.product( compute_units, backends, [rank for rank in range(3, 4)], [-1,], [1e-2, 1e-10], [True, False], ), ) def test_layer_normalization(self, compute_unit, backend, rank, axis, epsilon, dynamic): shape = np.random.randint(low=2, high=4, size=rank) keras_shape = shape.tolist() inputs_for_conversion = None if dynamic: keras_shape[0] = None if backend[0] == "mlprogram": inputs_for_conversion = [ ct.TensorType(shape=[ct.RangeDim(upper_bound=4)] + keras_shape[1:]) ] model = tf.keras.Sequential( [ tf.keras.layers.LayerNormalization( batch_input_shape=keras_shape, axis=axis, epsilon=epsilon, trainable=False ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-100, rand_max=100)], inputs_for_conversion=inputs_for_conversion, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank, groups, axis, epsilon, center, scale", itertools.product( compute_units, backends, [rank for rank in range(4, 5)], [1, 2, 3], [-1], [1e-3, 1e-5], [True, False], [True, False], ), ) def test_group_normalization( self, compute_unit, backend, rank, groups, axis, epsilon, center, scale ): tensorflow_addons = pytest.importorskip("tensorflow_addons") from tensorflow_addons.layers import GroupNormalization shape = np.random.randint(low=2, high=4, size=rank) shape[-1] = shape[-1] * groups # groups must be a multiple of channels model = tf.keras.Sequential( [ GroupNormalization( batch_input_shape=shape, groups=groups, axis=axis, epsilon=epsilon, center=center, scale=scale, ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-1, rand_max=1)], compute_unit=compute_unit, backend=backend, atol=1e-3, rtol=1e-4, ) class TestPadding(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, op, data_format, padding, mixed_precision", itertools.product( compute_units, backends, [ tf.keras.layers.ZeroPadding1D, tf.keras.layers.ZeroPadding2D, tf.keras.layers.ZeroPadding3D, ], ["channels_first", "channels_last"], [(1, 1, 1), (2, 2, 2), (3, 3, 3), (1, 3, 4), (2, 3, 5)], [True, False], ), ) def test(self, compute_unit, backend, op, data_format, padding, mixed_precision): if backend[0] != "mlprogram" and mixed_precision: pytest.skip("neuralnetwork backend doesn't support fp16 computation.") if mixed_precision: tf.keras.mixed_precision.set_global_policy("mixed_float16") shape = None kwargs = {} if op == tf.keras.layers.ZeroPadding1D: padding = padding[-1] shape = np.random.randint(low=2, high=4, size=3) elif op == tf.keras.layers.ZeroPadding2D: padding = padding[1:] kwargs = {"data_format": data_format} shape = np.random.randint(low=2, high=4, size=4) elif op == tf.keras.layers.ZeroPadding3D: kwargs = {"data_format": data_format} shape = np.random.randint(low=2, high=4, size=5) model = tf.keras.Sequential( [op(batch_input_shape=shape, padding=padding, **kwargs)] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) if mixed_precision: tf.keras.mixed_precision.set_global_policy(tf.keras.backend.floatx()) class TestPermute(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank_and_perm", itertools.product( compute_units, backends, [ (rank, perm) for rank in range(3, 6) for perm in list(itertools.permutations(range(rank)[1:])) ], ), ) def test(self, compute_unit, backend, rank_and_perm): rank, perm = rank_and_perm shape = np.random.randint(low=2, high=4, size=rank) model = tf.keras.Sequential( [tf.keras.layers.Permute(batch_input_shape=shape, dims=perm)] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) class TestGlobalPooling(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, op, data_format", itertools.product( compute_units, backends, [ tf.keras.layers.GlobalAveragePooling1D, tf.keras.layers.GlobalAveragePooling2D, tf.keras.layers.GlobalAveragePooling3D, tf.keras.layers.GlobalMaxPool1D, tf.keras.layers.GlobalMaxPool2D, tf.keras.layers.GlobalMaxPool3D, ], ["channels_first", "channels_last"], ), ) def test_global_pooling(self, compute_unit, backend, op, data_format): shape = None if op in { tf.keras.layers.GlobalAveragePooling1D, tf.keras.layers.GlobalMaxPool1D, }: shape = np.random.randint(low=2, high=4, size=3) elif op in { tf.keras.layers.GlobalAveragePooling2D, tf.keras.layers.GlobalMaxPool2D, }: shape = np.random.randint(low=2, high=4, size=4) elif op in { tf.keras.layers.GlobalAveragePooling3D, tf.keras.layers.GlobalMaxPool3D, }: shape = np.random.randint(low=2, high=4, size=5) model = tf.keras.Sequential( [op(batch_input_shape=shape, data_format=data_format)] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) class TestPooling(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, op, data_format, pool_size", itertools.product( compute_units, backends, [ tf.keras.layers.AveragePooling1D, tf.keras.layers.AveragePooling2D, tf.keras.layers.AveragePooling3D, tf.keras.layers.MaxPool1D, tf.keras.layers.MaxPool2D, tf.keras.layers.MaxPool3D, ], ["channels_first", "channels_last"], [(2, 2, 1), (2, 3, 2), (1, 2, 3)], ), ) def test_pooling(self, compute_unit, backend, op, data_format, pool_size): shape = None if op in {tf.keras.layers.AveragePooling1D, tf.keras.layers.MaxPool1D}: shape = np.random.randint(low=3, high=9, size=3) pool_size = pool_size[2] elif op in {tf.keras.layers.AveragePooling2D, tf.keras.layers.MaxPool2D}: if data_format == "channels_first": return # AvgPoolingOp only supports NHWC on CPU shape = np.random.randint(low=3, high=9, size=4) pool_size = pool_size[1:] elif op in {tf.keras.layers.AveragePooling3D, tf.keras.layers.MaxPool3D}: shape = np.random.randint(low=3, high=9, size=5) model = tf.keras.Sequential( [op(batch_input_shape=shape, pool_size=pool_size, data_format=data_format)] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) class TestRecurrent(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, units, activation, " "recurrent_activation, use_bias, return_sequences", itertools.product( compute_units, backends, [rank for rank in range(3, 4)], [1, 3], [None, tf.nn.tanh], [None, tf.nn.relu], [True, False], [True, False], ), ) def test_lstm( self, compute_unit, backend, rank, units, activation, recurrent_activation, use_bias, return_sequences, ): shape = np.random.randint(low=1, high=4, size=rank) model = tf.keras.Sequential( [ tf.keras.layers.LSTM( batch_input_shape=shape, units=units, activation=activation, recurrent_activation=recurrent_activation, use_bias=use_bias, return_sequences=return_sequences, ), ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-1, rand_max=1)], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_lstmcell(self, compute_unit, backend): shape = np.random.randint(low=1, high=4, size=3) model = tf.keras.Sequential( [ tf.keras.layers.RNN( batch_input_shape=shape, cell=tf.keras.layers.LSTMCell(units=3) ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-1, rand_max=1)], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_lstm_time_distributed_dense(self, compute_unit, backend): shape = list(np.random.randint(low=1, high=4, size=3)) k_in = tf.keras.layers.Input(batch_size=shape[0], shape=shape[1:]) lstm = tf.keras.layers.LSTM(units=32, return_sequences=True)(k_in) k_out = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(1))(lstm) model = tf.keras.Model(inputs=k_in, outputs=k_out) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-1, rand_max=1)], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_lstm_dynamic_batch(self, compute_unit, backend): input_shape = (1, 1280) inp = tf.keras.layers.Input(shape=input_shape) out, hn, cn = tf.keras.layers.LSTM(512, return_sequences=True, return_state=True, recurrent_activation='sigmoid')(inp) model = tf.keras.models.Model(inputs=[inp], outputs=[out, hn, cn]) batch_size = 2 TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen((batch_size, 1, 1280), -1, 1),], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_lstm_conversion_static_shapes(self, compute_unit, backend): ''' Test that intermediate tensor shapes are populated correctly by the converter. That is, there are no symbolic dimensions in the shapes, when conversion is performed with a fixed input shape, irrespective of the shape used in the source model definition. ''' def _get_keras_simple_lstm_model(input_shape): input = tf.keras.Input(batch_input_shape=input_shape) output = tf.keras.layers.LSTM(5)(input) keras_model = tf.keras.Model(inputs=input, outputs=output) return keras_model def _test_for_symbolic_shapes(keras_input_shape, input_shape_for_conversion, are_symbols_expected): keras_model = _get_keras_simple_lstm_model(keras_input_shape) res = TensorFlowBaseTest.run_compare_tf_keras( keras_model, [random_gen((1, 32, 10), -1, 1)], inputs_for_conversion=[ct.TensorType(shape=input_shape_for_conversion)], compute_unit=compute_unit, backend=backend, ) coreml_model = res[1] mil_prog = coreml_model._get_mil_internal() assert is_symbolic_dim_in_prog(mil_prog) == are_symbols_expected _test_for_symbolic_shapes(keras_input_shape=(1, 32, 10), input_shape_for_conversion=(1, 32, 10), are_symbols_expected=False) _test_for_symbolic_shapes(keras_input_shape=(None, 32, 10), input_shape_for_conversion=(1, 32, 10), are_symbols_expected=False) _test_for_symbolic_shapes(keras_input_shape=(None, None, 10), input_shape_for_conversion=(1, 32, 10), are_symbols_expected=False) _test_for_symbolic_shapes(keras_input_shape=(None, 32, 10), input_shape_for_conversion=(ct.RangeDim(1, 10), 32, 10), are_symbols_expected=True) if backend[0] != "mlprogram": # FIX ME: model load fails if backend is "mlprogram". rdar://84862138 _test_for_symbolic_shapes(keras_input_shape=(None, None, 10), input_shape_for_conversion=(ct.RangeDim(1, 10), ct.RangeDim(16, 64), 10), are_symbols_expected=True) @pytest.mark.parametrize( "compute_unit, tf_raw_lstm_op, is_flexible_input, batch_size, backend", itertools.product( compute_units, [ tf.raw_ops.BlockLSTMV2, tf.raw_ops.BlockLSTM, ], [False, True], [1, 2], backends, ), ) def test_lstm_block_fused_op( self, compute_unit, tf_raw_lstm_op, is_flexible_input, batch_size, backend ): """ Define a model with custom LSTM ops that uses tf.raw_ops.BlockLSTM / tf.raw_ops.BlockLSTMV2 and verify that it converts to a fused lstm op. %x (shape: (Seq, Batch, idim) == (seq_len, batch, 4)) %x1 = LSTM(h=10) (%input) # shape = (seq_len, batch, 10) %x2 = LSTM(h=20) (%x1) # shape = (seq_len, batch, 20) %x3 = slice()(%x2) # shape = (1, batch, 20), to get the final seq value %x4 = reshape((1, -1)) (%x3) # shape = (1, batch * 20) %x5 = Dense(h=3)(%x4) # shape = (1, 3) """ class CustomLSTM(tf.keras.layers.Layer): def __init__(self, num_units, max_seq_length, batch_size): super(CustomLSTM, self).__init__() self.hidden_dim = num_units self.seq_length = max_seq_length self.batch_size = batch_size def build(self, input_shape): input_dim = input_shape[-1] self.w = self.add_weight( shape=(input_dim + self.hidden_dim, 4 * self.hidden_dim), initializer="random_normal", trainable=True, ) self.b = self.add_weight(shape=(4 * self.hidden_dim,), initializer="random_normal", trainable=True) self.init_h = tf.constant(np.zeros((self.batch_size, self.hidden_dim)).astype(np.float32)) self.init_c = tf.constant(np.zeros((self.batch_size, self.hidden_dim)).astype(np.float32)) def call(self, inputs): _, output_state, _, _, _, _, output = tf_raw_lstm_op( seq_len_max=self.seq_length, x=inputs, cs_prev=self.init_c, h_prev=self.init_h, w=self.w, wci=tf.constant(np.zeros((self.hidden_dim)).astype(np.float32)), wcf=tf.constant(np.zeros((self.hidden_dim)).astype(np.float32)), wco=tf.constant(np.zeros((self.hidden_dim)).astype(np.float32)), b=self.b, ) return output, output_state input_dim = 4 seq_length = 5 batch_size = batch_size x_shape = (seq_length, batch_size, input_dim) hidden_dim_1 = 10 hidden_dim_2 = 20 x = tf.keras.Input(batch_input_shape=x_shape) # (seq_len, batch, 4) x1, output_states_1 = CustomLSTM(num_units=hidden_dim_1, max_seq_length=seq_length, batch_size=batch_size)(x) # (seq_len, batch, 10), (seq_len, batch, 10) x2, output_states_2 = CustomLSTM(num_units=hidden_dim_2, max_seq_length=seq_length, batch_size=batch_size)(x1) # (seq_len, batch, 20), (seq_len, batch 10) x3 = tf.slice(x2, begin=[4, 0, 0], size=[1, batch_size, 20]) # (1, batch, 20) x4 = tf.reshape(x3, shape=(1, -1)) # (1, batch * 20) x5 = tf.keras.layers.Dense(3)(x4) # (1, 3) # Test that we can fuse the lstm op if we have an output that only extract the information from the last cell state x6 = tf.keras.layers.ReLU()(output_states_1[4, :, :]) x7 = output_states_2[4:5, :, :] x8 = output_states_1[-1, :, :] x9 = tf.keras.layers.ReLU()(output_states_2[-1:, :, :]) outputs = [x5, x8, x9] if is_flexible_input else [x5, x6, x7, x8, x9] keras_model = tf.keras.Model(inputs=x, outputs=outputs) inputs = None if is_flexible_input: inputs = [ ct.TensorType( shape=(ct.RangeDim(seq_length, 20), batch_size, input_dim) ) ] res = TensorFlowBaseTest.run_compare_tf_keras( keras_model, [random_gen(x_shape, -1, 1)], compute_unit=compute_unit, backend=backend, inputs_for_conversion=inputs, ) coreml_model = res[1] mil_prog = coreml_model._get_mil_internal() # assert that "lstm" ops are present in the mil program assert len(mil_prog.find_ops(op_type="lstm")) == 2 class TestRepeatVector(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, n", itertools.product( compute_units, backends, [2, 3, 5, 7], ), ) def test(self, compute_unit, backend, n): # input shape 2D tensor (batch size, features) # output shape 3D tensor (batch size, n, features) shape = np.random.randint(low=1, high=4, size=2) model = tf.keras.Sequential( [tf.keras.layers.RepeatVector(batch_input_shape=shape, n=n)] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) class TestReshape(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, infer_shape", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [True, False], ), ) def test(self, compute_unit, backend, rank, infer_shape): shape = np.random.randint(low=2, high=4, size=rank) # target shape does not include the batch dimension target_shape = random.sample(list(shape[1:]), len(shape[1:])) if len(target_shape) > 0 and infer_shape: target_shape[-1] = -1 model = tf.keras.Sequential( [ tf.keras.layers.Reshape( batch_input_shape=shape, target_shape=target_shape ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) class TestSkips(TensorFlowBaseTest): # ops in this class should be ignored / pass-through during conversion @pytest.mark.parametrize( "compute_unit, backend, skip_op", itertools.product( compute_units, backends, [ tf.keras.layers.Dropout, tf.keras.layers.AlphaDropout, tf.keras.layers.GaussianDropout, tf.keras.layers.SpatialDropout1D, tf.keras.layers.SpatialDropout2D, tf.keras.layers.SpatialDropout3D, ], ), ) def test_skip_dropout(self, compute_unit, backend, skip_op): shape = np.random.randint(low=1, high=4, size=5) if skip_op == tf.keras.layers.SpatialDropout1D: shape = shape[:3] elif skip_op == tf.keras.layers.SpatialDropout2D: shape = shape[:4] model = tf.keras.Sequential([skip_op(batch_input_shape=shape, rate=0.5)]) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends,) ) def test_skip_noise(self, compute_unit, backend): shape = np.random.randint(low=1, high=4, size=5) model = tf.keras.Sequential( [ # GaussianNoise should do nothing in inference mode tf.keras.layers.GaussianNoise(batch_input_shape=shape, stddev=0.5) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank, l1, l2", itertools.product( compute_units, backends, [rank for rank in range(5, 6)], [0.0, 0.5, 1.0], [0.0, 0.5, 1.0], ), ) def test_skip_regularization(self, compute_unit, backend, rank, l1, l2): shape = np.random.randint(low=2, high=4, size=rank) model = tf.keras.Sequential( [ tf.keras.layers.ActivityRegularization( batch_input_shape=shape, l1=l1, l2=l2 ) ] ) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], compute_unit=compute_unit, backend=backend, ) class TestUpSampling(TensorFlowBaseTest): @pytest.mark.parametrize( "compute_unit, backend, op, upsample_factor, data_format, interpolation, dynamic", itertools.product( compute_units, backends, [ tf.keras.layers.UpSampling1D, tf.keras.layers.UpSampling2D, tf.keras.layers.UpSampling3D, ], [(2, 2, 1), (4, 3, 2), (1, 2, 3)], ["channels_first", "channels_last"], ["nearest", "bilinear"], [True, False], ), ) def test( self, compute_unit, backend, op, upsample_factor, data_format, interpolation, dynamic ): kwargs = {} shape = None keras_shape = None if op == tf.keras.layers.UpSampling1D: shape = np.random.randint(low=2, high=4, size=3) keras_shape = np.copy(shape).tolist() if dynamic: keras_shape[1] = None upsample_factor = upsample_factor[2] elif op == tf.keras.layers.UpSampling2D: kwargs = {"data_format": data_format, "interpolation": interpolation} shape = np.random.randint(low=2, high=4, size=4) keras_shape = np.copy(shape).tolist() if dynamic: keras_shape[1] = keras_shape[2] = None upsample_factor = (upsample_factor[1], upsample_factor[2]) elif op == tf.keras.layers.UpSampling3D: kwargs = {"data_format": data_format} shape = np.random.randint(low=2, high=4, size=5) keras_shape = np.copy(shape).tolist() if dynamic: pytest.skip( "upsampling3D with dynamic input shape is not supported, since 6D tensors are produced in that case" ) inputs_for_conversion = None if backend[0] == "mlprogram" and dynamic: inputs_for_conversion = [ ct.TensorType(shape=[dim or ct.RangeDim(upper_bound=10) for dim in keras_shape]) ] model = tf.keras.Sequential( [op(batch_input_shape=keras_shape, size=upsample_factor, **kwargs)] ) spec = TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, rand_min=-10, rand_max=10)], inputs_for_conversion=inputs_for_conversion, compute_unit=compute_unit, backend=backend, )[0] # also check if the scale factor are integers if backend[0] == 'neuralnetwork': for layer in spec.neuralNetwork.layers: if layer.WhichOneof('layer') == "upsample": assert len(layer.upsample.fractionalScalingFactor) == 0 class TestGelu(TensorFlowBaseTest): @pytest.mark.skipif( _get_version(_tf.__version__) < Version("2.4.0"), reason="Gelu is a new layer for tf 2.4.0 and above." ) @pytest.mark.parametrize( "compute_unit, backend, rank, approximate", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [True, False], ), ) def test( self, compute_unit, backend, rank, approximate ): shape = np.random.randint(low=2, high=4, size=rank) input = tf.keras.layers.Input(batch_input_shape=tuple(shape)) out = tf.keras.activations.gelu(input, approximate=approximate) model = tf.keras.Model(inputs=[input], outputs=out) TensorFlowBaseTest.run_compare_tf_keras( model, [random_gen(shape, -10, 10)], compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/test/testing_utils.py0000644000000000000000000002414114672066616030347 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np import pytest tf = pytest.importorskip("tensorflow", minversion="2.1.0") from tensorflow.python.framework import dtypes import coremltools as ct import coremltools.models.utils as coremltoolsutils from coremltools.converters.mil.frontend.tensorflow.test.testing_utils import ( TensorFlowBaseTest, get_tf_node_names, ) from coremltools.converters.mil.input_types import RangeDim, TensorType from coremltools.converters.mil.testing_utils import ( compare_backend, ct_convert, validate_minimum_deployment_target, ) from coremltools.models.utils import _macos_version def make_tf2_graph(input_types): """ Decorator to help construct TensorFlow 2.x model. Parameters ---------- input_types: list of tuple or list of list List of input types. E.g. [(3, 224, 224, tf.int32)] represent 1 input, with shape (3, 224, 224), and the expected data type is tf.int32. The dtype is optional, in case it's missing, tf.float32 will be used. Returns ------- list of ConcreteFunction, list of str, list of str """ def wrapper(ops): input_signature = [] for input_type in input_types: if input_type is not None and len(input_type) > 0 and isinstance(input_type[-1], dtypes.DType): shape, dtype = input_type[:-1], input_type[-1] else: shape, dtype = input_type, tf.float32 input_signature.append(tf.TensorSpec(shape=shape, dtype=dtype)) @tf.function(input_signature=input_signature) def tf2_model(*args): return ops(*args) concrete_func = tf2_model.get_concrete_function() inputs = get_tf_node_names( [t.name for t in concrete_func.inputs if t.dtype != dtypes.resource], mode="input", ) outputs = get_tf_node_names( [t.name for t in concrete_func.outputs], mode="output" ) return [concrete_func], inputs, outputs return wrapper def run_compare_tf2( model, input_dict, output_names, inputs_for_conversion=None, compute_unit=ct.ComputeUnit.CPU_ONLY, frontend_only=False, frontend="tensorflow", backend=("neuralnetwork", "fp32"), debug=False, atol=1e-04, rtol=1e-05, minimum_deployment_target=None, ): """ Parameters ---------- model: list of tf.ConcreteFunction List of TensorFlow 2.x concrete functions. input_dict: dict of (str, np.array) Dict of name and value pairs representing inputs. output_names: list of str List of output node names. inputs_for_conversion: list of coremltools.TensorType() or coremltools.ImageType() objects Defaults to None. It is passed as is to the "inputs" argument of the converter. compute_unit: Enum[ct.ComputeUnit] Compute unit for the coreml model frontend_only: bool If True, skip the prediction call, only validate conversion. frontend: str Frontend to convert from. backend: str Backend to convert to. debug: bool If True, print verbose information and plot intermediate graphs. atol: float The absolute tolerance parameter. rtol: float The relative tolerance parameter. minimum_deployment_target: coremltools.target enumeration The spec version for the mlmodel """ # Infinite upper-bound not allowed in mlprogram. symbolic_upper_bound = 20 if backend[0] == "mlprogram" else -1 inputs = [] if inputs_for_conversion is None: cf_inputs = [t for t in model[0].inputs if t.dtype != dtypes.resource] for t in cf_inputs: name = get_tf_node_names(t.name)[0] shape = [ RangeDim(upper_bound=symbolic_upper_bound) if s is None or s == -1 else s for s in list(t.get_shape()) ] inputs.append(TensorType(name=name, shape=shape, dtype=t.dtype.as_numpy_dtype)) else: inputs = inputs_for_conversion outputs = [] for t in output_names: name = get_tf_node_names(t)[0] outputs.append(name) # get TensorFlow 2.x output as reference and run comparison tf_input_values = [tf.constant(t) for t in input_dict.values()] tf_outputs = model[0](*tf_input_values) if isinstance(tf_outputs, (tuple, list)): ref = [t.numpy() for t in tf_outputs] else: ref = [tf_outputs.numpy()] expected_outputs = {n: v for n, v in zip(outputs, ref)} mlmodel = ct_convert( model, source=frontend, inputs=inputs, outputs=outputs, convert_to=backend, debug=debug, compute_units=compute_unit, minimum_deployment_target=minimum_deployment_target, ) for k,v in input_dict.items(): if isinstance(v, np.ndarray) and issubclass(v.dtype.type, np.integer): input_dict[k] = v.astype(float) # Core ML only accepts floats if frontend_only or _macos_version() < (10, 13) \ or (mlmodel.is_package and _macos_version() < (12, 0)): return mlmodel._spec, mlmodel, input_dict, None pred = None if not coremltoolsutils._has_custom_layer(mlmodel._spec): pred = compare_backend( mlmodel, input_dict, expected_outputs, atol=atol, rtol=rtol, also_compare_shapes=True, dtype=backend[1], ) else: print('Skipping model prediction as it has a custom nn layer!') return mlmodel._spec, mlmodel, input_dict, pred def run_compare_tf_keras( model, input_values, inputs_for_conversion=None, compute_unit=ct.ComputeUnit.CPU_ONLY, frontend_only=False, frontend="tensorflow", backend=("neuralnetwork", "fp32"), atol=1e-04, rtol=1e-05, ): """ Parameters ---------- model: TensorFlow 2.x model TensorFlow 2.x model annotated with @tf.function. input_values: list of np.array List of input values in the same order as the input signature. inputs_for_conversion: list of coremltools.TensorType() or coremltools.ImageType() objects Defaults to None. It is passed as is to the "inputs" argument of the converter. compute_unit: Enum[ct.ComputeUnit] Compute unit for the coreml model frontend_only: bool If True, skip the prediction call, only validate conversion. frontend: str Frontend to convert from. backend: str Backend to convert to. atol: float The absolute tolerance parameter. rtol: float The relative tolerance parameter. """ mlmodel = ct_convert(model, inputs=inputs_for_conversion, source=frontend, convert_to=backend, compute_units=compute_unit) # assumes conversion preserve the i/o names proto = mlmodel._spec inputs = [i.name.split(":")[0].strip() for i in model.inputs] outputs = [str(o.name) for o in proto.description.output] # get tf.keras model output as reference and run comparison keras_outputs = model(input_values) if not isinstance(keras_outputs, list): keras_outputs = [keras_outputs] ref = [output.numpy() for output in keras_outputs] expected_outputs = {n: v for n, v in zip(outputs, ref)} input_key_values = {n: v for n, v in zip(inputs, input_values)} if frontend_only or _macos_version() < (10, 13) \ or (mlmodel.is_package and _macos_version() < (12, 0)): return proto, mlmodel, input_key_values, None pred = None if not coremltoolsutils._has_custom_layer(proto): pred = compare_backend( mlmodel, input_key_values, expected_outputs, atol=atol, rtol=rtol, also_compare_shapes=True, dtype=backend[1], ) else: print('Skipping model prediction as it has a custom nn layer!') return proto, mlmodel, input_key_values, pred class TensorFlow2BaseTest(TensorFlowBaseTest): @staticmethod def run_compare_tf2( model, input_dict, output_names, inputs_for_conversion=None, compute_unit=ct.ComputeUnit.CPU_ONLY, frontend_only=False, frontend="tensorflow", backend=("neuralnetwork", "fp32"), debug=False, atol=1e-04, rtol=1e-05, minimum_deployment_target=None, ): if minimum_deployment_target is not None: validate_minimum_deployment_target(minimum_deployment_target, backend) res = run_compare_tf2( model, input_dict, output_names, inputs_for_conversion=inputs_for_conversion, compute_unit=compute_unit, frontend_only=frontend_only, frontend=frontend, backend=backend, debug=debug, atol=atol, rtol=rtol, minimum_deployment_target=minimum_deployment_target, ) alist = list(res) alist.append(TensorFlow2BaseTest.testclassname) alist.append(TensorFlow2BaseTest.testmodelname) return tuple(alist) @staticmethod def run_compare_tf_keras( model, input_values, inputs_for_conversion=None, compute_unit=ct.ComputeUnit.CPU_ONLY, frontend_only=False, frontend="tensorflow", backend=("neuralnetwork", "fp32"), atol=1e-04, rtol=1e-05 ): res = run_compare_tf_keras(model, input_values, inputs_for_conversion=inputs_for_conversion, compute_unit=compute_unit, frontend_only=frontend_only, frontend=frontend, backend=backend, atol=atol, rtol=rtol) alist = list(res) alist.append(TensorFlow2BaseTest.testclassname) alist.append(TensorFlow2BaseTest.testmodelname) return tuple(alist) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2215466 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/tf_graph_pass/0000755000000000000000000000000014672075535026737 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/tf_graph_pass/__init__.py0000644000000000000000000000056314672066616031054 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .rewrite_control_flow_functions import (flatten_sub_graph_namespaces, rewrite_control_flow_functions) ././@PaxHeader0000000000000000000000000000021700000000000010215 xustar00121 path=coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/tf_graph_pass/rewrite_control_flow_functions.py 22 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/tensorflow2/tf_graph_pass/rewrite_control_flow_f0000644000000000000000000004716614672066616033455 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools import _logger as logger from coremltools.converters.mil.frontend.tensorflow.basic_graph_ops import ( connect_edge, connect_edge_at_index, delete_node, disconnect_edge, replace_dest, replace_node) from coremltools.converters.mil.frontend.tensorflow.parsed_tf_node import \ ParsedTFNode def _rename_node_in_fn(node, new_name, fn): """ Rename a node and all it's connections. Parameters ---------- node: ParsedTFNode Node to rename. new_name: str New name of the node. fn: SSAFunction Function that contains graph to operate on. """ old_name = node.name node.name = new_name for i in node.inputs: idx = fn.graph[i].outputs.index(old_name) fn.graph[i].outputs[idx] = new_name if old_name in fn.graph[i].control_outputs: idx = fn.graph[i].control_outputs.index(old_name) fn.graph[i].control_outputs[idx] = new_name for o in node.outputs: idx = fn.graph[o].inputs.index(old_name) fn.graph[o].inputs[idx] = new_name if old_name in fn.graph[o].control_inputs: idx = fn.graph[o].control_inputs.index(old_name) fn.graph[o].control_inputs[idx] = new_name for i in node.control_inputs: if old_name in fn.graph[i].control_outputs: idx = fn.graph[i].control_outputs.index(old_name) fn.graph[i].control_outputs[idx] = new_name for o in node.control_outputs: if old_name in fn.graph[o].control_inputs: idx = fn.graph[o].control_inputs.index(old_name) fn.graph[o].control_inputs[idx] = new_name fn.graph[new_name] = fn.graph.pop(old_name) def _flatten_sub_graph_namespaces(tf_ssa, fn_name): """ A pass to flatten namespaces for sub-graphs of the control flow while_loop op. For example, the while_loop's has two sub-graphs, "cond" and "body", all the nodes in the graph will be prefixing the sub-graph's name. This pass is required for converting control flow v2 ops (enabled by default in TensorFlow 2.0+) as the original sub-graphs will contain duplicated names. Parameters ---------- tf_ssa: NetworkEnsemble An object that contains multiple functions / sub-graphs. fn_name: str Name of the function / sub-graph to operate on. """ count = 0 fn = tf_ssa.functions.get(fn_name) for name, node in fn.graph.copy().items(): if node.op not in {"StatelessWhile", "While", "StatelessIf", "If"}: continue if node.op in {"StatelessWhile", "While"}: sub_fn_names = [node.attr.get("cond"), node.attr.get("body")] else: sub_fn_names = [node.attr.get("then_branch"), node.attr.get("else_branch")] for sf_name in sub_fn_names: sf = tf_ssa.functions.get(sf_name) prefix = "{}/{}".format(node.name, sf_name) for old_name, n in sf.graph.copy().items(): _rename_node_in_fn(n, "{}/{}".format(prefix, old_name), sf) count += 1 ios = set(sf.inputs + sf.outputs) io_name_mappings = {n: "{}/{}".format(prefix, n) for n in ios} sf.inputs = [io_name_mappings[n] for n in sf.inputs] sf.outputs = [io_name_mappings[n] for n in sf.outputs] _flatten_sub_graph_namespaces(tf_ssa, sf_name) msg = "flatten_sub_graph_namespaces: {} nodes renamed in '{}'" logger.info(msg.format(count, sf_name)) def _insert_op(fn, op, name, attr=None): """ Create a node with given attributes, then insert to the target graph in given function. Parameters ---------- fn: SSAFunction Function that contains graph to operate on. op: str Type of the operation for the new node. name: str Name of the new node. attr: dict or None (optional) Attributes of the new node. Returns ------- node: ParsedTFNode New node object. """ node = ParsedTFNode() node.op = op node.name = name if attr is not None: node.attr = attr fn.graph[node.name] = node return node def _insert_function_entry(fn): return _insert_op(fn=fn, op="function_entry", name="entry") def _insert_return(fn): return _insert_op(fn=fn, op="return", name="return") def _insert_make_tuple(fn, name=None): name = "make_tuple" if name is None else name return _insert_op(fn=fn, op="make_tuple", name=name) def _insert_get_tuple(fn, name, idx): return _insert_op(fn=fn, op="get_tuple", name=name, attr={"index": idx}) def _rewrite_cond_functions(tf_ssa, fn): r""" Rewrite tf.cond's sub-graphs with get_tuple, make_tuple, function_entry and return ops. This rewrite is required in order to convert functional form control flow v2 nodes 'StatelessIf' and 'If'. Parameters ---------- tf_ssa: NetworkEnsemble An object that contains multiple functions / sub-graphs. fn: SSAFunction Function that contains graph to operate on. Examples -------- Input: Before pass "main" graph: [const/greater/y] ---------\ [placeholder/args_0] -> [greater] -> [if] -> [identity] \------------------/ \--> [identity] [placeholder/args_1] ----------------/ Before pass "then" graph: [const/sub/y] ---------------\ [placeholder/sub_args_0] -> [sub] [placeholder/sub_args_1] -> [identity] Before pass "else" graph: [const/add/y] ---------------\ [placeholder/add_args_0] -> [add] [const/mul/y] ---------------\ [placeholder/add_args_1] -> [mul] Output: After pass "main" graph: [const/greater/y] ---------\ [placeholder/args_0] -> [greater] -> [make_tuple] -> [if] -> [get_tuple] -> [identity] \---------------------/ \--> [get_tuple] -> [identity] [placeholder/args_1] -------------------/ After pass "then" graph: [const/sub/y] ---------------\ [entry] -> [get_tuple] -> [placeholder/sub_args_0] -> [sub] -> [make_tuple] -> [return] -> [get_tuple] -> [placeholder/sub_args_1] -----------------/ After pass "else" graph: [const/add/y] ---------------\ [entry] -> [get_tuple] -> [placeholder/add_args_0] -> [add] -> [make_tuple] -> [return] -> [get_tuple] -> [placeholder/add_args_1] -> [mul] --------/ [const/mul/y] ---------------/ """ for cond_name, cond_node in fn.graph.copy().items(): if cond_node.op not in {"StatelessIf", "If"}: continue then_fn_name = cond_node.attr.get("then_branch") else_fn_name = cond_node.attr.get("else_branch") msg = "Rewriting '{}' ({}) sub-graphs: then '{}', else '{}'" logger.info( msg.format(cond_node.name, cond_node.op, then_fn_name, else_fn_name) ) then_fn = tf_ssa.functions.get(then_fn_name) else_fn = tf_ssa.functions.get(else_fn_name) # insert function entry nodes then_entry = _insert_function_entry(then_fn) else_entry = _insert_function_entry(else_fn) # pack node inputs to a single tuple cond_input = _insert_make_tuple(fn, "make_tuple/{}".format(cond_name)) for ci in cond_node.inputs: disconnect_edge(fn.graph, ci, cond_node.name) connect_edge(fn.graph, ci, cond_input) connect_edge(fn.graph, cond_input, cond_node.name) # unpack node outputs to multiple get_tuples for i, co in enumerate(cond_node.outputs): # utilize FunctionDef's ret to make sure function outputs and # node outputs order matches when multiple outputs are there. # Fallback to use original cond_node.outputs order if fails. o_original = fn.graph[co].original_node if o_original: c_input = [n for n in o_original.input if str(n).startswith(cond_name)][ 0 ] if ":" in c_input: identity_postfix = "identity_{}".format(c_input.split(":")[-1]) else: # access identity "0" identity_postfix = "identity" identity_keys = [t for t in then_fn.ret.keys() if t.endswith(identity_postfix)] if len(identity_keys) != 1: raise NotImplementedError("Branch not found.") mapped_name = then_fn.ret[identity_keys[0]].split(":")[0] if mapped_name in then_fn.outputs: idx = then_fn.outputs.index(mapped_name) else: # in else_fn.outputs idx = else_fn.outputs.index(mapped_name) else: idx = i cond_output = _insert_get_tuple( fn, "get_tuple/{}/{}".format(idx, cond_name), idx ) edge_idx = fn.graph[co].inputs.index(cond_node.name) replace_dest(fn.graph, cond_node, co, cond_output) connect_edge_at_index(fn.graph, cond_output, co, edge_idx) # fetch inputs using get_tuple for then branch for i, ti in enumerate(then_fn.inputs): then_input = _insert_get_tuple( then_fn, "get_tuple/{}/{}".format(i, ti), i + 1 ) connect_edge(then_fn.graph, then_entry, then_input) replace_node(then_fn.graph, ti, then_input) delete_node(then_fn.graph, ti) # fetch inputs using get_tuple for else branch for i, ei in enumerate(else_fn.inputs): else_input = _insert_get_tuple( else_fn, "get_tuple/{}/{}".format(i, ei), i + 1 ) connect_edge(else_fn.graph, else_entry, else_input) replace_node(else_fn.graph, ei, else_input) delete_node(else_fn.graph, ei) # returns a tuple of value(s) as output for then branch then_output = _insert_make_tuple(then_fn) for to in then_fn.outputs: if to not in then_fn.graph.keys(): # from identity, map back to get_tuple node to = "get_tuple/{}/{}".format(then_fn.inputs.index(to), to) connect_edge(then_fn.graph, to, then_output.name) then_return = _insert_return(then_fn) connect_edge(then_fn.graph, then_output.name, then_return.name) # returns a tuple of value(s) as output for else branch else_output = _insert_make_tuple(else_fn) for eo in else_fn.outputs: if eo not in else_fn.graph.keys(): # from identity, map back to get_tuple node eo = "get_tuple/{}/{}".format(else_fn.inputs.index(eo), eo) connect_edge(else_fn.graph, eo, else_output.name) else_return = _insert_return(else_fn) connect_edge(else_fn.graph, else_output.name, else_return.name) def _eliminate_loop_cond_nodes(tf_ssa, fn): """ Eliminate loop condition nodes, such as loop_counters, max_iterations from the cond sub-graph and body sub-graph of tf.while_loop. Parameters ---------- tf_ssa: NetworkEnsemble An object that contains multiple functions / sub-graphs. fn: SSAFunction Function that contains graph to operate on. Examples -------- Input: Before pass "main" graph: [while/maximum_iterations] -----\ [while/loop_counter] -------> [while] --> [identity] [placeholder/args_0] ----------/ Before pass "cond" graph: [const/mean] -------\ [placeholder] --> [mean] --> [greater] [const/greater/y] --------------/ [while_maximum_iterations], [while_loop_counter] (not connected) Before pass "body" graph: [const/sub/y] ------\ [placeholder] ---> [sub] [const/add/y] ------------\ [while_loop_counter] --> [add] [while_maximum_iterations] (not connected) Output: After pass "main" graph: [placeholder/args_0] --> [while] --> [identity] After pass "cond" graph: [const/mean] -------\ [placeholder] --> [mean] --> [greater] [const/greater/y] --------------/ After pass "body" graph: [const/sub/y] ------\ [placeholder] ---> [sub] """ for name, node in fn.graph.copy().items(): if node.op not in {"StatelessWhile", "While"}: continue cond_fn = tf_ssa.functions.get(node.attr.get("cond")) body_fn = tf_ssa.functions.get(node.attr.get("body")) cond_lc_nodes = {cond_fn.inputs.pop(0), cond_fn.inputs.pop(0)} logger.info("Removing {} from cond graph".format(cond_lc_nodes)) for n in cond_lc_nodes: delete_node(cond_fn.graph, n) body_lc_nodes = {body_fn.inputs.pop(0), body_fn.inputs.pop(0)} q = list(body_lc_nodes) # delete entire sub-fn while len(q) > 0: n = body_fn.graph[q.pop(0)] for o in n.outputs: if o not in body_lc_nodes: q.append(o) body_lc_nodes.add(o) for i in body_fn.graph[o].inputs: if i not in body_lc_nodes: q.append(i) body_lc_nodes.add(i) # remove if in outputs for n in body_lc_nodes: if n in body_fn.outputs: msg = "Removing '{}' ({}) from body fn outputs" logger.info(msg.format(n, body_fn.graph[n].op)) body_fn.outputs.remove(n) logger.info("Removing {} from body graph".format(body_lc_nodes)) for n in body_lc_nodes: delete_node(body_fn.graph, n) def _rewrite_while_loop_functions(tf_ssa, fn): """ Rewrite tf.while_loop's sub-graphs with get_tuple, make_tuple, function_entry and return ops. This rewrite is required in order to convert functional form control flow v2 nodes 'StatelessWhile' and 'While'. Parameters ---------- tf_ssa: NetworkEnsemble An object that contains multiple functions / sub-graphs. fn: SSAFunction Function that contains graph to operate on. Example ------- Input: Before pass "main" graph: [placeholder/args_0] --> [while] --> [identity] Before pass "cond" graph: [const/mean] -------\ [placeholder] --> [mean] --> [greater] [const/greater/y] --------------/ Before pass "body" graph: [const/sub/y] ------\ [placeholder] ---> [sub] Output: After pass "main" graph: [placeholder/args_0] --> [make_tuple] --> [while] --> [get_tuple] --> [identity] After pass "cond" graph: [const/mean] ------\ [entry] -> [get_tuple] -> [placeholder] -> [mean] -> [greater] -> [make_tuple] -> [return] [const/greater/y] ------------/ After pass "body" graph: [const/sub/y] ----\ [entry] -> [get_tuple] -> [placeholder] -> [sub] -> [make_tuple] -> [return] """ for while_name, while_node in fn.graph.copy().items(): if while_node.op not in {"StatelessWhile", "While"}: continue cond_fn_name = while_node.attr.get("cond") body_fn_name = while_node.attr.get("body") msg = "Rewriting '{}' ({}) sub-graphs: cond '{}', body '{}'" logger.info( msg.format(while_node.name, while_node.op, cond_fn_name, body_fn_name) ) cond_fn = tf_ssa.functions.get(cond_fn_name) body_fn = tf_ssa.functions.get(body_fn_name) # insert function entry nodes cond_entry = _insert_function_entry(cond_fn) body_entry = _insert_function_entry(body_fn) # pack node inputs to a single tuple while_input_tuple = _insert_make_tuple(fn, "make_tuple/{}".format(while_name)) for wi in while_node.inputs: disconnect_edge(fn.graph, wi, while_node.name) connect_edge(fn.graph, wi, while_input_tuple) connect_edge(fn.graph, while_input_tuple, while_node.name) # unpack node outputs to multiple get_tuples for i, wo in enumerate(while_node.outputs): # utilize FunctionDef's ret to make sure function outputs and # node outputs order matches when multiple outputs are there. o_original = fn.graph[wo].original_node while_input = [ n for n in o_original.input if str(n).startswith(while_name) ][0] while_index = while_input.split(":")[-1] if while_index != 0: identity_postfix = "identity_{}".format(while_index) else: # access identity "0" identity_postfix = "identity" identity_keys = [t for t in body_fn.ret.keys() if t.endswith(identity_postfix)] if len(identity_keys) != 1: raise NotImplementedError("Branch not found.") mapped_name = body_fn.ret[identity_keys[0]].split(":")[0] idx = body_fn.outputs.index(mapped_name) loop_output = _insert_get_tuple( fn, "get_tuple/{}/{}".format(idx, while_input), idx ) edge_idx = fn.graph[wo].inputs.index(while_node.name) replace_dest(fn.graph, while_node, wo, loop_output) connect_edge_at_index(fn.graph, loop_output, wo, edge_idx) # fetch inputs using get_tuple for cond fn for i, ci in enumerate(cond_fn.inputs): cond_input = _insert_get_tuple(cond_fn, "get_tuple/{}/{}".format(i, ci), i) connect_edge(cond_fn.graph, cond_entry, cond_input) replace_node(cond_fn.graph, ci, cond_input) delete_node(cond_fn.graph, ci) # fetch inputs using get_tuple for body fn for i, bi in enumerate(body_fn.inputs): new_name = "get_tuple/{}/{}".format(i, bi) if bi in body_fn.outputs: # input is also an output body_fn.outputs[body_fn.outputs.index(bi)] = new_name body_input = _insert_get_tuple(body_fn, new_name, i) connect_edge(body_fn.graph, body_entry, body_input) replace_node(body_fn.graph, bi, body_input) delete_node(body_fn.graph, bi) # returns a tuple of value(s) as output for cond fn cond_output = _insert_make_tuple(cond_fn) for co in cond_fn.outputs: connect_edge(cond_fn.graph, co, cond_output.name) cond_return = _insert_return(cond_fn) connect_edge(cond_fn.graph, cond_output.name, cond_return.name) # returns a tuple of value(s) as output for body branch body_output = _insert_make_tuple(body_fn) for bo in body_fn.outputs: connect_edge(body_fn.graph, bo, body_output.name) body_return = _insert_return(body_fn) connect_edge(body_fn.graph, body_output.name, body_return.name) def rewrite_control_flow_functions(tf_ssa): for fn_name, fn in tf_ssa.functions.items(): _rewrite_cond_functions(tf_ssa, fn) for fn_name, fn in tf_ssa.functions.items(): _eliminate_loop_cond_nodes(tf_ssa, fn) _rewrite_while_loop_functions(tf_ssa, fn) def flatten_sub_graph_namespaces(tf_ssa): _flatten_sub_graph_namespaces(tf_ssa, fn_name="main") ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2255466 coremltools-8.0/coremltools/converters/mil/frontend/torch/0000755000000000000000000000000014672075535022752 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/__init__.py0000644000000000000000000000106214672066616025062 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools._deps import _HAS_TORCH register_torch_op = None if _HAS_TORCH: from . import ops, quantization_ops from .dialect_ops import (torch_tensor_assign, torch_upsample_bilinear, torch_upsample_nearest_neighbor) from .torch_op_registry import register_torch_op, is_torch_fx_node_supported ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/converter.py0000644000000000000000000020001014672066616025324 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import math from collections import OrderedDict from enum import Enum from typing import Dict, List, Optional, Tuple, Union import attrs import numpy as np import torch as torch from torch.jit._script import RecursiveScriptModule from coremltools import _logger as logger from coremltools._deps import _HAS_TORCH_EXPORT_API from coremltools.converters.mil import mil from coremltools.converters.mil._deployment_compatibility import AvailableTarget as _target from coremltools.converters.mil.frontend import _utils as frontend_utils from coremltools.converters.mil.input_types import ImageType, InputType, StateType, TensorType from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, Placeholder, Program, types from coremltools.converters.mil.mil.block import is_current_opset_version_compatible_with from coremltools.converters.mil.mil.scope import ScopeInfo, ScopeSource from coremltools.converters.mil.mil.types import builtin_to_string, is_float from coremltools.converters.mil.mil.types.symbolic import any_symbolic from coremltools.converters.mil.mil.var import Var from coremltools.optimize.coreml import _utils as optimize_utils from coremltools.optimize.coreml._quantization_passes import prune_weights from .exir_utils import WRAPPED_SCALAR_INPUT_SUFFIX from .internal_graph import InternalTorchIRGraph, InternalTorchIRNode from .ops import convert_nodes from .quantization_ops import _dequantized_weight from .torch_op_registry import _TORCH_OPS_REGISTRY from .torchir_passes import ( flatten_graph_input_values, flatten_graph_output_values, generate_tensor_assignment_ops, populate_native_const_model_hierarchy, remove_getattr_nodes, transform_inplace_ops, ) from .utils import ( NUM_TO_NUMPY_DTYPE, TORCH_DTYPE_TO_MIL_DTYPE, TORCH_DTYPE_TO_NUM, TORCH_EXPORT_BASED_FRONTENDS, TorchFrontend, ) if _HAS_TORCH_EXPORT_API: from torch.export import ExportedProgram # The compression info is stored in state_dict with the prefix, e.g. "dense2._COREML_n_bits". _COMPRESSION_INFO_PREFIX = "_COREML_" # TODO: Share the enum between cto.coreml and cto.torch (rdar://124409664). class CompressionType(Enum): PRUNING = 1 PALETTIZATION = 2 QUANTIZATION = 3 @attrs.define(kw_only=True) class CompressionInfo: """ This class stores the compression info carried by the traced torch model. """ # Quantization related fields. quantization_n_bits: Optional[int] = attrs.field( default=None, validator=attrs.validators.optional([attrs.validators.instance_of(int)]), converter=attrs.converters.optional(int), ) quantization_scale: Optional[torch.Tensor] = attrs.field( default=None, validator=attrs.validators.optional([attrs.validators.instance_of(torch.Tensor)]), ) zero_point: Optional[torch.Tensor] = attrs.field( default=None, validator=attrs.validators.optional([attrs.validators.instance_of(torch.Tensor)]), ) # Palettization related fields. lut: Optional[torch.Tensor] = attrs.field( default=None, validator=attrs.validators.optional([attrs.validators.instance_of(torch.Tensor)]), ) palettization_scale: Optional[torch.Tensor] = attrs.field( default=None, validator=attrs.validators.optional([attrs.validators.instance_of(torch.Tensor)]), ) vector_axis: Optional[int] = attrs.field( default=None, validator=attrs.validators.optional([attrs.validators.instance_of(int)]), converter=attrs.converters.optional(int), ) # Compression type indication fields. compression_type: Optional[List[int]] = attrs.field( default=None, converter=attrs.converters.optional(lambda tensor: tensor.tolist()), ) @quantization_n_bits.validator def check_n_bits(self, attribute, n_bits): if n_bits is not None and not 1 <= n_bits <= 8: raise ValueError(f"Only support quantization_n_bits between 1 and 8, but got {n_bits}") @compression_type.validator def check_compression_type(self, attribute, compression_type): if compression_type is not None: if not all(isinstance(type_val, int) for type_val in compression_type): raise ValueError( f"Only support int compression_type, but got {type(compression_type)}" ) def _convert_to_torch_inputtype(inputs: List[TensorType]) -> List[TensorType]: input_type = [] for _input in inputs: if isinstance(_input, (list, tuple)): input_type.append(_convert_to_torch_inputtype(_input)) elif isinstance(_input, InputType): if _input.shape is None: raise ValueError( "'shape' must be provided in the 'inputs' argument for pytorch conversion" ) input_type.append(_input) elif isinstance(_input, torch.Tensor): input_type.append( TensorType(shape=_input.shape, dtype=TORCH_DTYPE_TO_MIL_DTYPE[_input.dtype]) ) else: raise ValueError("Unknown type {} for conversion to InputType.".format(type(_input))) return input_type class QuantizationContext: """ Utilities to manage information pertaining to quantization of tensors in a PyTorch graph. This is necessary only for TorchScript (not ExecuTorch) """ def __init__(self, context: "TranscriptionContext") -> None: if context.frontend != TorchFrontend.TORCHSCRIPT: raise ValueError("QuantizationContext is necessary only for TorchScript") self._context = context # Maps var name to tuple of (torch dtype, scale, zero_point) # zero_point is in a NumPy dtype corresponding to torch one (for e.g. np.uint8 for torch.quint8). self._quant_param_map = {} # In MIL Programs, if a MIL op doesn't support quantized I/O but the PyTorch ops do, # we just use floating-point tensors after dequantization. This means that information about # what dtype (int8/uint8) quantized tensors had in the PyTorch graph is not carried into # in the MIL graph. # To simplify, we only support a single dtype for activation quantizations throughout the # incoming graph. # The other option is to remember dtypes across ops, including MIL ones that don't support # quantized I/O. We will need to be careful about edge cases like conflicting dtypes, etc. self._quant_dtype = None def add_quantization_info(self, name, torch_dtype, scale, zero_point, axis=None): """ Stores the quantization parameters (torch dtype, scale, zero_point) corresponding to a named var in the graph. zero_point should be in a NumPy dtype corresponding to torch one (for e.g. np.uint8 for torch.quint8). """ self._quant_param_map[name] = (torch_dtype, scale, zero_point, axis) def get_quantization_info(self, name: str) -> None: """ Retrieves the information added via add_quantization_info, if applicable. Returns None if quantization parameters could not be found. """ if name not in self._quant_param_map: return None return self._quant_param_map[name] def maybe_handle_quantized_inputs(self, node: InternalTorchIRNode) -> None: """ If a node's op doesn't support quantized inputs but gets one, this will wire it to receive a dequantized version of it. """ op_type = node.kind if op_type in {"quantize_per_tensor", "dequantize"} or "quantized::" in op_type: # Op can handle quantized inputs. Nothing to do here. return for input in node.inputs: # In EXIR, input can be a literal and thus have no name if not isinstance(input, str) or self.get_quantization_info(input) is None: # Not a quantized tensor continue # We need a dequantized version of the input to feed to the op. dequantized_var, _ = self.get_dequantized_var(input) node.replace_name(input, dequantized_var.name) def get_quantized_per_tensor(self, name, torch_dtype, scale, zero_point, quantized_name): """ Quantizes the provided named var as per quantization params. zero_point will be cast to the appropriate dtype based on torch_dtype. """ if self._quant_dtype is None: self._quant_dtype = torch_dtype elif self._quant_dtype != torch_dtype: raise NotImplementedError( "Currently we only support a single activation dtype throughout the model" ) if torch_dtype == torch.quint8: zero_point = np.uint8(zero_point) output_dtype = "uint8" elif torch_dtype == torch.qint8: zero_point = np.int8(zero_point) output_dtype = "int8" else: raise ValueError(f"Invalid torch dtype for quantization: {torch_dtype}") if np.isscalar(zero_point): # MIL allows skipping zero_point if its zero. if zero_point == 0: zero_point = None # TODO (rdar://107718371): skip 128 for uint8 by switching to int8 result = mb.quantize( input=self._context[name], zero_point=zero_point, scale=scale, output_dtype=output_dtype ) self._context.add(result, quantized_name) self._context.quant_context.add_quantization_info( quantized_name, torch_dtype, scale, zero_point ) return result def get_dequantized_var(self, name: str, dequantized_name: str = None): """ Returns dequantized var & torch dtype corresponding to the named var. """ original_var = self._context[name] if is_float(original_var.dtype): # Input doesn't need dequantization. # This might happen if in the PyTorch graph the upstream nodes supported quantized inputs, # but MIL does not. In that case, we already dequantized the vars before feeding them to # the MIL op. if dequantized_name is not None: self._context.add(original_var, dequantized_name) return original_var, self._quant_dtype quant_params = self.get_quantization_info(name) if quant_params is None: raise ValueError( f"Could not find quantization parameters for quantized var {original_var.name}" ) torch_dtype, scale, zero_point, axis = quant_params # We add a new var corresponding to each dequantized value. # This ensures the atomicity of quantized op patterns in MIL. dequantized_var = mb.dequantize( input=original_var, scale=scale, zero_point=zero_point, axis=axis ) if dequantized_name is not None: dequantized_var_name = dequantized_name else: dequantized_var_name = dequantized_var.name self._context.add(dequantized_var, dequantized_var_name) return dequantized_var, torch_dtype class TranscriptionContext: """ Maintains a map from torch operations to their MIL values while building the graph. Can be used to process subgraphs recursively by pushing new context when stepping into a subgraph and popping that context when stepping out. """ def __init__( self, name: Optional[str] = None, frontend: TorchFrontend = TorchFrontend.TORCHSCRIPT, ) -> None: self.name = name if name else "" self.frontend = frontend self._current_graph = [{}] self._torch_graph = None if frontend == TorchFrontend.TORCHSCRIPT: self._quant_context = QuantizationContext(self) # Dict to map a var's name into its corresponding source state var. self.name_to_source_state = dict() @property def torch_graph(self): if self._torch_graph is None: raise ValueError("InternalTorchIRGraph not set yet on context") return self._torch_graph @property def quant_context(self) -> QuantizationContext: return self._quant_context @torch_graph.setter def torch_graph(self, graph: InternalTorchIRGraph): self._torch_graph = graph def prepare_for_conversion(self, node: InternalTorchIRNode) -> None: """ Perform any preparation necessary before node-specific frontend conversion is invoked. This utility check if the input is a function state input, and convert it into a tensor type. For instance, given the following torchscript graph: %x(state, fp16), %y(tensor, fp32) -> { %1 = add(%x, %y) } The graph is translated into: %x(state, fp16), %y(tensor, fp32) -> { %read_x = read_state(%x) %read_x_cast = cast(%read_x, "fp32") %1 = add(%read_x_cast, %y) } ``%read_x_cast`` is cached in ``name_to_source_state``, to make sure one state feeds into only one ``read_state`` op. """ # Only torch script needs to prepare if self.frontend != TorchFrontend.TORCHSCRIPT: return for val in node.inputs: if val is None: continue if val not in self: continue in_node = self[val] if in_node is None or not isinstance(in_node, Var): continue if types.is_state(in_node.sym_type): self.name_to_source_state[val] = self[val] assert ( in_node.op is None ), f"A state type var must come from a placeholder. Got parent op {in_node.op.op_type} instead." read_state = mb.read_state(input=in_node) read_state_fp32 = mb.cast(x=read_state, dtype="fp32") self.add(read_state_fp32, torch_name=val, override=True) return def process_inplace_op(self, node: InternalTorchIRNode) -> None: """ This utility: 1. adds ``mb.coreml_update_state`` after each torch inplace ops. 2. adjusts the dtype across state / tensor. In torch, inplaces ops have the following properties: 1. op type has the suffix of ``_``. For instance, ``add_``, ``mul_``, etc. 2. The op does an inplace update for the first input tensor. For instance, the following syntax of a TorchScript: %3 = add_(%1, %2) denotes an inplace ``add`` operation on the ``%1`` tensor. The memory buffer of ``%1`` is updated and returned as a reference ``%3``. Here are the steps what this utility does, lets use the above simple torch script as an example, after adding the ``add_`` in the context, we currently have a MIL graph as ``%3 = add(x=%1, y=%2)``: 1. Validate the first input (``%1``) comes from a state source by checking if the tensor's name ``1`` is in ``name_to_source_state``. If not, this utility does nothing. 2. Say ``name_to_source_state["1"] = %state``. ``%state, %3`` can potentially has different dtype. For instance, the user could specify ``%state`` in fp16, while the MIL program in the front end conversion stage is still in fp32. Hence we cast ``%3`` into ``%state``'s dtype: (%state: fp16) -> { ... %3_ = add(x=%1, y=%2) %3_cast = cast(x=%3_, dtype="fp16") } 3. Insert a ``coreml_update_state`` and cast the output back to ``%3``'s original dtype: (%state: fp16) -> { ... %3_ = add(x=%1, y=%2) %3_cast = cast(x=%3_, dtype="fp16") %3_update = coreml_update_state(state=%state, value=%3_cast) %3 = cast(x=%3_update, dtype="fp32") } 4. Set ``name_to_source_state["3"] = %state``, so the state chain can be used in the downstream. The below Torch Script model, (%state: fp16) -> { ... %3 = add_(%1, %2) %out = sub_(%3, %4) } will result in: (%state: fp16) -> { %1_ = read_state(%state) %1 = cast(x=%1_, dtype="fp32") %3_ = add(x=%1, y=%2) %3_cast = cast(x=%3_, dtype="fp16") %3_update = coreml_update_state(state=%state, value=%3_cast) %3 = cast(x=%3_update, dtype="fp32") %out_ = sub(x=%3, y=%4) %out_cast = cast(x=%out_, dtype="fp16") %out_update = coreml_update_state(state=%state, value=%out_cast) %out = cast(x=%out_update, dtype="fp32") } Please note that, the intermediate ``cast`` ops would be removed by the ``add_fp16_cast`` + ``cast_optimization`` graph passes: (%state: fp16) -> { %1 = read_state(%state) %3_ = add(x=%1, y=%2) %3 = coreml_update_state(state=%state, value=%3_) %out_ = sub(x=%3, y=%4) %out = coreml_update_state(state=%state, value=%out_) } """ assert self.frontend == TorchFrontend.TORCHSCRIPT, "Only torch script has no in-place op" if len(node.inputs) == 0: return if node.inputs[0] not in self.name_to_source_state: return source_state = self.name_to_source_state[node.inputs[0]] self.name_to_source_state[node.name] = source_state value_node = self[node.name] cast_value = mb.cast(x=value_node, dtype=builtin_to_string(source_state.dtype)) update = mb.coreml_update_state(state=source_state, value=cast_value) cast_update = mb.cast(x=update, dtype=builtin_to_string(value_node.dtype), name=node.name) self.add(cast_update, torch_name=node.name, override=True) def add(self, ssa_var: Var, torch_name: Optional[str] = None, override=False) -> None: """ Arguments: ssa_var: Variable to add to the graph being constructed. torch_name: Optional unique string identifier of the operation. If omitted, it will use @ssa_var.name. """ if torch_name is None: torch_name = ssa_var.name if torch_name in self._current_graph[-1] and not override: logger.warning(f"Torch var {torch_name} is added again.") return self._current_graph[-1][torch_name] = ssa_var def __getitem__(self, torch_name: str) -> Var: """ Lookup a name in the context. Note that since nested blocks must be able to access anything that was defined before them, we have to search all contexts for a name, starting with the most local scope. """ for idx in reversed(range(len(self._current_graph))): current_graph = self._current_graph[idx] if torch_name in current_graph: return self._current_graph[idx][torch_name] raise ValueError(f"Torch var {torch_name} not found in context {self.name}") def __contains__(self, torch_name): """Returns whether or not the torch var exist in context.""" for idx in reversed(range(len(self._current_graph))): current_graph = self._current_graph[idx] if torch_name in current_graph: return True return False def push(self, inputs=None): """ Add another frame to the context. Optionally provide a tuple of (name list, Var list) to populate the new context frame. """ self._current_graph.append({}) if inputs is not None: if len(inputs[0]) != len(inputs[1]): raise ValueError("name list and Var list must be the same length") for name, var in zip(inputs[0], inputs[1]): self.add(var, torch_name=name) def pop(self): """ Remove and discard the top context frame. """ self._current_graph = self._current_graph[:-1] def __str__(self): _str = "" for current_graph in reversed(self._current_graph): __str = "" for k, v in current_graph.items(): if hasattr(v, "shape_str"): shape_str = v.shape_str() elif hasattr(v, "sym_shape"): shape_str = v.sym_shape() else: shape_str = "None" __str += f"%{k} : {shape_str}\n" _str += __str + "\n" return _str def __repr__(self): return str(self) class TorchConverter: """ Class that handles conversion of pytorch models to the MIL format. Models passed to the @TorchConverter go from: Loaded-Torch Model -> Internal Graph -> PyMIL """ def __init__( self, loaded_model: Union[RecursiveScriptModule, "ExportedProgram"], inputs: Optional[List[TensorType]] = None, outputs: Optional[List[TensorType]] = None, cut_at_symbols: Optional[List[str]] = None, opset_version: Optional[int] = None, use_default_fp16_io: bool = False, states: Optional[List[StateType]] = None, ) -> None: """ Arguments: loaded_model: It could be one of the following: - In-memory TorchScript model of type torch.jit.ScriptModule - In-memory EXIR program of type ExportedProgram inputs: Input values and optional names. See kwarg in load.py for full description. outputs: List of outputs as ct.InputType. See kwarg in load.py for full description. cut_at_symbols: A list of internal symbol name strings. Graph conversion will terminate once these symbols have been generated. For debugging use only. See kwarg in load.py. opset_version: An int represents the Core ML opset version. use_default_fp16_io (optional): bool. Defaults to False. When minimum_deployment_target set >= ct.target.iOS16 (the same as ct.target.macOS13), and the compute precision set to fp16, this flag is True. When True, fp32 i/o defaults to fp16. """ self.use_default_fp16_io = use_default_fp16_io # process inputs if inputs is None: inputs = [] self.inputs = _convert_to_torch_inputtype(inputs) for idx, inp in enumerate(self.inputs): if isinstance(inp, ImageType) and self.inputs[idx].channel_first is None: self.inputs[idx].channel_first = True # process states if states is None: states = [] self.states = states if self.use_default_fp16_io: # If the input type is not specified by the user and use_default_fp16_io # is True. Make the default input type to fp16 self._adjust_default_input_to_fp16() self.outputs = outputs self.output_names = frontend_utils.get_output_names(self.outputs) self.opset_version = _target(opset_version) if opset_version is not None else None self._prog = mil.Program() if isinstance(loaded_model, torch.jit.ScriptModule): self.context = TranscriptionContext(frontend=TorchFrontend.TORCHSCRIPT) self.graph = InternalTorchIRGraph.from_torchscript( torchscript=loaded_model, inputs=self.inputs, cut_at_symbols=cut_at_symbols ) # TODO (rdar://106161395): Register Torch IR passes and unify them into the pass pipeline. # Apply Torch IR passes passes = [ transform_inplace_ops, flatten_graph_input_values, flatten_graph_output_values, remove_getattr_nodes, generate_tensor_assignment_ops, populate_native_const_model_hierarchy, ] for p in passes: p(self.graph) elif _HAS_TORCH_EXPORT_API and isinstance(loaded_model, ExportedProgram): if loaded_model.dialect == "ATEN": frontend = TorchFrontend.TORCHEXPORT elif loaded_model.dialect == "EDGE": frontend = TorchFrontend.EXECUTORCH else: raise NotImplementedError( "Conversion for models with only ATEN or EDGE dialect is supported/tested. " f"Provided Dialect: {loaded_model.dialect}" ) self.context = TranscriptionContext(frontend=frontend) self.graph = InternalTorchIRGraph.from_exir(exir=loaded_model) # For iOS 18+, create states for all mutable buffers if self.opset_version is not None and self.opset_version >= _target.iOS18: self.states = [] for name, tensor in self.graph.buffers.items(): dtype = NUM_TO_NUMPY_DTYPE[TORCH_DTYPE_TO_NUM[tensor.dtype]] # For now, we check state dtype here since we construct input from EXIR program # TODO (rdar://115845792): Once we support user inputs, # we can migrate this check to inputs validation if dtype != np.float16: logger.warning( "Core ML only supports fp16 states, " f"so buffer {name} has been cast to fp16" ) dtype = np.float16 state = StateType( wrapped_type=TensorType(shape=tensor.shape, dtype=dtype), name=name ) self.states.append(state) # For iOS 17 or earlier, make sure there is no mutable buffer # (We may workaround by out of place, i.e. have initial value as input # and mutated value as output. Let us see if there is such demand) else: if self.graph.buffers: raise ValueError("iOS 18+ is required to convert mutable buffer") else: raise ValueError( "Model should be an instance of either torch.jit.ScriptModule or ExportedProgram" ) self.context.torch_graph = self.graph self.inputs = list(self.graph.inputs.values()) self._validate_states() # Store the mapping from parameter name (such as "dense1.weight") to the compression info. self.param_to_compression_info: Dict[str, CompressionInfo] = dict() if self.opset_version is not None and self.opset_version >= _target.iOS16: # Notice that even the compression info in registered buffer is kept in self.graph, # we still want to explicitly construct it here, to make it useful for both TorchScript # and ExportedProgram. state_dict = loaded_model.state_dict self.param_to_compression_info = self._construct_compression_info( state_dict() if callable(state_dict) else state_dict ) if self.context.frontend in TORCH_EXPORT_BASED_FRONTENDS: # For EXIR, all param names are lifted as input names (in the format of `argx_x`), so we need to # change names accordingly to make sure the compression info could be found later. for ( arg_name, param_name, ) in loaded_model.graph_signature.inputs_to_parameters.items(): if param_name in self.param_to_compression_info: self.param_to_compression_info[arg_name] = self.param_to_compression_info[ param_name ] del self.param_to_compression_info[param_name] def _validate_states(self) -> None: """ Validate that the user provided states is consistent with the registered buffer in the torchscript model, and add states to inputs """ if len(self.states) > 0: for state in self.states: if state.name is None or state.name not in self.graph.buffers: raise ValueError( f"StateType named {state.name} not provided or " "not found in the source torch model. " "Please make sure the name in " "'ct.StateType(name=..., wrapped_type=ct.TensorType(...))' " f"match the 'named_buffers()' in the source torch model: {list(self.graph.buffers.keys())}" ) state_shape = tuple(state.shape.symbolic_shape) buffer_shape = tuple(self.graph.buffers[state.name].size()) # If Core ML state has fixed shape, then we make sure it matches torch buffer shape # Note: Although dynamic-shape state does not make sense at runtime, # for flexibility in graph manipulation, pymil allows symbolic-shape state if not any_symbolic(state_shape): if state_shape != buffer_shape: raise ValueError( f"StateType shape {state_shape} must match the torch buffer shape {buffer_shape}." ) if self.opset_version is None or self.opset_version < _target.iOS18: raise ValueError( "State model is supported only >= iOS18. " "Please update the minimum_deployment_target to at least coremltools.target.iOS18" ) self.inputs.extend(self.states) def _adjust_default_input_to_fp16(self) -> None: """ An utility function that sets the default input dtype to fp16 """ def _adjust_default_input_to_fp16_helper(inputs: InputType): assert isinstance(inputs, list), "inputs must be type of list" # Adjust inputs dtype to fp16 for val in inputs: if isinstance(val, (StateType, TensorType)) and val.dtype is None: val.dtype = types.fp16 _adjust_default_input_to_fp16_helper(self.inputs) _adjust_default_input_to_fp16_helper(self.states) def _adjust_default_output_to_fp16(self, graph_outputs): """ An utility function that sets the default outputs with inferred type fp32 to fp16. - If the inferred output dtype is fp32, and the user doesn't provide dtype, it defaults to fp16. - If the inferred output dtype is not fp32, nothing would change. """ if self.outputs is None: self.outputs = [] for val in graph_outputs: dtype = types.fp16 if val.dtype == types.fp32 else val.dtype self.outputs.append(TensorType(dtype=dtype)) else: for i, val in enumerate(self.outputs): if ( isinstance(val, TensorType) and val.dtype is None and graph_outputs[i].dtype == types.fp32 ): val.dtype = types.fp16 @staticmethod def _check_ops(graph): """ Returns the set of ops in @graph that are implemented, and the set for which no conversion function is registered. @graph can be either InternalTorchIRGraph or InternalTorchIRBlock. """ implemented_ops = set() missing_ops = set() for node in graph.nodes: _add_op = _TORCH_OPS_REGISTRY.get_func(node.kind) if _add_op is None: missing_ops.add(node.kind) else: implemented_ops.add(node.kind) for block in node.blocks: _impl, _miss = TorchConverter._check_ops(block) implemented_ops.update(_impl) missing_ops.update(_miss) return implemented_ops, missing_ops @staticmethod def _create_placeholder(_input: InputType) -> Placeholder: """ Converts an InputType into a Placeholder. 1. ``StateType`` into ``mb.state_tensor_placeholder``. 2. ``TensorType`` and ``ImageType`` into ``mb.placeholder``. """ shape = _input.shape.symbolic_shape dtype = _input.dtype # int64 and fp64 are not supported, so they are mapped to int32 / fp32 accordingly if dtype == types.int64: dtype = types.int32 elif dtype == types.fp64: dtype = types.fp32 if isinstance(_input, StateType): return mb.state_tensor_placeholder(shape, dtype=dtype) return mb.placeholder(shape, dtype=dtype) @staticmethod def _construct_compression_info( state_dict: Dict[str, torch.Tensor], ) -> Dict[str, CompressionInfo]: """ Construct compression info from the traced model's state_dict. The state_dict of the traced model is something like { 'dense1.weight': xxx, 'dense1.bias': xxx, 'dense1._COREML_/weight/quantization_n_bits': tensor(4), 'dense1._COREML_/weight/quantization_scale': xxx, 'dense1._COREML_/weight/zero_point': xxx, 'dense1._COREML_/weight/compression_type': tensor([3]), 'dense2.weight': xxx, ... } We extract the compression info and store it as a dict { 'dense1.weight': CompressionInfo(quantization_n_bits=4, quantization_scale=xxx, zero_point=xxx, compression_type=[QUANTIZATION]), 'dense2.weight': ... } """ compression_info = dict() for torch_key_name in state_dict.keys(): if f"{_COMPRESSION_INFO_PREFIX}/metadata_version" in torch_key_name: # TODO: rdar://124707382 ([Compression] Support versioning in CompressionInfo) continue if _COMPRESSION_INFO_PREFIX in torch_key_name: module_name = None buffer_name = torch_key_name if not torch_key_name.startswith(_COMPRESSION_INFO_PREFIX): module_name, buffer_name = torch_key_name.rsplit(".", 1) _, param_name, compression_key = buffer_name.rsplit("/", 2) if module_name: param_name = f"{module_name}.{param_name}" if param_name not in compression_info: compression_info[param_name] = CompressionInfo() setattr( compression_info[param_name], compression_key, state_dict[torch_key_name], ) return compression_info def _has_compression_info(self, param_name: str) -> bool: """Check if the parameter carries compression info.""" return param_name in self.param_to_compression_info @staticmethod def _interleave_repeat_scale_zp( weight: np.ndarray, scale: np.ndarray, zero_point: Optional[np.ndarray] ) -> Tuple[np.ndarray, Optional[np.ndarray]]: """ The scale and zero-point both have shape [.., block_num, ..], which means each scale is for one block. As weight has shape [.., block_num*block_size, ..], we need to interleave repeat them, so they can be applied to all blocks at once. """ scale_repeated = scale zero_point_repeated = zero_point for axis, weight_dim_size in enumerate(weight.shape): scale_dim_size = scale.shape[axis] if weight_dim_size != scale_dim_size and scale_dim_size != 1: # Only repeat axis where dim size is not 1, because 1 will be auto-broadcast by np. block_size = weight_dim_size // scale.shape[axis] scale_repeated = np.repeat(scale_repeated, block_size, axis=axis) if zero_point_repeated is not None: zero_point_repeated = np.repeat(zero_point_repeated, block_size, axis=axis) return scale_repeated, zero_point_repeated def _construct_quantization_op( self, weight: np.ndarray, compression_info: CompressionInfo, name: str, compressed_var: Optional[Var] = None, ) -> Var: """ The weight is constructed by `weight = scale * (quantized_data - zero_point)`. We need to restore the quantized_data to construct the quantization op. If compressed_var is not None, it's the var constructed by a previous compression function, which means this is a joint compression. For example, if the compression_info.compression_type is [CompressionType.PRUNING, CompressionType.QUANTIZATION], the compressed_var is the var produced by the pruning. """ if compression_info.quantization_n_bits is None: raise ValueError("quantization_n_bits must be specified in quantization.") if compression_info.quantization_scale is None: raise ValueError("quantization_scale must be specified in quantization.") scale = compression_info.quantization_scale.detach().numpy() zero_point: Optional[np.ndarray] = None if compression_info.zero_point is not None: zero_point = compression_info.zero_point.detach().numpy() # For conv/conv_transpose, the weight has rank=4, so we auto-expand scale and zero-point if # it only has two elements. if len(weight.shape) == 4 and len(scale.shape) == 2: scale = np.expand_dims(np.expand_dims(scale, axis=-1), axis=-1) if zero_point is not None: zero_point = np.expand_dims(np.expand_dims(zero_point, axis=-1), axis=-1) if compressed_var is not None and compressed_var.op.op_type == "constexpr_lut_to_dense": # The quantization on lut could lead to extra two dims at the end. if len(scale.shape) == len(weight.shape) + 2 and scale.shape[-2:] == (1, 1): scale = np.squeeze(np.squeeze(scale, axis=-1), axis=-1) if zero_point is not None: zero_point = np.squeeze(np.squeeze(zero_point, axis=-1), axis=-1) if len(weight.shape) != len(scale.shape): raise ValueError( f"In {name}, the `weight` should have same rank as `scale`, but got {weight.shape} vs {scale.shape}" ) if zero_point is not None: if len(weight.shape) != len(zero_point.shape): raise ValueError( f"In {name}, the `weight` should have same rank as `zero_point`, but got {weight.shape} vs {zero_point.shape}" ) scale_repeated, zero_point_repeated = self._interleave_repeat_scale_zp( weight, scale, zero_point ) quantized_data = np.round(weight / scale_repeated) if zero_point_repeated is not None: quantized_data += zero_point_repeated # Adjust dtype based on nbits. dtype_str_prefix = "int" if quantized_data.min() >= 0 and (zero_point is None or zero_point.min() >= 0): dtype_str_prefix = "uint" dtype_str = dtype_str_prefix + str(compression_info.quantization_n_bits) builtin_dtype = types.string_to_builtin(dtype_str) np_dtype = types.nptype_from_builtin(builtin_dtype) builtin_range = types.type_mapping.builtin_to_range(builtin_dtype) quantized_data = np.clip(quantized_data, builtin_range.low, builtin_range.high).astype( np_dtype ) if zero_point is not None: zero_point = zero_point.astype(np_dtype) if compressed_var is None: return frontend_utils._construct_constexpr_dequant_op( quantized_data, zero_point, scale, name=name ) else: # Specially handles joint compression, such as using sparse op if joint with pruning. if compressed_var.op.op_type == "constexpr_sparse_to_dense": mask, nonzero_data = mb.constexpr_sparse_blockwise_shift_scale( data_mask=compressed_var.op.mask, nonzero_data=quantized_data[compressed_var.op.mask.val != 0].flatten(), scale=scale, offset=zero_point, before_op=compressed_var.op, name=compressed_var.op.name + "_quantized", ) return mb.constexpr_sparse_to_dense(nonzero_data=nonzero_data, mask=mask, name=name) elif compressed_var.op.op_type == "constexpr_lut_to_dense": if not types.is_int(compressed_var.dtype): raise ValueError( "The joint palettization+quantization only supports lut with " f"int entries, but got {types.builtin_to_string(compressed_var.dtype)}" ) return mb.constexpr_blockwise_shift_scale( data=compressed_var, scale=scale, offset=zero_point, name=name, ) else: raise ValueError( "Unsupported joint compression combination. The quantization can only be joint " f"with pruning or palettization, but got {compressed_var.op.op_type}. Please check the value of " "'compression_type' in your registered buffers." ) def _construct_palettization_op( self, weight: np.ndarray, compression_info: CompressionInfo, name: str, compressed_var: Optional[Var] = None, ) -> Var: """ The weight is constructed by 2**nbits unique values in each group. When `palettization_scale` is provided, it means the weight has scales before got palettized. More specifically, the diagram is: lut(fp16) \ -> constexpr_lut_to_dense -> dense(fp16) -> constexpr_blockwise_shift_scale -> dense(fp16) indices / If compressed_var is not None, it's the var constructed by a previous compression function, which means this is a joint compression. For example, if the compression_info.compression_type is [CompressionType.PRUNING, CompressionType.PALETTIZATION], the compressed_var is the var produced by the pruning. """ if compression_info.lut is None: raise ValueError("Missing lut in compression info. Please register a buffer for lut.") lut = compression_info.lut.detach().numpy() if len(lut.shape) == len(weight.shape) + 1: # The last dim to indicate vector size is by default 1 for scalar palettization. lut = np.expand_dims(lut, axis=-1) if len(lut.shape) != len(weight.shape) + 2: raise ValueError( f"In {name}, The rank of lut is invalid. It should match weight's rank where" f"lut.rank == weight.rank + 2). Got lut.rank {len(lut.shape)} and weight.rank {len(weight.shape)}" ) num_palettes = lut.shape[-2] nbits = int(math.ceil(math.log2(num_palettes))) if 2**nbits != num_palettes: # Padding lut to make it has 2**nbits dim size on -2 axis. padding_shape = list(lut.shape) padding_shape[-2] = 2**nbits - num_palettes lut = np.concatenate([lut, np.zeros(padding_shape, dtype=lut.dtype)], axis=-2) num_palettes = lut.shape[-2] if compression_info.palettization_scale is not None: # The weight has scales, which means the palettization is on the pre-scale data. scale = compression_info.palettization_scale.detach().numpy() # For conv/conv_transpose, the weight has rank=4, so we auto-expand scale and zero-point if # it only has two elements. if len(weight.shape) == 4 and len(scale.shape) == 2: scale = np.expand_dims(np.expand_dims(scale, axis=-1), axis=-1) if len(scale.shape) != len(weight.shape): raise ValueError( f"In {name}, the scale should have the same rank as weight, but got " f"{scale.shape} vs {weight.shape}." ) weight = weight / scale vector_axis = compression_info.vector_axis if lut.shape[-1] > 1: if vector_axis is None: # The cto.torch uses 0 for vector axis. logger.warning( "It's recommended to provide vector_axis for vector palettization. " "Defaulting to axis zero." ) vector_axis = 0 indices = optimize_utils.find_indices_for_lut(weight, lut, vector_axis) if CompressionType.QUANTIZATION.value in compression_info.compression_type: # In joint palettization + quantization, the `lut` in the palettization op should be # quantized, so we calculate the quantized lut on-the-fly. tmp_quant_var = self._construct_quantization_op( lut, compression_info, name + "_tmp_quant" ) lut = tmp_quant_var.op.data.val if compressed_var is None: if is_current_opset_version_compatible_with(_target.iOS18): result = mb.constexpr_lut_to_dense( indices=indices, lut=lut, vector_axis=vector_axis, name=name ) else: if np.prod(lut.shape[:-2]) > 1: raise ValueError( "More than one look-up-table (lut) per tensor is only supported in iOS18+. " "Please set the minimum_deployment_target to iOS18 or later." ) if lut.shape[-1] > 1: raise ValueError( "Vector palettization (lut last dim > 1) is only supported in iOS18+. " "Please set the minimum_deployment_target to iOS18 or later." ) # Convert iOS18 lut params to pre-iOS18 compatible format. lut = lut.reshape([num_palettes]) result = mb.constexpr_lut_to_dense( indices=optimize_utils.pack_elements_into_bits(indices, nbits), lut=lut, shape=np.uint32(indices.shape), name=name, ) else: # Specially handles joint compression, such as using sparse op if joint with pruning. if compressed_var.op.op_type == "constexpr_sparse_to_dense": mask, nonzero_data = mb.constexpr_lut_to_sparse( indices_mask=compressed_var.op.mask, indices_nonzero_data=indices[compressed_var.op.mask.val != 0].flatten(), lut=lut, vector_axis=vector_axis, before_op=compressed_var.op, name=compressed_var.op.name + "_palettized", ) result = mb.constexpr_sparse_to_dense( nonzero_data=nonzero_data, mask=mask, name=name ) else: raise ValueError( "Unsupported joint compression combination. The palettization can only be joint " f"with pruning, but got {compressed_var.op.op_type}. Please check the value of " "'compression_type' in your registered buffers." ) if compression_info.palettization_scale is not None: if not is_current_opset_version_compatible_with(_target.iOS18): raise ValueError( "The palettization with per-channel-scale is only supported in iOS18+. Please " "set the minimum_deployment_target to iOS18 or later." ) result = mb.constexpr_blockwise_shift_scale( data=result, scale=scale, offset=None, name=name ) return result @staticmethod def _construct_sparsification_op( weight: np.ndarray, compression_info: CompressionInfo, name: str, compressed_var: Optional[Var] = None, ) -> Var: sparse_params = prune_weights.compress_by_threshold( weight, threshold=np.finfo(np.float16).eps, minimum_sparsity_percentile=0 ) if sparse_params is None: raise ValueError( f"Unable to construct sparsified op. Please check if the weight {name} " "is sparse." ) if is_current_opset_version_compatible_with(_target.iOS18): sparse_params_ios18 = optimize_utils.ios16_sparse_params_to_ios18(sparse_params) return mb.constexpr_sparse_to_dense( nonzero_data=sparse_params_ios18.nonzero_data, mask=sparse_params_ios18.mask, name=name, ) else: return mb.constexpr_sparse_to_dense( nonzero_data=sparse_params.nonzero_data, mask=sparse_params.mask, shape=np.uint32(sparse_params.shape), name=name, ) def _construct_compression_op(self, val: np.ndarray, param_name: str) -> Var: """Construct the compression op based on the compression info.""" compression_info: CompressionInfo = self.param_to_compression_info[param_name] shared_msg = ( "There are coreml compression related buffers registered in the torch " f"model (with {_COMPRESSION_INFO_PREFIX} in the buffer's name) for {param_name}" ) if not compression_info.compression_type: raise ValueError( shared_msg + ", but the 'compression_type' is not set. Please set it to indicate " "the type of compression used on the weight." ) if len(compression_info.compression_type) > 3: raise ValueError( shared_msg + ", but the 'compression_type' has too many values. Support at most 3 " "values." ) if len(compression_info.compression_type) > 1: if not is_current_opset_version_compatible_with(_target.iOS18): raise ValueError( "The joint compression (more than one values in 'compression_type') is only " "supported in iOS18+. Please set minimum_deployment_target to iOS18 or later." ) result: Optional[Var] = None for type_val in compression_info.compression_type: if type_val == CompressionType.QUANTIZATION.value: result = self._construct_quantization_op(val, compression_info, param_name, result) elif type_val == CompressionType.PALETTIZATION.value: result = self._construct_palettization_op(val, compression_info, param_name, result) else: assert type_val == CompressionType.PRUNING.value result = self._construct_sparsification_op( val, compression_info, param_name, result ) if result is None: raise AssertionError(shared_msg + f", but unable to compress weight {param_name}") return result def _add_const(self, name: str, val: Union[torch.Tensor, torch._C.ScriptObject]) -> None: """Create a const op and add it to the graph.""" if isinstance(val, torch.Tensor) and self._has_compression_info(name): try: compression_op = self._construct_compression_op(val.detach().numpy(), name) self.context.add(compression_op) return except NotImplementedError as e: logger.warning( "Failed to create a compression op based on the compression info " f"carried by {name} in the torch model. Ignored the compression info " f"and constructed a normal const. Detailed error message:\n{e}" ) if isinstance(val, torch._C.ScriptObject): logger.info(f"Encountered constant {name} of type _torch._C.ScriptObject") return elif isinstance(val, torch.Tensor) and val.is_quantized: const = _dequantized_weight(val.cpu(), name) self.context.add(const) return elif not isinstance(val, torch.Tensor): raise ValueError(f"unsupported class for {name} in PyTorch graph: {type(val)}") val = val.detach().cpu().numpy() # TODO (rdar://107718371): support uint8 activation quantization in torchscript # Some torchscript models store indices with uint8, which are unrelated to quantization and # need to be cast to int32 since many non-quantized Core ML ops do not support int8. # We need a way to distinguish whether an uint8 is quantization (so should be kept) # or not (so should be cast to int32). if self.context.frontend == TorchFrontend.TORCHSCRIPT and val.dtype == np.uint8: val = val.astype(np.int32) const = mb.const(val=val, name=name) self.context.add(const) def check_ops(self): """ Returns the set of ops in @self.graph that are implemented, and the set for which no conversion function is registered. """ return TorchConverter._check_ops(self.graph) def convert_const(self) -> None: for name, val in self.graph.params.items(): if self.context.frontend == TorchFrontend.TORCHSCRIPT: scope_name, scope_type = self.graph.params_scope[name] with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=scope_type), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=scope_name), ): self._add_const(name, val) elif self.context.frontend in TORCH_EXPORT_BASED_FRONTENDS: # Torch.Export has constants lifted as inputs, yet we have not sorted out # how to support IO metadata, so for now just put a dummy metadata # since inputs/constants will not contribute to debugging/profiling # TODO (rdar://125572392): Support torch.export IO metadata scopes = [ScopeInfo(source=ScopeSource.EXIR_STACK_TRACE, data=[None])] if self.context.frontend == TorchFrontend.EXECUTORCH: scopes.append(ScopeInfo(source=ScopeSource.EXIR_DEBUG_HANDLE, data=[None])) with mb.scope(*scopes): self._add_const(name, val) else: raise ValueError(f"Invalid PyTorch frontend {self.context.frontend}") def convert(self) -> Program: logger.info("Converting graph.") # Set SSA function input name to user defined name if provided. for index, (name, spec) in enumerate(self.graph.inputs.items()): if spec.name is not None: name = spec.name self.inputs[index].name = name # This will hold the converted model. prog = self._prog # Construct placeholder for input to SSA function ssa_func_inputs = OrderedDict() for spec in self.inputs: ssa_func_inputs[spec.name] = self._create_placeholder(spec) # Initialize the SSA for conversion with Function(ssa_func_inputs, opset_version=self.opset_version) as ssa_func: # Map internal @self.graph.inputs to user specified @ssa_func_inputs # If @self.graph.inputs == @ssa_func_inputs this just adds the inputs # to the context. # Convert input placeholders user_names = list(ssa_func_inputs.keys()) internal_names = list(self.graph.inputs.keys()) internal_names.extend(user_names[len(internal_names) :]) for torch_name, ssa_name in zip(internal_names, user_names): input_var = ssa_func.inputs[ssa_name] if self.context.frontend == TorchFrontend.TORCHSCRIPT: # To create fp16 Core ML model from fp32 torch model, we # 1. Cast input to fp32 (if specified fp16 input) # 2. Convert fp32 torch model to fp32 Core ML model # 3. Graph passes `add_fp16_cast` and `cast_optimization` # then cast fp32 Core ML model to fp16 # So here we perform the "cast input to fp32" step if ( types.is_tensor(input_var.sym_type) or types.is_scalar(input_var.sym_type) ) and input_var.dtype == types.fp16: # This cast should have placeholder scope with mb.scope( ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="placeholder" ), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=torch_name), ): input_var = mb.cast(x=input_var, dtype="fp32") elif self.context.frontend in TORCH_EXPORT_BASED_FRONTENDS: # EXIR has dtypes all determined, so for now we just stick to EXIR dtypes # TODO (rdar://115845792): Handle fp16 IO dtypes # When handle user provided IO dtypes, we will also need to handle IO metadata # TODO (rdar://125572392): Support torch.export IO metadata if ( input_var.dtype == types.fp16 and not is_current_opset_version_compatible_with(_target.iOS16) ): raise ValueError( "To use fp16 input, please set minimum deployment target to iOS16+" ) # Torch.export may produce scalar input, # which then gets wrapped as rank-1 size-1 tensor for Core ML residency # during our internal graph construction. # Here we squeeze it back to scalar if torch_name.endswith(WRAPPED_SCALAR_INPUT_SUFFIX): torch_name = torch_name[: -len(WRAPPED_SCALAR_INPUT_SUFFIX)] scopes = [ ScopeInfo( source=ScopeSource.EXIR_STACK_TRACE, data=f"unwrap_scalar_input_{torch_name}", ) ] if self.context.frontend == TorchFrontend.EXECUTORCH: scopes.append( ScopeInfo(source=ScopeSource.EXIR_DEBUG_HANDLE, data=[None]) ) with mb.scope(*scopes): input_var = mb.squeeze(x=input_var, name=torch_name) else: raise ValueError(f"Invalid PyTorch frontend {self.context.frontend}") self.context.add(input_var, torch_name=torch_name) # EXIR lifts buffer references as inputs, so we need to create them by reading states if self.context.frontend in TORCH_EXPORT_BASED_FRONTENDS: for ( input_name, buffer_name, ) in self.context.torch_graph.input_name_to_source_buffer_name.items(): buffer_var = self.context[buffer_name] scopes = [ ScopeInfo(source=ScopeSource.EXIR_STACK_TRACE, data=f"read_{buffer_name}") ] if self.context.frontend == TorchFrontend.EXECUTORCH: scopes.append(ScopeInfo(source=ScopeSource.EXIR_DEBUG_HANDLE, data=[None])) with mb.scope(*scopes): input_var = mb.read_state(input=buffer_var) # As of iOS 18, Core ML state can only be fp16 # In torch converter, we convert everything under fp32 # (then cast everything to fp16 if specified fp16 compute precision) # so we need to (temporarily) cast read result to fp32 input_var_fp32 = mb.cast(x=input_var, dtype="fp32", name=input_name) self.context.add(input_var_fp32) self.context.name_to_source_state[input_name] = buffer_var # Convert constants self.convert_const() # Add the rest of the operations has_states = len(getattr(self, "states", [])) > 0 convert_nodes(self.context, self.graph, early_exit=not has_states) # EXIR represents stateful execution as buffer mutation at output, # i.e. buffer.copy_(...) at the end of EXIR program, # so analogously we update state at the end of pymil function if self.context.frontend in TORCH_EXPORT_BASED_FRONTENDS: for ( output_name, buffer_name, ) in self.context.torch_graph.output_name_to_target_buffer_name.items(): output_var = self.context[output_name] buffer_var = self.context[buffer_name] scopes = [ ScopeInfo(source=ScopeSource.EXIR_STACK_TRACE, data=f"write_{buffer_name}") ] if self.context.frontend == TorchFrontend.EXECUTORCH: scopes.append(ScopeInfo(source=ScopeSource.EXIR_DEBUG_HANDLE, data=[None])) with mb.scope(*scopes): cast_value = mb.cast( x=output_var, dtype=builtin_to_string(buffer_var.dtype) ) mb.coreml_update_state(state=buffer_var, value=cast_value) graph_outputs = [self.context[name] for name in self.graph.outputs] # An output can be None when it's a None constant, which happens # in Fairseq MT. for g in graph_outputs: if g is None: logger.warning(f"Dropping output {g} which is None") graph_outputs = [g for g in graph_outputs if g is not None] # Output renaming occurs if self.outputs is not None: if len(self.outputs) != len(graph_outputs): raise ValueError( f"Number of outputs provided, {len(self.outputs)}, do not match the number of outputs detected in the model, {len(graph_outputs)}." ) if self.output_names: for index, var in enumerate(graph_outputs): if self.output_names[index] is not None: output_rename = self.output_names[index] var.name = output_rename ssa_func.set_outputs(graph_outputs) prog.add_function("main", ssa_func) if self.use_default_fp16_io: # If the output type is not specified by the user and use_default_fp16_io # is True. Make the default output type to fp16 self._adjust_default_output_to_fp16(graph_outputs) if self.outputs is not None: prog.functions["main"].set_output_types(self.outputs) prog.functions["main"].set_input_types(tuple(self.inputs)) # Make sure the prog is not missing any scope information essential_scope_sources = [] if self.context.frontend == TorchFrontend.TORCHSCRIPT: essential_scope_sources = [ ScopeSource.TORCHSCRIPT_MODULE_NAME, ScopeSource.TORCHSCRIPT_MODULE_TYPE, ] elif self.context.frontend in TORCH_EXPORT_BASED_FRONTENDS: essential_scope_sources = [ScopeSource.EXIR_STACK_TRACE] if self.context.frontend == TorchFrontend.EXECUTORCH: essential_scope_sources.append(ScopeSource.EXIR_DEBUG_HANDLE) else: raise ValueError(f"Invalid PyTorch frontend {self.context.frontend}") prog._add_essential_scope_source(essential_scope_sources) prog.validate(check_essential_scope=True) return prog ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/dialect_ops.py0000644000000000000000000001734714672066616025626 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Operation, get_new_symbol, types from coremltools.converters.mil.mil.input_type import DefaultInputs, InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.defs._utils import get_param_val, solve_slice_by_index_shape from coremltools.converters.mil.mil.ops.registry import SSAOpRegistry from coremltools.converters.mil.mil.types.symbolic import is_compatible_symbolic_vector register_op = SSAOpRegistry.register_op # This file contains the Torch dialect of SSA. Briefly, these ops are only # understandable in the Torch frontend and not acceptable in the standard op set. # No backend would support any of the op here. These ops exist to facilitate # frontend SSA passes, but must be replaced with standard ops during SSA # passes. # All torch op must start with 'torch_' prefix. # torch_upsample_nearest_neighbor is dealing with upsample layer which has flexible input shape, # and recompute_scale_factor is set to True in the original torch layer. @register_op(namespace="torch") class torch_upsample_nearest_neighbor(Operation): """ Upsample the spatial dimensions (last two dimensions) of the input by scale factors using nearest-neighbor interpolation. It corresponds to `torch.nn.functional.interpolate` function with `mode=nearest`, `recompute_scale_factor=True`, and input with flexible shape. source: https://pytorch.org/docs/stable/_modules/torch/nn/functional.html#interpolate Parameters ---------- x: tensor<[b, C, H1, W1],T> (Required) * Must be rank ``4``. output_height: i32 * Output height for the height dimension. output_width: i32 * Output width for the width dimension. Returns ------- tensor<[b, C, H2, W2],T> * Tensor with same type as the input. * ``H2`` = output_height * ``W2`` = output_width Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), output_height=TensorInputType(type_domain=types.int32), output_width=TensorInputType(type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): if self.x.rank != 4: raise ValueError( 'input to the "torch_upsample_nearest_neighbor" op must have rank 4' ) ret_shape = list(self.x.shape) ret_shape[2] = get_new_symbol() ret_shape[3] = get_new_symbol() return types.tensor(self.x.dtype, ret_shape) # torch_upsample_bilinear is dealing with upsample layer which has flexible input shape, # and recompute_scale_factor is set to True in the original torch layer. @register_op(namespace="torch") class torch_upsample_bilinear(Operation): """ Upsample the spatial dimensions (last two dimensions) of the input by scale factors using bilinear interpolation. It corresponds to `torch.nn.functional.interpolate` function with `mode=bilinear`, `recompute_scale_factor=True`, and input with flexible shape. source: https://pytorch.org/docs/stable/_modules/torch/nn/functional.html#interpolate Parameters ---------- x: tensor<[b, C, H1, W1], T> (Required) * Must be rank ``4``. output_height: i32 * Output height for the height dimension. output_width: i32 * Output width for the width dimension. aligh_corners: const * The `aligh_corners` parameter for the original torch op. Returns ------- tensor<[b, C, H2, W2], T> * Tensor with same type as the input. * ``H2`` = output_height * ``W2`` = output_width Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), output_height=TensorInputType(type_domain=types.int32), output_width=TensorInputType(type_domain=types.int32), align_corners=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( align_corners=True, ) def type_inference(self): if self.x.rank != 4: raise ValueError( 'input to the "torch_upsample_bilinear" op must have rank 4' ) ret_shape = list(self.x.shape) ret_shape[2] = get_new_symbol() ret_shape[3] = get_new_symbol() return types.tensor(self.x.dtype, ret_shape) # torch_tensor_assign is dealing with the tensor assignment operation @register_op(namespace="torch") class torch_tensor_assign(Operation): """ Method for tensor value assignment via indexing and slicing. Suppose we have a tensor ``x``, this method achieves: ``x[begin[0]: end[0]: stride[0], begin[1]: end[1]: stride[1], ...] = value`` Parameters ---------- x: tensor<*?, T> (Required) * Input tensor updates: tensor<\*K, T> (Required) * Value tensor to be inserted * The shape of the updates tensor must match the slicing result of the input data ``x``. begin: tensor<[rank], i32> (Required) * Starting index for the dimension of slicing. end: tensor<[rank(x)], i32> (Required) * Ending index for the dimension of slicing. stride: tensor<[rank(x)], i32> (Optional) * Default as all ``1``s. * Stride for the dimension of slicing. begin_mask: tensor<[rank(x)], bool> (Optional) * Default to all ``False``. * If ``begin_mask[i]==True``, neglect ``begin[i]``, and set ``begin[i]`` to ``0``. end_mask: tensor<[rank(x)], bool> (Optional) * Default to all ``False``. * If ``end_mask[i]==True``, neglect ``end[i]``, and set ``end[i]`` to ``x.shape[i]``. squeeze_mask: tensor<[rank(x)], bool> (Optional) * Default to all ``False``. * If ``squeeze_mask[i]==True``, neglect ``end[i]``, and do the pure index at ``begin[i]``. Returns ------- tensor<*?, T> - Scalar or tensor. Attributes ---------- T: fp16, fp32, i32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), updates=TensorInputType(type_domain="T"), begin=TensorInputType(type_domain=types.int32), end=TensorInputType(type_domain=types.int32), stride=TensorInputType(const=True, optional=True, type_domain=types.int32), begin_mask=TensorInputType(const=True, optional=True, type_domain=types.bool), end_mask=TensorInputType(const=True, optional=True, type_domain=types.bool), squeeze_mask=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( stride=None, begin_mask=None, end_mask=None, squeeze_mask=None, ) def type_inference(self): # solve shape ret_shape = solve_slice_by_index_shape( self.x.shape, self.begin.val, self.end.val, get_param_val(self.stride), get_param_val(self.begin_mask), get_param_val(self.end_mask), get_param_val(self.squeeze_mask), ) if not is_compatible_symbolic_vector(ret_shape, self.updates.shape): raise ValueError( "The updates tensor should have shape {}. Got {}".format( ret_shape, self.updates.shape ) ) return self.x.sym_type ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/exir_utils.py0000644000000000000000000003247214672066616025523 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Dict, List, Tuple import sympy import torch from coremltools import _logger as logger from coremltools.converters.mil.input_types import RangeDim, TensorType from coremltools.converters.mil.mil import types from .utils import TORCH_DTYPE_TO_MIL_DTYPE WRAPPED_SCALAR_INPUT_SUFFIX = "_wrapped_as_tensor_for_coreml" def _map_sympy_number_to_int(sympy_number: sympy.core.numbers.Number) -> int: MAX_DIM = 2147483647 if sympy_number == sympy.oo or sympy_number > MAX_DIM: return MAX_DIM else: return int(sympy_number) def _construct_ct_range_dim_from_torch_value_ranges( value_ranges, # torch.utils._sympy.value_ranges.ValueRanges ) -> RangeDim: if value_ranges.is_bool: raise NotImplementedError("Only non-bool torch value range handled yet") lower = _map_sympy_number_to_int(value_ranges.lower) upper = _map_sympy_number_to_int(value_ranges.upper) return RangeDim(lower_bound=lower, upper_bound=upper) def _construct_symbol_name_to_ct_range_dim_dict( exported_program, # torch.export.ExportedProgram ) -> Dict[str, RangeDim]: symbol_name_to_ct_range_dim = {} for symbol, value_ranges in exported_program.range_constraints.items(): symbol_name = str(symbol) symbol_name_to_ct_range_dim[symbol_name] = _construct_ct_range_dim_from_torch_value_ranges( value_ranges ) return symbol_name_to_ct_range_dim def _construct_ct_tensor_type_from_torch( name: str, tensor: torch.Tensor, symbol_name_to_ct_range_dim: Dict[str, RangeDim], ) -> TensorType: coreml_dtype = TORCH_DTYPE_TO_MIL_DTYPE[tensor.dtype] # TODO (rdar://115845792): Once we support user inputs, we can migrate this check to inputs validation if coreml_dtype == types.int16: coreml_dtype = types.int32 logger.warning( f"Core ML does not support int16 input, so input {name} has been cast to int32" ) shape = [] for size in tensor.shape: size_str = str(size) if size_str in symbol_name_to_ct_range_dim: shape.append(symbol_name_to_ct_range_dim[size_str]) else: shape.append(int(size)) if len(shape) == 0: shape = [1] logger.warning( "Core ML does not support scalar input, " f"so {name} has been wrapped as rank-1 size-1 tensor" ) name = name + WRAPPED_SCALAR_INPUT_SUFFIX return TensorType(name=name, dtype=coreml_dtype, shape=shape) def _construct_ct_input_types_from_torch_user_inputs( exported_program, # torch.export.ExportedProgram torch_user_inputs: Dict[str, torch.Tensor], ) -> List[TensorType]: ct_input_types = [] symbol_name_to_ct_range_dim = _construct_symbol_name_to_ct_range_dim_dict(exported_program) for name, tensor in torch_user_inputs.items(): ct_input_type = _construct_ct_tensor_type_from_torch( name, tensor, symbol_name_to_ct_range_dim ) ct_input_types.append(ct_input_type) return ct_input_types def _extract_placeholders_from_exir_program( exported_program # torch.export.ExportedProgram ) -> Dict[str, torch.fx.Node]: """ Given: exported_program: torch.export.ExportedProgram Return: placeholders: dictionary mapping names to placeholder nodes """ placeholders = {} for node in exported_program.graph_module.graph.nodes: if node.op == "placeholder": placeholders[node.name] = node return placeholders def _extract_parameters_from_exir_program( exported_program, # torch.export.ExportedProgram ) -> Dict[str, torch.Tensor]: """ Given: exported_program: torch.export.ExportedProgram Return: parameters: dictionary mapping names to torch parameter tensors """ parameters = {} for name, parameter in zip( exported_program.graph_signature.parameters, exported_program.parameters() ): if not isinstance(parameter, torch.Tensor): raise NotImplementedError( f"Only torch.Tensor parameter handled yet, but got {type(parameter)}" ) parameters[name] = parameter return parameters def _extract_buffers_from_exir_program( exported_program, # torch.export.ExportedProgram ) -> Dict[str, torch.Tensor]: """ Given: exported_program: torch.export.ExportedProgram Return: buffers: dictionary mapping names to torch buffer tensors """ buffers = {} for name, buffer in zip( exported_program.graph_signature.buffers, exported_program.buffers() ): if not isinstance(buffer, torch.Tensor): raise NotImplementedError( f"Only torch.Tensor buffer handled yet, but got {type(buffer)}" ) buffers[name] = buffer return buffers def _extract_inputs_from_exir_program( exported_program, # torch.export.ExportedProgram ) -> Tuple[List[TensorType], Dict[str, torch.Tensor], Dict[str, torch.Tensor], Dict[str, str],]: """ Extract "input"s from torch.export.ExportedProgram For easy delegation to different backends, EXIR lifts constants and buffer references also as inputs, so we will extract all user inputs and constants and mutable buffers EXIR has 2 types of constants: 1. parameters (e.g. weight of torch.nn.Linear) 2. constants (e.g. torch.tensor([0]) inside a torch.nn.Module) We consider buffers that are not mutated also as constant, e.g. batch norm running mean and variance are constant during inference Given: exported_program: torch.export.ExportedProgram Return: user_inputs: List[ct.TensorType] list of coremltools input tensor specifications constants: Dict[str, torch.Tensor] dictionary mapping variable names to torch constant tensors buffers: Dict[str, torch.Tensor] dictionary mapping torch mutable buffer names to tensors input_name_to_source_buffer_name: Dict[str, str] dictionary mapping input variable names to underlying mutable buffer names, i.e. these input variables are "read" from mutable buffer PS: Here is an example of buffers and input_name_to_source_buffer_name. Consider a toy model ``` class Model(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer("state", torch.tensor([7, 5, 6], dtype=torch.float16)) def forward(self, x): ... ``` The EXIR program is ``` class GraphModule(torch.nn.Module): def forward(self, arg0_1: "f16[3]", arg1_1: "f16[3]"): ... Graph signature: ExportGraphSignature(input_specs=[InputSpec(kind=, arg=TensorArgument(name='arg0_1'), target='state', persistent=True), ... ``` We will have ``` buffers = {"state": torch.tensor([7, 5, 6], dtype=torch.float16)} input_name_to_source_buffer_name = {"arg0_1": "state"} ``` """ # prepare placeholder nodes and parameters and buffers into convenient dicts placeholders = _extract_placeholders_from_exir_program(exported_program) parameters = _extract_parameters_from_exir_program(exported_program) buffers = _extract_buffers_from_exir_program(exported_program) # loop over input specs and populate results torch_user_inputs = {} constants = {} input_name_to_source_buffer_name = {} for input_spec in exported_program.graph_signature.input_specs: if input_spec.kind == torch.export.graph_signature.InputKind.USER_INPUT: node = placeholders[input_spec.arg.name] if node.meta is None or "val" not in node.meta: raise ValueError( "Placeholder torch.fx.Node metadata val is required in Core ML conversion" ) val = node.meta["val"] if not isinstance(val, torch.Tensor): raise NotImplementedError( "Placeholder val must be a tensor or fake tensor, " f"but got type {type(val)}, value {str(val)}" ) torch_user_inputs[node.name] = val elif input_spec.kind == torch.export.graph_signature.InputKind.PARAMETER: constants[input_spec.arg.name] = parameters[input_spec.target] elif input_spec.kind == torch.export.graph_signature.InputKind.CONSTANT_TENSOR: constants[input_spec.arg.name] = exported_program.constants[input_spec.target] elif input_spec.kind == torch.export.graph_signature.InputKind.BUFFER: if input_spec.target in exported_program.graph_signature.buffers_to_mutate.values(): input_name_to_source_buffer_name[input_spec.arg.name] = input_spec.target else: constants[input_spec.arg.name] = buffers.pop(input_spec.target) else: raise NotImplementedError( "Only 4 types of inputs handled yet: user input, parameter, constant, buffer. " f"But got {input_spec.kind}" ) ct_input_types = _construct_ct_input_types_from_torch_user_inputs( exported_program, torch_user_inputs ) return ct_input_types, constants, buffers, input_name_to_source_buffer_name def _extract_outputs_from_exir_program( exported_program, # torch.export.ExportedProgram ) -> Tuple[List[str], Dict[str, str],]: """ Extract "outputs" from torch.export.ExportedProgram For easy delegation to different backends, EXIR lifts buffer mutations also as outputs, so we will extract all user outputs and buffer mutations Given: exported_program: torch.export.ExportedProgram Return: user_outputs: List[str] list of output names output_name_to_target_buffer_name: Dict[str, str] dictionary mapping output variable names to underlying mutable buffer names, i.e. these output variables are "written" to mutable buffer """ user_outputs = [] output_name_to_target_buffer_name = {} for output_spec in exported_program.graph_signature.output_specs: if output_spec.kind == torch.export.graph_signature.OutputKind.USER_OUTPUT: user_outputs.append(output_spec.arg.name) elif output_spec.kind == torch.export.graph_signature.OutputKind.BUFFER_MUTATION: output_name_to_target_buffer_name[output_spec.arg.name] = output_spec.target elif output_spec.kind == torch.export.graph_signature.OutputKind.USER_INPUT_MUTATION: raise ValueError( "Core ML cannot handle user input mutation, because Core ML distinguishes " "input (immutable) and state (mutable). You have 2 options:\n" "1. If mutation is intentional and necessary to carry over, then please " "replace input with buffer by your torch model.register_buffer then re-export\n" "2. If mutation is unnecessary, then please avoid it, e.g. by " "adding `input = input.clone()` at the 1st line of your torch model.forward method" ) else: raise NotImplementedError( "Only 2 types of outputs handled yet: user output, buffer mutation. " f"But got {output_spec.kind}" ) return user_outputs, output_name_to_target_buffer_name def extract_io_from_exir_program( exported_program, # torch.export.ExportedProgram ) -> Tuple[ List[TensorType], List[str], Dict[str, torch.Tensor], Dict[str, torch.Tensor], Dict[str, str], Dict[str, str], ]: """ Extract "input"s and "output"s from torch.export.ExportedProgram For easy delegation to different backends, EXIR lifts constants and buffer references also as inputs, buffer mutations also as outputs, so we will extract all user inputs / outputs, constants, buffers and their mutations Given: exported_program: torch.export.ExportedProgram Return: user_inputs: List[ct.TensorType] list of coremltools input tensor specifications user_outputs: List[str] list of output names constants: Dict[str, torch.Tensor] dictionary mapping variable names to torch constant tensors buffers: Dict[str, torch.Tensor] dictionary mapping torch mutable buffer names to tensors input_name_to_source_buffer_name: Dict[str, str] dictionary mapping input variable names to underlying mutable buffer names, i.e. these input variables are "read" from mutable buffer output_name_to_target_buffer_name: Dict[str, str] dictionary mapping output variable names to underlying mutable buffer names, i.e. these output variables are "written" to mutable buffer """ ( user_inputs, constants, buffers, input_name_to_source_buffer_name, ) = _extract_inputs_from_exir_program(exported_program) user_outputs, output_name_to_target_buffer_name = _extract_outputs_from_exir_program( exported_program ) return ( user_inputs, user_outputs, constants, buffers, input_name_to_source_buffer_name, output_name_to_target_buffer_name, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/internal_graph.py0000644000000000000000000005760714672066616026340 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict from typing import Any, Dict, List, Optional, Tuple, Union import numpy as np import torch from coremltools import _logger as logger from coremltools.converters.mil.input_types import TensorType from .exir_utils import extract_io_from_exir_program from .torch_op_registry import _TORCH_OPS_REGISTRY from .torchscript_utils import _expand_and_optimize_ir from .utils import TORCH_DTYPE_TO_NUM, sanitize_op_kind def _make_ssa_name(name: Optional[Union[str, int]]) -> str: """ Converts a symbol name (string) into an SSA name, by prepending '%'. If the name is a parameter value (int), directly printing it without prepending '%'. Only used for pretty printing the graph. """ if name is None: return "None" if type(name) is int: return str(name) return "%" + name def _ssa_name_list(names: List[Optional[Union[str, int]]]) -> List[str]: """ Take a list of symbol names (strings) and return them as SSA names. Only used for pretty printing the graph. """ return [_make_ssa_name(x) for x in names] def _find_new_name(old_name: str, node_names: List[str]) -> str: """ Disambiguate a node's name from a list of existing node names by adding successively larger integers. """ count = 0 new_name = old_name + "." + str(count) if count != 0 else old_name while new_name in node_names: count += 1 new_name = old_name + "." + str(count) return new_name def _replace_in_list(ls: List[Any], old_val: Any, new_val: Any) -> None: """Helper function to replace a value in a list.""" try: idx = ls.index(old_val) except ValueError: pass else: ls[idx] = new_val class InternalTorchIRBlock: """ coremltools internal representation of a torch IR block. """ def __init__( self, parent: Optional["InternalTorchIRNode"] = None, nodes: Optional[List["InternalTorchIRNode"]] = None, inputs: Optional[List[str]] = None, outputs: Optional[List[str]] = None, ): """ Arguments: parent: The InternalTorchIRNode this block belongs to. nodes: list of InternalTorchIRNode in the block inputs: list of input symbols. outputs: list of output symbols. """ self.nodes = nodes self.inputs = inputs self.outputs = outputs self.parent = parent @classmethod def from_exir_block(cls, block, parent): raise NotImplementedError( "EXIR: Support for Ops containing blocks not implemented yet" ) # TODO: rdar://115846569 ([Executorch] Handle control flow ops from EXIR) @classmethod def from_torchscript_block(cls, block, parent): node_names = set() nodes = [] inputs = [] outputs = [] # Add inputs for inp in block.inputs(): inputs.append(inp.debugName()) # Add outputs for outp in block.outputs(): outputs.append(outp.debugName()) internal_block = cls(parent=parent, inputs=inputs, outputs=outputs, nodes=nodes) # Add nodes for raw_node in block.nodes(): new_node = InternalTorchIRNode.from_torchscript_node( node=raw_node, parent=internal_block ) if new_node.name == new_node.kind: new_node.name = _find_new_name(new_node.name, node_names) internal_block.nodes.append(new_node) node_names.add(new_node.name) return internal_block def __str__(self, indent=2): indent_str = " " * indent graph_str = "{}block({}):\n".format( indent_str, ", ".join(_ssa_name_list(self.inputs)) ) graph_str += "{}\n".format(indent_str).join( [x.__str__(indent=indent + 2) for x in self.nodes] ) graph_str += "\n{}return ({})".format( indent_str, ", ".join(_ssa_name_list(self.outputs)) ) return graph_str def __repr__(self): return str(self) def replace_name(self, old_name, new_name): """Replaces all instances of @old_name with @new_name in @self.""" # Replace graph inputs/outputs _replace_in_list(self.inputs, old_name, new_name) _replace_in_list(self.outputs, old_name, new_name) for node in self.nodes: node.replace_name(old_name, new_name) class InternalTorchIRNode: """ coremltools internal representation of a torch IR node. Can construct itself from a provided torchIR node or manually constructed with args for testing. See InternalTorchIRGraph for the motivation behind this structure. """ def __init__( self, kind: str, inputs: List[str], outputs: List[str], kwinputs: Optional[Dict[str, str]] = None, name: Optional[str] = None, parent: Optional[Union["InternalTorchIRGraph", "InternalTorchIRBlock"]] = None, attr: Optional[Dict[str, Any]] = None, blocks: Optional[List["InternalTorchIRBlock"]] = None, model_hierarchy: Optional[str] = None, meta: Optional[Dict] = None, ): """ Arguments: name: Name of the node. kind: the kind (op) of the node. inputs: list of input symbols. outputs: list of output symbols. kwinputs: dict of keyword input symbols. parent: The InternalTorchIRGraph/Block this node belongs to. attr: dict of named attributes. blocks: list of InternalTorchIRBlock. model_hierarchy: str represents TorchScript node's model hierarchy. meta: A dictionary of torch fx node metadata inherited from torch.fx.Node.meta """ if not name and not outputs: self.name = "" else: self.name = name if name else outputs[0] self.kind = kind self.inputs = inputs self.outputs = outputs self.kwinputs = kwinputs self.parent = parent self.attr = attr if attr is not None else {"value": None} self.blocks = blocks if blocks is not None else [] self.model_hierarchy = model_hierarchy self.meta = meta @classmethod def from_torchscript_node(cls, node, parent): inputs = [_input.debugName() for _input in node.inputs()] outputs = [output.debugName() for output in node.outputs()] kind = sanitize_op_kind(node.kind()) attr = {name: getattr(node, node.kindOf(name))(name) for name in node.attributeNames()} if "value" not in attr: attr["value"] = None # If the output is boolean, explicitly cast it so type inference # will work correctly. if len(outputs) == 1 and next(node.outputs()).type().str() == "bool": attr["value"] = bool(attr["value"]) # On rare occassions, a node has no outputs. In that case, the node's # On rare occasions, a node has no outputs. In that case, the node's # name will be its kind. However, this no longer guarantees the node's # name is unique. It will be up to the graph constructing the node to # make sure names are unique. name = outputs[0] if len(outputs) > 0 else kind internal_node = cls( name=name, kind=kind, parent=parent, inputs=inputs, outputs=outputs, attr=attr, blocks=None, model_hierarchy=node.getModuleHierarchy(), ) internal_node.blocks = [ InternalTorchIRBlock.from_torchscript_block(block=b, parent=internal_node) for b in node.blocks() ] return internal_node @classmethod def from_exir_node(cls, node): def get_arguments(alist: List) -> Tuple: args = [] for i in alist: if isinstance(i, torch.fx.Node): args.append(i.name) elif isinstance(i, torch.fx.immutable_collections.immutable_list): args.append(get_arguments(i)) elif isinstance(i, (int, float, str)): args.append(i) # This is necessitated by backward compatibility: # * TorchScript used to store dtype as integers/enums # * Subsequently, we built our PyTorch converter based on numbered dtypes # * Now EXIR uses dtype directly... # * Until refactoring EXIR converter to be independent from TorchScript converter, # we have to map dtype to number ourselves # to leverage the existing TorchScript converter infra elif isinstance(i, torch.dtype): args.append(TORCH_DTYPE_TO_NUM[i]) elif ( isinstance(i, torch.device) or isinstance(i, torch.layout) or isinstance(i, torch.memory_format) ): # PyMIL graph does not care about these things pass elif i is None: args.append(None) else: raise AssertionError( f"Unhandled node type {type(i)}. Node content is: {str(i)}" ) return tuple(args) try: kind = node.target.name() except: if callable(node.target): kind = node.target.__name__ else: kind = str(node.target) kind = sanitize_op_kind(kind) if not kind in _TORCH_OPS_REGISTRY: raise ValueError(f"Unsupported fx node {str(node)}, kind {kind}") # TODO (rdar://134015126) handle kwargs inputs = get_arguments(node.args) # TODO: rdar://115846125 ([Executorch] Handle Models/Layers with Multiple outputs) outputs = [node.name] kwinputs = {} for keyword, arg in node.kwargs.items(): if arg is not None: kwinputs[keyword] = get_arguments([arg]) if len(kwinputs) == 0: kwinputs = None name = node.name return cls( name=name, kind=kind, inputs=inputs, outputs=outputs, kwinputs=kwinputs, parent=None, attr=None, blocks=None, meta=node.meta, ) def __str__(self, indent=2): node_str = " " * indent + "{} = {}".format( ", ".join(_ssa_name_list(self.outputs)), self.kind ) node_str += "[{}]".format( ", ".join( ["{}={}".format(n, v) for n, v in self.attr.items() if v is not None] ) ) node_str += "({})".format(", ".join(_ssa_name_list(self.inputs))) for b in self.blocks: node_str += "\n" + b.__str__(indent=indent + 2) return node_str def __repr__(self): return str(self) def replace_name(self, old_name, new_name): """Replaces all instances of @old_name with @new_name in @self.""" _replace_in_list(self.inputs, old_name, new_name) _replace_in_list(self.outputs, old_name, new_name) if self.name == old_name: self.name = new_name for block in self.blocks: block.replace_name(old_name, new_name) def get_scope_info(self) -> Tuple[List[str], List[str]]: """ Get the scope information (``scope_name``, ``scope_type``) of a TorchScript node. In a TorchScript node, a model hierarchy is represented in a string of format: ``scope_name_1(scope_type_1).scope_name_2(scope_type_1).<...>.scope_name_n(scope_type_n)`` For instance, given a torch model: class SubModule(torch.nn.Module): def __init__(self): super().__init__() self.linear_1 = torch.nn.Linear(2, 3) def forward(self, x): x_1 = self.linear(x) x_2 = torch.relu(x_1) return x_2 class Model(torch.nn.Module): def __init__(self): super().__init__() self.submodule_1 = SubModule() def forward(self, x): return self.submodule_1(x) The model hierarchy of ``x_1`` is ``submodule_1(SubModule).linear_1(Linear)``, and ``x_2`` has ``submodule_1(SubModule)``. We consider the ``node.name`` as the most inner ``scope_name``, and ``node.kind`` (aten op type) as the most inner ``scope_type``. ``x_1`` results in: { "scope_name": ["submodule_1", "linear_1", "x_1"], "scope_type": ["SubModule", "Linear", "linear"], }, and ``x_2`` gets: { "scope_name": ["submodule_1", "x_2"], "scope_type": ["SubModule", "relu"], }. Note that, for the model weight const ops, the names are in the following format: "submodule_1.linear_1.weight", which would result in a long ``scope_name``: ``["submodule_1", "linear_1", "submodule_1.linear_1.weight"]``. This function does a special handling to trim it to: ``["submodule_1", "linear_1", "weight"]`` """ def _trim_scopename_for_weight(scope_names: List[str]) -> List[str]: weight_name = scope_names[-1] if scope_names[:-1] != weight_name.split(".")[:-1]: return scope_names scope_names[-1] = weight_name.split(".")[-1] return scope_names if self.model_hierarchy == "" or self.model_hierarchy is None: scopes = [] else: scopes = self.model_hierarchy.split(".") scope_names, scope_types = [], [] for val in scopes: if val == "": scope_names.append("UNKNOWN_SCOPE_NAME") scope_types.append("UNKNOWN_SCOPE_TYPE") continue if val.count("(") != 1 or val.count(")") != 1: raise ValueError(f"{val} is not a valid model hierarchy string.") lower_idx, upper_idx = val.index("("), val.index(")") scope_names.append(val[:lower_idx]) scope_types.append(val[lower_idx + 1 : upper_idx]) scope_names.append(self.name) scope_types.append(self.kind) if self.kind == "getattr": scope_names = _trim_scopename_for_weight(scope_names) return scope_names, scope_types class InternalTorchIRGraph: """ Core ML internal representation of a torch IR graph. A torch._C.Graph object is not an ideal structure to use in converting to CoreML. Conversion to an InternalTorchIRGraph is inserted between the original graph and the final Core ML model to address several issues: 1. A torch._C.graph is hard to work with. For example, its .inputs() and .outputs() functions return iterators, so the only way to determine the number of inputs/outputs is by counting to the end. There are other examples of why the torch structure is hard to work with, and this structure alleviates those issues. 2. torch._C.graph is an internal API and so we can't count on its stability. By inserting a layer in between, we can handle any changes to torch._C.graph here and isolate the ops code that processes the graph. 3. torch._C.graph does not expose a Python constructor. This makes it impossible to write unit tests that isolate specific ops since they have to come from actually converting a PyTorch graph. With an internal structure, we can directly build the test cases we need for unit testing. """ def __init__( self, params: Dict[str, np.ndarray], inputs: Dict[str, TensorType], outputs: List[str], nodes: Optional[List["InternalTorchIRNode"]] = None, buffers: Optional[Dict[str, torch.Tensor]] = None, input_name_to_source_buffer_name: Optional[Dict[str, str]] = None, output_name_to_target_buffer_name: Optional[Dict[str, str]] = None, ): """ Arguments: params: dict mapping parameter names to their numpy value. inputs: OrderedDict mapping input names to their input types. outputs: list[str], list of outputs from the graph. nodes: list of InternalTorchIRNode in the graph. buffers: Dict mapping torch model buffers to their names. input_name_to_source_buffer_name: Dict[str, str] (EXIR only) dictionary mapping input variable names to underlying mutable buffer names, i.e. these input variables are "read" from mutable buffer output_name_to_target_buffer_name: Dict[str, str] (EXIR only) dictionary mapping output variable names to underlying mutable buffer names, i.e. these output variables are "written" to mutable buffer """ self.nodes = nodes self.params = params self.inputs = inputs self.outputs = outputs self.buffers = buffers self.input_name_to_source_buffer_name = input_name_to_source_buffer_name self.output_name_to_target_buffer_name = output_name_to_target_buffer_name self.params_scope = {} @classmethod def from_torchscript(cls, torchscript, inputs=None, cut_at_symbols=None): """ Arguments: torchscript: TorchScript object representing the model to convert. inputs: A list of input types to the graph. cut_at_symbols: The list of desired outputs from the graph. Symbols must be present in the graph. For debugging use only. """ if not isinstance(torchscript, torch.jit.ScriptModule): raise AssertionError( f"Input should be an object of type torch.jit.ScriptModule. Provide: {type(torchscript)}" ) if hasattr(torchscript, "training") and torchscript.training: logger.warning( "Model is not in eval mode. " "Consider calling '.eval()' on your model prior to conversion" ) if type(torchscript) == torch.jit._script.RecursiveScriptModule: logger.warning( "Support for converting Torch Script Models is experimental. " "If possible you should use a traced model for conversion." ) nodes = [] inputs_name_to_type = OrderedDict() outputs = [] raw_graph, params, buffers = _expand_and_optimize_ir(torchscript) # Add inputs # The first element of the raw_graph.inputs() is the 'self' of the module, which is not used. graph_inputs = list(raw_graph.inputs())[1:] if len(graph_inputs) != len(inputs): raise ValueError( f"Number of TorchScript inputs ({len(graph_inputs)}) must match the user provided inputs ({len(inputs)})." ) for index, _input in enumerate(graph_inputs): name = _input.debugName() inputs_name_to_type[name] = inputs[index] # Add outputs, cutting if @cut_at_symbols is set output_names = cut_at_symbols if output_names is None: output_names = [x.debugName() for x in raw_graph.outputs()] for output in output_names: outputs.append(output) internal_graph = cls( nodes=nodes, params=params, inputs=inputs_name_to_type, outputs=outputs, buffers=buffers, ) # Add nodes node_names = set() for raw_node in raw_graph.nodes(): new_node = InternalTorchIRNode.from_torchscript_node( node=raw_node, parent=internal_graph ) if new_node.name == new_node.kind: new_node.name = _find_new_name(new_node.name, node_names) internal_graph.nodes.append(new_node) node_names.add(new_node.name) internal_graph._cache_model_hierarchy_for_params() return internal_graph def _cache_model_hierarchy_for_params(self) -> None: # We cache the model hierarchy information for model weights in self.params_scope, # since self.params doesn't contain the information. def cache_model_hierarchy_block(block): for node in block.nodes: for b in node.blocks: cache_model_hierarchy_block(b) if node.name in self.params: self.params_scope[node.name] = node.get_scope_info() cache_model_hierarchy_block(self) @classmethod def from_exir(cls, exir): exported_program: torch.export.ExportedProgram = exir ( user_inputs, user_outputs, params, buffers, input_name_to_source_buffer_name, output_name_to_target_buffer_name, ) = extract_io_from_exir_program(exported_program) inputs = OrderedDict([(i.name, i) for i in user_inputs]) nodes = [] for node in exported_program.graph_module.graph.nodes: if node.op == "call_function": nodes.append(InternalTorchIRNode.from_exir_node(node=node)) elif node.op == "get_attr": name = node.target attr = exported_program.graph_module.__getattr__(name) # Only handle simple tensor attribute for now # There may be unconvertible advanced attributes, # e.g. higher-level callables such as "call_delegate" if not isinstance(attr, torch.Tensor): raise NotImplementedError("Only torch.Tensor attr handled yet") params[name] = attr elif node.op == "placeholder": continue elif node.op == "output": continue else: raise NotImplementedError(f"Nodes of type {node.op} not yet implemented") return cls( nodes=nodes, params=params, inputs=inputs, outputs=user_outputs, buffers=buffers, input_name_to_source_buffer_name=input_name_to_source_buffer_name, output_name_to_target_buffer_name=output_name_to_target_buffer_name, ) def __str__(self): graph_str = "graph(\n" graph_str += self._format_inputs(self.inputs, unpack=True) graph_str += self._format_inputs(self.params) graph_str += "):\n" graph_str += "\n".join([str(x) for x in self.nodes]) + "\n" graph_str += "return ({})".format(", ".join(_ssa_name_list(self.outputs))) return graph_str def _format_inputs(self, inputs, unpack=False): def tensor_str(x): try: return "Tensor{}".format( tuple(list(x.shape.shape if unpack else x.shape) + [str(x.dtype)]) ) except: return "Custom Params({})".format(type(x)) inp_str = "" for k, v in inputs.items(): if isinstance(v, (tuple, list)): shape_str = "({})".format(", ".join([tensor_str(x) for x in v])) else: shape_str = tensor_str(v) inp_str += " {} : {},\n".format(_make_ssa_name(k), shape_str) return inp_str def __repr__(self): return str(self) def replace_name(self, old_name, new_name): """Replaces all instances of @old_name with @new_name in @self.""" # Replace graph inputs/outputs _replace_in_list(self.inputs, old_name, new_name) _replace_in_list(self.outputs, old_name, new_name) for node in self.nodes: node.replace_name(old_name, new_name) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/load.py0000644000000000000000000001444714672066616024255 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os.path as _os_path from typing import List, Optional, Union import torch as _torch from torch.jit._script import RecursiveScriptModule from coremltools import _logger as logger from coremltools._deps import _HAS_EXECUTORCH, _HAS_TORCH_EXPORT_API from coremltools.converters.mil.frontend.torch.converter import TorchConverter from coremltools.converters.mil.input_types import StateType, TensorType from coremltools.converters.mil.mil.program import Program from .converter import TorchConverter from .utils import TorchFrontend if _HAS_TORCH_EXPORT_API: from torch.export import ExportedProgram if _HAS_EXECUTORCH: import executorch.exir def load( spec: Union[RecursiveScriptModule, "ExportedProgram", str], inputs: List[TensorType], specification_version: int, debug: bool = False, outputs: Optional[List[TensorType]] = None, cut_at_symbols: Optional[List[str]] = None, use_default_fp16_io: bool = False, states: Optional[List[StateType]] = None, **kwargs ) -> Program: """ Convert PyTorch model to mil CoreML format. Parameters ---------- spec: It could be one of the following: - String path to .pt file containing serialized torchscript model - In memory TorchScript model of type torch.jit.ScriptModule - In memory EXIR program of type ExportedProgram inputs: Can be a singular element or list of elements of the following form 1. Any subclass of InputType 2. torch.Tensor (only shape and dtype will be used) 3. list of (1. or 2.) Inputs are parsed in the flattened order that the model accepts them. If names are not specified: input keys for calling predict on the converted model will be internal symbols of the input to the graph. User can specify a subset of names. debug: bool, optional. Defaults to False. This flag should generally be False except for debugging purposes for diagnosing conversion errors. Setting this flag to True will print the list of supported and unsupported ops found in the model if conversion fails due to an unsupported op. outputs (optional): list[ct.InputType] or None list of either ct.TensorTypes or ct.ImageTypes (both of which are child classes of InputType) This is the value of the "outputs" argument, passed on by the user in "coremltools.convert" API. cut_at_symbols (optional): List of internal symbol name strings. Graph conversion will terminate once these symbols have been generated. For debugging use only. use_default_fp16_io (optional): bool. Defaults to False. When minimum_deployment_target set >= ct.target.iOS16 (the same as ct.target.macOS13), and the compute precision set to fp16, this flag is True. When True, fp32 i/o defaults to fp16. """ if _HAS_TORCH_EXPORT_API and isinstance(spec, ExportedProgram): # TODO: rdar://115845792 ([Executorch] Handle user provided inputs/outputs in the convert API) if states: raise AssertionError("'states' argument should be None for ExportedProgram") model = spec else: model = _torchscript_from_spec(spec) converter = TorchConverter( model, inputs, outputs, cut_at_symbols, specification_version, use_default_fp16_io, states, ) return _perform_torch_convert(converter, debug) def is_torch_model(model_spec: Union[str, RecursiveScriptModule]) -> bool: if isinstance(model_spec, str) and (model_spec.endswith(".pt") or model_spec.endswith(".pth")): # PyTorch file path return True elif isinstance(model_spec, _torch.jit.ScriptModule): # PyTorch object return True return False def _torchscript_from_spec(model_spec: Union[str, RecursiveScriptModule]) -> RecursiveScriptModule: if isinstance(model_spec, str) and (model_spec.endswith(".pt") or model_spec.endswith(".pth")): filename = _os_path.abspath(model_spec) try: return _torch.jit.load(filename) except Exception as e: logger.error("\n\nERROR - Could not load the PyTorch model. Got the following error:\n") raise e elif isinstance(model_spec, _torch.jit.ScriptModule): return model_spec else: raise TypeError( "A PyTorch model must either be a .pt or .pth file, or a TorchScript object. " f"Received: {type(model_spec)}" ) if _HAS_TORCH_EXPORT_API: def _torchexport_from_spec( model_spec: Union[str, ExportedProgram], frontend=TorchFrontend.TORCHEXPORT, ) -> ExportedProgram: # Load torch.export serialization if isinstance(model_spec, str) and model_spec.endswith(".pt2"): filename = _os_path.abspath(model_spec) try: model = _torch.export.load(filename) except Exception as e: logger.error( "\n\nERROR - Could not load the PyTorch model. Got the following error:\n" ) raise e elif isinstance(model_spec, ExportedProgram): model = model_spec else: raise TypeError( "A PyTorch model must either be a .pt2 file, or an ExportedProgram object. " f"Received: {type(model_spec)}" ) # To edge if edge dialect is desired if frontend == TorchFrontend.EXECUTORCH and model.dialect != "EDGE": model = executorch.exir.to_edge(model).exported_program() return model def _perform_torch_convert(converter: TorchConverter, debug: bool) -> Program: try: prog = converter.convert() except RuntimeError as e: if debug and "convert function" in str(e): implemented, missing = converter.check_ops() print("the following model ops are IMPLEMENTED:") print("\n".join([" " + str(x) for x in sorted(implemented)])) print("the following model ops are MISSING:") print("\n".join([" " + str(x) for x in sorted(missing)])) raise e return prog ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/ops.py0000644000000000000000000100246414672066616024134 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import builtins import math as _math import numbers import re from collections.abc import Iterable from typing import Dict, List, Optional, Tuple, Union import numpy as _np import numpy as np import torch from tqdm import tqdm as _tqdm from coremltools import _logger as logger from coremltools.converters.mil._deployment_compatibility import AvailableTarget as target from coremltools.converters.mil.frontend import _utils from coremltools.converters.mil.frontend.milproto.load import TranscriptionContext from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Symbol, types from coremltools.converters.mil.mil.block import is_current_opset_version_compatible_with from coremltools.converters.mil.mil.ops.defs._utils import ( MAX_SIZE_CONSTANT_FOLDING, promote_input_dtypes, solve_slice_by_index_shape, ) from coremltools.converters.mil.mil.scope import ScopeInfo, ScopeSource from coremltools.converters.mil.mil.types import is_bool, nptype_from_builtin from coremltools.converters.mil.mil.types.symbolic import any_symbolic, is_symbolic from coremltools.converters.mil.mil.types.type_mapping import builtin_to_string from coremltools.converters.mil.mil.var import ListVar, Var from .._utils import build_einsum_mil, value_at from .internal_graph import InternalTorchIRGraph, InternalTorchIRNode from .torch_op_registry import _TORCH_OPS_REGISTRY, register_torch_op from .utils import ( NUM_TO_DTYPE_STRING, NUM_TO_NUMPY_DTYPE, NUM_TO_TORCH_DTYPE, NUMPY_DTYPE_TO_TORCH_NUM, TORCH_DTYPE_TO_NUM, TORCH_EXPORT_BASED_FRONTENDS, TYPE_TO_DTYPE_STRING, TorchFrontend, dtype_to_32bit, ) # The pytorch args for many of the below ops were sourced from # https://github.com/pytorch/pytorch/blob/d971007c291c0ead1003d12cd553d18ddb582207/torch/csrc/jit/mobile/register_mobile_ops.cpp#L216 # Max int64 value. Used as a default value in many PyTorch functions. PYTORCH_DEFAULT_VALUE = 2**63 - 1 VALUE_CLOSE_TO_INFINITY = 1e+38 TORCH_STRING_ARGS = { # conv padding "same", "valid", # meshgrid indexing "ij", "xy", # pad mode "circular", "constant", "reflect", "replicate", } def _all_outputs_present(context, graph): """ Returns true if all the symbols in the graph's output list are present in context. """ for outp in graph.outputs: try: context[outp] except ValueError: return False return True def convert_nodes( context: TranscriptionContext, graph: InternalTorchIRGraph, early_exit: Optional[bool] = True, ) -> None: """ Iterate over the nodes of a graph or block and convert to MIL. Arguments: context: A TranscriptionContext object to pull node inputs and assign node outputs. graph: An InternalTorchIRGraph or InternalTorchIRBlock object. """ for node in _tqdm(graph.nodes, desc="Converting PyTorch Frontend ==> MIL Ops", unit=" ops"): try: convert_single_node(context, node) except Exception as e: scope_names = node.get_scope_info()[0] op_location = '/'.join(scope_names) logger.error(f"\n\nERROR - converting '{node.kind}' op (located at: '{op_location}'):\n") raise e # re-raise exception if early_exit and _all_outputs_present(context, graph): # We've generated all the outputs the graph needs, terminate conversion. break def convert_single_node(context: TranscriptionContext, node: InternalTorchIRNode) -> None: """ Converts a single lowered PyTorch op to MIL. Arguments: context: A TranscriptionContext object to pull node inputs and assign node outputs. node: lowered PyTorch op to convert. """ op_lookup = node.kind add_op = _TORCH_OPS_REGISTRY.get_func(op_lookup) if add_op is None: if re.match(r".*_dynamic", op_lookup): raise RuntimeError( f"PyTorch convert function for op '{op_lookup}' not implemented.\n" "Dynamic quantized models are not supported by Core ML.\n" "Please use static quantization or the APIs in coremltools.optimize to quantize/compress models." ) else: raise RuntimeError( f"PyTorch convert function for op '{op_lookup}' not implemented." ) logger.info("Converting op {} : {}".format(node.name, op_lookup)) scopes = [] if context.frontend == TorchFrontend.TORCHSCRIPT: scope_name, scope_type = node.get_scope_info() scopes = [ ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=scope_type), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=scope_name), ] elif context.frontend in TORCH_EXPORT_BASED_FRONTENDS: scopes = [ ScopeInfo(source=ScopeSource.EXIR_STACK_TRACE, data=[node.meta.get("stack_trace")]) ] if context.frontend == TorchFrontend.EXECUTORCH: scopes.append( ScopeInfo( source=ScopeSource.EXIR_DEBUG_HANDLE, data=[node.meta.get("debug_handle")] ) ) else: raise ValueError(f"Invalid PyTorch frontend {context.frontend}") with mb.scope(*scopes): if context.frontend == TorchFrontend.TORCHSCRIPT: context.quant_context.maybe_handle_quantized_inputs(node) context.prepare_for_conversion(node) add_op(context, node) if _TORCH_OPS_REGISTRY.is_inplace_op(op_lookup): context.process_inplace_op(node) def convert_block(context, block, inputs): """Convert a block (sub-graph) to MIL. Conversion happens within a new context frame. Arguments: context: A TranscriptionContext object to pull node inputs and assign node outputs. block: An InternalTorchIRBlock object. inputs: List of Vars from the outer context that map to the block's expected inputs. The number of inputs provided must match the number expected by the block. """ assert len(block.inputs) == len(inputs) # Start a new context frame. context.push((block.inputs, inputs)) # Add the block ops. convert_nodes(context, block) # Collect the block outputs. outputs = [context[outp] for outp in block.outputs] # Return to the previous context frame. context.pop() return outputs def _assert_torch_dtype_num_is_not_complex_number(num): # 9 is torch.complex64 or torch.cfloat # 10 is torch.complex128 or torch.cdouble assert num is None or num.val is None or num.val not in (9, 10), \ "This op does not support complex number dtype." def _get_bindings(context, alist) -> List[Var]: """ This utility is needed in order to handle following cases: With EXIR, - Some of the inputs can be literals (like axis, perms) and thus can be of types: list, int etc. - An Input Parameter of an op could be a list/tuple similar to our concat layer """ results = [] for i in alist: if isinstance(i, str): if i in context: results.append(context[i]) elif i in TORCH_STRING_ARGS: results.append(i) else: raise ValueError( f"Binding {i} is neither a name of exisitng var in context, " "nor a torch string argument" ) elif isinstance(i, (list, tuple)) and all(isinstance(j, int) for j in i): results.append(mb.const(val=i)) elif isinstance(i, (list, tuple)): results.append(_get_bindings(context, i)) elif isinstance(i, (int, float)): results.append(mb.const(val=i)) elif i is None: results.append(None) else: raise NotImplementedError(f"Binding of inputs of type {type(i)} not handled yet") return results def _get_inputs( context, node, expected: Union[int, List, Tuple, Dict[TorchFrontend, int]] = None, min_expected: Union[int, Dict[TorchFrontend, int]] = None, ) -> List[Var]: """ Look up a node's inputs in @context and return them as a list. If @expected is not None, also verifies the number of inputs matches the value of @expected. """ def check_if_number_of_inputs_expected(num_inputs: int, expected: Union[int, List, Tuple]) -> None: expected = [expected] if isinstance(expected, int) else expected if num_inputs not in expected: raise ValueError( f"node {node.name} ({node.kind}) got {num_inputs} input(s), expected {expected}" ) def check_if_number_of_inputs_more_than_min_expected(num_inputs: int, min_expected: int) -> None: if num_inputs < min_expected: raise ValueError( f"node {node.name} ({node.kind}) got {num_inputs} input(s), " f"expected minimum {min_expected} inputs" ) inputs = _get_bindings(context, node.inputs) if expected is not None: if isinstance(expected, dict): if context.frontend in expected: check_if_number_of_inputs_expected(len(inputs), expected[context.frontend]) else: check_if_number_of_inputs_expected(len(inputs), expected) if min_expected is not None: if isinstance(min_expected, dict): if context.frontend in min_expected: check_if_number_of_inputs_more_than_min_expected(len(inputs), min_expected[context.frontend]) else: check_if_number_of_inputs_more_than_min_expected(len(inputs), min_expected) return inputs def _get_kwinputs(context, node, keyword: str, default: Optional[List[Var]] = None) -> List[Var]: if node.kwinputs is None: return default else: bindings = node.kwinputs.get(keyword) if bindings is None: return default else: return _get_bindings(context, bindings) def _list_select(shape_var, index): """ Sometimes we need to select a specific item from a list. If that item is known at compile time, extract it as a const. Otherwise, if it's symbolic, use gather. """ if shape_var.can_be_folded_to_const(): res = mb.const(val=shape_var.val[index]) else: if is_current_opset_version_compatible_with(target.iOS17): # IOS17 `gather` requires non-negative indices. index = mb.select( cond=mb.greater_equal(x=index, y=0), a=index, b=mb.add(x=index, y=value_at(mb.shape(x=shape_var), 0)), ) res = mb.gather(x=shape_var, indices=index) return res def _is_const(var, optional=False): """ Check if a var is a const. It could be `const` or `constexpr_` ops. """ if optional and var is None: return True if isinstance(var, np.ndarray): return True return ( var is not None and isinstance(var, Var) and var.op is not None and ( var.op.op_type.startswith("constexpr_") or (var.op.op_type == "dequantize" and var.op.can_materialize_val()) or var.val is not None ) ) def _create_linear_layer(x, w, bias): """ Utility to translate linear layer. Since the linear layer can only take `const` or `constexpr_` weight as input, for other cases, we implement the linear layer through matmul. For instance, given a torch model with an int8 weight: int8_weight -> transpose -> reshape -> linear If we directly use `mb.linear`, it is going to produce compilation error at the runtime. """ if _is_const(w) and _is_const(bias, optional=True): return mb.linear(x=x, weight=w, bias=bias) res = mb.matmul(x=x, y=w, transpose_y=True) if bias is not None: res = mb.add(x=res, y=bias) return res def _construct_constant(val, name): # Converter cannot handle torch tensors. if isinstance(val, torch.Tensor): val = val.cpu().numpy() # MIL casts ints to int32, which can't represent PyTorch's default value. # So we instead represent it with None, and any ops that might get the # value will check for None instead. if isinstance(val, int) and val == PYTORCH_DEFAULT_VALUE: val = None # Pytorch uses inf if val is not None and isinstance(val, numbers.Number) and _np.isinf(val): if val < 0: # neg inf # most negative number in fp32 val = -3.4e+38 else: # positive inf val = 3.4e+38 if val is None: return None else: return mb.const(val=val, name=name) @register_torch_op def native_dropout(context, node): if context.frontend in TORCH_EXPORT_BASED_FRONTENDS: inputs = _get_inputs(context, node, min_expected=2) context.add((inputs[0],), node.name) else: raise ValueError(f"native_dropout should only appear in EXIR, but got {context.frontend}") @register_torch_op def affine_grid_generator(context, node): # rdar://73165386 (Improve error handling of coremltools "affine" op PyTorch conversion.) affine_op_name = node.name theta, size, align_corners = _get_inputs(context, node, expected=3) # note: only add consts here as PyTorch uses affine_grid + grid_sampler together is_theta_const = theta.val is not None if is_theta_const: context.add(mb.const(val=theta.val, name="{}_theta".format(affine_op_name))) else: # theta is dynamic input, keep track of it's name context.add(mb.const(val=theta.name, name="{}_theta".format(affine_op_name))) context.add(mb.const(val=size.val, name="{}_size".format(affine_op_name))) context.add(mb.const(val=align_corners.val, name="{}_align_corners".format(affine_op_name))) @register_torch_op def grid_sampler(context, node): affine_op_name = node.inputs[1] # https://github.com/pytorch/pytorch/blob/00d432a1ed179eff52a9d86a0630f623bf20a37a/aten/src/ATen/native/GridSampler.h#L10-L11 m_mode = {0: "bilinear", 1: "nearest"} m_padding_mode = {0: "constant", 1: "border", 2: "reflection"} # add `resample` if grid/coordinates is in input, otherwise, # add `affine` to generate grid from `affine_grid_generator`. if affine_op_name in context: # add `resample` op inputs = _get_inputs(context, node, expected=5) sampling_mode = m_mode[inputs[2].val] padding_mode = m_padding_mode[inputs[3].val] align_corners = inputs[4].val # When align_corners=False, padding_mode is corresponding to Core ML's symmetric if padding_mode == "reflection" and align_corners is False: padding_mode = "symmetric" x = mb.resample( x=inputs[0], coordinates=inputs[1], sampling_mode=sampling_mode, padding_mode=padding_mode, padding_value=0.0, coordinates_mode="normalized_minus_one_to_one", align_corners=align_corners, name=node.name, ) context.add(x) else: # add `affine` op instead x = context[node.inputs[0]] # inputs from `affine_grid_generator` affine_theta = context["{}_theta".format(affine_op_name)] affine_size = context["{}_size".format(affine_op_name)] affine_align_corners = context["{}_align_corners".format(affine_op_name)] # affine_theta.val is either name string (dynamic input) or np.ndarray (static values) # see `affine_grid_generator` for details. is_theta_const = not isinstance(affine_theta.val, str) if is_theta_const: transform_matrix = _np.reshape(affine_theta.val, (affine_theta.shape[0], 6)) else: # theta is dynamic input, add `reshape` op to PyMIL transform_matrix = mb.reshape( x=context[affine_theta.val], shape=(-1, 6), name=node.name + "_theta_reshape", ) # inputs from `grid_sampler` sampling_mode = m_mode[context[node.inputs[2]].val] padding_mode = m_padding_mode[context[node.inputs[3]].val] align_corners = context[node.inputs[4]].val if sampling_mode != "bilinear": raise NotImplementedError("'sampling_mode' not supported.") if padding_mode != "constant": raise NotImplementedError("'padding_mode' not supported.") if affine_align_corners.val != align_corners: raise ValueError( "Op 'affine_grid_generator' and 'grid_sampler' must agree on 'align_corners'." ) x = mb.affine( x=x, transform_matrix=transform_matrix, output_height=affine_size.val[2], output_width=affine_size.val[3], sampling_mode=sampling_mode, padding_mode=padding_mode, padding_value=0.0, coordinates_mode="normalized_minus_one_to_one", align_corners=align_corners, name=node.name, ) context.add(x) @register_torch_op def silu(context, node): inputs = _get_inputs(context, node, expected=1) x = mb.silu(x=inputs[0], name=node.name) context.add(x) @register_torch_op def constant(context, node): assert len(node.inputs) == 0 assert len(node.outputs) == 1 name = node.name val = node.attr["value"] const = _construct_constant(val, name) context.add(const, torch_name=name) @register_torch_op def cosine_similarity(context, node): inputs = _get_inputs(context, node, expected=4) dim = inputs[-2].val eps = inputs[-1].val xy = mb.mul(x=inputs[0], y=inputs[1]) sum_xy = mb.reduce_sum(x=xy, axes=[dim]) xx = mb.mul(x=inputs[0], y=inputs[0]) sum_xx = mb.reduce_sum(x=xx, axes=[dim]) yy = mb.mul(x=inputs[1], y=inputs[1]) sum_yy = mb.reduce_sum(x=yy, axes=[dim]) mul_sum_xy = mb.mul(x=sum_xx, y=sum_yy) div_12 = mb.maximum(x=mul_sum_xy, y=eps * eps) div_sqrt = mb.sqrt(x=div_12) cs = mb.real_div(x=sum_xy, y=div_sqrt, name=node.name) context.add(cs) @register_torch_op def selu(context, node): ALPHA = 1.6732632423543772 SCALE = 1.0507009873554805 x = _get_inputs(context, node, expected=1)[0] x = mb.elu(x=x, alpha=ALPHA) x = mb.mul(x=x, y=SCALE, name=node.name) context.add(x) @register_torch_op def dot(context, node): inputs = _get_inputs(context, node, expected=2) xy = mb.mul(x=inputs[0], y=inputs[1]) sum_xy = mb.reduce_sum(x=xy, axes=[0]) context.add(sum_xy, node.name) @register_torch_op def mv(context, node): inputs = _get_inputs(context, node, expected=2) expand = mb.expand_dims(x=inputs[1], axes=[-1], name=node.name + "_expanded") mv = mb.matmul(x=inputs[0], y=expand, name=node.name + "_mv") res = mb.squeeze(x=mv, axes=[-1], name=node.name) context.add(res) @register_torch_op def outer(context, node): inputs = _get_inputs(context, node, expected=2) x = mb.reshape(x=inputs[0], shape=[-1, 1]) y = mb.reshape(x=inputs[1], shape=[1, -1]) res = mb.matmul(x=x, y=y, name=node.name) context.add(res) @register_torch_op def cross(context, node): inputs = _get_inputs(context, node, expected=3) x = inputs[0] y = inputs[1] dim = inputs[2] x1 = mb.gather(x=x, indices=[1, 2, 0], axis=dim, name="x1") x2 = mb.gather(x=x, indices=[2, 0, 1], axis=dim, name="x2") y1 = mb.gather(x=y, indices=[1, 2, 0], axis=dim, name="y1") y2 = mb.gather(x=y, indices=[2, 0, 1], axis=dim, name="y2") m1 = mb.mul(x=x1, y=y2) m2 = mb.mul(x=x2, y=y1) z = mb.sub(x=m1, y=m2, name=node.name) context.add(z) @register_torch_op def frobenius_norm(context, node): x, dim, keep_dims = _get_inputs(context, node, expected=3) result = mb.reduce_l2_norm(x=x, axes=dim, keep_dims=keep_dims, name=node.name) context.add(result) @register_torch_op def norm(context, node): x, num, dim, keep_dims = _get_inputs(context, node, expected=4) assert x is not None and keep_dims is not None and num is not None and dim is not None temp = _vector_norm(x=x, order=num, dim=dim, keep_dims=keep_dims, name=node.name) context.add(temp) def _vector_norm(x, order, dim, keep_dims, name): # 0 norm is special if order.val == 0: # sum(x!=0) x = mb.cast(x=x, dtype="fp32") temp = mb.not_equal(x=x, y=0.) temp = mb.cast(x=temp, dtype='int32') temp = mb.reduce_sum(x=temp, axes=dim, keep_dims=keep_dims, name=name) # infinity norm is special elif order.val > VALUE_CLOSE_TO_INFINITY: # max(abs(x)) temp = mb.abs(x=x) temp = mb.reduce_max(x=temp, axes=dim, keep_dims=keep_dims, name=name) # -infinity norm is special elif order.val < -VALUE_CLOSE_TO_INFINITY: # min(abs(x)) temp = mb.abs(x=x) temp = mb.reduce_min(x=temp, axes=dim, keep_dims=keep_dims, name=name) # Although 2 norm can fit in the general formula, # since it is very common, we have tailored kernel for it elif order.val == 2: temp = mb.reduce_l2_norm(x=x, axes=dim, keep_dims=keep_dims, name=name) # use general formula to compute all other norms else: # sum(abs(x)^{order})^{(1 / order)} temp = mb.abs(x=x) x, y = promote_input_dtypes([temp, order.val]) temp = mb.pow(x=x, y=y) temp = mb.reduce_sum(x=temp, axes=dim, keep_dims=keep_dims) temp = mb.pow(x=temp, y=1.0 / order.val, name=name) return temp @register_torch_op def _weight_norm(context, node): v, g, dim = _get_inputs(context, node, expected=3) # Determine axes for L2 norm if dim.val == -1: axes = None else: axes = list(range(v.rank)) dim = dim.val if dim >= 0: axes.remove(dim) else: axes.remove(v.rank + dim) # Calculate L2 norm of v temp = mb.pow(x=v, y=2.) temp = mb.reduce_sum(x=temp, axes=axes, keep_dims=True) norm = mb.pow(x=temp, y=1./2) inverse_norm = mb.inverse(x=norm) direction = mb.mul(x=v, y=inverse_norm) result = mb.mul(x=g, y=direction, name=node.name) context.add(result) def _matrix_norm(x, order, dim, keep_dims, name): if order.val == 1: # min(sum(abs(x), dim=0)) temp = mb.abs(x=x) temp = mb.reduce_sum(x=temp, axes=[dim[0]], keep_dims=True) temp = mb.reduce_max(x=temp, axes=dim, keep_dims=keep_dims, name=name) elif order.val == -1: # min(sum(abs(x), dim=0)) temp = mb.abs(x=x) temp = mb.reduce_sum(x=temp, axes=[dim[0]], keep_dims=True) temp = mb.reduce_min(x=temp, axes=dim, keep_dims=keep_dims, name=name) elif order.val == "fro": # sum(x**2)**1/2 temp = mb.reduce_l2_norm(x=x, axes=dim, keep_dims=keep_dims, name=name) elif order.val > VALUE_CLOSE_TO_INFINITY: # max(sum(abs(x), dim=1)) temp = mb.abs(x=x) temp = mb.reduce_sum(x=temp, axes=[dim[1]], keep_dims=True) temp = mb.reduce_max(x=temp, axes=dim, keep_dims=keep_dims, name=name) elif order.val < -VALUE_CLOSE_TO_INFINITY: # min(sum(abs(x), dim=1)) temp = mb.abs(x=x) temp = mb.reduce_sum(x=temp, axes=[dim[1]], keep_dims=True) temp = mb.reduce_min(x=temp, axes=dim, keep_dims=keep_dims, name=name) else: raise RuntimeError("Matrix norm is not defined for the current inputs") return temp @register_torch_op def narrow(context, node): x, dim, start, length = _get_inputs(context, node, expected=4) begin = [0] * len(x.shape) begin[dim.val] = start.val end = list(x.shape) end[dim.val] = start.val + length.val context.add( mb.slice_by_index(x=x, begin=begin, end=end, name=node.name) ) @register_torch_op def linalg_vector_norm(context, node): x, order, dim, keep_dims, _ = _get_inputs(context, node, expected=5) assert x is not None and keep_dims is not None and order is not None temp = _vector_norm(x=x, order=order, dim=dim, keep_dims=keep_dims, name=node.name) context.add(temp) @register_torch_op def linalg_matrix_norm(context, node): x, order, dim, keep_dims, _ = _get_inputs(context, node, expected=5) assert x is not None and keep_dims is not None and order is not None and dim is not None assert len(dim.val) == 2 temp = _matrix_norm(x=x, order=order, dim=dim.val, keep_dims=keep_dims, name=node.name) context.add(temp) @register_torch_op def linalg_norm(context, node): x, order, dim, keep_dims, _ = _get_inputs(context, node, expected=5) assert x is not None and keep_dims is not None if dim is None: dim = _np.arange(x.rank) else: dim = dim.val if order is None: temp = mb.reduce_l2_norm(x=x, axes=dim, keep_dims=keep_dims, name=node.name) elif len(dim) == 2: temp = _matrix_norm( x=x, order=order, dim=dim, keep_dims=keep_dims, name=node.name ) else: temp = _vector_norm(x=x, order=order, dim=dim, keep_dims=keep_dims, name=node.name) context.add(temp) @register_torch_op def hardswish(context, node): inputs = _get_inputs(context, node, expected=1) x = inputs[0] w = mb.thresholded_relu(x=x, alpha=-3.0) y = mb.sigmoid_hard( x=w, alpha=1.0 / 6, beta=0.5 ) # ``y = min(max(alpha * x + beta, -1), 1) result = mb.mul(x=w, y=y, name=node.name) context.add(result) @register_torch_op def reshape_as(context, node): inputs = _get_inputs(context, node, expected=2) x = inputs[0] ref = inputs[1] shape = mb.shape(x=ref) result = mb.reshape(x=x, shape=shape, name=node.name) context.add(result) @register_torch_op def unflatten(context, node): x, dim_var, unflattened_size_var = _get_inputs(context, node, expected=3) dim = dim_var.val if dim is None: raise ValueError("In 'unflatten' op, the 'dim' must be provided.") if dim < 0: dim += x.rank x_shape = mb.shape(x=x) pre_shape = mb.slice_by_index(x=x_shape, begin=[0], end=[dim]) post_shape = mb.slice_by_index(x=x_shape, begin=[dim + 1], end=[len(x.shape)]) target_shape = mb.concat(values=(pre_shape, unflattened_size_var, post_shape), axis=0) target_shape = mb.cast(x=target_shape, dtype="int32") y = mb.reshape(x=x, shape=target_shape, name=node.name) context.add(y) def _array_construct(context, node, array_type): assert len(node.outputs) == 1 inputs = _get_inputs(context, node) scalar_inputs = [ inp for inp in inputs if isinstance(inp, Var) and inp.can_be_folded_to_const() and len(inp.shape) == 0 ] if len(scalar_inputs) == len(inputs): # All the list items are compile-time scalar constants, so let's create # a new const that concatenates them. val = array_type([inp.val for inp in inputs]) const = mb.const(val=val, name=node.name) context.add(const) else: # If at least one input to the construct op is non-const, collect # the inputs and add them directly to the context. Ops that use this # node's output will take the list directly as input. context.add(array_type(inputs), node.name) @register_torch_op def tupleconstruct(context, node): _array_construct(context, node, array_type=tuple) @register_torch_op def listconstruct(context, node): _array_construct(context, node, array_type=list) @register_torch_op def eq(context, node): inputs = _get_inputs(context, node, expected=2) x = inputs[0] y = inputs[1] if is_bool(x.dtype): x = mb.cast(x=x, dtype="int32") if is_bool(y.dtype): y = mb.cast(x=y, dtype="int32") x, y = promote_input_dtypes([x, y]) equal_to = mb.equal(x=x, y=y, name=node.name) context.add(equal_to) @register_torch_op def ne(context, node): inputs = _get_inputs(context, node, expected=2) x = inputs[0] y = inputs[1] if is_bool(x.dtype): x = mb.cast(x=x, dtype="int32") if is_bool(y.dtype): y = mb.cast(x=y, dtype="int32") x, y = promote_input_dtypes([x, y]) equal_to = mb.not_equal(x=x, y=y, name=node.name) context.add(equal_to) @register_torch_op def le(context, node): inputs = _get_inputs(context, node, expected=2) x, y = promote_input_dtypes(inputs) less_equal = mb.less_equal(x=x, y=y, name=node.name) context.add(less_equal) @register_torch_op def lt(context, node): inputs = _get_inputs(context, node, expected=2) x, y = promote_input_dtypes(inputs) less = mb.less(x=x, y=y, name=node.name) context.add(less) @register_torch_op def ge(context, node): inputs = _get_inputs(context, node, expected=2) x, y = promote_input_dtypes(inputs) greater_equal = mb.greater_equal(x=x, y=y, name=node.name) context.add(greater_equal) @register_torch_op def gt(context, node): inputs = _get_inputs(context, node, expected=2) x, y = promote_input_dtypes(inputs[:2]) greater = mb.greater(x=x, y=y, name=node.name) context.add(greater) @register_torch_op(torch_alias=["t", "numpy_t"]) def transpose(context, node): assert len(node.outputs) == 1 inputs = _get_inputs(context, node) x = inputs[0] if len(node.inputs) == 1: # PyTorch has several transpose ops that can be emitted. This one is only # emitted when .t() is called on a tensor, which means it can only be # called on a matrix. if len(x.shape) > 2: raise ValueError("transpose without dims for rank > 2 is unsupported") res = mb.transpose(x=x, perm=[1, 0], name=node.name) else: assert len(inputs) == 3 ax0 = inputs[1].val ax1 = inputs[2].val perm = list(range(len(x.shape))) perm[ax0] = ax1 perm[ax1] = ax0 res = mb.transpose(x=x, perm=perm, name=node.name) context.add(res) @register_torch_op(torch_alias=["permute"]) def permute_copy(context, node): inputs = _get_inputs(context, node, expected=2) perm = mb.transpose(x=inputs[0], perm=inputs[1], name=node.name) context.add(perm) @register_torch_op def frac(context, node): # Frac(x) = x - floor(abs(x)) * sign(x) x = _get_inputs(context, node, expected=1)[0] floor_abs = mb.floor(x=mb.abs(x=x)) sign_abs_floor = mb.mul(x=floor_abs, y=mb.sign(x=x)) res = mb.sub(x=x, y=sign_abs_floor) context.add(res, torch_name=node.name) @register_torch_op def pixel_shuffle(context, node): inputs = _get_inputs(context, node, expected=2) perm = mb.pixel_shuffle(x=inputs[0], upscale_factor=inputs[1], name=node.name) context.add(perm) @register_torch_op def pixel_unshuffle(context, node): inputs = _get_inputs(context, node, expected=2) downscale_factor = _np.uint32(inputs[1].val) perm = mb.pixel_unshuffle(x=inputs[0], downscale_factor=downscale_factor, name=node.name) context.add(perm) @register_torch_op(torch_alias=["bmm", "mm"]) def matmul(context, node): inputs = _get_inputs(context, node, expected=2) if (len(inputs[1].shape) == 2 and len(inputs[0].shape) <= 3) and ( _is_const(inputs[1]) or inputs[1].is_descendant_of_const ): linear_x, weight = inputs transposed_weight = mb.transpose(x=weight, perm=(1, 0)) res = mb.linear(x=linear_x, weight=transposed_weight, name=node.name) else: x, y = promote_input_dtypes([inputs[0], inputs[1]]) res = mb.matmul(x=x, y=y, name=node.name) context.add(res) @register_torch_op def add(context, node): add_inputs = _get_inputs(context, node) assert len(node.outputs) == 1 # TODO (sberardi): 3rd param to aten::add is a scale factor, need to handle that. # out=input+alpha x other # rdar://60175736 if len(add_inputs) > 2 and add_inputs[2].val != 1: raise ValueError("ADD does not support scale factor param") x, y = add_inputs[:2] if types.is_bool(x.dtype) and types.is_bool(y.dtype): add_node = mb.logical_or(x=x, y=y, name=node.name) elif types.is_complex(x.dtype) or types.is_complex(y.dtype): x_real = mb.complex_real(data=x) if types.is_complex(x.dtype) else x x_imag = mb.complex_imag(data=x) if types.is_complex(x.dtype) else 0.0 y_real = mb.complex_real(data=y) if types.is_complex(y.dtype) else y y_imag = mb.complex_imag(data=y) if types.is_complex(y.dtype) else 0.0 add_node = mb.complex(real_data=mb.add(x=x_real, y=y_real), imag_data=mb.add(x=x_imag, y=y_imag), name=node.name) else: x, y = promote_input_dtypes([x, y]) add_node = mb.add(x=x, y=y, name=node.name) context.add(add_node) @register_torch_op def cumsum(context, node): inputs = _get_inputs(context, node, expected=3) x = inputs[0] if is_bool(x.dtype): x = mb.cast(x=x, dtype='int32') res = mb.cumsum(x=x, axis=inputs[1], name=node.name) context.add(res) @register_torch_op def addmm(context, node): # addmm(Tensor x, Tensor mat1, Tensor mat2, Scalar beta=1, Scalar alpha=1) # output = beta * x + alpha * (mat1 @ mat2) assert len(node.outputs) == 1 inputs = _get_inputs(context, node, expected=[3, 4, 5]) x = inputs[0] mat1 = inputs[1] mat2 = inputs[2] beta = inputs[3] if len(inputs) > 3 else mb.const(val=1.0) alpha = inputs[4] if len(inputs) > 4 else mb.const(val=1.0) if beta.val != 1.0: # Apply beta scaling factor to the input. x = mb.mul(x=x, y=beta) matmul = mb.matmul(x=mat1, y=mat2) if alpha.val != 1.0: # Apply alpha scaling factor to the matrix multiplicaiton matmul = mb.mul(x=alpha, y=matmul) result = mb.add(x=x, y=matmul, name=node.name) context.add(result) @register_torch_op def linear(context, node): inputs = _get_inputs(context, node, expected=[2, 3]) x = inputs[0] W = inputs[1] bias = inputs[2] if len(node.inputs) == 3 else None if bias is not None: x, W, bias = promote_input_dtypes([x, W, bias]) else: x, W = promote_input_dtypes([x, W]) res = _create_linear_layer(x, W, bias) context.add(res, torch_name=node.name) @register_torch_op( torch_alias=[ "convolution", "conv1d", "conv2d", "conv3d", "conv1d.padding", "conv2d.padding", "conv3d.padding", "conv_transpose1d", "conv_transpose2d.input", "conv_transpose3d.input", ] ) def _convolution(context, node): default_torch_padding = "valid" if node.kind.endswith(".padding") else 0 def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs( context, node, min_expected={ TorchFrontend.TORCHSCRIPT: 7, TorchFrontend.TORCHEXPORT: 2, TorchFrontend.EXECUTORCH: 2, }, ) nargs = len(inputs) x = inputs[0] # PyTorch and MIL has same weight layout # Conv: [Cout, Cin, *D] # ConvTranspose: [Cin, Cout, *D] weight = inputs[1] x, weight = promote_input_dtypes([x, weight]) bias = inputs[2] if nargs > 2 else None stride = inputs[3] if nargs > 3 else 1 padding = inputs[4] if nargs > 4 else default_torch_padding if node.kind in ("_convolution", "convolution"): dilation = inputs[5] if nargs > 5 else 1 transposed = inputs[6].val if nargs > 6 else False out_padding = inputs[7] if nargs > 7 else 0 groups = inputs[8] if nargs > 8 else 1 elif re.match(r"conv_transpose[123]d.*", node.kind): out_padding = inputs[5] if nargs > 5 else 0 groups = inputs[6] if nargs > 6 else 1 dilation = inputs[7] if nargs > 7 else 1 transposed = True else: dilation = inputs[5] if nargs > 5 else 1 groups = inputs[6] if nargs > 6 else 1 transposed = False out_padding = 0 return x, weight, bias, stride, padding, dilation, groups, transposed, out_padding def _parse_keyword_args( context, node, bias, stride, padding, dilation, groups, out_padding ) -> Tuple[Var]: # Only torch.export may have kwargs if context.frontend != TorchFrontend.TORCHEXPORT: return bias, stride, padding, dilation, groups, out_padding bias = _get_kwinputs(context, node, "bias", default=[bias])[0] stride = _get_kwinputs(context, node, "stride", default=[stride])[0] padding = _get_kwinputs(context, node, "padding", default=[padding])[0] dilation = _get_kwinputs(context, node, "dilation", default=[dilation])[0] groups = _get_kwinputs(context, node, "groups", default=[groups])[0] out_padding = _get_kwinputs(context, node, "out_padding", default=[out_padding])[0] return bias, stride, padding, dilation, groups, out_padding def _translate_torch_args(node, weight, stride, padding, dilation, groups, out_padding): spatial_rank = weight.rank - 2 # Core ML strides comes from torch stride if isinstance(stride, Var): stride = stride.val assert stride is not None, "torch conv stride must be constant" # Torch stride is an int (for all spatial dims) or an n-tuple of ints (one per spatial dim) # Core ML requires an n-tuple if isinstance(stride, int) or len(stride) == 1: strides = _np.array([np.squeeze(stride)] * spatial_rank) else: strides = stride # 1 is Core ML default value, so using None is preferred if _np.all(strides == 1): strides = None # Core ML pad_type and pad come from torch padding # For torch conv op .padding variants, torch padding is a string, # with possible values ("valid", "same") if node.kind.endswith(".padding"): pad_type = padding if isinstance(pad_type, Var): assert pad_type.val is not None pad_type = pad_type.val assert pad_type in ("valid", "same") # Core ML pad is None for pad_type "valid" / "same" pad = None # For other torch conv op variants, torch padding is # an int (for all spatial dims) or an n-tuple of ints (one per spatial dim) else: if isinstance(padding, Var): padding = padding.val assert padding is not None, "torch conv padding must be constant" # Core ML requires a (2 * n)-tuple, start and end for each spatial dim if isinstance(padding, int) or len(padding) == 1: pad = _np.array([np.squeeze(padding)] * (2 * spatial_rank)) else: assert len(padding) == spatial_rank pad = _np.repeat(padding, 2) # Create Core ML pad_type according to Core ML pad if _np.all(pad == 0): pad_type = "valid" # 0 is Core ML default value, so using None is preferred pad = None else: pad_type = "custom" # Core ML dilations comes from torch dilation if isinstance(dilation, Var): dilation = dilation.val assert dilation is not None, "torch conv dilation must be constant" # Torch dilation is an int (for all spatial dims) or an n-tuple of ints (one per spatial dim) # Core ML requires an n-tuple if isinstance(dilation, int) or len(dilation) == 1: dilations = _np.array([np.squeeze(dilation)] * spatial_rank) else: dilations = dilation # 1 is Core ML default value, so using None is preferred if _np.all(dilations == 1): dilations = None # Core ML groups is torch groups if isinstance(groups, Var): groups = groups.val assert groups is not None, "torch conv groups must be constant" # 1 is Core ML default value, so using None is preferred if groups == 1: groups = None if isinstance(out_padding, Var): out_padding = out_padding.val assert out_padding is not None, "torch out_padding must be constant" # 0 is Core ML default value, so using None is preferred if _np.all(out_padding == 0): out_padding = None return strides, pad_type, pad, dilations, groups, out_padding ( x, weight, bias, stride, padding, dilation, groups, transposed, out_padding, ) = _parse_positional_args(context, node) bias, stride, padding, dilation, groups, out_padding = _parse_keyword_args( context, node, bias, stride, padding, dilation, groups, out_padding ) strides, pad_type, pad, dilations, groups, out_padding = _translate_torch_args( node, weight, stride, padding, dilation, groups, out_padding ) kwargs = { "x": x, "weight": weight, "pad_type": pad_type, "name": node.name, } if bias is not None: kwargs["bias"] = bias if pad_type == "custom": kwargs["pad"] = pad if strides is not None: kwargs["strides"] = strides if dilations is not None: kwargs["dilations"] = dilations if groups is not None: kwargs["groups"] = groups if transposed is True: pad_len = 2 * (weight.rank - 2) # Transposed convolution # Handle output_padding using pre-pad or post-crop pre_pad = [0] * pad_len post_crop = [0] * pad_len if out_padding is not None and any(out_padding): output_padding = [0] * pad_len # output padding adds additional padding on one of the side of dimension # i.e. bottom from top-bottom, # right from left-right # back from front-back # Core ML padding structure is similar [top, bottom, left, right] # mapping output_padding to simplify further processing! # # For ConvTranspose2d: [bottom, right] -> [0, b, 0, r] output_padding = [0 if i % 2 == 0 else out_padding[i // 2] for i in range(pad_len)] if sum(pad) == 0 and any(output_padding): raise ValueError( "ConvTranspose configuration of padding=0 and output_padding > 0 not supported!" ) post_crop = pad.copy() pad *= 0 for i in range(0, pad_len): if post_crop[i] >= output_padding[i]: post_crop[i] -= output_padding[i] else: pre_pad[i] = output_padding[i] - post_crop[i] kwargs["pad"] = pre_pad if any(pre_pad): # Constant pad requires pad to be of length 2*input_rank pre_pad = [0] * 2 * (len(x.shape) - 2) + pre_pad x = mb.pad(x=x, pad=pre_pad) kwargs["x"] = x if any(post_crop): del kwargs["name"] conv = mb.conv_transpose(**kwargs) if any(post_crop): # TODO: rdar://65575826 (PyTorch converter: output_padding mapping to slice # instead of crop layer for 1 and 3D ConvTranspose) if len(post_crop) == 2 and conv.rank == 3: # Number of elements to crop from right = post_crop[-1]. # Since slicing supports negative indexing, end_id = -1 * post_crop[-1] conv = mb.slice_by_index( x=conv, begin=[0, 0, post_crop[0]], end=[0, 0, -1 * post_crop[-1]], begin_mask=[True, True, False], end_mask=[True, True, False], name=node.name, ) elif len(post_crop) == 4 and conv.rank == 4: conv = mb.crop( x=conv, crop_height=post_crop[:2], crop_width=post_crop[2:4], name=node.name, ) else: raise ValueError( "output_padding is supported only for ConvTranspose1D or ConvTranspose2D!" ) else: # Normal convolution conv = mb.conv(**kwargs) context.add(conv) # Convolution with "same, valid" padding @register_torch_op def _convolution_mode(context, node): inputs = _get_inputs(context, node, expected=7) mode = inputs[4].val context.add( mb.conv( x=inputs[0], weight=inputs[1], bias=inputs[2], strides=inputs[3], pad_type=mode, dilations=inputs[5], groups=inputs[6], name=node.name, ) ) @register_torch_op(torch_alias=["_softmax"]) def softmax(context, node): inputs = _get_inputs(context, node) x = inputs[0] axis = inputs[1] res = mb.softmax(x=x, axis=axis, name=node.name) context.add(res) @register_torch_op def flatten(context, node): inputs = _get_inputs(context, node) x = inputs[0] dims = list(x.shape) start_val = inputs[1].val end_val = inputs[2].val start = len(dims) + start_val if start_val < 0 else start_val end = len(dims) + end_val if end_val < 0 else end_val if start > len(dims) or end > len(dims) or start < 0 or end < 0: raise ValueError( "Invalid start and end. (start, end) == ({}, {})".format(start, end_val) ) if start > end: raise ValueError( "Start must be before end. (start, end) == ({}, {})".format(start, end_val) ) x_shape = mb.shape(x=x) shape1 = mb.slice_by_index(x=x_shape, begin=[0], end=[start]) shape2 = mb.slice_by_index(x=x_shape, begin=[end + 1], end=[len(dims)]) flatten_dim = -1 if not any_symbolic(x.shape): flatten_dim = 1 for dim in dims[start: end + 1]: flatten_dim *= dim shape = mb.concat(values=(shape1, [flatten_dim], shape2), axis=0) shape = mb.cast(x=shape, dtype="int32") reshape = mb.reshape(x=x, shape=shape, name=node.name) context.add(reshape) @register_torch_op def _reshape_from_tensor(context, node): inputs = _get_inputs(context, node, expected=2) reshape = mb.reshape(x=inputs[0], shape=inputs[1], name=node.name) context.add(reshape) @register_torch_op def softsign(context, node): inputs = _get_inputs(context, node, expected=1) res = mb.softsign(x=inputs[0], name=node.name) context.add(res) @register_torch_op def relu(context, node): inputs = _get_inputs(context, node, expected=1) res = mb.relu(x=inputs[0], name=node.name) context.add(res) @register_torch_op def prelu(context, node): inputs = _get_inputs(context, node, expected=2) x = inputs[0] alpha = inputs[1] # In the MIL backend, it assumes that the inputs of prelu should have # at least rank 3, i.e. [batch, channel, spatial_dims*]. if x.rank >= 2: alpha = alpha.val alpha = _np.ones((x.shape[1],)) * alpha if x.rank <= 2: axes = [1, 2] if x.rank == 1 else [2] x = mb.expand_dims(x=x, axes=axes) x = mb.prelu(x=x, alpha=alpha) res = mb.squeeze(x=x, axes=axes, name=node.name) else: res = mb.prelu(x=x, alpha=alpha, name=node.name) context.add(res) @register_torch_op def linspace(context, node): inputs = _get_inputs(context, node, min_expected=3) start = inputs[0] end = inputs[1] nums = inputs[2] start = mb.cast(x=start, dtype="fp32") end = mb.cast(x=end, dtype="fp32") if start.can_be_folded_to_const() and end.can_be_folded_to_const() and nums.can_be_folded_to_const(): start_val = start.val end_val = end.val nums_val = nums.val if nums_val < MAX_SIZE_CONSTANT_FOLDING: res = mb.const(val=_np.linspace(start_val, end_val, nums_val), name=node.name) context.add(res) return if nums.val is None: msg = "Dynamic steps input for torch.linspace is not supported. Please use torch.arange instead" raise NotImplementedError(msg) else: if nums.val == 1: res = mb.expand_dims(x=start, axes=[0], name=node.name) else: # step = (end - start) / (nums - 1) x = mb.sub(x=end, y=start) y = mb.sub(x=nums, y=1) x = mb.cast(x=x, dtype="fp32") y = mb.cast(x=y, dtype="fp32") step = mb.real_div(x=x, y=y) # Note that the range_1d op excluded the end point, # so we have to add the end back to the resulting array. arange = mb.range_1d(end=end, start=start, step=step) new_end = mb.expand_dims(x=end, axes=[0]) res = mb.concat(values=[arange, new_end], axis=0, name=node.name) context.add(res) @register_torch_op def relu6(context, node): inputs = _get_inputs(context, node, expected=1) res = mb.relu6(x=inputs[0], name=node.name) context.add(res) @register_torch_op def einsum(context, node): if context.frontend == TorchFrontend.TORCHSCRIPT: vars = context[node.inputs[1]] vars = promote_input_dtypes(vars) equation = context[node.inputs[0]].val else: equation = node.inputs[0] if isinstance(equation, str) and equation in context: equation = context[equation].val tensor_names = node.inputs[1] if isinstance(tensor_names, str) and tensor_names in context: vars = context[tensor_names] else: assert isinstance(tensor_names, tuple) vars = [context[tensor_name] for tensor_name in tensor_names] x = build_einsum_mil(vars, equation, node.name) context.add(x) @register_torch_op def eye(context, node): # TODO: rdar://104400568 ([PyTorch] Use MIL ops to construct the eye matrix in order to avoid directly folding the input into a const) inputs = _get_inputs(context, node, expected=[5, 6]) if len(inputs) == 5: eye = _np.eye(inputs[0].val) if len(inputs) == 6: eye = _np.eye(inputs[0].val, inputs[1].val) eye = mb.const(val=eye, name=node.name) context.add(eye) @register_torch_op def elu(context, node): ## Torch port to ATen adds scale and input_scale which is set to 1 inputs = _get_inputs(context, node, expected=4) res = mb.elu(x=inputs[0], alpha=inputs[1], name=node.name) context.add(res) @register_torch_op def leaky_relu(context, node): inputs = _get_inputs(context, node, expected=2) res = mb.leaky_relu(x=inputs[0], alpha=inputs[1], name=node.name) context.add(res) @register_torch_op def rrelu(context, node): inputs = _get_inputs(context, node, expected=5) # Alpha in evaluation mode is just the average between upper and lower. lower_alpha = inputs[1] upper_alpha = inputs[2] alpha = (lower_alpha.val + upper_alpha.val) / 2 res = mb.leaky_relu(x=inputs[0], alpha=alpha, name=node.name) context.add(res) @register_torch_op def softplus(context, node): inputs = _get_inputs(context, node, expected=3) x = inputs[0] beta_ = inputs[1].val C = x.shape[1] alpha_br = _np.repeat(1.0 / beta_, C).astype('float32') beta_br = _np.repeat(beta_, C).astype('float32') res = mb.softplus_parametric(x=x, alpha=alpha_br, beta=beta_br, name=node.name) context.add(res) @register_torch_op def mish(context, node): inputs = _get_inputs(context, node, expected=1) x = inputs[0] softplus = mb.softplus(x=x) tanh = mb.tanh(x=softplus) res = mb.mul(x=x, y=tanh, name=node.name) context.add(res) def _adjust_pad_for_ceil_mode(input_shape, kernel_size, stride_sizes, pad_sizes): """ Given an input tensor and pooling parameters, add the extra input padding needed to replicate ceil_mode. MIL 3D pooling does not support ceil_mode natively, but we can workaround by padding the input appropriately. PyTorch output size formula for pooling: (reference: https://github.com/pytorch/pytorch/blob/375c30a7177442fb9d6de7516a9ae4031ae324c4/aten/src/ATen/native/Pool.h#L28) When ceil mode is True: out_dim = floor((in_dim + pad_l + pad_r - kernel_size + (stride-1)) / stride) + 1 if (out_dim-1) * stride >= in_dim + pad_l and (pad_l > 0 or pad_r > 0): out_dim = out_dim - 1 When ceil mode is False: out_dim = floor((in_dim + pad_l + pad_r - kernel_size) / stride) + 1 # follow the approach here to calculate padding: # https://github.com/pytorch/pytorch/blob/edf751ca2fededecdd9366874c761431c0f61f01/aten/src/ATen/native/mkldnn/Pooling.cpp#L121 # which keeps increasing the pad_r value until the output size without the ceil mode matches that of the ceil mode """ def _calculate_pool_output_size(in_dim, kernel, stride, pad_l, pad_r, ceil_mode): if ceil_mode: out_dim = _math.floor((in_dim + pad_r + pad_l - kernel + stride - 1) / stride) + 1 if (out_dim - 1) * stride >= in_dim + pad_l and (pad_l > 0 or pad_r > 0): out_dim = out_dim - 1 else: out_dim = _math.floor((in_dim + pad_r + pad_l - kernel) / stride) + 1 return out_dim new_pad = pad_sizes.copy() for idx in range(len(input_shape)): if is_symbolic(input_shape[idx]): logger.warning( "pooling padding adjusted to support ceil_mode=True, for symbolic dimension." "Output shape of the pool op maybe be wrong for certain input shapes." ) new_pad[2 * idx + 1] += stride_sizes[idx] - 1 else: out_dim_with_ceil_mode = _calculate_pool_output_size( input_shape[idx], kernel_size[idx], stride_sizes[idx], pad_sizes[2 * idx], pad_sizes[2 * idx + 1], True, ) is_equal = False while not is_equal: out_dim_without_ceil_mode = _calculate_pool_output_size( input_shape[idx], kernel_size[idx], stride_sizes[idx], new_pad[2 * idx], new_pad[2 * idx + 1], False, ) is_equal = True if out_dim_without_ceil_mode < out_dim_with_ceil_mode: new_pad[2 * idx + 1] += 1 is_equal = False return new_pad @register_torch_op( torch_alias=[ "max_pool2d", "max_pool3d", "max_pool1d_with_indices", "max_pool2d_with_indices", "max_pool3d_with_indices", ] ) def max_pool1d(context, node): inputs = _get_inputs(context, node, min_expected=3) x = inputs[0] kernel_sizes = inputs[1] strides = inputs[2] if strides.op.op_type == "const" and (not list(strides.val)): strides = mb.const(val=kernel_sizes.val, name=strides.name) pad_type = "custom" pad = np.array([0] * (kernel_sizes.shape[0] * 2)) if len(inputs) < 4 else _np.repeat(inputs[3].val, 2) dilation = np.array([1] * kernel_sizes.shape[0]) if len(inputs) < 5 else inputs[4].val ceil_mode = False if len(inputs) < 6 else inputs[5].val if _np.any(dilation > 1): # See: rdar://60633736 (Implement dilation for mil op max_pool) raise ValueError("@max_pool does not support dilation > 1") spatial_rank = len(pad) // 2 if spatial_rank > 2 and ceil_mode is True and list(strides.val) != [1] * len(strides.val): # since MIL does not support ceil_mode for 3D pool, # need to adjust padding values if ceil_mode is True # ceil_mode only causes any difference though, if the strides are not 1 x_spatial_dimensions = x.shape[-spatial_rank:] pad = _adjust_pad_for_ceil_mode(x_spatial_dimensions, kernel_sizes.val, strides.val, pad) pool = mb.max_pool( x=x, kernel_sizes=kernel_sizes, strides=strides, pad_type=pad_type, pad=pad, name=node.name, ceil_mode=ceil_mode if spatial_rank <= 2 else False, ) if re.match(r"max_pool[123]d_with_indices", node.kind): # TODO(rdar://117038432) ([Executorch] Handle/Bind other outputs of `max_pool2d_with_indices` op during lowering) context.add((pool, None), torch_name=node.name) else: context.add(pool) @register_torch_op def minimum(context, node): inputs = _get_inputs(context, node, expected=2) x, y = promote_input_dtypes(inputs) assert len(node.outputs) == 1 out = mb.minimum(x=x, y=y, name=node.name) context.add(out) @register_torch_op def clamp_min(context, node): inputs = _get_inputs(context, node, expected=2) x, y = inputs[0], inputs[1] assert x.dtype == y.dtype out = mb.maximum(x=x, y=y, name=node.name) context.add(out) @register_torch_op def maximum(context, node): inputs = _get_inputs(context, node, expected=2) x, y = promote_input_dtypes(inputs) assert len(node.outputs) == 1 out = mb.maximum(x=x, y=y, name=node.name) context.add(out) @register_torch_op(torch_alias=["truediv"]) def div(context, node): inputs = _get_inputs(context, node, expected=[2, 3]) x = mb.cast(x=inputs[0], dtype="fp32") y = mb.cast(x=inputs[1], dtype="fp32") if len(inputs) > 2 and inputs[2] is not None: rounding_mode = inputs[2].val if rounding_mode == "floor": # round towards negative infinity # e.g.: # values before floor: [2.6, -3.4, -3.6] # values after floor: [2, -4, -4] res = mb.floor_div(x=x, y=y, name=node.name) elif rounding_mode == "trunc": # round towards 0 # e.g.: # values before trunc: [2.6, -3.4, -3.6] # values after trunc: [2, -3, -3] z = mb.real_div(x=x, y=y) s = mb.sign(x=z) all_positive = mb.mul(x=z, y=s) all_positive_floor = mb.floor(x=all_positive) res = mb.mul(x=all_positive_floor, y=s, name=node.name) else: raise NotImplementedError( 'rounding mode "{}" not supported in the "div" op'.format(rounding_mode) ) else: res = mb.real_div(x=x, y=y, name=node.name) context.add(res) @register_torch_op(torch_alias=["floordiv"]) def floor_divide(context, node): inputs = _get_inputs(context, node, expected=2) inputs = promote_input_dtypes(inputs) div_res = mb.floor_div(x=inputs[0], y=inputs[1]) # Pytorch's floor_divide always returns fp32, even if the inputs are int res = mb.cast(x=div_res, dtype='fp32', name=node.name) context.add(res) @register_torch_op def true_divide(context, node): inputs = _get_inputs(context, node, expected=2) res = mb.real_div(x=inputs[0], y=inputs[1], name=node.name) context.add(res) @register_torch_op def mul(context, node): inputs = _get_inputs(context, node, expected=2) x, y = promote_input_dtypes(inputs) if types.is_bool(x.dtype) and types.is_bool(y.dtype): res = mb.logical_and(x=x, y=y, name=node.name) else: res = mb.mul(x=x, y=y, name=node.name) context.add(res) @register_torch_op def pow(context, node): inputs = _get_inputs(context, node, expected=2) x, y = promote_input_dtypes(inputs) res = mb.pow(x=x, y=y, name=node.name) context.add(res) @register_torch_op(torch_alias=["rsub"]) def sub(context, node): inputs = _get_inputs(context, node, expected=[2, 3]) assert len(node.outputs) == 1 if node.kind == "rsub": # rsub reverses the order of arguments y = inputs[0] x = inputs[1] else: x = inputs[0] y = inputs[1] if len(inputs) > 2: alpha = inputs[2].val # TODO (sberardi): 3rd param to aten::sub is a scale factor, need to handle that. # out=input-alpha x other # rdar://60175736 if alpha != 1: raise ValueError("SUB does not support scale factor param") x, y = promote_input_dtypes([x, y]) res = mb.sub(x=x, y=y, name=node.name) context.add(res) @register_torch_op( torch_alias=[ "mean.dim", "sum", "logsumexp", ] ) def mean(context, node): def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs(context, node, min_expected=1) nargs = len(inputs) x = inputs[0] dim = inputs[1] if nargs > 1 else None keepdim = inputs[2] if nargs > 2 else False return x, dim, keepdim x, dim, keepdim = _parse_positional_args(context, node) # torch.export may have kwargs if context.frontend == TorchFrontend.TORCHEXPORT: if keepdim == False: keepdim = _get_kwinputs(context, node, "keepdim", default=[keepdim])[0] if types.is_bool(x.dtype): # TODO: In the future when MIL op supports bool, we need to use curr_opset_version to decide # if we want to cast or not. x = mb.cast(x=x, dtype="fp32") kwargs = {"x": x, "name": node.name} # torch dim means Core ML axes if dim is not None: # Core ML axes needs to be a list, but if only one dim was specified in torch, # it will be constructed as an int, so we construct a new constant as a list if not isinstance(dim.val, _np.ndarray): axes = mb.const(val=[dim.val], name=dim.name + "_list") else: axes = dim.val kwargs["axes"] = axes # torch keepdim means Core ML keep_dims if keepdim != False: kwargs["keep_dims"] = keepdim if node.kind == "sum": res = mb.reduce_sum(**kwargs) elif node.kind == "logsumexp": res = mb.reduce_log_sum_exp(**kwargs) else: res = mb.reduce_mean(**kwargs) context.add(res) @register_torch_op(torch_alias=["squeeze.dim", "squeeze_copy.dim", "squeeze_copy.dims"]) def squeeze(context, node): inputs = _get_inputs(context, node) if len(inputs) == 1: res = mb.squeeze(x=inputs[0], name=node.name) elif len(inputs) == 2: dims = inputs[1].val try: dims = (int(dims),) except: pass res = mb.squeeze(x=inputs[0], axes=dims, name=node.name) context.add(res) @register_torch_op(torch_alias=["unsqueeze_copy"]) def unsqueeze(context, node): inputs = _get_inputs(context, node, expected=2) unsqueeze = mb.expand_dims(x=inputs[0], axes=[inputs[1].val], name=node.name) context.add(unsqueeze) @register_torch_op(torch_alias=["sym_size"]) def size(context, node): inputs = _get_inputs(context, node, expected=[1, 2]) x = inputs[0] # Get the shape of the tensor. if types.is_complex(x.dtype): size_node = mb.complex_shape(x=inputs[0], name=node.name + "_shape") else: size_node = mb.shape(x=inputs[0], name=node.name + "_shape") # Get the size of the tensor along the input dimension. if len(node.inputs) == 2: dim = inputs[1].val size_node = _list_select(size_node, dim) context.add(size_node, node.name) @register_torch_op def _shape_as_tensor(context, node): inputs = _get_inputs(context, node, expected=1) # Get the shape of the tensor. shape_node = mb.shape(x=inputs[0], name=node.name) context.add(shape_node, node.name) @register_torch_op(torch_alias=["view_copy", "_unsafe_view", "reshape"]) def view(context, node): inputs = _get_inputs(context, node, expected=2) x = inputs[0] shape = inputs[1] if isinstance(shape, Var) and np.prod(shape.shape) == 0: # Reshape to empty shape (works only for scalar) is a no op assert ( np.prod(x.shape) <= 1 ), "Reshape to empty shape works only for scalar and single-element tensor" context.add(mb.identity(x=x, name=node.name)) return if isinstance(shape, ListVar): length = mb.list_length(ls=shape) indices = mb.range_1d(start=0, end=length, step=1) shape = mb.list_gather(ls=shape, indices=indices) if isinstance(shape, list) and all( [isinstance(dim, Var) and len(dim.shape) == 0 for dim in shape] ): shape = mb.concat(values=shape, axis=0) shape = mb.cast(x=shape, dtype="int32") if types.is_complex(x.dtype): real, imag = (mb.reshape(x=x, shape=shape, name=node.name) for x in (mb.complex_real(data=x), mb.complex_imag(data=x))) view = mb.complex(real_data=real, imag_data=imag, name=node.name) else: view = mb.reshape(x=x, shape=shape, name=node.name) context.add(view) @register_torch_op(torch_alias=["constant_pad_nd"]) def pad(context, node): def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs( context, node, expected={TorchFrontend.TORCHSCRIPT: [3, 4]}, min_expected={TorchFrontend.TORCHEXPORT: 2, TorchFrontend.EXECUTORCH: 2}, ) nargs = len(inputs) if context.frontend == TorchFrontend.TORCHSCRIPT: assert (node.kind == "pad") == (nargs == 4) assert (node.kind == "constant_pad_nd") == (nargs == 3) x = inputs[0] pad = inputs[1] if pad.val is not None: pad = pad.val.reshape((-1, 2))[::-1].reshape(-1).tolist() missing_dims = x.rank - (len(pad) // 2) pad = [0, 0] * missing_dims + pad if node.kind == "pad": mode = "constant" if nargs > 2: if isinstance(inputs[2], str): mode = inputs[2] else: if isinstance(inputs[2], Var) and inputs[2].val is not None: mode = inputs[2].val else: raise ValueError( "if pad mode is specified, then it must either be a string, " "or a constant pymil variable" ) assert mode in ("circular", "constant", "reflect", "replicate") scalar_val = inputs[3] if nargs > 3 else 0.0 else: mode = "constant" scalar_val = inputs[2] if nargs > 2 else 0.0 if scalar_val is None: scalar_val = 0.0 elif isinstance(scalar_val, Var): assert scalar_val.val is not None scalar_val = float(scalar_val.val) return x, pad, mode, scalar_val x, pad, mode, scalar_val = _parse_positional_args(context, node) if types.is_complex(x.dtype): real, imag = (mb.pad(x=x, pad=pad, mode=mode, constant_val=scalar_val, name=node.name) for x in (mb.complex_real(data=x), mb.complex_imag(data=x))) res = mb.complex(real_data=real, imag_data=imag, name=node.name) else: x, scalar_val = promote_input_dtypes([x, scalar_val]) res = mb.pad(x=x, pad=pad, mode=mode, constant_val=scalar_val, name=node.name) context.add(res) @register_torch_op def adaptive_avg_pool1d(context, node): _adaptive_pool1d(context, node, mb.reduce_mean) @register_torch_op def adaptive_avg_pool2d(context, node): _adaptive_pool2d(context, node, mb.avg_pool, mb.reduce_mean) def _adaptive_pool1d(context, node, reduce_op): inputs = _get_inputs(context, node, expected=2) x = inputs[0] assert len(inputs[1].val) == 1 out_length = inputs[1].val[0] if len(x.shape) == 3: # 3D input begin_prefix = [0, 0] end_prefix = [x.shape[0], x.shape[1]] out_shape = [x.shape[0], x.shape[1], out_length] else: # 2D input assert len(x.shape) == 2 begin_prefix = [0] end_prefix = [x.shape[0]] out_shape = [x.shape[0], out_length] pool_results = [] for start, end in _get_kernel_indexes_1d_for_adaptive_pooling(x.shape[-1], out_length): cur_kernel = mb.slice_by_index( x=x, begin=begin_prefix + [start], end=end_prefix+[end], ) cur_result = reduce_op( x=cur_kernel, axes=[-1], keep_dims=True ) pool_results.append(cur_result) context.add( mb.reshape( x=mb.concat(values=pool_results, axis=-1), shape=out_shape, name=node.name, ) ) @register_torch_op def adaptive_max_pool1d(context, node): _adaptive_pool1d(context, node, mb.reduce_max) @register_torch_op def adaptive_max_pool2d(context, node): _adaptive_pool2d(context, node, mb.max_pool, mb.reduce_max) def _get_kernel_indexes_1d_for_adaptive_pooling( in_dimension: int, out_dimension: int) -> List[Tuple[int, int]]: results = [] for i in range(out_dimension): start = _math.floor(i * in_dimension / out_dimension) end = _math.ceil((i + 1) * in_dimension / out_dimension) results.append((start, end)) return results def _adaptive_pool2d_non_fixed_kernel_size_and_stride(x, output_shape, name, reduce_op): ''' If the input dimension is not evenly divisible by the output dimension, then the stride and kernel size used by PyTorch is not fixed. This is true for both the height and width dimension. ''' pool_results = [] for s2, e2 in _get_kernel_indexes_1d_for_adaptive_pooling(x.shape[2], output_shape[0]): for s3, e3 in _get_kernel_indexes_1d_for_adaptive_pooling(x.shape[3], output_shape[1]): cur_kernel = mb.slice_by_index( x=x, begin=[0, 0, s2, s3], end=[x.shape[0], x.shape[1], e2, e3], ) cur_result = reduce_op( x=cur_kernel, axes=[-2, -1], keep_dims=True ) pool_results.append(cur_result) return mb.reshape( x=mb.concat(values=pool_results, axis=-1), shape=[x.shape[0], x.shape[1], output_shape[0], output_shape[1]], name=name, ) def _adaptive_pool2d(context, node, pool_op, reduce_op): # Get input tensor and output shape inputs = _get_inputs(context, node, expected=2) x = inputs[0] output_shape = inputs[1].val assert isinstance(output_shape, _np.ndarray) and len(output_shape) == 2 output_shape = tuple(output_shape) if output_shape == (1, 1): # Represent (1,1) output size with global reduce op result = reduce_op(x=x, axes=[-2, -1], keep_dims=True, name=node.name) elif x.shape is None or any_symbolic(x.shape): raise ValueError( "Adaptive pooling is only supported when input tensor size is known or output size == (1,1). " "Received: input size == {}, output size == {}".format( x.shape_str(), output_shape, ) ) elif x.shape[-2] % output_shape[-2] == 0 and x.shape[-1] % output_shape[-1] == 0: # Stride and and kernel size is fixed strides = [ind // outd for ind, outd in zip(x.shape[-2:], output_shape)] kernel_sizes = [ ind - s * (outd - 1) for ind, outd, s in zip(x.shape[-2:], output_shape, strides) ] result = pool_op( x=x, kernel_sizes=kernel_sizes, strides=strides, pad_type="valid", name=node.name, ) else: result = _adaptive_pool2d_non_fixed_kernel_size_and_stride( x, output_shape, node.name, reduce_op ) context.add(result) @register_torch_op(torch_alias=["_native_batch_norm_legit_no_training"]) def batch_norm(context, node): inputs = _get_inputs(context, node, expected=[7, 9]) _input = inputs[0] weight = inputs[1] bias = inputs[2] running_mean = inputs[3] running_var = inputs[4] if len(inputs) == 9: # inputs skipped: # float momentum (6) # bool cudnn_enabled (8) training = inputs[5].val eps = inputs[7] # no: training, cudnn_enabled elif len(inputs) == 7: # inputs skipped: # float momentum (5) eps = inputs[6] training = False else: raise ValueError( f"BatchNorm: got {len(inputs)} inputs, expected 7 or 9" ) input_rank = _input.rank if input_rank < 2 or input_rank > 5: raise ValueError( "BatchNorm: Encountered invalid input rank during translation in torch frontend." ) # If training = True, the mean and variance of the current batch of data are used to normalize the input data. # If training = False, data statistics running_mean and running_var are used instead. # Note that, even in the evaluation mode (after calling model.eval()), the training parameter can still be true # and it just refers to a different computation as mentioned above. # helper functions for different type of batch norm def _add_batch_norm_dynamic(): x = _input if training or (running_mean is None) or (running_var is None): axes = [axis for axis in range(x.rank) if axis != 1] mean = mb.reduce_mean(x=x, axes=axes, keep_dims=True) num = mb.sub(x=x, y=mean) square = mb.mul(x=num, y=num) variance = mb.reduce_mean(x=square, axes=axes, keep_dims=True) shape = mb.shape(x=variance) else: shape = [1] * x.rank shape[1] = -1 if any_symbolic(running_mean.shape) else running_mean.shape[0] mean = mb.reshape(x=running_mean, shape=shape) num = mb.sub(x=x, y=mean) variance = mb.reshape(x=running_var, shape=shape) variance_add_epsilon = mb.add(x=variance, y=eps) sqrt = mb.sqrt(x=variance_add_epsilon) name = node.name if weight is None and bias is None else node.name + "_div" x = mb.real_div(x=num, y=sqrt, name=name) if weight is not None: weight_reshape = mb.reshape(x=weight, shape=shape) name = node.name if bias is None else node.name + "_mul" x = mb.mul(x=x, y=weight_reshape, name=name) if bias is not None: bias_reshape = mb.reshape(x=bias, shape=shape) x = mb.add(x=x, y=bias_reshape, name=node.name) return x def _add_batch_norm_1d(): # first expand the 3d tensor to 4d, and call the standard mb.batch_norm x = mb.expand_dims(x=_input, axes=[-1], name=node.name + "_rank2_expansion") bn = mb.batch_norm( x=x, mean=running_mean, variance=running_var, gamma=weight, beta=bias, epsilon=eps, name=node.name + "_batch_norm_1d", ) bn = mb.squeeze(x=bn, name=node.name, axes=[-1]) return bn def _add_batch_norm(): bn = mb.batch_norm( x=_input, mean=running_mean, variance=running_var, gamma=weight, beta=bias, epsilon=eps, name=node.name, ) return bn is_batch_norm_1d_rank_2 = input_rank == 2 if training or running_mean.val is None or running_var.val is None or weight is None or bias is None: bn = _add_batch_norm_dynamic() elif is_batch_norm_1d_rank_2: bn = _add_batch_norm_1d() else: bn = _add_batch_norm() if node.kind == "_native_batch_norm_legit_no_training": # TODO(rdar://117038279) ([Executorch] Handle/Bind other outputs of `_native_batch_norm_legit_no_training` op during lowering) bn = (bn, None, None) context.add(bn, torch_name=node.name) @register_torch_op def instance_norm(context, node): inputs = _get_inputs(context, node, expected=9) x = inputs[0] weight = inputs[1] bias = inputs[2] eps = inputs[7] x = mb.instance_norm( x=x, gamma=weight, beta=bias, epsilon=eps, name=node.name, ) context.add(x) @register_torch_op def group_norm(context, node): def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs( context, node, expected={TorchFrontend.TORCHSCRIPT: 6}, min_expected={TorchFrontend.TORCHEXPORT: 2, TorchFrontend.EXECUTORCH: 2}, ) nargs = len(inputs) x = inputs[0] num_groups = inputs[1].val weight = inputs[2] if nargs > 2 else None bias = inputs[3] if nargs > 3 else None eps = inputs[4].val if nargs > 4 else 1e-5 return x, num_groups, weight, bias, eps def _parse_keyword_args(context, node, weight, bias) -> Tuple[Var]: # Only torch.export may have kwargs if context.frontend != TorchFrontend.TORCHEXPORT: return weight, bias weight = _get_kwinputs(context, node, "weight", default=[weight])[0] bias = _get_kwinputs(context, node, "bias", default=[bias])[0] return weight, bias x, num_groups, weight, bias, eps = _parse_positional_args(context, node) weight, bias = _parse_keyword_args(context, node, weight, bias) n,c = x.shape[0],x.shape[1] # at minimum (N, C) required num_groups = builtins.min(num_groups,c) new_shape = [n, num_groups, c//num_groups] # optimization for non symbolic shapes. This get rids of 3 mil ops that required on dynamic shapes if not any_symbolic(x.shape[2:]): new_shape += [*x.shape[2:]] # adds remaining dims input_shape = [*x.shape] # n, c, * else: input_shape = mb.shape(x=x) input_shape_sliced = mb.slice_by_size(x=input_shape, begin=[2], size=[-1]) # x_shape[2:] new_shape = mb.concat(values=[new_shape, input_shape_sliced], axis=0) num_extra_axes = len(x.shape[2:]) axes_ = [int(i) for i in range(2, 2 + num_extra_axes + 1)] weight_shape, bias_shape = [1,c], [1,c] weight_shape += [1 for _ in range(num_extra_axes)] bias_shape += [1 for _ in range(num_extra_axes)] x = mb.reshape(x=x, shape=new_shape) mean = mb.reduce_mean(x=x, axes=axes_, keep_dims=True) var = _std(x, axes_, True, False, eps) x = mb.sub(x=x, y=mean) x = mb.real_div(x=x, y=var) x = mb.reshape(x=x, shape=input_shape) if weight is not None: weight = mb.reshape(x=weight, shape=weight_shape) x = mb.mul(x=x,y=weight) if bias is not None: bias = mb.reshape(x=bias, shape=bias_shape) x = mb.add(x=x, y=bias) context.add(x,node.name) @register_torch_op def embedding(context, node): inputs = _get_inputs(context, node) _input = inputs[0] indices = inputs[1] padding_idx = -1 scale_grad_by_freq = False sparse = False if len(inputs) >= 3: padding_idx = inputs[2].val if len(inputs) >= 4: scale_grad_by_freq = inputs[3].val if len(inputs) >= 5: sparse = inputs[4].val if padding_idx != -1 or scale_grad_by_freq or sparse: logger.warning( "Core ML embedding (gather) layer does not support any " "inputs besides the weights and indices. Those given " "will be ignored." ) indices = mb.cast(x=indices, dtype="int32") # Changing the axis from 0 is not an option in torch, so we don't expose it gather = mb.gather(x=_input, indices=indices, name=node.name) context.add(gather) @register_torch_op def hardtanh(context, node): inputs = _get_inputs(context, node, expected=3) _input = inputs[0] min_val = inputs[1].val max_val = inputs[2].val res = mb.clip(x=_input, alpha=min_val, beta=max_val, name=node.name) context.add(res) @register_torch_op(torch_alias=["concat"]) def cat(context, node): def is_tensor_empty(var: Var) -> bool: return np.any([size == 0 for size in var.shape]) def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs(context, node, min_expected=1) nargs = len(inputs) xs = inputs[0] # PyTorch can have empty tensor, which is then ignored # However, CoreML does not allow such empty tensor, so remove them now if np.any([is_tensor_empty(x) for x in xs]): xs = [x for x in xs if not is_tensor_empty(x)] dim = inputs[1] if nargs > 1 else 0 return xs, dim def _parse_keyword_args(context, node, dim) -> Var: # Only torch.export may have kwargs if context.frontend != TorchFrontend.TORCHEXPORT: return dim dim = _get_kwinputs(context, node, "dim", default=[dim])[0] return dim xs, dim = _parse_positional_args(context, node) dim = _parse_keyword_args(context, node, dim) concat = mb.concat(values=promote_input_dtypes(xs), axis=dim, name=node.name) context.add(concat) @register_torch_op def stack(context, node): def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs(context, node, min_expected=1) nargs = len(inputs) tensors = inputs[0] dim = inputs[1] if nargs > 1 else 0 return tensors, dim tensors, dim = _parse_positional_args(context, node) # torch.export may have kwargs if context.frontend == TorchFrontend.TORCHEXPORT: if dim == 0: dim = _get_kwinputs(context, node, "dim", default=[dim])[0] if isinstance(dim, Var): dim = dim.val if len(tensors) == 1: res = mb.expand_dims(x=tensors[0], axes=[dim], name=node.name) else: res = mb.stack(values=tensors, axis=dim, name=node.name) context.add(res) @register_torch_op def tile(context, node): x, dims = _get_inputs(context, node, expected=2) # The torch.tile only supports tuple of ints for "dims", not Tensor. So it will not be dynamic. if dims is None or dims.val is None: raise ValueError("The `dims` input for torch.tile must be static (tuple of ints).") dims_num = dims.shape[0] if dims_num < x.rank: # When the number of elements in dims is smaller than rank of x, ones are prepended. prepend_ones = np.array([1] * (x.rank - dims_num)) dims = mb.concat(values=(prepend_ones, dims), axis=0) res = mb.tile(x=x, reps=dims, name=node.name) context.add(res) @register_torch_op def item(context, node): inputs = _get_inputs(context, node, expected=1) if inputs[0].shape == (): # MIL ops that reduce already output a scalar, so no need to do # anything. res = inputs[0] elif _np.all([d == 1 for d in inputs[0].shape]): # Item only makes sense when called on a length 1 tensor. We use # reduce_max as a workaround for not having a way to extract a scalar # from a symbolic tensor. res = mb.reduce_max(x=inputs[0], name=node.name) else: raise ValueError("expected input to be a scalar or a length 1 tensor") context.add(res, node.name) def _cast(context, node, dtype, dtype_name): inputs = _get_inputs(context, node, expected=1) x = inputs[0] # Input must either be a scalar or a (1 x 1 x ... x 1) tensor if not (len(x.shape) == 0 or _np.all([d == 1 for d in x.shape])): raise ValueError("input to cast must be either a scalar or a length 1 tensor") if x.can_be_folded_to_const(): # If x is a compile-time constant, directly cast it to @dtype if it's # not one already. if not isinstance(x.val, dtype): res = mb.const(val=dtype(x.val), name=node.name) else: res = x elif len(x.shape) > 0: x = mb.squeeze(x=x, name=node.name + "_item") res = mb.cast(x=x, dtype=dtype_name, name=node.name) else: res = mb.cast(x=x, dtype=dtype_name, name=node.name) context.add(res, node.name) @register_torch_op(torch_alias=["bool"]) def _bool(context, node): _cast(context, node, bool, "bool") @register_torch_op(torch_alias=["int"]) def _int(context, node): _cast(context, node, int, "int32") @register_torch_op(torch_alias=["native_layer_norm"]) def layer_norm(context, node): def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs( context, node, expected={TorchFrontend.TORCHSCRIPT: [5, 6]}, min_expected={TorchFrontend.TORCHEXPORT: 2, TorchFrontend.EXECUTORCH: 2}, ) nargs = len(inputs) x, normalized_shape = inputs[:2] weight = inputs[2] if nargs > 2 else None bias = inputs[3] if nargs > 3 else None eps = inputs[4] if nargs > 4 else None return x, normalized_shape, weight, bias, eps x, normalized_shape, weight, bias, eps = _parse_positional_args(context, node) layer_norm = mb.layer_norm( x=x, axes=list(range(-len(normalized_shape.val), 0)), gamma=weight, beta=bias, epsilon=eps, name=node.name, ) if node.kind == "native_layer_norm": # TODO(rdar://117038370) ([Executorch] Handle/Bind other outputs of `native_layer_norm` op during lowering) context.add((layer_norm, None, None), torch_name=node.name) else: context.add(layer_norm) @register_torch_op def numtotensor(context, node): inputs = _get_inputs(context, node, expected=1) x = inputs[0] if x.shape != (): raise ValueError( "numtotensor expected scalar input, got tensor with shape {}".format( x.shape ) ) if x.can_be_folded_to_const(): res = mb.const(val=[x.val], name=node.name) context.add(res) else: context.add(x, node.name) def _ifzo_to_ifoz(weights, name): """ i, f, z, o -> i, f, o, z where weights_split[0] == i, etc. Used to transform lstm weights from pytorch to Core ML format """ split_size = weights.shape[0] // 4 weights_split = mb.split(x=weights, split_sizes=_np.array([split_size] * 4), axis=0) return mb.concat( values=[weights_split[0], weights_split[1], weights_split[3], weights_split[2]], axis=0, ) def _pytorch_hidden_to_coreml_milops(x, name): """ Used to transform lstm state values (hn, cn) from pytorch to Core ML format. """ split_size = x.shape[0] // 2 x_split = mb.split(x=x, split_sizes=_np.array([split_size] * 2), axis=0) x_concat = mb.concat( values=[x_split[0], x_split[1]], axis=2, ) # (4.) See docstring to @lstm return mb.squeeze(x=x_concat, axes=_np.array([0]), name=name) def _add_gru_layer(_input, h0, wi, wh, bi, bh, h_list_name, h_name): """ Add a single GRU layer. Please note that the Core ML GRU has different definition from Torch, so we cannot use mb.gru, and need to implement it with while loop. To be more specific, in Core ML: o_t = activation(W_{io} x_t + r_t * W_{ho} h_(t−1) + b_{o}) while torch has o_t = activation(W_{io} x_t + b_{io} + r_t * (W_{ho} h_(t−1) + b_{ho})) Inputs: _input : (seq_len, batch_size, input_dim) h0 : (1, batch_size, hidden_dim) wi : (3*hidden_dim, input_dim) for the first layer, else (3*hidden_dim, hidden_dim) wh : (3*hidden_dim, hidden_dim) bi : (3*hidden_dim) bh : (3*hidden_dim) Return: h_list : the list contains all hidden states for each time step with shape (seq_len, batch_size, hidden_dim) h : the last hidden state, with shape (1, batch_size, hidden_dim """ # split the weights and bias w_ir, w_iz, w_in = _np.split(wi, 3) w_hr, w_hz, w_hn = _np.split(wh, 3) b_ir, b_iz, b_in = _np.split(bi, 3) b_hr, b_hz, b_hn = _np.split(bh, 3) # allocate hlist # hlist : (seq_len, batch_size, hidden_dim) x_shape = mb.shape(x=_input) seq_len = mb.slice_by_index(x=x_shape, begin=[0], end=[1]) h_shape = mb.shape(x=h0) h_shape = mb.slice_by_index(x=h_shape, begin=[1], end=[3]) h_list_shape = mb.concat(values=[seq_len, h_shape], axis=0) h_list = mb.fill(shape=h_list_shape) # concate h0 to h_list # h_list: (seq_len + 1, batch_size, hidden_dim) h_list = mb.concat(values=[h0, h_list], axis=0) def cond(i, h_list): return mb.less(x=i, y=seq_len) def body(i, h_list): # slice for the x and state for time step i # the resulting shape: # xt : (batch_size, input_dim) # h_prev : (batch_size, hidden_dim) xt = mb.gather(x=_input, indices=i, axis=0) h_prev = mb.gather(x=h_list, indices=i, axis=0) xt = mb.squeeze(x=xt, axes=[0]) h_prev = mb.squeeze(x=h_prev, axes=[0]) # rt = sigmoid(wir * xt + whr * h_prev + bir + bhr) # rt : (batch_size, hidden_dim) rt_1 = mb.linear(x=xt, weight=w_ir, bias=b_ir) rt_2 = mb.linear(x=h_prev, weight=w_hr, bias=b_hr) rt = mb.add(x=rt_1, y=rt_2) rt = mb.sigmoid(x=rt) # zt = sigmoid(wiz * xt + whz * h_prev + biz + bhz) # zt : (batch_size, hidden_dim) zt_1 = mb.linear(x=xt, weight=w_iz, bias=b_iz) zt_2 = mb.linear(x=h_prev, weight=w_hz, bias=b_hz) zt = mb.add(x=zt_1, y=zt_2) zt = mb.sigmoid(x=zt) # nt = tanh(win * xt + bin + rt(whn * h_prev + bhn)) # nt : (batch_size, hidden_dim) nt_1 = mb.linear(x=xt, weight=w_in, bias=b_in) nt_2 = mb.linear(x=h_prev, weight=w_hn, bias=b_hn) nt_2 = mb.mul(x=rt, y=nt_2) nt = mb.add(x=nt_1, y=nt_2) nt = mb.tanh(x=nt) # h = (1-zt) * nt + zt* h_prev # h : (batch_size, hidden_dim) h_1 = mb.sub(x=1., y=zt) h_1 = mb.mul(x=h_1, y=nt) h_2 = mb.mul(x=zt, y=h_prev) h = mb.add(x=h_1, y=h_2) # update counter counter = mb.add(x=i, y=1) # update h and h_list h = mb.expand_dims(x=h, axes=[0]) h_list = mb.scatter(data=h_list, indices=counter, updates=h) return ( counter, h_list, ) _, h_list = mb.while_loop( _cond=cond, _body=body, loop_vars=([0], h_list), ) # slice h0 out of h_list h_list = mb.slice_by_index( x=h_list, begin=[1, 0, 0], end=[0, 0, 0], begin_mask=[False, True, True], end_mask=[True, True, True], name=h_list_name, ) # get the last state of h_list if seq_len.val is None or seq_len.val > 1: h = mb.slice_by_index( x=h_list, begin=[-1, 0, 0], end=[-2, 0, 0], begin_mask=[False, True, True], end_mask=[False, True, True], stride=[-1, 1, 1], name=h_name, ) else: h = h_list return h_list, h @register_torch_op def gru(context, node): inputs = _get_inputs(context, node, expected=9) _input = inputs[0] h0 = inputs[1] weights_list = inputs[2] has_bias = inputs[3].val num_layers = inputs[4].val dropout = inputs[5] bidirectional = inputs[7].val batch_first = inputs[8].val # For each layer of GRU, the layout of the weights list is [Wi, Wh, bi, bh] with has_bias == True, # and is [Wi, Wh] with bias == False. # If bidirectional == True, the list is double up, corresponding to forward and backward direction. expected_num_weights = 2 * num_layers * (int(has_bias) + 1) * (int(bidirectional) + 1) if len(weights_list) != expected_num_weights: raise ValueError( "Incorrect weights shape for gru layer: Expected: {}. Received {}".format( expected_num_weights, len(weights_list) ) ) # Transpose the input data to (seq_len, batch_size, input_dim) if batch_first == True if batch_first: _input = mb.transpose(x=_input, perm=[1, 0, 2]) # iterate through all the layers x = _input state_out_list = [] def _get_weights_and_bias(weights_list, index, num_layers, has_bias, bidirectional, mode): num_weights_per_layer = len(weights_list) // num_layers weights = weights_list[ num_weights_per_layer * index : num_weights_per_layer * (index + 1) ] if bidirectional: weights_f, weights_r = ( weights[: num_weights_per_layer // 2], weights[num_weights_per_layer // 2 :], ) assert len(weights_f) == len(weights_r) else: weights_f, weights_r = weights, [] if mode == "forward": weights = weights_f elif mode == "reverse": weights = weights_r wi, wh = weights[0].val, weights[1].val if has_bias: bi, bh = weights[2].val, weights[3].val else: hidden_dim = wh.shape[1] bi, bh = _np.zeros(3 * hidden_dim), _np.zeros(3 * hidden_dim) return wi, wh, bi, bh def _get_initial_state(h0, i, bidirectional, mode): if mode == "forward": return mb.slice_by_index( x=h0, begin=[(1 + int(bidirectional)) * i, 0, 0], end=[(1 + int(bidirectional)) * i + 1, 0, 0], begin_mask=[False, True, True], end_mask=[False, True, True], ) if mode == "reverse": assert bidirectional return mb.slice_by_index( x=h0, begin=[2 * i + 1, 0, 0], end=[2 * (i + 1), 0, 0], begin_mask=[False, True, True], end_mask=[False, True, True], ) seq_output_name = node.outputs[0] # output sequence name state_output_name = node.outputs[1] # output state name for i in range(num_layers): # get layer names x_name = seq_output_name + "_layer_" + str(i) if i < num_layers - 1 else seq_output_name h_name = state_output_name + '_layer_' + str(i) if num_layers > 0 else state_output_name if batch_first: x_name += "_tmp" if bidirectional: x_f_name = x_name + '_forward' h_f_name = h_name + '_forward' x_r_name = x_name + '_backward' h_r_name = h_name + '_backward' else: x_f_name = x_name h_f_name = h_name # forward direction x_f = x wi_f, wh_f, bi_f, bh_f = _get_weights_and_bias( weights_list, i, num_layers, has_bias, bidirectional, "forward" ) initial_h_f = _get_initial_state(h0, i, bidirectional, "forward") x_f, h_f = _add_gru_layer(x_f, initial_h_f, wi_f, wh_f, bi_f, bh_f, x_f_name, h_f_name) # reverse direction if bidirectional: x_r = mb.reverse(x=x, axes=[0]) wi_r, wh_r, bi_r, bh_r = _get_weights_and_bias( weights_list, i, num_layers, has_bias, bidirectional, "reverse" ) initial_h_r = _get_initial_state(h0, i, bidirectional, "reverse") x_r, h_r = _add_gru_layer( x_r, initial_h_r, wi_r, wh_r, bi_r, bh_r, x_r_name + "_reverse", h_r_name, ) x_r = mb.reverse(x=x_r, axes=[0], name=x_r_name) # concate output from forward and reverse direction x = mb.concat(values=[x_f, x_r], axis=2, name=x_name) h = mb.concat(values=[h_f, h_r], axis=0, name=h_name) else: x = x_f h = h_f state_out_list.append(h) # rnn output if batch_first: x = mb.transpose(x=x, perm=[1, 0, 2], name=seq_output_name) context.add(x, seq_output_name) # state output if len(state_out_list) > 1: h = mb.concat(values=state_out_list, axis=0, name=state_output_name) context.add(h, state_output_name) def _add_simple_rnn(context, node, activation): inputs = _get_inputs(context, node, expected=9) ''' Batch size: B Sequence length: S Input dimension: C Hidden dimension: H (1) _input : (B, S, C) if batch_first == True, else (S, B, C) (2) h0: (num_layers, B, H) ''' _input = inputs[0] h0 = inputs[1] weights_list = inputs[2] has_bias = inputs[3].val num_layers = inputs[4].val dropout = inputs[5] bidirectional = inputs[7].val batch_first = inputs[8].val # We only support uni-directional simple RNN now if bidirectional: raise NotImplementedError("Bidirectional simple RNN not supported.") expected_num_weights = 2 * num_layers * (int(has_bias) + 1) if len(weights_list) != expected_num_weights: raise ValueError( "Incorrect weights shape for lstm layer: Expected: {}. Received {}".format( expected_num_weights, len(weights_list) ) ) # Transpose the input data to (S, B, C) if batch_first == True if batch_first: _input = mb.transpose(x=_input, perm=[1, 0, 2]) state_out_list = [] out = _input for i in range(num_layers): if has_bias: weight_ih = weights_list[4 * i] weight_hh = weights_list[4 * i + 1] bias = mb.add(x=weights_list[4 * i + 2], y=weights_list[4 * i + 3]) else: weight_ih = weights_list[2 * i] weight_hh = weights_list[2 * i + 1] bias = None # get the initial state initial_h = mb.slice_by_index( x=h0, begin=[i, 0, 0], end=[0, 0, 0], stride=[1, 1, 1], begin_mask=[False, True, True], end_mask=[False, True, True], squeeze_mask=[True, False, False], ) # get the RNN output for each unit out, state = mb.rnn( x=out, initial_h=initial_h, weight_ih=weight_ih, weight_hh=weight_hh, bias=bias, output_sequence=True, activation=activation, ) # append state to lists which will stack later state_out_list.append(state) # rnn output output_name = node.outputs[0] if batch_first: out = mb.transpose(x=out, perm=[1, 0, 2], name=output_name) else: out = mb.identity(x=out, name=output_name) context.add(out, output_name) # stack the states into a single tensor state_output_name = node.outputs[1] if num_layers == 1: state = mb.expand_dims(x=state_out_list[0], axes=[0], name=state_output_name) else: state = mb.stack(values=state_out_list, axis=0, name=state_output_name) context.add(state, state_output_name) @register_torch_op def rnn_tanh(context, node): _add_simple_rnn(context, node, "tanh") @register_torch_op def rnn_relu(context, node): _add_simple_rnn(context, node, "relu") def _add_mil_lstm(input, initial_h, initial_c, weights, has_bias, bidirectional, name): """ Most of this code is to transform the tensors into a shape acceptable by the Core ML implementation of LSTM. For weights, biases, per direction, pytorch uses two tensors: (ii, if, ig, io) stacked on top of each other for each layer (tensor 1) and (hi, hf, hg, ho) stacked on top of each other for each layer (tensor 2). That is, (W_ii|W_if|W_ig|W_io), of shape (4*hidden_size, input_size) and (W_hi|W_hf|W_hg|W_ho), of shape (4*hidden_size, hidden_size). The Core ML LSTM op expects two tensors, weight and bias. So the tensors for weight and bias are separated from pytorch's @weights list (1.). For bias tensor, the Core ML LSTM op expects the form ii, if, io, ig and hi, hf, ho, hg, requiring the ifzo_to_ifoz function. Further adding input and hidden bias into one (2.). Similar to bias, input and hidden weight requires different layout. (3.) initial_h and initial_c are list of "num_layers" tensors, each of shape [n_directions, B, H], where n_directions = 1 or 2 whereas the shapes of the initial states to MIL's LSTM, BiLSTM must be [B, H] and [B, 2*H] respectively. This means we need to do the following transformations: - if its an LSTM (n_directions=1): squeeze the first dimension of initial_h/initial_c , before feeding it to MIL's LSTM - if its a BiLSTM (n_directions=2): - split the input, shape=(2, B, H), to get (1,B,H) and (1,B,H) - concatenate to get (1,B,2*H) - squeeze to get (B,2*H) """ if bidirectional: if has_bias: # (1.) biases = weights[2:4] + weights[6:8] weights = weights[0:2] + weights[4:6] # (2.) assert len(biases) == 4 for index in range(len(biases)): biases[index] = _ifzo_to_ifoz( biases[index], name="{}_lstm_bias_reshape_{}".format(name, index), ) f_b = mb.add(x=biases[0], y=biases[1], ) r_b = mb.add(x=biases[2], y=biases[3], ) # (3.) f_ih_w = _ifzo_to_ifoz( weights[0], name=name + "_lstm_forward_ih_weights_ifoz_to_ifzo", ) f_hh_w = _ifzo_to_ifoz( weights[1], name=name + "_lstm_forward_hh_weights_ifoz_to_ifzo", ) r_ih_w = _ifzo_to_ifoz( weights[2], name=name + "_lstm_reverse_ih_weights_ifoz_to_ifzo", ) r_hh_w = _ifzo_to_ifoz( weights[3], name=name + "_lstm_reverse_hh_weights_ifoz_to_ifzo", ) h = _pytorch_hidden_to_coreml_milops(initial_h, name=name + "_lstm_h0_reshaped") c = _pytorch_hidden_to_coreml_milops(initial_c, name=name + "_lstm_c0_reshaped") return mb.lstm(x=input, initial_h=h, initial_c=c, weight_ih=f_ih_w, weight_hh=f_hh_w, weight_ih_back=r_ih_w, weight_hh_back=r_hh_w, bias=(f_b if has_bias else None), bias_back=(r_b if has_bias else None), direction="bidirectional", output_sequence=True, name=name) else: if has_bias: # (1.) biases = weights[len(weights) // 2:] weights = weights[: len(weights) // 2] # (2.) b = mb.add(x=biases[0], y=biases[1], ) b = _ifzo_to_ifoz( b, name=name + "_lstm_bias_transformed", ) # (3.) f_ih_w = _ifzo_to_ifoz( weights[0], name=name + "_lstm_ih_weights_ifoz_to_ifzo", ) f_hh_w = _ifzo_to_ifoz( weights[1], name=name + "_lstm_hh_weights_ifoz_to_ifzo", ) h = mb.squeeze(x=initial_h, axes=_np.array([0]), name=name + "_lstm_h0_squeeze") c = mb.squeeze(x=initial_c, axes=_np.array([0]), name=name + "_lstm_c0_squeeze") return mb.lstm(x=input, initial_h=h, initial_c=c, weight_ih=f_ih_w, weight_hh=f_hh_w, bias=(b if has_bias else None), direction="forward", output_sequence=True, name=name) @register_torch_op def lstm(context, node): inputs = _get_inputs(context, node, expected=9) _input = inputs[0] # there are two cases here, # (1) the input tensor is a PackedSequence object, # in this case, the second input of the lstm layer is the batch_size (MIL Var). # (2) the input tensor is a normal tensor, # in this case, the second input is an array. # As the result, we can use the second input to identify which category the graph is. has_batch_sizes = not isinstance(inputs[1], Iterable) if has_batch_sizes: batch_sizes = inputs[1] h0, c0 = inputs[2] weights_list = inputs[3] has_bias = inputs[4].val num_layers = inputs[5].val dropout = inputs[6] bidirectional = inputs[8].val # the output of the _pack_padded_sequence is always in the layout of batch first batch_first = True else: h0, c0 = inputs[1] weights_list = inputs[2] has_bias = inputs[3].val num_layers = inputs[4].val dropout = inputs[5] bidirectional = inputs[7].val batch_first = inputs[8].val ''' Torch LSTM layer's input shapes: (1) first input (Seq, B, C) : if batch_first = False (B, Seq, C) : if batch_first = True (2) & (3) initialization states (num_layers, B, H) : if bidirectional = False (num_layers * 2, B, H) : if bidirectional = True For the MIL LSTM layer, these are the input shapes: (1) first input: (Seq, B, C) this means, if batch_first=True, we need to insert a transpose op first (2) & (3) initialization states MIL's LSTM layer does not natively support the "num_layers" parameters. So, when num_layers > 1, we add multiple MIL LSTM ops in a sequence. Each of these LSTM ops will take in initialization states in the following shape: (B, H) if bidirectional = False (B, 2*H) if bidirectional = True ''' if batch_first: _input = mb.transpose(x=_input, perm=[1, 0, 2], name=_input.name + "_batch_first_transpose") expected_num_weights = 2 * num_layers * (int(bidirectional) + 1) * (int(has_bias) + 1) if len(weights_list) != expected_num_weights: raise ValueError( "Incorrect weights shape for lstm layer: Expected: {}. Received {}".format( expected_num_weights, len(weights_list) ) ) # shape of h0 and c0 are (num_layers * n_directions, B, H) if num_layers == 1: all_initial_h = [h0] # [(n_directions, B, H)] all_initial_c = [c0] # [(n_directions, B, H)] else: all_initial_h = mb.split( x=h0, num_splits=num_layers, axis=0 ) # [(n_directions, B, H)] all_initial_c = mb.split( x=c0, num_splits=num_layers, axis=0 ) # [(n_directions, B, H)] n_weights_per_layer = int(len(weights_list) / num_layers) x = _input h_out_list = [] c_out_list = [] for i in range(num_layers): if i < num_layers - 1: op_name = node.name + "_lstm_layer_{}".format(i) else: if batch_first: op_name = node.name + "_batch_first" else: op_name = node.name lstm_out = _add_mil_lstm( input=x, initial_h=all_initial_h[i], initial_c=all_initial_c[i], weights=weights_list[ i * n_weights_per_layer : (i + 1) * n_weights_per_layer ], has_bias=has_bias, bidirectional=bidirectional, name=op_name, ) # shape of lstm_out[0] == (S,B,H) if bidirectional = True else (S, B, 2*H) x = lstm_out[0] # shape of lstm_out[1] == (B,H) if bidirectional = False else (B, 2*H) h_out_list.append(lstm_out[1]) # shape of lstm_out[2] == (B,H) if bidirectional = False else (B, 2*H) c_out_list.append(lstm_out[2]) ''' For torch, these are the dimensions of the 3 output tensors: (1) output[0] : (Seq, B, H) if batch_first = False, bidirectional = False (Seq, B, 2*H) if batch_first = False, bidirectional = True (B, Seq, H) if batch_first = True, bidirectional = False (B, Seq, 2*H) if batch_first = True, bidirectional = True (2) & (3) these are the state outputs: (num_layers, B, H) if bidirectional = False (num_layers * 2, B, H) if bidirectional = True MIL lstm layer's output shapes: (1) output[0]: (Seq, B, H) if bidirectional = False (Seq, B, 2*H) if bidirectional = True This means we need a transpose op if batch_first is True (2) & (3) shapes of the state outputs: each MIL LSTM op will produce final state tensors with the following shape: (B, H) if bidirectional = False (B, 2*H) if bidirectional = True stack/expand the final state tensors to match the Torch output ''' for index, (name, output) in enumerate(zip(node.outputs, lstm_out)): if index > 0: # index > 0 ===> its one of the state outputs (h or c) if bidirectional: if num_layers == 1: out1, out2 = mb.split( x=output, num_splits=2, axis=1 ) # each output of shape [B, H] after the split final_out = mb.stack( values=[out1, out2], axis=0, name=name ) # [2, B, H] context.add(final_out, name) else: out_state_tensors_list = ( h_out_list if index == 1 else c_out_list ) # each tensor in the list is of shape (B, 2*H) list_of_tensors_to_stack = [] for i in range(num_layers): out1, out2 = mb.split( x=out_state_tensors_list[i], num_splits=2, axis=1 ) # each output of shape [B, H] after the split out = mb.stack(values=[out1, out2], axis=0) # [2, B, H] list_of_tensors_to_stack.append(out) final_out = mb.concat( values=list_of_tensors_to_stack, axis=0, name=name ) # output of shape (num_layers * 2, B, H) context.add(final_out, name) else: if num_layers == 1: unsqueeze = mb.expand_dims(x=output, axes=[0], name=name) context.add(unsqueeze, name) else: out = mb.stack( values=h_out_list if index == 1 else c_out_list, axis=0, name=name, ) context.add(out, name) else: if batch_first: output = mb.transpose(x=output, perm=[1, 0, 2], name=name) context.add(output, name) def _get_scales_from_output_size(output_size, input_shape): scales = [] if output_size is not None: # output_size will be either # (1) A list of Var, and each Var indicates the output size for that dimension # (2) A single Var which indicates the whole output size # (3) A numpy array if isinstance(output_size, list): output_size = [x.val for x in output_size] if isinstance(output_size, Var): output_size = [x for x in output_size.val] if isinstance(output_size, _np.ndarray): output_size = output_size.tolist() # output size is computed using the formula floor (scale * input_size) in Core ML (and PyTorch). # Thus, when computing the scales from the output size, we add a small positive constant to the output size # to make sure that the floor formula results in the correct output size and not 1 unit smaller. # For instance, if output size = 5 and input size = 2, then scale will be 2.5, which can get # represented as 2.49999 due to float precision issues, and this might resultin an output size of 4 # instead of 5, without the epsilon correction. if len(output_size) == 1: # 1d upsampling Hout = output_size[0] Hin = input_shape[-1] scales_h = Hout / Hin if Hout % Hin == 0 else (Hout + 1e-4) / Hin scales = scales_h elif len(output_size) == 2: # 2d upsampling Hout, Wout = output_size[0], output_size[1] Hin, Win = input_shape[-2], input_shape[-1] scales_h = Hout / Hin if Hout % Hin == 0 else (Hout + 1e-4) / Hin scales_w = Wout / Win if Wout % Win == 0 else (Wout + 1e-4) / Win scales = [scales_h, scales_w] else: msg = "Only 1d and 2d unsampling are supported." raise NotImplementedError(msg) return scales def _is_float_value(x, threshold=0.001): return x - _math.floor(x) > threshold @register_torch_op def upsample_linear1d(context, node): inputs = _get_inputs(context, node) x = inputs[0] output_size = inputs[1] align_corners = bool(inputs[2].val) scale = inputs[3] scale_factor = None if scale is not None and scale.val is not None and scale.shape == (1,): # Get the scale factor from provided inputs # This happens when recompute_scale_factor = False scale_factor = scale.val[0] # Currently, we are not supporting recompute_scale_factor = False, align_corners = False with float output size _, _, h = x.shape if not is_symbolic(h): # For the static input shape, we can compute the output size beforehand, and check if it is a float value output_size = h * scale_factor is_float = _is_float_value(output_size) else: # For the dynamic input shape, we check if the scale factor itself is float is_float = _is_float_value(scale_factor) if is_float and not align_corners: msg = ( "recompute_scale_factor = False, align_corners = False with float output size is " + "not supported for the upsample op {}".format(node.name) ) raise NotImplementedError(msg) elif isinstance(output_size, list): # When the input shape is dynamic and recompute_scale_factor = True, # we need to trace the graph to find the scale factor. x = mb.expand_dims(x=x, axes=[3]) x = mb.torch_upsample_bilinear( x=x, output_height=output_size[0], output_width=1, align_corners=align_corners, ) x = mb.squeeze(x=x, axes=[3], name=node.name) context.add(x) return elif output_size.val is not None: # Infer the scale factor from the provided output size scale_factor = _get_scales_from_output_size(output_size, x.shape) # Expand the input to a 4d tensor, and use MIL's upsample_bilinear op x = mb.expand_dims(x=x, axes=[3]) x = mb.upsample_bilinear( x=x, scale_factor_height=scale_factor, scale_factor_width=1., align_corners=align_corners, ) x = mb.squeeze(x=x, axes=[3], name=node.name) context.add(x) @register_torch_op(torch_alias=["upsample_bilinear2d.vec"]) def upsample_bilinear2d(context, node): inputs = _get_inputs(context, node) _input = inputs[0] output_size = inputs[1] align_corners = bool(inputs[2].val) scale_factors = inputs[3] scales_h, scales_w = None, None if ( scale_factors is not None and scale_factors.val is not None and scale_factors.rank == 1 and scale_factors.shape[0] == 2 ): # get scale factors from provided inputs # this happens when recompute_scale_factor = False scale_factors = scale_factors.val scales_h = scale_factors[0] scales_w = scale_factors[1] # currently, we are not supporting recompute_scale_factor = False, align_corners = False with float output size _, _, h, w = _input.shape if not is_symbolic(h) and not is_symbolic(w): # For the static input shape, we can compute the output size beforehand output_h = h * scales_h output_w = w * scales_w is_h_float = _is_float_value(output_h) is_w_float = _is_float_value(output_w) else: # For the dynamic input shape, we check if the scale factor itself is float is_h_float = _is_float_value(scales_h) is_w_float = _is_float_value(scales_w) if (is_h_float or is_w_float) and not align_corners: msg = ( "recompute_scale_factor = False, align_corners = False with float output size is " + "not supported for the upsample op {}".format(node.name) ) raise NotImplementedError(msg) elif ( isinstance(output_size, list) and output_size[0].val is None and output_size[1].val is None ): # the input shape is dynamic and recompute_scale_factor = True # need to trace the graph to find the scale factor # we define a torch front end op mb.torch_upsample_bilinear to resolve the const scaling factor torch_upsample_bilinear = mb.torch_upsample_bilinear( x=_input, output_height=output_size[0], output_width=output_size[1], align_corners=align_corners, name=node.name, ) context.add(torch_upsample_bilinear) return else: # infer scale factors from output sizes # This happens when recompute_scale_factor = True or the output_size is specified scales = _get_scales_from_output_size(output_size, _input.shape) if scales: scales_h, scales_w = scales if scales_h is None or scales_w is None: if len(inputs) == 5: # For torch==1.5.0, upsample_bilinear2d has 5 inputs. scales_h = inputs[3] scales_w = inputs[4] else: raise ValueError("Failed to infer scale factors from inputs.") upsample_bilinear = mb.upsample_bilinear( x=_input, scale_factor_height=scales_h, scale_factor_width=scales_w, align_corners=align_corners, name=node.name, ) context.add(upsample_bilinear) @register_torch_op def upsample_nearest1d(context, node): inputs = _get_inputs(context, node) x = inputs[0] output_size = inputs[1] scale = inputs[2] scale_factor = None if scale is not None and scale.val is not None and scale.shape == (1,): # Get the scale factor from provided inputs # This happens when recompute_scale_factor = False scale_factor = scale.val[0] elif isinstance(output_size, list): # When the input shape is dynamic and recompute_scale_factor = True, # we need to trace the graph to find the scale factor. x = mb.expand_dims(x=x, axes=[3]) x = mb.torch_upsample_nearest_neighbor( x=x, output_height=output_size[0], output_width=1, ) x = mb.squeeze(x=x, axes=[3], name=node.name) context.add(x) return else: # Infer scale factors from output sizes scale_factor = _get_scales_from_output_size(output_size, x.shape) x = mb.expand_dims(x=x, axes=[3]) x = mb.upsample_nearest_neighbor( x=x, scale_factor_height=scale_factor, scale_factor_width=1., ) x = mb.squeeze(x=x, axes=[3], name=node.name) context.add(x) @register_torch_op(torch_alias=["upsample_nearest2d.vec"]) def upsample_nearest2d(context, node): inputs = _get_inputs(context, node) _input = inputs[0] scales_h, scales_w = None, None output_size = inputs[1] scale_factors = inputs[2] if ( scale_factors is not None and isinstance(scale_factors, Var) and scale_factors.val is not None and scale_factors.rank == 1 and scale_factors.shape[0] == 2 ): # get scale factors from provided inputs scale_factors = scale_factors.val scales_h = scale_factors[0] scales_w = scale_factors[1] elif scale_factors is not None and isinstance(scale_factors, list) and len(scale_factors) == 2: # get scale factors from provided inputs scales_h = scale_factors[0] scales_w = scale_factors[1] elif ( isinstance(output_size, list) and output_size[0].val is None and output_size[1].val is None ): # the input shape is dynamic and recompute_scale_factor = True # need to trace the graph to find the scale factor # we define a torch front end op mb.torch_upsample_nearest_neighbor to resolve the const scaling factor torch_upsample_nearest2d = mb.torch_upsample_nearest_neighbor( x=_input, output_height=output_size[0], output_width=output_size[1], name=node.name, ) context.add(torch_upsample_nearest2d) return else: # infer scale factors from output sizes scales = _get_scales_from_output_size(output_size, _input.shape) if scales: scales_h, scales_w = scales if scales_h is None or scales_w is None: if len(inputs) == 5: # For torch==1.5.0, upsample_bilinear2d has 5 inputs. scales_h = inputs[3] scales_w = inputs[4] else: raise ValueError("Failed to infer scale factors from inputs.") upsample_nearest2d = mb.upsample_nearest_neighbor( x=_input, scale_factor_height=scales_h, scale_factor_width=scales_w, name=node.name, ) context.add(upsample_nearest2d) @register_torch_op(torch_alias=["listunpack"]) def tupleunpack(context, node): inputs = _get_inputs(context, node, expected=1) values = inputs[0] # Node input could have been turned into constant array in @tupleconstruct if not isinstance(values, (tuple, list)): if values.val is not None: values = values.val else: # The `values` could be a single Var with symbolic val. values = [values] if len(values) != len(node.outputs): raise ValueError(f"unpack node expected {len(node.outputs)} outputs, got {len(values)}") # @value is either a numpy primitive or a Var object for value, output in zip(values, node.outputs): if not isinstance(value, Var): value = _construct_constant(value, name=output) assert isinstance(value, Var) context.add(value, output) @register_torch_op def loop(context, node): """ In TorchIR, a loop looks like: %y_1, ..., %y_r = prim::Loop(%max_trip_count, %initial_condition, %x_1, ..., %x_r) block0(%i, %a_1, ..., %a_r): %b_1, ..., %b_m = some::node(%a_value_from_outer_block, %a_1) %iter_condition = some::other_node(%a_2) -> (%iter_condition, %b_1, ..., %b_r) This translates to pseudo code as: y_1, ..., y_r = x_1, ..., x_r condition = initial_condition i = 0 while condition and i < max_trip_count: a_1, ..., a_r = y_1, ..., y_r ############################################################ # Actual body of the loop b_1, ..., b_m = some::node(a_value_from_outside_of_the_loop, a_1) iter_condition = some::node(a_2) ############################################################ y_1, ..., y_r = b_1, ..., b_r condition = iter_condition i += 1 Which further translates to MIL while_loop as: loop_vars = (0, initial_condition, x_1, ..., x_r) _cond = { return (loop_vars[1] and loop_vars[0] < max_trip_count) } _body = { a_1, ..., a_r = loop_vars[2], ..., loop_vars[-1] b_1, ..., b_m = some::node(a_value_from_outside_of_the_loop, a_1) iter_condition = some::node(a_2) return (loop_vars[0] + 1, iter_condition, b_1, ..., b_r) } For loops pass True for %initial_condition and %iter_condition While loops set %max_trip_count to INT_MAX and %i is unused """ name = node.name # inputs[0]: max iter count # inputs[1]: initial condition # inputs[2]: block input 0 # ... # inputs[N+2]: block input N inputs = _get_inputs(context, node) max_iter_count = inputs[0] # Magic default signals this is a while-only loop, so no iteration count # is needed. has_iter_count = max_iter_count is not None # Create an interation count. This will only be used if this is a for loop. iter_count = mb.const(val=0, name=node.name + "_iter") # @loop_vars is tuple(iter_count, cond, inputs...) loop_vars = tuple([iter_count] + inputs[1:]) def _loop_cond(*loop_vars): cond = loop_vars[1] # Check the iteration count if we're keeping track. if has_iter_count: iter_count = loop_vars[0] iter_cond = mb.less( x=iter_count, y=max_iter_count, name=node.name + "_cond" ) return mb.logical_and(x=cond, y=iter_cond) else: return mb.identity(x=cond) def _shapes_are_equivalent(shape1, shape2): """ Compares two sets of tensor shapes and returns True if they are equivalent. That is, they are the same rank, and each dimension is the same or symbolic. """ if len(shape1) != len(shape2): return False # Each dimension must have the same integer length, or else be # symbolic. all_equivalent = [ s1 == s2 or (isinstance(s1, Symbol) and isinstance(s2, Symbol)) for s1, s2 in zip(shape1, shape2) ] return all_equivalent def _loop_body(*loop_vars): block = node.blocks[0] iter_var = loop_vars[0] inputs = (iter_var,) + loop_vars[2:] res = convert_block(context, block, inputs) for input_var, output_var in zip(loop_vars[2:], res[1:]): if not _shapes_are_equivalent(input_var.shape, output_var.shape): logger.warning( "detected change in shape of loop variable. this could lead to incorrect inference results!" ) logger.warning( "{}:{} -> {}:{}".format( input_var.name, input_var.shape, output_var.name, output_var.shape, ) ) # Update the iteration count if we're keeping track. if has_iter_count: iter_var = mb.add(x=iter_var, y=1, name=iter_var.name + "_inc") else: iter_var = mb.identity(x=iter_var) # Must return tuple with same length and types as @loop_vars. return tuple( [ iter_var, ] + res ) loop = mb.while_loop( _cond=_loop_cond, _body=_loop_body, loop_vars=loop_vars, name=name ) # Make sure the loop returned the expected number of outputs. Note that the # first two loop outputs are the iteration count and condition. assert len(loop) - 2 == len(node.outputs) for output_name, output_var in zip(node.outputs, loop[2:]): context.add(output_var, torch_name=output_name) @register_torch_op def _unique2(context, node): (x, sorted, return_inverse, return_counts) = _get_inputs(context, node, expected=4) # Unsupported case if sorted.val is not True: raise NotImplementedError("sorted=False not supported for unique op") x_flatten = mb.reshape(x=x, shape=[-1]) # Sort flattened input indices = mb.argsort(x=x_flatten, ascending=True) x_sorted = mb.gather_along_axis(x=x_flatten, indices=indices) # Subtract n_th+1 element from n_th element neg_inf = np.float32(-np.inf) x_sorted = mb.cast(x=x_sorted, dtype="fp32") x_sorted_shifted = mb.pad(x=x_sorted, pad=[1, 0], constant_val=neg_inf) x_sorted_padded = mb.pad(x=x_sorted, pad=[0, 1], mode="replicate") diff = mb.sub(x=x_sorted_padded, y=x_sorted_shifted) # Get non-zero element after subtraction to determine unique values non_zero_indices = mb.non_zero(x=diff) unique_values_unsqueeze = mb.gather(x=x_sorted, indices=non_zero_indices) unique_values = mb.squeeze(x = unique_values_unsqueeze) # Add unique values to output and see if we're done. context.add(unique_values, torch_name=node.outputs[0]) if return_counts.val is False and return_inverse.val is False: # only the unique values are needed return # Calculate a UxN boolean tensor, where: # U - number of unique values # N - number of input elements num_unique_values = mb.shape(x=unique_values) x_tile = mb.tile(x=x_flatten, reps=num_unique_values) tile_shape = mb.concat(values=(num_unique_values, mb.shape(x=x_flatten)), axis=0) x_tile = mb.reshape(x=x_tile, shape=tile_shape) unique_values_unsqueeze = mb.cast(x=unique_values_unsqueeze, dtype="int32") x_tile, unique_values_unsqueeze = promote_input_dtypes([x_tile, unique_values_unsqueeze]) diff = mb.sub(x=x_tile, y=unique_values_unsqueeze) bool_tensor = mb.logical_not(x=mb.cast(x=diff, dtype="bool")) if return_inverse.val is True: # Get indices range = mb.range_1d(start=0, end=mb.squeeze(x=num_unique_values), step=1) indices = mb.matmul(x=range, y=mb.cast(x=bool_tensor, dtype="int32")) indices = mb.reshape(x=indices, shape=mb.shape(x=x)) context.add(indices, torch_name=node.outputs[1]) if return_counts.val is True: # Get counts counts = mb.reduce_sum(x=mb.cast(x=bool_tensor, dtype='int32'), axes=(-1,)) context.add(counts, torch_name=node.outputs[2]) @register_torch_op(torch_alias=["if"]) def _if(context, node): """ In TorchIR, a conditional looks like: %y_1, ..., %y_r = prim::If(%condition) block0(): # TRUE BRANCH, never takes arguments, has to return r outputs %t_1, ..., %t_k = some::node(%a_value_from_outer_block) -> (%t_1, ..., %t_r) block1(): # FALSE BRANCH, never takes arguments, has to return r outputs %f_1, ..., %f_m = some::node(%a_value_from_outer_block) -> (%f_1, ..., %f_r) This translates to pseudo code as: if (condition): t_1, ..., t_k = some::node(a_value_from_outer_block) y_1, ..., y_r = t_1, ..., t_r else: f_1, ..., f_m = some::node(a_value_from_outer_block) y_1, ..., y_r = f_1, ..., f_r Which further translates to MIL cond as: _true = { t_1, ..., t_k = some::node(a_value_from_outer_block) return (t_1, ..., t_r) } _false = { f_1, ..., f_m = some::node(a_value_from_outer_block) return (f_1, ..., f_m) } """ name = node.name # inputs[0]: condition inputs = _get_inputs(context, node, expected=1) condition = inputs[0] assert len(node.blocks) == 2 true_block = node.blocks[0] false_block = node.blocks[1] def _true_path(): res = convert_block(context, true_block, []) return tuple(res) def _false_path(): res = convert_block(context, false_block, []) return tuple(res) cond = mb.cond( pred=condition, _true_fn=_true_path, _false_fn=_false_path, name=name ) # If the condition only returns one item, wrap it in a tuple. if not isinstance(cond, (tuple, list)): cond = (cond,) # Make sure the condition returned the expected number of outputs. assert len(cond) == len(node.outputs) for output_name, output_var in zip(node.outputs, cond): context.add(output_var, torch_name=output_name) @register_torch_op(torch_alias=["select_copy"]) def select(context, node): inputs = _get_inputs(context, node, expected=3) _input = inputs[0] dim = inputs[1].val index = inputs[2] assert dim.shape == () # NOTE: # Each index in @begin_array/@end_array corresponds to a dimension of @_input # Each val of those arrays corresponds to the start/end index to slice in that dimension rank = _input.rank begin_array = [0] * rank if index.val is None: # index value not known till runtime begin_array[dim] = index begin_array = mb.concat(values=begin_array, axis=0) else: # index value known now assert index.val.shape == () begin_array[dim] = index.val end_array = [s if isinstance(s, int) else 0 for s in _input.shape] end_mask = [True] * rank squeeze_mask = [False] * rank squeeze_mask[dim] = True if index.val != -1: if index.val is None: # index value not know till runtime temp = mb.add(x=index, y=1) end_array[dim] = temp end_array = mb.concat(values=end_array, axis=0) else: end_array[dim] = index.val + 1 end_mask[dim] = False slice_by_index = mb.slice_by_index( x=_input, begin=begin_array, end=end_array, end_mask=end_mask, squeeze_mask=squeeze_mask, name=node.name, ) context.add(slice_by_index) @register_torch_op def getitem(context, node): inputs = _get_inputs(context, node, expected=2) if inputs[1].val is None: raise AssertionError("Only static item selection supported") try: index = int(inputs[1].val) except: raise AssertionError( f"Index into python list/tuple needs to be integer. Provided value: {inputs[1].val}" ) if not isinstance(inputs[0], (list, tuple)): # For single object with index 0, return this object if index == 0: context.add(inputs[0], torch_name=node.name) return # Otherwise undefined else: raise AssertionError("Item selection is supported only on python list/tuple objects") out = inputs[0][index] if out is None: raise AssertionError( f"coremltools lowering didn't handle/bind value at index {index}. Please inspect the lowering of parent op for its return value" ) context.add(out, torch_name=node.name) @register_torch_op def type_as(context, node): inputs = _get_inputs(context, node, expected=2) if inputs[0].dtype == inputs[1].dtype: x = mb.identity(x=inputs[0], name=node.name) else: x = inputs[0] if inputs[1].dtype not in TYPE_TO_DTYPE_STRING: raise NotImplementedError( "Tensor type {} cast is not supported.".format(inputs[1].dtype) ) x = mb.cast(x=x, dtype=TYPE_TO_DTYPE_STRING[inputs[1].dtype], name=node.name) context.add(x) @register_torch_op def nonzero(context, node): inputs = _get_inputs(context, node, expected=1) x = inputs[0] nonzero = mb.non_zero(x=x, name=node.name) context.add(nonzero) def _get_slice_params(context, data, inputs): def _expand_list_to_rank_1(arr): """ We make the elements in begin and end rank 1, so the pattern of ``squeeze -> expand_dims`` can be removed by the ``fuse_squeeze_expand_dims`` graph pass. """ for i, val in enumerate(arr): if isinstance(val, Var): if val.rank == 0: arr[i] = mb.expand_dims(x=val, axes=[0]) else: arr[i] = np.array([val]) return arr rank = data.rank begin = [0] * rank end = [0] * rank stride = [1] * rank begin_mask = [False] * rank end_mask = [False] * rank squeeze_mask = [False] * rank num_of_slice_set = len(inputs) // 3 for i in range(num_of_slice_set): if inputs[3 * i + 1] is None: # This is pure index select idx = context[inputs[3 * i]] if idx.val is not None: idx = idx.val begin[i] = idx squeeze_mask[i] = True else: # This is a slice begin_var = context[inputs[3 * i]] end_var = context[inputs[3 * i + 1]] stride_var = context[inputs[3 * i + 2]] if begin_var is None: begin_mask[i] = True else: begin[i] = begin_var if end_var is None: end_mask[i] = True else: end[i] = end_var if stride_var is None: stride[i] = 1 else: stride[i] = stride_var.val for i in range(num_of_slice_set, rank): begin_mask[i] = True end_mask[i] = True begin = _expand_list_to_rank_1(begin) eng = _expand_list_to_rank_1(end) begin = mb.concat(values=begin, axis=0) end = mb.concat(values=end, axis=0) return begin, end, stride, begin_mask, end_mask, squeeze_mask def _translate_torch_tensor_assign( x: Var, updates: Var, begin: Var, end: Var, stride=None, begin_mask=None, end_mask=None, squeeze_mask=None, name=None, ): translation_kwargs = {} if stride is not None: translation_kwargs["stride"] = stride if begin_mask is not None: translation_kwargs["begin_mask"] = begin_mask if end_mask is not None: translation_kwargs["end_mask"] = end_mask if squeeze_mask is not None: translation_kwargs["squeeze_mask"] = squeeze_mask if name is not None: translation_kwargs["name"] = name if is_current_opset_version_compatible_with(target.iOS18): # slice_update is not supporting scalar update at runtime. # Until this radar is fixed: rdar://128221986 ([Feature][Slice_update] The backend is not supporting scalar update for the slice_update op), # we have a workaround to expand scalar update to a 1-D tensor. if updates.rank == 0: # Since the workaround uses the compile-time value of begin and end, # so we do the validation first. is_begin_or_end_dynamic = False for var in [begin, end]: if isinstance(var, Var) and var.val is None: is_begin_or_end_dynamic = True if is_begin_or_end_dynamic or any_symbolic(x.shape): return mb.torch_tensor_assign( x=x, updates=updates, begin=begin, end=end, **translation_kwargs, ) # First pick up the ``dim`` in which ``squeeze_mask[dim] = True``, # and do the following transformation: # 1. set ``squeeze_mask[dim] = False`` # 2. set both ``begin_mask`` and ``end_mask`` to ``False`` # 3. make ``end = begin + 1`` dim = None for i, val in enumerate(squeeze_mask): if val is True: dim = i break squeeze_mask[dim] = False begin_mask = [False] * x.rank end_mask = [False] * x.rank if isinstance(begin, Var): begin = begin.val if isinstance(end, Var): end = end.val # convert negative indexes to positive indexes begin = [val if val >= 0 else val + x.shape[i] for i, val in enumerate(begin)] end = mb.add(x=begin, y=1) # expand updates to 1D tensor updates = mb.expand_dims(x=updates, axes=[0]) return mb.slice_update( x=x, update=updates, begin=begin, end=end, **translation_kwargs, ) return mb.torch_tensor_assign( x=x, updates=updates, begin=begin, end=end, **translation_kwargs, ) @register_torch_op def _internal_op_tensor_inplace_copy(context, node): data = context[node.inputs[0]] updates = context[node.inputs[1]] begin, end, stride, begin_mask, end_mask, squeeze_mask = _get_slice_params( context, data, node.inputs[2:] ) data, updates = promote_input_dtypes([data, updates]) updated_x = _translate_torch_tensor_assign( x=data, updates=updates, begin=begin, end=end, stride=stride, begin_mask=begin_mask, end_mask=end_mask, squeeze_mask=squeeze_mask, name=node.name, ) context.add(updated_x) @register_torch_op def _internal_op_tensor_inplace_fill(context, node): data = context[node.inputs[0]] fill_scalar = context[node.inputs[1]] if len(node.inputs) == 2 and fill_scalar.val is not None: shape = mb.shape(x=data) if isinstance(fill_scalar.val, _np.ndarray): fill = mb.fill(shape=shape, value=fill_scalar.val.item()) else: fill = mb.fill(shape=shape, value=fill_scalar) casted = mb.cast(x=fill, dtype=TYPE_TO_DTYPE_STRING[data.dtype], name=node.name) context.add(casted) return begin, end, stride, begin_mask, end_mask, squeeze_mask = _get_slice_params( context, data, node.inputs[2:] ) if begin.val is None or end.val is None or any_symbolic(data.shape): raise ValueError("_internal_op_tensor_inplace_fill does not support dynamic index") fill_shape = solve_slice_by_index_shape( data.shape, begin.val, end.val, stride, begin_mask, end_mask, squeeze_mask ) update_values = _np.full(fill_shape, fill_scalar.val) data, update_values = promote_input_dtypes([data, update_values]) updated_x = _translate_torch_tensor_assign( x=data, updates=update_values, begin=begin, end=end, stride=stride, begin_mask=begin_mask, end_mask=end_mask, squeeze_mask=squeeze_mask, name=node.name, ) context.add(updated_x) @register_torch_op def select_scatter(context, node): inputs = _get_inputs(context, node, expected=4) x = inputs[0] updates = inputs[1] dim = inputs[2].val if dim is None: raise ValueError("Only compile time known dim supported yet") index = inputs[3] # mb.torch_tensor_assign handles multi-dim slicing # so we need to create slice specifications for all other dimensions begin = [0] * x.rank begin[dim] = index begin = mb.concat(values=begin, axis=0) end = x.shape # and squeeze dim to do pure indexing on it squeeze_mask = [False] * x.rank squeeze_mask[dim] = True updated_x = _translate_torch_tensor_assign( x=x, updates=updates, begin=begin, end=end, squeeze_mask=squeeze_mask, name=node.name, ) context.add(updated_x) @register_torch_op def slice_scatter(context, node): inputs = _get_inputs(context, node, min_expected=2) x, updates = promote_input_dtypes(inputs[0:2]) # sanitize and validate dim dim = 0 if len(inputs) <= 2 else inputs[2].val if dim is None: raise ValueError("Only compile time known dim supported yet") if dim < 0: dim = dim + x.rank assert 0 <= dim and dim < x.rank, f"invalid dim: {dim}" # sanitize start start = 0 if len(inputs) <= 3 else inputs[3] if start is None: start = 0 # sanitize end if len(inputs) <= 4: end = x.shape[dim] else: end = inputs[4] if end is not None: end = mb.minimum(x=inputs[4], y=x.shape[dim]) else: end = x.shape[dim] # get step given different number of inputs step = 1 if len(inputs) <= 5 else inputs[5] # mb.torch_tensor_assign handles multi-dim slicing # so we need to pad start, end, step from scalar to x.rank starts = [0] * x.rank starts[dim] = start starts = mb.concat(values=starts, axis=0) ends = list(x.shape) ends[dim] = end ends = mb.concat(values=ends, axis=0) steps = [1] * x.rank steps[dim] = step steps = mb.concat(values=steps, axis=0) updated_x = _translate_torch_tensor_assign( x=x, updates=updates, begin=starts, end=ends, stride=steps, begin_mask=None, end_mask=None, squeeze_mask=None, name=node.name, ) context.add(updated_x) @register_torch_op def index_put(context, node): inputs = _get_inputs(context, node, min_expected=3) x = inputs[0] indices = inputs[1] values = inputs[2] accumulate = False if len(inputs) < 4 else inputs[3].val mode = "add" if accumulate else "update" assert isinstance(indices, list), "indices must be a list of tensors" # Usually indices is a list of non-None tensors, so we stack them and feed to mb.scatter_nd # However, when there exists a whole slice (i.e. :), that index is represented as None if any(map(lambda index: index is None, indices)): # We have 2 ways to translate such torch.index_put, both have pros and cons # 1. mb.scatter_nd # * pro: can handle accumulate or update # * con: can only have whole slice at last dimensions # 2. mb.torch_tensor_assign # * pro: can have whole slice at arbitrary dimension # * con: can only handle update # Here we use mb.torch_tensor_assign # TODO: explore how can we cover as many torch.index_put cases as possible if accumulate: raise NotImplementedError( "If there existed any whole slice (e.g. : in x[:, 0]), " "only torch.index_put(..., accumulate=False) handled yet" ) begin = [0] * x.rank end = list(x.shape) stride = [1] * x.rank begin_mask = [True] * x.rank end_mask = [True] * x.rank # note: in torch slice, an indexed dim becomes size 1, rather than squeezed, e.g. # x = torch.zeros((2, 3)) # y = x[:, 1] # we will get y.shape as (2, 1) is_dim_unity = [False] * x.rank for dim, index in enumerate(indices): if index is not None: if len(index.shape) > 0: index = mb.squeeze(x=index) begin[dim] = index end[dim] = mb.add(x=index, y=1) begin_mask[dim] = False end_mask[dim] = False is_dim_unity[dim] = True begin = mb.concat(values=begin, axis=0) end = mb.concat(values=end, axis=0) expected_values_shape = [] for dim in range(x.rank): expected_values_shape.append(1 if is_dim_unity[dim] else x.shape[dim]) expected_values_shape = tuple(expected_values_shape) if values.shape != expected_values_shape: values = _broadcast(values.name + "_broadcasted", values, expected_values_shape) updated_x = _translate_torch_tensor_assign( x=x, updates=values, begin=begin, end=end, stride=stride, begin_mask=begin_mask, end_mask=end_mask, squeeze_mask=[False] * x.rank, name=node.name, ) context.add(updated_x) return indices_type = indices[0].sym_type.get_primitive() if types.is_bool(indices_type): # indices assert len(indices) == 1, "Unsupported index_put_ usage." indices = indices[0] assert ( indices.shape == x.shape ), f"indices shape {indices.shape} must equal to input shape {x.shape} for index put operation." indices = mb.cast(x=indices, dtype="int32") indices = mb.non_zero(x=indices) # if the indices is all False, # we translate the op into identity if 0 in indices.shape: result = mb.identity(x=x, name=node.name) context.add(result) return # values if values.shape == (): values = mb.expand_dims(x=values, axes=[0]) if values.rank == 1 and values.shape[0] == 1: reps = value_at(mb.shape(x=indices), 0) reps = mb.expand_dims(x=reps, axes=[0]) values = mb.tile(x=values, reps=reps) elif types.is_int(indices_type): # indices if len(indices) > 1: indices = mb.stack(values=indices, axis=indices[0].rank) else: indices = mb.expand_dims(x=indices[0], axes=[-1]) # values expected_values_shape = indices.shape[:-1] + x.shape[indices.shape[-1] :] if values.shape != expected_values_shape: values = _broadcast(values.name + "_broadcasted", values, expected_values_shape) else: raise ValueError(f"Only bool and int index handled yet, but got {indices_type}") if is_current_opset_version_compatible_with(target.iOS17): # IOS17 `scatter_nd` behaviour is undefined for negative indices. cond = mb.greater_equal(x=indices, y=0) x_shape = mb.shape(x=x) indices_shape = mb.shape(x=indices) indices_last_dim = value_at(indices_shape, indices.rank - 1) indices_last_dim_expand = mb.expand_dims(x=indices_last_dim, axes=[0]) slice_shape = mb.slice_by_size(x=x_shape, begin=[0], size=indices_last_dim_expand) indices = mb.select( cond=cond, a=indices, b=mb.add(x=indices, y=slice_shape), ) result = mb.scatter_nd(data=x, indices=indices, updates=values, mode=mode, name=node.name) context.add(result) @register_torch_op(torch_alias=["_unsafe_index"]) def index(context, node): inputs = _get_inputs(context, node, expected=2) x = inputs[0] indices = inputs[1] rank = x.rank """ Case 1: A single boolean index selection Ex: a = torch.rand(2, 3, 4) b = torch.rand(3, 4) index = b > 0.1 c = a[:, b] For this case, the only non-None tensor is with dtype bool The true value indicates whether the element should be selected among the masked axes The output c is a tensor with shape (2, N), where N is the number of elements of b satisfying condition > 0.1 """ boolean_indices_axis = [] for i, index in enumerate(indices): if index is not None and types.is_bool(index.dtype): boolean_indices_axis.append(i) if len(boolean_indices_axis) == 1: # get the True element indices axis = boolean_indices_axis[0] axes = list(range(axis, axis + index.rank)) index = indices[axis] index = mb.non_zero(x=index) # transpose the masked axes to the beginning perm = axes + [i for i in range(rank) if i not in axes] x = mb.transpose(x=x, perm=perm) x = _utils._construct_gather_op("gather_nd", x, index) # transpose the tensor back perm_back = list(range(1, x.rank)) perm_back.insert(axis, 0) res = mb.transpose(x=x, perm=perm_back, name=node.name) context.add(res) return """ Case 2: Pure index selection Ex # 1 [Single dimension selection]: a = torch.rand(1,2,3,4) index = torch.tensor([0, 1]) b = a[:,:,:,index] In this case, indices is a list [None, None, None, [0, 1]]]. The None element means the corresponding dimension is masked. b has shape (1,2,3,2). Ex # 2 [Multiple disconnected dimensions selection]: a = torch.rand(1,2,3,4) index = torch.tensor([0, 1]) b = a[:,index,:,index] In this case, indices is a list [None, [0,1], None, [0,1]] b has shape (2,1,3), where b[0,:,:] = a[:,0,:,0] and b[1,:,:] = a[:,1,:,1] Ex # 3 [Multiple connected dimensions selection]: a = torch.rand(1,2,3,4) index_1 = torch.tensor([0, 1]) index_2 = torch.tensor([0, 1]) b = a[:,index_1,index_2,:] indices is a list [None, [0, 1], [0, 1], None] b has shape (1,2,4), where b[:,0,:] = a[:,0,0,:] and b[:,1,:] = a[:,1,1,:] Ex # 4 [Selection with boolean masks]: a = torch.rand(4,5) index_1 = [True, True, False, False] index_2 = [False, True, True, False, False] b = a[index_1, index_2] indices is a list [[True, True, False, False], [False, True, True, False, False]] In this case, index_1 and index_2 are interpreted as mask by indices of True, index_1 -> [0, 1] index_2 -> [1, 2] b has shape (2,), where b[0] = a[0, 1] and b[1] = a[1, 2] Ex # 5 [Broadcast selection]: a = torch.rand(1,2,3,4) index_1 = torch.tensor([0, 1]) index_2 = torch.tensor([0]) b = a[:,index_1,index_2,:] indices is a list [None, [0, 1], [0], None] In this case, index_2 is going to be broadcasted to [0, 0] b has shape (1,2,4), where b[:,0,:] = a[:,0,0,:] and b[:,1,:] = a[:,1,0,:] """ # get the index axes indices = indices + [None] * (x.rank - len(indices)) indices_axes = [] valid_indices = [] for i, index in enumerate(indices): if index is not None: indices_axes.append(i) valid_indices.append(index) # If all elements in indices is None, simpily return the original tensor. if len(indices_axes) == 0: x = mb.identity(x=x, name=node.name) context.add(x) return # convert all indices to int type for i, indice in enumerate(valid_indices): if indice is not None and types.is_bool(indice.dtype): indice = mb.non_zero(x=indice) indice = mb.squeeze(x=indice, axes=[1]) valid_indices[i] = indice # For the single index axis case, we can use mb.gather directly if len(indices_axes) == 1: axis = indices_axes[0] indices = valid_indices[0] if is_current_opset_version_compatible_with(target.iOS17): # IOS17 `gather` behaviour is undefined for negative indices. indices = mb.select( cond=mb.greater_equal(x=indices, y=0), a=indices, b=mb.add(x=indices, y=value_at(mb.shape(x=x), axis)), ) x = _utils._construct_gather_op("gather", x, indices, axis, name=node.name) context.add(x) return # For multiple index axes case, we delegate broadcast to np if there is no dynamic shape. if all(not any_symbolic(idx.shape) for idx in valid_indices): broadcasted_shape = _np.broadcast_shapes(*[idx.shape for idx in valid_indices]) for i, index in enumerate(valid_indices): if (index.shape != broadcasted_shape) and index.val is not None: new_val = _np.broadcast_to(index.val, broadcasted_shape) valid_indices[i] = mb.const( val=new_val, name=index.name + "_broadcasted" ) valid_indices = [mb.cast(x=index, dtype="int32") for index in valid_indices] # First stack the index together indices_rank = valid_indices[0].rank indices = mb.stack(values=valid_indices, axis=indices_rank) # transpose the input tensor to gather the slicing index in front is_connected = True for i in range(1, len(indices_axes)): if indices_axes[i] != indices_axes[i - 1] + 1: is_connected = False break name = node.name + "_transpose" if is_connected else node.name perm = indices_axes + [axis for axis in range(x.rank) if axis not in indices_axes] x = mb.transpose(x=x, perm=perm) if is_current_opset_version_compatible_with(target.iOS17): # IOS17 `gather_nd` behaviour is undefined for negative indices. cond = mb.greater_equal(x=indices, y=0) x_shape = mb.shape(x=x) indices_shape = mb.shape(x=indices) indices_last_dim = value_at(indices_shape, indices.rank - 1) indices_last_dim_expand = mb.expand_dims(x=indices_last_dim, axes=[0]) slice_shape = mb.slice_by_size(x=x_shape, begin=[0], size=indices_last_dim_expand) indices = mb.select( cond=cond, a=indices, b=mb.add(x=indices, y=slice_shape), ) x = _utils._construct_gather_op("gather_nd", x, indices, name=name) # if the index axes are connect, we need to transpose it back if is_connected: new_dimensions = list(range(indices_axes[0], indices_axes[0] + indices_rank)) new_perm = new_dimensions + [ axis for axis in range(rank + indices_rank - len(indices_axes)) if axis not in new_dimensions ] perm_back = [new_perm.index(axis) for axis in range(len(new_perm))] x = mb.transpose(x=x, perm=perm_back, name=node.name) context.add(x) @register_torch_op def ones(context, node): inputs = _get_inputs( context, node, expected={TorchFrontend.TORCHSCRIPT: [5, 6]}, min_expected={TorchFrontend.TORCHEXPORT: 1, TorchFrontend.EXECUTORCH: 1}, ) size = inputs[0] # dtype = NUM_TO_TORCH_DTYPE[inputs[1].val] unused # layout = inputs[2] unused # device = inputs[3] unused # requires_grad = inputs[4] unused # out = inputs[5] unused if isinstance(size, list): size = mb.concat(values=size, axis=0) fill = mb.fill(shape=size, value=1.0, name=node.name) context.add(fill) @register_torch_op def ones_like(context, node): inputs = _get_inputs(context, node, expected=6) x = inputs[0] if is_current_opset_version_compatible_with(target.iOS16): fill = mb.fill_like(ref_tensor=x, value=1.0, name=node.name) else: size = mb.shape(x=x) # dtype = NUM_TO_TORCH_DTYPE[inputs[1].val] unused # layout = inputs[2] unused # device = inputs[3] unused # requires_grad = inputs[4] unused # out = inputs[5] unused fill = mb.fill(shape=size, value=1.0, name=node.name) context.add(fill) @register_torch_op def fill(context, node): inputs = _get_inputs(context, node, expected=2) shape = inputs[0].shape value = inputs[1].val result = mb.fill(shape=shape, value=value, name=node.name) context.add(result) def _make_fill_op(size, val, name): assert val is not None if isinstance(size, list): size = mb.concat(values=size, axis=0) if types.is_float(size.dtype): size = mb.cast(x=size, dtype="int32") fill = mb.fill(shape=size, value=val, name=name) return fill @register_torch_op def full(context, node): inputs = _get_inputs(context, node, min_expected=2) size = inputs[0] # dtype could be torch.dtype or an integer that maps to a numpy.dtype dtype = None if len(inputs) < 3 or inputs[2] is None: dtype = np.float32 elif isinstance(inputs[2].val, torch.dtype): dtype = NUM_TO_NUMPY_DTYPE[TORCH_DTYPE_TO_NUM[inputs[2].val]] elif isinstance(inputs[2].val, (int, np.generic)): dtype = NUM_TO_NUMPY_DTYPE[inputs[2].val] else: raise ValueError(f"unsupported type {type(inputs[2].val)}.") val = dtype(inputs[1].val) result = _make_fill_op(size, val, node.name) context.add(result) @register_torch_op def full_like(context, node): inputs = _get_inputs(context, node, min_expected=2) x = inputs[0] val = inputs[1].val if is_current_opset_version_compatible_with(target.iOS16): result = mb.fill_like(ref_tensor=x, value=val, name=node.name) else: size = mb.shape(x=inputs[0]) result = _make_fill_op(size, val, node.name) context.add(result) @register_torch_op def new_full(context, node): # The difference between "new_full" and "full" is that the "new_full" is called from # an existing tensor: tensor.new_full(size, fill_value), while the "full" is called # from the torch API: torch.full(size, fill_value). # But they are basically doing the same thing. inputs = _get_inputs(context, node) size = inputs[1] val = inputs[2].val result = _make_fill_op(size, val, node.name) context.add(result) @register_torch_op(torch_alias=["randint.low"]) def randint(context, node): def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs(context, node, min_expected=2) if context.frontend == TorchFrontend.TORCHSCRIPT or node.kind == "randint.low": low = mb.cast(x=inputs[0], dtype="fp32") high = mb.cast(x=inputs[1], dtype="fp32") shape = inputs[2].val else: assert node.kind == "randint" low = 0.0 high = mb.cast(x=inputs[0], dtype="fp32") shape = inputs[1].val return low, high, shape low, high, shape = _parse_positional_args(context, node) rand_uniform = mb.random_uniform(shape=shape, low=low, high=high) rand_int = mb.cast(x=rand_uniform, dtype="int32", name=node.name) context.add(rand_int) @register_torch_op def rand(context, node): shape, _, dtype, _, _ = _get_inputs(context, node) dtype = NUM_TO_DTYPE_STRING[TORCH_DTYPE_TO_NUM[dtype.val]] if dtype else "fp32" low, high = mb.cast(x=0.0, dtype=dtype), mb.cast(x=1.0, dtype=dtype) rand_uniform = mb.random_uniform(shape=shape, low=low, high=high) context.add(rand_uniform, node.name) @register_torch_op def randn(context, node): inputs = _get_inputs(context, node, expected=[5, 6]) shape = inputs[0] dtype = inputs[1] _assert_torch_dtype_num_is_not_complex_number(dtype) rand_normal = mb.random_normal(shape=shape) rand_fp32 = mb.cast(x=rand_normal, dtype="fp32", name=node.name) context.add(rand_fp32) @register_torch_op def randn_like(context, node): inputs = _get_inputs(context, node, expected=6) x = inputs[0] dtype = inputs[1] _assert_torch_dtype_num_is_not_complex_number(dtype) shape = mb.shape(x=x) rand_normal = mb.random_normal(shape=shape) rand_fp32 = mb.cast(x=rand_normal, dtype="fp32", name=node.name) context.add(rand_fp32) @register_torch_op def bitwise_not(context, node): inputs = _get_inputs(context, node) x = inputs[0] dtype = x.dtype if types.is_int(dtype): x = mb.add(x=x, y=1) x = mb.mul(x=x, y=-1, name=node.name) elif types.is_bool(dtype): x = mb.logical_not(x=x, name=node.name) else: raise ValueError("Not supported type {} found for 'bitwise_not' op".format(dtype)) context.add(x) @register_torch_op(torch_alias=["and"]) def bitwise_and(context, node): inputs = _get_inputs(context, node) input_dtypes = [i.dtype for i in inputs] if all(types.is_bool(input_dtype) for input_dtype in input_dtypes): logical_and(context, node) else: raise NotImplementedError( f"The `bitwise_and` op only supports boolean input, but get {input_dtypes}." ) @register_torch_op def logical_not(context, node): # There is an optional `out` parameter in torch.logical_not. inputs = _get_inputs(context, node, expected=[1, 2]) x = inputs[0] if not types.is_bool(x.dtype): x = mb.cast(x=x, dtype="bool") res = mb.logical_not(x=x, name=node.name) context.add(res) def _avg_pool(context, node, inputs): x = inputs[0] kernel_sizes = inputs[1] strides = kernel_sizes # default strides = kernel sizes if len(inputs) > 2: strides = inputs[2] # TorchScript may give us empty stride, in such case # we still default strides to kernel sizes, but name conform to TorchScript if strides.op.op_type == "const" and (not list(strides.val)): strides = mb.const(val=kernel_sizes.val, name=strides.name) pad_type = "custom" # Need to explicitly state L-R, T-B pad pad = None if len(inputs) < 4 else _np.repeat(inputs[3].val, 2) ceil_mode = False if len(inputs) < 5 else inputs[4].val include_pad = True if len(inputs) < 6 else inputs[5].val spatial_rank = 0 if pad is None else len(pad) // 2 if spatial_rank > 2 and ceil_mode is True and list(strides.val) != [1] * len(strides.val): # since MIL does not support ceil_mode for 3D pool, # need to adjust padding values if ceil_mode is True # ceil_mode only causes any difference though, if the strides are not 1 x_spatial_dimensions = x.shape[-spatial_rank:] new_pad = _adjust_pad_for_ceil_mode( x_spatial_dimensions, kernel_sizes.val, strides.val, pad ) if _np.sum(_np.abs(new_pad - pad)) > 1e-3: if include_pad: raise ValueError('pool3D with ceil mode=True and include_pad=True not supported') pad = new_pad pool = mb.avg_pool( x=x, kernel_sizes=kernel_sizes, strides=strides, pad_type=pad_type, pad=pad, name=node.name, exclude_padding_from_average=not include_pad, ceil_mode=ceil_mode if spatial_rank <= 2 else False, ) context.add(pool) @register_torch_op def avg_pool1d(context, node): inputs = _get_inputs( context, node, expected={TorchFrontend.TORCHSCRIPT : 6}, min_expected={TorchFrontend.TORCHEXPORT: 2, TorchFrontend.EXECUTORCH: 2}, ) _avg_pool(context, node, inputs) @register_torch_op def avg_pool2d(context, node): inputs = _get_inputs( context, node, min_expected={ TorchFrontend.TORCHSCRIPT: 6, TorchFrontend.TORCHEXPORT: 2, TorchFrontend.EXECUTORCH: 2, }, ) divisor_override = None if len(inputs) < 7 else inputs[6] if divisor_override is not None: raise ValueError("divisor_override is not supported for avg_pool2d") _avg_pool(context, node, inputs) @register_torch_op def avg_pool3d(context, node): inputs = _get_inputs( context, node, expected={TorchFrontend.TORCHSCRIPT : 7}, min_expected={TorchFrontend.TORCHEXPORT: 2, TorchFrontend.EXECUTORCH: 2}, ) divisor_override = inputs[6] if divisor_override is not None: raise ValueError("divisor_override is not supported for avg_pool3d") _avg_pool(context, node, inputs) @register_torch_op(torch_alias=["_log_softmax"]) def log_softmax(context, node): def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs(context, node, min_expected=2) nargs = len(inputs) x = inputs[0] axis = inputs[1] # input 2 is dtype, so we ignore return x, axis x, axis = _parse_positional_args(context, node) res = mb.softmax(x=x, axis=axis, name=node.name + "_softmax") res = mb.log(x=res, name=node.name) context.add(res) @register_torch_op(torch_alias=["nll_loss_nd"]) def nll_loss(context, node): inputs = _get_inputs(context, node, expected=5) x = inputs[0] target = inputs[1] weight = inputs[2] reduction = inputs[3] ignore_index = inputs[4] # mapping for reduction reduction_mapping = {0: "none", 1: "mean", 2: "sum"} reduction = reduction_mapping[reduction.val] # compute the weights loss batch_size = x.shape[0] class_num = x.shape[1] # only support weight and ignore_index both None if weight is not None: raise NotImplementedError("Only unity weight is supported for NLLLoss.") if ignore_index.val != -100: raise NotImplementedError("ignore index not supported for NLLLoss.") x = mb.cast(x=x, dtype="fp32") x = mb.mul(x=x, y=-1.) target = mb.cast(x=target, dtype="int32") labels = mb.one_hot(indices=target, one_hot_vector_size=class_num) labels = mb.cast(x=labels, dtype="fp32") loss = mb.mul(x=x, y=labels) loss = mb.reduce_sum(x=loss, axes=[1]) # reduction type if reduction == "none": out = mb.identity(x=loss, name=node.name) elif reduction == "sum": out = mb.reduce_sum(x=loss, axes=[0], keep_dims=False, name=node.name) elif reduction == "mean": out = mb.real_div(x=loss, y=_np.float32(batch_size)) out = mb.reduce_sum(x=out, axes=[0], keep_dims=False, name=node.name) else: raise NotImplementedError("Unsupported reduction type for NLLLoss.") context.add(out) @register_torch_op def sigmoid(context, node): inputs = _get_inputs(context, node, expected=1) res = mb.sigmoid(x=inputs[0], name=node.name) context.add(res) @register_torch_op def hardsigmoid(context, node): inputs = _get_inputs(context, node, expected=1) res = mb.sigmoid_hard(x=inputs[0], alpha=1.0 / 6, beta=0.5, name=node.name) context.add(res) @register_torch_op def gelu(context, node): inputs = _get_inputs(context, node) assert len(inputs) in (1, 2) mode = None if len(inputs) == 2: approximate = inputs[1].val if approximate == "tanh": mode = "TANH_APPROXIMATION" else: assert approximate == "none" res = mb.gelu(x=inputs[0], mode=mode, name=node.name) context.add(res) @register_torch_op(torch_alias=["_slice", "slice_copy"]) def slice(context, node): def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs( context, node, expected=(1, 2, 3, 4, 5), ) nargs = len(inputs) x = inputs[0] dim = inputs[1].val if nargs > 1 else 0 start = None if nargs > 2: start = inputs[2] if isinstance(start, Var) and start.val is not None: start = start.val end = None if nargs > 3: end = inputs[3] if isinstance(end, Var) and end.val is not None: end = end.val step = inputs[4].val if nargs > 4 else 1 return x, dim, start, end, step x, dim, start, end, step = _parse_positional_args(context, node) # torch.export may have kwargs if context.frontend == TorchFrontend.TORCHEXPORT: if dim == 0: dim = _get_kwinputs(context, node, "dim", default=[dim])[0] if start is None: start = _get_kwinputs(context, node, "start", default=[start])[0] if end is None: end = _get_kwinputs(context, node, "end", default=[end])[0] if step == 1: step = _get_kwinputs(context, node, "step", default=[step])[0] # torch start = None means Core ML start = 0 if start is None: start = 0 # dim must be constant if isinstance(dim, Var): dim = dim.val assert dim is not None if start == 0 and end is None and step == 1: # Handling x[:], just pass through the tensor. x_identity = mb.identity(x=x, name=node.name) context.add(x_identity) return begin_array = [0] * len(x.shape) begin_array[dim] = start end_array = [s if isinstance(s, int) else 0 for s in x.shape] end_mask = [True] * len(x.shape) if end is not None: end_array[dim] = end # if end >= x.shape[dim], then end can be ignored, i.e. end_mask[dim] = True end_mask[dim] = True if isinstance(end, int) and end >= x.shape[dim] else False if isinstance(start, Var): begin_array = mb.concat(values=begin_array, axis=0) if isinstance(end, Var): end_array = mb.concat(values=end_array, axis=0) kwargs = { "x": x, "begin": begin_array, "end": end_array, "end_mask": end_mask, "name": node.name, } if step != 1: stride_array = _np.array([1] * len(x.shape)) stride_array[dim] = step kwargs["stride"] = stride_array res = mb.slice_by_index(**kwargs) context.add(res) @register_torch_op(torch_alias=["split_with_sizes", "split_with_sizes_copy"]) def split(context, node): def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs(context, node, min_expected=2) nargs = len(inputs) x = inputs[0] split_sizes = inputs[1] dim = inputs[2] if nargs > 2 else 0 return x, split_sizes, dim def _parse_keyword_args(context, node, dim) -> Var: # Only torch.export may have kwargs if context.frontend != TorchFrontend.TORCHEXPORT: return dim dim = _get_kwinputs(context, node, "dim", default=[dim])[0] return dim def _translate_torch_args(dim) -> Var: if isinstance(dim, Var): dim = dim.val return dim x, split_sizes, dim = _parse_positional_args(context, node) dim = _parse_keyword_args(context, node, dim) dim = _translate_torch_args(dim) if not isinstance(split_sizes.val, _np.ndarray): shape = mb.shape(x=x) dim_size = _list_select(shape, dim) # MIL split op needs the size of each split to be given explicitly. num_whole_splits = mb.floor_div(x=dim_size, y=split_sizes) remainder = mb.mod(x=dim_size, y=split_sizes) # MIL doesn't have a way of turning a scalar into a tensor (list write # only supports tensors). As a workaround, we create a constant [1] # tensor and multiply it by the scalar value, thus creating a tensor # with the scalar value in it. tmp = mb.const(val=[1]) whole_sizes = mb.mul(x=tmp, y=split_sizes) reps = mb.mul(x=tmp, y=num_whole_splits) whole_sizes = mb.tile(x=whole_sizes, reps=reps) if remainder.val == 0: split_sizes = whole_sizes else: partial_size = mb.mul(x=tmp, y=remainder) split_sizes = mb.concat(values=[whole_sizes, partial_size], axis=0) res = mb.split(x=x, split_sizes=split_sizes, axis=dim, name=node.name) context.add(res, torch_name=node.name) @register_torch_op def unbind(context, node): def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs(context, node, expected=(1, 2)) nargs = len(inputs) x = inputs[0] dim = inputs[1] if nargs > 1 else 0 return x, dim x, dim = _parse_positional_args(context, node) # torch.export may have kwargs if context.frontend == TorchFrontend.TORCHEXPORT: if dim == 0: dim = _get_kwinputs(context, node, "dim", default=[dim])[0] if isinstance(dim, Var): dim = dim.val split_sizes = [1] * x.shape[dim] if len(split_sizes) == 1: res = [mb.squeeze(x=x, axes=[dim])] else: res = mb.split(x=x, split_sizes=split_sizes, axis=dim, name=node.name) res = [mb.squeeze(x=x, axes=[dim]) for x in res] context.add(res, torch_name=node.name) @register_torch_op(torch_alias = ["_to_copy"]) def to(context, node): inputs = _get_inputs(context, node) # There are a lot of variants of `to` op. # - When len(inputs) is 7 or 8, we only care about the first two params (input and dtype). # - When len(inputs) == 6, the parameter is (input, _, dtype, non_blocking, copy, memory_format) # - When len(inputs) == 5, the parameter is (input, dtype, non_blocking, copy, memory_format) # - When len(inputs) == 4, the parameter is (input, dtype, non_blocking, copy) # - When len(inputs) == 3, the parameter is (input, non_blocking, copy) # We only use `input` and `dtype`, and `non_blocking` and `copy` are unused. _input = inputs[0] target_dtype: Optional[Var] inputs_len = len(inputs) if inputs_len in (4, 5, 7, 8): target_dtype = inputs[1] elif inputs_len == 6: target_dtype = inputs[2] elif inputs_len <= 3: target_dtype = None else: raise ValueError( "Received invalid arguments for PyTorch conversion of op {}".format(node) ) if target_dtype is None: # When target_dtype is None, it means the input's dtype is already the target dtype. context.add(_input, torch_name=node.name) return elif types.is_scalar(target_dtype.sym_type) and target_dtype.val is not None: dtype = target_dtype.val else: # When the val of dtype is not available, bridge from the np dtype. np_type = nptype_from_builtin(target_dtype.dtype) dtype = NUMPY_DTYPE_TO_TORCH_NUM[np_type] torch_dtype = dtype_to_32bit(NUM_TO_TORCH_DTYPE[dtype]) if isinstance(_input, Var) and _input.can_be_folded_to_const(): # numpy -> torch -> torch cast -> numpy # This path is needed to use the mapping of passed in dtypes to torch dtypes. casted_input = torch.tensor(_input.val).type(torch_dtype).cpu().numpy() res = mb.const(val=casted_input, name=node.name) else: dtype_str = NUM_TO_DTYPE_STRING[dtype] valid_dtypes = ( {"int8", "uint8", "int16", "uint16", "int32", "fp16", "fp32", "bool"} if is_current_opset_version_compatible_with(target.iOS17) else {"int32", "fp16", "fp32", "bool"} ) if dtype_str in valid_dtypes: res = mb.cast(x=_input, dtype=dtype_str, name=node.name) else: # For dtype that is not supported by mb.cast, we do it in best-effort to cast it to int # or float based on the dtype. np_dtype = NUM_TO_NUMPY_DTYPE[dtype] if _np.issubdtype(np_dtype, _np.integer): res = mb.cast(x=_input, dtype="int32", name=node.name) elif _np.issubdtype(np_dtype, _np.floating): res = mb.cast(x=_input, dtype="fp32", name=node.name) else: raise ValueError(f"Unsupported op {node} with target dtype {np_dtype}") context.add(res) @register_torch_op def erf(context, node): inputs = _get_inputs(context, node, expected=1) _input = inputs[0] erf = mb.erf(x=_input, name=node.name) context.add(erf) @register_torch_op(torch_alias=["scalarimplicit"]) def implicittensortonum(context, node): inputs = _get_inputs(context, node, expected=1) _input = inputs[0] if _input.shape == (): # already a scalar context.add(_input, node.name) else: assert _input.shape == (1,) # shape: (1,) -> () squeeze = mb.squeeze(x=_input, name=node.name) context.add(squeeze) @register_torch_op def constantchunk(context, node): inputs = _get_inputs(context, node, expected=1) x = inputs[0] # ConstantChunk gets its parameters as attributes of the node. chunks = node.attr["chunks"] dim = node.attr["dim"] total = x.shape[dim] size = int(_math.ceil(float(total) / float(chunks))) split_sizes = [size] * int(_math.floor(total / size)) remainder = total - sum(split_sizes) if remainder > 0: split_sizes.append(remainder) res = mb.split(x=x, split_sizes=split_sizes, axis=dim, name=node.name) for val, name in zip(res, node.outputs): context.add(val, name) def _broadcast(name, tensor, shape): if len(shape) > tensor.rank: new_dims = len(shape) - tensor.rank tensor = mb.expand_dims(x=tensor, axes=list(range(new_dims))) reps = [] for ts, ds in zip(tensor.shape, shape): if not is_symbolic(ts) and not is_symbolic(ds) and ds > 0 and ts == 1: reps.append(ds) else: reps.append(1) res = mb.tile(x=tensor, reps=reps, name=name) return res @register_torch_op(torch_alias=["expand_copy"]) def expand(context, node): def _broadcast_dynamic(name, tensor, shape): # Add any extra dimensions if len(shape) > tensor.rank: new_dims = len(shape) - tensor.rank tensor = mb.expand_dims(x=tensor, axes=list(range(new_dims))) tensor_shape = mb.shape(x=tensor) shape = mb.concat(values=shape, axis=0) reps = mb.real_div(x=shape, y=tensor_shape) reps = mb.cast(x=reps, dtype="int32") res = mb.tile(x=tensor, reps=reps, name=name) return res # PyTorch 1.6+ has 3 inputs while older version has 2 inputs = _get_inputs(context, node, expected=[2, 3]) x = inputs[0] shape = inputs[1] if isinstance(shape, list): res = _broadcast_dynamic(node.name, x, shape) else: res = _broadcast(node.name, x, shape.val) context.add(res) @register_torch_op def expand_as(context, node): # PyTorch 1.6+ has 3 inputs while older version has 2 inputs = _get_inputs(context, node, expected=[2, 3]) x = inputs[0] other = inputs[1] res = _broadcast(node.name, x, other.shape) context.add(res) @register_torch_op( torch_alias=[ "atleast_2d", "atleast_3d", "atleast_1d.sequence", "atleast_2d.sequence", "atleast_3d.sequence", ] ) def atleast_1d(context, node): def _maybe_expand_dims(x: Var, rank: int, name: Optional[str] = None) -> Var: if x.rank < rank: if rank == 3: if x.rank == 2: axes = [2] elif x.rank == 1: axes = [0, 2] else: axes = [0, 1, 2] else: axes = [*range(rank - x.rank)] kwargs = {"x": x, "axes": axes} if name is not None: kwargs["name"] = name x = mb.expand_dims(**kwargs) return x inputs = _get_inputs(context, node, expected=1)[0] rank = int(node.kind[8]) assert rank in (1, 2, 3) if isinstance(inputs, (tuple, list)): results = [] for x in inputs: results.append(_maybe_expand_dims(x, rank)) else: assert isinstance(inputs, Var) x = inputs results = _maybe_expand_dims(x, rank, node.name) context.add(results, torch_name=node.name) def _arange( context, node_name: str, start: Var, end: Var, step: Var, ): # torch may have mixed precision, including mixing float and int, # but Core ML needs these inputs to have uniform dtype start, end, step = promote_input_dtypes([start, end, step]) res = mb.range_1d(start=start, end=end, step=step, name=node_name) context.add(res) @register_torch_op(torch_alias=["arange.start"]) def arange(context, node): def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs(context, node, min_expected=1) nargs = len(inputs) if context.frontend == TorchFrontend.TORCHSCRIPT: # dtype = inputs[-4] # layout = inputs[-3] # device = inputs[-2] # pin_memory = inputs[-1] if nargs == 1 or nargs == 5: # inputs are [end] or [end, dtype, layout, device, pin_memory] start = 0 end = inputs[0] step = 1 elif nargs == 6: # inputs are [start, end, dtype, layout, device, pin_memory] start = inputs[0] end = inputs[1] step = 1 elif nargs == 7: # inputs are [start, end, step, dtype, layout, device, pin_memory] start = inputs[0] end = inputs[1] step = inputs[2] else: raise ValueError(f"arange must have exactly 5, 6, or 7 inputs, got {nargs}") else: if re.match(r"arange\.start.*", node.kind): start = inputs[0] assert nargs > 1, "arange.start has at least 2 positional args: start, end" end = inputs[1] if node.kind == "arange.start_step": step = inputs[2] if nargs > 2 else 1 else: step = 1 else: start = 0 end = inputs[0] step = 1 return start, end, step def _parse_keyword_args(context, node, step) -> Var: # Only torch.export may have kwargs if context.frontend != TorchFrontend.TORCHEXPORT: return step step = _get_kwinputs(context, node, "step", default=[step])[0] return step start, end, step = _parse_positional_args(context, node) step = _parse_keyword_args(context, node, step) _arange(context, node.name, start, end, step) @register_torch_op(torch_alias=["arange.start_step"]) def arange_start_step(context, node): inputs = _get_inputs(context, node) start = inputs[0] end = inputs[1] step = 1 if len(inputs) < 3 else inputs[2] _arange(context, node.name, start, end, step) @register_torch_op def masked_fill(context, node): inputs = _get_inputs(context, node, expected=3) x = inputs[0] mask = inputs[1] value = inputs[2] if not types.is_bool(mask.dtype): # cond must be bool type mask = mb.cast(x=mask, dtype="bool") if value.dtype != x.dtype: value = mb.cast(x=value, dtype=builtin_to_string(x.dtype)) value, x = promote_input_dtypes([value, x]) res = mb.select(cond=mask, a=value, b=x, name=node.name) context.add(res) @register_torch_op(torch_alias=["meshgrid.indexing"]) def meshgrid(context, node): """ For N input tensors, a meshgrid is constructed by viewing each tensor as an N-dimension tensor with values in the dimension corresponding it its order in the args. (a.) Then, it is expanded along dimensions corresponding to the dimensions of each 1d tensor in the order that they were passed in. (b.) Each output tensor is put into a tuple that is returned. These tuples form N, N-dimenional grids, where the ith grid is defined as expanding the ith input over dimensions defined by the other inputs. """ def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs(context, node, expected=[1, 2]) nargs = len(inputs) tensor_inputs = inputs[0] indexing = inputs[1].val if nargs > 1 else "ij" return tensor_inputs, indexing def _check_args(tensor_inputs, indexing) -> None: assert isinstance(tensor_inputs, (list, tuple)) if len(tensor_inputs) < 2: raise ValueError("Requires >= 2 tensor inputs.") if any([len(tensor_var.shape) > 1 for tensor_var in tensor_inputs]): raise ValueError("meshgrid received non-1d tensor.") if indexing not in ("ij", "xy"): raise ValueError(f"indexing mode {indexing} not supported") tensor_inputs, indexing = _parse_positional_args(context, node) # torch.export may have kwargs if context.frontend == TorchFrontend.TORCHEXPORT: if indexing == "ij": indexing = _get_kwinputs(context, node, "indexing", default=[indexing])[0] _check_args(tensor_inputs, indexing) dim_tuple = tuple(tensor_var.shape[0] for tensor_var in tensor_inputs) grids = [] size = len(tensor_inputs) for i in range(size): view_shape = [1] * size view_shape[i] = -1 view_shape = tuple(view_shape) # (a.) in docstring view = mb.reshape( x=tensor_inputs[i], shape=view_shape, name=node.name + "_view_" + str(i) ) # (b.) in docstring reps = [ ds if ds > 0 and ts == 1 else 1 for ts, ds in zip(view.shape, dim_tuple) ] res = mb.tile(x=view, reps=reps, name=node.name + "_expand_" + str(i)) # transpose the first two dimensions for "xy" indexing if indexing == "xy": perm = [1, 0] + list(range(2, size)) res = mb.transpose(x=res, perm=perm, name=node.name + "_transpose_" + str(i)) grids.append(res) context.add(tuple(grids), node.name) # Defines all the nodes that are noOps @register_torch_op( torch_alias=[ "_assert_async.msg", "_assert_scalar", "_local_scalar_dense", "alias_copy", "clone", "contiguous", "detach", "device", "dropout", "feature_dropout", "lift_fresh", "lift_fresh_copy", ] ) def noop(context, node): logger.info(f"Setting pytorch op: {node.kind} to no-op.") # These noops do not produce output if node.kind in ("_assert_scalar",): return # Other noops return input as output else: inputs = _get_inputs(context, node) _input = inputs[0] context.add(_input, torch_name=node.name) @register_torch_op def argmax(context, node): inputs = _get_inputs(context, node) x = inputs[0] axis = inputs[1] keep_dims = inputs[2] if types.is_int(x.dtype) and x.dtype._width == 64: # MIL reduce_argmax doesn't support int64. x = mb.cast(x=x, dtype="int32") res = mb.reduce_argmax(x=x, axis=axis, keep_dims=keep_dims, name=node.name) context.add(res) @register_torch_op(torch_alias=["empty_like"]) def zeros_like(context, node): inputs = _get_inputs( context, node, expected={TorchFrontend.TORCHSCRIPT: 6}, min_expected={TorchFrontend.TORCHEXPORT: 1, TorchFrontend.EXECUTORCH: 1}, ) x = inputs[0] shape = mb.shape(x=x) if len(inputs) > 1 and inputs[1] and inputs[1].val: dtype = inputs[1].val np_type = NUM_TO_NUMPY_DTYPE[dtype] else: np_type = nptype_from_builtin(x.dtype) if shape.can_be_folded_to_const(): shape = shape.val zeros = _np.zeros(shape).astype(np_type) zeros_like = mb.const(val=zeros, name=node.name) else: value = np_type(0) if is_current_opset_version_compatible_with(target.iOS16): zeros_like = mb.fill_like(ref_tensor=x, value=value, name=node.name) else: zeros_like = mb.fill(shape=shape, value=value, name=node.name) context.add(zeros_like) @register_torch_op(torch_alias=["empty", "empty.memory_format"]) def zeros(context, node): inputs = _get_inputs(context, node, min_expected=1) size = inputs[0] if len(inputs) > 1 and inputs[1] is not None: dtype = inputs[1].val else: dtype = torch.get_default_dtype() assert dtype in (torch.float32, torch.float64) dtype = 6 if isinstance(size, list) or not size.can_be_folded_to_const(): # the size is dynamic or this zeros op cannot be folded into const. size = mb.concat(values=size, axis=0) if isinstance(size, list) else size np_type = NUM_TO_NUMPY_DTYPE[dtype] zeros = mb.fill(shape=size, value=np_type(0), name=node.name) else: # the size is static and this zeros op can be folded into const. size = size.val # layout = inputs[2] unused # device = inputs[3] unused # pin_memory = inputs[4] unused torch_dtype = dtype_to_32bit(NUM_TO_TORCH_DTYPE[dtype]) zeros_array = torch.zeros(tuple(size)).type(torch_dtype).numpy() zeros = mb.const(val=zeros_array, name=node.name) context.add(zeros) @register_torch_op(torch_alias=["new_empty"]) def new_zeros(context, node): inputs = _get_inputs(context, node) shape = inputs[1] if isinstance(shape, list): # when the size is dynamic, it is a list of pymil scalar, # we need to concat them first to get a shape. shape = mb.concat(values=shape, axis=0) context.add(mb.fill(shape=shape, value=0., name=node.name)) @register_torch_op def scalar_tensor(context, node): value = _get_inputs(context, node, expected=1)[0].val context.add(mb.const(val=value, name=node.name)) @register_torch_op def dim(context, node): inputs = _get_inputs(context, node) shape = mb.shape(x=inputs[0]) rank = mb.shape(x=shape) context.add(value_at(rank, 0, node.name)) @register_torch_op def min(context, node): inputs = _get_inputs(context, node, expected=[1, 2, 3]) # mimic functionality from https://pytorch.org/docs/stable/generated/torch.min.html if len(inputs) == 1: value = mb.reduce_min(x=inputs[0], axes=None, name=node.name) context.add(value) elif len(inputs) == 2: value = mb.minimum(x=inputs[0], y=inputs[1], name=node.name) context.add(value) elif len(inputs) == 3: _input = inputs[0] dim = inputs[1].val keepdim = inputs[2].val values = mb.reduce_min(x=_input, axes=[dim], keep_dims=keepdim) indices = mb.reduce_argmin(x=_input, axis=dim, keep_dims=keepdim) assert len(node.outputs) == 2 values_name = node.outputs[0] indices_name = node.outputs[1] context.add(values, torch_name=values_name) context.add(indices, torch_name=indices_name) @register_torch_op def max(context, node): inputs = _get_inputs(context, node, expected=[1, 2, 3]) # mimic functionality from https://pytorch.org/docs/stable/generated/torch.max.html if len(inputs) == 1: value = mb.reduce_max(x=inputs[0], axes=None, name=node.name) context.add(value) elif len(inputs) == 2: value = mb.maximum(x=inputs[0], y=inputs[1], name=node.name) context.add(value) elif len(inputs) == 3: _input = inputs[0] dim = inputs[1].val keepdim = inputs[2].val values = mb.reduce_max(x=_input, axes=[dim], keep_dims=keepdim) indices = mb.reduce_argmax(x=_input, axis=dim, keep_dims=keepdim) assert len(node.outputs) == 2 values_name = node.outputs[0] indices_name = node.outputs[1] context.add(values, torch_name=values_name) context.add(indices, torch_name=indices_name) def _add_amax_amin(context, node, reduce_op): # mimic functionality from https://pytorch.org/docs/stable/generated/torch.amax.html # mimic functionality from https://pytorch.org/docs/stable/generated/torch.amin.html assert len(node.outputs) == 1 all_inputs = _get_inputs(context, node, expected=[2, 3]) _input = all_inputs[0] dim = [all_inputs[1].val] if type(all_inputs[1].val) == int else [x for x in all_inputs[1].val] keepdim = all_inputs[2] if len(all_inputs) == 3 else False context.add(reduce_op(x=_input, axes=dim, keep_dims=keepdim), torch_name=node.outputs[0]) @register_torch_op def amax(context, node): _add_amax_amin(context, node, mb.reduce_max) @register_torch_op def amin(context, node): _add_amax_amin(context, node, mb.reduce_min) @register_torch_op def argsort(context, node): inputs = _get_inputs(context, node, expected=3) ascending = mb.logical_not(x=inputs[2]) argsort = mb.argsort(x=inputs[0], axis=inputs[1], ascending=ascending, name=node.name) context.add(argsort) @register_torch_op def sort(context, node): inputs = _get_inputs(context, node) _input = inputs[0] axis = inputs[1].val ascending = not inputs[2].val indices_name = node.outputs[1] values_name = node.outputs[0] indices = mb.argsort(x=_input, axis=axis, ascending=ascending, name=indices_name) values = mb.gather_along_axis(x=_input, indices=indices, axis=axis, name=values_name) context.add(values, torch_name=values_name) context.add(indices, torch_name=indices_name) @register_torch_op def append(context, node): # Note: by applying torchir_passes.transform_inplace_ops the meaning of # this op is changed from the original TorchIR. This op expects a python # list or MIL List as its first input. If an MIL List, the second input # must be a tensor of whatever shape the List expects. If not an MIL List, # the second input can by anything. The result will be the second input # joined to the first input, either by list_write if an MIL list, or # append if a python list. inputs = _get_inputs(context, node, expected=2) ls = inputs[0] value = inputs[1] if isinstance(ls, list): context.add(ls + [value], node.name) elif isinstance(ls, ListVar): index = mb.list_length(ls=ls, name=node.name + "_index") res = mb.list_write(ls=ls, index=index, value=value, name=node.name) context.add(res) else: raise ValueError( "can only append to Python list or MIL ListVar, got {}.".format( type(inputs[0]) ) ) @register_torch_op def gather(context, node): inputs = _get_inputs(context, node) res = mb.gather_along_axis(x=inputs[0], indices=inputs[2], axis=inputs[1], name=node.name) context.add(res) @register_torch_op def index_select(context, node): x = context[node.inputs[0]] axis = context[node.inputs[1]] indices = context[node.inputs[2]] context.add(mb.gather(x=x, indices=indices, axis=axis, name=node.name)) @register_torch_op(torch_alias=["abs"]) def _abs(context, node): x = _get_inputs(context, node, expected=1)[0] if types.is_complex(x.dtype): context.add(mb.complex_abs(x=x, name=node.name)) else: context.add(mb.abs(x=x, name=node.name)) @register_torch_op def repeat(context, node): x, reps = _get_inputs(context, node, expected=2) if isinstance(reps, list): reps = mb.concat(values=reps, axis=0) if reps.shape[0] > len(x.shape): x = mb.expand_dims(x=x, axes=list(range(reps.shape[0] - x.rank))) context.add(mb.tile(x=x, reps=reps, name=node.name)) @register_torch_op(torch_alias=["repeat_interleave.self_tensor", "repeat_interleave.self_int"]) def repeat_interleave(context, node): """ For now, we only support scalar repeats + None or 0 dim """ def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs( context, node, expected={TorchFrontend.TORCHSCRIPT: 4}, min_expected={TorchFrontend.TORCHEXPORT: 2, TorchFrontend.EXECUTORCH: 2}, ) nargs = len(inputs) x = inputs[0] repeats = inputs[1] dim = inputs[2] if nargs > 2 else None return x, repeats, dim def repeat_interleave_dim0(x: Var, repeats_val: int, name: str = None) -> Var: """ on a high level: x | tile in dim 0 v [x, x, ...] | reshape to split the repeats v [[x], [x], ...] | transpose(1, 0) V [x^T, x^T, ...] | flatten V result """ translation_kwargs = {} if name is not None: translation_kwargs["name"] = name x_shape = mb.shape(x=x) reps = [1] * x.rank reps[0] = repeats_val x_tiled = mb.tile(x=x, reps=reps) split_reps_shape = mb.concat(values=([repeats_val], x_shape), axis=0) x_reshaped = mb.reshape(x=x_tiled, shape=split_reps_shape) perm = [*range(x.rank + 1)] perm[0] = 1 perm[1] = 0 x_transposed = mb.transpose(x=x_reshaped, perm=perm) x_unaffected_sizes = mb.slice_by_index(x=x_shape, begin=[1], end=[x.rank]) result_shape = mb.concat(values=([-1], x_unaffected_sizes), axis=0) result = mb.reshape(x=x_transposed, shape=result_shape, **translation_kwargs) return result x, repeats, dim = _parse_positional_args(context, node) # torch.export may have kwargs if context.frontend == TorchFrontend.TORCHEXPORT: if dim is None: dim = _get_kwinputs(context, node, "dim", default=[dim])[0] repeats_val = repeats.val if isinstance(repeats_val, np.ndarray): repeats_val0 = np.expand_dims(repeats_val, 0).reshape(-1)[0] if np.any(repeats_val != repeats_val0): raise NotImplementedError( "Conversion for torch.repeat_interleave with Tensor repeats has not been implemented" ) repeats_val = repeats_val0 is_dim_0 = True # This would operate on the flattened input tensor if dim is None: x = mb.reshape(x=x, shape=(-1,)) else: dim_val = dim.val assert dim_val is not None, "torch.repeat_interleave uses static dim" if dim_val < 0: dim_val += x.rank # non-0 dim requires additional pre and post treatment if dim_val != 0: is_dim_0 = False # quick return: repeat 1 is noop if repeats_val == 1: context.add(x, torch_name=node.name) return if is_dim_0: result = repeat_interleave_dim0(x, repeats_val, node.name) else: # pre treatment: permute to have dim 0 perm2dim0 = [dim_val] for i in range(x.rank): if i != dim_val: perm2dim0.append(i) x = mb.transpose(x=x, perm=perm2dim0) result_of_dim0 = repeat_interleave_dim0(x, repeats_val) # post treatment: permute back to original dim perm_back = [0] * x.rank for i in range(x.rank): perm_back[perm2dim0[i]] = i result = mb.transpose(x=result_of_dim0, perm=perm_back, name=node.name) context.add(result) @register_torch_op def acos(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.acos(x=inputs[0], name=node.name)) @register_torch_op def acosh(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.acosh(x=inputs[0], name=node.name)) @register_torch_op def asin(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.asin(x=inputs[0], name=node.name)) @register_torch_op def atan(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.atan(x=inputs[0], name=node.name)) @register_torch_op def atan2(context, node): """ atan2(Tensor y, Tensor x) Element-wise arctangent of y / x with consideration of the quadrant Returns a new tensor with the signed angles in radians between vector (x, y) and vector (1, 0) On a high level: 1. atan(y / x) to get the angle in [-pi / 2, pi / 2] 2. analyze quadrant to determine the angle in [-pi, pi] Reference PyTorch code https://gist.github.com/nikola-j/b5bb6b141b8d9920318677e1bba70466 def my_atan2(y, x): pi = torch.from_numpy(np.array([np.pi])).to(y.device, y.dtype) ans = torch.atan(y / x) ans += ((y > 0) & (x < 0)) * pi ans -= ((y < 0) & (x < 0)) * pi ans *= (1 - ((y > 0) & (x == 0)) * 1.0) ans += ((y > 0) & (x == 0)) * (pi / 2) ans *= (1 - ((y < 0) & (x == 0)) * 1.0) ans += ((y < 0) & (x == 0)) * (-pi / 2) return ans """ inputs = _get_inputs(context, node, expected=2) y = inputs[0] x = inputs[1] if not types.is_float(y.dtype): y = mb.cast(x=y, dtype="fp32") if not types.is_float(x.dtype): x = mb.cast(x=x, dtype="fp32") # basic logical expressions y_less_0 = mb.less(x=y, y=0.0) y_greater_0 = mb.greater(x=y, y=0.0) x_less_0 = mb.less(x=x, y=0.0) x_equal_0 = mb.equal(x=x, y=0.0) # combined logical expressions ygreater0_and_xless0 = mb.logical_and(x=y_greater_0, y=x_less_0) yless0_and_xless0 = mb.logical_and(x=y_less_0, y=x_less_0) ygreater0_and_xequal0 = mb.logical_and(x=y_greater_0, y=x_equal_0) yless0_and_xequal0 = mb.logical_and(x=y_less_0, y=x_equal_0) # bool -> fp32 for numeric operation ygreater0_and_xless0_numeric = mb.cast(x=ygreater0_and_xless0, dtype="fp32") yless0_and_xless0_numeric = mb.cast(x=yless0_and_xless0, dtype="fp32") ygreater0_and_xequal0_numeric = mb.cast(x=ygreater0_and_xequal0, dtype="fp32") yless0_and_xequal0_numeric = mb.cast(x=yless0_and_xequal0, dtype="fp32") # quadrant modification coefficients coeff1 = mb.mul(x=ygreater0_and_xless0_numeric, y=_np.pi) coeff2 = mb.mul(x=yless0_and_xless0_numeric, y=_np.pi) coeff3 = mb.sub(x=1.0, y=ygreater0_and_xequal0_numeric) coeff4 = mb.mul(x=ygreater0_and_xequal0_numeric, y=_np.pi / 2.0) coeff5 = mb.sub(x=1.0, y=yless0_and_xequal0_numeric) coeff6 = mb.mul(x=yless0_and_xequal0_numeric, y=-_np.pi / 2.0) # if -1e-8 < x < 1e-8, x += 2e-8 to avoid y / 0 # this shift makes atan2(0, 0) = 0, which is consistent with PyTorch torch.atan2 x0left = mb.greater(x=x, y=-1e-8) x0right = mb.less(x=x, y=1e-8) x0 = mb.logical_and(x=x0left, y=x0right) x0numeric = mb.cast(x=x0, dtype="fp32") safe_shift = mb.mul(x=x0numeric, y=2e-8) x_safe = mb.add(x=x, y=safe_shift) # compute atan(y / x) ydx = mb.real_div(x=y, y=x_safe) atan2_1 = mb.atan(x=ydx) # analyze quadrant atan2_2 = mb.add(x=atan2_1, y=coeff1) atan2_3 = mb.sub(x=atan2_2, y=coeff2) atan2_4 = mb.mul(x=atan2_3, y=coeff3) atan2_5 = mb.add(x=atan2_4, y=coeff4) atan2_6 = mb.mul(x=atan2_5, y=coeff5) context.add(mb.add(x=atan2_6, y=coeff6, name=node.name)) @register_torch_op def atanh(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.atanh(x=inputs[0], name=node.name)) @register_torch_op def ceil(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.ceil(x=inputs[0], name=node.name)) @register_torch_op(torch_alias=["clip"]) def clamp(context, node): inputs = _get_inputs(context, node, expected=[1,2,3]) x = inputs[0] min_val = inputs[1] if (len(inputs) > 1 and inputs[1]) else mb.const(val=_np.finfo(_np.float32).min) max_val = inputs[2] if (len(inputs) > 2 and inputs[2]) else mb.const(val=_np.finfo(_np.float32).max) if isinstance(min_val, Var) and isinstance(max_val, Var) and min_val.val >= max_val.val: # When min >= max, PyTorch sets all values to max. context.add(mb.fill(shape=mb.shape(x=x), value=max_val.val, name=node.name)) return is_input_int = types.is_int(x.dtype) if not types.is_float(x.dtype): # The `mb.clip` op requires parameters from type domain ['fp16', 'fp32']. x = mb.cast(x=x, dtype="fp32") x, min_val, max_val = promote_input_dtypes([x, min_val, max_val]) if is_input_int: clip_res = mb.clip(x=x, alpha=min_val, beta=max_val) context.add(mb.cast(x=clip_res, dtype="int32", name=node.name)) else: context.add(mb.clip(x=x, alpha=min_val, beta=max_val, name=node.name)) @register_torch_op def triu(context, node): assert context.frontend != TorchFrontend.EXECUTORCH, "triu is not a core aten op" inputs = _get_inputs( context, node, expected={ TorchFrontend.TORCHSCRIPT: 2, TorchFrontend.TORCHEXPORT: [1, 2], }, ) x = inputs[0] if len(inputs) > 1 and inputs[1] is not None and inputs[1].val is not None: diagonal = inputs[1].val else: diagonal = 0 if diagonal <= 0: res = mb.band_part(x=x, lower=-diagonal, upper=-1, name=node.name) else: y = mb.band_part(x=x, lower=-1, upper=diagonal - 1) res = mb.sub(x=x, y=y, name=node.name) context.add(res) @register_torch_op def tril(context, node): assert context.frontend != TorchFrontend.EXECUTORCH, "tril is not a core aten op" inputs = _get_inputs( context, node, expected={ TorchFrontend.TORCHSCRIPT: 2, TorchFrontend.TORCHEXPORT: [1, 2], }, ) x = inputs[0] if len(inputs) > 1 and inputs[1] is not None and inputs[1].val is not None: diagonal = inputs[1].val else: diagonal = 0 if diagonal >= 0: res = mb.band_part(x=x, lower=-1, upper=diagonal, name=node.name) else: y = mb.band_part(x=x, lower=-diagonal - 1, upper=-1) res = mb.sub(x=x, y=y, name=node.name) context.add(res) @register_torch_op def cos(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.cos(x=inputs[0], name=node.name)) @register_torch_op def cosh(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.cosh(x=inputs[0], name=node.name)) @register_torch_op def exp(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.exp(x=inputs[0], name=node.name)) @register_torch_op def exp2(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.exp2(x=inputs[0], name=node.name)) @register_torch_op def floor(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.floor(x=inputs[0], name=node.name)) @register_torch_op def reciprocal(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.inverse(x=inputs[0], name=node.name)) @register_torch_op def log(context, node): inputs = _get_inputs(context, node, expected=1) x = inputs[0] if types.is_int(x.dtype): x = mb.cast(x=x, dtype="fp32") context.add(mb.log(x=x, name=node.name)) @register_torch_op(torch_alias=["round"]) def _round(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.round(x=inputs[0], name=node.name)) @register_torch_op def rsqrt(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.rsqrt(x=inputs[0], name=node.name)) @register_torch_op def sin(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.sin(x=inputs[0], name=node.name)) @register_torch_op def sinh(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.sinh(x=inputs[0], name=node.name)) @register_torch_op def asinh(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.asinh(x=inputs[0], name=node.name)) @register_torch_op def sqrt(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.sqrt(x=inputs[0], name=node.name)) @register_torch_op def square(context, node): inputs = _get_inputs(context, node, expected=1) # mb.square is not supported in some backend context.add(mb.mul(x=inputs[0], y=inputs[0], name=node.name)) @register_torch_op def tan(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.tan(x=inputs[0], name=node.name)) @register_torch_op def tanh(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.tanh(x=inputs[0], name=node.name)) @register_torch_op def threshold(context, node): inputs = _get_inputs(context, node, expected=3) x = inputs[0] alpha = inputs[1] threshold_val = inputs[2] # Simple case (threshold_val == alpha) if alpha.val == threshold_val.val: threshold_node = mb.threshold(x=x, alpha=alpha, name=node.name) context.add(threshold_node) return # Complex case (threshold_val != threshold) threshold_node = mb.threshold(x=x, alpha=alpha, name=node.name + '_threshold') context.add(threshold_node) gt_node = mb.greater_equal(x=alpha, y=x, name=node.name + '_ge') context.add(gt_node) gt_node_32 = mb.cast(x=gt_node, dtype="fp32", name=node.name + '_ge32') mul_node = mb.linear_activation( x=gt_node_32, alpha=float(threshold_val.val - alpha.val), beta=0., name=node.name + '_mul' ) context.add(mul_node) final_node = mb.add(x=mul_node, y=threshold_node, name=node.name) context.add(final_node) @register_torch_op def sign(context, node): inputs = _get_inputs(context, node, expected=1) context.add(mb.sign(x=inputs[0], name=node.name)) @register_torch_op def is_floating_point(context, node): inputs = _get_inputs(context, node, expected=1) is_float = types.is_float(inputs[0].dtype) context.add(mb.const(val=is_float, name=node.name)) @register_torch_op def logical_and(context, node): inputs = _get_inputs(context, node, expected=2) x, y = inputs x = mb.cast(x=x, dtype="bool") y = mb.cast(x=y, dtype="bool") context.add(mb.logical_and(x=x, y=y, name=node.name)) @register_torch_op def logical_or(context, node): inputs = _get_inputs(context, node, expected=2) x, y = inputs x = mb.cast(x=x, dtype="bool") y = mb.cast(x=y, dtype="bool") context.add(mb.logical_or(x=x, y=y, name=node.name)) @register_torch_op def logical_xor(context, node): inputs = _get_inputs(context, node, expected=2) x, y = inputs x = mb.cast(x=x, dtype="bool") y = mb.cast(x=y, dtype="bool") context.add(mb.logical_xor(x=x, y=y, name=node.name)) def _nonzero_as_tuple(context, node, x): ''' Calculates the non-zero elements of x then slices results by each inner index. ''' non_zero = mb.non_zero(x=x) result = [] for i in range(x.rank): result.append( mb.slice_by_index( x=non_zero, begin=[0, i], end=[-1, -1], # Ignored, but required end_mask=[True, False], squeeze_mask=[False, True] ) ) context.add(result, node.name) @register_torch_op(torch_alias=["where.self"]) def where(context, node): inputs = _get_inputs(context, node) if len(inputs) == 1: _nonzero_as_tuple(context, node, inputs[0]) return assert len(inputs) == 3 cond, a, b = inputs a, b = promote_input_dtypes([a, b]) if not types.is_bool(cond.dtype): # cond must be bool type cond = mb.cast(x=cond, dtype="bool") if not any([any_symbolic(x.shape) for x in (cond, a, b)]): # broadcast all tensors to the same shape cond, a, b = _broadcast_tensors([cond, a, b]) result = mb.select(cond=cond, a=a, b=b, name=node.name) context.add(result) @register_torch_op def nonzero_numpy(context, node): inputs = _get_inputs(context, node, expected=1) _nonzero_as_tuple(context, node, inputs[0]) @register_torch_op def neg(context, node): inputs = _get_inputs(context, node, expected=1) x, y = promote_input_dtypes([inputs[0], -1]) context.add(mb.mul(x=x, y=y, name=node.name)) @register_torch_op def topk(context, node): inputs = _get_inputs(context, node) kwargs = {"name": node.name, "x": inputs[0], "k": inputs[1]} if len(inputs) > 6: raise Exception("Number of inputs to topk exceeds 6") # optional: @axis if len(inputs) > 2: if inputs[2] is not None: kwargs["axis"] = inputs[2].val # optional: @ascending if len(inputs) > 3: largest = inputs[3].val kwargs["ascending"] = not largest # last inputs to topk are optional - sorted and out. sort = True if len(inputs) > 4: if inputs[4].val is False and not is_current_opset_version_compatible_with(target.iOS16): raise Exception("For opset <= iOS16, only sorted=True supported for the topk") sort = inputs[4].val if len(inputs) > 5: if inputs[5] is not None: raise Exception( "Unsupported value for argument 'out' in topk. Supported values: None, but input " "is {}".format(inputs[5].val) ) if is_current_opset_version_compatible_with(target.iOS16): kwargs["sort"] = sort if kwargs["k"].val is None: res = _utils.dynamic_topk( x=kwargs["x"], k=kwargs["k"], axis=kwargs["axis"], ascending=kwargs["ascending"] ) else: res = mb.topk(**kwargs) values_name = node.outputs[0] indices_name = node.outputs[1] context.add(res[0], torch_name=values_name) context.add(res[1], torch_name=indices_name) def _std(x, axes, keep_dim, unbiased, eps): need_rescale = False if unbiased: # If "unbiased" is True, # then we need to divide by "N-1" (instead of "N") to compute the mean of (x-E[x])^2 # for an unbiased estimate of the variance / standard deviation. # In the sequence of MIL ops added below, we first compute the mean using "N", and only if its unbiased # we rescale later, the final result. # We ignore the "unbiased" flag, if any of the dimensions involved in this operation are dynamic # (we could have still handled that case by using "get_shape" etc ops, but we don't do that here, # trading performance for numerical accuracy) if axes is None: if not any_symbolic(x.shape) and _np.prod(x.shape) > 1: N = _np.prod(x.shape) need_rescale = True else: dims = [] # collect dimensions corresponding to "axes" for axis in axes: dims.append(x.shape[axis]) if all([not is_symbolic(s) for s in dims]): N = _np.prod(dims) if N > 1: need_rescale = True if need_rescale: rescale_factor = _np.sqrt(N / float(N - 1)) x_mean = mb.reduce_mean(x=x, axes=axes, keep_dims=True) x_demeaned = mb.sub(x=x, y=x_mean) x_demeaned_square = mb.square(x=x_demeaned) x_demeaned_square_mean = mb.reduce_mean(x=x_demeaned_square, axes=axes, keep_dims=keep_dim) if eps > 0: x_demeaned_square_mean = mb.add(x=x_demeaned_square_mean, y=eps) if need_rescale: y_before_scale = mb.sqrt(x=x_demeaned_square_mean) y = mb.mul(x=y_before_scale, y=rescale_factor) else: y = mb.sqrt(x=x_demeaned_square_mean) return y @register_torch_op def numel(context, node): inputs = _get_inputs(context, node, expected=1) x = inputs[0] x = mb.shape(x=x) x = mb.reduce_prod(x=x, axes=[0], name=node.name) context.add(x) @register_torch_op def std(context, node): inputs = _get_inputs(context, node) x = inputs[0] if not (len(inputs) == 2 or len(inputs) == 4): raise ValueError("Number of inputs to the 'std' op must be" "2 or 4") keep_dim = False axes = None if len(inputs) == 2: unbiased = inputs[1].val if len(inputs) == 4: axes = inputs[1].val if isinstance(axes, int): axes = [axes] unbiased = inputs[2].val keep_dim = inputs[3].val y = _std(x, axes, keep_dim, unbiased, 0) context.add(y, node.name) @register_torch_op def copy(context, node): inputs = _get_inputs(context, node, expected=[2, 3]) assert ( context.frontend != TorchFrontend.TORCHSCRIPT ), ( "In torch script frontend, by graph pass `generate_tensor_assignment_ops`, " "`torch.copy_` should have been replaced with `_internal_op_tensor_inplace_copy`" ) if context.frontend in TORCH_EXPORT_BASED_FRONTENDS: src = inputs[1] if inputs[0].shape != src.shape: _, src = _broadcast_tensors(inputs[: 2]) result = mb.identity(x=src, name=node.name) else: raise ValueError(f"Invalid PyTorch frontend {context.frontend}") context.add(result) @register_torch_op def dtype(context, node): inputs = _get_inputs(context, node, expected=1) dtype_str = inputs[0].dtype.__name__ context.add(mb.const(val=dtype_str, name=node.name)) @register_torch_op def tensor(context, node): def _make_tensor(list_of_tensor, name, rank): if rank == 6: raise NotImplementedError("Core ML only supports tensor rank <= 5.") if not isinstance(list_of_tensor, list): return list_of_tensor values = [ _make_tensor(x, name + "_r_" + str(i), rank + 1) for i, x in enumerate(list_of_tensor) ] if len(values) == 1: return mb.expand_dims(x=values[0], axes=[0], name=name) return mb.stack(values=values, axis=0, name=name) inputs = _get_inputs(context, node, expected=4) # Case 1: Using torch.tensor to create a const tensor # For example: # torch.tensor([[[0, 0], [0, 10], [5, 10], [5, 0]]], dtype=torch.float32) val = inputs[0] if isinstance(val, list): context.add(_make_tensor(val, node.name, 1)) return if inputs[2] is None: context.add(mb.identity(x=val, name=node.name)) return # Case 2: Create a tensor filled with a single value val = val.val # element val to fill msg_prefix = 'torch::tensor {} '.format(node.name) if val is None: raise ValueError(msg_prefix + 'val is None') dtype_str = inputs[1].val if dtype_str != "fp32": raise NotImplementedError( msg_prefix + "Unsupported dtype: {}".format(dtype_str) ) # inputs[3] is a bool (not sure what it is) shape = mb.shape(x=inputs[2], name=node.name + "_shape") context.add(mb.fill(shape=shape, value=val, name=node.name)) """ Pack and unpack op in pytorch. The typical pattern is as following >>> seq = torch.tensor([[1,2,0], [3,0,0], [4,5,6]]) >>> lens = [2, 1, 3] >>> packed = pack_padded_sequence(seq, lens, batch_first=True, enforce_sorted=False) >>> packed PackedSequence(data=tensor([4, 1, 3, 5, 2, 6]), batch_sizes=tensor([3, 2, 1]), sorted_indices=tensor([2, 0, 1]), unsorted_indices=tensor([1, 2, 0])) >>> seq_unpacked, lens_unpacked = pad_packed_sequence(packed, batch_first=True) >>> seq_unpacked tensor([[1, 2, 0], [3, 0, 0], [4, 5, 6]]) >>> lens_unpacked tensor([2, 1, 3]) source from https://pytorch.org/docs/stable/generated/torch.nn.utils.rnn.pad_packed_sequence.html """ @register_torch_op def _pack_padded_sequence(context, node): # The implementation of this op is not efficient. Raise a warning. logger.warning( "Encountered a _pack_padded_sequence layer. The implementation of translating pack/unpack op\ in pytorch is not efficient due to the current limitation of Core ML. Removing the pack-unpack logic \ and use a fixed batch size model is recommended." ) inputs = _get_inputs(context, node, expected=3) tensor_name, batch_sizes_name = node.outputs tensor_input = inputs[0] batch_sizes = inputs[1] batch_first = inputs[2].val # by assuming that the output of this op is always feed in lstm layer, # we enforce the layout to be Batch * seq_length * Feature. if not batch_first: tensor_input = mb.transpose(x=tensor_input, perm=[1, 0, 2]) context.add(mb.identity(x=tensor_input, name=tensor_name)) # add the batch_sizes in the context, so that _pad_packed_sequence can # find it later. context.add(mb.identity(x=batch_sizes, name=batch_sizes_name)) @register_torch_op def _pad_packed_sequence(context, node): # The implementation of this op is not efficient. Raise a warning. logger.warning( "Encountered a _pad_packed_sequence layer. The implementation of translating pack/unpack op\ in pytorch is not efficient due to the current limitation of Core ML. Removing the pack-unpack logic \ and use a fixed batch size model is recommended." ) inputs = _get_inputs(context, node) # seq_lengths denotes the actual sequence length for each batch. # pad denotes the padding value for those data which has shorter length. input_tensor = inputs[0] seq_lengths = inputs[1] batch_first = inputs[2].val pad = inputs[3].val # we only support pack and unpack translation for static tensor shape, # i.e., the three dimensions are all known during compile time. if any([is_symbolic(x) for x in input_tensor.shape]): raise NotImplementedError("Only static shape of PackedSequence object is supported.") # the input always has batch first layout. # padded_seq_len denotes the maximum sequence length across batches. batch, padded_seq_len, input_dim = input_tensor.shape assert seq_lengths.rank == 1 assert batch == seq_lengths.shape[0] # we iterate through the batch, pad each data, and concate them into a single tensor in the end, # which is the total_tensor here. # Say the input_tensor has shape [batch , padded_seq_len, input_dim], # and the seq_lengths = [len_1, len_2, len_3]. # Note that in pytorch, the seq_lengths must be decreasing in order, len_1 >= len_2 >= len_3. total_tensor = [] for i in range(batch): # slice for each data # x has shape [padded_seq_len, input_dim] x = mb.slice_by_index( x=input_tensor, begin=[i, 0, 0], end=[0, 0, 0], stride=[1, 1, 1], begin_mask=[False, True, True], end_mask=[False, True, True], squeeze_mask=[True, False, False], ) # get the unpadded sequence, # if the unpadded sequence has length seq_length, # x would have shape [seq_length, input_dim]. # For example, the first data would result in a [len_1, input_dim] tensor. seq_length = mb.cast(x=value_at(seq_lengths, i), dtype="int32") concate_values = [seq_length, input_dim] end_index = mb.concat(values=concate_values, axis=0) x = mb.slice_by_index( x=x, begin=[0, 0], end=end_index, stride=[1, 1], begin_mask=[True, True], end_mask=[False, True], ) # get the padding part of the data # Note that we always add one dummy padding in the end with shape [padded_seq_len - seq_length + 1, input_dim]. # The reason is that for the case when seq_length = padded_seq_len, # coreml cannot handle the empty tensor. pad_length = mb.sub(x=padded_seq_len + 1, y=seq_length) concate_values = [pad_length, input_dim] shape = mb.concat(values=concate_values, axis=0) pad_values = mb.fill(shape=shape, value=pad) # concate the unpadded sequence and the padding data # the resulting tensor would have shape [padded_seq_len + 1, input_dim] x, pad_values = promote_input_dtypes([x, pad_values]) concate_values = [x, pad_values] add_values = mb.concat(values=concate_values, axis=0) # trim the dummy padding tensor # the output would have shape [padded_seq_len, input_dim] x = mb.slice_by_index( x=add_values, begin=[0, 0], end=[padded_seq_len, 0], stride=[1, 1], begin_mask=[True, True], end_mask=[False, True], ) # add it to total tensor total_tensor.append(x) # transpose the tensor if batch_first = False if not batch_first: x = mb.stack(values=total_tensor, axis=0) x = mb.transpose(x=x, perm=[1, 0, 2], name=node.name) else: x = mb.stack(values=total_tensor, axis=0, name=node.name) context.add(x) @register_torch_op def log10(context, node): inputs = _get_inputs(context, node) x = inputs[0] log_x = mb.log(x=x) context.add(mb.mul(x=log_x, y=1 / _np.log(10.0)), node.name) @register_torch_op def log2(context, node): inputs = _get_inputs(context, node) x = inputs[0] log_x = mb.log(x=x) context.add(mb.mul(x=log_x, y=1 / _np.log(2.0)), node.name) @register_torch_op def flip(context, node): inputs = _get_inputs(context, node, expected=2) x = mb.reverse(x=inputs[0], axes=inputs[1], name=node.name) context.add(x, node.name) @register_torch_op(torch_alias=["reflection_pad1d"]) def reflection_pad2d(context, node): inputs = _get_inputs(context, node) x = inputs[0] torch_pad = inputs[1].val pad_flipped = torch_pad.reshape((-1, 2))[::-1].ravel() pad = _np.pad(pad_flipped, (len(x.shape) * 2 - len(pad_flipped), 0)) context.add(mb.pad(x=x, pad=pad, mode='reflect'), node.name) @register_torch_op(torch_alias=["replication_pad1d"]) def replication_pad2d(context, node): inputs = _get_inputs(context, node) x = inputs[0] torch_pad = inputs[1].val pad_flipped = torch_pad.reshape((-1, 2))[::-1].ravel() pad = _np.pad(pad_flipped, (len(x.shape) * 2 - len(pad_flipped), 0)) context.add(mb.pad(x=x, pad=pad, mode='replicate'), node.name) def _solve_broadcast_shape(shapes: List[List[int]]) -> List[np.ndarray]: rank = _np.max([len(shape) for shape in shapes]) shapes = [[1] * (rank - len(shape)) + shape for shape in shapes] result_shape = [] for i in range(rank): dims = [shapes[j][i] for j in range(len(shapes))] if any_symbolic(dims): # rdar://85559497 (Handle dynamic shapes inputs broadcast for pytorch) symbols = set() integers = set() for dim in dims: if is_symbolic(dim): symbols.add(dim) else: integers.add(dim) # Integers can be safely ignored if integers == {1} or integers == set(): result_dim = list(symbols)[0] result_shape.append(result_dim) # In principle, there must be only 1 symbol # In practise, since our symbol propagation is imperfect, # we may see multiple symbols, even if they must equal to each other / 1 if len(symbols) != 1: logger.warning(f"Recklessly broadcast {symbols} to {result_dim}") # In principle, in such case the symbols must be 1 or equal to the integer # In practise, since our symbol propagation is imperfect, # we may still see symbols, even if they must equal to max integer / 1 else: result_dim = _np.max(list(integers)) result_shape.append(result_dim) logger.warning(f"Recklessly broadcast {symbols} and {integers} to {result_dim}") else: result_shape.append(_np.max(dims)) return result_shape def _broadcast_tensors(tensors): if len(tensors) == 1: return tensors # solve the broadcast shape input_shapes = [list(x.shape) for x in tensors] broadcast_shape = _solve_broadcast_shape(input_shapes) # do the broadcasting results = [] for tensor in tensors: name = tensor.name + "_after_broadcast" results.append(_broadcast(name, tensor, broadcast_shape)) return results @register_torch_op def broadcast_tensors(context, node): inputs = _get_inputs(context, node) context.add(_broadcast_tensors(inputs[0]), node.name) def _scatter(context, inputs, mode, name): data = inputs[0] axis = inputs[1].val indices = inputs[2] updates = inputs[3] if types.is_scalar(updates.sym_type): updates = mb.fill(shape=indices.shape, value=updates.val, name=name) result = mb.scatter_along_axis(data=data, indices=indices, updates=updates, axis=axis, mode=mode, name=name) context.add(result) @register_torch_op def scatter(context, node): inputs = _get_inputs(context, node) assert len(inputs) in (4, 5) # Determine reduce/mode parameter if len(inputs) == 5: mode = inputs[4].val if mode == 'multiply': mode = 'mul' else: assert mode == 'add' else: mode = 'update' _scatter(context, inputs, mode, node.name) @register_torch_op def scatter_add(context, node): inputs = _get_inputs(context, node) _scatter(context, inputs, 'add', node.name) @register_torch_op def baddbmm(context, node): """ baddbmm(Tensor input, Tensor batch1, Tensor batch2, Scalar beta=1, Scalar alpha=1) output = beta * input + alpha * batch1 * batch2 Notice that batch1 and batch2 must be 3-D tensors each containing the same number of matrices. If batch1 is a (b×n×m) tensor, batch2 is a (b×m×p) tensor, then input must be broadcastable with a (b×n×p) tensor and out will be a (b×n×p) tensor. """ assert len(node.outputs) == 1 inputs = _get_inputs(context, node, expected=5) bias, batch1, batch2, beta, alpha = inputs if alpha.val != 1.0: # Apply scaling factor alpha to the input. batch1 = mb.mul(x=alpha, y=batch1, name=batch1.name + "_scaled") context.add(batch1) bmm_node = mb.matmul(x=batch1, y=batch2, name=node.name + "_bmm") if beta.val != 0.0 or bias.shape != bmm_node.shape: context.add(bmm_node) if beta.val != 1.0: # Torch supports integers, so convert to float before if beta.dtype != bias.dtype: logger.warning( f"Casting the `beta`(value={beta.val}) argument of `baddbmm` op {node.name} " f"from {beta.dtype} to {bias.dtype} dtype") beta = mb.cast(x=beta, dtype=TYPE_TO_DTYPE_STRING[bias.dtype]) # Apply scaling factor beta to the bias. bias = mb.mul(x=beta, y=bias, name=bias.name + "_scaled") context.add(bias) baddbmm_node = mb.add(x=bias, y=bmm_node, name=node.name) context.add(baddbmm_node) else: bmm_node.name = node.name context.add(bmm_node) @register_torch_op def glu(context, node): """ glu(Tensor input, Scalar dim=-1) Applies the gated linear unit function GLU(a,b)=a⊗σ(b) where a is the first half of the input matrices and b is the second half. """ assert len(node.outputs) == 1 inputs = _get_inputs(context, node, expected=2) input, axis = inputs first_half, second_half = mb.split(x=input, num_splits=2, axis=axis.val, name=node.name + "_split") context.add(first_half) context.add(second_half) sigmoid_second_half = mb.sigmoid(x=second_half, name=second_half.name + "_sigmoid") context.add(sigmoid_second_half) glu_node = mb.mul(x=first_half, y=sigmoid_second_half, name=node.name) context.add(glu_node) @register_torch_op def hstack(context, node): """ hstack(List[Tensor] tensors, Optional[Tensor] out) Stack tensors in sequence horizontally (column wise). This is equivalent to concatenation along the first axis for 1-D tensors, and along the second axis for all other tensors. """ inputs = _get_inputs(context, node) tensors = inputs[0] input_shapes = [list(x.shape) for x in tensors] # Concatenates along the first axis for 1-D tensors, and along the second axis for all other tensors. axis = 0 if len(input_shapes[0]) == 1 else 1 hstack_node = mb.concat(values=tensors, axis=axis, name=node.name) context.add(hstack_node) @register_torch_op def remainder(context, node): """ remainder(Tensor dividend, Tensor divisor, Optional[Tensor] out) Computes Python’s modulus operation entrywise. The result has the same sign as the divisor and its absolute value is less than that of divisor. It may also be defined in terms of torch.div() as: remainder(a, b) == a - a.div(b, rounding_mode="floor") * b """ # Don't specify `expected` because the parameter `out` is optional. inputs = _get_inputs(context, node) dividend, divisor = promote_input_dtypes([inputs[0], inputs[1]]) div_node = mb.floor_div(x=dividend, y=divisor, name=node.name + "_div") context.add(div_node) scaled_div = mb.mul(x=div_node, y=divisor, name=div_node.name + "_scaled") context.add(scaled_div) remainder_node = mb.sub(x=dividend, y=scaled_div, name=node.name) context.add(remainder_node) @register_torch_op def hann_window(context, node): inputs = _get_inputs(context, node, expected=[5, 6]) if inputs[0].val is None: raise NotImplementedError("variable 'window_length' not supported.") periodic = True if len(inputs) == 6: if inputs[1].val is None: raise NotImplementedError("variable 'periodic' not supported.") if not inputs[1].val: periodic = False size = (inputs[0].val,) if inputs[0].val <= 1: one = mb.fill(shape=size, value=1.0, name=node.name) context.add(one) return ones = mb.fill(shape=size, value=1.0) cum = mb.cumsum(x=ones, axis=0) seq = mb.sub(x=cum, y=ones) pi = mb.fill(shape=size, value=_math.pi) window_length_float = mb.cast(x=inputs[0], dtype="fp32") if not periodic: window_length_float = mb.sub(x=window_length_float, y=ones) denominator = mb.fill(shape=size, value=window_length_float) numerator = mb.mul(x=seq, y=pi) frac = mb.real_div(x=numerator, y=denominator) sin = mb.sin(x=frac) sin_sq = mb.mul(x=sin, y=sin, name=node.name) context.add(sin_sq) @register_torch_op def mse_loss(context, node): inputs = _get_inputs(context, node, expected=3) x = inputs[0] y = inputs[1] reduction = inputs[2].val diff = mb.sub(x=x, y=y) if reduction == 0: # reduction is "none" res = mb.mul(x=diff, y=diff, name=node.name) context.add(res) return square = mb.mul(x=diff, y=diff) if reduction == 1: # reduction is "mean" res = mb.reduce_mean(x=square, axes=None, name=node.name) elif reduction == 2: # reduction is "sum" res = mb.reduce_sum(x=square, axes=None, name=node.name) else: raise ValueError("Reduction is not supported") context.add(res) @register_torch_op def trace(context, node): inputs = _get_inputs(context, node, expected=1) x = inputs[0] dims = mb.shape(x=x) dim0 = value_at(dims, 0) dim1 = value_at(dims, 1) min_dim = mb.minimum(x=dim0, y=dim1) indices = mb.range_1d(end=min_dim, start=0, step=1) indices = mb.stack(values=[indices, indices], axis=1) diagonal = mb.gather_nd(x=x, indices=indices) trace = mb.reduce_sum(x=diagonal, name=node.name) context.add(trace) @register_torch_op def roll(context, node): inputs = _get_inputs(context, node, expected=3) x = inputs[0] shift = inputs[1].val dims = inputs[2].val origin_shape = mb.shape(x=x) need_flatten = len(dims) == 0 if need_flatten: # The tensor is flattened before rolling x = mb.reshape(x=x, shape=[-1]) dims = [0] shape = mb.shape(x=x) for s, i in zip(shift, dims): dim = value_at(shape, i) s = mb.mod(x=s, y=dim) start_idx = mb.sub(x=dim, y=s) indices0 = mb.range_1d(end=dim, start=start_idx, step=1) indices1 = mb.range_1d(end=start_idx, start=0, step=1) indices = mb.concat(values=[indices0, indices1], axis=0) x = mb.gather(x=x, indices=indices, axis=i) if need_flatten: x = mb.reshape(x=x, shape=origin_shape) context.add(x, node.name) def _construct_unfold_indices(N, C, H, W, kernel_size, stride): """ A utility function to construct indices for torch.unfold (im2col), assuming the torch.unfold input `x` to be contiguous """ # Get starting block indices. start_idx = _np.arange(kernel_size[0])[None, :, None] * W + _np.arange( kernel_size[1] ) # Generate depth indices. channel_index = H * W * _np.arange(C) start_idx = (channel_index[None, :, None] + _np.ravel(start_idx)).reshape( (-1, kernel_size[0], kernel_size[1]) ) # Get offsetted indices across the height and width of input array. row_extent = H - kernel_size[0] + 1 col_extent = W - kernel_size[1] + 1 offset_idx = _np.arange(0, row_extent, stride[0])[None, :, None] * W + _np.arange(0, col_extent, stride[1]) indices = _np.ravel(start_idx)[:, None] + _np.ravel(offset_idx) # Get batch block indices. batch_idx = _np.arange(N)[:, None, None] * C * H * W indices = batch_idx + indices return indices.reshape(-1) @register_torch_op def im2col(context, node): """ Extract sliding local blocks from a batched input tensor (rank=4). torch.nn.functional.unfold aims to be the general version: im2col is the rank=4 case of unfold. PyTorch currently only supports rank=4 input: torch.nn.functional.unfold redispatches to at::im2col, which is why coremltools needs im2col to convert torch.nn.functional.unfold. We currently only support rank=4 input (consistent with PyTorch) and dilation set to 1. More flexbible dilation support will be added in the future. Reference https://pytorch.org/docs/stable/generated/torch.nn.Unfold.html """ inputs = _get_inputs(context, node, expected=5) x = inputs[0] kernel_size = inputs[1].val dilation = inputs[2].val padding = inputs[3].val stride = inputs[4].val if x.rank != 4: raise ValueError("Only supports rank=4 input data for im2col (unfold).") if not (dilation[0] == 1 and dilation[1] == 1): raise ValueError("Only supports dilation=1 for im2col (unfold).") # for simplicity, we explicitly pad; TODO: implicit padding would be more efficient # torch.unfold padding has different semantics # * for torch.unfold # x.shape[i + x.rank - padding.rank] = padding[i] + x.shape[i + x.rank - padding.rank] + padding[i] # taking x.rank = 4 and padding.rank = 2 as an example: # x.shape[0 + 4 - 2] = padding[0] + x.shape[0 + 4 - 2] + padding[0] # x.shape[1 + 4 - 2] = padding[1] + x.shape[1 + 4 - 2] + padding[1] # * for mb.pad(x=x, pad=pad, mode="constant") # x.shape[i] = pad[2 * i] + x.shape[i] + pad[2 * i + 1] # * for torch.nn.functional.pad # x.shape[-1] = padding[0] +x.shape[-1] + padding[1] # x.shape[-2] = padding[2] +x.shape[-1] + padding[3] # ... # x.shape[-i] = padding[2 * i - 2] + x.shape[-i] + padding[2 * i - 1] # so we need to convert torch.unfold padding to mb.pad(mode="constant") pad missing_dims = x.rank - len(padding) pad = [0, 0] * missing_dims + _np.array(padding).repeat(2).tolist() x = mb.pad(x=x, pad=pad, mode="constant") N, C, H, W = x.shape # Get total number of blocks. It follows the formula at torch.nn.Unfold documentation. sptial_size = (H, W) block_count = 1 for i in range(2): block_count *= _np.floor( # the original formula is # (sptial_size[i] + 2 * padding[i] - dilation[i] * (kernel_size[i] - 1) - 1) / stride[i] # since we have explicitly padded, we no longer add 2 * padding[i] to sptial_size[i] (sptial_size[i] - dilation[i] * (kernel_size[i] - 1) - 1) / stride[i] + 1 ).astype(_np.int32) """ The implementation below assumes x to be contiguous """ indices = _construct_unfold_indices(N, C, H, W, kernel_size, stride) x = mb.reshape(x=x, shape=[-1]) gathered_data = mb.gather_along_axis(x=x, indices=indices, axis=0) block_size = C * kernel_size[0] * kernel_size[1] output = mb.reshape( x=gathered_data, shape=(N, block_size, block_count), name=node.name ) context.add(output) @register_torch_op def col2im(context, node): """ Combines an array of sliding local blocks into a large containing tensor. torch.nn.functional.fold aims to be the general version: col2im is the "2 output spatial dimensions" case of fold. PyTorch currently only supports col2im: torch.nn.functional.fold redispatches to at::col2im, which is why coremltools needs col2im to convert torch.nn.functional.fold. We currently only support col2im (consistent with PyTorch) and: * dilation set to 1 * padding set to 0 * stride set to kernel_size * output_size is divisible by kernel_size More flexbible support will be added in the future. Reference https://pytorch.org/docs/stable/generated/torch.nn.Fold.html """ inputs = _get_inputs(context, node, expected=6) x = inputs[0] output_size = inputs[1].val kernel_size = inputs[2].val dilation = inputs[3].val padding = inputs[4].val stride = inputs[5].val if len(output_size) != 2: raise ValueError("Only supports 2 output spatial dimensions for col2im (fold).") if not (dilation[0] == 1 and dilation[1] == 1): raise ValueError("Only supports dilation=1 for col2im (fold).") if not (padding[0] == 0 and padding[1] == 0): raise ValueError("Only supports padding=0 for col2im (fold).") # In Pytorch, if multiple entries unfold to same location, then in folding they are accumulated # In Core ML, however, there is no such op to perform this accumulation, # so we cowardly refuse to convert if accumulation happens # TODO: we may be able to support accumulation if x has certain symmetry (e.g. output by im2col) # by multiplying the repeat times of each entry if any(stride != kernel_size): raise ValueError("Only supports stride = kernel_size for col2im (fold).") # We implement fold as an inverse to unfold # i.e. a gather with indices that are inverse to unfold gather indices # This works only if there is no edge leftover if any(output_size % kernel_size != 0): raise ValueError("Only supports output_size % kernel_size = 0 for col2im (fold).") N, block_size, block_count = x.shape C = int(block_size / _np.prod(kernel_size)) H, W = output_size """ The implementation below assumes x to be contiguous """ # inverse unfold indices indices_unfold = _construct_unfold_indices(N, C, H, W, kernel_size, stride) indices = _np.empty(indices_unfold.shape, dtype=np.int32) for i in range(indices.shape[0]): indices[indices_unfold[i]] = i # perform gather with fold indices x_flatten = mb.reshape(x=x, shape=(-1,)) y_flatten_with_extra = mb.gather_along_axis(x=x_flatten, indices=indices) y_flatten = mb.slice_by_index(x=y_flatten_with_extra, begin=(0,), end=(N * C * H * W,)) y = mb.reshape(x=y_flatten, shape=(N, C, H, W), name=node.name) context.add(y) @register_torch_op def complex(context, node): real_part, imag_part = _get_inputs(context, node, expected=2) result = mb.complex(real_data=real_part, imag_data=imag_part) context.add(result, node.name) @register_torch_op def real(context, node): input_data = _get_inputs(context, node, expected=1)[0] if types.is_complex(input_data.dtype): real_part = mb.complex_real(data=input_data) context.add(real_part, node.name) else: context.add(input_data, node.name) @register_torch_op def imag(context, node): input_data = _get_inputs(context, node, expected=1)[0] if not types.is_complex(input_data.dtype): # Keep consistent with PyTorch. raise ValueError("The `imag` op only supports complex input.") real_part = mb.complex_imag(data=input_data) context.add(real_part, node.name) @register_torch_op def view_as_real(context, node): input_data = _get_inputs(context, node, expected=1)[0] if not types.is_complex(input_data.dtype): raise ValueError(f"view_as_real only supports complex input, but got {types.builtin_to_string(input_data.dtype)}") real_part = mb.complex_real(data=input_data) imag_part = mb.complex_imag(data=input_data) result = mb.stack(values=[real_part, imag_part], axis=-1) context.add(result, node.name) @register_torch_op def fft_fft(context, node): """Lowers torch.fft.fft by the dialect op `complex_fft` from complex_dialect_ops.py.""" input_data, n, dim, norm = _get_inputs(context, node, expected=[4]) fft_res = mb.complex_fft(data=input_data, n=n, dim=dim, norm=norm) context.add(fft_res, node.name) @register_torch_op def fft_fftn(context, node): """Lowers torch.fft.fftn by the dialect op `complex_fftn` from complex_dialect_ops.py.""" input_data, shapes, dims, norm = _get_inputs(context, node, expected=[4]) fft_res = mb.complex_fftn(data=input_data, shapes=shapes, dims=dims, norm=norm) context.add(fft_res, node.name) @register_torch_op def fft_rfft(context, node): """Lowers torch.fft.rfft by the dialect op `complex_rfft` from complex_dialect_ops.py.""" input_data, n, dim, norm = _get_inputs(context, node, expected=[4]) rfft_res = mb.complex_rfft(data=input_data, n=n, dim=dim, norm=norm) context.add(rfft_res, node.name) @register_torch_op def fft_rfftn(context, node): """Lowers torch.fft.rfftn by the dialect op `complex_rfftn` from complex_dialect_ops.py.""" input_data, shapes, dims, norm = _get_inputs(context, node, expected=[4]) rfft_res = mb.complex_rfftn(data=input_data, shapes=shapes, dims=dims, norm=norm) context.add(rfft_res, node.name) @register_torch_op def fft_ifft(context, node): """Lowers torch.fft.ifft by the dialect op `complex_ifft` from complex_dialect_ops.py.""" input_data, n, dim, norm = _get_inputs(context, node, expected=[4]) ifft_res = mb.complex_ifft(data=input_data, n=n, dim=dim, norm=norm) context.add(ifft_res, node.name) @register_torch_op def fft_ifftn(context, node): """Lowers torch.fft.ifftn by the dialect op `complex_ifftn` from complex_dialect_ops.py.""" input_data, shapes, dims, norm = _get_inputs(context, node, expected=[4]) ifftn_res = mb.complex_ifftn(data=input_data, shapes=shapes, dims=dims, norm=norm) context.add(ifftn_res, node.name) @register_torch_op def fft_irfft(context, node): """Lowers torch.fft.irfft by the dialect op `complex_irfft` from complex_dialect_ops.py.""" input_data, n, dim, norm = _get_inputs(context, node, expected=[4]) irfft_res = mb.complex_irfft(data=input_data, n=n, dim=dim, norm=norm) context.add(irfft_res, node.name) @register_torch_op def fft_irfftn(context, node): """Lowers torch.fft.irfftn by the dialect op `complex_irfftn` from complex_dialect_ops.py.""" input_data, shapes, dims, norm = _get_inputs(context, node, expected=[4]) irfftn_res = mb.complex_irfftn(data=input_data, shapes=shapes, dims=dims, norm=norm) context.add(irfftn_res, node.name) @register_torch_op def stft(context, node): """ Lowers torch.stft with the dialect op `complex_stft` from complex_dialect_ops.py """ input_data, n_fft, hop_length, win_length, window, normalized, onesided, _ = _get_inputs(context, node, min_expected=2) if types.is_complex(input_data.dtype): onesided = False # pytorch defaults onesided to False for complex inputs stft_res = mb.complex_stft( input=input_data, n_fft=n_fft, hop_length=hop_length, win_length=win_length, window=window, normalized=normalized, onesided=onesided) context.add(stft_res, node.name) @register_torch_op(torch_alias=["torchvision::nms"]) def torchvision_nms(context, node): inputs = _get_inputs(context, node, expected=3) boxes, scores = promote_input_dtypes([inputs[0], inputs[1]]) iou_threshold = inputs[2].val # Use float min to avoid boxes being pruned by scores in MIL NMS op. score_threshold = ( _np.finfo(_np.float16).min if boxes.dtype._width == 16 else _np.finfo(_np.float32).min ) box_num = boxes.shape[0] if is_symbolic(box_num): # When the number of boxes is unknown at compile time, use a large number to avoid valid # boxes got pruned. We don't use _np.iinfo(_np.int32).max here because it triggers the MIL # NMS op segment fault. box_num = 10000 # The boxes' coordinates from PyTorch input is (x1, y1, x2, y2) format with 0 <= x1 < x2 and # 0 <= y1 < y2. However, the MIL NMS op expects CENTER_SIZE_WIDTH_FIRST format, which is # (x, y, width, height) where (x, y) is the center coordinate. x1, y1, x2, y2 = mb.split(x=boxes, num_splits=4, axis=-1) # For numerical stability, use x1+(x2-x1)/2 instead of (x1+x2)/2 to calculate center coordinate. width = mb.sub(x=x2, y=x1) height = mb.sub(x=y2, y=y1) center_x = mb.add(x=x1, y=mb.real_div(x=width, y=2.0)) center_y = mb.add(x=y1, y=mb.real_div(x=height, y=2.0)) boxes = mb.concat(values=[center_x, center_y, width, height], axis=-1) # Expand dims to construct the batch dim and score class dim expected by MIL NMS op. boxes = mb.expand_dims(x=boxes, axes=[0]) scores = mb.expand_dims(x=scores, axes=[0, -1]) if not is_current_opset_version_compatible_with(target.iOS17): _, _, indices, valid_outputs = mb.non_maximum_suppression( boxes=boxes, scores=scores, max_boxes=box_num, iou_threshold=iou_threshold, score_threshold=score_threshold, ) indices = mb.squeeze(x=indices, axes=[0]) valid_outputs = mb.squeeze(x=valid_outputs, axes=[0]) range = mb.range_1d(end=valid_outputs, start=0, step=1) indices = mb.cast(x=indices, dtype="fp32") valid_indices = mb.gather(x=indices, indices=range, axis=0) valid_indices = mb.cast(x=valid_indices, dtype="int32", name=node.name) context.add(valid_indices) else: # In IOS17, the MIL NMS op's inputs are ordered with number of boxes in the last dimension. boxes = mb.transpose(x=boxes, perm=[0, 2, 1]) scores = mb.transpose(x=scores, perm=[0, 2, 1]) # In IOS17, the MIL NMS op's last output (number of valid boxes in each batch) gets removed. _, _, indices = mb.non_maximum_suppression( boxes=boxes, scores=scores, max_boxes=box_num, iou_threshold=iou_threshold, ) # Remove invalid indices (the padded -1 indices). valid_outputs = mb.reduce_sum( x=mb.cast(x=mb.greater(x=indices, y=-1), dtype="int32"), axes=[-1] ) valid_indices = mb.slice_by_size( x=mb.squeeze(x=indices, axes=[0]), begin=mb.fill_like(ref_tensor=valid_outputs, value=0), size=valid_outputs, name=node.name, ) context.add(valid_indices) @register_torch_op def tupleindex(context, node): tuple_input, index_input = _get_inputs(context, node, expected=2) context.add(tuple_input[index_input.val], node.name) def _get_causal_attn_mask(is_causal: bool, query_var: Var, key_var: Var) -> Var: assert is_causal # create mask of shape (target_seq, source_seq) # s.t the diagonal and lower triangular of the matrix is all 1s # and upper triangular is a large negative number (e.g. -30k) target_seq = query_var.shape[-2] source_seq = key_var.shape[-2] if is_symbolic(target_seq) or is_symbolic(source_seq): raise NotImplementedError( "scaled_dot_product_attention op: " "is_causal flag not handled when sequence length is symbolic" ) all_ones = mb.fill(value=1.0, shape=(target_seq, source_seq)) all_negative_inf = mb.fill(value=-3e4, shape=(target_seq, source_seq)) all_ones_lower = mb.band_part( x=all_ones, lower=-1, upper=0 ) # will 0 out upper triangle, excluding diag all_negative_inf_upper = mb.band_part( x=all_negative_inf, lower=0, upper=-1 ) # will 0 out lower triangle, excluding diag all_negative_inf_diag_only = mb.band_part(x=all_negative_inf_upper, lower=0, upper=0) all_negative_inf_upper_no_diag = mb.sub(x=all_negative_inf_upper, y=all_negative_inf_diag_only) return mb.add(x=all_ones_lower, y=all_negative_inf_upper_no_diag) def _cast_bool_attn_mask(attn_mask: Var, query_var: Var) -> Var: """ compute float mask as (1 - cast(bool_mask)) * -30k """ assert is_bool(attn_mask.dtype) mask = mb.cast(x=attn_mask, dtype=types.builtin_to_string(query_var.dtype)) compliment_of_mask = mb.sub( x=_np.array([1.0]).astype(types.nptype_from_builtin(mask.dtype)), y=mask ) return mb.mul(x=-3e4, y=compliment_of_mask) @register_torch_op( torch_alias=[ "_scaled_dot_product_flash_attention_for_cpu", "coreml.sdpa", "coreml::sdpa", ] ) def scaled_dot_product_attention(context, node): """ Input shapes/types: - query : (target_seq, d) or (B, target_seq, d) or (B, h, target_seq, d) or (B,.., target_seq, d) - key : (source_seq, d) or (B, source_seq, d) or (B, h, source_seq, d) or (B,.., source_seq, d) - value: (source_seq, d_v) or (B, source_seq, d_v) or (B, h, source_seq, d_v) or (B,.., source_seq, d_v) - attn_mask : (target_seq, source_seq) or (B, target_seq, source_seq) or (B, h, target_seq, source_seq) or (B, ..., target_seq, source_seq) - is_causal : bool - scale : optional float Output shape: (target_seq, d_v) or (B,...,target_seq, d_v) output = softmax(scale*Q*K^transpose + mask) * V Currently, Core ML does not support dropout, so it has to be either None or 0 See details at: https://pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html """ def _get_batch_dims(x: Var) -> List[int]: return list(x.shape)[:-2] def _broadcast_tensor_to_same_batch_dims(x: Var, batch_dims: List[int]) -> Var: broadcast_shape = batch_dims + list(x.shape[-2:]) return _broadcast(x.name + "_broadcast_same_batch_dims", x, broadcast_shape) def _parse_positional_args(context, node) -> Tuple[Var]: inputs = _get_inputs(context, node, min_expected=3) nargs = len(inputs) q, k, v = inputs[:3] if node.kind == "scaled_dot_product_attention": attn_mask = inputs[3] if nargs > 3 else None dropout = inputs[4] if nargs > 4 else 0.0 is_causal = inputs[5].val if nargs > 5 else False scale = inputs[6] if nargs > 6 else None elif node.kind == "_scaled_dot_product_flash_attention_for_cpu": dropout = inputs[3] if nargs > 3 else 0.0 is_causal = inputs[4].val if nargs > 4 else False attn_mask = inputs[5] if nargs > 5 else None scale = inputs[6] if nargs > 6 else None else: assert node.kind in ("coreml.sdpa", "coreml::sdpa") attn_mask = inputs[3] if nargs > 3 else None dropout = 0.0 is_causal = False scale = None return q, k, v, attn_mask, dropout, is_causal, scale def _check_args(q, k, v, attn_mask, dropout, is_causal, scale) -> None: if attn_mask is not None and is_causal: raise ValueError( "scaled_dot_product_attention op: attn_mask cannot be provided when is_causal is set to True." ) if dropout is not None: if isinstance(dropout, Var): if dropout.val is None: raise NotImplementedError( "A variable dropout probability is specified. Since Core ML " "does not support dropout yet, we cowardly refuse to convert it" ) else: dropout = dropout.val if dropout != 0.0: raise ValueError( "A non-zero dropout probability is specified. Since Core ML " "does not support dropout yet, we cannot convert it" ) # check that ranks of q, k, v and attn_mask match if k.rank != q.rank: raise ValueError( "Rank of query and key do not match in scaled_dot_product_attention torch op" ) if v.rank != q.rank: raise ValueError( "Rank of query and value do not match in scaled_dot_product_attention torch op" ) q, k, v, attn_mask, dropout, is_causal, scale = _parse_positional_args(context, node) # torch.export may have kwargs if context.frontend == TorchFrontend.TORCHEXPORT: if attn_mask is None: attn_mask = _get_kwinputs(context, node, "attn_mask", default=[attn_mask])[0] if scale is None: scale = _get_kwinputs(context, node, "scale", default=[scale])[0] _check_args(q, k, v, attn_mask, dropout, is_causal, scale) mask = None if is_causal: mask = _get_causal_attn_mask(is_causal, q, k) elif attn_mask is not None: # For ios18-, bool attention mask has to be cast to equivalent floating point attention mask if is_bool(attn_mask.dtype) and not is_current_opset_version_compatible_with(target.iOS18): mask = _cast_bool_attn_mask(attn_mask, q) else: mask = attn_mask # Since ios18, Core ML supports scaled_dot_product_attention op # It does not have scale, though if is_current_opset_version_compatible_with(target.iOS18) and scale is None: # ios18 scaled_dot_product_attention only supports rank >= 3 is_rank_2 = q.rank == 2 if is_rank_2: q = mb.expand_dims(x=q, axes=[0]) k = mb.expand_dims(x=k, axes=[0]) v = mb.expand_dims(x=v, axes=[0]) # broadcast the batch_dims to the same shape # note that, we only support the broadcast if the batch_dim is static q_batch = _get_batch_dims(q) k_batch = _get_batch_dims(k) v_batch = _get_batch_dims(v) if not any_symbolic(q_batch + k_batch + v_batch): b_dims = _solve_broadcast_shape([q_batch, k_batch, v_batch]) q = _broadcast_tensor_to_same_batch_dims(q, b_dims) k = _broadcast_tensor_to_same_batch_dims(k, b_dims) v = _broadcast_tensor_to_same_batch_dims(v, b_dims) # directly translated into iOS18 sdpa op res = mb.scaled_dot_product_attention( query=q, key=k, value=v, attn_mask=mask, name=node.name ) if is_rank_2: res = mb.squeeze(x=res, axes=[0], name=node.name) # For ios18-, scaled_dot_product_attention has to be decomposed else: res = _utils._decompose_scaled_dot_product_attention(q, k, v, mask, node.name, scale=scale) context.add(res) @register_torch_op def fliplr(context, node): """ Flip tensor in the left/right direction. Flip the entries in each row in the left/right direction. Columns are preserved, but appear in a different order than before. It's equivalent to TF's reverse op but with axes always be [1]. """ x = _get_inputs(context, node, expected=1)[0] res = mb.reverse(x=x, axes=[1], name=node.name) context.add(res) @register_torch_op def multinomial(context, node): x = context[node.inputs[0]] num_samples = context[node.inputs[1]].val replacement = context[node.inputs[2]].val if num_samples is None: raise ValueError("In torch.multinomial op, num_samples must be const") if num_samples > 1 and not replacement: raise ValueError("When num_samples is larger than 1, only replacement=True is supported.") # Based on PyTorch documentations, the input to `torch.multinomial` is probability, not logit. x = mb.random_categorical(x=x, size=num_samples, mode="probs", name=node.name) context.add(x) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/quantization_ops.py0000644000000000000000000006514114672066616026742 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as _np import torch as _torch from packaging.version import Version from coremltools import _logger as logger from coremltools._deps import _HAS_TORCHAO, MSG_TORCHAO_NOT_FOUND from coremltools.converters.mil.frontend import _utils from coremltools.converters.mil.frontend.torch.ops import NUM_TO_NUMPY_DTYPE from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Var, types from .ops import _create_linear_layer, _get_inputs, promote_input_dtypes from .torch_op_registry import register_torch_op from .utils import ( NUM_TO_TORCH_DTYPE, TORCH_DTYPE_TO_NUM, TORCH_EXPORT_BASED_FRONTENDS, TORCH_QTYPE_TO_NP_TYPE, TORCH_QTYPE_TO_STR, TYPE_TO_DTYPE_STRING, TorchFrontend, ) if _HAS_TORCHAO: from torchao.quantization import quant_primitives as torchao_quant def _quantize_general( context, node, input: Var, scale_var: Var, zero_point_var: Var, torch_dtype_var: Var, axis: int = None, ): if input.op is not None and input.op.op_type.startswith("constexpr_"): # Skip already quantized weight, which was done by using compression metadata. context.add(input, node.name) return scale = scale_var.val if scale is None: raise ValueError("quantization scale must be const at compile time") if len(scale.shape) > 0 and _np.prod(scale.shape) == 1: scale = scale.reshape(-1)[0] axis = None zero_point = zero_point_var.val if zero_point is None: raise ValueError("quantization zero point must be const at compile time") if len(zero_point.shape) > 0 and _np.prod(zero_point.shape) == 1: zero_point = zero_point.reshape(-1)[0] torch_dtype = NUM_TO_TORCH_DTYPE.get(torch_dtype_var.val) if torch_dtype is None: raise ValueError("quantization dtype must be const at compile time") dtype = TORCH_QTYPE_TO_STR.get(torch_dtype) # pytorch quantization dtype can be int32, which is not supported in MIL if dtype is None: raise ValueError("MIL quantization dtype must be int8 or uint8") # perf: all 0 zero point can be no zero point in MIL if zero_point is not None and _np.all(zero_point == 0): zero_point = None # make sure zero point dtype is consistent with quantization dtype, # since torch may provide int32 zero point if zero_point is not None: if dtype == "int8" and _np.all(-128 <= zero_point) and _np.all(zero_point < 128): zero_point = zero_point.astype(_np.int8) elif dtype == "uint8" and _np.all(0 <= zero_point) and _np.all(zero_point < 256): zero_point = zero_point.astype(_np.uint8) else: raise ValueError("cannot fit zero point into quantization dtype") result = mb.quantize( input=input, zero_point=zero_point, scale=scale, output_dtype=dtype, axis=axis, ) context.add(result, node.name) if context.frontend == TorchFrontend.TORCHSCRIPT: context.quant_context.add_quantization_info(node.name, torch_dtype, scale, zero_point, axis) @register_torch_op( torch_alias=[ "quantized_decomposed::quantize_per_tensor", "quantized_decomposed.quantize_per_tensor", ] ) def quantize_per_tensor(context, node): inputs = _get_inputs( context, node, expected={ TorchFrontend.TORCHSCRIPT: 4, TorchFrontend.TORCHEXPORT: 6, TorchFrontend.EXECUTORCH: 6, }, ) if context.frontend == TorchFrontend.TORCHSCRIPT: input, scale, zero_point, torch_dtype = inputs elif context.frontend in TORCH_EXPORT_BASED_FRONTENDS: input, scale, zero_point, qmin, qmax, torch_dtype = inputs if qmax.val - qmin.val <= 16: logger.warning( f"Core ML does not support 4-bit activation, so {torch_dtype.val} is used instead" ) else: raise ValueError(f"Invalid PyTorch frontend {context.frontend}") _quantize_general(context, node, input, scale, zero_point, torch_dtype) @register_torch_op def quantize_per_channel(context, node): input, scale, zero_point, axis, torch_dtype = _get_inputs(context, node, expected=[5]) if axis.val is None: raise ValueError("quantization axis must be const at compile time") _quantize_general(context, node, input, scale, zero_point, torch_dtype, axis.val) @register_torch_op( torch_alias=[ "quantized_decomposed::choose_qparams_per_token_asymmetric", "quantized_decomposed.choose_qparams_per_token_asymmetric", ] ) def choose_qparams_per_token_asymmetric(context, node): """PyTorch uses this op to calculate scale and zero_point on-the-fly for input data.""" raise NotImplementedError("Dynamic activation quantization is not supported in Core ML.") def _dequantize_general( context, node, input: Var, scale: Var, zero_point: Var, axis: Var, qmin: Var, qmax: Var, ) -> None: # torch may use different dtype for input and zero_point, # but Core ML requires input and zero_point to have a same dtype, # so cast zero_point dtype to input dtype if input.dtype != zero_point.dtype: zero_point = mb.cast(x=zero_point, dtype=TYPE_TO_DTYPE_STRING[input.dtype]) # Not sure why torch may quantize a scalar... does not make sense, # since the floating point scale is as big as the original floating point input data scalar if input.rank == 0: # For const input, translate to the const floating point scalar output if input.val is not None: output_value = scale.val * (input.val - zero_point.val) output = mb.const(val=output_value, name=node.name) # For variable input, we have no choice but to expand and squeeze, # since Core ML dequantize op requires tensor input else: expanded_input = mb.expand_dims(x=input, axes=(0,)) dequantize_output = mb.dequantize( input=expanded_input, zero_point=zero_point, scale=scale, axis=axis, ) output = mb.squeeze(x=dequantize_output, name=node.name) else: # activation quantization if input.val is None: if qmax.val - qmin.val <= 16: logger.warning( f"Core ML does not support 4-bit activation, so {input.dtype} is used instead" ) output = mb.dequantize( input=input, zero_point=zero_point, scale=scale, axis=axis, name=node.name, ) # weight compression else: if qmax.val - qmin.val <= 8: logger.warning( "Core ML does not support less than 4-bit compression, so 4 bit is used instead" ) input_val = input.val zero_point_val = zero_point.val if zero_point_val.dtype != input_val.dtype: zero_point_val = zero_point_val.astype(input_val.dtype) axis_val = None if axis is None else axis.val output = _utils._construct_constexpr_dequant_op( input_val, zero_point_val, scale.val, axis=axis_val, name=node.name ) context.add(output, node.name) @register_torch_op( torch_alias=[ "quantized_decomposed::dequantize_per_tensor", "quantized_decomposed.dequantize_per_tensor", "quantized_decomposed::dequantize_per_channel", "quantized_decomposed.dequantize_per_channel", ] ) def dequantize(context, node): if context.frontend == TorchFrontend.TORCHSCRIPT: context.quant_context.get_dequantized_var(node.inputs[0], node.name) elif context.frontend in TORCH_EXPORT_BASED_FRONTENDS: inputs = _get_inputs( context, node, min_expected={TorchFrontend.TORCHEXPORT: 6, TorchFrontend.EXECUTORCH: 6} ) num_inputs = len(inputs) if num_inputs == 6: input, scale, zero_point, qmin, qmax, _ = inputs axis = None elif num_inputs == 7: input, scale, zero_point, axis, qmin, qmax, _ = inputs else: raise ValueError(f"dequantize should have 6 or 7 inputs, but got {num_inputs}") _dequantize_general(context, node, input, scale, zero_point, axis, qmin, qmax) else: raise ValueError(f"Invalid PyTorch frontend {context.frontend}") def _dequantized_weight(qweight, name: str = None): """ Given the first output (qweight) of torch.ops.quantized.conv2d/linear_unpack, this returns a dequantized version of the tensor to be added to the context. """ if qweight.qscheme() == _torch.per_tensor_affine: quant_dtype_np = TORCH_QTYPE_TO_NP_TYPE[qweight.dtype] scale = _np.float32(qweight.q_scale()) zero_point = quant_dtype_np(qweight.q_zero_point()) quantized_weights = _torch.int_repr(qweight).numpy() dequant_weights = _utils._construct_constexpr_dequant_op( quantized_weights, zero_point, scale, axis=None, name=name ) # per_channel_affine_float_qparams is same as per_channel_affine except that it # expects both scale and zero point to be floating point values. elif qweight.qscheme() in {_torch.per_channel_affine, _torch.per_channel_affine_float_qparams}: quant_dtype_np = TORCH_QTYPE_TO_NP_TYPE[qweight.dtype] # TODO: How do we set the appropriate dtype here (fp16/fp32)? scale = qweight.q_per_channel_scales().numpy() if qweight.qscheme() == _torch.per_channel_affine: zero_point = quant_dtype_np(qweight.q_per_channel_zero_points().numpy()) else: logger.warning( "Found per_channel_affine_float_qparams qscheme, which isn't directly " "supported by coremltools. Casting zero-points to quantized type loses some " "precision." ) dtype_info = _np.iinfo(quant_dtype_np) val = _np.clip( _np.around(qweight.q_per_channel_zero_points().numpy()), dtype_info.min, dtype_info.max, ) zero_point = quant_dtype_np(val) quantized_weights = _torch.int_repr(qweight).numpy() axis = _np.int32(qweight.q_per_channel_axis()) dequant_weights = _utils._construct_constexpr_dequant_op( quantized_weights, zero_point, scale, axis=axis, name=name ) else: raise ValueError(f'Unsupported quant scheme "{qweight.qscheme()}"') return dequant_weights def _process_conv(context, node, add_relu=False): # Node has 4 inputs: # 1. The input activations # 2. The packed weights/biases (need to get from context.torch_graph) # 3. output scale # 4. output zero-point # Unpack weights/bias & dequantize weights. packed_params = context.torch_graph.params[node.inputs[1]] qweight, bias = _torch.ops.quantized.conv2d_unpack(packed_params) dequant_weights = _dequantized_weight(qweight) context.add(dequant_weights) # Bias can be fed as-is. bias = bias.detach().numpy() # Convolution Parameters. x, x_dtype = context.quant_context.get_dequantized_var(node.inputs[0]) raw_params = tuple(list(packed_params.__getstate__())[:-1]) conv_attr_raw = raw_params[0][1][0].detach().numpy().astype(_np.int32) # Stride strides = conv_attr_raw[1:3] # Padding. torch.nn.quantized.Conv2d & its variants only support 'zeros' mode. pad = conv_attr_raw[3:5] assert conv_attr_raw[8] == 0 if len(dequant_weights.shape) in (3, 4): # 1D and 2D: Need to explicitly state L-R, T-B pad pad = _np.repeat(pad, 2) else: raise ValueError("Invalid weight dimension. Must be 4 for 2D convolution.") # Dilation. dilations = conv_attr_raw[5:7] # Group. group = conv_attr_raw[9] kwargs = { "x": x, "weight": dequant_weights, "bias": bias, "strides": strides, "pad_type": "custom", "pad": pad, "dilations": dilations, } if group > 0: kwargs["groups"] = group res = mb.conv(**kwargs) if add_relu: res = mb.relu(x=res) context.add(res) out_scale = context[node.inputs[2]] out_zero_point = context[node.inputs[3]].val context.quant_context.get_quantized_per_tensor( res.name, x_dtype, out_scale, out_zero_point, node.name ) def _process_linear(context, node, add_relu=False): # Node has 4 inputs: # 1. The input activations # 2. The packed weights/biases (need to get from context.torch_graph) # 3. output scale # 4. output zero-point # Unpack PyTorch's packed params. if node.inputs[1] not in context: packed_params = context.torch_graph.params[node.inputs[1]] qweight, bias = _torch.ops.quantized.linear_unpack(packed_params) dequant_weights = _dequantized_weight(qweight) context.add(dequant_weights) bias = bias.detach().numpy() else: dequant_weights, bias = context[node.inputs[1]] x, x_dtype = context.quant_context.get_dequantized_var(node.inputs[0]) x, dequant_weights = promote_input_dtypes([x, dequant_weights]) res = _create_linear_layer(x, dequant_weights, bias) if add_relu: res = mb.relu(x=res) context.add(res) out_scale = context[node.inputs[2]] out_zero_point = context[node.inputs[3]].val if out_scale.val != 0 or out_zero_point != 0: context.quant_context.get_quantized_per_tensor( res.name, x_dtype, out_scale, out_zero_point, node.name ) else: context.add(res, node.name) def _process_binary(context, node, binary_op, add_relu=False): # Node has 4 inputs: # 1. LHS # 2. RHS # 3. output scale # 4. output zero-point assert len(node.inputs) == 4 assert len(node.outputs) == 1 lhs, lhs_dtype = context.quant_context.get_dequantized_var(node.inputs[0]) rhs, rhs_dtype = context.quant_context.get_dequantized_var(node.inputs[1]) assert lhs_dtype == rhs_dtype res = binary_op(x=lhs, y=rhs) if add_relu: res = mb.relu(x=res) context.add(res) out_scale = context[node.inputs[2]] out_zero_point = context[node.inputs[3]].val context.quant_context.get_quantized_per_tensor( res.name, lhs_dtype, out_scale, out_zero_point, node.name ) @register_torch_op(torch_alias=["quantized::matmul"]) def quantized_matmul(context, node): inputs = _get_inputs(context, node, expected=4) assert types.is_float(inputs[0].dtype) assert types.is_float(inputs[1].dtype) x, y = promote_input_dtypes([inputs[0], inputs[1]]) assert ( inputs[2].val == 0 and inputs[3].val == 0 ), "non zero scale / zero-point not supported in quantized_matmul op." res = mb.matmul(x=x, y=y, name=node.name) context.add(res) # Defines all the quantization-related nodes that are noOps @register_torch_op( torch_alias=[ "quantized::linear_prepack", ] ) def quant_noop(context, node): logger.info("Setting pytorch op: {} to no-op.".format(node)) inputs = _get_inputs(context, node) context.add(inputs, torch_name=node.name) @register_torch_op(torch_alias=["quantized::linear"]) def quantized_linear(context, node): _process_linear(context, node) @register_torch_op(torch_alias=["quantized::linear_relu"]) def quantized_linear_relu(context, node): _process_linear(context, node, add_relu=True) @register_torch_op(torch_alias=["quantized::conv2d_relu"]) def quantized_conv2d_relu(context, node): _process_conv(context, node, add_relu=True) @register_torch_op(torch_alias=["quantized::conv2d"]) def quantized_conv2d(context, node): _process_conv(context, node) @register_torch_op(torch_alias=["quantized::add"]) def quantized_add(context, node): _process_binary(context, node, mb.add) @register_torch_op(torch_alias=["quantized::add_relu"]) def quantized_add_relu(context, node): _process_binary(context, node, mb.add, add_relu=True) @register_torch_op(torch_alias=["quantized::mul"]) def quantized_mul(context, node): _process_binary(context, node, mb.mul) @register_torch_op(torch_alias=["quantized::embedding_byte"]) def quantized_embedding(context, node): packed_params = context.torch_graph.params[node.inputs[0]] qweight = _torch.ops.quantized.embedding_bag_unpack(packed_params) dequant_weights = _dequantized_weight(qweight) indices = context[node.inputs[1]] if len(node.inputs) >= 3: logger.warning( "Core ML quantized embedding (gather) layer does not support any " "inputs besides the weights and indices. Those given " "will be ignored." ) if isinstance(indices, tuple): # Sometimes inputs will be a tuple, so handle that correctly. assert len(indices) == 1 indices = indices[0] indices = mb.cast(x=indices, dtype="int32") # Changing the axis from 0 is not an option in torch, so we don't expose it gather = mb.gather(x=dequant_weights, indices=indices, name=node.name) context.add(gather) @register_torch_op( torch_alias=[ "quantized_decomposed::embedding_4bit", "quantized_decomposed::embedding_4bit.dtype", "quantized_decomposed.embedding_4bit", "quantized_decomposed.embedding_4bit.dtype", ] ) def quantized_embedding_4bit(context, node): """Lower the 4-bit quantized embedding op used in executorch.""" inputs = _get_inputs(context, node, expected=[6, 7]) weight = inputs[0].val weight_scales = inputs[1].val weight_zero_points = None if inputs[2] is not None and inputs[2].val is not None: weight_zero_points = inputs[2].val weight_quant_min = inputs[3].val weight_quant_max = inputs[4].val indices = inputs[5] out_np_dtype = None if len(inputs) > 6: if isinstance(inputs[6].val, _torch.dtype): out_np_dtype = NUM_TO_NUMPY_DTYPE[TORCH_DTYPE_TO_NUM[inputs[6].val]] elif isinstance(inputs[6].val, (int, _np.generic)): out_np_dtype = NUM_TO_NUMPY_DTYPE[inputs[6].val] if out_np_dtype is not None: weight_scales = weight_scales.astype(out_np_dtype) if weight_quant_min == 0 and weight_quant_max == 0: # Executorch wrongly passes both weight_quant_min and weight_quant_max. We should set it to correct numbers. signed = True weight_quant_min = -8 weight_quant_max = 7 else: signed = weight_quant_min < 0 quant_low = -8 if signed else 0 quant_high = 7 if signed else 15 quant_torch_dtype = _torch.int8 if signed else _torch.uint8 if weight_quant_min != quant_low: raise ValueError( f"The weight_quant_min should be {quant_low} for 4-bit embedding, but got {weight_quant_min}." ) if weight_quant_max != quant_high: raise ValueError( f"The weight_quant_max should be {quant_high} for 4-bit embedding, but got {weight_quant_max}." ) # Unpack the weight to the normal layout. with _torch.no_grad(): weight = _torch.from_numpy(weight) # The original weight was packed by using 8-bit to represent two numbers, so we need to separate them. help_move_bits = 2**4 weight_even = weight.div(help_move_bits, rounding_mode="trunc") weight_odd = weight.remainder(help_move_bits) weight_unpacked = _torch.stack((weight_even, weight_odd), dim=-1) weight = weight_unpacked.view(weight.shape[0], -1) weight = weight.view(quant_torch_dtype).add(weight_quant_min).numpy() if not _np.logical_and(weight >= quant_low, weight <= quant_high).all(): raise ValueError( f"All elements in weight should be within 4-bit range ({quant_low} to {quant_high})." ) quantized_np_dtype = types.nptype_from_builtin( types.string_to_builtin("int4" if signed else "uint4") ) dequant_weight = _utils._construct_constexpr_dequant_op( weight.astype(quantized_np_dtype), weight_zero_points, weight_scales, axis=-1, name=inputs[0].name, ) gather = mb.gather(x=dequant_weight, indices=indices, name=node.name) context.add(gather) @register_torch_op def _convert_weight_to_int4pack(context, node): """Pack weight to int4pack format which will be fed into `_weight_int4pack_mm` op.""" inputs = _get_inputs(context, node, expected=2) x = inputs[0].val inner_k_tiles = inputs[1].val if x is None or inner_k_tiles is None: raise NotImplementedError( "For `_convert_weight_to_int4pack` op, we only support static case, where x, " "and inner_k_tiles are all known during compilation time." ) with _torch.no_grad(): x_int4packed = _torch._convert_weight_to_int4pack( _torch.from_numpy(x), inner_k_tiles ).numpy() res = mb.const(val=x_int4packed, name=node.name) context.add(res) @register_torch_op def _weight_int4pack_mm(context, node): """ The first argument is the same as torch.mm, but the second argument (weight) is packed. The packed weight has rank=4, because the meta registration in dynamo requires operator has the same output shape for each device. So it creates a fake shape {N / 8, K / (16 * innerKTiles), 32, innerKTiles / 2} for CPU. More specifically: # Original torch.mm torch.mm(a, b) # The int4 packed version mm b_uint8, b_scales_and_zeros = _group_quantize_tensor( b, n_bit=4, q_group_size=q_group ) b_int4pack = torch._convert_weight_to_int4pack( b_uint8, inner_k_tiles ) weight_int4pack_mm(a, b_int4pack, b_scales_and_zeros) """ if Version(_torch.__version__) < Version("2.4.0"): raise AssertionError("To lower _weight_int4pack_mm, requires torch >= 2.4.0") logger.warning( "The current conversion of `_weight_int4pack_mm` op only works with model produced by torchao. " "If the op is produced by other libs, you may observe large numerical discrepancy." ) if not _HAS_TORCHAO: raise AssertionError( f"{MSG_TORCHAO_NOT_FOUND}\n torchao is needed to convert torch blockwise quantized model." ) inputs = _get_inputs(context, node, expected=4) x = inputs[0] y_int4pack = inputs[1].val group_size = inputs[2].val y_scales_and_zeros = inputs[3].val if y_int4pack is None or group_size is None or y_scales_and_zeros is None: raise NotImplementedError( "For `_weight_int4pack_mm` op, we only support static case, where y_int4pack, " "group_size, y_scales_and_zeros are all known during compilation time." ) if not (len(y_scales_and_zeros.shape) == 3 and y_scales_and_zeros.shape[2] == 2): raise ValueError( "The scales_and_zeros from torch should have 3 dims and last dim has size 2." ) scales = _np.transpose(y_scales_and_zeros[:, :, 0]) zero_points = _np.transpose(y_scales_and_zeros[:, :, 1]) if _np.allclose(zero_points, zero_points.astype("int32")): zero_points = zero_points.astype("int32") else: zero_points = zero_points.astype(_np.float32) # Unpack the result of `torch._convert_weight_to_int4pack` back to plain layout. # TODO: Use `torchao.ops.unpack_tensor_core_tiled_layout` to unpack after it has CPU implementation. # The current way to unpack by using _weight_int4pack_mm with eye matrix is a workaround on CPU. if len(y_int4pack.shape) != 4: raise ValueError( f"The packed y from torch should have 4 dims, but got {len(y_int4pack.shape)}." ) inner_k_tiles = y_int4pack.shape[-1] * 2 y_unpacked_shape = (y_int4pack.shape[0] * 8, y_int4pack.shape[1] * (inner_k_tiles * 16)) eye_shape = y_unpacked_shape[1] quant_min = 0 quant_max = 2**4 - 1 with _torch.no_grad(): y_dequantized = ( _torch._weight_int4pack_mm( _torch.eye(eye_shape, device=_torch.device("cpu"), dtype=_torch.float32), _torch.from_numpy(y_int4pack), group_size, _torch.from_numpy(y_scales_and_zeros).float(), ) .t() .contiguous() ) zero_point_domain = ( torchao_quant.ZeroPointDomain.INT if _np.issubdtype(zero_points.dtype, _np.integer) else torchao_quant.ZeroPointDomain.FLOAT ) y_quantized = torchao_quant.quantize_affine( y_dequantized, (1, group_size), _torch.from_numpy(scales), _torch.from_numpy(zero_points), _torch.int32, quant_min=quant_min, quant_max=quant_max, zero_point_domain=zero_point_domain, ) y_quantized = y_quantized.numpy().astype(_np.uint8) if len(y_quantized.shape) != 2: raise ValueError( f"The unpacked quantized y should have 2 dims, but got {len(y_quantized.shape)}." ) if not _np.logical_and(y_quantized >= 0, y_quantized <= 15).all(): raise ValueError("All elements should be within 4-bit range (0 to 15).") # If zero_point_domain in `quantize_affine` is set to `ZeroPointDomain.INT`, it matches with CoreML implementation: # quant = torch.clamp(torch.round(input * (1.0 / scale)) + zero_point, quant_min, quant_max) # However, for `ZeroPointDomain.FLOAT`, torchao did following transformations to make it compatible with `tinygemm`: # mid_point = (quant_max + quant_min + 1) / 2 # min_val = zero_point - scale * mid_point # quant = torch.clamp(torch.round((input - min_val) / scale), quant_min, quant_max)) # As we want to make sure the quantize matches CoreML dequant op, we have to do following transformations: # dequant = (quant - mid_point) * scale + zp # so we can re-write the expression as # dequant = (quant - (mid_point - zp / scale)) * scale # which means the zero_point in CoreML is actually `mid_point - zp / scale`. if not _np.issubdtype(zero_points.dtype, _np.integer): mid_point = (quant_max + quant_min + 1) / 2 zero_points = mid_point - zero_points / scales # Use MIL constexpr op to represent the quantization. quantized_np_dtype = types.nptype_from_builtin(types.string_to_builtin("uint4")) dequant_weights = _utils._construct_constexpr_dequant_op( y_quantized.astype(quantized_np_dtype), zero_points, scales, axis=-1, name=inputs[1].name ) res = mb.linear(x=x, weight=dequant_weights, name=node.name) context.add(res) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2255466 coremltools-8.0/coremltools/converters/mil/frontend/torch/ssa_passes/0000755000000000000000000000000014672075535025116 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/ssa_passes/__init__.py0000644000000000000000000000044614672066616027233 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import torch_tensor_assign_to_core, torch_upsample_to_core_upsample ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/ssa_passes/torch_tensor_assign_to_core.py0000644000000000000000000000474214672066616033266 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="torch") class torch_tensor_assign_to_core(AbstractGraphPass): """ Map Torch dialect ops `torch_tensor_assign` into core opset. Currently, we transform the torch_tensor_assign op using mb.scatter. """ def apply(self, prog): for f in prog.functions.values(): _torch_tensor_assign_to_core_block(f) @block_context_manager def _torch_tensor_assign_to_core_block(block): for op in list(block.operations): for b in op.blocks: _torch_tensor_assign_to_core_block(b) if op.op_type in ["torch_tensor_assign"]: _transform_tensor_assign(op, block) def _transform_tensor_assign(op, block): shape = mb.shape(x=op.x, before_op=op) dim_prod = mb.reduce_prod(x=shape, before_op=op) ref_indices = mb.range_1d(end=dim_prod, start=0, step=1, before_op=op) ref_indices = mb.reshape(x=ref_indices, shape=shape, before_op=op) ref_sliced_indices = mb.slice_by_index( x=ref_indices, begin=op.begin, end=op.end, stride=op.stride, begin_mask=op.begin_mask, end_mask=op.end_mask, squeeze_mask=op.squeeze_mask, before_op=op, ) flatten_indices = mb.reshape(x=ref_sliced_indices, shape=[-1], before_op=op) flatten_updates = mb.reshape(x=op.updates, shape=[-1], before_op=op) flatten_data = mb.reshape(x=op.x, shape=[-1], before_op=op) new_data = mb.scatter( data=flatten_data, indices=flatten_indices, updates=flatten_updates, mode="update", before_op=op, ) new_data = mb.reshape(x=new_data, shape=shape, before_op=op) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=new_data ) # Remove all the ops at once block.remove_ops([op]) ././@PaxHeader0000000000000000000000000000020700000000000010214 xustar00113 path=coremltools-8.0/coremltools/converters/mil/frontend/torch/ssa_passes/torch_upsample_to_core_upsample.py 22 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/ssa_passes/torch_upsample_to_core_upsample0000644000000000000000000001066014672066616033511 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools import _logger as logger from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass target_ops = [ "torch_upsample_nearest_neighbor", "torch_upsample_bilinear", ] @register_pass(namespace="torch") class torch_upsample_to_core_upsample(AbstractGraphPass): """ Try to map Torch dialect ops 1. `torch_upsample_nearest_neighbor` 2. `torch_upsample_bilinear` to `upsample_nearest_neighbor` or `upsample_bilinear` in the core op set if compatible. Inputs: prog: Program """ def apply(self, prog): for f in prog.functions.values(): _torch_upsample_to_core_upsample_block(f) @block_context_manager def _torch_upsample_to_core_upsample_block(block): for op in list(block.operations): for b in op.blocks: _torch_upsample_to_core_upsample_block(b) if op.op_type in target_ops: if _try_replace_with_core_upsample(op): logger.info("Successfully map {} to core upsample".format(op.op_type)) else: raise ValueError("Unable to map {} to core upsample".format(op.op_type)) def _try_get_upsample_factor(output_size): op = output_size # If the output has value, than the upsample op itself is derived from the upsample_1d op, # so we can just return scale factor 1 for that case if op.outputs[0].val is not None: assert op.outputs[0].val == 1. return 1. # output_size = [ # (torch.floor((input.size(i + 2).float() * torch.tensor(scale_factors[i], dtype=torch.float32)).float())) # for i in range(dim) # ] # source from : https://pytorch.org/docs/stable/_modules/torch/nn/functional.html#interpolate # We validation if we can trace all the way back to the original scale_factor # The whole sequence is mul(input_size, scale_factor) -> cast(fp32) -> floor() -> cast(int32) # 1. check if the output_size is type 'cast' with dtype 'int32' if op.op_type != "cast" or op.dtype.val != "int32": return None # 2. check if the op is type 'floor' op = op.x.op if op.op_type != "floor": return None # 3. check if the op is type 'cast' with dtype 'fp32' op = op.x.op if op.op_type != 'cast' or op.dtype.val != "fp32": return None # 4. check if the op is type mul op = op.x.op if op.op_type != 'mul': return None # we successfully trace back the original scale factor return np.float32(op.y.val) def _try_replace_with_core_upsample(op): """ Inputs: op (Operation): op.op_type must be either 1. `torch_upsample_nearest_neighbor` 2. `torch_upsample_bilinear` Returns: True if op can be represented by mb.upsample_nearest_neighbor or mb.upsample_bilinear op in SSA. False otherwise """ assert op.op_type in target_ops # 2d upsampling if op.op_type in ["torch_upsample_nearest_neighbor", "torch_upsample_bilinear"]: scales_h = _try_get_upsample_factor(op.output_height.op) scales_w = _try_get_upsample_factor(op.output_width.op) if scales_h is None or scales_w is None: return False old_upsample = op.outputs[0] block = op.enclosing_block if op.op_type == "torch_upsample_nearest_neighbor": new_upsample = mb.upsample_nearest_neighbor( x=op.x, scale_factor_height=scales_h, scale_factor_width=scales_w, name=op.name, before_op=op, ) elif op.op_type == "torch_upsample_bilinear": new_upsample = mb.upsample_bilinear( x=op.x, scale_factor_height=scales_h, scale_factor_width=scales_w, align_corners=op.align_corners, name=op.name, before_op=op, ) block.replace_uses_of_var_after_op(anchor_op=op, old_var=old_upsample, new_var=new_upsample) block.remove_ops([op]) return True ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2255466 coremltools-8.0/coremltools/converters/mil/frontend/torch/test/0000755000000000000000000000000014672075535023731 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/test/__init__.py0000644000000000000000000000033214672066616026040 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/test/test_custom_ops.py0000644000000000000000000001301014672066616027530 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import pytest import torch import torch.nn as nn from coremltools.converters.mil.frontend.torch.ops import _get_inputs from coremltools.converters.mil.frontend.torch.ops import \ cosine_similarity as cosine_similarity_main from coremltools.converters.mil.frontend.torch.torch_op_registry import \ _TORCH_OPS_REGISTRY as _TORCH_OPS_REG from coremltools.converters.mil.frontend.torch.torch_op_registry import \ register_torch_op from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, types from coremltools.converters.mil.mil.input_type import (DefaultInputs, InputSpec, TensorInputType) from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from .testing_utils import TorchBaseTest, convert_to_mlmodel # Custom layer imports # Log Converter supported Cosine Similarity conversion function default_cosine_similarity = _TORCH_OPS_REG.get_func("cosine_similarity") @register_torch_op(override=True) def cosine_similarity(context, node): cosine_similarity_main(context, node) # Log custom Cosine Similarity conversion function custom_cosine_similarity = _TORCH_OPS_REG.get_func("cosine_similarity") def _set_torch_reg_op(op_type, op_func): _TORCH_OPS_REG.set_func_by_name(op_func, op_type) class TestCompositeOp(TorchBaseTest): @pytest.mark.parametrize("input_shape", [(100, 180), (56, 123)]) def test_composite_op(self, input_shape): _set_torch_reg_op("cosine_similarity", custom_cosine_similarity) model = nn.CosineSimilarity(dim=1, eps=1e-6) self.run_compare_torch([input_shape, input_shape], model) _set_torch_reg_op("cosine_similarity", default_cosine_similarity) class TestCustomOp: # Define SSA Custom Op for Sparse MatMul # This will map to `custom_op` in SSA with binding information # to bind input spec to the custom implementation @register_op(is_custom_op=True) class custom_torch_sparse_matmul(Operation): # Defining input spec for current op input_spec = InputSpec( x=TensorInputType(type_domain="T"), y=TensorInputType(type_domain="T"), transpose_x=TensorInputType(const=True, optional=True, type_domain=types.bool), transpose_y=TensorInputType(const=True, optional=True, type_domain=types.bool), x_is_sparse=TensorInputType(const=True, optional=True, type_domain=types.bool), y_is_sparse=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( transpose_x=False, transpose_y=False, x_is_sparse=False, y_is_sparse=False, ) # Specifying binding for custom op for specifying inputs, # parameters required for creating custom op to be synced with Swift API bindings = { "class_name": "SparseMatMul", "input_order": ["x", "y"], "parameters": ["transpose_x", "transpose_y", "x_is_sparse", "y_is_sparse"], "description": "Custom Sparse MatMul Layer", } def type_inference(self): x_type = self.x.dtype x_shape = self.x.shape y_shape = self.y.shape # For illustration purpose, assuming getting valid shape # Ideally, should consider transpose_?, ?_is_sparse parameters into consideration # for computing output shape return types.tensor(x_type, [x_shape[0], y_shape[1]]) @register_torch_op() def _sparse_mm(context, node): inputs = _get_inputs(context, node, expected=2) x = mb.custom_torch_sparse_matmul( x=inputs[0], y=inputs[1], x_is_sparse=True, y_is_sparse=True, name=node.name ) context.add(x) def test_custom_sparse_mm_op(self, input_shape=(4, 4)): class TestLayer(nn.Module): def __init__(self): super(TestLayer, self).__init__() def forward(self, x, y): x = torch.sparse.mm(x, y) return x model = TestLayer() input_data_x = torch.ones(input_shape) input_data_y = torch.ones(input_shape) input_data = [input_data_x, input_data_y] model.eval() torch_model = torch.jit.trace(model, (input_data_x, input_data_y)) mlmodel = convert_to_mlmodel(torch_model, input_data) layers = mlmodel.get_spec().neuralNetwork.layers assert layers[-1].custom is not None, "Expecting a custom layer" assert ( "SparseMatMul" == layers[-1].custom.className ), "Custom Layer class name mismatch" assert ( not layers[-1].custom.parameters["transpose_x"].boolValue ), "Incorrect parameter value k" assert ( not layers[-1].custom.parameters["transpose_y"].boolValue ), "Incorrect parameter value k" assert ( layers[-1].custom.parameters["x_is_sparse"].boolValue ), "Incorrect parameter value k" assert ( layers[-1].custom.parameters["y_is_sparse"].boolValue ), "Incorrect parameter value k" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/test/test_examples.py0000644000000000000000000000377314672066616027172 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import pytest import coremltools from coremltools._deps import _HAS_TORCH, MSG_TORCH_NOT_FOUND from coremltools.converters.mil.testing_reqs import backends if _HAS_TORCH: import torch import torch.nn.functional as F from torch import nn @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) class TestModelScripting: @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test(backend): # Example code from https://coremltools.readme.io/docs/model-scripting class _LoopBody(nn.Module): def __init__(self, channels): super(_LoopBody, self).__init__() conv = nn.Conv2d( in_channels=channels, out_channels=channels, kernel_size=3, padding=1, ) self.conv = conv def forward(self, x): x = self.conv(x) x = F.relu(x) return x class ControlFlowNet(nn.Module): def __init__(self, num_channels: int): super(ControlFlowNet, self).__init__() self.loop_body = _LoopBody(num_channels) def forward(self, x): avg = torch.mean(x) if avg.item() < 0: loop_count = 2 else: loop_count = 1 for _ in range(loop_count): x = self.loop_body(x) return x model = ControlFlowNet(num_channels=3) scripted_model = torch.jit.script(model) mlmodel = coremltools.converters.convert( scripted_model, inputs=[coremltools.TensorType(shape=(1, 3, 64, 64))], convert_to=backend[0], ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/test/test_internal_graph.py0000644000000000000000000020242314672066616030342 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest torch = pytest.importorskip("torch") import torch.nn as nn import torch.nn.functional as F from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, get_new_symbol, types from coremltools.converters.mil.testing_utils import random_gen from .. import ops from .. import utils from ..converter import TranscriptionContext from ..internal_graph import InternalTorchIRNode class TestTorchOps: """Class containing tests for converting TorchIR -> CoreML ops. These tests interface with only the InternalTorchIRGraph and do not build a torch module. Thus, they are much faster then the numerical tests. However, for some ops it is necessary to use the torch module to verify numerical output so they are placed the numerical tests. NOTE: Confused where @context is coming from? Its from the pytest fixture defined below. """ @pytest.fixture def context(self): return TranscriptionContext() @pytest.fixture def set_random_seeds(self): torch.manual_seed(1) np.random.seed(1) @pytest.mark.parametrize("dtype", [torch.bool, torch.float, torch.int]) def test_constant(self, context, dtype): test_data = torch.ones(1, dtype=dtype) node = InternalTorchIRNode( attr={"value": test_data}, kind="constant", inputs=[], outputs=["1"] ) ssa = self._construct_test_graph(context, ops.constant, node, "1") assert np.allclose(test_data, ssa.val) assert test_data.shape == ssa.shape def test_constant_magic(self, context): test_val = ops.PYTORCH_DEFAULT_VALUE node = InternalTorchIRNode( attr={"value": test_val}, kind="constant", inputs=[], outputs=["1"] ) ssa = self._construct_test_graph(context, ops.constant, node, "1") # We expect the magic default to get converted to None assert ssa is None @staticmethod def _gen_constants(size, vals): """Helper function. Generates a list of internal constant nodes. Arguments: size: number of constants to generate vals: Either a list of values for each constant or one value used for all constants.""" is_list = isinstance(vals, list) if is_list: if len(vals) != size: raise ValueError("len(@vals): {} != size: {}".format(len(vals), size)) constants = [] for index in range(size): if is_list: val = vals[index] else: val = vals constants.append( InternalTorchIRNode( attr={"value": val}, kind="constant", inputs=[], outputs=[str(index)], ) ) input_list = [str(i) for i in range(size)] output_name = str(len(input_list)) return constants, input_list, output_name @staticmethod def _construct_test_graph( context, test_op, test_node, output_name=None, graph_inputs=None, constants=None ): """ Construct an Function for the given @graph_inputs, @constants, and @test_node. Returns the output of the graph, which is the ssa Var of the given @output_name. """ if graph_inputs is None: graph_inputs = {} if constants is None: constants = [] with Function(inputs=graph_inputs) as ssa_func: for name in ssa_func.inputs.keys(): context.add(ssa_func.inputs[name]) for node in constants: ops.constant(context, node) test_op(context, test_node) ssa = None if output_name: ssa = context[output_name] return ssa def _test_elementwise_binary( self, context, op_name, op, test_input, num_constants, expected_result ): """Helper function, runs op on test input and compares against expected result""" constants, input_list, output_name = self._gen_constants( num_constants, test_input ) eb_node = InternalTorchIRNode( kind=op_name, inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, op, eb_node, output_name, constants=constants ) np.testing.assert_allclose(expected_result, ssa.val, atol=1e-6) def _test_cast(self, context, test_val, op_kind, op_func, python_type): constants, input_list, output_name = self._gen_constants(1, [test_val]) node = InternalTorchIRNode( kind=op_kind, inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, op_func, node, output_name, constants=constants ) assert ssa.val == python_type(test_val) def test_add(self, context): test_input_1 = np.random.rand(2, 3) test_input_2 = np.random.rand(2, 3) scale_factor = 1 self._test_elementwise_binary( context, "Add", ops.add, [test_input_1, test_input_2, scale_factor], 3, test_input_1 + test_input_2, ) def test_add_no_scale_factor(self, context): test_input_1 = np.random.rand(2, 3) test_input_2 = np.random.rand(2, 3) self._test_elementwise_binary( context, "Add", ops.add, [test_input_1, test_input_2], 2, test_input_1 + test_input_2, ) @pytest.mark.parametrize( "test_input_1, test_input_2", [(np.random.rand(3, 2), np.random.rand(3, 2)), (np.random.rand(3, 2), 5), ], ) def test_sub(self, context, test_input_1, test_input_2): scale_factor = 1 self._test_elementwise_binary( context, "Sub", ops.sub, [test_input_1, test_input_2, scale_factor], 3, test_input_1 - test_input_2, ) @pytest.mark.parametrize( "test_input_1, test_input_2", [(np.random.rand(3, 2), np.random.rand(3, 2)), (np.random.rand(3, 2), 5), ], ) def test_rsub(self, context, test_input_1, test_input_2): scale_factor = 1 self._test_elementwise_binary( context, "rsub", ops.sub, [test_input_1, test_input_2, scale_factor], 3, # Note the reversal of arg ordering relative to 'sub' test_input_2 - test_input_1, ) def test_mul(self, context): test_input_1 = np.random.rand(3, 2) test_input_2 = np.random.rand(3, 2) self._test_elementwise_binary( context, "Mul", ops.mul, [test_input_1, test_input_2], 2, test_input_1 * test_input_2, ) def test_div(self, context): test_input_1 = np.random.rand(3, 2) test_input_2 = np.random.rand(3, 2) self._test_elementwise_binary( context, "Div", ops.div, [test_input_1, test_input_2], 2, np.divide(test_input_1, test_input_2), ) def test_floor_divide(self, context): test_input_1 = np.random.randint(low=1, high=100, size=(3, 2)) test_input_2 = np.random.randint(low=1, high=100, size=(3, 2)) self._test_elementwise_binary( context, "floor_divide", ops.floor_divide, [test_input_1, test_input_2], 2, np.floor_divide(test_input_1, test_input_2), ) def test_pow(self, context): test_input_1 = np.random.rand(3, 2) test_input_2 = np.random.rand(3, 2) self._test_elementwise_binary( context, "Pow", ops.pow, [test_input_1, test_input_2], 2, np.power(test_input_1, test_input_2), ) def test_eq(self, context): test_input_1 = torch.zeros([2, 3, 4, 5, 6]).float() test_input_2 = torch.ones([2, 3, 4, 5, 6]).float() test_input_2[0][0][0][0][0] = 0 expected_output = (test_input_1 == test_input_2).float() self._test_elementwise_binary( context, "Eq", ops.eq, [test_input_1, test_input_2], 2, expected_output ) def test_ne(self, context): test_input_1 = torch.zeros([2, 3, 4, 5, 6]).float() test_input_2 = torch.ones([2, 3, 4, 5, 6]).float() test_input_2[0][0][0][0][0] = 0 expected_output = (test_input_1 != test_input_2).float() self._test_elementwise_binary( context, "ne", ops.ne, [test_input_1, test_input_2], 2, expected_output ) def test_le(self, context): test_input_1 = torch.zeros([2, 3, 4, 5, 6]).float() test_input_2 = torch.ones([2, 3, 4, 5, 6]).float() test_input_2[0][0][0][0][0] = 0 expected_output = (test_input_1 <= test_input_2).float() self._test_elementwise_binary( context, "Le", ops.le, [test_input_1, test_input_2], 2, expected_output ) def test_lt(self, context): test_input_1 = torch.zeros([2, 3, 4, 5, 6]).float() test_input_2 = torch.ones([2, 3, 4, 5, 6]).float() test_input_2[0][0][0][0][0] = 0 expected_output = (test_input_1 < test_input_2).float() self._test_elementwise_binary( context, "Lt", ops.lt, [test_input_1, test_input_2], 2, expected_output ) def test_ge(self, context): test_input_1 = torch.zeros([2, 3, 4, 5, 6]).float() test_input_2 = torch.ones([2, 3, 4, 5, 6]).float() test_input_2[0][0][0][0][0] = 0 expected_output = (test_input_1 >= test_input_2).float() self._test_elementwise_binary( context, "Ge", ops.ge, [test_input_1, test_input_2], 2, expected_output ) def test_gt(self, context): test_input_1 = torch.zeros([2, 3, 4, 5, 6]).float() test_input_2 = torch.ones([2, 3, 4, 5, 6]).float() test_input_2[0][0][0][0][0] = 0 expected_output = (test_input_1 > test_input_2).float() self._test_elementwise_binary( context, "Gt", ops.gt, [test_input_1, test_input_2], 2, expected_output ) @pytest.mark.parametrize( "size, array_type", itertools.product( [1, 5, 7], [ ("ListConstruct", ops.listconstruct), ("TupleConstruct", ops.tupleconstruct), ], ), ) def test_arrayconstruct_scalars(self, context, size, array_type): constant_vals = list(range(size)) array_kind = array_type[0] array_op = array_type[1] constants, input_list, output_name = self._gen_constants(size, constant_vals) ac_node = InternalTorchIRNode( kind=array_kind, inputs=input_list, outputs=[output_name], ) ssa = self._construct_test_graph( context, array_op, ac_node, output_name, constants=constants ) expected_val = np.arange(size) np.testing.assert_equal(ssa.shape, (size,)) np.testing.assert_array_equal(ssa.val, expected_val) @pytest.mark.parametrize( "shape1, shape2, array_type", itertools.product( [(1, 2), (3, 4, 5), (2,)], [(2, 1), (1, 4, 5), (3,)], [ ("ListConstruct", ops.listconstruct), ("TupleConstruct", ops.tupleconstruct), ], ), ) def test_arrayconstruct_nonscalar(self, context, shape1, shape2, array_type): tensor1 = torch.rand(shape1) tensor2 = torch.rand(shape2) array_kind = array_type[0] array_op = array_type[1] constants, input_list, output_name = self._gen_constants(2, [tensor1, tensor2]) ac_node = InternalTorchIRNode( kind=array_kind, inputs=input_list, outputs=[output_name], ) ssa = self._construct_test_graph( context, array_op, ac_node, output_name, constants=constants ) expected_val = (tensor1.numpy(), tensor2.numpy()) np.testing.assert_equal(len(ssa), 2) for x, y in zip(ssa, expected_val): np.testing.assert_allclose(x.val, y) @pytest.mark.parametrize( "input_shape, dim0, dim1", [ x for x in itertools.product( [(1, 2, 3), (1, 2, 3, 4), (1, 2, 3, 4, 5)], [0, 1, -1], [0, 2, -2], ) ] + [((1, 2), None, None)], ) def test_transpose(self, context, input_shape, dim0, dim1): test_input = torch.rand(input_shape) constant_list = [test_input] if len(input_shape) > 2: constant_list += [dim0, dim1] kind = "transpose" expected_result = torch.transpose(test_input, dim0, dim1) else: kind = "t" expected_result = test_input.t() constants, input_list, output_name = self._gen_constants( len(constant_list), constant_list ) transpose_node = InternalTorchIRNode( kind=kind, inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.transpose, transpose_node, output_name, constants=constants, ) np.testing.assert_array_equal(expected_result.shape, ssa.shape) np.testing.assert_allclose(expected_result, ssa.val) @pytest.mark.parametrize( "dim1, dim2, dim3", itertools.product([1, 2, 5], [2, 5, 10], [1, 2, 5]), ) def test_matmul(self, context, dim1, dim2, dim3): mat1 = torch.rand((dim1, dim2)) mat2 = torch.rand((dim2, dim3)) constant_vals = [ mat1, mat2, ] constants, input_list, output_name = self._gen_constants(2, constant_vals) matmul_node = InternalTorchIRNode( kind="matmul", inputs=input_list, outputs=[output_name], ) ssa = self._construct_test_graph( context, ops.matmul, matmul_node, output_name, constants=constants ) expected_result = torch.matmul(mat1, mat2).detach().numpy() assert np.allclose(expected_result, ssa.val) @pytest.mark.parametrize( "input_shape, axis, expected_shape", [ ((1, 2), None, (2,)), ((1, 2), 0, (2,)), ((1, 2, 1), None, (2,)), ((1, 2, 1, 1), None, (2,)), ((1, 2, 1, 1), 2, (1, 2, 1)), ((1, 2, 1, 1, 1), None, (2,)), ], ) def test_squeeze(self, context, input_shape, axis, expected_shape): test_data = torch.rand(input_shape) if axis is None: constants, input_list, output_name = self._gen_constants(1, test_data) else: constants, input_list, output_name = self._gen_constants( 2, [test_data, axis] ) squeeze_node = InternalTorchIRNode( kind="Squeeze", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.squeeze, squeeze_node, output_name, constants=constants ) if axis is None: expected_result = torch.squeeze(test_data) else: expected_result = torch.squeeze(test_data, axis) assert np.allclose(expected_result, ssa.val) assert expected_result.size() == torch.Size(expected_shape) @pytest.mark.parametrize( "input_shape, axis, expected_shape", [ ((2,), 0, (1, 2)), ((2,), 1, (2, 1)), ((2,), -1, (2, 1)), ((2, 3), 1, (2, 1, 3)), ], ) def test_unsqueeze(self, context, input_shape, axis, expected_shape): test_data = torch.rand(input_shape) constants, input_list, output_name = self._gen_constants(2, [test_data, axis]) unsqueeze_node = InternalTorchIRNode( kind="Unsqueeze", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.unsqueeze, unsqueeze_node, output_name, constants=constants ) expected_result = torch.unsqueeze(test_data, axis) assert np.allclose(expected_result, ssa.val) assert expected_result.size() == torch.Size(expected_shape) @pytest.mark.parametrize( "input_shape, start, end", [ ((2, 1, 1, 2), 1, 3), ((2, 2, 1, 1), 1, -2), ((1, 1, 1), 0, 2), ((1, 2), 0, 1), ((1, 2), 1, 1), ((1, 1), 1, -1), ((1,), 0, 0), ], ) def test_flatten(self, context, input_shape, start, end): test_data = torch.rand(input_shape) constants, input_list, output_name = self._gen_constants( 3, [test_data, start, end] ) flatten_node = InternalTorchIRNode( kind="Flatten", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.flatten, flatten_node, output_name, constants=constants ) expected_result = torch.flatten(test_data, start, end) assert np.allclose(expected_result, ssa.val) @pytest.mark.parametrize( "start, end", [(0, -5), (100, 2), (2, 100), (-3, -4),], ) def test_flatten_exception(self, context, start, end): test_data = torch.rand(1, 1, 1, 1) constants, input_list, output_name = self._gen_constants( 3, [test_data, start, end] ) flatten_node = InternalTorchIRNode( kind="Flatten", inputs=input_list, outputs=[output_name] ) with pytest.raises(ValueError): self._construct_test_graph( context, ops.flatten, flatten_node, output_name, constants=constants, ) @pytest.mark.parametrize( "input_shape", [(2, 3), (2, 3, 4), (2, 3, 4, 5), (2, 3, 4, 5, 6),], ) def test_permute(self, context, input_shape): test_data = torch.rand(*input_shape) permutation = list(range(len(input_shape))) np.random.shuffle(permutation) constants, input_list, output_name = self._gen_constants( 2, [test_data, permutation] ) permute_node = InternalTorchIRNode( kind="Permute", inputs=input_list, outputs=[output_name], ) ssa = self._construct_test_graph( context, ops.permute_copy, permute_node, output_name, constants=constants ) expected_result = test_data.permute(*permutation) assert expected_result.shape == ssa.shape @pytest.mark.parametrize( "in_features, out_features, scaling", itertools.product([10, 25, 100], [3, 6], [1.0, 0.5]), ) def test_addmm(self, context, in_features, out_features, scaling): input_data = torch.rand((1, in_features)) weight_data = torch.rand((in_features, out_features)) bias_data = torch.rand((out_features)) constant_vals = [ scaling, input_data, weight_data, bias_data, ] constants, _, output_name = self._gen_constants(4, constant_vals) addmm_node = InternalTorchIRNode( kind="addmm", inputs=["3", "1", "2", "0", "0"], outputs=[output_name], ) ssa = self._construct_test_graph( context, ops.addmm, addmm_node, output_name, constants=constants ) torch_linear = nn.Linear(in_features=in_features, out_features=out_features,) expected_shape = tuple(torch_linear(input_data).shape) assert expected_shape == ssa.shape @pytest.mark.parametrize( "height, width, kernel_size, stride, padding, dilation", itertools.product([5, 6], [5, 7], [1, 3], [1, 3], [1, 3], [1, 3]), ) def test_convolution2d( self, context, height, width, kernel_size, stride, padding, dilation, groups=1, in_channels=1, out_channels=2, ): test_input = torch.rand(1, in_channels, height, width) constant_vals = [ 1, # None argument test_input, np.random.rand( out_channels, in_channels, kernel_size, kernel_size ), # weights np.random.rand(out_channels), # bias np.array([stride, stride]), np.array([padding, padding]), np.array([dilation, dilation]), False, # transposed np.array([0, 0]), # output_pad groups, ] constants, _, output_name = self._gen_constants( len(constant_vals), constant_vals ) # For reference, the values for `kind` and `inputs` indices are determined from the definition for Torch's # `at::_convolution` used for all convolutions. The link below is approximately correct at the time of writing. # https://github.com/pytorch/pytorch/blob/bd604mb5b7ae4f6388aca461891d620b0d485fbb/aten/src/ATen/native/Convolution.cpp#L544 conv_node = InternalTorchIRNode( kind="_convolution", inputs=["1", "2", "3", "4", "5", "6", "7", "8", "9", "0", "0", "0"], outputs=[output_name], ) ssa = self._construct_test_graph( context, ops._convolution, conv_node, output_name, constants=constants ) torch_conv = nn.Conv2d( in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, ) expected_shape = tuple(torch_conv(test_input).shape) assert ssa.val is None assert expected_shape == ssa.shape @pytest.mark.parametrize( "depth, height, width, kernel_size, stride, padding, dilation, groups", itertools.product( [5, 5], [5, 6], [5, 7], [1, 3], [(1, 1, 1), (3, 2, 1)], [(1, 1, 1), (1, 3, 2)], [(1, 1, 1), (1, 2, 3)], [ 1, -1, ], # -1 groups indicates it should be set to the number of input channels for depthwise convolution ), ) def test_convolution3d( self, context, depth, height, width, kernel_size, stride, padding, dilation, groups, in_channels=2, out_channels=4, ): if groups == -1: groups = in_channels test_input = torch.rand(1, in_channels, depth, height, width) constant_vals = [ 1, # None argument test_input, np.random.rand( out_channels, in_channels // groups, kernel_size, kernel_size, kernel_size, ), # weights np.random.rand(out_channels), # bias # PyTorch's Conv3d accepts either an int (for all dimensions) or a 3-tuple of ints (one per dimension) np.array([stride[0], stride[1], stride[2]]), np.array([padding[0], padding[1], padding[2]]), np.array([dilation[0], dilation[1], dilation[2]]), False, # transposed np.array([0, 0, 0]), # out_pad groups, ] constants, _, output_name = self._gen_constants( len(constant_vals), constant_vals ) # For reference, the values for `kind` and `inputs` indices are determined from the definition for Torch's # `at::_convolution` used for all convolutions. The link below is approximately correct at the time of writing. # https://github.com/pytorch/pytorch/blob/bd604mb5b7ae4f6388aca461891d620b0d485fbb/aten/src/ATen/native/Convolution.cpp#L544 conv_node = InternalTorchIRNode( kind="_convolution", inputs=["1", "2", "3", "4", "5", "6", "7", "8", "9", "0", "0", "0"], outputs=[output_name], ) ssa = self._construct_test_graph( context, ops._convolution, conv_node, output_name, constants=constants ) torch_conv = nn.Conv3d( in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=groups, ) expected_result = torch_conv(test_input) expected_shape = tuple(expected_result.shape) assert ssa.val is None assert expected_shape == ssa.shape @pytest.mark.parametrize( "height, width, kernel_size, stride, padding, dilation", itertools.product([5, 6], [5, 7], [1, 3], [2, 3], [0, 1], [1, 3]), ) def test_convolution_transpose2d( self, context, height, width, kernel_size, stride, padding, dilation, groups=1, in_channels=1, out_channels=2, ): test_input = torch.rand(1, in_channels, height, width) constant_vals = [ np.random.rand( in_channels, out_channels, kernel_size, kernel_size ), # weights np.random.rand(out_channels), # bias np.array([stride, stride]), np.array([padding, padding]), np.array([dilation, dilation]), True, # transposed, np.array([0, 0]), # output_pad groups, False, False, False, ] graph_inputs = {"input": mb.placeholder(test_input.shape, dtype=types.float)} constants, input_list, output_name = self._gen_constants( len(constant_vals), constant_vals ) conv_node = InternalTorchIRNode( kind="_convolution", inputs=["input"] + input_list, outputs=[output_name], ) ssa = self._construct_test_graph( context, ops._convolution, conv_node, output_name, constants=constants, graph_inputs=graph_inputs, ) torch_conv = nn.ConvTranspose2d( in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, ) expected_shape = tuple(torch_conv(test_input).shape) assert ssa.val is None assert expected_shape == ssa.shape @pytest.mark.parametrize( "input_shape, dim, keepdim", itertools.product([(3, 20, 20), (1, 50, 50)], [0, 1, 2, [0, 2]], [True, False]), ) def test_mean(self, context, input_shape, dim, keepdim): test_input = torch.rand(*input_shape) constants, input_list, output_name = self._gen_constants( 4, [test_input, dim, keepdim, None] ) mean_node = InternalTorchIRNode( kind="mean", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.mean, mean_node, output_name, constants=constants ) expected_result = torch.mean(test_input, dim, keepdim) assert np.allclose(expected_result, ssa.val) def test_mean_no_dims(self, context): test_input = torch.rand((3, 20, 20)) constants, input_list, output_name = self._gen_constants(2, [test_input, None]) mean_node = InternalTorchIRNode( kind="mean", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.mean, mean_node, output_name, constants=constants ) expected_result = torch.mean(test_input) assert np.allclose(expected_result, ssa.val) def test_embedding(self, context): EMBEDDING_DIMENSION = 10 NUM_EMBEDDINGS = 20 input_shape = (NUM_EMBEDDINGS, EMBEDDING_DIMENSION) # size is arbitrary for indices indices = np.random.randint(NUM_EMBEDDINGS, size=100) test_input = torch.rand(input_shape) constants, input_list, output_name = self._gen_constants( 2, [test_input, indices] ) gather_node = InternalTorchIRNode( kind="embedding", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.embedding, gather_node, output_name, constants=constants ) torch_embedding = nn.Embedding.from_pretrained(test_input) expected_result = torch_embedding(torch.LongTensor(indices)) assert np.allclose(expected_result, ssa.val) @pytest.mark.parametrize( "dim", [0, 1, 2, 3, 4], ) def test_size(self, context, dim): test_input = torch.rand(1, 2, 3, 4, 5) graph_inputs = {"input": mb.placeholder(test_input.shape, dtype=types.float)} constants, input_list, output_name = self._gen_constants(1, [dim]) size_node = InternalTorchIRNode( kind="size", inputs=["input"] + input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.size, size_node, output_name, constants=constants, graph_inputs=graph_inputs, ) expected_result = test_input.shape[dim] assert expected_result == ssa.val @pytest.mark.parametrize( "dim", [0, 1], ) def test_size_symbolic(self, context, dim): test_shape = (3, get_new_symbol()) graph_inputs = {"input": mb.placeholder(shape=test_shape, dtype=types.float)} constants, input_list, output_name = self._gen_constants(1, [dim]) size_node = InternalTorchIRNode( kind="size", inputs=["input"] + input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.size, size_node, output_name, constants=constants, graph_inputs=graph_inputs, ) expected_result = test_shape[dim] assert expected_result == ssa.sym_val @pytest.mark.parametrize( "input_size, shape", itertools.product([(5, 12), (1, 4, 15), (3, 5, 4)], [(3, 20), (-1, 6), (60,)],), ) def test_view(self, context, input_size, shape): test_input = torch.rand(input_size) constants, input_list, output_name = self._gen_constants(2, [test_input, shape]) view_node = InternalTorchIRNode( kind="view", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.view, view_node, output_name, constants=constants ) expected_result = test_input.view(shape) assert np.allclose(expected_result, ssa.val) @pytest.mark.parametrize( "input_shape, output_shape", itertools.product( [(1, 3, 15, 15), (1, 1, 2, 2), (1, 3, 10, 10)], [(1, 1), (2, 2), (2, 1)], ), ) def test_adaptive_avg_pool2d(self, context, input_shape, output_shape): test_input = torch.rand(input_shape) constants, input_list, output_name = self._gen_constants( 2, [test_input, output_shape] ) adaptive_avg_pool2d_node = InternalTorchIRNode( kind="adaptive_avg_pool2d", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.adaptive_avg_pool2d, adaptive_avg_pool2d_node, output_name, constants=constants, ) expected_result = torch._adaptive_avg_pool2d(test_input, output_shape) expected_shape = tuple(expected_result.shape) assert expected_shape == ssa.shape # We only expect numerical output when reducing to global average. if output_shape == (1, 1): assert np.allclose(expected_result, ssa.val) def test_adaptive_avg_pool2d_exception(self, context): # For this test, the input tensor HW channels are dynamic. input_shape = [1, 3, get_new_symbol(), get_new_symbol()] graph_inputs = {"input": mb.placeholder(input_shape, dtype=types.float)} constants, input_list, output_name = self._gen_constants(1, [(2, 1)]) adaptive_avg_pool2d_node = InternalTorchIRNode( kind="adaptive_avg_pool2d", inputs=["input"] + input_list, outputs=[output_name], ) with pytest.raises(ValueError): self._construct_test_graph( context, ops.adaptive_avg_pool2d, adaptive_avg_pool2d_node, output_name, constants=constants, graph_inputs=graph_inputs, ) @pytest.mark.parametrize("input_shape", [(1, 3, 15, 15), (1, 1, 1, 1)]) def test_batch_norm(self, context, input_shape): test_input = torch.rand(input_shape) channels = input_shape[1] constants, input_list, output_name = self._gen_constants( 9, [ torch.rand(input_shape), # input torch.rand(channels), # weight torch.rand(channels), # bias torch.rand(channels), # running mean torch.rand(channels), # running var 0, # training 0.1, # momentum 1e-6, # eps 1, # cudnn_enabled ], ) batch_norm_node = InternalTorchIRNode( kind="batch_norm", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.batch_norm, batch_norm_node, output_name, constants=constants ) assert ssa.val is None assert ssa.shape == tuple(test_input.shape) @pytest.mark.parametrize("input_shape", [(1, 3, 15, 15), (1, 1, 1, 1)]) def test_instance_norm(self, context, input_shape): test_input = torch.rand(input_shape) channels = input_shape[1] constants, input_list, output_name = self._gen_constants( 9, [ torch.rand(input_shape), # input torch.rand(channels), # weight torch.rand(channels), # bias torch.rand(channels), # running mean torch.rand(channels), # running var 0, # training 0.1, # momentum 1e-6, # eps 1, # cudnn_enabled ], ) instant_norm_node = InternalTorchIRNode( kind="instance_norm", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.instance_norm, instant_norm_node, output_name, constants=constants ) assert ssa.val is None assert ssa.shape == tuple(test_input.shape) @pytest.mark.parametrize("axis", [1, 2, 3]) def test_cat(self, context, axis): input_shape = (1, 3, 240, 320) test_input1 = torch.rand(input_shape) test_input2 = torch.rand(input_shape) const_input = torch.rand(input_shape) graph_inputs = { "input1": mb.placeholder(input_shape, dtype=types.float), "input2": mb.placeholder(input_shape, dtype=types.float), } dim_node = InternalTorchIRNode( attr={"value": axis}, kind="constant", inputs=[], outputs=["0"], ) const_tensor_node = InternalTorchIRNode( attr={"value": const_input.numpy()}, kind="constant", inputs=[], outputs=["1"], ) listconstruct_node = InternalTorchIRNode( kind="listconstruct", inputs=["1", "input1", "input2"], outputs=["2"] ) cat_node = InternalTorchIRNode( kind="cat", inputs=["2", "0"], outputs=["output"] ) with Function(inputs=graph_inputs) as ssa_func: context.add(ssa_func.inputs["input1"]) context.add(ssa_func.inputs["input2"]) ops.constant(context, dim_node) ops.constant(context, const_tensor_node) ops.listconstruct(context, listconstruct_node) ops.cat(context, cat_node) ssa = context["output"] expected_result = torch.cat( (const_input, test_input1, test_input2), dim=axis ).numpy() assert np.allclose(expected_result.shape, ssa.shape) @pytest.mark.parametrize("axis", [0, 1, 2, 3, 4]) def test_stack(self, context, axis): input_shape = (1, 3, 240, 320) test_input1 = torch.rand(input_shape) test_input2 = torch.rand(input_shape) const_input = torch.rand(input_shape) graph_inputs = { "input1": mb.placeholder(input_shape, dtype=types.float), "input2": mb.placeholder(input_shape, dtype=types.float), } dim_node = InternalTorchIRNode( attr={"value": axis}, kind="constant", inputs=[], outputs=["0"], ) const_tensor_node = InternalTorchIRNode( attr={"value": const_input.numpy()}, kind="constant", inputs=[], outputs=["1"], ) listconstruct_node = InternalTorchIRNode( kind="listconstruct", inputs=["1", "input1", "input2"], outputs=["2"] ) stack_node = InternalTorchIRNode( kind="stack", inputs=["2", "0"], outputs=["output"] ) with Function(inputs=graph_inputs) as ssa_func: context.add(ssa_func.inputs["input1"]) context.add(ssa_func.inputs["input2"]) ops.constant(context, dim_node) ops.constant(context, const_tensor_node) ops.listconstruct(context, listconstruct_node) ops.stack(context, stack_node) ssa = context["output"] expected_result = np.stack((const_input, test_input1, test_input2), axis=axis) assert np.allclose(expected_result.shape, ssa.shape) def test_item(self, context): const_val = 0 constants, input_list, output_name = self._gen_constants(1, [const_val]) item_node = InternalTorchIRNode( kind="item", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.item, item_node, output_name, constants=constants ) assert ssa.val == const_val def test_item_exception(self, context): const_val = [0, 1] constants, input_list, output_name = self._gen_constants(1, [const_val]) item_node = InternalTorchIRNode( kind="item", inputs=input_list, outputs=[output_name] ) with pytest.raises(ValueError): self._construct_test_graph( context, ops.item, item_node, output_name, constants=constants, ) @pytest.mark.parametrize("test_val", [1, 1.5, False]) def test_bool(self, context, test_val): self._test_cast(context, test_val, "bool", ops._bool, bool) @pytest.mark.parametrize("test_val", [1, 1.5, -0.3]) def test_int(self, context, test_val): self._test_cast(context, test_val, "int", ops._int, int) @pytest.mark.parametrize("input_shape", [(1, 3, 15, 15), (1, 1, 1, 1)]) def test_layer_norm(self, context, input_shape): graph_inputs = {"input": mb.placeholder(input_shape, dtype=types.float)} constants, input_list, output_name = self._gen_constants( 5, [ input_shape, # normalized shape torch.rand(*input_shape), # weight torch.rand(*input_shape), # running bias 1e-6, 1, # cudnn enabled ], ) layer_norm_node = InternalTorchIRNode( kind="layer_norm", inputs=["input"] + input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.layer_norm, layer_norm_node, output_name, graph_inputs=graph_inputs, constants=constants, ) assert ssa.val is None assert ssa.shape == input_shape @pytest.mark.parametrize("shape", [(1, 2), (2, 3, 4, 5), (3, 4, 5),]) def test_ones(self, context, shape): constants, constant_input_list, output_name = self._gen_constants( 6, [shape, 1, 1, 1, 1, 1] ) ones_node = InternalTorchIRNode( kind="ones", inputs=constant_input_list, outputs=[output_name], ) ssa = self._construct_test_graph( context, ops.ones, ones_node, output_name, constants=constants, ) assert ssa.shape == shape @pytest.mark.parametrize("input_shape", [(1, 2), (2, 3, 4, 5), (3, 4, 5),]) def test_ones_like(self, context, input_shape): graph_inputs = {"input": mb.placeholder(input_shape, dtype=types.float)} constants, constant_input_list, output_name = self._gen_constants(5, 1) ones_node = InternalTorchIRNode( kind="ones_like", inputs=["input"] + constant_input_list, outputs=[output_name], ) ssa = self._construct_test_graph( context, ops.ones_like, ones_node, output_name, graph_inputs=graph_inputs, constants=constants, ) assert ssa.shape == input_shape @pytest.mark.parametrize( "input_size, dim, index", itertools.product( [(13, 43, 10), (39, 14, 11, 9)], [0, 1, 2], [0, 1, 3, 8, -1], ), ) def test_select(self, context, input_size, dim, index): graph_inputs = {"input1": mb.placeholder(input_size, dtype=types.float)} constants, constant_input_list, output_name = self._gen_constants( 2, [dim, index] ) select_node = InternalTorchIRNode( kind="select", inputs=["input1"] + constant_input_list, outputs=[output_name], ) ssa = self._construct_test_graph( context, ops.select, select_node, output_name, graph_inputs=graph_inputs, constants=constants, ) select_index = index if index < 0: select_index += input_size[dim] expected_shape = tuple( torch.rand(input_size) .index_select(dim, torch.tensor([select_index])) .squeeze(dim) .shape ) assert np.allclose(ssa.shape, expected_shape) @pytest.mark.parametrize( "dynamic, test_tuple", itertools.product([True, False], [True, False]) ) def test_tuple_and_list_unpack(self, context, dynamic, test_tuple): """ if @dynamic is True then packs up a dynamic input if @test_tuple is True tests tupleUnpack else tests listUnpack """ if test_tuple: construct_op = ops.tupleconstruct construct_name = "TupleConstruct" unpack_name = "TupleUnpack" else: construct_op = ops.listconstruct construct_name = "ListConstruct" unpack_name = "ListUnpack" input_shape = (1, 2, 3) constant_vals = [str(i) for i in range(1, 6)] constants_unpacked = [str(i) for i in range(6, 11)] constants, input_list, _ = self._gen_constants(5, constant_vals) output_list = constants_unpacked[:] graph_inputs = {} if dynamic: graph_input_name = "input1" graph_inputs = { graph_input_name: mb.placeholder(input_shape, dtype=types.float) } input_list += [graph_input_name] output_list += [graph_input_name + "_out"] construct_node = InternalTorchIRNode( kind=construct_name, inputs=input_list, outputs=["construct"], ) unpack_node = InternalTorchIRNode( kind=unpack_name, inputs=["construct"], outputs=output_list ) with Function(inputs=graph_inputs) as ssa_func: if dynamic: context.add(ssa_func.inputs["input1"]) for node in constants: ops.constant(context, node) construct_op(context, construct_node) ops.tupleunpack(context, unpack_node) ssa_constants = [] for name in constants_unpacked: ssa_constants.append(context[name].val) assert ssa_constants == constant_vals if dynamic: ssa_dyanmic = context[graph_input_name + "_out"] assert ssa_dyanmic.val is None assert ssa_dyanmic.shape == input_shape def _test_pool( self, context, test_input, param_list, op_kind, op_func, expected_result ): constants, input_list, output_name = self._gen_constants( len(param_list) + 1, [test_input] + param_list, ) pool_node = InternalTorchIRNode( kind=op_kind, inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, op_func, pool_node, output_name, constants=constants, ) expected_shape = tuple(expected_result.shape) assert expected_shape == ssa.shape @pytest.mark.parametrize( "input_shape, kernel_size, stride, pad, include_pad, ceil_mode", itertools.product( [(1, 3, 15), (1, 1, 7), (1, 3, 10)], [1, 3], [1, 2], [0, 1], [True, False], [False, True], ), ) def test_avg_pool1d( self, context, input_shape, kernel_size, stride, pad, include_pad, ceil_mode, ): if pad > kernel_size / 2: return if ceil_mode: if kernel_size == 1 and stride == 2 and pad == 0 and input_shape[-1] == 10: pytest.xfail("Torch ceil_mode does not match exactly with CoreML's ceil_mode. rdar://80050546") test_input = torch.rand(input_shape) expected_result = F.avg_pool1d( test_input, kernel_size=kernel_size, stride=stride, padding=pad, ceil_mode=ceil_mode, count_include_pad=include_pad, ) self._test_pool( context, test_input, [[kernel_size], [stride], [pad], ceil_mode, not include_pad], "avg_pool1d", ops.avg_pool1d, expected_result, ) @pytest.mark.parametrize( "input_shape, kernel_size, stride, pad, include_pad, ceil_mode", itertools.product( [(1, 3, 15, 15), (1, 1, 7, 7), (1, 3, 10, 10)], [1, 3], [1, 2], [0, 1], [True, False], [False, True], ), ) def test_avg_pool2d( self, context, input_shape, kernel_size, stride, pad, include_pad, ceil_mode, ): if pad > kernel_size / 2: return if ceil_mode: if kernel_size == 1 and stride == 2 and pad == 0 and input_shape[-1] == 10: pytest.xfail("Torch ceil_mode does not match exactly with CoreML's ceil_mode. rdar://80050546") test_input = torch.rand(input_shape) expected_result = F.avg_pool2d( test_input, kernel_size=kernel_size, stride=stride, padding=pad, ceil_mode=ceil_mode, count_include_pad=include_pad, ) self._test_pool( context, test_input, [ [kernel_size, kernel_size], [stride, stride], [pad, pad], ceil_mode, not include_pad, None, ], "avg_pool2d", ops.avg_pool2d, expected_result, ) @pytest.mark.parametrize( "input_shape, kernel_size, stride, pad, ceil_mode", itertools.product( [(1, 3, 15), (1, 1, 7), (1, 3, 10)], [1, 3], [1, 2], [0, 1], [False, True] ), ) def test_max_pool1d( self, context, input_shape, kernel_size, stride, pad, ceil_mode ): if pad > kernel_size / 2: return if ceil_mode: if kernel_size == 1 and stride == 2 and pad == 0 and input_shape[-1] == 10: pytest.xfail("Torch ceil_mode does not match exactly with CoreML's ceil_mode. rdar://80050546") test_input = torch.rand(input_shape) expected_result = F.max_pool1d( test_input, kernel_size=kernel_size, stride=stride, padding=pad, ceil_mode=ceil_mode, ) self._test_pool( context, test_input, [[kernel_size], [stride], [pad], [1], ceil_mode], "max_pool1d", ops.max_pool1d, expected_result, ) @pytest.mark.parametrize( "input_shape, kernel_size, stride, pad, ceil_mode", itertools.product( [(1, 3, 15, 15), (1, 1, 7, 7), (1, 3, 10, 10)], [1, 3], [1, 2], [0, 1], [False, True], ), ) def test_max_pool2d( self, context, input_shape, kernel_size, stride, pad, ceil_mode, ): if pad > kernel_size / 2: return if ceil_mode: if kernel_size == 1 and stride == 2 and pad == 0 and input_shape[-1] == 10: pytest.xfail("Torch ceil_mode does not match exactly with CoreML's ceil_mode. rdar://80050546") test_input = torch.rand(input_shape) expected_result = F.max_pool2d( test_input, kernel_size=kernel_size, stride=stride, padding=pad, ceil_mode=ceil_mode, ) self._test_pool( context, test_input, [ [kernel_size, kernel_size], [stride, stride], [pad, pad], [1, 1,], # dilation ceil_mode, ], "max_pool2d", # Using ops.max_pool1d because max_pool2d is its alias ops.max_pool1d, expected_result, ) @pytest.mark.parametrize( "dim, start, end, step", itertools.product([0, 1, 2], [0, 1, 2], [3, 4, 5, None], [1, 2]), ) def test_slice(self, context, dim, start, end, step): test_input = torch.rand(5, 5, 5) constants, input_list, output_name = self._gen_constants( 5, [test_input, dim, start, end, step] ) node = InternalTorchIRNode( kind="slice", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.slice, node, output_name, constants=constants ) if end is None: end = test_input.shape[dim] expected_result = test_input.index_select( dim, torch.LongTensor(range(start, end, step)) ) np.testing.assert_allclose(expected_result, ssa.val) @pytest.mark.parametrize( "split_sizes, dim, make_explicit", itertools.product([2, 3], [0, 1, 2], [True, False]), ) def test_split(self, context, split_sizes, dim, make_explicit): test_input = torch.rand(3, 4, 5) if make_explicit: # Explicitly provide the size of each split. This will be two # splits, the given size and the remainder. split_sizes = [split_sizes, test_input.shape[dim] - split_sizes] constants, input_list, output_name = self._gen_constants( 3, [test_input, split_sizes, dim] ) node = InternalTorchIRNode( kind="split", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.split, node, output_name, constants=constants ) expected_result = torch.split(test_input, split_sizes, dim) if not isinstance(ssa, list): ssa = [ssa] for ex_res, ssa_res in zip(expected_result, ssa): np.testing.assert_allclose(ex_res.numpy(), ssa_res.val, atol=1e-6) def test_floor(self, context): test_input = torch.rand(1, 2, 3) * 10 constants, input_list, output_name = self._gen_constants(1, test_input) floor_node = InternalTorchIRNode( kind="floor", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.floor, floor_node, output_name, constants=constants, ) expected_result = test_input.floor() assert np.allclose(expected_result, ssa.val) def test_erf(self, context): test_input = torch.rand(1, 2, 3, 4) constants, input_list, output_name = self._gen_constants(1, test_input) node = InternalTorchIRNode(kind="erf", inputs=input_list, outputs=[output_name]) ssa = self._construct_test_graph( context, ops.erf, node, output_name, constants=constants ) expected_result = test_input.erf() assert np.allclose(expected_result, ssa.val, atol=1e-05) def test_implicittensortonum(self, context): input_shape = (1,) graph_input_name = "input1" graph_inputs = { graph_input_name: mb.placeholder(input_shape, dtype=types.float) } output_name = "1" node = InternalTorchIRNode( kind="implicittensortonum", inputs=["input1"], outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.implicittensortonum, node, output_name, graph_inputs=graph_inputs, ) assert ssa.shape == () @pytest.mark.parametrize( "chunks, dim", itertools.product([2, 3, 5], [0, 1, 2, 3]), ) def test_constantchunk(self, context, chunks, dim): test_input = torch.rand(5, 8, 9, 11) expected_result = test_input.chunk(chunks, dim=dim) constants, input_list, first_output = self._gen_constants(1, [test_input]) outputs = [str(int(first_output) + i) for i in range(len(expected_result))] node = InternalTorchIRNode( attr={"chunks": chunks, "dim": dim}, kind="constantchunk", inputs=input_list, outputs=outputs, ) self._construct_test_graph( context, ops.constantchunk, node, first_output, constants=constants ) actual_result = [context[name] for name in outputs] np.testing.assert_equal(len(expected_result), len(actual_result)) for ex_res, ssa_res in zip(expected_result, actual_result): np.testing.assert_allclose(ex_res.numpy(), ssa_res.val, atol=1e-6) @pytest.mark.parametrize( "input_shape, shape", [ ((3, 1), (3, 4)), ((3, 1), (-1, 4)), ((3, 1, 1), (3, 4, 1)), ((3, 1, 1), (3, -1, 5)), ((3, 1, 1), (3, 4, 5)), ((1, 3, 1, 1), (2, 3, -1, 1)), ((1, 3, 4, 1), (2, 3, -1, 5)), ], ) def test_expand(self, context, input_shape, shape): test_input = torch.rand(input_shape) constants, input_list, output_name = self._gen_constants(2, [test_input, shape]) node = InternalTorchIRNode( kind="expand", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.expand, node, output_name, constants=constants ) expected_result = test_input.expand(shape) np.testing.assert_allclose(expected_result, ssa.val) @pytest.mark.parametrize( "input_shape, other_shape", [ ((3, 1), (3, 4)), ((3, 1, 1), (3, 4, 1)), ((3, 1, 1), (3, 4, 5)), ((1, 3, 1, 1), (2, 3, 4, 1)), ((1, 3, 4, 1), (2, 3, 4, 5)), ((1, 3, 4, 1), (1, 3, 4, 5)), ], ) def test_expand_as(self, context, input_shape, other_shape): test_input = torch.rand(input_shape) other = torch.rand(other_shape) constants, input_list, output_name = self._gen_constants(2, [test_input, other]) node = InternalTorchIRNode( kind="expand_as", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.expand_as, node, output_name, constants=constants ) expected_result = test_input.expand_as(other) np.testing.assert_allclose(expected_result, ssa.val) @pytest.mark.parametrize( "start, end, step", [x for x in itertools.product((None, 0, 2), (5, 10), (None,),)] + [x for x in itertools.product((0, 2), (5, 10), (1, 2))], ) def test_arange(self, context, start, end, step): # Arange can get [end], [start, end], or [start, end, step] args = [x for x in [start, end, step] if x is not None] args += [0, 0, 0, False] # Extra args needed but ignored by arange constants, input_list, output_name = self._gen_constants(len(args), args) node = InternalTorchIRNode( kind="arange", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.arange, node, output_name, constants=constants ) kwargs = {"end": end} if start is not None: kwargs["start"] = start if step is not None: kwargs["step"] = step expected_result = torch.arange(**kwargs) np.testing.assert_allclose(expected_result, ssa.val) @pytest.mark.parametrize( "input_shape, axis", [((2, 3), 0), ((2, 3, 4), 1), ((2, 3, 4, 5), 0), ((2, 3, 4, 5), 2),], ) def test_masked_fill(self, context, input_shape, axis): mask_shape = list(input_shape) mask_shape[axis] = 1 mask = torch.randint(0, 1, mask_shape, dtype=torch.bool) input_data = torch.rand(input_shape) value = -1.0 constants, input_list, output_name = self._gen_constants( 3, [input_data, mask, value] ) node = InternalTorchIRNode( kind="masked_fill", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.masked_fill, node, output_name, constants=constants ) expected_result = input_data.masked_fill(mask, value) np.testing.assert_allclose(expected_result, ssa.val) @pytest.mark.parametrize( "noop_kind", ["dropout", "dropout_", "feature_dropout", "contiguous", "device", "detach"], ) def test_noops(self, context, noop_kind): test_input = torch.rand(3, 4, 5) constants, input_list, output_name = self._gen_constants( 3, [test_input, "test", "test"] ) node = InternalTorchIRNode( kind=noop_kind, inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.noop, node, output_name, constants=constants ) assert np.allclose(test_input.numpy(), ssa.val) def test_tanh(self, context): test_input = torch.rand(3, 4, 5) constants, input_list, output_name = self._gen_constants(1, [test_input]) node = InternalTorchIRNode( kind="tanh", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.tanh, node, output_name, constants=constants ) expected_result = torch.tanh(test_input) assert np.allclose(expected_result.numpy(), ssa.val) @pytest.mark.parametrize( "input_shape, dim, keepdim", itertools.product([(3, 20, 20), (1, 50, 50)], [0, 1, 2], [True, False]), ) def test_argmax(self, context, input_shape, dim, keepdim): test_input = torch.rand(*input_shape) constants, input_list, output_name = self._gen_constants( 4, [test_input, dim, keepdim, None] ) node = InternalTorchIRNode( kind="argmax", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.argmax, node, output_name, constants=constants ) expected_result = torch.argmax(test_input, dim, keepdim) np.testing.assert_allclose(expected_result, ssa.val) @pytest.mark.parametrize( "size, dtype", itertools.product([(1, 2, 3, 4), (1,)], [11, 0, 1, 6]), ) def test_zeros(self, context, size, dtype): layout = 0 # unused device = 0 # unused pin_memory = 0 # unused constants, input_list, output_name = self._gen_constants( 5, [size, dtype, layout, device, pin_memory] ) node = InternalTorchIRNode( kind="zeros", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.zeros, node, output_name, constants=constants ) expected_result = torch.zeros(size, dtype=utils.NUM_TO_TORCH_DTYPE[dtype]) np.testing.assert_allclose(expected_result, ssa.val) @pytest.mark.parametrize("input_size", [(1, 2, 3, 4), (1,)]) def test_exp(self, context, input_size): test_input = torch.rand(input_size) constants, input_list, output_name = self._gen_constants(1, test_input) node = InternalTorchIRNode(kind="exp", inputs=input_list, outputs=[output_name]) ssa = self._construct_test_graph( context, ops.exp, node, output_name, constants=constants ) expected_result = torch.exp(test_input) np.testing.assert_allclose(expected_result, ssa.val, rtol=1e-06) @pytest.mark.parametrize( "input_size, dim, keepdim", itertools.product([(1, 2, 3, 4)], [0, 1, 2], [True, False]), ) def test_max(self, context, input_size, dim, keepdim): test_input = torch.rand(input_size) constants, input_list, _ = self._gen_constants(3, [test_input, dim, keepdim]) node = InternalTorchIRNode( kind="max", inputs=input_list, outputs=["out1", "out2"], ) self._construct_test_graph(context, ops.max, node, constants=constants) torch.max(test_input, dim=dim, keepdim=keepdim) @pytest.mark.parametrize( "input_size, dim, descending", itertools.product([(2, 3, 4), (1, 2, 3, 4)], [0, 1, 2], [True, False]), ) def test_sort(self, context, input_size, dim, descending): test_input = torch.rand(input_size) constants, input_list, output_name = self._gen_constants( 3, [test_input, dim, descending] ) node = InternalTorchIRNode( kind="sort", inputs=input_list, outputs=["out1", "out2"], ) self._construct_test_graph(context, ops.sort, node, constants=constants) expected_sort, expected_index = torch.sort( test_input, dim=dim, descending=descending ) sort_result = context["out1"].val index_result = context["out2"].val np.testing.assert_allclose(expected_sort, sort_result) np.testing.assert_allclose(expected_index, index_result) @pytest.mark.parametrize( "input_shape, dim, keepdim", itertools.product( [(3, 20, 20), (1, 50, 50)], [[0], [1], [2], [0, 2]], [True, False]), ) def test_sum(self, context, input_shape, dim, keepdim): test_input = torch.rand(*input_shape) constants, input_list, output_name = self._gen_constants( 4, [test_input, dim, keepdim, None] ) sum_node = InternalTorchIRNode( kind="sum", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.mean, sum_node, output_name, constants=constants ) expected_result = torch.sum(test_input, dim, keepdim) assert np.allclose(expected_result, ssa.val) def test_sum_no_dims(self, context): test_input = torch.rand((3, 20, 20)) constants, input_list, output_name = self._gen_constants(2, [test_input, None]) sum_node = InternalTorchIRNode( kind="sum", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.mean, sum_node, output_name, constants=constants ) expected_result = torch.sum(test_input) assert np.allclose(expected_result, ssa.val) def test_neg(self, context): test_input = torch.rand(3, 4, 5) constants, input_list, output_name = self._gen_constants(1, [test_input]) node = InternalTorchIRNode( kind="neg", inputs=input_list, outputs=[output_name] ) ssa = self._construct_test_graph( context, ops.neg, node, output_name, constants=constants ) expected_result = torch.neg(test_input) assert np.allclose(expected_result.numpy(), ssa.val) @pytest.mark.parametrize( "input_shape, k, dim, largest", itertools.product([(5, 10, 10), (10, 5, 5)], [0, 3, 5], [0, 1, 2], [True, False]), ) def test_topk(self, context, input_shape, k, dim, largest): test_input = torch.tensor(random_gen(input_shape, allow_duplicate=False)) constants, input_list, output_name = self._gen_constants( 6, [test_input, k, dim, largest, True, None] ) topk_node = InternalTorchIRNode( kind="topk", inputs=input_list, outputs=["out1", "out2"] ) self._construct_test_graph( context, ops.topk, topk_node, constants=constants ) topk_result = context["out1"].val index_result = context["out2"].val expected_max, expected_indices = torch.topk(test_input, k, dim, largest) np.testing.assert_allclose(expected_max.numpy(), topk_result) np.testing.assert_allclose(expected_indices.numpy(), index_result) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/test/test_passes.py0000644000000000000000000003146514672066616026651 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict import numpy as np import pytest import torch from ..internal_graph import ( InternalTorchIRBlock, InternalTorchIRGraph, InternalTorchIRNode, ) from ..torchir_passes import ( flatten_graph_input_values, flatten_graph_output_values, transform_inplace_ops, ) import coremltools as ct def _build_flattening_test_graph(): # This test graph is: # graph( # %1 : (Tensor[1, 1], (Tensor[1, 2], Tensor[1, 3])) # ): # %2, %3 = tupleunpack[](%1) # %4, %5 = tupleunpack[](%3) # %6 = tupleconstruct[](%2, %4) # %7 = tupleconstruct[](%6, %5) # return (%7) # # And if you were to run the graph it would turn # (a, (b, c)) # into # ((a, b), c) graph_params = {} graph_inputs = OrderedDict() graph_inputs["1"] = ( torch.rand(1, 1), ( torch.rand(1, 2), torch.rand(1, 3), ), ) graph_nodes = [ InternalTorchIRNode( inputs=["1"], outputs=["2", "3"], kind="tupleunpack", ), InternalTorchIRNode( inputs=["3"], outputs=["4", "5"], kind="tupleunpack", ), InternalTorchIRNode( inputs=["2", "4"], outputs=["6"], kind="tupleconstruct", ), InternalTorchIRNode( inputs=["6", "5"], outputs=["7"], kind="tupleconstruct", ), ] graph_outputs = ["7"] return InternalTorchIRGraph( nodes=graph_nodes, params=graph_params, inputs=graph_inputs, outputs=graph_outputs, ) class TestTorchPasses: """ Class containing tests for InternalTorchIR optimization passes. """ @pytest.fixture def set_random_seeds(self): torch.manual_seed(1) np.random.seed(1) @staticmethod def test_flatten_input_values(): graph = _build_flattening_test_graph() flatten_graph_input_values(graph) # The graph input tuple should have been flattened. np.testing.assert_equal(len(graph.inputs.keys()), 3) # Tuple flattening should introduce two new ops. np.testing.assert_equal(len(graph.nodes), 6) # The new ops at the beginning of the graph should be a tupleconstruct. np.testing.assert_equal(graph.nodes[0].kind, "tupleconstruct") np.testing.assert_equal(graph.nodes[1].kind, "tupleconstruct") # The inputs to the tupleconstructs should be the new flattened inputs. input_names = [k for k in graph.inputs.keys()] np.testing.assert_equal(input_names[1:], graph.nodes[0].inputs) np.testing.assert_equal(input_names[0], graph.nodes[1].inputs[0]) np.testing.assert_equal(graph.nodes[0].outputs[0], graph.nodes[1].inputs[1]) # The last inserted tuple construct should produce the input for the # next op. np.testing.assert_equal(graph.nodes[1].outputs[0], graph.nodes[2].inputs[0]) @staticmethod def test_flatten_output_values(): graph = _build_flattening_test_graph() flatten_graph_output_values(graph) # The graph output tuple should have been flattened. np.testing.assert_equal(len(graph.outputs), 3) # The outputs of the graph should come from intermediate ops. np.testing.assert_equal(graph.outputs[0], graph.nodes[0].outputs[0]) np.testing.assert_equal(graph.outputs[1], graph.nodes[1].outputs[0]) np.testing.assert_equal(graph.outputs[2], graph.nodes[1].outputs[1]) @staticmethod def test_transform_inplace_ops_graph(): # The test graph is: # graph( # %x : Tensor[1], # ): # %1 = constant[value=0]() # %2 = constant[value=10]() # %3 = listconstruct[](%1) # %4 = append[](%3, %2) # return (%3) graph_params = {} graph_inputs = OrderedDict() graph_inputs["x"] = torch.rand(1) graph_nodes = [ InternalTorchIRNode( inputs=[], attr={"value": 0}, outputs=["1"], kind="constant", ), InternalTorchIRNode( inputs=[], attr={"value": 10}, outputs=["2"], kind="constant", ), InternalTorchIRNode( inputs=["1"], outputs=["3"], kind="listconstruct", ), InternalTorchIRNode( inputs=["3", "2"], outputs=["4"], kind="append", ), ] graph_outputs = ["3"] graph = InternalTorchIRGraph( nodes=graph_nodes, params=graph_params, inputs=graph_inputs, outputs=graph_outputs, ) for node in graph.nodes: node.parent = graph transform_inplace_ops(graph) np.testing.assert_equal(len(graph.outputs), 1) np.testing.assert_equal(graph.outputs[0], graph.nodes[-1].outputs[0]) @staticmethod def test_transform_inplace_ops_loop(): # The test graph is: # graph( # %x : Tensor[1], # ): # %1 = constant[value=True]() # %2 = constant[value=-1]() # %3 = constant[value=10]() # %4 = listconstruct[](%2) # = loop[](%3, %1) # block(%i.1): # %6 = append[](%4, %i.1) # return (%1) # return (%4) graph_params = {} graph_inputs = OrderedDict() graph_inputs["x"] = torch.rand(1) loop_block = InternalTorchIRBlock( inputs=["i.1"], outputs=["1"], nodes=[ InternalTorchIRNode( inputs=["4", "i.1"], outputs=["6"], kind="append", ), ], ) loop_block.nodes[0].parent = loop_block loop_node = InternalTorchIRNode( inputs=["3", "1"], outputs=[], kind="loop", blocks=[loop_block], ) loop_block.parent = loop_node graph_nodes = [ InternalTorchIRNode( inputs=[], attr={"value": True}, outputs=["1"], kind="constant", ), InternalTorchIRNode( inputs=[], attr={"value": -1}, outputs=["2"], kind="constant", ), InternalTorchIRNode( inputs=[], attr={"value": 10}, outputs=["3"], kind="constant", ), InternalTorchIRNode( inputs=["2"], outputs=["4"], kind="listconstruct", ), loop_node, ] graph_outputs = ["4"] graph = InternalTorchIRGraph( nodes=graph_nodes, params=graph_params, inputs=graph_inputs, outputs=graph_outputs, ) for node in graph.nodes: node.parent = graph transform_inplace_ops(graph) # There should be an additional input to the loop. np.testing.assert_equal(len(loop_node.inputs), 3) # That input should be the output of the previous op. np.testing.assert_equal(loop_node.inputs[2], graph.nodes[3].outputs[0]) # The loop block should have an additional input. np.testing.assert_equal(len(loop_block.inputs), 2) # The loop block's new input should be the input to append. np.testing.assert_equal(loop_block.inputs[1], loop_block.nodes[0].inputs[0]) # The loop block should have an additional output. np.testing.assert_equal(len(loop_block.outputs), 2) # Append's output should be returned from the loop block. np.testing.assert_equal(loop_block.outputs[1], loop_block.nodes[0].outputs[0]) # The loop should now have an output. np.testing.assert_equal(len(loop_node.outputs), 1) # The loop's name should now be the name of its output. np.testing.assert_equal(loop_node.name, loop_node.outputs[0]) # That graph output should now be the output of the graph. np.testing.assert_equal(loop_node.outputs[0], graph.outputs[0]) @staticmethod @pytest.mark.xfail(reason="rdar://64235006") def test_transform_inplace_ops_if(): # The test graph is: # graph( # %x : Tensor[1], # ): # %1 = constant[value=True]() # %2 = constant[value=0]() # %3 = constant[value=1]() # %4 = listconstruct[](%2) # = if[](%1) # block0(): # %5 = append[](%4, %3) # return () # block1(): # %6 = append[](%4, %2) # return () # return (%4) graph_params = {} graph_inputs = OrderedDict() graph_inputs["x"] = torch.rand(1) if_true_block = InternalTorchIRBlock( inputs=[], outputs=[], nodes=[ InternalTorchIRNode( inputs=["4", "3"], outputs=["5"], kind="append", ), ], ) if_true_block.nodes[0].parent = if_true_block if_false_block = InternalTorchIRBlock( inputs=[], outputs=[], nodes=[ InternalTorchIRNode( inputs=["4", "2"], outputs=["6"], kind="append", ), ], ) if_false_block.nodes[0].parent = if_false_block if_node = InternalTorchIRNode( inputs=["1"], outputs=[], kind="if", blocks=[if_true_block, if_false_block], ) if_true_block.parent = if_node if_false_block.parent = if_node graph_nodes = [ InternalTorchIRNode( inputs=[], attr={"value": True}, outputs=["1"], kind="constant", ), InternalTorchIRNode( inputs=[], attr={"value": 0}, outputs=["2"], kind="constant", ), InternalTorchIRNode( inputs=[], attr={"value": 1}, outputs=["3"], kind="constant", ), InternalTorchIRNode( inputs=["2"], outputs=["4"], kind="listconstruct", ), if_node, ] graph_outputs = ["4"] graph = InternalTorchIRGraph( nodes=graph_nodes, params=graph_params, inputs=graph_inputs, outputs=graph_outputs, ) for node in graph.nodes: node.parent = graph transform_inplace_ops(graph) # The true block should now have an output. np.testing.assert_equal(len(if_true_block.outputs), 1) # The true block should output the result of the append op. np.testing.assert_equal(if_true_block.outputs[0], if_true_block.nodes[0].outputs[0]) # The false block should now have an output. np.testing.assert_equal(len(if_false_block.outputs), 1) # The false block should output the result of the append op. np.testing.assert_equal(if_false_block.outputs[0], if_false_block.nodes[0].outputs[0]) # The if op should have an additional output. np.testing.assert_equal(len(if_node.outputs), 1) # The if's name should now be the name of its output. np.testing.assert_equal(if_node.name, if_node.outputs[0]) # The graph output should be the if op output. np.testing.assert_equal(if_node.outputs[0], graph.outputs[0]) @staticmethod def test_inpace_op_from_cast(): class Net(torch.nn.Module): def forward(self, x): y = torch.empty(x.shape).to(torch.int32) y.fill_(0.2) return y shape = (2, 3) x = torch.rand(*shape) traced_fn = torch.jit.trace(Net(), x).eval() ct_model = ct.convert( traced_fn, inputs=[ct.TensorType(shape=shape)], outputs=[ct.TensorType(name="y", dtype=np.int32)], source="pytorch", ) y_cm = ct_model.predict({'x': x})['y'] assert((y_cm == np.zeros(shape)).all()) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/test/test_torch_conversion_api.py0000644000000000000000000036573514672066616031602 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import os import platform import shutil import tempfile from unittest.mock import patch import numpy as np import pytest from packaging.version import Version from PIL import Image import coremltools as ct from coremltools._deps import ( _HAS_EXECUTORCH, _HAS_HF, _HAS_TORCH, _HAS_TORCHAO, MSG_EXECUTORCH_NOT_FOUND, MSG_TORCH_NOT_FOUND, MSG_TORCHAO_NOT_FOUND, ) from coremltools.converters.mil.frontend.torch.test.testing_utils import _copy_input_data from coremltools.converters.mil.frontend.torch.torch_op_registry import ( _TORCH_OPS_REGISTRY, TorchOpsRegistry, register_torch_op, ) from coremltools.converters.mil.mil.types.symbolic import any_symbolic from coremltools.converters.mil.testing_reqs import backends from coremltools.converters.mil.testing_utils import ( assert_cast_ops_count, assert_input_dtype, assert_ops_in_mil_program, assert_output_dtype, assert_prog_input_type, assert_prog_output_type, assert_spec_input_image_type, assert_spec_output_image_type, get_op_types_in_program, verify_prediction, ) from coremltools.models import _METADATA_SOURCE_DIALECT from coremltools.proto import FeatureTypes_pb2 as ft from coremltools.test.api.test_api_examples import TestInputs as _TestInputs if _HAS_TORCH: import torch import torch.nn as nn import torchvision torch.manual_seed(1818) if _HAS_HF: from peft import LoraConfig, get_peft_model if _HAS_EXECUTORCH: import executorch.exir if _HAS_TORCHAO: from torchao.quantization import quant_api from torchao.utils import unwrap_tensor_subclass @pytest.fixture def torch_model(): class TestModule(torch.nn.Module): def __init__(self): super(TestModule, self).__init__() self.linear = torch.nn.Linear(10, 20) def forward(self, x): return self.linear(x) model = TestModule() model.eval() return model @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) class TestTorchScriptValidation: @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_no_inputs(torch_model, backend): traced_torch_model = torch.jit.trace(torch_model, torch.rand(1, 10)) with pytest.raises( ValueError, match=r'Expected argument "inputs" for TorchScript models not provided' ): ct.convert(traced_torch_model, convert_to=backend[0]) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_pth_extension(torch_model, tmpdir, backend): # test for issue: https://github.com/apple/coremltools/issues/917 shape = (1, 10) traced_torch_model = torch.jit.trace(torch_model, torch.rand(*shape)) model_path = os.path.join(str(tmpdir), "torch_model.pth") traced_torch_model.save(model_path) ct.convert( model_path, source="pytorch", inputs=[ ct.TensorType( shape=shape, ) ], convert_to=backend[0], ) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_source_dialect_metadata(torch_model, backend): shape = (1, 10) traced_torch_model = torch.jit.trace(torch_model, torch.rand(*shape)) mlmodel = ct.convert( traced_torch_model, source="pytorch", inputs=[ ct.TensorType( shape=shape, ) ], convert_to=backend[0], ) assert _METADATA_SOURCE_DIALECT in mlmodel.user_defined_metadata assert mlmodel.user_defined_metadata[_METADATA_SOURCE_DIALECT] == "TorchScript" @pytest.mark.skipif(not _HAS_EXECUTORCH, reason=MSG_EXECUTORCH_NOT_FOUND) class TestEXIRValidation: @staticmethod @pytest.mark.parametrize("backend", backends) def test_fp16_io(torch_model, backend): # TODO (rdar://115845792): Handle fp16 IO dtypes class TestModule(torch.nn.Module): def __init__(self): super(TestModule, self).__init__() self.linear = torch.nn.Linear(10, 20, dtype=torch.float16) def forward(self, x): return self.linear(x) model = TestModule() model.eval() shape = (1, 10) example_inputs = (torch.rand(*shape, dtype=torch.float16),) exir_program_aten = torch.export.export(model, example_inputs) exir_program_edge = executorch.exir.to_edge(exir_program_aten).exported_program() # Default deployment target is iOS14 for neuralnetwork and iOS15 for mlprogram, # both are too old to support fp16 io with pytest.raises( ValueError, match=r"To use fp16 input, please set minimum deployment target to iOS16\+" ): ct.convert(exir_program_edge, convert_to=backend[0]) # fp16 io should work fine for iOS16+ if backend[0] == "mlprogram": ct.convert( exir_program_edge, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS16, ) @staticmethod @pytest.mark.parametrize("backend", backends) def test_inputs( torch_model, backend ): # TODO: rdar://115845792 ([Executorch] Handle user provided inputs/outputs in the convert API) shape = (2, 10) exir_program_aten = torch.export.export(torch_model, (torch.rand(*shape),)) exir_program_edge = executorch.exir.to_edge(exir_program_aten).exported_program() with pytest.raises( AssertionError, match=r"'inputs' argument should be None for ExportedProgram" ): ct.convert( exir_program_edge, convert_to=backend[0], inputs=[ct.TensorType(shape=shape)], ) @staticmethod @pytest.mark.parametrize("backend", backends) def test_outputs( torch_model, backend ): # TODO: rdar://115845792 ([Executorch] Handle user provided inputs/outputs in the convert API) shape = (3, 10) exir_program_aten = torch.export.export(torch_model, (torch.rand(*shape),)) exir_program_edge = executorch.exir.to_edge(exir_program_aten).exported_program() with pytest.raises( AssertionError, match=r"'outputs' argument should be None for ExportedProgram" ): ct.convert( exir_program_edge, convert_to=backend[0], outputs=[ct.TensorType(name="result")], ) @staticmethod @pytest.mark.parametrize("backend", backends) def test_source_dialect_metadata(torch_model, backend): shape = (4, 10) exir_program_aten = torch.export.export(torch_model, (torch.rand(*shape),)) exir_program_edge = executorch.exir.to_edge(exir_program_aten).exported_program() mlmodel = ct.convert( exir_program_edge, source="pytorch", convert_to=backend[0], ) assert _METADATA_SOURCE_DIALECT in mlmodel.user_defined_metadata assert mlmodel.user_defined_metadata[_METADATA_SOURCE_DIALECT] == "TorchExport::EDGE" @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) class TestTorchOpsRegistry: @staticmethod def test_api_example(): # Example code in https://apple.github.io/coremltools/docs-guides/source/composite-operators.html#using-composite-ops-with-pytorch-conversion # Whenever this test fails, we should update API documentations # This test needs to be modified after rdar://117502178 ([Infra][Pytorch] We should deprecate the direct use of _TORCH_OPS_REGISTRY in 7.2) from coremltools.converters.mil import Builder as mb from coremltools.converters.mil.frontend.torch.ops import _get_inputs from coremltools.converters.mil.frontend.torch.torch_op_registry import ( _TORCH_OPS_REGISTRY, register_torch_op, ) default_func = _TORCH_OPS_REGISTRY.get_func("selu") # Test ``__contains__`` and ``__delitem__`` assert "selu" in _TORCH_OPS_REGISTRY if "selu" in _TORCH_OPS_REGISTRY: del _TORCH_OPS_REGISTRY["selu"] assert not "selu" in _TORCH_OPS_REGISTRY # Test ``@register_torch_op`` decorator @register_torch_op def selu(context, node): x = _get_inputs(context, node, expected=1)[0] x = mb.elu(x=x, alpha=1.6732632423543772) x = mb.mul(x=x, y=1.0507009873554805, name=node.name) context.add(x) # Test ``__getitem__`` assert _TORCH_OPS_REGISTRY["selu"] is not None # Test ``__setitem__`` _TORCH_OPS_REGISTRY["selu"] = default_func @staticmethod def test_register_torch_op(): # Test ``register_torch_op`` works def test_func_dummy(context, inputs): return register_torch_op(test_func_dummy) assert _TORCH_OPS_REGISTRY.name_to_func_mapping["test_func_dummy"] is test_func_dummy # Test error out for duplicate registration with pytest.raises(ValueError, match="Torch op test_func_dummy already registered."): register_torch_op(test_func_dummy) # Test we can override the function def test_func_dummy(context, inputs): dummy = 1 return register_torch_op(test_func_dummy, override=True) assert _TORCH_OPS_REGISTRY.name_to_func_mapping["test_func_dummy"] is test_func_dummy # Cleanup the test del _TORCH_OPS_REGISTRY.name_to_func_mapping["test_func_dummy"] @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) class TestFxNodeSupport: """ The API ``ct.converters.mil.frontend.torch.is_torch_fx_node_supported`` is used by 3rd-party code ExecuTorch: https://github.com/pytorch/executorch/pull/1415, so we cannot break it """ @staticmethod def test_simple_case(): class Model(torch.nn.Module): def forward(self, a, x, b): y = torch.mm(a, x) z = y + b a.sub_(z) y = torch.mm(a, x) z = y + b return z model = Model() model.eval() symbolic_traced = torch.fx.symbolic_trace(model) for node in symbolic_traced.graph.nodes: # There are many types of torch fx node, # we only support "call_function" node for now if node.op == "call_function": # All PyTorch ops in the example model are supported, so they should all return true assert ct.converters.mil.frontend.torch.is_torch_fx_node_supported(node) # Other types of torch fx node are not supported else: assert not ct.converters.mil.frontend.torch.is_torch_fx_node_supported(node) @staticmethod def test_unsupported_op(): class Model(torch.nn.Module): def forward(self, x, y): z = x + y return torch.nn.functional.softmax(z) model = Model() model.eval() symbolic_traced = torch.fx.symbolic_trace(model) # Mock our torch ops registry, pretending that only "add" is supported with patch.object( TorchOpsRegistry, "__contains__", side_effect=(lambda op_name: op_name == "add"), ): for node in symbolic_traced.graph.nodes: # There are many types of torch fx node, # we only support "call_function" node for now if node.op == "call_function": # Only "add" is supported assert ( (node.target.__name__.lower() == "add") == ct.converters.mil.frontend.torch.is_torch_fx_node_supported(node) ) # Other types of torch fx node are not supported else: assert not ct.converters.mil.frontend.torch.is_torch_fx_node_supported(node) ################################################################################# # Note: Starting from here, all of the following tests are also used as examples # in https://coremltools.readme.io/docs as a reference. # Whenever any of the following test fails, we should update API documentations ################################################################################# @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) class TestPyTorchConverterExamples: @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_convert_torch_vision_mobilenet_v2(tmpdir, backend): """ In this example, we'll instantiate a PyTorch classification model and convert it to Core ML. """ """ Here we instantiate our model. In a real use case this would be your trained model. """ model = torchvision.models.mobilenet_v2() """ The next thing we need to do is generate TorchScript for the model. The easiest way to do this is by tracing it. """ """ It's important that a model be in evaluation mode (not training mode) when it's traced. This makes sure things like dropout are disabled. """ model.eval() """ Tracing takes an example input and traces its flow through the model. Here we are creating an example image input. The rank and shape of the tensor will depend on your model use case. If your model expects a fixed size input, use that size here. If it can accept a variety of input sizes, it's generally best to keep the example input small to shorten how long it takes to run a forward pass of your model. In all cases, the rank of the tensor must be fixed. """ example_input = torch.rand(1, 3, 256, 256) """ Now we actually trace the model. This will produce the TorchScript that the CoreML converter needs. """ traced_model = torch.jit.trace(model, example_input) """ Now with a TorchScript representation of the model, we can call the CoreML converter. The converter also needs a description of the input to the model, where we can give it a convenient name. """ mlmodel = ct.convert( traced_model, inputs=[ct.TensorType(name="input", shape=example_input.shape)], convert_to=backend[0], ) """ Now with a conversion complete, we can save the MLModel and run inference. """ suffix = ".mlmodel" if backend == "neuralnetwork" else ".mlpackage" save_path = os.path.join(str(tmpdir), "mobilenet_v2" + suffix) mlmodel.save(save_path) """ Running predict() is only supported on macOS. """ if ct.utils._is_macos(): results = mlmodel.predict({"input": example_input.numpy()}) assert isinstance(results, dict) @staticmethod def test_convert_torch_traced_model_to_milinternal(tmpdir): from torch import nn class Network(nn.Module): def __init__(self): super(Network, self).__init__() self.hidden = nn.Linear(100, 10) self.output = nn.Linear(10, 2) self.sigmoid = nn.Sigmoid() self.softmax = nn.Softmax(dim=1) def forward(self, x): x = self.hidden(x) x = self.sigmoid(x) x = self.output(x) x = self.softmax(x) return x torch_model = Network() torch_model.eval() example_input = torch.rand(1, 100) traced_model = torch.jit.trace(torch_model, example_input) model = ct.convert( traced_model, inputs=[ct.TensorType(name="input", shape=example_input.shape)], convert_to='milinternal' ) assert isinstance(model, ct.converters.mil.Program) @staticmethod def _get_classifier_model(): class Net(torch.nn.Module): def __init__(self): super(Net, self).__init__() self.linear1 = torch.nn.Linear(28 * 28, 100) self.linear2 = torch.nn.Linear(100, 50) self.final = torch.nn.Linear(50, 10) self.relu = torch.nn.ReLU() def forward(self, img): # convert + flatten x = img.view(-1, 28 * 28) x = self.relu(self.linear1(x)) x = self.relu(self.linear2(x)) x = self.final(x) return x model = Net() model.eval() example_input = torch.rand(1, 28 * 28, 1) traced_model = torch.jit.trace(model, example_input) traced_model.eval() return traced_model, example_input @staticmethod def _convert_classifier_model(traced_model, example_input, class_type, backend="mlprogram"): label = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] if class_type == "str": label = list(map(lambda x: str(x), label)) classifier_config = ct.ClassifierConfig(label) return ct.convert( traced_model, source="pytorch", convert_to=backend, inputs=[ ct.TensorType( name="input", shape=example_input.shape, dtype=example_input.numpy().dtype, ) ], classifier_config=classifier_config, ) @staticmethod def test_torch_classifier(): def _test_classifier(traced_model, example_input, class_type, backend): mlmodel = TestPyTorchConverterExamples._convert_classifier_model( traced_model, example_input, class_type, backend, ) if ct.utils._is_macos(): coreml_out = mlmodel.predict({"input": example_input.detach().numpy()}) assert "classLabel" in coreml_out key_type = str if class_type == "str" else int assert isinstance(coreml_out["classLabel"], key_type) for class_type in ("str", "int"): traced_model, example_input = TestPyTorchConverterExamples._get_classifier_model() _test_classifier(traced_model, example_input, class_type, "neuralnetwork") if ct.utils._macos_version() >= (12, 0): _test_classifier(traced_model, example_input, class_type, "mlprogram") @staticmethod @pytest.mark.parametrize("backend", backends) def test_convert_to_argument_with_torch_model(tmpdir, backend): class Network(torch.nn.Module): def __init__(self): super(Network, self).__init__() self.hidden = torch.nn.Linear(30, 5) self.relu = torch.nn.ReLU() def forward(self, x): x = self.hidden(x) return self.relu(x) torch_model = Network() torch_model.eval() example_input = torch.rand(1, 30) traced_model = torch.jit.trace(torch_model, example_input) model = ct.convert( traced_model, inputs=[ct.TensorType(name="input", shape=example_input.shape)], convert_to=backend[0], ) assert isinstance(model, ct.models.MLModel) spec = model.get_spec() if backend[0] == "mlprogram": assert spec.WhichOneof('Type') == 'mlProgram' else: assert spec.WhichOneof('Type') == 'neuralNetwork' @staticmethod def test_deployment_target_argument_with_torch_model(): class Network(torch.nn.Module): def __init__(self): super(Network, self).__init__() self.hidden = torch.nn.Linear(30, 5) self.relu = torch.nn.ReLU() def forward(self, x): x = self.hidden(x) return self.relu(x) torch_model = Network() torch_model.eval() example_input = torch.rand(1, 30) traced_model = torch.jit.trace(torch_model, example_input) # convert to 'neuralnetwork' by specifying an iOS13 target model = ct.convert( traced_model, inputs=[ct.TensorType(name="input", shape=example_input.shape)], minimum_deployment_target=ct.target.iOS13, ) assert isinstance(model, ct.models.MLModel) assert model.get_spec().WhichOneof('Type') == 'neuralNetwork' # convert to 'mlprogram' by specifying an iOS15 target model = ct.convert( traced_model, inputs=[ct.TensorType(name="input", shape=example_input.shape)], minimum_deployment_target=ct.target.iOS15, ) assert isinstance(model, ct.models.MLModel) assert model.get_spec().WhichOneof('Type') == 'mlProgram' # verify an error is raised when convert_to="neuralnetwork" and target is iOS15 with pytest.raises(ValueError) as e: model = ct.convert( traced_model, inputs=[ct.TensorType(name="input", shape=example_input.shape)], convert_to="neuralnetwork", minimum_deployment_target=ct.target.iOS15, ) expected_error = "If minimum deployment target is iOS15/macOS12/watchOS8/tvOS15 or higher, " \ "then 'convert_to' cannot be neuralnetwork. It must be 'mlprogram'" assert expected_error == str(e.value) # verify an error is raised when convert_to="mlprogram" and target is less than iOS15 with pytest.raises(ValueError) as e: model = ct.convert( traced_model, inputs=[ct.TensorType(name="input", shape=example_input.shape)], convert_to="mlprogram", minimum_deployment_target=ct.target.iOS14, ) expected_error = "When 'convert_to' is mlprogram, the minimum deployment target " \ "must be at least iOS15/macOS12/watchOS8/tvOS15" assert expected_error == str(e.value) @staticmethod def test_get_milprogram_method_with_torch_model(): class Network(torch.nn.Module): def __init__(self): super(Network, self).__init__() self.hidden = torch.nn.Linear(100, 10) self.relu = torch.nn.ReLU() def forward(self, x): x = self.hidden(x) x = self.relu(x) return x torch_model = Network() torch_model.eval() example_input = torch.rand(1, 100) traced_model = torch.jit.trace(torch_model, example_input) model = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], convert_to='mlprogram' ) assert isinstance(model._get_mil_internal(), ct.converters.mil.Program) @staticmethod @pytest.mark.skipif(ct.utils._macos_version() < (12, 0), reason='Model produces specification 6.') @pytest.mark.parametrize( "backend, provide_prob_output_argument", itertools.product( backends, [False, True], ) ) def test_classifier_from_torch_model(backend, provide_prob_output_argument): torch_model = torch.nn.ReLU().eval() traced_model = torch.jit.trace(torch_model, torch.rand(3,)) variable_name = "var_2" class_label_name = "class_label" classifier_config = ct.ClassifierConfig( class_labels=['a', 'b', 'c'], predicted_feature_name=class_label_name, predicted_probabilities_output=variable_name if provide_prob_output_argument else None, ) model = ct.convert( traced_model, inputs=[ct.TensorType(shape=(3,))], classifier_config = classifier_config, convert_to=backend[0], ) spec = model.get_spec() input_name = spec.description.input[0].name out_dict = model.predict({input_name : np.array([1.0, 2.0, 3.0])}) assert class_label_name in out_dict assert out_dict[class_label_name] == 'c' if backend[0] == "neuralnetwork": assert variable_name in out_dict assert isinstance(out_dict[variable_name], dict) else: output_dict_feature_name = class_label_name + "_probs" assert output_dict_feature_name in out_dict assert isinstance(out_dict[output_dict_feature_name], dict) @staticmethod @pytest.mark.skipif( ct.utils._macos_version() < (15, 0), reason="Tests are for deployment target iOS18/macos15" ) @pytest.mark.xfail( reason="rdar://131396853 Lora Adapted Model Dies as ct.models.MLModel but Passes coremltest", run=False, ) def test_multifunction_example(): # util to add adapters def adapt_model_with_lora(model): lora_config = LoraConfig( target_modules=["linear1", "linear2"], r=32, lora_alpha=1 ) # rank 32 adapted_model = get_peft_model(model, lora_config) return adapted_model # define the base model class Base(nn.Module): def __init__(self): super().__init__() self.linear1 = nn.Linear(6000, 6000) self.relu = nn.ReLU() self.linear2 = nn.Linear(6000, 6000) def forward(self, x): x = self.linear1(x) x = self.relu(x) x = self.linear2(x) return x base_model = Base() # create tmp paths for models mlmodel_1_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel_2_path = tempfile.mkdtemp(suffix=".mlpackage") multifunction_model_path = tempfile.mkdtemp(suffix=".mlpackage") try: # first model with adapter adapted_model_1 = adapt_model_with_lora(base_model) mlmodel_1 = ct.convert( torch.jit.trace(adapted_model_1.eval(), torch.rand(1, 6000)), inputs=[ct.TensorType(name="input_adpated_model_1", shape=(1, 6000))], outputs=[ct.TensorType(name="out_adpated_model_1")], minimum_deployment_target=ct.target.iOS18, skip_model_load=True, ) mlmodel_1.save(mlmodel_1_path) # second model adapted_model_2 = adapt_model_with_lora(base_model) mlmodel_2 = ct.convert( torch.jit.trace(adapted_model_2.eval(), torch.rand(1, 6000)), inputs=[ct.TensorType(name="input_adpated_model_2", shape=(1, 6000))], outputs=[ct.TensorType(name="out_adpated_model_2")], minimum_deployment_target=ct.target.iOS18, skip_model_load=True, ) mlmodel_2.save(mlmodel_2_path) # combine two models into a multifunction model desc = ct.utils.MultiFunctionDescriptor() desc.add_function( mlmodel_1_path, src_function_name="main", target_function_name="adapter_1" ) desc.add_function( mlmodel_2_path, src_function_name="main", target_function_name="adapter_2" ) desc.default_function_name = "adapter_1" ct.utils.save_multifunction(desc, multifunction_model_path) if platform.machine() == "arm64": # The following model fails to run on Intel machines, # tracked by rdar://132919101 ([Bug] Intel machines fails on running several multifunction unittest) # run the prediction mlmodel_1 = ct.models.MLModel(multifunction_model_path) # Uses default function y_1 = mlmodel_1.predict({"input_adpated_model_1": np.random.rand(1, 6000)}) mlmodel_2 = ct.models.MLModel(multifunction_model_path, function_name="adapter_2") y_2 = mlmodel_2.predict({"input_adpated_model_2": np.random.rand(1, 6000)}) # run the model using CompiledMLModel compile_model = ct.models.CompiledMLModel(multifunction_model_path) y_1 = mlmodel_1.predict({"input_adpated_model_1": np.random.rand(1, 6000)}) except: raise ValueError("Test failing for test_multifunction_example.") finally: # cleanup shutil.rmtree(mlmodel_1_path) shutil.rmtree(mlmodel_2_path) shutil.rmtree(multifunction_model_path) @staticmethod @pytest.mark.skipif( ct.utils._macos_version() < (15, 0), reason="Tests are for deployment target iOS18/macos15" ) def test_stateful_accumulator(): # stateful model definition in torch class Model(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer("accumulator", torch.tensor(np.array([0], dtype=np.float16))) def forward(self, x): self.accumulator += x return self.accumulator * self.accumulator # convert the trace model into stateful mlmodel traced_model = torch.jit.trace(Model().eval(), torch.tensor([1])) mlmodel = ct.convert( traced_model, inputs=[ct.TensorType(shape=(1,))], outputs=[ct.TensorType(name="y")], states=[ ct.StateType( wrapped_type=ct.TensorType( shape=(1,), ), name="accumulator", ), ], minimum_deployment_target=ct.target.iOS18, ) # check the numerical outputs state1 = mlmodel.make_state() assert mlmodel.predict({"x": np.array([2.0])}, state=state1)["y"] == 4 # (2)^2 assert mlmodel.predict({"x": np.array([5.0])}, state=state1)["y"] == 49 # (5+2)^2 assert mlmodel.predict({"x": np.array([-1.0])}, state=state1)["y"] == 36 # (-1+5+2)^2 state2 = mlmodel.make_state() assert mlmodel.predict({"x": np.array([9.0])}, state=state2)["y"] == 81 # (9)^2 assert mlmodel.predict({"x": np.array([2.0])}, state=state2)["y"] == 121 # (2+9)^2 assert mlmodel.predict({"x": np.array([3.0])}, state=state1)["y"] == 81 # (3-1+5+2)^2 assert mlmodel.predict({"x": np.array([7.0])}, state=state1)["y"] == 256 # (7+3-1+5+2)^2 @staticmethod @pytest.mark.skipif( ct.utils._macos_version() < (15, 0), reason="States are supported since iOS18/macos15." ) def test_attention_stateful_key_value_cache(): """ Use a toy attention model to showcase kv cache with states. This toy example is only for showing how to convert in-place update kv-cache. It omits some other details such as multi-head, multi-layer, positional encoding, final logits, etc. """ class SimpleAttention(nn.Module): def __init__(self, embed_size): super().__init__() self.query = nn.Linear(embed_size, embed_size) self.key = nn.Linear(embed_size, embed_size) self.value = nn.Linear(embed_size, embed_size) def forward(self, x): Q = self.query(x) # (batch_size, seq_len, embed_size) K = self.key(x) # (batch_size, seq_len, embed_size) V = self.value(x) # (batch_size, seq_len, embed_size) return torch.nn.functional.scaled_dot_product_attention(Q, K, V) class ToyModel(nn.Module): def __init__(self, vocab_size, embed_size): super().__init__() self.embedding = nn.Embedding(vocab_size, embed_size) self.attention = SimpleAttention(embed_size) self.fc = nn.Linear(embed_size, embed_size) def forward(self, x): embedded = self.embedding(x) attention_output = self.attention(embedded) return self.fc(attention_output) class SimpleAttentionWithKeyValueCache(SimpleAttention): """Add kv-cache into SimpleAttention.""" def forward(self, x, attention_mask, k_cache, v_cache): Q = self.query(x) newly_computed_k = self.key(x) newly_computed_v = self.value(x) # Update kv-cache in-place. q_len = Q.shape[-2] end_step = attention_mask.shape[-1] past_kv_len = end_step - q_len k_cache[:, past_kv_len:end_step, :] = newly_computed_k v_cache[:, past_kv_len:end_step, :] = newly_computed_v # The K and V we need is (batch_size, q_len + past_kv_len, embed_size). K = k_cache[:, :end_step, :] V = v_cache[:, :end_step, :] return torch.nn.functional.scaled_dot_product_attention( Q, K, V, attn_mask=attention_mask ) class ToyModelWithKeyValueCache(nn.Module): def __init__(self, vocab_size, embed_size, batch_size, max_seq_len): super().__init__() self.embedding = nn.Embedding(vocab_size, embed_size) self.attention = SimpleAttentionWithKeyValueCache(embed_size) self.fc = nn.Linear(embed_size, embed_size) self.kvcache_shape = (batch_size, max_seq_len, embed_size) self.register_buffer("k_cache", torch.zeros(self.kvcache_shape)) self.register_buffer("v_cache", torch.zeros(self.kvcache_shape)) def forward( self, input_ids, # [batch_size, seq_len] causal_mask, # [batch_size, seq_len, seq_len + past_kv_len] ): embedded = self.embedding(input_ids) attention_output = self.attention(embedded, causal_mask, self.k_cache, self.v_cache) return self.fc(attention_output) # If you want to compare prediction speed, the benefits of stateful kv-cache will only be # revealed with large models, such as `vocab_size=32000` and `embed_size = 1024`. vocab_size = 100 embed_size = 32 batch_size = 1 seq_len = 5 max_seq_len = 1024 num_iterations = 100 # Stateless model without kv-cache. torch_model = ToyModel(vocab_size, embed_size) torch_model.eval() input_ids = torch.randint(0, vocab_size, (batch_size, seq_len)) torch_output = torch_model(input_ids).detach().numpy() traced_model = torch.jit.trace(torch_model, [input_ids]) query_length = ct.RangeDim(lower_bound=1, upper_bound=max_seq_len, default=1) inputs = [ct.TensorType(shape=(batch_size, query_length), dtype=np.int32, name="input_ids")] outputs = [ct.TensorType(dtype=np.float16, name="output")] # The minimum_deployment_target and compute_units is not necessary, as non-stateful models # are supported before iOS18. Here we set it just for fair comparison with the stateful # kvcache model below. converted_model = ct.convert( traced_model, inputs=inputs, outputs=outputs, minimum_deployment_target=ct.target.iOS18, compute_units=ct.ComputeUnit.CPU_AND_GPU, ) # Makes sure prediction works well. for token_id in range(0, num_iterations): inputs = {"input_ids": np.array([list(range(token_id + 1))], dtype=np.int32)} converted_model.predict(inputs) # Stateful model with kv-cache. past_kv_len = 0 torch_model_kvcache = ToyModelWithKeyValueCache( vocab_size, embed_size, batch_size, max_seq_len ) torch_model_kvcache.load_state_dict(torch_model.state_dict(), strict=False) torch_model_kvcache.eval() causal_mask = torch.zeros((batch_size, seq_len, seq_len + past_kv_len), dtype=torch.float32) # Make sure the output matches the non-kv-cache version. torch_kvcache_output = torch_model_kvcache(input_ids, causal_mask).detach().numpy() np.testing.assert_allclose(torch_output, torch_kvcache_output) traced_model_kvcache = torch.jit.trace(torch_model_kvcache, [input_ids, causal_mask]) query_length = ct.RangeDim(lower_bound=1, upper_bound=max_seq_len, default=1) end_step_dim = ct.RangeDim(lower_bound=1, upper_bound=max_seq_len, default=1) inputs = [ ct.TensorType(shape=(batch_size, query_length), dtype=np.int32, name="input_ids"), ct.TensorType( shape=(batch_size, query_length, end_step_dim), dtype=np.float16, name="causal_mask" ), ] outputs = [ct.TensorType(dtype=np.float16, name="output")] # In addition to `inputs` and `outputs`, we need `states` which uses the same name as the # registered buffers in `ToyModelWithKeyValueCache`. states = [ ct.StateType( wrapped_type=ct.TensorType( shape=torch_model_kvcache.kvcache_shape, dtype=np.float16 ), name="k_cache", ), ct.StateType( wrapped_type=ct.TensorType( shape=torch_model_kvcache.kvcache_shape, dtype=np.float16 ), name="v_cache", ), ] converted_model_kvcache = ct.convert( traced_model_kvcache, inputs=inputs, outputs=outputs, states=states, minimum_deployment_target=ct.target.iOS18, compute_units=ct.ComputeUnit.CPU_AND_GPU, ) # Makes sure prediction works well. past_kv_len = 0 kv_cache_state = converted_model_kvcache.make_state() for token_id in range(0, num_iterations): inputs = { "input_ids": np.array([[token_id]], dtype=np.int32), "causal_mask": np.zeros((1, 1, past_kv_len + 1), dtype=np.float16), } converted_model_kvcache.predict(inputs, kv_cache_state) past_kv_len += 1 ############################################################################### # Note: Stress tests for PyTorch input / output types ############################################################################### @pytest.mark.skipif(ct.utils._macos_version() < (10, 15), reason='Model produces specification 4.') @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) class TestTorchInputs(_TestInputs): @staticmethod @pytest.mark.skipif(not ct.utils._is_macos(), reason="test needs predictions") @pytest.mark.parametrize( "backend", backends, ) def test_torch_predict_input(backend): TestTorchInputs._test_variant_input_type_prediction(torch.tensor, backend[0]) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_int64_inputs(backend): num_tokens = 3 embedding_size = 5 class TestModule(torch.nn.Module): def __init__(self): super(TestModule, self).__init__() self.embedding = torch.nn.Embedding(num_tokens, embedding_size) def forward(self, x): return self.embedding(x) model = TestModule() model.eval() example_input = torch.randint(high=num_tokens, size=(2,), dtype=torch.int64) traced_model = torch.jit.trace(model, example_input) mlmodel = ct.convert( traced_model, inputs=[ ct.TensorType( name="input", shape=example_input.shape, dtype=example_input.numpy().dtype, ) ], convert_to=backend[0], ) # running predict() is supported on macOS if ct.utils._is_macos(): result = mlmodel.predict( {"input": example_input.detach().numpy().astype(np.float32)} ) # Verify outputs expected = model(example_input) name = list(result.keys())[0] rtol = 1e-03 if backend[0] == "mlprogram" else 1e-07 atol = 1e-04 if backend[0] == "mlprogram" else 0 np.testing.assert_allclose( result[name], expected.detach().numpy(), rtol=rtol, atol=atol ) # Duplicated inputs are invalid with pytest.raises(ValueError, match=r"Duplicated inputs"): mlmodel = ct.convert( traced_model, inputs=[ ct.TensorType( name="input", shape=example_input.shape, dtype=example_input.numpy().dtype, ), ct.TensorType( name="input", shape=example_input.shape, dtype=example_input.numpy().dtype, ), ], convert_to=backend[0], ) # Outputs must be of type ct.ImageType or ct.TensorType with pytest.raises(ValueError, match=r"must be a list of type ct.TensorType or ct.ImageType"): mlmodel = ct.convert( traced_model, inputs=[ ct.TensorType( name="input", shape=example_input.shape, dtype=example_input.numpy().dtype, ), ], outputs=["output"], convert_to=backend[0], ) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_fully_dynamic_inputs(backend): """ All dims of the inputs are dynamic, and write to slice to one of the inputs. """ class Model(torch.nn.Module): def __init__(self, index): super(Model, self).__init__() self.index = index def forward(self, x, y): x[:, int(self.index.item())] = 0.0 y = y.unsqueeze(0) return y, x model = Model(torch.tensor(3)) scripted_model = torch.jit.script(model) a, b = (-1, -1) if backend[0] == "neuralnetwork" else (6, 6) mlmodel = ct.convert( scripted_model, inputs=[ ct.TensorType("x", shape=(ct.RangeDim(upper_bound=a), ct.RangeDim(upper_bound=b))), ct.TensorType("y", shape=(ct.RangeDim(upper_bound=a), ct.RangeDim(upper_bound=b))), ], convert_to=backend[0], ) # running predict() is supported on macOS if ct.utils._is_macos(): x, y = torch.rand(2, 4), torch.rand(1, 2) torch_input = _copy_input_data([x, y]) torch_res = model(*torch_input) results = mlmodel.predict({"x": x.cpu().detach().numpy(), "y": y.cpu().detach().numpy()}) rtol = 1e-03 if backend[0] == "mlprogram" else 1e-07 atol = 1e-04 if backend[0] == "mlprogram" else 0 for i, name in enumerate(mlmodel.output_description): np.testing.assert_allclose(torch_res[i], results[name], rtol=rtol, atol=atol) x, y = torch.rand(1, 6), torch.rand(2, 3) torch_input = _copy_input_data([x, y]) torch_res = model(*torch_input) results = mlmodel.predict({"x": x.cpu().detach().numpy(), "y": y.cpu().detach().numpy()}) for i, name in enumerate(mlmodel.output_description): np.testing.assert_allclose(torch_res[i], results[name], rtol=rtol, atol=atol) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_rank0_inputs_torch(backend): """Similar to TestPyTorchConverterExamples::test_int64_inputs but using rank-0 int input. """ num_tokens = 3 embedding_size = 5 class TestModule(torch.nn.Module): def __init__(self): super(TestModule, self).__init__() self.embedding = torch.nn.Embedding(num_tokens, embedding_size) def forward(self, x): return self.embedding(x) model = TestModule() model.eval() example_input = torch.tensor(1) traced_model = torch.jit.trace(model, example_input) with pytest.raises(ValueError, match=r"Rank-0"): mlmodel = ct.convert( traced_model, inputs=[ ct.TensorType( name="input", shape=example_input.shape, dtype=example_input.numpy().dtype, ) ], convert_to=backend[0], ) @staticmethod @pytest.mark.parametrize( "variable_length, backend", itertools.product([True, False], backends), ) def test_torch_range_dim_lstm(variable_length, backend): """ This example shows how to run LSTM with previous hidden / cell states """ input_size = 3 hidden_size = 2 class TestNet(torch.nn.Module): def __init__(self): super(TestNet, self).__init__() self.lstm = torch.nn.LSTM(input_size, hidden_size, 1) def forward(self, x, hidden_state, cell_state): # LSTM takes in previous hidden and cell states. The first # invocation usually have zero vectors as initial states. output, (new_hidden_state, new_cell_state) = \ self.lstm(x, (hidden_state, cell_state)) # LSTM hidden / cell states are returned to be managed by the # caller (and is fed in as inputs in the next call). return output, new_hidden_state, new_cell_state model = TestNet() model.eval() seq_len = 2 # we'll make seq_len dynamic later batch = 1 input_shape = (seq_len, batch, input_size) rand_input = torch.rand(*input_shape) h_shape = (1, batch, hidden_size) rand_h0 = torch.rand(*h_shape) rand_c0 = torch.rand(*h_shape) traced_model = torch.jit.trace(model, (rand_input, rand_h0, rand_c0)) # ct.RangeDim() tells coremltools that this dimension can change for # each inference example (aka "runtime-determined"). If the sequence # length is always the same (e.g., 2 step LSTM would have seq_len == 2) # Note that fixed-length models usually run slightly faster than variable length models. upper_bound = -1 if backend[0] == "neuralnetwork" else 10 ct_seq_len = ct.RangeDim(upper_bound=upper_bound) if variable_length else seq_len seq_input = ct.TensorType(shape=(ct_seq_len, batch, input_size), name="seq_input") h_input = ct.TensorType(shape=h_shape, name="h_input") c_input = ct.TensorType(shape=h_shape, name="c_input") mlmodel = ct.convert( traced_model, inputs=[seq_input, h_input, c_input], convert_to=backend[0], ) if ct.utils._is_macos(): result = mlmodel.predict( {"seq_input": rand_input.detach().numpy().astype(np.float32), "h_input": rand_h0.detach().numpy().astype(np.float32), "c_input": rand_c0.detach().numpy().astype(np.float32), } ) # Verify outputs expected = model(rand_input, rand_h0, rand_c0) names = list(result.keys()) names.sort() atol = 1e-03 if backend[0] == "mlprogram" else 1e-04 rtol = 1e-03 if backend[0] == "mlprogram" else 1e-07 np.testing.assert_allclose( result[names[0]], expected[0].detach().numpy(), atol=atol, rtol=rtol ) np.testing.assert_allclose( result[names[1]], expected[1].detach().numpy(), atol=atol, rtol=rtol ) np.testing.assert_allclose( result[names[2]], expected[2].detach().numpy(), atol=atol, rtol=rtol ) # Try example of different length if variable_length: seq_len = 10 input_shape = (seq_len, batch, input_size) rand_input = torch.rand(*input_shape) result = mlmodel.predict( {"seq_input": rand_input.detach().numpy().astype(np.float32), "h_input": rand_h0.detach().numpy().astype(np.float32), "c_input": rand_c0.detach().numpy().astype(np.float32), } ) expected = model(rand_input, rand_h0, rand_c0) names = list(result.keys()) names.sort() np.testing.assert_allclose( result[names[0]], expected[0].detach().numpy(), atol=atol, rtol=rtol ) np.testing.assert_allclose( result[names[1]], expected[1].detach().numpy(), atol=atol, rtol=rtol ) np.testing.assert_allclose( result[names[2]], expected[2].detach().numpy(), atol=atol, rtol=rtol ) @staticmethod @pytest.mark.parametrize( "use_symbol, backend", itertools.product( [True, False], backends, ), ) def test_torch_outofbound_range_dim(use_symbol, backend): num_tokens = 3 embedding_size = 5 class TestModule(torch.nn.Module): def __init__(self): super(TestModule, self).__init__() self.embedding = torch.nn.Embedding(num_tokens, embedding_size) def forward(self, x): return self.embedding(x) model = TestModule() model.eval() example_input = torch.randint(high=num_tokens, size=(3,), dtype=torch.int64) traced_model = torch.jit.trace(model, example_input) if use_symbol: seq_len_dim = ct.RangeDim(symbol='len', lower_bound=3, upper_bound=5) else: # symbol is optional seq_len_dim = ct.RangeDim(lower_bound=3, upper_bound=5) seq_input = ct.TensorType(name="input", shape=(seq_len_dim,), dtype=np.int64) mlmodel = ct.convert( traced_model, inputs=[seq_input], convert_to=backend[0], ) if ct.utils._is_macos(): result = mlmodel.predict( {"input": example_input.detach().numpy().astype(np.float32)} ) # Verify outputs rtol = 1e-03 if backend[0] == "mlprogram" else 1e-07 atol = 1e-04 if backend[0] == "mlprogram" else 0 expected = model(example_input) name = list(result.keys())[0] np.testing.assert_allclose( result[name], expected.detach().numpy(), rtol=rtol, atol=atol ) # seq_len below/above lower_bound/upper_bound with pytest.raises(RuntimeError, match=r"Size \(99\) of dimension \(0\) is not in allowed range \(3\.\.5\)"): example_input2 = torch.randint(high=num_tokens, size=(99,), dtype=torch.int64) result = mlmodel.predict( {"input": example_input2.detach().numpy().astype(np.float32)} ) with pytest.raises(RuntimeError, match=r"Size \(2\) of dimension \(0\) is not in allowed range \(3\.\.5\)"): example_input2 = torch.randint(high=num_tokens, size=(2,), dtype=torch.int64) result = mlmodel.predict( {"input": example_input2.detach().numpy().astype(np.float32)} ) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_torch_enumerated_shapes(backend): in_channels = 3 out_channels = 2 kernel_size = 3 class TestModule(torch.nn.Module): def __init__(self): super(TestModule, self).__init__() self.conv = torch.nn.Conv2d(in_channels, out_channels, kernel_size) def forward(self, x): return self.conv(x) model = TestModule() model.eval() example_input = torch.randn(1, 3, 28, 28) traced_model = torch.jit.trace(model, example_input) shapes = [(1, 3, 28, 28), (1, 3, 56, 56)] enumerated_shapes = ct.EnumeratedShapes(shapes=shapes) tensor_input = ct.TensorType(name="input", shape=enumerated_shapes) mlmodel = ct.convert( traced_model, inputs=[tensor_input], compute_units=ct.ComputeUnit.CPU_ONLY, convert_to=backend[0], ) if ct.utils._is_macos(): result = mlmodel.predict( {"input": example_input.detach().numpy().astype(np.float32)}, ) # Verify outputs rtol = 1 if backend[0] == "mlprogram" else 1e-03 atol = 1e-02 if backend[0] == "mlprogram" else 1e-04 expected = model(example_input) name = list(result.keys())[0] np.testing.assert_allclose( result[name], expected.detach().numpy(), rtol=rtol, atol=atol ) # Test (1, 3, 56, 56) shape (can't verify numerical parity with Torch # which doesn't support enumerated shape) test_input_x = np.random.rand(*shapes[1]).astype(np.float32) mlmodel.predict({"input": test_input_x}) # Test with a wrong shape with pytest.raises(RuntimeError, match=r"MultiArray Shape \(1 x 3 x 29 x 29\) was not in enumerated set of allowed shapes"): test_input_x = np.random.rand(1, 3, 29, 29).astype(np.float32) mlmodel.predict({"input": test_input_x}) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_torch_image_enumerated_shapes(backend): import torchvision torch_model = torchvision.models.mobilenet_v2().features torch_model.eval() example_input = torch.rand(1, 3, 256, 256) traced_model = torch.jit.trace(torch_model, example_input) input_shapes = ct.EnumeratedShapes(shapes=[(1, 3, 256, 256), (1, 3, 224, 224)]) image_input = ct.ImageType(shape=input_shapes, bias=[-1, -1, -1], scale=1 / 127) model = ct.convert(traced_model, inputs=[image_input], convert_to=backend[0]) assert model is not None spec = model.get_spec() assert len(spec.description.input[0].type.imageType.enumeratedSizes.sizes) == 2 @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_torch_optional_input(backend): num_tokens = 3 embedding_size = 5 class TestModule(torch.nn.Module): def __init__(self): super(TestModule, self).__init__() self.embedding = torch.nn.Embedding(num_tokens, embedding_size) def forward(self, x, y): return self.embedding(x) + y model = TestModule() model.eval() example_input = [ torch.randint(high=num_tokens, size=(2,), dtype=torch.int64), torch.rand(1), ] traced_model = torch.jit.trace(model, example_input) upper_bound = -1 if backend[0] == "neuralnetwork" else 2 required_input = ct.TensorType( name="required_input", shape=(ct.RangeDim(upper_bound=upper_bound),), dtype=np.int64 ) default_value = np.array([3]).astype(np.float32) optional_input = ct.TensorType(name="optional_input", shape=(1,), default_value=default_value) for compute_units in ct.ComputeUnit: if compute_units == ct.ComputeUnit.CPU_AND_NE and ct.utils._macos_version() < (13, 0): continue mlmodel = ct.convert( traced_model, inputs=[required_input, optional_input], compute_units=compute_units, convert_to=backend[0], ) assert(mlmodel.compute_unit == compute_units) if ct.utils._is_macos(): result = mlmodel.predict( {"required_input": example_input[0].detach().numpy().astype(np.float32)} ) # Verify outputs rtol = 1e-03 if backend[0] == "mlprogram" else 1e-07 atol = 1e-03 if backend[0] == "mlprogram" else 0 torch_default_value = torch.tensor([3]) expected = model(example_input[0].detach(), torch_default_value) name = list(result.keys())[0] np.testing.assert_allclose( result[name], expected.detach().numpy(), rtol=rtol, atol=atol ) @pytest.fixture def int32_input_model(): class Model(torch.nn.Module): def forward(self, x): return x + 5 example_input = torch.randint(0, 100, (10, 20), dtype=torch.int32) return torch.jit.trace(Model().eval(), example_input) def int64_input_model(): class Model(torch.nn.Module): def forward(self, x): return x + 5 example_input = torch.randint(0, 100, (10, 20), dtype=torch.int64) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def float32_input_model_add_op(): class Model(torch.nn.Module): def forward(self, x): return x + 5.5 example_input = torch.randint(0, 100, (10, 20), dtype=torch.float32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def float32_input_model_relu_ops(): class Model(torch.nn.Module): def forward(self, x): x = torch.nn.ReLU()(x) return torch.nn.ReLU()(x) example_input = torch.randint(0, 100, (10, 20), dtype=torch.float32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def float32_two_input_model(): class Model(torch.nn.Module): def forward(self, x, y): return x + y example_input = torch.randint(0, 100, (10, 20), dtype=torch.float32) return torch.jit.trace(Model().eval(), [example_input, example_input]) @pytest.fixture def float32_two_output_model(): class Model(torch.nn.Module): def forward(self, x): y = torch.nn.ReLU()(x) out1 = torch.nn.ReLU()(y) out2 = torch.nn.ReLU6()(x) return out1, out2 example_input = torch.randint(0, 100, (10, 20), dtype=torch.float32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def int32_float32_two_output_model(): class Model(torch.nn.Module): def forward(self, x, y): out1 = x + 1 out2 = y + 1 return out1, out2 input_1 = torch.randint(0, 100, (10, 20), dtype=torch.int32) input_2 = torch.randint(0, 100, (10, 20), dtype=torch.float32) return torch.jit.trace(Model().eval(), [input_1, input_2]) def float64_input_model(): class Model(torch.nn.Module): def forward(self, x): return x + 5.1 example_input = torch.randint(0, 100, (10, 20), dtype=torch.float64) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def rank3_input_model(): class Model(torch.nn.Module): def forward(self, x): return x + 5.5 example_input = torch.randint(0, 100, (1, 10, 20), dtype=torch.float32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def rank4_input_model(): class Model(torch.nn.Module): def forward(self, x): return x + 5.0 example_input = torch.randint(0, 100, (1, 3, 10, 20), dtype=torch.float32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def rank4_grayscale_input_model(): class Model(torch.nn.Module): def forward(self, x): return x + 10 example_input = torch.randint(0, 100, (1, 1, 10, 20), dtype=torch.float32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def linear_model(): # this model will test the fuse_linear_bias pass class Model(torch.nn.Module): def __init__(self): super().__init__() self.linear = torch.nn.Linear(10, 15, bias=False) self.constant_tensor = torch.ones((15,), dtype=torch.float32) def forward(self, x): x = self.linear(x) x = x - self.constant_tensor x = torch.nn.ReLU()(x) return x example_input = torch.randint(0, 10, (1, 10), dtype=torch.float32) return torch.jit.trace(Model().eval(), example_input) @pytest.mark.skipif(ct.utils._macos_version() < (13, 0), reason='Tests are for deployment target ios16/macos13') class TestInputOutputConversionAPI: def test_input_dtype_default(self, int32_input_model): #if dtype is not provided it defaults to float32 mlmodel = ct.convert(int32_input_model, inputs=[ct.TensorType(shape=(10, 20))], minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) def test_input_shape_missing_error(self, float32_input_model_add_op): with pytest.raises(ValueError, match="'shape' must be provided in the 'inputs' argument for pytorch conversion"): mlmodel = ct.convert(float32_input_model_add_op, inputs=[ct.TensorType(dtype=np.int32)], minimum_deployment_target=ct.target.macOS12) @pytest.mark.parametrize( "default_input_dtype, model", itertools.product( [True, False], [int64_input_model, float64_input_model], ), ) def test_unsupported_input_dtype_torch_model(self, default_input_dtype, model): # test that no error is raised when the Torch model's input dtype is not supported. # If users don't provide the input type, it will be mapped to the default dtype which is float32. # If the input type is provided, it will be mapped to the most compatible dtype: # fp64 -> fp32, int64 -> int32 if default_input_dtype: dtype = None expected_type_str = "fp32" else: if model == int64_input_model: dtype = np.int64 expected_type_str = "int32" elif model == float64_input_model: dtype = np.float64 expected_type_str = "fp32" mlmodel = ct.convert( model(), inputs=[ct.TensorType(shape=(10, 20), dtype=dtype)], minimum_deployment_target=ct.target.macOS12, ) assert_input_dtype(mlmodel, expected_type_str=expected_type_str) verify_prediction(mlmodel) def test_input_dtype_user_provided(self, float32_input_model_add_op): # test that provided dtype in the api is applied mlmodel = ct.convert(float32_input_model_add_op, inputs=[ct.TensorType(shape=(10, 20), dtype=np.int32)], minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="int32") assert_output_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) def test_invalid_input_dtype(self, int32_input_model): with pytest.raises(TypeError, match="is unsupported for inputs/outputs of the model" ): mlmodel = ct.convert(int32_input_model, inputs=[ct.TensorType(dtype=np.int16)], minimum_deployment_target=ct.target.macOS12) with pytest.raises(TypeError, match="float16 dtype for inputs is only supported for deployment target >= iOS16/macOS13" ): mlmodel = ct.convert(int32_input_model, inputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS12) def test_fp16_input_dtype(self, float32_input_model_add_op, float32_input_model_relu_ops, int32_input_model): """ Test that providing fp16 input dtype works with macOS13. """ mlmodel = ct.convert( float32_input_model_add_op, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["add", "cast"]) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) mlmodel = ct.convert( float32_input_model_relu_ops, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) # Two consecutive relus are merged in the `merge_consecutive_relus` pass. assert_ops_in_mil_program(mlmodel, expected_op_list=["relu", "cast"]) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) mlmodel = ct.convert( int32_input_model, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["add", "cast"]) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) def test_fp16_input_dtype_fp32_precision(self, float32_input_model_add_op, float32_input_model_relu_ops, int32_input_model): """ Same test as test_fp16_input_dtype, but with Float32 precision """ mlmodel = ct.convert(float32_input_model_add_op, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, compute_precision=ct.precision.FLOAT32, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "add"]) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) """ Although no FP16ComputePrecision is applied, the float16 input propagates through the network """ mlmodel = ct.convert(float32_input_model_relu_ops, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, compute_precision=ct.precision.FLOAT32, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "relu"]) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp32") def test_input_name_specified_by_user(self, float32_input_model_relu_ops, float32_two_input_model): mlmodel = ct.convert(float32_input_model_relu_ops, inputs=[ct.TensorType(shape=(10, 20), name="my_custom_input_name")], minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="fp32", expected_name="my_custom_input_name") mlmodel = ct.convert(float32_two_input_model, inputs=[ct.TensorType(shape=(10, 20), name="user_provided_name_1"), ct.TensorType(shape=(10, 20), name="user_provided_name_2")], minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="fp32", expected_name="user_provided_name_1", index=0) assert_input_dtype(mlmodel, expected_type_str="fp32", expected_name="user_provided_name_2", index=1) def test_two_input_model(self, float32_two_input_model): # test that error is raised if only 1 input is provided with pytest.raises( ValueError, match="Number of TorchScript inputs \(2\) must match the user provided inputs \(1\).", ): ct.convert( float32_two_input_model, inputs=[ct.TensorType(shape=(10, 20), dtype=np.int32)], minimum_deployment_target=ct.target.macOS12, ) # test forcing 1st input to type int32 mlmodel = ct.convert(float32_two_input_model, inputs=[ct.TensorType(shape=(10, 20), dtype=np.int32), ct.TensorType(shape=(10, 20))], minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="int32", index=0) assert_input_dtype(mlmodel, expected_type_str="fp32", index=1) assert_output_dtype(mlmodel, expected_type_str="fp32") # test forcing both inputs to be int32 mlmodel = ct.convert(float32_two_input_model, inputs=[ct.TensorType(shape=(10, 20), dtype=np.int32), ct.TensorType(shape=(10, 20), dtype=np.int32), ], minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="int32", index=0) assert_input_dtype(mlmodel, expected_type_str="int32", index=1) assert_output_dtype(mlmodel, expected_type_str="int32") # test forcing both inputs to be float16 mlmodel = ct.convert( float32_two_input_model, inputs=[ ct.TensorType(shape=(10, 20), dtype=np.float16), ct.TensorType(shape=(10, 20), dtype=np.float16), ], outputs=[ ct.TensorType(dtype=np.float32), ], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["add", "cast"]) assert_input_dtype(mlmodel, expected_type_str="fp16", index=0) assert_input_dtype(mlmodel, expected_type_str="fp16", index=1) assert_output_dtype(mlmodel, expected_type_str="fp32") verify_prediction(mlmodel) def test_output_name_specified_by_user(self, float32_input_model_relu_ops, float32_two_output_model): mlmodel = ct.convert(float32_input_model_relu_ops, inputs=[ct.TensorType(shape=(10, 20), name="custom_input_name")], outputs=[ct.TensorType(name="custom_output_name")], minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="fp32", expected_name="custom_input_name") assert_output_dtype(mlmodel, expected_type_str="fp32", expected_name="custom_output_name") mlmodel = ct.convert(float32_two_output_model, inputs=[ct.TensorType(shape=(10, 20), name="custom_input_name")], outputs=[ct.TensorType(name="custom_output1_name"), ct.TensorType(name="custom_output2_name")], minimum_deployment_target=ct.target.macOS12) assert_input_dtype(mlmodel, expected_type_str="fp32", expected_name="custom_input_name") assert_output_dtype(mlmodel, expected_type_str="fp32", expected_name="custom_output1_name", index=0) assert_output_dtype(mlmodel, expected_type_str="fp32", expected_name="custom_output2_name", index=1) def test_single_output_model(self, int32_input_model, float32_input_model_relu_ops): # test output type: if not provided, it should be the default which is float32 mlmodel = ct.convert( int32_input_model, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float32)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS12, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "add", "cast"]) assert_input_dtype(mlmodel, expected_type_str="fp32") assert_output_dtype(mlmodel, expected_type_str="fp32") # test that the output dtype provided by the user is applied during conversion mlmodel = ct.convert( float32_input_model_relu_ops, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float32)], outputs=[ct.TensorType(dtype=np.int32)], minimum_deployment_target=ct.target.macOS12, ) assert_input_dtype(mlmodel, expected_type_str="fp32") assert_output_dtype(mlmodel, expected_type_str="int32") assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "relu", "cast"]) # test that an error is raised when shape is provided for the output with pytest.raises(ValueError): mlmodel = ct.convert(int32_input_model, inputs=[ct.TensorType(shape=(10, 20))], outputs=[ct.TensorType(dtype=np.float32, shape=(10, 20))], minimum_deployment_target=ct.target.macOS12) # test that output dtype of float16 is rejected when deployment target is low with pytest.raises(TypeError, match="float16 dtype for outputs is only supported for deployment target >= iOS16/macOS13" ): ct.convert(float32_input_model_relu_ops, inputs=[ct.TensorType(shape=(10, 20))], outputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS12, ) # test that output type float16 is applied correctly mlmodel = ct.convert( float32_input_model_relu_ops, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float32)], outputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, ) assert_output_dtype(mlmodel, expected_type_str="fp16") assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "relu"]) # test that input and output types float16 are applied correctly mlmodel = ct.convert(float32_input_model_relu_ops, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, ) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp16") assert_ops_in_mil_program(mlmodel, expected_op_list=["relu"]) verify_prediction(mlmodel) def test_multi_output_model(self, float32_two_output_model): # check that error is raised when only 1 output provided with pytest.raises(ValueError, match="Number of outputs provided, 1, " "do not match the number of outputs detected in the model, 2"): ct.convert(float32_two_output_model, inputs=[ct.TensorType(shape=(10, 20))], outputs=[ct.TensorType()], minimum_deployment_target=ct.target.macOS12) # set 1 output to float16 and the other to float32 mlmodel = ct.convert(float32_two_output_model, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float16)], outputs=[ct.TensorType(name="out1", dtype=np.float16), ct.TensorType(name="out2", dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) assert_cast_ops_count(mlmodel, expected_count=1) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp16", expected_name="out1" ,index=0) assert_output_dtype(mlmodel, expected_type_str="fp32", expected_name="out2", index=1) verify_prediction(mlmodel) def test_color_input(self, rank4_input_model, rank3_input_model): mlmodel = ct.convert( rank4_input_model, inputs=[ct.ImageType(shape=(1, 3, 10, 20), color_layout=ct.colorlayout.RGB)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "add", "cast"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") verify_prediction(mlmodel) with pytest.raises(ValueError, match="must have rank 4"): mlmodel = ct.convert(rank3_input_model, inputs=[ct.ImageType(shape=(1, 10, 20), color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.macOS12, ) def test_grayscale_input(self, rank4_input_model, rank3_input_model, rank4_grayscale_input_model): with pytest.raises(ValueError, match="must have rank 4"): ct.convert(rank3_input_model, inputs=[ct.ImageType(shape=(1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE)], minimum_deployment_target=ct.target.macOS13, ) # invalid shape with pytest.raises(ValueError): ct.convert(rank4_input_model, inputs=[ct.ImageType(shape=(1, 3, 10, 20), color_layout=ct.colorlayout.GRAYSCALE)], minimum_deployment_target=ct.target.macOS13, ) mlmodel = ct.convert( rank4_grayscale_input_model, inputs=[ct.ImageType(shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "add", "cast"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") verify_prediction(mlmodel) with pytest.raises(TypeError, match="float16 dtype for inputs is only supported for deployment target >= iOS16/macOS13"): ct.convert(rank4_grayscale_input_model, inputs=[ct.ImageType(shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], minimum_deployment_target=ct.target.macOS12, ) # test that grayscale_16 raises error when used with neural network with pytest.raises(TypeError, match="float16 dtype for inputs is only supported for deployment target >= iOS16/macOS13"): ct.convert(rank4_grayscale_input_model, inputs=[ct.ImageType(shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], ) mlmodel = ct.convert(rank4_grayscale_input_model, inputs=[ct.ImageType(shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], outputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["add"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp16") verify_prediction(mlmodel) def test_color_output(self, rank4_input_model, float32_input_model_add_op): # check that an error is raised if the output shape is not of form (1, 3, H, W) with pytest.raises(ValueError, match="must have rank 4. Instead it has rank 2"): ct.convert(float32_input_model_add_op, inputs=[ct.TensorType(shape=(10, 20))], outputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.macOS13) mlmodel = ct.convert(rank4_input_model, inputs=[ct.ImageType(shape=(1, 3, 10, 20), color_layout=ct.colorlayout.BGR)], outputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "add", "cast"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.BGR) assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") verify_prediction(mlmodel) # check neural network conversion mlmodel = ct.convert( rank4_input_model, inputs=[ct.ImageType(shape=(1, 3, 10, 20), color_layout=ct.colorlayout.RGB)], outputs=[ct.ImageType(color_layout=ct.colorlayout.BGR)], convert_to="neuralnetwork", ) assert_ops_in_mil_program(mlmodel, expected_op_list=["add"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB) assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.BGR) verify_prediction(mlmodel) # check mlprogram can have dynamic shape image output shape = ct.Shape((1, 3, ct.RangeDim(5, 10), ct.RangeDim(5, 10))) mlmodel = ct.convert( rank4_input_model, inputs=[ct.TensorType(shape=shape, dtype=np.float32)], outputs=[ct.ImageType(name="output_image", color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "add", "cast"]) assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") assert any_symbolic(mlmodel._mil_program.functions["main"].outputs[0].shape) verify_prediction(mlmodel) # Test output image numerical sample_input = np.random.randint(low=0, high=200, size=(1, 3, 10, 10)).astype(np.float32) model_output_pil_image = mlmodel.predict({"x": sample_input})["output_image"] assert isinstance(model_output_pil_image, Image.Image) assert model_output_pil_image.mode == "RGBA" model_output_as_numpy = np.array(model_output_pil_image)[:, :, :3] # last A channel is 255 model_output_as_numpy = np.transpose(model_output_as_numpy, axes=[2, 0, 1]) reference_output = rank4_input_model(torch.from_numpy(sample_input)).detach().numpy() reference_output = np.squeeze(reference_output) np.testing.assert_allclose(reference_output, model_output_as_numpy, rtol=1e-2, atol=1e-2) a_channel = np.array(model_output_pil_image)[:, :, 3].flatten() assert np.all(a_channel == 255) def test_grayscale_output(self, rank4_grayscale_input_model): with pytest.raises(TypeError, match="float16 dtype for outputs is only supported for deployment target >= iOS16/macOS13"): ct.convert(rank4_grayscale_input_model, inputs=[ct.TensorType(shape=(1, 1, 10, 20))], outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], minimum_deployment_target=ct.target.macOS12, ) mlmodel = ct.convert( rank4_grayscale_input_model, inputs=[ct.ImageType(shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE)], outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE)], convert_to="neuralnetwork", ) assert_ops_in_mil_program(mlmodel, expected_op_list=["add"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE) assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE) verify_prediction(mlmodel) mlmodel = ct.convert(rank4_grayscale_input_model, inputs=[ct.ImageType(shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["add"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16) assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) mlmodel = ct.convert(rank4_grayscale_input_model, inputs=[ct.ImageType(shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE)], outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], minimum_deployment_target=ct.target.macOS13, ) assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "add"]) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE) assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) def test_linear_model(self, linear_model): # this will test the fuse_linear_bias pass, when the inputs are of type float16 mlmodel = ct.convert(linear_model, inputs=[ct.TensorType(shape=(1, 10), dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.macOS13, ) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp16") assert_ops_in_mil_program(mlmodel, ["linear", "relu"]) verify_prediction(mlmodel) def test_classifier(self): torch_model = torch.nn.ReLU().eval() traced_model = torch.jit.trace(torch_model, torch.rand(3,)) model = ct.convert( traced_model, inputs=[ct.TensorType(shape=(3,), dtype=np.float16)], outputs=[ct.TensorType(dtype=np.float16)], classifier_config = ct.ClassifierConfig(['a', 'b', 'c']), convert_to='mlprogram', minimum_deployment_target=ct.target.macOS13, ) assert_input_dtype(model, expected_type_str="fp16") assert_ops_in_mil_program(model, ["relu", "cast", "classify"]) spec = model.get_spec() input_name = spec.description.input[0].name out_dict = model.predict({input_name : np.array([1.0, 2.0, 3.0])}) assert 'classLabel' in out_dict assert out_dict['classLabel'] == 'c' assert len(spec.description.output) == 2 assert "classLabel_probs" in out_dict assert isinstance(out_dict["classLabel_probs"], dict) def test_prediction_with_fp16_io(self): torch_model = torch.nn.Linear(30, 5).eval() traced_model = torch.jit.trace(torch_model, torch.rand(1, 30)) mlmodel = ct.convert(traced_model, inputs=[ct.TensorType(name="input", shape=(1, 30), dtype=np.float32)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.macOS13, compute_units=ct.ComputeUnit.CPU_ONLY, ) # test prediction sample_input = np.random.rand(1, 30).astype(np.float32) * 10 model_output = mlmodel.predict({"input": sample_input})[mlmodel._spec.description.output[0].name] reference_output = traced_model(torch.from_numpy(sample_input)).detach().numpy() np.testing.assert_allclose(reference_output, model_output, rtol=1e-2, atol=1e-2) @pytest.mark.skipif(ct.utils._macos_version() < (13, 0), reason='Tests are for deployment target ios16/macos13') class TestGrayscaleImagePredictions: def test_grayscale_input_image(self, rank4_grayscale_input_model): mlmodel = ct.convert(rank4_grayscale_input_model, inputs=[ct.ImageType(name="input_image", shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE)], outputs=[ct.TensorType(name="output")], minimum_deployment_target=ct.target.macOS13, ) sample_input = np.random.randint(low=0, high=246, size=(1, 1, 10, 20)) img_input = Image.fromarray(sample_input[0, 0, :, :].astype(np.uint8), 'L') model_output = mlmodel.predict({"input_image": img_input})['output'] reference_output = rank4_grayscale_input_model(torch.from_numpy(sample_input.astype(np.float32))).detach().numpy() np.testing.assert_allclose(reference_output, model_output, rtol=1e-2, atol=1e-2) def test_grayscale_fp16_input_image(self, rank4_grayscale_input_model): mlmodel = ct.convert(rank4_grayscale_input_model, inputs=[ct.ImageType(name="input_image", shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], outputs=[ct.TensorType(name="output")], minimum_deployment_target=ct.target.macOS13, ) # incorrect way to do prediction with pytest.raises(TypeError, match="must be of type PIL.Image.Image with mode=='F'", ): sample_input = np.random.randint(low=0, high=246, size=(1, 1, 10, 20)) img_input = Image.fromarray(sample_input[0, 0, :, :].astype(np.uint8), 'L') mlmodel.predict({"input_image": img_input}) # correct way to do prediction sample_input = np.random.rand(1, 1, 10, 20) # in between [0, 1] img_input = Image.fromarray(sample_input[0, 0, :, :].astype(np.float32), 'F') model_output = mlmodel.predict({"input_image": img_input})['output'] reference_output = rank4_grayscale_input_model(torch.from_numpy(sample_input.astype(np.float32))).detach().numpy() np.testing.assert_allclose(reference_output, model_output, rtol=1e-2, atol=1e-2) @pytest.mark.parametrize( "dynamic_shape", [True, False], ) def test_grayscale_output_image(self, rank4_grayscale_input_model, dynamic_shape): if dynamic_shape: shape = ct.Shape((1, 1, ct.RangeDim(5, 10), ct.RangeDim(5, 20))) else: shape = (1, 1, 10, 20) mlmodel = ct.convert( rank4_grayscale_input_model, inputs=[ct.TensorType(name="input", shape=shape)], outputs=[ct.ImageType(name="output_image", color_layout=ct.colorlayout.GRAYSCALE)], minimum_deployment_target=ct.target.macOS13, compute_precision=ct.precision.FLOAT32, ) sample_input = np.random.randint(low=0, high=200, size=(1, 1, 10, 20)).astype(np.float32) model_output_pil_image = mlmodel.predict({"input": sample_input})['output_image'] assert isinstance(model_output_pil_image, Image.Image) assert model_output_pil_image.mode == "L" model_output_as_numpy = np.array(model_output_pil_image) reference_output = rank4_grayscale_input_model(torch.from_numpy(sample_input)).detach().numpy() reference_output = np.squeeze(reference_output) np.testing.assert_allclose(reference_output, model_output_as_numpy, rtol=1e-2, atol=1e-2) @pytest.mark.parametrize( "dynamic_shape", [True, False], ) def test_grayscale_fp16_output_image(self, rank4_grayscale_input_model, dynamic_shape): if dynamic_shape: shape = ct.Shape((1, 1, ct.RangeDim(5, 10), ct.RangeDim(5, 20))) else: shape = (1, 1, 10, 20) mlmodel = ct.convert( rank4_grayscale_input_model, inputs=[ct.TensorType(name="input", shape=shape)], outputs=[ ct.ImageType(name="output_image", color_layout=ct.colorlayout.GRAYSCALE_FLOAT16) ], minimum_deployment_target=ct.target.macOS13, compute_precision=ct.precision.FLOAT32, ) sample_input = np.random.randint(low=0, high=200, size=(1, 1, 10, 20)).astype(np.float32) model_output_pil_image = mlmodel.predict({"input": sample_input})['output_image'] assert isinstance(model_output_pil_image, Image.Image) assert model_output_pil_image.mode == "F" model_output_as_numpy = np.array(model_output_pil_image) reference_output = rank4_grayscale_input_model(torch.from_numpy(sample_input)).detach().numpy() reference_output = np.squeeze(reference_output) np.testing.assert_allclose(reference_output, model_output_as_numpy, rtol=1e-2, atol=1e-2) @pytest.mark.skipif( ct.utils._macos_version() < (14, 0), reason="Tests are for deployment target iOS16/macos14" ) class TestQuantizationConversionAPI: def test_dynamic_quantization(self): torch.backends.quantized.engine = "qnnpack" class Model(torch.nn.Module): def __init__(self): super().__init__() self.fc = torch.nn.Linear(3, 2) def forward(self, x): x = self.fc(x) return x SHAPE = (4, 3) x = torch.randn(SHAPE) model_fp32 = Model() model_int8 = torch.ao.quantization.quantize_dynamic( model_fp32, {torch.nn.Linear}, # a set of layers to dynamically quantize dtype=torch.qint8, ) model_int8.eval() traced_model = torch.jit.trace(model_int8, x) with pytest.raises( RuntimeError, match=( r"PyTorch convert function for op '.*_dynamic' not implemented\.\n" r"Dynamic quantized models are not supported by Core ML.\n" r"Please use static quantization or the APIs in coremltools.optimize to quantize/compress models." ), ): ct.convert(traced_model, inputs=[ct.TensorType(shape=SHAPE)]) def test_static_quantization_as_activation_quantization(self): torch.backends.quantized.engine = "qnnpack" class Model(torch.nn.Module): def __init__(self): super().__init__() self.quant = torch.ao.quantization.QuantStub() self.conv = torch.nn.Conv2d(3, 2, 5) self.relu = torch.nn.ReLU() self.dequant = torch.ao.quantization.DeQuantStub() def forward(self, x): x = self.quant(x) x = self.conv(x) x = self.relu(x) x = self.dequant(x) return x SHAPE = (4, 3, 8, 16) x = torch.randn(SHAPE) model_fp32 = Model() model_fp32.eval() model_fp32.qconfig = torch.ao.quantization.get_default_qconfig("qnnpack") model_fp32_fused = torch.ao.quantization.fuse_modules(model_fp32, [["conv", "relu"]]) model_fp32_prepared = torch.ao.quantization.prepare(model_fp32_fused) model_fp32_prepared(x) model_int8 = torch.ao.quantization.convert(model_fp32_prepared) traced_model = torch.jit.trace(model_int8, x) coreml_model = ct.convert( traced_model, inputs=[ct.TensorType(name="x", shape=SHAPE)], outputs=[ct.TensorType(name="y")], minimum_deployment_target=ct.target.iOS17, ) ops = get_op_types_in_program(coreml_model._mil_program) # constexpr_affine_dequantize and cast -> quantize can have arbitrary order assert set(ops[:2]) == set(["quantize", "constexpr_affine_dequantize"]) # these ops have well-defined order assert ops[2:] == [ # quantized ConvRelu op "dequantize", "conv", "relu", "quantize", # dequantize and output "dequantize", ] output = traced_model(x) coreml_output = coreml_model.predict({"x": x})["y"] np.testing.assert_allclose(output, coreml_output, rtol=1e-2, atol=2e-2) def test_static_quantization_as_weight_compression(self): torch.backends.quantized.engine = "qnnpack" weight = torch.rand(5, 3, 2, 4) class Model(torch.nn.Module): def __init__(self): super().__init__() self.quant = torch.ao.quantization.QuantStub() self.dequant = torch.ao.quantization.DeQuantStub() def forward(self, x): quantized_weight = self.quant(weight) dequantized_weight = self.dequant(quantized_weight) y = torch.nn.functional.conv2d(x, dequantized_weight) return y SHAPE = (4, 3, 16, 32) x = torch.randn(SHAPE) model_fp32 = Model() model_fp32.eval() model_fp32.qconfig = torch.ao.quantization.get_default_qconfig("qnnpack") model_fp32_prepared = torch.ao.quantization.prepare(model_fp32) model_fp32_prepared(x) model_int8 = torch.ao.quantization.convert(model_fp32_prepared) traced_model = torch.jit.trace(model_int8, x) coreml_model = ct.convert( traced_model, inputs=[ct.TensorType(name="x", shape=SHAPE)], outputs=[ct.TensorType(name="y")], minimum_deployment_target=ct.target.iOS17, ) ops = get_op_types_in_program(coreml_model._mil_program) # constexpr_affine_dequantize and cast can have arbitrary order assert ops == [ "constexpr_affine_dequantize", "conv", ] output = traced_model(x) coreml_output = coreml_model.predict({"x": x})["y"] np.testing.assert_allclose(output, coreml_output, rtol=1e-2, atol=2e-2) class TestiOS16DefaultIODtype: """ This class tests the default i/o dtype behavior for iOS16 (and above) models. """ @staticmethod def _verify_model_io(mlmodel, input_dtype, output_dtype, expected_op_list): """ This utility function verifies the model's i/o dtypes and expected ops """ assert_input_dtype(mlmodel, expected_type_str=input_dtype) assert_output_dtype(mlmodel, expected_type_str=output_dtype) assert_ops_in_mil_program(mlmodel, expected_op_list=expected_op_list) verify_prediction(mlmodel) def test_iO16_default_fp16_input(self, float32_input_model_add_op): """ With minimum_deployment_target set >= iOS16, and if the compute precision is set to fp16. By default, a fp16 i/o model is produced. However, if the users specify the dtype, the converter is going to respect that. """ # Case 1: Inputs given / outputs None mlmodel = ct.convert( float32_input_model_add_op, inputs=[ct.TensorType(shape=(10, 20))], minimum_deployment_target=ct.target.iOS16, ) self._verify_model_io( mlmodel, input_dtype="fp16", output_dtype="fp16", expected_op_list=["add"], ) # Case 2: Inputs given / outputs given mlmodel = ct.convert( float32_input_model_add_op, inputs=[ct.TensorType(shape=(10, 20), dtype=None)], outputs=[ct.TensorType(dtype=None)], minimum_deployment_target=ct.target.iOS16, ) self._verify_model_io( mlmodel, input_dtype="fp16", output_dtype="fp16", expected_op_list=["add"], ) # Case 3: Inputs set fp32 / outputs None mlmodel = ct.convert( float32_input_model_add_op, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float32)], minimum_deployment_target=ct.target.iOS16, ) self._verify_model_io( mlmodel, input_dtype="fp32", output_dtype="fp16", expected_op_list=["cast", "add"], ) # Case 4: Inputs set fp32 / outputs given mlmodel = ct.convert( float32_input_model_add_op, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float32)], outputs=[ct.TensorType(dtype=None)], minimum_deployment_target=ct.target.iOS16, ) self._verify_model_io( mlmodel, input_dtype="fp32", output_dtype="fp16", expected_op_list=["cast", "add"], ) # Case 5: Inputs given / outputs set to fp32 mlmodel = ct.convert( float32_input_model_add_op, inputs=[ct.TensorType(shape=(10, 20))], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.iOS16, ) self._verify_model_io( mlmodel, input_dtype="fp16", output_dtype="fp32", expected_op_list=["add", "cast"], ) # Case 6: Inputs / outputs both set to fp32 mlmodel = ct.convert( float32_input_model_add_op, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float32)], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.iOS16, ) self._verify_model_io( mlmodel, input_dtype="fp32", output_dtype="fp32", expected_op_list=["cast", "add", "cast"], ) def test_iO16_default_fp16_io_with_multiple_inputs(self, float32_two_input_model): """ For the multiple inputs model, the converter only set the default dtype for inputs with unspecified dtype. """ # Case 1: first input is set to fp32 mlmodel = ct.convert( float32_two_input_model, inputs=[ct.TensorType(shape=(10, 20), dtype=np.float32), ct.TensorType(shape=(10, 20))], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp32", index=0) assert_input_dtype(mlmodel, expected_type_str="fp16", index=1) assert_output_dtype(mlmodel, expected_type_str="fp16") assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "add"]) # Case 2: second input is set to fp32 mlmodel = ct.convert( float32_two_input_model, inputs=[ct.TensorType(shape=(10, 20)), ct.TensorType(shape=(10, 20), dtype=np.float32)], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp16", index=0) assert_input_dtype(mlmodel, expected_type_str="fp32", index=1) assert_output_dtype(mlmodel, expected_type_str="fp16") assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "add"]) # Case 3: both inputs are set to fp32 mlmodel = ct.convert( float32_two_input_model, inputs=[ ct.TensorType(shape=(10, 20), dtype=np.float32), ct.TensorType(shape=(10, 20), dtype=np.float32), ], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp32", index=0) assert_input_dtype(mlmodel, expected_type_str="fp32", index=1) assert_output_dtype(mlmodel, expected_type_str="fp16") assert_ops_in_mil_program(mlmodel, expected_op_list=["cast", "cast", "add"]) # Case 4: both inputs are not set mlmodel = ct.convert( float32_two_input_model, inputs=[ct.TensorType(shape=(10, 20)), ct.TensorType(shape=(10, 20))], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp16", index=0) assert_input_dtype(mlmodel, expected_type_str="fp16", index=1) assert_output_dtype(mlmodel, expected_type_str="fp16") assert_ops_in_mil_program(mlmodel, expected_op_list=["add"]) def test_iO16_default_fp16_io_with_multiple_outputs( self, float32_two_output_model, int32_float32_two_output_model ): """ For the multiple outputs model, the converter only set the default dtype to fp16 for outputs that satisfy 1. dtype is None 2. inferred dtype is fp32 """ # Case 1: first output is set to fp32 mlmodel = ct.convert( float32_two_output_model, inputs=[ct.TensorType(shape=(10, 20))], outputs=[ct.TensorType(dtype=np.float32), ct.TensorType(dtype=None)], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp32", index=0) assert_output_dtype(mlmodel, expected_type_str="fp16", index=1) assert_ops_in_mil_program(mlmodel, expected_op_list=["relu", "clip", "cast"]) # Case 2: second output is set to fp32 mlmodel = ct.convert( float32_two_output_model, inputs=[ct.TensorType(shape=(10, 20))], outputs=[ct.TensorType(dtype=None), ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp16", index=0) assert_output_dtype(mlmodel, expected_type_str="fp32", index=1) assert_ops_in_mil_program(mlmodel, expected_op_list=["relu", "clip", "cast"]) # Case 3: both outputs are set to fp32 mlmodel = ct.convert( float32_two_output_model, inputs=[ct.TensorType(shape=(10, 20))], outputs=[ct.TensorType(dtype=np.float32), ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp32", index=0) assert_output_dtype(mlmodel, expected_type_str="fp32", index=1) assert_ops_in_mil_program(mlmodel, expected_op_list=["relu", "clip", "cast", "cast"]) # Case 4: both outputs are not set mlmodel = ct.convert( float32_two_output_model, inputs=[ct.TensorType(shape=(10, 20))], outputs=[ct.TensorType(dtype=None), ct.TensorType(dtype=None)], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp16", index=0) assert_output_dtype(mlmodel, expected_type_str="fp16", index=1) assert_ops_in_mil_program(mlmodel, expected_op_list=["relu", "clip"]) # Case 5: outputs is not provided at all mlmodel = ct.convert( float32_two_output_model, inputs=[ct.TensorType(shape=(10, 20))], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="fp16") assert_output_dtype(mlmodel, expected_type_str="fp16", index=0) assert_output_dtype(mlmodel, expected_type_str="fp16", index=1) assert_ops_in_mil_program(mlmodel, expected_op_list=["relu", "clip"]) # Case 6: int32 and fp32 output. The fp32 defaults to fp32 while the int32 one remains unchanged. mlmodel = ct.convert( int32_float32_two_output_model, inputs=[ ct.TensorType(shape=(10, 20), dtype=np.int32), ct.TensorType(shape=(10, 20), dtype=np.float32), ], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="int32", index=0) assert_input_dtype(mlmodel, expected_type_str="fp32", index=1) assert_output_dtype(mlmodel, expected_type_str="int32", index=0) assert_output_dtype(mlmodel, expected_type_str="fp16", index=1) assert_ops_in_mil_program(mlmodel, expected_op_list=["add", "cast", "add"]) # Case 7: int32 and fp32 output. The fp32 defaults to fp32 while the int32 one remains unchanged. mlmodel = ct.convert( int32_float32_two_output_model, inputs=[ ct.TensorType(shape=(10, 20), dtype=np.int32), ct.TensorType(shape=(10, 20)), ], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="int32", index=0) assert_input_dtype(mlmodel, expected_type_str="fp16", index=1) assert_output_dtype(mlmodel, expected_type_str="int32", index=0) assert_output_dtype(mlmodel, expected_type_str="fp16", index=1) assert_ops_in_mil_program(mlmodel, expected_op_list=["add", "add"]) # Case 8: int32 and fp32 output. The fp32 defaults to fp32 while the int32 one remains unchanged. mlmodel = ct.convert( int32_float32_two_output_model, inputs=[ ct.TensorType(shape=(10, 20), dtype=np.int32), ct.TensorType(shape=(10, 20)), ], outputs=[ ct.TensorType(name="out1"), ct.TensorType(name="out2"), ], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="int32", index=0) assert_input_dtype(mlmodel, expected_type_str="fp16", index=1) assert_output_dtype(mlmodel, expected_type_str="int32", index=0) assert_output_dtype(mlmodel, expected_type_str="fp16", index=1) assert_ops_in_mil_program(mlmodel, expected_op_list=["add", "add"]) # Case 9: two int32 outputs. Nothing changed. mlmodel = ct.convert( int32_float32_two_output_model, inputs=[ ct.TensorType(shape=(10, 20), dtype=np.int32), ct.TensorType(shape=(10, 20), dtype=np.int32), ], minimum_deployment_target=ct.target.iOS16, ) assert_input_dtype(mlmodel, expected_type_str="int32", index=0) assert_input_dtype(mlmodel, expected_type_str="int32", index=1) assert_output_dtype(mlmodel, expected_type_str="int32", index=0) assert_output_dtype(mlmodel, expected_type_str="int32", index=1) assert_ops_in_mil_program(mlmodel, expected_op_list=["add", "add"]) def test_iO16_default_image_dtype_input( self, rank4_input_model, rank4_grayscale_input_model, ): """ We keep the input dtype for the image input model to fp32, unless it is GRAYSCALE_FLOAT16 """ # Example 1 mlmodel = ct.convert( rank4_input_model, inputs=[ct.ImageType(shape=(1, 3, 10, 20), color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.iOS16, ) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) # Example 2 mlmodel = ct.convert( rank4_input_model, inputs=[ct.ImageType(shape=(1, 3, 10, 20), color_layout=ct.colorlayout.BGR)], minimum_deployment_target=ct.target.iOS16, ) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.BGR) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) # Example 3 mlmodel = ct.convert( rank4_grayscale_input_model, inputs=[ct.ImageType(shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE)], minimum_deployment_target=ct.target.iOS16, ) assert_spec_input_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE ) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) # Example 4 mlmodel = ct.convert( rank4_grayscale_input_model, inputs=[ ct.ImageType(shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE_FLOAT16) ], minimum_deployment_target=ct.target.iOS16, ) assert_spec_input_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16 ) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) def test_iO16_default_image_dtype_output( self, rank4_input_model, rank4_grayscale_input_model, ): """ We keep the output dtype for the image input model to fp32, unless it is GRAYSCALE_FLOAT16 """ # Example 1 mlmodel = ct.convert( rank4_input_model, inputs=[ct.TensorType(shape=(1, 3, 10, 20))], outputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.iOS16, ) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB) verify_prediction(mlmodel) # Example 2 mlmodel = ct.convert( rank4_input_model, inputs=[ct.TensorType(shape=(1, 3, 10, 20))], outputs=[ct.ImageType(color_layout=ct.colorlayout.BGR)], minimum_deployment_target=ct.target.iOS16, ) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.BGR) verify_prediction(mlmodel) # Example 3 mlmodel = ct.convert( rank4_grayscale_input_model, inputs=[ct.TensorType(shape=(1, 1, 10, 20))], outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE)], minimum_deployment_target=ct.target.iOS16, ) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") assert_spec_output_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE ) verify_prediction(mlmodel) # Example 4 mlmodel = ct.convert( rank4_grayscale_input_model, inputs=[ct.TensorType(shape=(1, 1, 10, 20))], outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], minimum_deployment_target=ct.target.iOS16, ) assert_prog_input_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") assert_spec_output_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16 ) verify_prediction(mlmodel) def test_iO16_default_fp32_io(self, float32_input_model_add_op): """ With minimum_deployment_target set >= iOS16, and if the compute precision is set to fp32. By default, a fp32 i/o model is produced. """ # Case 1: Inputs given / outputs None mlmodel = ct.convert( float32_input_model_add_op, inputs=[ct.TensorType(shape=(10, 20))], compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS16, ) self._verify_model_io( mlmodel, input_dtype="fp32", output_dtype="fp32", expected_op_list=["add"], ) # Case 2: Inputs given / outputs given mlmodel = ct.convert( float32_input_model_add_op, inputs=[ct.TensorType(shape=(10, 20), dtype=None)], outputs=[ct.TensorType(dtype=None)], compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS16, ) self._verify_model_io( mlmodel, input_dtype="fp32", output_dtype="fp32", expected_op_list=["add"], ) @pytest.mark.skipif( Version(torch.__version__) < Version("2.4.0"), reason="Most torchao functionalities only work with PyTorch 2.4.0+", ) @pytest.mark.skipif( ct.utils._macos_version() < (15, 0), reason="Torchao block-wise quantization requires MacOS 15+.", ) @pytest.mark.skipif(not _HAS_TORCHAO, reason=MSG_TORCHAO_NOT_FOUND) class TestTorchao: """ This class tests the torchao quantized model conversion. """ @staticmethod def _construct_test_model(): # The old Quantizer method in torchao doesn't work with a single-layer model such as model=nn.Linear(...), # so we have to create a Module which contains linear layers. class TestModel(nn.Module): def __init__(self): super(TestModel, self).__init__() # Currently torchao only supports Linear module without bias. self.linear1 = nn.Linear(32, 64, bias=False) self.linear2 = nn.Linear(64, 32, bias=False) self.relu = nn.ReLU() def forward(self, x): x = self.relu(self.linear1(x)) return self.relu(self.linear2(x)) return TestModel().to(torch.device("cpu")).eval() @pytest.mark.parametrize("use_export", (False, True)) def test_weight_only_quantization(self, use_export): model = self._construct_test_model() quantizer = quant_api.Int4WeightOnlyQuantizer( precision=torch.float32, groupsize=32, inner_k_tiles=2, device=torch.device("cpu") ) model = quantizer.quantize(model) input_data = torch.randn((2, 32), dtype=torch.float16) if use_export: exported_model = torch.export.export(model, (input_data,)) inputs = None else: exported_model = torch.jit.trace(model, example_inputs=(input_data,)) inputs = [ct.TensorType(shape=input_data.shape, name="input")] converted_model = ct.convert( exported_model, inputs=inputs, minimum_deployment_target=ct.target.iOS18 ) main_func = converted_model._mil_program.functions["main"] quantize_ops = main_func.find_ops(op_type="constexpr_blockwise_shift_scale") assert len(quantize_ops) > 0 if ct.utils._is_macos(): result = converted_model.predict( { list(converted_model.input_description)[0]: input_data.detach() .numpy() .astype(np.float32) } ) expected = model(input_data) output_name = list(result.keys())[0] np.testing.assert_allclose(result[output_name], expected.detach().numpy(), atol=1e-3) def test_weight_only_quantization_bfloat16_not_support(self): """ Torchao quant_api.int4_weight_only only supports bfloat16. """ model = self._construct_test_model().bfloat16() quant_api.quantize_(model, quant_api.int4_weight_only(group_size=32, inner_k_tiles=2)) model = unwrap_tensor_subclass(model) input_data = torch.randn((2, 32), dtype=torch.float16) exported_model = torch.export.export(model, (input_data,)) # The conversion of bfloat16 hasn't been supported yet. with pytest.raises(KeyError, match="torch.bfloat16"): ct.convert(exported_model, minimum_deployment_target=ct.target.iOS17) @pytest.mark.parametrize("use_export", (True, False)) def test_dynamic_activation_quantization_not_support(self, use_export): """ Although Int8DynActInt4WeightQuantizer will be deprecated, we still want to test it because it's used in ExecuTorch to quantize llama models. """ model = self._construct_test_model() quantizer = quant_api.Int8DynActInt4WeightQuantizer( precision=torch.float16, groupsize=32, device=torch.device("cpu") ) model = quantizer.quantize(model) input_data = torch.randn((2, 32), dtype=torch.float16) if use_export: exported_model = torch.export.export(model, (input_data,)) inputs = None err_msg = "Unsupported fx node quantize_per_token" err_type = ValueError else: exported_model = torch.jit.trace(model, example_inputs=(input_data,)) inputs = [ct.TensorType(shape=input_data.shape)] err_msg = "Dynamic activation quantization is not supported in Core ML" err_type = NotImplementedError with pytest.raises(err_type, match=err_msg): ct.convert(exported_model, inputs=inputs, minimum_deployment_target=ct.target.iOS17) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508431.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/test/test_torch_export_conversion_api.py0000644000000000000000000006102714672066617033167 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import pytest from coremltools._deps import _HAS_EXECUTORCH, _HAS_TORCH_EXPORT_API if not _HAS_TORCH_EXPORT_API: pytest.skip(allow_module_level=True, reason="torch.export is required") from coremltools.converters.mil.frontend.torch.exir_utils import WRAPPED_SCALAR_INPUT_SUFFIX from coremltools.converters.mil.frontend.torch.utils import TorchFrontend frontends = [TorchFrontend.TORCHEXPORT] if _HAS_EXECUTORCH: import executorch.exir frontends.append(TorchFrontend.EXECUTORCH) import torch import coremltools as ct from coremltools.converters.mil import testing_reqs from coremltools.converters.mil.mil.scope import ScopeSource from .testing_utils import TorchBaseTest backends = testing_reqs.backends compute_units = testing_reqs.compute_units TORCH_EXPORT_DEFAULT_LOWER_BOUND = {TorchFrontend.TORCHEXPORT: 2, TorchFrontend.EXECUTORCH: 2} if torch.__version__ >= "2.4.0": TORCH_EXPORT_DEFAULT_LOWER_BOUND[TorchFrontend.TORCHEXPORT] = 0 class TestTorchExportConversionAPI(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends), ) def test_scalar_input(self, compute_unit, backend, frontend): class Model(torch.nn.Module): def forward(self, x): return x + 1 model = Model() model.eval() mlmodel = self.run_compare_torch( (), model, compute_unit=compute_unit, backend=backend, frontend=frontend, )[1] main_function = mlmodel._mil_program.functions["main"] assert len(main_function.inputs) == 1 input_name = list(main_function.inputs.keys())[0] input_var = main_function.inputs[input_name] assert input_name.endswith(WRAPPED_SCALAR_INPUT_SUFFIX) assert input_var.shape == (1,) squeeze_op = main_function.find_ops(op_type="squeeze")[0] if backend[1] == "fp32": assert squeeze_op.x is input_var elif backend[1] == "fp16": cast_op = main_function.find_ops(op_type="cast")[0] assert cast_op.x is input_var assert cast_op.dtype.val == "fp16" assert squeeze_op.x is cast_op.outputs[0] @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends), ) def test_dynamic_input(self, compute_unit, backend, frontend): if ct.utils._macos_version() <= (14, 2): pytest.xfail("rdar://135925921 ([CI] Upgrade External CI Machine OS)") class Model(torch.nn.Module): def __init__(self): super().__init__() self.linear = torch.nn.Linear(3, 5) def forward(self, x): return self.linear(x) model = Model() model.eval() batch_dim = torch.export.Dim("batch_dim") dynamic_shapes = {"x": {0: batch_dim}} coreml_model = self.run_compare_torch( (2, 3), model, compute_unit=compute_unit, backend=backend, frontend=frontend, torch_export_dynamic_shapes=dynamic_shapes, )[1] input_proto = coreml_model.input_description._fd_spec[0] size_ranges = input_proto.type.multiArrayType.shapeRange.sizeRanges assert size_ranges[0].lowerBound == TORCH_EXPORT_DEFAULT_LOWER_BOUND[frontend] assert size_ranges[0].upperBound == 2147483647 assert size_ranges[1].lowerBound == 3 assert size_ranges[1].upperBound == 3 @pytest.mark.parametrize("frontend, dynamic", itertools.product(frontends, (True, False))) def test_invalid_inputs(self, frontend, dynamic): class Model(torch.nn.Module): def __init__(self): super().__init__() self.linear = torch.nn.Linear(3, 5) def forward(self, x): return self.linear(x) model = Model() model.eval() example_inputs = (torch.rand(2, 3),) dynamic_shapes = None if dynamic: batch_dim = torch.export.Dim("batch_dim") dynamic_shapes = {"x": {0: batch_dim}} exported_program = torch.export.export( model, example_inputs, dynamic_shapes=dynamic_shapes, ) if frontend == TorchFrontend.EXECUTORCH: exported_program = executorch.exir.to_edge(exported_program).exported_program() with pytest.raises( AssertionError, match=r"'inputs' argument should be None for ExportedProgram" ): inputs = [ct.TensorType(shape=(2, 3))] if dynamic: batch_dim = ct.RangeDim(lower_bound=1, upper_bound=128) shape = (batch_dim, 3) inputs = [ct.TensorType(shape=shape)] ct.convert(exported_program, inputs=inputs) class TestExecuTorchExamples(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, dynamic", itertools.product(compute_units, backends, frontends, (True, False)), ) def test_mul(self, compute_unit, backend, frontend, dynamic): if ct.utils._macos_version() <= (14, 2) and dynamic: pytest.xfail("rdar://135925921 ([CI] Upgrade External CI Machine OS)") class MulModule(torch.nn.Module): def forward(self, input, other): return input * other dynamic_shapes = None if dynamic: dim0 = torch.export.Dim("dim0") dim1 = torch.export.Dim("dim1", min=1, max=3) dynamic_shapes = { "input": {0: dim0, 1: dim1}, "other": {0: dim0, 1: dim1}, } coreml_model = self.run_compare_torch( [(3, 2), (3, 2)], MulModule(), compute_unit=compute_unit, backend=backend, frontend=frontend, torch_export_dynamic_shapes=dynamic_shapes, )[1] if dynamic: for input_proto in coreml_model.input_description._fd_spec: size_ranges = input_proto.type.multiArrayType.shapeRange.sizeRanges assert size_ranges[0].lowerBound == TORCH_EXPORT_DEFAULT_LOWER_BOUND[frontend] assert size_ranges[0].upperBound == 2147483647 assert size_ranges[1].lowerBound == max( 1, TORCH_EXPORT_DEFAULT_LOWER_BOUND[frontend] ) assert size_ranges[1].upperBound == 3 mil_program = coreml_model._mil_program mul = mil_program.functions["main"].find_ops(op_type="mul")[0] stack_trace = mul.scopes[ScopeSource.EXIR_STACK_TRACE][0] assert stack_trace.split("\n")[-2].strip() == "return input * other" if frontend == TorchFrontend.EXECUTORCH: debug_handle = mul.scopes[ScopeSource.EXIR_DEBUG_HANDLE][0] assert isinstance(debug_handle, int) debug_handle_to_ops_mapping = mil_program.construct_debug_handle_to_ops_mapping() assert debug_handle_to_ops_mapping.keys() == {debug_handle} ops = debug_handle_to_ops_mapping[debug_handle] index_mul = 0 indices_const = () indices_cast = () if backend[1] == "fp32": assert len(ops) == 1 index_mul = 0 else: # fp16 introduces additional io casts # each cast introduces 1 const to store destination dtype assert len(ops) == 7 index_mul = 4 indices_const = (0, 1, 5) indices_cast = (2, 3, 6) assert ops[index_mul] == [ {"Type": "Program"}, {"Type": "Function", "Name": "main"}, {"Type": "Block"}, {"Type": "Operation", "Operator": "mul", "Output": mul.outputs[0].name}, ] for index_const_cast in indices_const + indices_cast: assert ops[index_const_cast][:-1] == [ {"Type": "Program"}, {"Type": "Function", "Name": "main"}, {"Type": "Block"}, ] for index_const in indices_const: assert ops[index_const][-1]["Operator"] == "const" for index_cast in indices_cast: assert ops[index_cast][-1]["Operator"] == "cast" @pytest.mark.parametrize( "compute_unit, backend, frontend, dynamic", itertools.product(compute_units, backends, frontends, (True, False)), ) def test_linear(self, compute_unit, backend, frontend, dynamic): if ct.utils._macos_version() <= (14, 2) and dynamic: pytest.xfail("rdar://135925921 ([CI] Upgrade External CI Machine OS)") class LinearModule(torch.nn.Module): def __init__(self): super().__init__() self.linear = torch.nn.Linear(3, 3) def forward(self, arg): return self.linear(arg) dynamic_shapes = None if dynamic: batch_dim = torch.export.Dim("batch_dim") dynamic_shapes = {"arg": {0: batch_dim}} coreml_model = self.run_compare_torch( [(3, 3)], LinearModule(), compute_unit=compute_unit, backend=backend, frontend=frontend, torch_export_dynamic_shapes=dynamic_shapes, )[1] if dynamic: input_proto = coreml_model.input_description._fd_spec[0] size_ranges = input_proto.type.multiArrayType.shapeRange.sizeRanges assert size_ranges[0].lowerBound == TORCH_EXPORT_DEFAULT_LOWER_BOUND[frontend] assert size_ranges[0].upperBound == 2147483647 assert size_ranges[1].lowerBound == 3 assert size_ranges[1].upperBound == 3 mil_program = coreml_model._mil_program linear = mil_program.functions["main"].find_ops(op_type="linear")[0] stack_trace = linear.scopes[ScopeSource.EXIR_STACK_TRACE][0] assert stack_trace.split("\n")[-2].strip() == "return self.linear(arg)" if frontend == TorchFrontend.EXECUTORCH: debug_handle = linear.scopes[ScopeSource.EXIR_DEBUG_HANDLE][0] assert isinstance(debug_handle, int) debug_handle_to_ops_mapping = mil_program.construct_debug_handle_to_ops_mapping() assert debug_handle_to_ops_mapping.keys() == {debug_handle} ops = debug_handle_to_ops_mapping[debug_handle] index_linear = 0 indices_const = () indices_cast = () if backend[1] == "fp32": assert len(ops) == 3 index_linear = 2 indices_const = (0, 1) else: # fp16 introduces additional io casts # each cast introduces 1 const to store destination dtype assert len(ops) == 7 index_linear = 4 indices_const = (0, 1, 2, 5) indices_cast = (3, 6) assert ops[index_linear] == [ {"Type": "Program"}, {"Type": "Function", "Name": "main"}, {"Type": "Block"}, {"Type": "Operation", "Operator": "linear", "Output": linear.outputs[0].name}, ] for index_const_cast in indices_const + indices_cast: assert ops[index_const_cast][:-1] == [ {"Type": "Program"}, {"Type": "Function", "Name": "main"}, {"Type": "Block"}, ] for index_const in indices_const: assert ops[index_const][-1]["Operator"] == "const" for index_cast in indices_cast: assert ops[index_cast][-1]["Operator"] == "cast" @pytest.mark.parametrize( "compute_unit, backend, frontend, dynamic", itertools.product(compute_units, backends, frontends, (True, False)), ) def test_add(self, compute_unit, backend, frontend, dynamic): if dynamic: pytest.skip( "https://github.com/apple/coremltools/issues/2307 " "torch.export has not settled the dynamism from 0/1 static shape yet" ) class AddModule(torch.nn.Module): def forward(self, x, y): z = x + y z = z + x z = z + x z = z + z return z dynamic_shapes = None if dynamic: dim0 = torch.export.Dim("dim0", min=1) dynamic_shapes = {"x": {0: dim0}, "y": {0: dim0}} coreml_model = self.run_compare_torch( [(1,), (1,)], AddModule(), compute_unit=compute_unit, backend=backend, frontend=frontend, torch_export_dynamic_shapes=dynamic_shapes, )[1] if dynamic: for input_proto in coreml_model.input_description._fd_spec: size_ranges = input_proto.type.multiArrayType.shapeRange.sizeRanges assert size_ranges[0].lowerBound == 1 assert size_ranges[0].upperBound == 2147483647 mil_program = coreml_model._mil_program adds = mil_program.functions["main"].find_ops(op_type="add") stack_traces = [add.scopes[ScopeSource.EXIR_STACK_TRACE][0] for add in adds] source_codes = [ "z = x + y", "z = z + x", "z = z + x", "z = z + z", ] for i, stack_trace in enumerate(stack_traces): assert stack_trace.split("\n")[-2].strip() == source_codes[i] if frontend == TorchFrontend.EXECUTORCH: debug_handles = [add.scopes[ScopeSource.EXIR_DEBUG_HANDLE][0] for add in adds] for debug_handle in debug_handles: assert isinstance(debug_handle, int) debug_handle_to_ops_mapping = mil_program.construct_debug_handle_to_ops_mapping() assert debug_handle_to_ops_mapping.keys() == set(debug_handles) for add_index, debug_handle in enumerate(debug_handles): add = adds[add_index] ops = debug_handle_to_ops_mapping[debug_handle] index_add = 0 indices_const = () indices_cast = () if backend[1] == "fp32": assert len(ops) == 1 index_add = 0 else: # fp16 introduces additional io casts # each cast introduces 1 const to store destination dtype ADD_INDEX_TO_NUM_OPS = {0: 5, 1: 1, 2: 1, 3: 3} ADD_INDEX_TO_OP_INDEX = {0: -1, 1: 0, 2: 0, 3: 0} assert len(ops) == ADD_INDEX_TO_NUM_OPS[add_index] index_add = ADD_INDEX_TO_OP_INDEX[add_index] if add_index == 0: indices_const = (0, 1) indices_cast = (2, 3) elif add_index == 3: indices_const = (1,) indices_cast = (2,) assert ops[index_add] == [ {"Type": "Program"}, {"Type": "Function", "Name": "main"}, {"Type": "Block"}, {"Type": "Operation", "Operator": "add", "Output": add.outputs[0].name}, ] for index_const_cast in indices_const + indices_cast: assert ops[index_const_cast][:-1] == [ {"Type": "Program"}, {"Type": "Function", "Name": "main"}, {"Type": "Block"}, ] for index_const in indices_const: assert ops[index_const][-1]["Operator"] == "const" for index_cast in indices_cast: assert ops[index_cast][-1]["Operator"] == "cast" @pytest.mark.parametrize( "compute_unit, backend, frontend, dynamic", itertools.product(compute_units, backends, frontends, (True, False)), ) def test_add_mul(self, compute_unit, backend, frontend, dynamic): if ct.utils._macos_version() <= (14, 2) and dynamic: pytest.xfail("rdar://135925921 ([CI] Upgrade External CI Machine OS)") class AddMulModule(torch.nn.Module): def forward(self, a, x, b): y = torch.mm(a, x) z = torch.add(y, b) return z dynamic_shapes = None if dynamic: embedding_dim = torch.export.Dim("embedding_dim") dynamic_shapes = { "a": {1: embedding_dim}, "x": {0: embedding_dim}, "b": {}, } coreml_model = self.run_compare_torch( [(2, 2), (2, 2), (2, 2)], AddMulModule(), compute_unit=compute_unit, backend=backend, frontend=frontend, torch_export_dynamic_shapes=dynamic_shapes, )[1] if dynamic: for i, input_proto in enumerate(coreml_model.input_description._fd_spec): multi_array_type = input_proto.type.multiArrayType shape = multi_array_type.shape size_ranges = multi_array_type.shapeRange.sizeRanges if i == 0: assert size_ranges[0].lowerBound == 2 assert size_ranges[0].upperBound == 2 assert size_ranges[1].lowerBound == TORCH_EXPORT_DEFAULT_LOWER_BOUND[frontend] assert size_ranges[1].upperBound == 2147483647 elif i == 1: assert size_ranges[0].lowerBound == TORCH_EXPORT_DEFAULT_LOWER_BOUND[frontend] assert size_ranges[0].upperBound == 2147483647 assert size_ranges[1].lowerBound == 2 assert size_ranges[1].upperBound == 2 else: assert i == 2 assert shape == [2, 2] assert len(size_ranges) == 0 mil_program = coreml_model._mil_program matmul_or_add = {} for op_type in ("matmul", "add"): matmul_or_add[op_type] = mil_program.functions["main"].find_ops(op_type=op_type)[0] stack_traces = { k: v.scopes[ScopeSource.EXIR_STACK_TRACE][0] for k, v in matmul_or_add.items() } source_codes = { "matmul": "y = torch.mm(a, x)", "add": "z = torch.add(y, b)", } for op_type in ("matmul", "add"): stack_trace = stack_traces[op_type] source_code = source_codes[op_type] assert stack_trace.split("\n")[-2].strip() == source_code if frontend == TorchFrontend.EXECUTORCH: debug_handle = { k: v.scopes[ScopeSource.EXIR_DEBUG_HANDLE][0] for k, v in matmul_or_add.items() } for v in debug_handle.values(): assert isinstance(v, int) debug_handle_to_ops_mapping = mil_program.construct_debug_handle_to_ops_mapping() assert debug_handle_to_ops_mapping.keys() == set(debug_handle.values()) ops = {} for op_type in ("matmul", "add"): ops[op_type] = debug_handle_to_ops_mapping[debug_handle[op_type]] index = {"matmul": 0, "add": 0} indices_const = {"matmul": (), "add": ()} indices_cast = {"matmul": (), "add": ()} if backend[1] == "fp32": assert len(ops["matmul"]) == 3 and len(ops["add"]) == 1 index = {"matmul": 2, "add": 0} indices_const["matmul"] = (0, 1) else: # fp16 introduces additional io casts # each cast introduces 1 const to store destination dtype assert len(ops["matmul"]) == 7 and len(ops["add"]) == 5 index = {"matmul": 6, "add": 2} indices_const = {"matmul": (0, 1, 2, 3), "add": (0, 3)} indices_cast = {"matmul": (4, 5), "add": (1, 4)} for op_type in ("matmul", "add"): assert ops[op_type][index[op_type]] == [ {"Type": "Program"}, {"Type": "Function", "Name": "main"}, {"Type": "Block"}, { "Type": "Operation", "Operator": op_type, "Output": matmul_or_add[op_type].outputs[0].name, }, ] for index_const_cast in indices_const[op_type] + indices_cast[op_type]: assert ops[op_type][index_const_cast][:-1] == [ {"Type": "Program"}, {"Type": "Function", "Name": "main"}, {"Type": "Block"}, ] for index_const in indices_const[op_type]: assert ops[op_type][index_const][-1]["Operator"] == "const" for index_cast in indices_cast[op_type]: assert ops[op_type][index_cast][-1]["Operator"] == "cast" @pytest.mark.parametrize( "compute_unit, backend, frontend, dynamic", itertools.product(compute_units, backends, frontends, (True, False)), ) def test_softmax(self, compute_unit, backend, frontend, dynamic): if ct.utils._macos_version() <= (14, 2) and dynamic: pytest.xfail("rdar://135925921 ([CI] Upgrade External CI Machine OS)") class SoftmaxModule(torch.nn.Module): def __init__(self): super().__init__() self.softmax = torch.nn.Softmax() def forward(self, x): return self.softmax(x) dynamic_shapes = None if dynamic: vocab_dim = torch.export.Dim("vocab_dim") dynamic_shapes = {"x": {0: vocab_dim}} coreml_model = self.run_compare_torch( [(2, 2)], SoftmaxModule(), compute_unit=compute_unit, backend=backend, frontend=frontend, torch_export_dynamic_shapes=dynamic_shapes, )[1] if dynamic: input_proto = coreml_model.input_description._fd_spec[0] size_ranges = input_proto.type.multiArrayType.shapeRange.sizeRanges assert size_ranges[0].lowerBound == TORCH_EXPORT_DEFAULT_LOWER_BOUND[frontend] assert size_ranges[0].upperBound == 2147483647 assert size_ranges[1].lowerBound == 2 assert size_ranges[1].upperBound == 2 mil_program = coreml_model._mil_program softmax = mil_program.functions["main"].find_ops(op_type="softmax")[0] stack_trace = softmax.scopes[ScopeSource.EXIR_STACK_TRACE][0] assert stack_trace.split("\n")[-2].strip() == "return self.softmax(x)" if frontend == TorchFrontend.EXECUTORCH: debug_handle = softmax.scopes[ScopeSource.EXIR_DEBUG_HANDLE][0] assert isinstance(debug_handle, int) debug_handle_to_ops_mapping = mil_program.construct_debug_handle_to_ops_mapping() assert debug_handle_to_ops_mapping.keys() == {debug_handle} ops = debug_handle_to_ops_mapping[debug_handle] index_softmax = 0 indices_const = () indices_cast = () if backend[1] == "fp32": assert len(ops) == 2 index_softmax = 1 indices_const = (0,) else: # fp16 introduces additional io casts # each cast introduces 1 const to store destination dtype assert len(ops) == 6 index_softmax = 3 indices_const = (0, 1, 4) indices_cast = (2, 5) assert ops[index_softmax] == [ {"Type": "Program"}, {"Type": "Function", "Name": "main"}, {"Type": "Block"}, {"Type": "Operation", "Operator": "softmax", "Output": softmax.outputs[0].name}, ] for index_const_cast in indices_const + indices_cast: assert ops[index_const_cast][:-1] == [ {"Type": "Program"}, {"Type": "Function", "Name": "main"}, {"Type": "Block"}, ] for index_const in indices_const: assert ops[index_const][-1]["Operator"] == "const" for index_cast in indices_cast: assert ops[index_cast][-1]["Operator"] == "cast" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/test/test_torch_export_quantization.py0000644000000000000000000002061414672066616032673 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools from typing import Tuple import pytest from coremltools._deps import _HAS_EXECUTORCH, _HAS_TORCH_EXPORT_API if not _HAS_TORCH_EXPORT_API: pytest.skip(allow_module_level=True, reason="torch.export is required") from coremltools.converters.mil.frontend.torch.utils import TorchFrontend frontends = [TorchFrontend.TORCHEXPORT] if _HAS_EXECUTORCH: frontends.append(TorchFrontend.EXECUTORCH) import torch _TORCH_VERSION = torch.__version__ _EXPECTED_TORCH_VERSION = "2.2.0" if _TORCH_VERSION < _EXPECTED_TORCH_VERSION: pytest.skip( allow_module_level=True, reason=f"PyTorch {_EXPECTED_TORCH_VERSION} or higher is required" ) from torch._export import capture_pre_autograd_graph from torch.ao.quantization.quantize_pt2e import convert_pt2e, prepare_pt2e, prepare_qat_pt2e from torch.ao.quantization.quantizer.xnnpack_quantizer import ( XNNPACKQuantizer, get_symmetric_quantization_config, ) import coremltools as ct import coremltools.converters.mil.mil.types as types from coremltools.converters.mil.testing_utils import get_op_types_in_program from coremltools.optimize.torch.quantization._coreml_quantizer import CoreMLQuantizer from coremltools.optimize.torch.quantization.quantization_config import ( LinearQuantizerConfig, QuantizationScheme, ) from .testing_utils import TorchBaseTest class TestTorchExportQuantization(TorchBaseTest): @staticmethod def make_torch_quantized_graph( model, example_inputs: Tuple[torch.Tensor], quantizer_name: str, quantization_type: str, is_per_channel: bool, nbit: int, ) -> torch.fx.GraphModule: assert quantizer_name in ("XNNPack", "CoreML") assert quantization_type in ("PTQ", "QAT") assert nbit in (4, 8) if quantizer_name == "CoreML" and nbit == 4: pytest.skip("4-bit Core ML quantizer is under development") if torch.__version__ <= _EXPECTED_TORCH_VERSION: if (quantizer_name, is_per_channel, nbit) != ("CoreML", False, 8): pytest.xfail("Need at least torch 2.3.0 to run this test.") pre_autograd_aten_dialect = capture_pre_autograd_graph(model, example_inputs) if quantizer_name == "XNNPack": # As of iOS 18, Core ML does not have 4-bit activation quantization, # so we only test 4-bit weight if nbit == 4: weight_qmin = -8 weight_qmax = 7 else: weight_qmin = -128 weight_qmax = 127 quantization_config = get_symmetric_quantization_config( is_per_channel=is_per_channel, is_qat=(quantization_type == "QAT"), is_dynamic=False, weight_qmin=weight_qmin, weight_qmax=weight_qmax, ) quantizer = XNNPACKQuantizer().set_global(quantization_config) elif quantizer_name == "CoreML": quantization_config = LinearQuantizerConfig.from_dict( { "global_config": { "quantization_scheme": QuantizationScheme.symmetric, "activation_dtype": torch.quint8, "weight_dtype": torch.qint8, "weight_per_channel": is_per_channel, } } ) quantizer = CoreMLQuantizer(quantization_config) if quantization_type == "PTQ": prepared_graph = prepare_pt2e(pre_autograd_aten_dialect, quantizer) elif quantization_type == "QAT": prepared_graph = prepare_qat_pt2e(pre_autograd_aten_dialect, quantizer) prepared_graph(*example_inputs) converted_graph = convert_pt2e(prepared_graph) return converted_graph @pytest.mark.parametrize( "quantizer_name, quantization_type, is_per_channel, nbit, frontend", itertools.product( ("XNNPack", "CoreML"), ("PTQ", "QAT"), (True, False), (4, 8), frontends, ), ) def test_conv_relu(self, quantizer_name, quantization_type, is_per_channel, nbit, frontend): SHAPE = (1, 3, 256, 256) class Model(torch.nn.Module): def __init__(self) -> None: super().__init__() self.conv = torch.nn.Conv2d( in_channels=3, out_channels=16, kernel_size=3, padding=1 ) self.relu = torch.nn.ReLU() def forward(self, x: torch.Tensor) -> torch.Tensor: a = self.conv(x) return self.relu(a) model = Model() example_inputs = (torch.randn(SHAPE),) converted_graph = self.make_torch_quantized_graph( model, example_inputs, quantizer_name, quantization_type, is_per_channel, nbit, ) minimum_deployment_target = ct.target.iOS17 if nbit == 4: minimum_deployment_target = max(minimum_deployment_target, ct.target.iOS18) _, mlmodel, _, _, _, _ = self.run_compare_torch( SHAPE, converted_graph, frontend=frontend, backend=("mlprogram", "fp16"), minimum_deployment_target=minimum_deployment_target, ) op_types_in_program = get_op_types_in_program(mlmodel._mil_program) if nbit == 4: assert "constexpr_blockwise_shift_scale" in op_types_in_program constexpr_blockwise_shift_scale_op = mlmodel._mil_program.find_ops( op_type="constexpr_blockwise_shift_scale", exactly_one=True )[0] assert constexpr_blockwise_shift_scale_op.data.dtype in (types.int4, types.uint4) else: assert "constexpr_affine_dequantize" in op_types_in_program constexpr_affine_dequantize_op = mlmodel._mil_program.find_ops( op_type="constexpr_affine_dequantize", exactly_one=True )[0] assert constexpr_affine_dequantize_op.quantized_data.dtype in (types.int8, types.uint8) @pytest.mark.parametrize( "quantizer_name, quantization_type, is_per_channel, nbit, frontend", itertools.product( ("XNNPack", "CoreML"), ("PTQ", "QAT"), (True, False), (4, 8), frontends, ), ) def test_linear(self, quantizer_name, quantization_type, is_per_channel, nbit, frontend): SHAPE = (1, 5) class Model(torch.nn.Module): def __init__(self) -> None: super().__init__() self.linear = torch.nn.Linear(5, 10) def forward(self, x: torch.Tensor) -> torch.Tensor: return self.linear(x) model = Model() example_inputs = (torch.randn(SHAPE),) converted_graph = self.make_torch_quantized_graph( model, example_inputs, quantizer_name, quantization_type, is_per_channel, nbit, ) minimum_deployment_target = ct.target.iOS17 if nbit == 4: minimum_deployment_target = max(minimum_deployment_target, ct.target.iOS18) _, mlmodel, _, _, _, _ = self.run_compare_torch( SHAPE, converted_graph, frontend=frontend, backend=("mlprogram", "fp16"), minimum_deployment_target=minimum_deployment_target, ) op_types_in_program = get_op_types_in_program(mlmodel._mil_program) if nbit == 4: assert "constexpr_blockwise_shift_scale" in op_types_in_program constexpr_blockwise_shift_scale_op = mlmodel._mil_program.find_ops( op_type="constexpr_blockwise_shift_scale", exactly_one=True )[0] assert constexpr_blockwise_shift_scale_op.data.dtype in (types.int4, types.uint4) else: assert "constexpr_affine_dequantize" in op_types_in_program constexpr_affine_dequantize_op = mlmodel._mil_program.find_ops( op_type="constexpr_affine_dequantize", exactly_one=True )[0] assert constexpr_affine_dequantize_op.quantized_data.dtype in (types.int8, types.uint8) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508431.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/test/test_torch_ops.py0000644000000000000000000145325614672066617027363 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import platform from typing import List, Optional, Tuple from unittest.mock import patch import numpy as np import pytest torch = pytest.importorskip("torch") import torch.nn as nn import coremltools as ct from coremltools import RangeDim, Shape, TensorType from coremltools._deps import _HAS_TORCH_AUDIO, _HAS_TORCH_VISION, version_lt from coremltools.converters.mil import testing_reqs from coremltools.converters.mil.frontend.torch.utils import ( NUM_TO_TORCH_DTYPE, NUMPY_DTYPE_TO_TORCH_NUM, TORCH_EXPORT_BASED_FRONTENDS, TorchFrontend, ) from coremltools.converters.mil.mil import Operation, Program, types from coremltools.converters.mil.mil.var import Var from coremltools.converters.mil.testing_utils import ( einsum_equations, gen_input_shapes_einsum, get_op_types_in_program, hardcoded_einsum_equations, ) from coremltools.models.utils import _macos_version, _python_version from .testing_utils import ( ModuleWrapper, TorchBaseTest, contains_op, export_torch_model_to_frontend, frontends, generate_input_data, ) if _HAS_TORCH_AUDIO: import torchaudio if _HAS_TORCH_VISION: import torchvision backends = testing_reqs.backends compute_units = testing_reqs.compute_units for frontend in frontends: if frontend in TORCH_EXPORT_BASED_FRONTENDS: # torch.export limits the number of compilation frames to prevent infinite loop # However, those frames are not immediately released after torch.export is done, # so when we have many torch.export calls, we can still hit the frame number limit torch._dynamo.config.accumulated_cache_size_limit = 1000000 break torch = pytest.importorskip("torch") torch.manual_seed(30) np.random.seed(30) # Set of common shapes for testing. Not all layers support 1D, so these two # set of shapes are kept separate COMMON_SHAPES = [(1, 10), (1, 5, 6), (1, 3, 5, 6), (1, 3, 4, 5, 6)] COMMON_SHAPES_ALL = [(1,)] + COMMON_SHAPES class TestScriptedModels(TorchBaseTest): @staticmethod def get_while_loop_model(): class TestLayer(nn.Module): def forward(self, x): x = 0.5 * x return x class TestNet(nn.Module): input_size = (1,) def __init__(self): super(TestNet, self).__init__() layer = TestLayer() self.layer = torch.jit.trace(layer, torch.rand(self.input_size)) def forward(self, x): while x > 0.01: x = self.layer(x) return x return TestNet().eval() @staticmethod def get_cond_model(): class TestNet(nn.Module): def forward(self, x): if torch.squeeze(x) < 10.0: return x * 10.0 else: return x * 2.0 return TestNet().eval() @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_while_loop(self, compute_unit, backend): model = TestScriptedModels.get_while_loop_model() self.run_compare_torch( model.input_size, model, backend=backend, compute_unit=compute_unit, use_scripting=True ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_cond(self, compute_unit, backend): torch_model = TestScriptedModels.get_cond_model() self.run_compare_torch( torch.tensor([1.0]), torch_model, input_as_shape=False, backend=backend, compute_unit=compute_unit, use_scripting=True, ) self.run_compare_torch( torch.tensor([11.0]), torch_model, input_as_shape=False, backend=backend, compute_unit=compute_unit, use_scripting=True, ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_for_loop(self, compute_unit, backend): class TestLayer(nn.Module): def forward(self, x): x = 2.0 * x return x class TestNet(nn.Module): input_size = (64,) def __init__(self): super(TestNet, self).__init__() layer = TestLayer() self.layer = torch.jit.trace(layer, torch.rand(self.input_size)) def forward(self, x): for _ in range(7): x = self.layer(x) return x model = TestNet().eval() self.run_compare_torch( model.input_size, model, backend=backend, compute_unit=compute_unit, use_scripting=True ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_if(self, compute_unit, backend): class TestLayer(nn.Module): def forward(self, x): x = torch.mean(x) return x class TestNet(nn.Module): input_size = (64,) def __init__(self): super(TestNet, self).__init__() layer = TestLayer() self.layer = torch.jit.trace(layer, torch.rand(self.input_size)) def forward(self, x): m = self.layer(x) if m < 0: scale = -2.0 else: scale = 2.0 x = scale * x return x model = TestNet().eval() self.run_compare_torch( model.input_size, model, backend=backend, compute_unit=compute_unit, use_scripting=True ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends) ) def test_linear(self, compute_unit, backend, frontend): class Model(torch.nn.Module): def __init__(self): super(Model, self).__init__() self.linear = torch.nn.Linear(2, 2) def forward(self, x): return self.linear(x) model = Model().eval() self.run_compare_torch( torch.tensor([[1.0, 2.0]]), model, input_as_shape=False, backend=backend, frontend=frontend, compute_unit=compute_unit, use_scripting=True, ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_conv(self, compute_unit, backend): pytest.xfail( "rdar://88194776 ([Converter] coremltools is not working with scripted torch convolution model)" ) model = torch.nn.Conv2d( in_channels=2, out_channels=3, kernel_size=1, padding="same", stride=1, dilation=1, groups=1, bias=False, ) self.run_compare_torch( (1, 2, 4, 5), model, backend=backend, compute_unit=compute_unit, use_scripting=True, ) class TestMean(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, keepdim", itertools.product(compute_units, backends, frontends, (True, False)), ) def test_mean(self, compute_unit, backend, frontend, keepdim): class Model(nn.Module): def forward(self, x): return torch.mean(x, dim=(2, 3), keepdim=keepdim) model = Model() shape = (1, 3, 256, 256) self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends), ) def test_with_flexible_shape(self, compute_unit, backend, frontend): if backend[0] == "mlprogram" and _macos_version() < (13, 0): pytest.xfail( "Issue fixed in iOS16/macOS13: https://github.com/apple/coremltools/issues/1420" ) class Model(nn.Module): def forward(self, x): return torch.mean(x, dim=(2, 3), keepdim=True) model = Model() shape = (1, 3, 256, 256) upper_bound = 512 if backend[0] == "mlprogram" else -1 converter_input_type = [ TensorType( shape=Shape( shape=[ 1, 3, RangeDim(upper_bound=upper_bound), RangeDim(upper_bound=upper_bound), ], default=shape, ) ) ] self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, converter_input_type=converter_input_type, ) @staticmethod @pytest.mark.skipif(ct.utils._macos_version() < (13, 0), reason="Bug fixed in macOS13/iOS16") def test_flexible_shape_with_default_value(): # test for bug reported in https://github.com/apple/coremltools/issues/1420 class Network(torch.nn.Module): def forward(self, x): return torch.mean(x, dim=(2, 3), keepdim=True) model = Network() x = torch.rand(1, 3, 256, 256) traced_model = torch.jit.trace(model, x) input_x = ct.TensorType( shape=( 1, 3, ct.RangeDim(upper_bound=512, default=256), ct.RangeDim(upper_bound=512, default=256), ), name="input", ) cml = ct.convert( traced_model, inputs=[input_x], outputs=[ct.TensorType(name="out")], convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY, ) input_dict = {"input": np.random.rand(1, 3, 112, 112)} if ct.utils._is_macos(): out = cml.predict(input_dict)["out"] assert out.shape == (1, 3, 1, 1) class TestAffineGrid(TorchBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "x_shape_and_target_size", "sampling_mode", "padding_mode", "align_corners", ] ), itertools.product( compute_units, backends, [ # shape format: (Batch, Channel, Height, Width) [(1, 1, 3, 3), (1, 1, 3, 3)], # no size change [(2, 3, 5, 5), (2, 3, 3, 2)], # down-sampling [(3, 1, 6, 6), (3, 1, 8, 8)], # up-sampling ], ["bilinear"], ["zeros"], [True], ), ) def test( self, compute_unit, backend, x_shape_and_target_size, sampling_mode, padding_mode, align_corners, ): if backend[0] == "neuralnetwork": pytest.skip("nn backend not supported") x_shape, target_size = x_shape_and_target_size theta = torch.rand((x_shape[0], 2, 3)) class TestModule(torch.nn.Module): def __init__(self): super(TestModule, self).__init__() self.affine_grid = torch.nn.functional.affine_grid self.grid_sample = torch.nn.functional.grid_sample def forward(self, x): grid = self.affine_grid( theta=theta, size=target_size, align_corners=align_corners, ) x = self.grid_sample( x, grid=grid, mode=sampling_mode, padding_mode=padding_mode, align_corners=align_corners, ) return x model = TestModule() self.run_compare_torch( x_shape, model, backend=backend, compute_unit=compute_unit, ) class TestGridSample(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, data_grid_shapes, mode, padding_mode, align_corners", itertools.product( compute_units, backends, [ # Input shape format: (Batch, C, Hin, Win) # Grid shape format: (Batch, Hout, Wout, 2) [(1, 1, 3, 3), (1, 3, 3, 2)], # no size change [(2, 3, 5, 5), (2, 3, 3, 2)], # down-sampling [(3, 1, 6, 6), (3, 8, 8, 2)], # up-sampling ], ["bilinear", "nearest"], ["zeros", "border", "reflection"], [True, False], ), ) def test( self, compute_unit, backend, data_grid_shapes, mode, padding_mode, align_corners, ): if backend[0] == "neuralnetwork": pytest.skip("nn backend not supported") params = { "mode": mode, "padding_mode": padding_mode, "align_corners": align_corners, } model = ModuleWrapper(function=torch.nn.functional.grid_sample, kwargs=params) self.run_compare_torch( data_grid_shapes, model, backend=backend, compute_unit=compute_unit, ) class TestFrac(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shape", itertools.product( compute_units, backends, frontends, COMMON_SHAPES, ), ) def test_frac(self, compute_unit, backend, frontend, shape): model = ModuleWrapper(function=torch.frac) TorchBaseTest.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(-10.0, 10.0), ) class TestNLLLoss(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, reduction", itertools.product( compute_units, backends, ["none", "sum", "mean"], ), ) def test_nllloss( self, compute_unit, backend, reduction, ): class NLLLossModel(nn.Module): def __init__(self): super(NLLLossModel, self).__init__() self.loss = nn.NLLLoss(reduction=reduction) def forward(self, x, target): loss = self.loss(x, target) return loss x = torch.randn(3, 5) target = torch.tensor([1, 0, 4]) inputs = (x, target) model = NLLLossModel() expected_results = model(*inputs) res = self.run_compare_torch( inputs, model, expected_results, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) # verify that the translation function is using one_hot instead of gather prog = res[1]._mil_program ops = get_op_types_in_program(prog) assert "gather" not in ops and "gather_nd" not in ops assert "one_hot" in ops class TestArgSort(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape, axis, descending", itertools.product( compute_units, backends, COMMON_SHAPES, [-1, 0], [True, False], ), ) def test_argsort(self, compute_unit, backend, shape, axis, descending): model = ModuleWrapper( function=torch.argsort, kwargs={"dim": axis, "descending": descending} ) TorchBaseTest.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, ) class TestSort(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape, axis, descending", itertools.product( compute_units, backends, COMMON_SHAPES, [-1, 0], [True, False], ), ) def test_sort(self, compute_unit, backend, shape, axis, descending): model = ModuleWrapper(function=torch.sort, kwargs={"dim": axis, "descending": descending}) TorchBaseTest.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, ) class TestSelu(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, inplace", itertools.product( compute_units, backends, [True, False], ), ) def test_selu(self, compute_unit, backend, inplace): x = torch.tensor([-6.0, -4.0, -2.0, 0.0, 2.0, 4.0, 6.0]) model = torch.nn.SELU(inplace=inplace) TorchBaseTest.run_compare_torch( x, model, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) class TestMv(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, matrix_shape", itertools.product(compute_units, backends, [(2, 3), (10, 12), (10, 1), (1, 5)]), ) def test_mv(self, compute_unit, backend, matrix_shape): model = ModuleWrapper(function=torch.mv) matrix = generate_input_data(matrix_shape) vector_length = matrix_shape[-1] vector = generate_input_data((vector_length,)) TorchBaseTest.run_compare_torch( (matrix, vector), model, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) @pytest.mark.skip( reason="rdar://100332029 ([PyTorch] cos_similarity unittest is failing stochastically)" ) class TestCosineSimilarity(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, dim, eps, shape", itertools.product( compute_units, backends, [0, -1], [0.1, 1e-5, 1e-8], COMMON_SHAPES, ), ) def test_cosine_similarity(self, compute_unit, backend, dim, eps, shape): class CosineSimilarity(nn.Module): def __init__(self, dim, eps): super(CosineSimilarity, self).__init__() self.cossim = torch.nn.CosineSimilarity(dim=dim, eps=eps) def forward(self, x, y): out = self.cossim(x, y) return out model = CosineSimilarity(dim, eps) input1 = generate_input_data(shape) input2 = generate_input_data(shape) TorchBaseTest.run_compare_torch( [input1, input2], model, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) class TestDot(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, vector_length", itertools.product(compute_units, backends, [1, 5, 11]), ) def test_dot(self, compute_unit, backend, vector_length): model = ModuleWrapper(function=torch.dot) vector1 = generate_input_data((vector_length,)) vector2 = generate_input_data((vector_length,)) TorchBaseTest.run_compare_torch( (vector1, vector2), model, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) class TestOuter(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, x_vector_length, y_vector_length", itertools.product( compute_units, backends, frontends, [1, 5], [1, 3], ), ) def test_outer(self, compute_unit, backend, frontend, x_vector_length, y_vector_length): model = ModuleWrapper(function=torch.outer) vector1 = generate_input_data((x_vector_length,)) vector2 = generate_input_data((y_vector_length,)) TorchBaseTest.run_compare_torch( (vector1, vector2), model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) class TestCross(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape_dim", itertools.product(compute_units, backends, [((3,), 0), ((4, 3, 2), 1)]), ) def test_cross(self, compute_unit, backend, shape_dim): shape = shape_dim[0] dim = shape_dim[1] class CrossModel(nn.Module): def forward(self, x, y): return torch.cross(x, y, dim) x = generate_input_data(shape) y = generate_input_data(shape) model = CrossModel().eval() torch_out = model(x, y) self.run_compare_torch( (x, y), model, expected_results=torch_out, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) class TestNormalize(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product( compute_units, backends, COMMON_SHAPES, ), ) def test_normalize(self, compute_unit, backend, shape): model = ModuleWrapper(function=nn.functional.normalize) TorchBaseTest.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, ) class TestNorms(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape, keepdim", itertools.product(compute_units, backends, COMMON_SHAPES, [True, False]), ) def test_frobenius_norm(self, compute_unit, backend, shape, keepdim): num_dims = len(shape) for dim in range(-num_dims, num_dims): model = ModuleWrapper(function=torch.norm, kwargs={"keepdim": keepdim, "dim": dim}) TorchBaseTest.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, shape, p, keepdim", itertools.product( compute_units, backends, COMMON_SHAPES, [-1, 0, 1, 2, 3, np.inf, -np.inf], [True, False], ), ) def test_number_norm(self, compute_unit, backend, shape, p, keepdim): for dim in (-1, 0, 1): model = ModuleWrapper( function=torch.norm, kwargs={"p": p, "keepdim": keepdim, "dim": dim} ) TorchBaseTest.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, atol=1e-2, ) class TestNarrow(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shape", itertools.product( compute_units, backends, frontends, COMMON_SHAPES, ), ) def test_narrow(self, compute_unit, backend, frontend, shape): class Model(torch.nn.Module): def __init__(self, dim, start, length): super().__init__() self.dim = dim self.start = start self.length = length def forward(self, x): return torch.narrow(x, self.dim, self.start, self.length) for cur_dim in range(len(shape)): for cur_start in range(shape[cur_dim] - 1): for cur_length in range(1, shape[cur_dim] - cur_start): m = Model(cur_dim, cur_start, cur_length) TorchBaseTest.run_compare_torch( shape, m, frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestWeightNorm(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, in_out_features", itertools.product( compute_units, backends, [(1, 1), (2, 10), (20, 10)], ), ) def test_linear(self, compute_unit, backend, in_out_features): in_features, out_features = in_out_features for dim in (None, -2, -1, 0, 1): model = nn.utils.weight_norm(nn.Linear(in_features, out_features), dim=dim) TorchBaseTest.run_compare_torch( (in_features,), model, backend=backend, compute_unit=compute_unit, atol=1e-3, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_conv2d(self, compute_unit, backend): x = torch.randn(20, 16, 50, 100) for dim in (None,) + tuple(range(-4, 4)): model = nn.utils.weight_norm(nn.Conv2d(16, 33, 3), dim=dim) TorchBaseTest.run_compare_torch( x, model, input_as_shape=False, atol=1e-3, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_conv3d(self, compute_unit, backend): x = torch.randn(20, 16, 5, 50, 100) for dim in (None,) + tuple(range(-5, 5)): model = nn.utils.weight_norm(nn.Conv3d(16, 33, 3), dim=dim) TorchBaseTest.run_compare_torch( x, model, input_as_shape=False, atol=1e-3, backend=backend, compute_unit=compute_unit, ) class TestLinAlgNorms(TorchBaseTest): def _is_valid_config(self, shape, order, dim): if isinstance(dim, tuple): if isinstance(order, int) and (order == 0 or order > 2): return False elif isinstance(dim, int): if order == "fro": return False elif dim is None: if order is not None: if len(shape) > 2: return False elif len(shape) == 2 and not isinstance(order, str) and (order == 0 or order > 2): return False elif len(shape) == 1 and isinstance(order, str): return False return True @pytest.mark.parametrize( "compute_unit, backend, shape, order, keepdim, dim", itertools.product( compute_units, backends, COMMON_SHAPES, [-2, -1, 0, 1, 2, 3, np.inf, -np.inf, "fro", None], [True, False], [-1, 0, 1, (0, 1), (0, -1), None], ), ) def test_norm(self, compute_unit, backend, shape, order, keepdim, dim): if not self._is_valid_config(shape, order, dim): pytest.skip() if ( isinstance(order, int) and abs(order) == 2 and ((dim is None and len(shape) == 2) or isinstance(dim, tuple)) ): pytest.xfail("Matrix norm for order 2 and -2 is not implemented") model = ModuleWrapper( function=torch.linalg.norm, kwargs={"ord": order, "keepdim": keepdim, "dim": dim}, ) TorchBaseTest.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, atol=1e-2, ) class TestLinAlgMatrixNorms(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape, order, keepdim, dim", itertools.product( compute_units, backends, COMMON_SHAPES, [-2, -1, 1, 2, np.inf, -np.inf, "fro", "nuc"], [True, False], [(0, 1), (0, -1), (1, 2), (0, 2), (2, 3)], ), ) def test_norm(self, compute_unit, backend, shape, order, keepdim, dim): if dim[-1] > len(shape) - 1: pytest.skip() if order == "nuc" or (type(order) != str and abs(order) == 2): pytest.xfail("Matrix norm for order 2, -2 and nuc is not implemented") model = ModuleWrapper( function=torch.linalg.matrix_norm, kwargs={"ord": order, "keepdim": keepdim, "dim": dim}, ) TorchBaseTest.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, atol=1e-2 ) class TestLinAlgVectorNorms(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape, order, keepdim, dim", itertools.product( compute_units, backends, COMMON_SHAPES, [-2, -1, 0, 1, 2, np.inf, -np.inf], [True, False], [-1, 0, 1, (0, 1), (0, -1), None], ), ) def test_norm(self, compute_unit, backend, shape, order, keepdim, dim): model = ModuleWrapper( function=torch.linalg.vector_norm, kwargs={"ord": order, "keepdim": keepdim, "dim": dim}, ) TorchBaseTest.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, atol=1e-2, ) class TestHardswish(TorchBaseTest): class HardswishModel(nn.Module): def __init__(self, inplace=False): super(TestHardswish.HardswishModel, self).__init__() self.activation = nn.Hardswish(inplace=inplace) def forward(self, x): return self.activation(x) def test_longer_range_input_element_values(self): x = torch.tensor([-6.0, -4.0, -2.0, 0.0, 2.0, 4.0, 6.0]) model = TestHardswish.HardswishModel() TorchBaseTest.run_compare_torch(x, model, input_as_shape=False) model = TestHardswish.HardswishModel(inplace=True) TorchBaseTest.run_compare_torch(x, model, input_as_shape=False) @pytest.mark.parametrize( "compute_unit, backend, frontend, shape", itertools.product( compute_units, backends, frontends, COMMON_SHAPES, ), ) def test_additional_shapes_and_backends(self, compute_unit, backend, frontend, shape): model = TestHardswish.HardswishModel() TorchBaseTest.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestBatchNorm(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, num_features, eps, affine", itertools.product(compute_units, backends, [5, 3, 1], [0.1, 1e-05], [True, False]), ) def test_batchnorm(self, compute_unit, backend, num_features, eps, affine): model = nn.BatchNorm2d(num_features, eps, affine=affine) self.run_compare_torch( (6, num_features, 5, 5), model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, affine", itertools.product(compute_units, backends, [True, False]), ) def test_batchnorm_2d_with_conv(self, compute_unit, backend, affine): class CRNNBase(nn.Module): def __init__(self, ch_in, ch_out, kernel_size=3): super(CRNNBase, self).__init__() self.conv = nn.Conv2d(ch_in, ch_out, kernel_size=kernel_size) self.norm = nn.BatchNorm2d(ch_out, affine=affine) def forward(self, x): x = self.conv(x) x = self.norm(x) return x model = CRNNBase(ch_in=6, ch_out=16) self.run_compare_torch( (1, 6, 15, 30), model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, num_features, eps, affine, dynamic_input", itertools.product( [ct.ComputeUnit.CPU_ONLY], backends, [5, 1], [0.1, 1e-05], [True, False], ["None", "Batch", "Height", "Width", "Depth", "All"], ), ) def test_batchnorm_3d(self, compute_unit, backend, num_features, eps, affine, dynamic_input): model = nn.BatchNorm3d(num_features, eps, affine=affine) input_shape = (6, num_features, 2, 3, 4) if dynamic_input == "None": self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, ) else: if dynamic_input == "Batch": converter_input_type = [ TensorType(shape=(RangeDim(1, 10), num_features, 2, 3, 4), dtype=np.float32) ] elif dynamic_input == "Height": converter_input_type = [ TensorType(shape=(6, num_features, RangeDim(1, 10), 3, 4), dtype=np.float32) ] elif dynamic_input == "Width": converter_input_type = [ TensorType(shape=(6, num_features, 2, RangeDim(1, 10), 4), dtype=np.float32) ] elif dynamic_input == "Depth": converter_input_type = [ TensorType(shape=(6, num_features, 2, 3, RangeDim(1, 10)), dtype=np.float32) ] elif dynamic_input == "All": converter_input_type = [ TensorType( shape=( RangeDim(1, 10), num_features, RangeDim(1, 10), RangeDim(1, 10), RangeDim(1, 10), ), dtype=np.float32, ) ] self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, converter_input_type=converter_input_type, ) @pytest.mark.parametrize( "compute_unit, backend, rank, num_features, eps, training", itertools.product( [ct.ComputeUnit.CPU_ONLY], backends, [3, 4, 5], [5, 1], [0.1, 1e-05], [True, False], ), ) def test_batchnorm_dynamic( self, compute_unit, backend, rank, num_features, eps, training ): model = ModuleWrapper( nn.functional.batch_norm, { "training": training, "eps": eps, }, ) input_shape = [6, num_features, 3, 4, 5] input_shape = input_shape[:rank] _input = torch.randn(*input_shape) _mean = torch.randn(num_features) _var = torch.randn(num_features) inputs = (_input, _mean, _var) expected_results = model(*inputs) self.run_compare_torch( inputs, model, expected_results, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, has_weight, has_bias, has_running_mean, has_running_var", itertools.product( compute_units, backends, [True, False], [True, False], [True, False], [True, False], ), ) def test_batchnorm_dynamic_stress( self, compute_unit, backend, has_weight, has_bias, has_running_mean, has_running_var, ): num_features = 5 input_shape = (3, num_features, 2) weight = torch.randn(num_features) if has_weight else None bias = torch.randn(num_features) if has_bias else None running_mean = torch.randn(num_features) if has_running_mean else None running_var = torch.randn(num_features) if has_running_var else None class Model(torch.nn.Module): def forward(self, x): res = torch.nn.functional.batch_norm( input=x, running_mean=running_mean, running_var=running_var, weight=weight, bias=bias, training=True, momentum=0.0, eps=1e-05, ) return res self.run_compare_torch( input_shape, Model(), backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, affine", itertools.product(compute_units, backends, [True, False]), ) def test_batchnorm_1d_with_conv(self, compute_unit, backend, affine): class CRNNBase(nn.Module): def __init__(self, ch_in, ch_out, kernel_size=3): super(CRNNBase, self).__init__() self.conv = nn.Conv1d(ch_in, ch_out, kernel_size=kernel_size) self.norm = nn.BatchNorm1d(ch_out, affine=affine) def forward(self, x): x = self.conv(x) x = self.norm(x) return x model = CRNNBase(ch_in=6, ch_out=16) self.run_compare_torch( (1, 6, 15), model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, shape, eps, affine", itertools.product( compute_units, backends, [(1, 10), (4, 6), (10, 1)], [0.1, 1e-05], [True, False], ), ) def test_batchnorm1d_rank2(self, compute_unit, backend, shape, eps, affine): N, C = shape batchnorm = nn.BatchNorm1d(C, eps=eps, affine=affine).eval() self.run_compare_torch( (N, C), batchnorm, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, shape, eps, affine", itertools.product( compute_units, backends, [(4, 8, 2), (1, 5, 3), (5, 10, 1), (6, 1, 4)], [0.1, 1e-05], [True, False], ), ) def test_batchnorm1d_rank3(self, compute_unit, backend, shape, eps, affine): N, C, L = shape batchnorm = nn.BatchNorm1d(C, eps=eps, affine=affine).eval() self.run_compare_torch( (N, C, L), batchnorm, backend=backend, compute_unit=compute_unit, ) class TestInstanceNorm(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, num_features, eps", itertools.product(compute_units, backends, [5, 2, 1], [0.1, 1e-05]), ) def test_instancenorm(self, compute_unit, backend, num_features, eps): model = nn.InstanceNorm2d(num_features, eps) self.run_compare_torch( (6, num_features, 5, 5), model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, num_features", itertools.product(compute_units, backends, [5, 2, 1]), ) def test_instancenorm_1d(self, compute_unit, backend, num_features): model = nn.InstanceNorm1d(num_features) self.run_compare_torch( (6, num_features, 10), model, backend=backend, compute_unit=compute_unit, ) class TestGroupNorm(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, group_features, eps, affine", itertools.product( compute_units, backends, frontends, [(16, 32), (1, 1)], [0.1, 1e-05], [True, False] ), ) def test_groupnorm(self, compute_unit, backend, frontend, group_features, eps, affine): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("ExecuTorch uses native_group_norm") model = nn.GroupNorm(group_features[0], group_features[1], eps=eps, affine=affine) self.run_compare_torch( (6, group_features[1], 5, 5), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, group_features, eps, affine", itertools.product( compute_units, backends, frontends, [(16, 32), (1, 1)], [0.1, 1e-05], [True, False] ), ) def test_groupnorm_rank3_input( self, compute_unit, backend, frontend, group_features, eps, affine ): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("ExecuTorch uses native_group_norm") model = nn.GroupNorm(group_features[0], group_features[1], eps=eps, affine=affine) self.run_compare_torch( (6, group_features[1], 5), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, group_features, eps, affine", itertools.product( compute_units, backends, frontends, [(16, 32), (1, 1)], [0.1, 1e-05], [True, False] ), ) def test_groupnorm_rank2_input( self, compute_unit, backend, frontend, group_features, eps, affine ): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("ExecuTorch uses native_group_norm") model = nn.GroupNorm(group_features[0], group_features[1], eps=eps, affine=affine) self.run_compare_torch( (4, group_features[1]), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, group_features, eps, affine", itertools.product( compute_units, backends, frontends, [(16, 32), (1, 1)], [0.1, 1e-05], [True, False] ), ) def test_groupnorm_dynamic(self, compute_unit, backend, frontend, group_features, eps, affine): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("ExecuTorch uses native_group_norm") model = nn.GroupNorm(group_features[0], group_features[1], eps=eps, affine=affine) dim_upper_bound = 30 if backend[0] == "mlprogram" else -1 converter_input_type = [ TensorType( shape=( 6, group_features[1], RangeDim(default=10, lower_bound=5, upper_bound=dim_upper_bound), RangeDim(default=10, lower_bound=5, upper_bound=dim_upper_bound), ), dtype=np.float32, ) ] self.run_compare_torch( (6, group_features[1], 10, 10), model, frontend=frontend, backend=backend, compute_unit=compute_unit, converter_input_type=converter_input_type, ) class TestLinear(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends), ) def test_linear_fp16(self, compute_unit, backend, frontend): class Model(nn.Module): def __init__(self): super().__init__() self.fc = nn.Linear(4, 4, dtype=torch.float16) def forward(self, x): return self.fc(x) model = Model() self.run_compare_torch( torch.randn(4, 4, dtype=torch.float16), model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, minimum_deployment_target=ct.target.iOS16, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, in_features, out_features, bias", itertools.product( compute_units, backends, frontends, [5], [10], [True, False], ), ) def test_linear_rank1_input( self, compute_unit, backend, frontend, in_features, out_features, bias ): model = nn.Linear(in_features, out_features, bias=bias) self.run_compare_torch( (in_features,), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, in_features, out_features, bias", itertools.product(compute_units, backends, frontends, [10, 25], [3, 6], [True, False]), ) def test_linear_rank2_input( self, compute_unit, backend, frontend, in_features, out_features, bias ): model = nn.Linear(in_features, out_features, bias=bias) self.run_compare_torch( (1, in_features), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, in_features, out_features, bias", itertools.product(compute_units, backends, frontends, [10], [6], [True, False]), ) def test_linear_rank3_input( self, compute_unit, backend, frontend, in_features, out_features, bias ): model = nn.Linear(in_features, out_features, bias=bias) self.run_compare_torch( (1, 3, in_features), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, in_features, out_features, bias", itertools.product(compute_units, backends, frontends, [10], [6], [True, False]), ) def test_linear_rank4_input( self, compute_unit, backend, frontend, in_features, out_features, bias ): model = nn.Linear(in_features, out_features, bias=bias) self.run_compare_torch( (1, 5, 3, in_features), model, compute_unit=compute_unit, backend=backend, frontend=frontend, ) class TestConv(TorchBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "frontend", "padding", "stride", "length", "in_channels", "out_channels", "kernel_size", "dilation", "bias", ] ), [ (compute_unit, backend, frontend, padding, stride, *param) for compute_unit, backend, frontend, padding, stride, param in itertools.product( [ct.ComputeUnit.CPU_ONLY], backends, frontends, ["same", "valid", 0, 1], [1, 2, 3], [ (5, 1, 1, 1, 1, True), (3, 1, 1, 1, 3, False), (4, 3, 3, 2, 1, True), (7, 3, 3, 1, 1, False), (5, 3, 3, 1, 1, True), (3, 3, 3, 1, 1, False), (3, 3, 3, 1, 3, True), (7, 3, 3, 2, 3, False), ], ) ], ) def test_convolution1d( self, compute_unit, backend, frontend, padding, stride, length, in_channels, out_channels, kernel_size, dilation, bias, ): if padding == "same" and stride != 1: # configuration not supported return model = nn.Conv1d( in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, bias=bias, ) self.run_compare_torch( (1, in_channels, length), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "frontend", "padding", "stride", "height", "width", "in_channels", "out_channels", "kernel_size", "dilation", "bias", ] ), [ (compute_unit, backend, frontend, padding, stride, *param) for compute_unit, backend, frontend, padding, stride, param in itertools.product( [ct.ComputeUnit.CPU_ONLY], backends, frontends, ["same", "valid", 1, 0], [1, 2, 3], [ (5, 3, 1, 1, 1, 1, True), (3, 3, 1, 1, 1, 3, False), (4, 3, 3, 3, 2, 1, True), (7, 3, 3, 3, 1, 1, False), (5, 5, 3, 3, 1, 1, True), (3, 5, 3, 3, 1, 1, False), (3, 5, 3, 3, 1, 3, True), (7, 5, 3, 3, 2, 3, False), ], ) ], ) def test_convolution2d( self, compute_unit, backend, frontend, padding, stride, height, width, in_channels, out_channels, kernel_size, dilation, bias, ): if padding == "same" and stride != 1: return model = nn.Conv2d( in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, bias=bias, ) self.run_compare_torch( (1, in_channels, height, width), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "frontend", "padding", "stride", "depth", "height", "width", "in_channels", "out_channels", "kernel_size", "dilation", "bias", ] ), [ (compute_unit, backend, frontend, padding, stride, *param) for compute_unit, backend, frontend, padding, stride, param in itertools.product( [ct.ComputeUnit.CPU_ONLY], backends, frontends, ["same", "valid", 1, 0], [1, 2, 3], [ (5, 3, 2, 1, 1, 1, 1, True), (3, 3, 1, 1, 1, 1, 3, False), (4, 3, 3, 3, 3, 2, 1, True), (7, 3, 4, 3, 3, 1, 1, False), (5, 5, 3, 3, 3, 1, 1, True), (3, 5, 1, 3, 3, 1, 1, False), (3, 5, 4, 3, 3, 1, 3, True), (7, 5, 6, 3, 3, 2, 3, False), ], ) ], ) def test_convolution3d( self, compute_unit, backend, frontend, padding, stride, depth, height, width, in_channels, out_channels, kernel_size, dilation, bias, ): if padding == "same" and stride != 1: return model = nn.Conv3d( in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, bias=bias, ) self.run_compare_torch( (1, in_channels, depth, height, width), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestDynamicConv(TorchBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "frontend", "width", "in_channels", "out_channels", "kernel_size", "stride", "padding", ] ), [ (compute_unit, backend, frontend, *param) for compute_unit, backend, frontend, param in itertools.product( compute_units, backends, frontends, [ (5, 1, 1, 1, 2, 1), (3, 1, 1, 1, 2, 3), (4, 3, 3, 1, 2, 1), (7, 3, 3, 1, 3, 1), (5, 3, 3, 2, 2, 1), (3, 3, 3, 1, 3, 1), (3, 3, 3, 1, 3, 3), (7, 3, 3, 3, 1, 3), ], ) ], ) def test_convolution1d( self, compute_unit, backend, frontend, width, in_channels, out_channels, kernel_size, stride, padding, groups=1, ): class DynamicConv(nn.Module): def forward(self, input_data, weights): return nn.functional.conv1d(input_data, weights, stride=stride, padding=padding) model = DynamicConv() input_shape = [ (1, in_channels, width), (out_channels, int(in_channels / groups), kernel_size), ] self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "frontend", "height", "width", "in_channels", "out_channels", "kernel_size", "stride", "padding", ] ), [ (compute_unit, backend, frontend, *param) for compute_unit, backend, frontend, param in itertools.product( compute_units, backends, frontends, [ (5, 3, 1, 1, 1, 2, 0), (3, 3, 1, 1, 1, 2, 1), (4, 3, 3, 3, 1, 2, 0), (7, 3, 3, 3, 1, 3, 0), (5, 5, 3, 3, 2, 1, 0), (3, 5, 3, 3, 1, 3, 0), (3, 5, 3, 3, 1, 3, 1), (7, 5, 3, 3, 2, 3, 1), ], ) ], ) def test_convolution2d( self, compute_unit, backend, frontend, height, width, in_channels, out_channels, kernel_size, stride, padding, groups=1, ): class DynamicConv(nn.Module): def forward(self, input_data, weights): return nn.functional.conv2d(input_data, weights, stride=stride, padding=padding) model = DynamicConv() input_shape = [ (1, in_channels, height, width), (out_channels, int(in_channels / groups), kernel_size, kernel_size), ] self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestConvTranspose(TorchBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "frontend", "width", "in_channels", "out_channels", "kernel_size", "stride", "padding", "dilation", ] ), [ (compute_unit, backend, frontend, *param) for compute_unit, backend, frontend, param in itertools.product( compute_units, backends, frontends, [ (3, 1, 1, 1, 2, 0, 1), (3, 1, 1, 1, 2, 1, 3), (3, 3, 3, 1, 2, 0, 1), (3, 3, 3, 1, 3, 0, 1), (5, 3, 3, 1, 3, 0, 1), (5, 3, 3, 1, 3, 0, 1), (5, 3, 3, 1, 3, 1, 3), (5, 3, 3, 1, 3, 1, 3), ], ) ], ) def test_convolution_transpose1d( self, compute_unit, backend, frontend, width, in_channels, out_channels, kernel_size, stride, padding, dilation, groups=1, ): model = nn.ConvTranspose1d( in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=groups, ) self.run_compare_torch( (1, in_channels, width), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "frontend", "height", "width", "in_channels", "out_channels", "kernel_size", "stride", "padding", "dilation", ] ), [ (compute_unit, backend, frontend, *param) for compute_unit, backend, frontend, param in itertools.product( compute_units, backends, frontends, [ (5, 5, 1, 1, 1, 2, 0, 1), (5, 5, 1, 1, 1, 2, 1, 3), (5, 5, 3, 3, 1, 2, 0, 1), (5, 5, 3, 3, 1, 3, 0, 1), (6, 5, 3, 3, 1, 3, 0, 1), (6, 5, 3, 3, 1, 3, 0, 1), (6, 5, 3, 3, 1, 3, 1, 3), (6, 5, 3, 3, 1, 3, 1, 3), ], ) ], ) def test_convolution_transpose2d( self, compute_unit, backend, frontend, height, width, in_channels, out_channels, kernel_size, stride, padding, dilation, ): model = nn.ConvTranspose2d( in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, ) self.run_compare_torch( (1, in_channels, height, width), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, dynamic_input", itertools.product( compute_units, backends, frontends, [True, False], ), ) def test_convolution_transpose2d_dynamic_input( self, compute_unit, backend, frontend, dynamic_input, ): in_channels = 5 model = nn.ConvTranspose2d( in_channels=in_channels, out_channels=10, kernel_size=3, stride=2, padding=1, dilation=3, ) in_height = 256 in_width = 512 input_shape = (1, in_channels, in_height, in_width) converter_input_type = None if dynamic_input: upper_bound = 4096 if backend[0] == "mlprogram" else -1 converter_input_type = [ TensorType( shape=(1, in_channels, RangeDim(256, upper_bound), RangeDim(256, upper_bound)), dtype=np.float32, ) ] self.run_compare_torch( input_shape, model, converter_input_type=converter_input_type, compute_unit=compute_unit, backend=backend, frontend=frontend, ) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "frontend", "height", "width", "in_channels", "out_channels", "kernel_size", "stride", "padding", "dilation", "output_padding", ] ), [ (compute_unit, backend, frontend, *param) for compute_unit, backend, frontend, param in itertools.product( compute_units, backends, frontends, [ (5, 5, 1, 1, 1, 2, 1, 1, 1), (5, 5, 1, 1, 1, 2, 2, 3, 2), (5, 5, 3, 3, 1, 2, 0, 1, 0), (5, 5, 3, 3, 1, 3, 1, 1, 1), (6, 5, 3, 3, 1, 3, 2, 1, 2), (6, 5, 3, 3, 1, 3, 1, 1, 1), (6, 5, 3, 3, 1, 3, 2, 3, 2), (6, 5, 3, 3, 1, 3, 3, 3, 3), ], ) ], ) def test_convolution_transpose2d_output_padding( self, compute_unit, backend, frontend, height, width, in_channels, out_channels, kernel_size, stride, padding, dilation, output_padding, ): # Output padding must be less than either stride or dilation # Skip testing invalid combinations if isinstance(output_padding, int): if output_padding >= stride and output_padding >= dilation: return elif isinstance(output_padding, tuple): for _output_padding in output_padding: if _output_padding >= stride and _output_padding >= dilation: return model = nn.ConvTranspose2d( in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, output_padding=output_padding, ) self.run_compare_torch( (1, in_channels, height, width), model, compute_unit=compute_unit, backend=backend, frontend=frontend, ) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "frontend", "depth", "height", "width", "in_channels", "out_channels", "kernel_size", "stride", "padding", "dilation", ] ), [ (compute_unit, backend, frontend, *param) for compute_unit, backend, frontend, param in itertools.product( compute_units, backends, frontends, [ (3, 5, 5, 1, 1, 1, 2, 0, 1), (3, 5, 5, 1, 1, 1, 2, 1, 3), (3, 5, 5, 3, 3, 1, 2, 0, 1), (3, 5, 5, 3, 3, 1, 1, 0, 2), (4, 6, 5, 3, 3, 1, 3, 0, 1), (4, 6, 5, 3, 3, 1, 3, 1, 2), (4, 6, 5, 3, 3, 1, 3, 1, 3), ], ) ], ) def test_convolution_transpose3d( self, compute_unit, backend, frontend, depth, height, width, in_channels, out_channels, kernel_size, stride, padding, dilation, ): model = nn.ConvTranspose3d( in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, ) self.run_compare_torch( (1, in_channels, depth, height, width), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) def _is_float_value(x, threshold=0.001): return x - np.floor(x) > threshold class TestUpsample(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, output_size, align_corners", itertools.product( compute_units, backends, [1, 3, 10, 190], [True, False], ), ) def test_upsample_linear1d_with_output_size( self, compute_unit, backend, output_size, align_corners ): input_shape = (1, 3, 10) output_size = 3 model = ModuleWrapper( nn.functional.interpolate, { "size": output_size, "mode": "linear", "align_corners": align_corners, }, ) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, scale, align_corners, recompute_scale_factor", itertools.product( compute_units, backends, [2, 0.5, 5.3], [True, False], [True, False] ), ) def test_upsample_linear1d_with_scales( self, compute_unit, backend, scale, align_corners, recompute_scale_factor ): Height = 8 input_shape = (1, 3, Height) output_h = Height * scale is_h_float = _is_float_value(output_h) if is_h_float and not align_corners and not recompute_scale_factor: pytest.xfail("rdar://81124053 (Support recompute_scale_factor)") model = ModuleWrapper( nn.functional.interpolate, { "scale_factor": scale, "mode": "linear", "align_corners": align_corners, "recompute_scale_factor": recompute_scale_factor, }, ) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, scales, align_corners, recompute_scale_factor", itertools.product( compute_units, backends, [2, 0.7, 3.6], [True, False], [True, False] ), ) def test_upsample_linear1d_with_scales_dynamic( self, compute_unit, backend, scales, align_corners, recompute_scale_factor ): is_float = _is_float_value(scales) input_shape = (1, 3, 22) if is_float and not align_corners and not recompute_scale_factor: pytest.xfail("rdar://81124053 (Support recompute_scale_factor)") model = ModuleWrapper( nn.functional.interpolate, { "scale_factor": scales, "mode": "linear", "align_corners": align_corners, "recompute_scale_factor": recompute_scale_factor, }, ) converter_input_type = [ TensorType( shape=( 1, 3, RangeDim(default=22, upper_bound=22 if backend[0] == "mlprogram" else -1), ), dtype=np.float32, ) ] mlmodel = self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, converter_input_type=converter_input_type, )[1] # also check if the scale factor are integers if backend[0] == "neuralnetwork" and not is_float: for layer in mlmodel._spec.neuralNetwork.layers: if layer.WhichOneof("layer") == "upsample": assert len(layer.upsample.fractionalScalingFactor) == 0 @pytest.mark.parametrize( "compute_unit, backend, output_size", itertools.product( compute_units, backends, [10, 170], ), ) def test_upsample_nearest1d_with_output_size( self, compute_unit, backend, output_size ): input_shape = (1, 3, 10) model = ModuleWrapper( nn.functional.interpolate, {"size": output_size, "mode": "nearest"}, ) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, scales", itertools.product(compute_units, backends, [2, 3, 4.5]), ) def test_upsample_nearest1d_with_scales(self, compute_unit, backend, scales): if backend[0] == "neuralnetwork": if isinstance(scales, float): return # Skip fractional scale factors tests for neuralnetwork input_shape = (1, 3, 10) model = ModuleWrapper( nn.functional.interpolate, {"scale_factor": scales, "mode": "nearest"}, ) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, scales", itertools.product(compute_units, backends, [2, 3]), ) def test_upsample_nearest1d_with_scales_dynamic( self, compute_unit, backend, scales ): input_shape = (1, 3, 10) model = ModuleWrapper( nn.functional.interpolate, { "scale_factor": scales, "mode": "nearest", "recompute_scale_factor": True, }, ) converter_input_type = [ TensorType( shape=(1, 3, RangeDim(upper_bound=10 if backend[0] == "mlprogram" else -1)), dtype=np.float32, ) ] mlmodel = self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, converter_input_type=converter_input_type, )[1] # also check if the scale factor are integers if backend[0] == "neuralnetwork": for layer in mlmodel._spec.neuralNetwork.layers: if layer.WhichOneof("layer") == "upsample": assert len(layer.upsample.fractionalScalingFactor) == 0 @pytest.mark.parametrize( "compute_unit, backend, output_size, align_corners", itertools.product( compute_units, backends, [ (10, 10), # PyTorch has a bug for the following parameter: # (1, 1), # See: https://github.com/pytorch/pytorch/issues/71188 (2, 3), (190, 170), ], [True, False], ), ) def test_upsample_bilinear2d_with_output_size( self, compute_unit, backend, output_size, align_corners ): input_shape = (1, 3, 10, 10) model = ModuleWrapper( nn.functional.interpolate, { "size": output_size, "mode": "bilinear", "align_corners": align_corners, }, ) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, scales_h, scales_w, align_corners, recompute_scale_factor", itertools.product( compute_units, backends, [2, 0.5, 4.1], [3, 0.5, 5.3], [True, False], [True, False], ), ) def test_upsample_bilinear2d_with_scales( self, compute_unit, backend, scales_h, scales_w, align_corners, recompute_scale_factor, ): Height = 8 Width = 22 input_shape = (1, 3, Height, Width) output_h = Height * scales_h output_w = Width * scales_w is_h_float = _is_float_value(output_h) is_w_float = _is_float_value(output_w) if ( (is_h_float or is_w_float) and not align_corners and not recompute_scale_factor ): pytest.xfail("rdar://81124053 (Support recompute_scale_factor)") model = ModuleWrapper( nn.functional.interpolate, { "scale_factor": (scales_h, scales_w), "mode": "bilinear", "align_corners": align_corners, "recompute_scale_factor": recompute_scale_factor, }, ) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, output_size", itertools.product( compute_units, backends, [(10, 10), (190, 170)], ), ) def test_upsample_nearest2d_with_output_size( self, compute_unit, backend, output_size ): input_shape = (1, 3, 10, 10) model = ModuleWrapper( nn.functional.interpolate, {"size": output_size, "mode": "nearest"}, ) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, scales_h, scales_w", itertools.product(compute_units, backends, [2, 3, 4.5], [4, 5, 5.5]), ) def test_upsample_nearest2d_with_scales( self, compute_unit, backend, scales_h, scales_w ): if backend[0] == "neuralnetwork": if isinstance(scales_h, float) or isinstance(scales_w, float): return # Skip fractional scale factors tests for neuralnetwork input_shape = (1, 3, 10, 10) model = ModuleWrapper( nn.functional.interpolate, {"scale_factor": (scales_h, scales_w), "mode": "nearest"}, ) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, scales_h, scales_w", itertools.product(compute_units, backends, [2, 3], [4, 5]), ) def test_upsample_nearest2d_with_scales_dynamic( self, compute_unit, backend, scales_h, scales_w ): input_shape = (1, 3, 10, 10) model = ModuleWrapper( nn.functional.interpolate, { "scale_factor": (scales_h, scales_w), "mode": "nearest", "recompute_scale_factor": True, }, ) upper_bound = 10 if backend[0] == "mlprogram" else -1 converter_input_type = [ TensorType( shape=(1, 3, RangeDim(upper_bound=upper_bound), RangeDim(upper_bound=upper_bound)), dtype=np.float32, ) ] mlmodel = self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, converter_input_type=converter_input_type, )[1] # also check if the scale factor are integers if backend[0] == "neuralnetwork": for layer in mlmodel._spec.neuralNetwork.layers: if layer.WhichOneof("layer") == "upsample": assert len(layer.upsample.fractionalScalingFactor) == 0 @pytest.mark.parametrize( "compute_unit, backend, scales_h, scales_w, align_corners, recompute_scale_factor", itertools.product( compute_units, backends, [2, 3.6], [4, 0.7], [True, False], [True, False], ), ) def test_upsample_bilinear2d_with_scales_dynamic( self, compute_unit, backend, scales_h, scales_w, align_corners, recompute_scale_factor, ): is_h_float = _is_float_value(scales_h) is_w_float = _is_float_value(scales_w) input_shape = (1, 3, 9, 22) if ( (is_h_float or is_w_float) and not align_corners and not recompute_scale_factor ): pytest.xfail("rdar://81124053 (Support recompute_scale_factor)") model = ModuleWrapper( nn.functional.interpolate, { "scale_factor": (scales_h, scales_w), "mode": "bilinear", "align_corners": align_corners, "recompute_scale_factor": recompute_scale_factor, }, ) dim_upper_bound = 30 if backend[0] == "mlprogram" else -1 converter_input_type = [ TensorType( shape=( 1, 3, RangeDim(default=9, upper_bound=dim_upper_bound), RangeDim(default=22, upper_bound=dim_upper_bound), ), dtype=np.float32, ) ] mlmodel = self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, converter_input_type=converter_input_type, )[1] # also check if the scale factor are integers if backend[0] == "neuralnetwork" and not is_h_float and not is_w_float: for layer in mlmodel._spec.neuralNetwork.layers: if layer.WhichOneof("layer") == "upsample": assert len(layer.upsample.fractionalScalingFactor) == 0 class TestEmpty(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product( compute_units, backends, COMMON_SHAPES, ), ) def test_empty_like(self, compute_unit, backend, shape): class TestModel(nn.Module): def forward(self, x): y = torch.empty_like(x) # Value of y is Nondeterministic, so return length return torch.Tensor([len(y)]) self.run_compare_torch(shape, TestModel(), backend=backend, compute_unit=compute_unit) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product( compute_units, backends, COMMON_SHAPES, ), ) def test_new_empty(self, compute_unit, backend, shape): class TestModel(nn.Module): def forward(self, _): tensor = torch.ones(()) y = tensor.new_empty(shape) # Value of y is Nondeterministic, so return length return torch.Tensor([len(y)]) self.run_compare_torch( shape, TestModel(), backend=backend, compute_unit=compute_unit, ) class TestAvgPool(TorchBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "input_shape", "kernel_size", "stride", "padding", "ceil_mode", "include_pad", ] ), [ (compute_unit, backend, *param) for compute_unit, backend, param in itertools.product( compute_units, backends, [ ((1, 3, 5), 1, 1, 0, True, True), ((1, 3, 5), 3, 1, 0, False, True), ((1, 3, 5), 1, 2, 1, False, False), ((1, 3, 5), 3, 2, 1, False, True), ((1, 3, 5), 1, 2, 0, False, True), ((1, 3, 10), 1, 1, 1, False, False), ((1, 3, 10), 3, 1, 0, False, False), ((1, 3, 10), 1, 2, 1, True, True), ((1, 3, 10), 3, 2, 0, True, False), ((1, 3, 10), 1, 1, 1, True, True), ], ) ], ) def test_avg_pool1d( self, compute_unit, backend, input_shape, kernel_size, stride, padding, ceil_mode, include_pad, ): if padding > kernel_size / 2: return model = nn.AvgPool1d( kernel_size, stride, padding, ceil_mode=ceil_mode, count_include_pad=include_pad, ) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "input_shape", "kernel_size", "stride", "padding", "ceil_mode", "include_pad", ] ), [ (compute_unit, backend, *param) for compute_unit, backend, param in itertools.product( compute_units, backends, [ ((1, 3, 5, 5), 1, 1, 0, True, True), ((1, 3, 5, 5), 3, 1, 0, False, True), ((1, 3, 5, 5), 1, 2, 1, False, False), ((1, 3, 5, 5), 3, 2, 1, False, True), ((1, 3, 5, 5), 1, 2, 0, False, True), ((1, 3, 10, 10), 1, 1, 1, False, False), ((1, 3, 10, 10), 3, 1, 0, False, False), ((1, 3, 10, 10), 1, 2, 1, True, True), ((1, 3, 10, 10), 3, 2, 0, True, False), ((1, 3, 10, 10), 1, 1, 1, True, True), ], ) ], ) def test_avg_pool2d( self, compute_unit, backend, input_shape, kernel_size, stride, padding, ceil_mode, include_pad, ): if padding > kernel_size / 2: return model = nn.AvgPool2d( kernel_size, stride, padding, ceil_mode=ceil_mode, count_include_pad=include_pad, ) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "input_shape", "kernel_size", "stride", "padding", "ceil_mode", "include_pad", ] ), [ (compute_unit, backend, *param) for compute_unit, backend, param in itertools.product( compute_units, backends, [ ((1, 3, 11, 5, 5), 1, 1, 0, True, True), ((1, 3, 11, 5, 5), 3, 1, 0, False, True), ((1, 3, 11, 5, 5), 1, 2, 1, False, False), ((1, 3, 11, 5, 5), 3, 2, 1, False, True), ((1, 3, 11, 5, 5), 1, 2, 0, False, True), ((1, 3, 6, 10, 10), 1, 1, 1, False, False), ((1, 3, 6, 10, 10), 3, 1, 0, False, False), ((1, 3, 6, 10, 10), 1, 2, 1, True, True), ((1, 3, 6, 10, 10), 3, 2, 0, True, False), ((1, 3, 6, 10, 10), 1, 1, 1, True, True), ], ) ], ) def test_avg_pool3d( self, compute_unit, backend, input_shape, kernel_size, stride, padding, ceil_mode, include_pad, ): if padding > kernel_size / 2: return if include_pad and ceil_mode and stride > 1: # skip: MIL/CoreML does not support this configuration pytest.xfail( "rdar://73723194 (Support 3D Avg pooling with ceil_mode=True and include_pad = True, in MIL)" ) model = nn.AvgPool3d( kernel_size, stride, padding, ceil_mode=ceil_mode, count_include_pad=include_pad, ) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit ) class TestAdaptiveMaxPool(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_shape, output_size", itertools.product( compute_units, backends, [(1, 64, 8), (20, 10)], [3, 5] ) ) def test_adaptive_max_pool1d(self, compute_unit, backend, input_shape, output_size): model = nn.AdaptiveMaxPool1d(output_size) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, output_size, magnification, delta, depth, n", itertools.product( compute_units, backends, [(1, 1), (3, 2)], [1, 2, 7], [0, 11], [1, 2, 3], [1, 2], ), ) def test_adaptive_max_pool2d( self, compute_unit, backend, output_size, magnification, delta, depth, n ): # input_size = output_size * magnification + delta input_size = ( delta + magnification * output_size[0], delta + magnification * output_size[1], ) in_shape = (n, depth) + input_size model = nn.AdaptiveMaxPool2d(output_size) self.run_compare_torch( in_shape, model, backend=backend, compute_unit=compute_unit, ) class TestAdaptiveAvgPool(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_shape, output_size", itertools.product( compute_units, backends, [(1, 64, 8), (20, 10)], [3, 5] ) ) def test_adaptive_max_pool1d(self, compute_unit, backend, input_shape, output_size): model = nn.AdaptiveAvgPool1d(output_size) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, output_size, magnification, delta, depth, n", itertools.product( compute_units, backends, [(1, 1), (3, 2)], [1, 2, 7], [0, 11], [1, 2, 3], [1, 2], ), ) def test_adaptive_avg_pool2d( self, compute_unit, backend, output_size, magnification, delta, depth, n ): # input_size = output_size * magnification + delta input_size = ( delta + magnification * output_size[0], delta + magnification * output_size[1], ) in_shape = (n, depth) + input_size model = nn.AdaptiveAvgPool2d(output_size) self.run_compare_torch( in_shape, model, backend=backend, compute_unit=compute_unit, ) class TestMaxPool(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, input_shape, kernel_size, stride, padding, ceil_mode", itertools.product( compute_units, backends, frontends, [(1, 3, 15), (1, 1, 7)], [1, 3], [1, 2], [0, 1], [True, False], ), ) def test_max_pool1d( self, compute_unit, backend, frontend, input_shape, kernel_size, stride, padding, ceil_mode, ): if padding > kernel_size / 2: return if ceil_mode > 0 and padding == 0 and kernel_size == 1 and stride == 2: if input_shape[-1] % 2 == 0: # TODO: is this a valid case? # in this case, torch adds "-inf" values at the border, post max pool operation return model = nn.MaxPool1d( kernel_size, stride, padding, dilation=1, return_indices=False, ceil_mode=ceil_mode, ) self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_shape, kernel_size, stride, padding, ceil_mode", itertools.product( compute_units, backends, frontends, [(1, 3, 15, 15), (1, 1, 7, 7)], [1, 3], [1, 2], [0, 1], [True, False], ), ) def test_max_pool2d( self, compute_unit, backend, frontend, input_shape, kernel_size, stride, padding, ceil_mode, ): if padding > kernel_size / 2: return if ceil_mode > 0 and padding == 0 and kernel_size == 1 and stride == 2: for r in range(2, 4): if input_shape[r] % 2 == 0: # TODO: is this a valid case? # in this case, torch adds "-inf" values at the border, post max pool operation return model = nn.MaxPool2d( kernel_size, stride, padding, dilation=1, return_indices=False, ceil_mode=ceil_mode, ) self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_shape, kernel_size, stride, padding, ceil_mode", itertools.product( compute_units, backends, frontends, [(1, 3, 11, 3, 11), (1, 1, 7, 4, 7)], [1, 3], [1, 2], [0, 1], [True, False], ), ) def test_max_pool3d( self, compute_unit, backend, frontend, input_shape, kernel_size, stride, padding, ceil_mode, ): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.xfail("TODO (rdar://115846125): handle multi-output op max_pool3d_with_indices") if padding > kernel_size / 2: return if ceil_mode > 0 and padding == 0 and kernel_size == 1 and stride == 2: for r in range(2, 5): if input_shape[r] % 2 == 0: # TODO: is this a valid case? # in this case, torch adds "-inf" values at the border, post max pool operation return model = nn.MaxPool3d( kernel_size, stride, padding, dilation=1, return_indices=False, ceil_mode=ceil_mode, ) self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit ) class TestMaximumMinimum(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, input_shapes, mode", itertools.product( compute_units, backends, frontends, [ [(2, 5, 7, 3), (2, 5, 7, 3)], [(3, 2, 9), (3, 2, 9)], [(1, 2, 3), (1,)], [(1,), (2, 5, 6, 7)], [(1, 2, 1), (3, 4, 2, 5)], ], ["minimum", "maximum"], ), ) def test_minimum_maximum(self, compute_unit, backend, frontend, input_shapes, mode): class TestModel(torch.nn.Module): def forward(self, x, y): if mode == "minimum": return torch.minimum(x, y) elif mode == "maximum": return torch.maximum(x, y) else: raise ValueError("Unsupported mode: {mode}".format(mode=mode)) self.run_compare_torch( input_shapes, TestModel(), frontend=frontend, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_shapes, mode, xdtype, ydtype", itertools.product( compute_units, backends, frontends, [ [(2, 5, 7, 3), (2, 5, 7, 3)], [(3, 2, 9), (3, 2, 9)], [(1, 2, 3), (1,)], [(1,), (2, 5, 6, 7)], [(1, 2, 1), (3, 4, 2, 5)], ], ["minimum", "maximum"], (torch.float16, torch.float32), (torch.float16, torch.float32), ), ) def test_minimum_maximum_mixed_precision( self, compute_unit, backend, frontend, input_shapes, mode, xdtype, ydtype ): class TestModel(torch.nn.Module): def forward(self, x, y): a = x.to(xdtype) b = y.to(ydtype) if mode == "minimum": return torch.minimum(a, b) elif mode == "maximum": return torch.maximum(a, b) else: raise ValueError("Unsupported mode: {mode}".format(mode=mode)) self.run_compare_torch( input_shapes, TestModel(), frontend=frontend, compute_unit=compute_unit, backend=backend, rtol=1e-6 if xdtype == ydtype and xdtype == torch.float32 else 1e-3, atol=1e-6 if xdtype == ydtype and xdtype == torch.float32 else 1e-3, ) class TestAMaxAMin(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, input_shapes, mode, reduce_dim, keepdim", itertools.product( compute_units, backends, frontends, [ [(2, 5, 7, 3)], [(3, 2, 9)], [(1,)], ], ["minimum", "maximum"], [0, 1, 2, 3, [0, 1], [0, 1, 2], [0, 1, 2, 3]], [True, False], ), ) def test_minimum_maximum( self, compute_unit, backend, frontend, input_shapes, mode, reduce_dim, keepdim ): class TestModel(torch.nn.Module): def forward(self, input): if type(reduce_dim) == int: reduce_dim_clamped = min(input.dim() - 1, reduce_dim) else: reduce_dim_clamped = reduce_dim[: input.dim()] if mode == "minimum": return torch.amin(input, reduce_dim_clamped, keepdim) elif mode == "maximum": return torch.amax(input, reduce_dim_clamped, keepdim) else: raise ValueError("Unsupported mode: {mode}".format(mode=mode)) model = TestModel() self.run_compare_torch( input_shapes, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestPoolSymbolicInput(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends), ) def test_max_pool(self, compute_unit, backend, frontend): model = nn.MaxPool2d( kernel_size=1, stride=2, padding=0, dilation=1, ceil_mode=True, ) input_shape = (1, 1, 11, 11) upper_bound = 20 if backend[0] == "mlprogram" else -1 converter_input_type = [ TensorType( shape=(1, 1, RangeDim(upper_bound=upper_bound), RangeDim(upper_bound=upper_bound)), dtype=np.float32, ) ] self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, converter_input_type=converter_input_type, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends), ) def test_avg_pool(self, compute_unit, backend, frontend): model = nn.AvgPool2d( kernel_size=2, stride=2, padding=1, count_include_pad=True, ceil_mode=True, ) input_shape = (1, 2, 15, 15) upper_bound = 20 if backend[0] == "mlprogram" else -1 converter_input_type = [ TensorType( shape=(1, 2, RangeDim(upper_bound=upper_bound), RangeDim(upper_bound=upper_bound)), dtype=np.float32, ) ] self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, converter_input_type=converter_input_type, ) class TestLSTM(TorchBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "input_size", "hidden_size", "num_layers", "bias", "batch_first", "dropout", "bidirectional", ] ), [ (compute_unit, backend, *param) for compute_unit, backend, param in itertools.product( compute_units, backends, [ (1, 1, 1, True, True, 0.3, True), (1, 1, 1, False, True, 0.3, False), (1, 1, 1, False, True, 0.3, True), (3, 1, 5, True, False, 0.3, False), (3, 1, 5, True, True, 0.3, True), (3, 7, 5, True, False, 0.3, False), (3, 7, 5, False, True, 0.3, True), (3, 7, 5, False, True, 0.3, False), ], ) ], ) def test_lstm( self, compute_unit, backend, input_size, hidden_size, num_layers, bias, batch_first, dropout, bidirectional, ): model = nn.LSTM( input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, bias=bias, batch_first=batch_first, dropout=dropout, bidirectional=bidirectional, ) SEQUENCE_LENGTH = 3 BATCH_SIZE = 2 model.eval() num_directions = int(bidirectional) + 1 if batch_first: _input = torch.randn(BATCH_SIZE, SEQUENCE_LENGTH, input_size) else: _input = torch.randn(SEQUENCE_LENGTH, BATCH_SIZE, input_size) h0 = torch.randn(num_layers * num_directions, BATCH_SIZE, hidden_size) c0 = torch.randn(num_layers * num_directions, BATCH_SIZE, hidden_size) inputs = (_input, (h0, c0)) expected_results = model(*inputs) self.run_compare_torch( inputs, model, expected_results, input_as_shape=False, backend=backend, ) class TestRNN(TorchBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "input_size", "hidden_size", "num_layers", "bias", "batch_first", "dropout", "activation", ] ), [ (compute_unit, backend, *param) for compute_unit, backend, param in itertools.product( compute_units, backends, [ (1, 1, 1, True, True, 0.3, "tanh"), (1, 1, 1, False, True, 0.3, "relu"), (1, 1, 1, False, True, 0.3, "tanh"), (3, 1, 5, True, False, 0.3, "relu"), (3, 1, 5, True, True, 0.3, "tanh"), (3, 7, 5, True, False, 0.3, "relu"), (3, 7, 5, False, True, 0.3, "relu"), (3, 7, 5, False, True, 0.3, "tanh"), ], ) ], ) def test_rnn( self, compute_unit, backend, input_size, hidden_size, num_layers, bias, batch_first, dropout, activation, ): SEQUENCE_LENGTH = 10 BATCH_SIZE = 3 model = nn.RNN( input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, bias=bias, batch_first=batch_first, dropout=dropout, nonlinearity=activation, bidirectional=False, # bi-directional simple RNN not supported ) model.eval() num_directions = 1 if batch_first: _input = torch.randn(BATCH_SIZE, SEQUENCE_LENGTH, input_size) else: _input = torch.randn(SEQUENCE_LENGTH, BATCH_SIZE, input_size) h0 = torch.randn(num_layers * num_directions, BATCH_SIZE, hidden_size) inputs = (_input, h0) expected_results = model(*inputs) self.run_compare_torch( inputs, model, expected_results, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) class TestGRU(TorchBaseTest): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "input_size", "hidden_size", "num_layers", "bias", "batch_first", "sequence_length", "bidirectional", ] ), [ (compute_unit, backend, *param) for compute_unit, backend, param in itertools.product( compute_units, backends, [ (1, 1, 1, True, True, 10, True), (1, 1, 1, False, True, 10, True), (1, 1, 1, False, True, 1, False), (3, 1, 5, True, False, 10, False), (3, 1, 5, True, True, 10, True), (3, 7, 5, True, True, 10, False), (3, 7, 5, False, True, 10, True), (3, 7, 5, False, True, 1, True), ], ) ], ) def test_gru( self, compute_unit, backend, input_size, hidden_size, num_layers, bias, batch_first, sequence_length, bidirectional, ): DROPOUT = 0.3 BATCH_SIZE = 3 model = nn.GRU( input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, bias=bias, batch_first=batch_first, dropout=DROPOUT, bidirectional=bidirectional, ) model.eval() num_directions = int(bidirectional) + 1 if batch_first: _input = torch.randn(BATCH_SIZE, sequence_length, input_size) else: _input = torch.randn(sequence_length, BATCH_SIZE, input_size) h0 = torch.randn(num_layers * num_directions, BATCH_SIZE, hidden_size) inputs = (_input, h0) expected_results = model(*inputs) self.run_compare_torch( inputs, model, expected_results, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) class TestLSTMWithPackedSequence(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, pack_batch_first, pad_batch_first, LSTM_batch_first, pad_value", itertools.product( compute_units, backends, [True, False], [True, False], [True, False], [-1, 0], ), ) def test_lstm( self, compute_unit, backend, pack_batch_first, pad_batch_first, LSTM_batch_first, pad_value, ): from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence input_size = 4 hidden_size = 6 num_layers = 1 class Encoder(torch.nn.Module): def __init__(self): super().__init__() self.lstm = torch.nn.LSTM( input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, batch_first=LSTM_batch_first, bidirectional=False, dropout=0.0, ) def forward(self, batch_in, seq_lengths): packed_input = pack_padded_sequence( batch_in, seq_lengths, batch_first=pack_batch_first ) output_packed, (hidden, _) = self.lstm(packed_input) output, _ = pad_packed_sequence( output_packed, padding_value=pad_value, batch_first=pad_batch_first ) return output SEQUENCE_LENGTH = 10 BATCH_SIZE = 3 model = Encoder() model.eval() if pack_batch_first: _input = torch.randn(BATCH_SIZE, SEQUENCE_LENGTH, input_size) else: _input = torch.randn(SEQUENCE_LENGTH, BATCH_SIZE, input_size) seq_lengths = torch.tensor([10, 5, 1], dtype=torch.int32) inputs = (_input, seq_lengths) expected_results = model(*inputs) self.run_compare_torch( inputs, model, expected_results, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) # Workaround for GitHub Issue #824 # i.e. the return h_n/c_n for a converted BLSTM are mangled. # Therefore, just look at output 'y' (for now) which is correct. class StripCellAndHidden(nn.Module): def __init__(self, flagReturnTuple_): super(StripCellAndHidden, self).__init__() self.flagReturnTuple = flagReturnTuple_ def forward(self, x): # Pass tuple, not tensor, to avoid issue in coremltools/converters/mil/frontend/torch/test/testing_utils.py on "if not expected_results:" # Pass tensor when we need input for LSTM #2 as part of nn.Sequential() return tuple(x[0]) if self.flagReturnTuple else x[0] # Check GitHub Issue #810, assume num_layers == 2 and bidirectional == True class TestStackedBLSTM(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_size, hidden_size, bias, batch_first, dropout", itertools.product( compute_units, backends, [7], [5], [True, False], [True, False], [0.3], ), ) def test_lstm( self, compute_unit, backend, input_size, hidden_size, bias, batch_first, dropout, ): model = nn.Sequential( nn.LSTM( input_size=input_size, hidden_size=hidden_size, num_layers=1, bias=bias, batch_first=batch_first, dropout=dropout, bidirectional=True, ), StripCellAndHidden(False), nn.LSTM( input_size=2 * hidden_size, hidden_size=hidden_size, num_layers=1, bias=bias, batch_first=batch_first, dropout=dropout, bidirectional=True, ), StripCellAndHidden(True), ) SEQUENCE_LENGTH = 3 BATCH_SIZE = 2 # (seq_len, batch, input_size) if batch_first: _input = torch.rand(BATCH_SIZE, SEQUENCE_LENGTH, input_size) else: _input = torch.randn(SEQUENCE_LENGTH, BATCH_SIZE, input_size) # Do not use h_0/c_0 input and do not check h_n/c_n output, GitHub Issue #824 expected_results = model(_input) self.run_compare_torch( _input, model, expected_results, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) class TestConcat(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends) ) def test_cat_basic(self, compute_unit, backend, frontend): class TestNet(nn.Module): def forward(self, x): x = torch.cat((x, x), axis=1) return x model = TestNet() self.run_compare_torch( (1, 2, 3), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends) ) def test_cat_with_empty(self, compute_unit, backend, frontend): class TestNet(nn.Module): def forward(self, x): return torch.cat((x, torch.tensor([])), axis=1) model = TestNet() self.run_compare_torch( (1, 2, 3), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends) ) def test_cat_input_types_promotion(self, compute_unit, backend, frontend): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("executorch does not allow mixed dtypes") class TestNet(nn.Module): def forward(self, x, y): return torch.cat((x, y), axis=1) input_data_x = torch.randint(low=0, high=10, size=(2, 3), dtype=torch.int32) input_data_y = torch.rand(2, 3) self.run_compare_torch( [input_data_x, input_data_y], TestNet(), frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) # This tests an edge case where the list of tensors to concatenate only # has one item. NN throws an error for this case, hence why we have to # run through the full conversion process to test it. @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends) ) def test_cat_single_input(self, compute_unit, backend, frontend): class TestNet(nn.Module): def forward(self, x): x = torch.cat((x,), axis=1) return x model = TestNet() self.run_compare_torch( (1, 3, 16, 16), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends) ) def test_cat_const_fold(self, compute_unit, backend, frontend): class TestNet(nn.Module): def forward(self, x): x = torch.tensor([[[1, 2], [2, 3], [3, 4]]]) return torch.cat((x, x), axis=1) model = TestNet() mlmodel = self.run_compare_torch( (1, 2, 3), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) prog = mlmodel[1]._mil_program # The `listconstruct` is folded into a single const. assert len(prog.find_ops(op_type="const")) == 1 with patch.object(Var, "_is_nonreplaceable_var") as mocked_is_nonreplaceable_var: # Mock that the input with shape [1, 3, 2] const is non-replaceable. mocked_is_nonreplaceable_var.side_effect = ( lambda var: var.op and var.op.op_type == "const" and var.rank == 3 ) mlmodel = self.run_compare_torch( [(1, 2, 3)], model, frontend=frontend, backend=backend, compute_unit=compute_unit ) prog = mlmodel[1]._mil_program # The `listconstruct` is not folded so there are 3 const ops. assert len(prog.find_ops(op_type="const")) == 3 @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends) ) def test_concat_alias(self, compute_unit, backend, frontend): class Outer(torch.nn.Module): def __init__(self, net): super(Outer, self).__init__() self.net = net def forward(self, x): x = self.net(x) return x class TestNet(nn.Module): def forward(self, x): x = torch.concat((x, x), axis=1) return x # test passes without adding alias if `Outer` is not used model = Outer(TestNet()) self.run_compare_torch( (1, 3, 16, 16), model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestTile(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, dims", itertools.product( compute_units, backends, frontends, [(1, 2, 4), (3, 2), (2,)], ), ) def test_tile(self, compute_unit, backend, frontend, dims): class TestModel(nn.Module): def forward(self, x): return torch.tile(x, dims) self.run_compare_torch( (2, 3, 5), TestModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestBitwiseNot(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, input_type", itertools.product( compute_units, backends, frontends, ["int", "bool"], ), ) def test_bitwise_not(self, compute_unit, backend, frontend, input_type): class TestNet(nn.Module): def forward(self, x): return torch.bitwise_not(x) model = TestNet() if input_type == "int": torch_in = torch.tensor([1, 2, 3, -5, 0], dtype=torch.int32) elif input_type == "bool": torch_in = torch.tensor([True, False, True, False]) self.run_compare_torch( torch_in, model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) class TestBoolOps(TorchBaseTest): def _get_inputs(self, input_types): x_type, y_type = input_types if x_type == "int": x = torch.tensor([1, 0, 1, 0], dtype=torch.int32) elif x_type == "bool": x = torch.tensor([1, 0, 1, 0], dtype=torch.bool) if y_type == "int": y = torch.tensor([0, 0, 1, 1], dtype=torch.int32) elif y_type == "bool": y = torch.tensor([0, 0, 1, 1], dtype=torch.bool) return (x, y) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_types", itertools.product( compute_units, backends, frontends, [("int", "int"), ("int", "bool"), ("bool", "int"), ("bool", "bool")], ), ) def test_mul_int_or_bool(self, compute_unit, backend, frontend, input_types): class TestMulWithBool(nn.Module): def forward(self, x, y): return x * y x, y = self._get_inputs(input_types) model = TestMulWithBool() self.run_compare_torch( (x, y), model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_types", itertools.product( compute_units, backends, frontends, [("int", "int"), ("int", "bool"), ("bool", "int"), ("bool", "bool")], ), ) def test_add_int_or_bool(self, compute_unit, backend, frontend, input_types): class TestAddWithBool(nn.Module): def forward(self, x, y): return x + y x, y = self._get_inputs(input_types) model = TestAddWithBool() self.run_compare_torch( (x, y), model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, x_complex, y_complex", itertools.product( compute_units, backends, frontends, (True, False), (True, False), ), ) def test_add_complex(self, compute_unit, backend, frontend, x_complex, y_complex): if frontend == TorchFrontend.EXECUTORCH and (x_complex or y_complex): pytest.skip("Complex is not aten canonical") class TestAddComplexModel(nn.Module): def forward(self, x, y): if x_complex: x = torch.complex(x, x) if y_complex: y = torch.complex(y, y) return torch.add(x, y).abs() self.run_compare_torch( [(2, 3), (2, 3)], TestAddComplexModel(), compute_unit=compute_unit, backend=backend, frontend=frontend, ) class TestFull(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [1, 3], ), ) def test_full_dynamic(self, compute_unit, backend, rank): class FullDynamicModel(nn.Module): def forward(self, x): if rank == 1: h = x[0] x = torch.zeros(h) elif rank == 3: h, w, d = x[0], x[1], x[2] x = torch.zeros(h, w, d) return torch.full(x.shape, fill_value=3.14) input_shape = np.random.randint(low=2, high=6, size=rank) torch_in = torch.tensor(input_shape, dtype=torch.int32) model = FullDynamicModel().eval() torch_out = model(torch_in) self.run_compare_torch( torch_in, model, expected_results=torch_out, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, shape_val", itertools.product( compute_units, backends, [ [(1,), 0.0], [(2, 3), 3.1415], [(1, 1, 2, 5, 1), -2.0], ], ), ) def test_full_static(self, compute_unit, backend, shape_val): shape, val = shape_val class FullStaticModel(nn.Module): def forward(self, x): return torch.full(x.shape, fill_value=val) self.run_compare_torch( shape, FullStaticModel().eval(), backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, shape_val", itertools.product( compute_units, backends, [ [(1,), 0.0], [(2, 3), 3.1415], [(1, 1, 2, 5, 1), -2.0], ], ), ) def test_full_scalar(self, compute_unit, backend, shape_val): shape, val = shape_val class FullScalarModel(nn.Module): def forward(self, x): return x / torch.full([], fill_value=val) self.run_compare_torch( shape, FullScalarModel().eval(), backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, shape_val", itertools.product( compute_units, [ ["neuralnetwork", "fp32", ct.target.iOS14], ["mlprogram", "fp16", ct.target.iOS15], ["mlprogram", "fp32", ct.target.iOS15], ["mlprogram", "fp16", ct.target.iOS16], ["mlprogram", "fp32", ct.target.iOS16], ], [ [(1,), 0.0], [(2, 3), 3.1415], [(1, 1, 2, 5, 1), -2.0], ], ), ) def test_full_like(self, compute_unit, backend, shape_val): if _macos_version() < (13, 0) and backend[2] == ct.target.iOS16: pytest.skip("iOS16 target not available on macOS 13") shape, val = shape_val class FullLikeModel(nn.Module): def forward(self, x): return torch.full_like(x, fill_value=val) self.run_compare_torch( shape, FullLikeModel().eval(), backend=backend[:2], compute_unit=compute_unit, minimum_deployment_target=backend[2], ) class TestDim(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shape", itertools.product( compute_units, backends, frontends, [ (1,), (2, 3), (1, 1, 2, 5, 1), ], ), ) def test_dim(self, compute_unit, backend, frontend, shape): class DimModel(nn.Module): def forward(self, x): return torch.tensor([x.dim()]) self.run_compare_torch( shape, DimModel().eval(), compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestNewZeros(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [1, 3], ), ) def test_new_zeros_dynamic(self, compute_unit, backend, rank): class ZerosDynamicModel(nn.Module): def forward(self, x): if rank == 1: h = x[0] x = torch.zeros(h) elif rank == 3: h, w, d = x[0], x[1], x[2] x = torch.zeros(h, w, d) return x.new_zeros(x.shape) input_shape = np.random.randint(low=2, high=6, size=rank) torch_in = torch.tensor(input_shape, dtype=torch.int32) model = ZerosDynamicModel().eval() torch_out = model(torch_in) self.run_compare_torch( torch_in, model, expected_results=torch_out, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product( compute_units, backends, [ (1,), (2, 3), (1, 1, 2, 5, 1), ], ), ) def test_new_zeros_static(self, compute_unit, backend, shape): class ZerosStaticModel(nn.Module): def __init__(self): super(ZerosStaticModel, self).__init__() def forward(self, x): return x.new_zeros(x.shape) self.run_compare_torch( shape, ZerosStaticModel().eval(), backend=backend, compute_unit=compute_unit ) class TestNewFull(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [1, 3], ), ) def test_new_full_dynamic(self, compute_unit, backend, rank): class FullDynamicModel(nn.Module): def forward(self, x): if rank == 1: h = x[0] x = torch.zeros(h) elif rank == 3: h, w, d = x[0], x[1], x[2] x = torch.zeros(h, w, d) return x.new_full(x.shape, fill_value=3.14) input_shape = np.random.randint(low=2, high=6, size=rank) torch_in = torch.tensor(input_shape, dtype=torch.int32) model = FullDynamicModel().eval() torch_out = model(torch_in) self.run_compare_torch( torch_in, model, expected_results=torch_out, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, shape_val", itertools.product( compute_units, backends, [ [(1,), 0.0], [(2, 3), 3.1415], [(1, 1, 2, 5, 1), -2.0], ], ), ) def test_new_full_static(self, compute_unit, backend, shape_val): shape, val = shape_val class FullStaticModel(nn.Module): def forward(self, x): return x.new_full(x.shape, fill_value=val) self.run_compare_torch( shape, FullStaticModel().eval(), backend=backend, compute_unit=compute_unit ) class TestEye(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, eye_type", itertools.product( compute_units, backends, ["single", "double"], ), ) def test(self, compute_unit, backend, eye_type): class Model(nn.Module): def forward(self, x): if eye_type == "single": eye = torch.eye(3) return x + eye elif eye_type == "double": eye = torch.eye(2, 3) return x + eye input_shape = (3, 3) if eye_type == "single" else (2, 3) model = Model().eval() self.run_compare_torch(input_shape, model, backend=backend, compute_unit=compute_unit) class TestOnes(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [1, 3], ), ) def test_ones_dynamic(self, compute_unit, backend, rank): class OnesDynamicModel(nn.Module): def forward(self, x): if rank == 1: h = x[0] x = torch.zeros(h) elif rank == 3: h, w, d = x[0], x[1], x[2] x = torch.zeros(h, w, d) return torch.ones(x.shape) input_shape = np.random.randint(low=2, high=6, size=rank) torch_in = torch.tensor(input_shape, dtype=torch.int32) model = OnesDynamicModel().eval() torch_out = model(torch_in) self.run_compare_torch( torch_in, model, expected_results=torch_out, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product( compute_units, backends, [(1,), (2, 3), (1, 1, 2, 5, 1)], ), ) def test_ones_static(self, compute_unit, backend, shape): class OnesStaticModel(nn.Module): def forward(self, x): return torch.ones(x.shape) self.run_compare_torch( shape, OnesStaticModel().eval(), backend=backend, compute_unit=compute_unit ) class TestRandint(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shape, low, high", itertools.product( compute_units, backends, frontends, [(1,), (2, 3)], [-1, 2], [3, 5], ), ) def test_randint(self, compute_unit, backend, frontend, shape, low, high): class TestModel(nn.Module): def forward(self, x): y = torch.randint(low, high, x.shape) if frontend == TorchFrontend.TORCHSCRIPT: return torch.Tensor([len(y)]) else: return torch.tensor(y.shape) self.run_compare_torch( shape, TestModel(), compute_unit=compute_unit, backend=backend, frontend=frontend, ) @pytest.mark.parametrize("frontend", frontends) def test_tuple_input(self, frontend): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("torch._ops.aten.randint.low is not Aten Canonical") class TestModel(nn.Module): def forward(self, x): return torch.randint(0, 3, (10,)) model = TestModel().eval() x = torch.randn((1, 3, 256, 256)) torch_model = export_torch_model_to_frontend(model, (x,), frontend) inputs = [ct.TensorType(shape=x.shape)] if frontend == TorchFrontend.TORCHSCRIPT else None ct.convert(torch_model, inputs=inputs) class TestRand(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape, dtype", itertools.product( compute_units, backends, [(1,), (2, 3)], [None, torch.float16, torch.float32, torch.float64], ), ) def test_rand(self, compute_unit, backend, shape, dtype): class TestModel(nn.Module): def forward(self, x): y = torch.rand(x.shape, dtype=dtype) # can't compare directly (this is random) return torch.stack( [ torch.ones_like(y, dtype=torch.float32), (y >= 0).to(torch.float32), (y < 1).to(torch.float32), ] ) self.run_compare_torch(shape, TestModel(), backend=backend, compute_unit=compute_unit) class TestRandn(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shape", itertools.product( compute_units, backends, frontends, [(1,), (2, 3)], ), ) def test_randn(self, compute_unit, backend, frontend, shape): class TestModel(nn.Module): def forward(self, x): y = torch.randn(*x.shape) if frontend == TorchFrontend.TORCHSCRIPT: return torch.Tensor([len(y)]) else: return torch.tensor(y.shape) self.run_compare_torch( shape, TestModel(), compute_unit=compute_unit, backend=backend, frontend=frontend, ) @pytest.mark.parametrize( "dtype", [torch.complex64, torch.cfloat, torch.complex128, torch.cdouble] ) def test_invalid_complex_dtype(self, dtype): class TestModel(torch.nn.Module): def forward(self, x): return torch.randn((5, 4), dtype=dtype) with pytest.raises(AssertionError, match="complex number dtype"): self.run_compare_torch((5, 4), TestModel()) class TestRandnLike(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shape", itertools.product( compute_units, backends, frontends, [(1,), (2, 3)], ), ) def test_randn_like(self, compute_unit, backend, frontend, shape): class TestModel(nn.Module): def forward(self, x): y = torch.randn_like(torch.randn(shape)) if frontend == TorchFrontend.TORCHSCRIPT: return torch.Tensor([len(y)]) else: return torch.tensor(y.shape) self.run_compare_torch( shape, TestModel(), compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "dtype", [torch.complex64, torch.cfloat, torch.complex128, torch.cdouble] ) def test_invalid_complex_dtype(self, dtype): class TestModel(torch.nn.Module): def forward(self, x): return torch.randn_like(x, dtype=dtype) with pytest.raises(AssertionError, match="complex number dtype"): self.run_compare_torch((5, 4), TestModel()) class TestTypeAs(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, type", itertools.product(compute_units, backends, ["int32", "float32", "bool"]), ) def test_type_as(self, compute_unit, backend, type): class TestNet(nn.Module): def forward(self, x, y): return x.type_as(y) model = TestNet() type_map = { "int32": torch.int32, "float16": torch.float16, "float32": torch.float32, "bool": torch.bool, } input = [ torch.Tensor([0, 1, 2, 3]).to(torch.float32), torch.Tensor([2, 3]).to(type_map[type]), ] self.run_compare_torch( input, model, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) class TestReduction(TorchBaseTest): class TestModel(nn.Module): def __init__(self, mode, dim=None, keepdim=None): super().__init__() args = {"dim": dim, "keepdim": keepdim} self.op_args = {k: v for k, v in args.items() if v is not None} if mode == "min": self.op = torch.min elif mode == "max": self.op = torch.max else: raise ValueError("Unsupported mode: {mode}".format(mode=mode)) def forward(self, x, y=None): if y is not None: return self.op(x, y) return self.op(x, **self.op_args) @pytest.mark.parametrize( "compute_unit, backend, input_shape, dim, keepdim, mode", itertools.product( compute_units, backends, [(2, 2), (1, 1)], [0, 1, None], [True, False, None], ["min", "max"], ), ) def test_min_max(self, compute_unit, backend, input_shape, dim, keepdim, mode): if dim is None and keepdim is not None: pytest.skip("invalid torch.min configuration") input_data = torch.rand(input_shape) model = self.TestModel(mode, dim=dim, keepdim=keepdim) self.run_compare_torch( input_data, model, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, input_shape, mode", itertools.product(compute_units, backends, [(2, 2), (1, 1)], ["min", "max"]), ) def test_min_max_with_no_arguments(self, compute_unit, backend, input_shape, mode): self.run_compare_torch( input_shape, self.TestModel(mode), backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, input_shape, dim, mode", itertools.product(compute_units, backends, [(2, 2), (1, 1)], [0, 1], ["min", "max"]), ) def test_min_max_no_keepdim(self, compute_unit, backend, input_shape, dim, mode): input_data = torch.rand(input_shape) model = self.TestModel(mode, dim=dim) expected_results = model(input_data) self.run_compare_torch( input_data, model, expected_results=expected_results, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, input_shape, mode", itertools.product(compute_units, backends, [(2, 2), (1, 1)], ["min", "max"]), ) def test_min_max_two_tensors(self, compute_unit, backend, input_shape, mode): model = self.TestModel(mode) self.run_compare_torch([input_shape] * 2, model, backend=backend, compute_unit=compute_unit) class TestLayerNorm(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, input_shape, eps", itertools.product( [ct.ComputeUnit.CPU_ONLY], backends, frontends, [(1, 3, 15, 15), (1, 1, 1, 1)], [1e-5, 1e-7], ), ) def test_layer_norm(self, compute_unit, backend, frontend, input_shape, eps): model = nn.LayerNorm(input_shape, eps=eps) self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestPixelShuffle(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, batch_size, CHW, r", itertools.product( compute_units, backends, frontends, [1, 3], [(1, 4, 4), (3, 2, 3)], [2, 4] ), ) def test_pixel_shuffle(self, compute_unit, backend, frontend, batch_size, CHW, r): C, H, W = CHW input_shape = (batch_size, C * r * r, H, W) model = nn.PixelShuffle(upscale_factor=r) self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.skipif(_macos_version() < (13, 0), reason="New functionality in macOS13/iOS16") class TestPixelUnshuffle(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, batch_size, CHW, r", itertools.product( compute_units, backends, frontends, [1, 3], [(1, 4, 4), (3, 2, 3)], [2, 4] ), ) def test_pixel_shuffle(self, compute_unit, backend, frontend, batch_size, CHW, r): if backend[0] == "neuralnetwork": pytest.skip("pixel_unshuffle only supported in mlprogram backend.") C, H, W = CHW input_shape = (batch_size, C, H * r, W * r) model = nn.PixelUnshuffle(downscale_factor=r) self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=ct.target.iOS16, ) class TestExpand(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shapes", itertools.product( compute_units, backends, frontends, [ [(2, 1), (2, 2)], [(3, 1), (-1, 4)], [(1, 3, 4, 4), (3, 3, 4, 4)], [(4,), (3, 4)], [(3, 2), (1, 2, -1, 2)], ], ), ) def test_expand(self, compute_unit, backend, frontend, shapes): input_shape, output_shape = shapes class TestModel(torch.nn.Module): def forward(self, x): return x.expand(*output_shape) model = TestModel() self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target", itertools.product( compute_units, backends, frontends, [None, ct.target.iOS17], ), ) def test_expand_dynamic_shape0( self, compute_unit, backend, frontend, minimum_deployment_target ): class TestModel(nn.Module): def forward(self, x): return x.expand(x.shape[1], x.shape[1]) self.run_compare_torch( torch.arange(20).reshape((1, 20)), TestModel(), input_as_shape=False, converter_input_type=[ TensorType( shape=[1, ct.RangeDim(upper_bound=20 if backend[0] == "mlprogram" else -1)] ) ], frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_expand_dynamic_shape1(self, compute_unit, backend, frontend): class TestModel(nn.Module): def forward(self, x): return x.expand(x.shape[0], 1, x.shape[-1], x.shape[-1]) upper_bound = 20 if backend[0] == "mlprogram" else -1 self.run_compare_torch( torch.arange(20).reshape((1, 20)), TestModel(), input_as_shape=False, converter_input_type=[ TensorType( shape=[ ct.RangeDim(upper_bound=upper_bound), ct.RangeDim(upper_bound=upper_bound), ] ) ], frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_expand_dynamic_shape2(self, compute_unit, backend, frontend): class TestModel(nn.Module): def forward(self, x): return x.expand(x.shape[-1], 1, x.shape[-1], x.shape[-1]) upper_bound = 20 if backend[0] == "mlprogram" else -1 self.run_compare_torch( torch.arange(20).reshape((1, 20)), TestModel(), input_as_shape=False, converter_input_type=[TensorType(shape=[1, ct.RangeDim(upper_bound=upper_bound)])], frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_expand_dynamic_shape3(self, compute_unit, backend, frontend): class TestModel(nn.Module): def forward(self, x): return x.expand(x.shape[0], 10) upper_bound = 20 if backend[0] == "mlprogram" else -1 self.run_compare_torch( torch.arange(20).reshape((20, 1)), TestModel(), input_as_shape=False, converter_input_type=[ TensorType( shape=[ ct.RangeDim(upper_bound=upper_bound), ct.RangeDim(upper_bound=upper_bound), ] ) ], frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_expand_dynamic_shape_from_another_input(self, compute_unit, backend, frontend): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.skip( "torch._dynamo.exc.UserError: Tried to use data-dependent value in the subsequent " "computation. This can happen when we encounter unbounded dynamic value that is " "unknown during tracing time." ) class TestModel(nn.Module): def forward(self, x, y): return x.expand(int(y[0]), int(y[1])) self.run_compare_torch( [torch.arange(20).reshape((20, 1)), torch.Tensor([20, 20])], TestModel(), input_as_shape=False, converter_input_type=[ TensorType( shape=[ct.RangeDim(upper_bound=20 if backend[0] == "mlprogram" else -1), 1] ), TensorType(shape=(2,)), ], frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_shapes", itertools.product( compute_units, backends, frontends, [ [(2, 1), (2, 2)], [(3, 1), (3, 4)], [(1, 3, 4, 4), (3, 3, 4, 4)], [(4,), (1, 3, 4)], ], ), ) def test_expand_as(self, compute_unit, backend, frontend, input_shapes): class TestModel(torch.nn.Module): def forward(self, x, y): return x.expand_as(y) model = TestModel() self.run_compare_torch( input_shapes, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestExpandDims(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, rank_and_axis", itertools.product( compute_units, backends, frontends, [(rank, axis) for rank in range(1, 5) for axis in range(-rank - 1, rank + 1)], ), ) def test_unsqueeze(self, compute_unit, backend, frontend, rank_and_axis): rank, axis = rank_and_axis input_shape = tuple(np.random.randint(low=2, high=10, size=rank)) model = ModuleWrapper(function=torch.unsqueeze, kwargs={"dim": axis}) self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestAtLeastND(TorchBaseTest): @staticmethod def _generate_input_shape(input_rank): if input_rank == 0: # Core ML does not support scalar input, so we use rank-1 size-1 tensor then squeeze input_shape = (1,) else: input_shape = np.random.randint(2, 5, input_rank) return input_shape @pytest.mark.parametrize( "compute_unit, backend, frontend, rank, input_rank", itertools.product( compute_units, backends, frontends, (1, 2, 3), (0, 1, 2, 3, 4, 5), ), ) def test_atleast_nd(self, compute_unit, backend, frontend, rank, input_rank): if backend[0] == "neuralnetwork" and rank in (2, 3) and input_rank == 0: pytest.xfail("rdar://134723147 nn backend additionally expands a dim") class Model(torch.nn.Module): def forward(self, x): # Core ML does not support scalar input, so we use rank-1 size-1 tensor then squeeze if input_rank == 0: x = torch.squeeze(x) if rank == 1: result = torch.atleast_1d(x) elif rank == 2: result = torch.atleast_2d(x) else: assert rank == 3 result = torch.atleast_3d(x) return result input_shape = self._generate_input_shape(input_rank) model = Model() self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend, rank, input_rank", itertools.product( compute_units, backends, frontends, (1, 2, 3), (0, 1, 2, 3, 4, 5), ), ) def test_atleast_nd_sequence(self, compute_unit, backend, frontend, rank, input_rank): if backend[0] == "neuralnetwork" and rank in (2, 3) and input_rank == 0: pytest.xfail("rdar://134723147 nn backend additionally expands a dim") class Model(torch.nn.Module): def forward(self, x, y): # Core ML does not support scalar input, so we use rank-1 size-1 tensor then squeeze if input_rank == 0: x = torch.squeeze(x) y = torch.squeeze(y) # Lowering "tuple input as output" pymil program gives wrong output, # so insert add ops to avoid "input as output" # TODO (rdar://134722912) Fix the "tuple input as output" pymil program lowering x = x + 1.0 y = y + 2.0 if rank == 1: result = torch.atleast_1d((x, y)) elif rank == 2: result = torch.atleast_2d((x, y)) else: assert rank == 3 result = torch.atleast_3d((x, y)) return result input_shape = [ self._generate_input_shape(input_rank), self._generate_input_shape(input_rank), ] model = Model() self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestLinspace(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, start_end, steps", itertools.product( compute_units, backends, [(-0.1, -0.7), (1, 10)], [1, 3], ), ) def test_linspace_static(self, compute_unit, backend, start_end, steps): input_shape = tuple([steps]) start, end = start_end class Model(nn.Module): def forward(self, x): return torch.linspace(start, end, steps) model = Model() self.run_compare_torch(input_shape, model, backend=backend, compute_unit=compute_unit) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_linspace_static_large(self, compute_unit, backend): input_shape = tuple([1]) class Model(nn.Module): def forward(self, x): return torch.linspace(1, 2_000_000, 2_000_000) model = Model() self.run_compare_torch(input_shape, model, backend=backend, compute_unit=compute_unit) @pytest.mark.parametrize( "compute_unit, backend, start_end, steps", itertools.product( compute_units, backends, [(-0.1, -0.7), (1, 10)], [1, 2, 100], ), ) def test_linspace_dynamic(self, compute_unit, backend, start_end, steps): start, end = start_end class Model(nn.Module): def forward(self, x): return torch.linspace(x[0], x[1], steps) model = Model() inputs = [torch.Tensor([start, end])] self.run_compare_torch( inputs, model, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_linspace_static_not_fold(self, compute_unit, backend): class Model(nn.Module): def forward(self, x): return torch.linspace(0, 1, 100) model = Model() mlmodel = self.run_compare_torch( [(1, 2, 3)], model, backend=backend, compute_unit=compute_unit ) prog = mlmodel[1]._mil_program # The linspace op is folded to const, so there is no range_1d op. assert len(prog.find_ops(op_type="const")) == 1 assert len(prog.find_ops(op_type="range_1d")) == 0 with patch.object(Var, "_is_nonreplaceable_var") as mocked_is_nonreplaceable_var: # Mock that the first param to linspace is non-replaceable. mocked_is_nonreplaceable_var.side_effect = ( lambda var: var.op and var.op.op_type == "const" and var.rank == 0 and var.val == 0 ) mlmodel = self.run_compare_torch( [(1, 2, 3)], model, backend=backend, compute_unit=compute_unit ) prog = mlmodel[1]._mil_program # The linspace op is not folded to const, but translated to range_1d instead. assert len(prog.find_ops(op_type="range_1d")) == 1 class TestArange(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, start_end_step", itertools.product( compute_units, backends, frontends, [ (-0.1, -0.7, -0.07), (3, 10, 0.3), (1, 10, 100), (1, 300000, 1), (1, 10, 1e-6), ], ), ) def test_arange_static(self, compute_unit, backend, frontend, start_end_step): if start_end_step == (1, 10, 1e-6): pytest.xfail("rdar://88998831 (range_1d has numerical issue when the step is small)") input_shape = (1,) start, end, step = start_end_step class Model(nn.Module): def forward(self, x): return torch.arange(start, end, step) model = Model() self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend, start_end_step", itertools.product( compute_units, backends, frontends, [ (-0.1, -0.7, -0.07), (3, 10, 0.3), (1, 10, 100), (1, 300000, 1), ], ), ) def test_arange_dynamic(self, compute_unit, backend, frontend, start_end_step): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.skip( "torch._dynamo.exc.UserError: Tried to use data-dependent value in the subsequent " "computation. This can happen when we encounter unbounded dynamic value that is " "unknown during tracing time." ) start, end, step = start_end_step class Model(nn.Module): def forward(self, x): return torch.arange(x[0], x[1], x[2]) model = Model() inputs = [torch.tensor([start, end, step])] self.run_compare_torch( inputs, model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends), ) def test_arange_without_start(self, compute_unit, backend, frontend): class Model(nn.Module): def forward(self, x): return torch.arange(10) model = Model() self.run_compare_torch( (1,), model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestEinsum(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, equation, reverse_input_order, dynamic", itertools.product( compute_units, backends, frontends, einsum_equations, [False, True], [False, True], ), ) def test_binary_einsum( self, compute_unit, backend, frontend, equation, reverse_input_order, dynamic ): if dynamic and backend[0] == "mlprogram" and ct.utils._macos_version() > (14, 2): pytest.xfail("rdar://120386990 (Einsum Model Failed)") if frontend == TorchFrontend.EXECUTORCH: pytest.skip("ExecuTorch einsum decomposition issue") class TestBinaryEinsum(nn.Module): def forward(self, x, y): return torch.einsum(equation, x, y) input_shapes, converter_input_type = gen_input_shapes_einsum(equation, dynamic, backend) if frontend != TorchFrontend.TORCHSCRIPT: converter_input_type = None if reverse_input_order: input_output_strings = equation.split("->") input_string = ",".join(reversed(input_output_strings[0].split(","))) equation = input_string + "->" + input_output_strings[1] input_shapes.reverse() if converter_input_type is not None: converter_input_type.reverse() model = TestBinaryEinsum() res = self.run_compare_torch( input_shapes, model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=True, converter_input_type=converter_input_type, ) # Verify the pattern of the hardcode einsum cases traced_model = res[0] mlprogram = ct.convert( traced_model, inputs=converter_input_type, convert_to="milinternal", pass_pipeline=ct.PassPipeline.EMPTY, ) ops_in_prog = get_op_types_in_program(mlprogram) if (equation in hardcoded_einsum_equations) and not ( equation in ["abcd,cde->abe", "abc,cde->abde"] and dynamic ): assert "reduce_prod" not in ops_in_prog assert "concat" not in ops_in_prog assert "shape" not in ops_in_prog @pytest.mark.parametrize( "compute_unit, backend, frontend, equation, dynamic", itertools.product( compute_units, backends, frontends, ["ab->ba", "aa->a", "ab->b", "iijk->ji"], [False, True], ), ) def test_unary_einsum(self, compute_unit, backend, frontend, equation, dynamic): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("ExecuTorch einsum decomposition issue") if platform.machine() == "x86_64" and dynamic and equation == "iijk->ji": pytest.xfail("rdar://135843153 ([Bug] Models failed on x86_64 platform)") class TestUnaryEinsum(nn.Module): def forward(self, x): return torch.einsum(equation, x) input_shapes, converter_input_type = gen_input_shapes_einsum(equation, dynamic, backend) model = TestUnaryEinsum() self.run_compare_torch( input_shapes, model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=True, converter_input_type=converter_input_type, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, equation, dynamic", itertools.product( compute_units, backends, frontends, ["ab,bc,cd->ba", "abb,abc,a->ab"], [False, True], ), ) def test_ternary_einsum(self, compute_unit, backend, frontend, equation, dynamic): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("ExecuTorch einsum decomposition issue") class TestTernaryEinsum(nn.Module): def forward(self, x, y, z): return torch.einsum(equation, x, y, z) input_shapes, converter_input_type = gen_input_shapes_einsum(equation, dynamic, backend) model = TestTernaryEinsum() self.run_compare_torch( input_shapes, model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=True, converter_input_type=converter_input_type, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_einsum_with_same_input(self, compute_unit, backend, frontend): class Einsum(nn.Module): def forward(self, m1, m2, m3): y1 = torch.einsum("bnhd,bdhm->bnhm", m1, m2) y2 = torch.einsum("bnhd,bdhm->bnhm", m1, m3) return y1, y2 m1 = torch.rand(1, 8, 8, 64) m3 = torch.rand(1, 8, 128, 64).transpose(1, 3).transpose(2, 3) m2 = m3.clone() model = Einsum() out = model(m1, m2, m3) self.run_compare_torch( [m1, m2, m3], Einsum(), frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, expected_results=out, ) class TestSqueeze(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank_and_axis", itertools.product( compute_units, backends, [ (2, 1), (2, 0), (3, 1), (3, None), (4, None), (4, 2), (5, None), (5, -1), ], ), ) def test_squeeze(self, compute_unit, backend, rank_and_axis): rank, axis = rank_and_axis input_shape = list(np.random.randint(low=2, high=10, size=rank)) if axis is not None: input_shape[axis] = 1 else: input_shape[0] = 1 input_shape = tuple(input_shape) model = ModuleWrapper(function=torch.squeeze, kwargs={"dim": axis} if axis else {}) self.run_compare_torch(input_shape, model, backend=backend, compute_unit=compute_unit) @pytest.mark.parametrize( "compute_unit, backend, dynamic, dim", itertools.product(compute_units, backends, [True, False], [None, 0, 2, (1,), (1, 2)]), ) def test_squeeze_non_single_element_dim(self, compute_unit, backend, dynamic, dim): if backend[0] == "neuralnetwork": pytest.skip("neuralnetwork backend doesn't support squeeze a not-1 dimension") if dynamic and compute_unit == ct.ComputeUnit.CPU_ONLY: pytest.skip("CPU behaves differently from PyTorch for dropping dynamic dim.") if compute_unit == ct.ComputeUnit.CPU_ONLY and dim in {0, (1,), (1, 2)}: pytest.xfail("CPU failed non-single-dim squeeze (rdar://124555262)") input_shape = (2, 3, 1) model = ModuleWrapper(function=torch.squeeze, kwargs=None if dim is None else {"dim": dim}) if dynamic: converter_input_type = [ ct.TensorType( shape=( ct.RangeDim(upper_bound=10, default=2), ct.RangeDim(upper_bound=10, default=3), ct.RangeDim(upper_bound=10, default=1), ) ), ] else: converter_input_type = None self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, converter_input_type=converter_input_type, ) class TestCumSum(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, axis", itertools.product( compute_units, backends, [-1, 0, 1, 2, 3], ), ) def test_cumsum(self, compute_unit, backend, axis): input_shape = list(np.random.randint(low=2, high=10, size=4)) input_shape = tuple(input_shape) model = ModuleWrapper(function=torch.cumsum, kwargs={"dim": axis}) self.run_compare_torch(input_shape, model, backend=backend, compute_unit=compute_unit) class TestReshape(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, output_shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, [ (3, 2), (2, -1), (2, 1, 1, 3), ], [None, ct.target.iOS17], ), ) def test_reshape( self, compute_unit, backend, frontend, output_shape, minimum_deployment_target ): input_shape = (2, 3) model = ModuleWrapper(function=torch.reshape, kwargs={"shape": output_shape}) self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target", itertools.product( compute_units, backends, frontends, [None, ct.target.iOS17], ), ) def test_reshape_scalar(self, compute_unit, backend, frontend, minimum_deployment_target): model = ModuleWrapper(function=torch.reshape, kwargs={"shape": ()}) self.run_compare_torch( (1,), model, frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) class TestReshapeAs(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, input_output_shape", itertools.product( compute_units, backends, frontends, [ ((6, 1, 1), (3, 2)), ((8,), (2, 1, 1, 2, 2)), ], ), ) def test_reshape(self, compute_unit, backend, frontend, input_output_shape): class Model(nn.Module): def forward(self, x, ref): return x.reshape_as(ref) model = Model() input_shape, output_shape = input_output_shape self.run_compare_torch( [input_shape, output_shape], model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestFlatten(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, start_dim, end_dim, is_dynamic", itertools.product(compute_units, backends, frontends, [2, -2, 0], [3, -1], [False, True]), ) def test_flatten(self, compute_unit, backend, frontend, start_dim, end_dim, is_dynamic): input_shape = (2, 3, 4, 5) converter_input_type = None if is_dynamic: dim_upper_bound = 8 if backend[0] == "mlprogram" else -1 converter_input_type = [ TensorType( shape=( 2, 3, RangeDim(default=4, upper_bound=dim_upper_bound), RangeDim(default=5, upper_bound=dim_upper_bound), ), dtype=np.float32, ) ] model = ModuleWrapper( function=torch.flatten, kwargs={"start_dim": start_dim, "end_dim": end_dim} ) self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, converter_input_type=converter_input_type, ) class TestUnflatten(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, dim, auto_infer_idx, dynamic", itertools.product( compute_units, backends, frontends, (0, 1, -1, -2), (0, 1, None), (True, False), ), ) def test_unflatten(self, compute_unit, backend, frontend, dim, auto_infer_idx, dynamic): if dynamic and auto_infer_idx is not None: pytest.skip("Auto-inferring shape (-1) not supported for dynamic input.") class Head(nn.Module): def __init__(self, nhead, batch_size, input_size, output_size): super(Head, self).__init__() self.linear = nn.Linear(nhead * input_size, nhead * output_size) unflattened_size = [nhead, batch_size if dim == 0 or dim == -2 else output_size] if auto_infer_idx is not None: unflattened_size[auto_infer_idx] = -1 self.unflatten = nn.Unflatten(dim, unflattened_size) def forward(self, x): y = self.linear(x) y_heads = self.unflatten(y) return y_heads NHEAD = 2 BATCH_SIZE = 3 INPUT_SIZE = 5 OUTPUT_SIZE = 7 if dynamic: inputs = [ ct.TensorType( shape=( ct.RangeDim(lower_bound=1, upper_bound=NHEAD * BATCH_SIZE), ct.RangeDim(lower_bound=1, upper_bound=NHEAD * INPUT_SIZE), ) ), ] else: inputs = [ct.TensorType(shape=(NHEAD * BATCH_SIZE, NHEAD * INPUT_SIZE))] self.run_compare_torch( (NHEAD * BATCH_SIZE, NHEAD * INPUT_SIZE), Head(NHEAD, BATCH_SIZE, INPUT_SIZE, OUTPUT_SIZE), converter_input_type=inputs, frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestGather(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, rank_and_axis", itertools.product( compute_units, backends, frontends, [(i, j) for i in range(1, 6) for j in range(0, i)] ), ) def test_gather_along_axis(self, compute_unit, backend, frontend, rank_and_axis): rank, axis = rank_and_axis params_shape = np.random.randint(low=2, high=5, size=rank) indices_shape = np.copy(params_shape) indices_shape[axis] = np.random.randint(low=1, high=8) indices = np.random.randint(0, params_shape[axis], size=indices_shape) params_shape, indices_shape = tuple(params_shape), tuple(indices_shape) model = ModuleWrapper( function=torch.gather, kwargs={"dim": axis, "index": torch.from_numpy(indices)}, ) self.run_compare_torch( [params_shape], model, compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_enumerated_shape", itertools.product(compute_units, backends, frontends, (True, False)), ) def test_gather_enumerated_shape(self, compute_unit, backend, frontend, input_enumerated_shape): axis = 0 params_shape = (2, 3, 4) indices_shape = (3, 3, 4) class Model(nn.Module): def forward(self, x, index): return torch.gather(x, axis, index) input_data = [torch.rand(params_shape), torch.randint(0, params_shape[axis], indices_shape)] # Each model is only allowed for one input feature with enumerated shape. if input_enumerated_shape: converter_input_type = [ ct.TensorType(shape=ct.EnumeratedShapes(shapes=[(2, 3, 4), (3, 4, 5)])), ct.TensorType(shape=(3, 3, 4), dtype=np.int32), ] else: converter_input_type = [ ct.TensorType(shape=(2, 3, 4)), ct.TensorType( shape=ct.EnumeratedShapes(shapes=[(3, 3, 4), (4, 3, 4)]), dtype=np.int32 ), ] self.run_compare_torch( input_data, Model(), input_as_shape=False, converter_input_type=converter_input_type, frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=ct.target.iOS17, ) def test_gather_along_axis_invalid_indices(self): """This test is to verify that PyTorch gather op doesn't allow negative and out-of-range indices, so we don't need to add mb.select for IOS17 mb.gather op when lowering torch.gather.""" data = torch.tensor([[1, 2], [3, 4]]) with pytest.raises(RuntimeError, match="index -1 is out of bounds"): torch.gather(data, 1, torch.tensor([[-1, 0], [1, 0]])) with pytest.raises(RuntimeError, match="index 2 is out of bounds"): torch.gather(data, 1, torch.tensor([[0, 0], [2, 0]])) @pytest.mark.parametrize( "compute_unit, backend, frontend, dynamic", itertools.product(compute_units, backends, frontends, [True, False]), ) def test_gather_nd_int16_indices(self, compute_unit, backend, frontend, dynamic): """Test the indices access in torch model which gets lowered to gather_nd.""" B, C, H, W, T = 1, 24, 64, 64, 32 data = torch.rand(B, C, H, W) time = (torch.rand(1, T) * (C - 1)).to(torch.int) if frontend == TorchFrontend.TORCHSCRIPT: class DynamicModel(torch.nn.Module): def forward(self, data, time): return data[torch.arange(B).unsqueeze(1), time, :, :] class StaticModel(torch.nn.Module): def forward(self, data): return data[torch.arange(B).unsqueeze(1), time, :, :] torch_model = DynamicModel() if dynamic else StaticModel() else: class DynamicModel(torch.nn.Module): def __init__(self, B): super().__init__() self.slice0 = torch.arange(B).unsqueeze(1) def forward(self, data, time): return data[self.slice0, time, :, :] class StaticModel(torch.nn.Module): def __init__(self, B, time): super().__init__() self.slice0 = torch.arange(B).unsqueeze(1) self.time = time def forward(self, data): return data[self.slice0, self.time, :, :] torch_model = DynamicModel(B) if dynamic else StaticModel(B, time) input_data = (data, time) if dynamic else data converter_input_type = [ct.TensorType(shape=data.shape)] if dynamic: converter_input_type.append(ct.TensorType(shape=time.shape, dtype=np.int32)) mlmodel = self.run_compare_torch( input_data, torch_model, input_as_shape=False, converter_input_type=converter_input_type, frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=ct.target.iOS17, )[1] gather_op = mlmodel._mil_program.find_ops(op_type="gather_nd")[0] assert gather_op.indices.dtype == types.int16 if dynamic else types.uint16 class TestActivation(TorchBaseTest): @staticmethod def run_compare_torch(input_data, model, target_op: Optional[str] = None, **kwargs): """Override compare method for Activation ops tests, as we want to verify the mixed precision support for alpha/beta in IOS17 Activation Ops.""" results = TorchBaseTest.run_compare_torch(input_data, model, **kwargs) if target_op and kwargs.get("backend", (None, None))[1] == "fp16": prog: Program = results[1]._mil_program activation_op: Operation = prog.find_ops(op_type=target_op, exactly_one=True)[0] assert activation_op.x.dtype == types.fp16 # Before IOS17, both alpha and input/output are converted to fp16. # After IOS17, alpha is kept as fp32 because it supports mixed precision. expected_alpha_beta_dtype = types.fp16 if kwargs.get("minimum_deployment_target", None) == ct.target.iOS17: expected_alpha_beta_dtype = types.fp32 if hasattr(activation_op, "alpha"): assert activation_op.alpha.dtype == expected_alpha_beta_dtype if hasattr(activation_op, "beta"): assert activation_op.beta.dtype == expected_alpha_beta_dtype return results @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product(compute_units, backends, COMMON_SHAPES_ALL), ) def test_relu(self, compute_unit, backend, shape): model = nn.ReLU().eval() self.run_compare_torch( shape, model, backend=backend, ) model = ModuleWrapper(nn.functional.relu_) self.run_compare_torch(shape, model, backend=backend, compute_unit=compute_unit) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product(compute_units, backends, COMMON_SHAPES_ALL), ) def test_relu6(self, compute_unit, backend, shape): model = nn.ReLU6().eval() self.run_compare_torch(shape, model, backend=backend, compute_unit=compute_unit) @pytest.mark.parametrize( "compute_unit, backend, alpha, shape, single_alpha, minimum_deployment_target", itertools.product( compute_units, backends, [0.25, 2.0], [(3,), (2, 6), (2, 3, 4), (2, 5, 6, 7), (2, 3, 4, 5, 6)], [True, False], [None, ct.target.iOS17], ), ) def test_prelu( self, compute_unit, backend, alpha, shape, single_alpha, minimum_deployment_target ): if backend[0] == "mlprogram" and backend[1] == "fp16" or (len(shape) == 5): pytest.xfail( "rdar://92175249 ([MIL] TestActivation::test_prelu[backend=(mlprogram, fp16)] CI failure)" ) input_shape = shape num_parameters = input_shape[1] if len(input_shape) >= 2 else 1 if single_alpha: num_parameters = 1 model = nn.PReLU(num_parameters, alpha).eval() mlmodel = self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, target_op="leaky_relu", # prelu got fused to lrelu ) prog = mlmodel[1]._mil_program # Unfortunately since all these tests result in a prelu with a common leakage factor, the # prelu_to_lrelu pass optimizes them to contain leaky_relu instead. assert len(prog.find_ops(op_type="leaky_relu")) == 1 assert len(prog.find_ops(op_type="prelu")) == 0 @pytest.mark.parametrize( "compute_unit, backend, shape, alpha, minimum_deployment_target", itertools.product( compute_units, backends, COMMON_SHAPES_ALL, [0.1, 2.0], [None, ct.target.iOS17] ), ) def test_leaky_relu(self, compute_unit, backend, shape, alpha, minimum_deployment_target): model = nn.LeakyReLU(negative_slope=alpha).eval() self.run_compare_torch( shape, model, backend=backend, minimum_deployment_target=minimum_deployment_target, target_op="leaky_relu", ) model = ModuleWrapper(nn.functional.leaky_relu_, {"negative_slope": alpha}) self.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, target_op="leaky_relu", ) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product( compute_units, backends, COMMON_SHAPES_ALL, ), ) def test_randomized_leaky_relu(self, compute_unit, backend, shape): model = nn.RReLU(lower=0.01, upper=0.9).eval() self.run_compare_torch(shape, model, backend=backend, compute_unit=compute_unit) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product(compute_units, backends, COMMON_SHAPES_ALL), ) def test_softmax(self, compute_unit, backend, shape): model = nn.Softmax().eval() self.run_compare_torch(shape, model, backend=backend, compute_unit=compute_unit) @pytest.mark.parametrize( "compute_unit, backend, range_val", itertools.product( compute_units, backends, [(-1.0, 1.0), (0.0, 0.1), (1.0, 3.0), (-1.0, 6.0)] ), ) def test_hardtanh(self, compute_unit, backend, range_val): input_shape = (1, 10, 4, 5) model = nn.Hardtanh(range_val[0], range_val[1]).eval() self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit ) model = ModuleWrapper( nn.functional.hardtanh_, {"min_val": range_val[0], "max_val": range_val[1]} ) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, shape, alpha, minimum_deployment_target", itertools.product( compute_units, backends, COMMON_SHAPES_ALL, [0.1, 2.0], [None, ct.target.iOS17] ), ) def test_elu(self, compute_unit, backend, shape, alpha, minimum_deployment_target): model = nn.ELU(alpha).eval() self.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, target_op="elu", ) @pytest.mark.parametrize( "compute_unit, backend, shape, minimum_deployment_target", itertools.product(compute_units, backends, COMMON_SHAPES_ALL, [None, ct.target.iOS17]), ) def test_hardswish(self, compute_unit, backend, shape, minimum_deployment_target): model = nn.Hardswish().eval() self.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, target_op="thresholded_relu", ) @pytest.mark.parametrize( "compute_unit, backend, shape, approximate", itertools.product(compute_units, backends, COMMON_SHAPES_ALL, ["none", "tanh", None]), ) def test_gelu(self, compute_unit, backend, shape, approximate): model = nn.GELU() if approximate is None else nn.GELU(approximate=approximate) model = model.eval() self.run_compare_torch(shape, model, backend=backend, compute_unit=compute_unit) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product(compute_units, backends, COMMON_SHAPES_ALL), ) def test_erf(self, compute_unit, backend, shape): class ERFActivation(nn.Module): def forward(self, x): return torch.erf(x) model = ERFActivation().eval() self.run_compare_torch(shape, model, backend=backend, compute_unit=compute_unit) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product( compute_units, backends, [(1, 10), (1, 3, 5), (1, 5, 6, 7), (1, 3, 4, 5, 6)] ), ) def test_sigmoid(self, compute_unit, backend, shape): model = nn.Sigmoid().eval() self.run_compare_torch(shape, model, backend=backend, compute_unit=compute_unit) @pytest.mark.parametrize( "compute_unit, backend, shape, minimum_deployment_target", itertools.product(compute_units, backends, COMMON_SHAPES_ALL, [None, ct.target.iOS17]), ) def test_sigmoid_hard(self, compute_unit, backend, shape, minimum_deployment_target): model = nn.Hardsigmoid().eval() self.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, target_op="sigmoid_hard", ) @pytest.mark.parametrize( "compute_unit, backend, beta, threshold, minimum_deployment_target", itertools.product(compute_units, backends, [1, 2, 5], [5, 10, 20], [None, ct.target.iOS17]), ) @pytest.mark.skipif( _macos_version() <= (10, 15), reason="Parametric SoftPlus segfaults on macOS 10.15 and below.", ) def test_softplus(self, compute_unit, backend, beta, threshold, minimum_deployment_target): input_shape = (1, 10, 5, 15) model = nn.Softplus(beta, threshold).eval() self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, target_op="softplus_parametric", ) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product(compute_units, backends, COMMON_SHAPES_ALL), ) def test_mish(self, compute_unit, backend, shape): model = nn.Mish().eval() self.run_compare_torch(shape, model, backend=backend, compute_unit=compute_unit) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product(compute_units, backends, COMMON_SHAPES_ALL), ) def test_softsign(self, compute_unit, backend, shape): model = nn.Softsign().eval() self.run_compare_torch(shape, model, backend=backend, compute_unit=compute_unit) @pytest.mark.skipif( condition=version_lt(torch, "1.7.0"), reason="torch.nn.SiLU available only in PyTorch 1.7.0+", ) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product(compute_units, backends, [(1, 10), (1, 3, 4), (1, 4, 5, 6)]), ) def test_silu(self, compute_unit, backend, shape): model = ModuleWrapper(function=torch.nn.functional.silu) self.run_compare_torch([shape], model, backend=backend) @pytest.mark.parametrize( "compute_unit, backend, rounding_mode, x2_type", itertools.product( compute_units, backends, [None, "floor", "trunc"], [np.float32, np.int32] ), ) def test_div(self, compute_unit, backend, rounding_mode, x2_type): model = ModuleWrapper(function=torch.div, kwargs={"rounding_mode": rounding_mode}) x1 = torch.from_numpy(np.array([2.3, 2.6, -3.6, -3.2], dtype=np.float32)) x2 = torch.from_numpy(np.array([1.0, 1.0, 1.0, 1.0], dtype=x2_type)) out = torch.div(x1, x2, rounding_mode=rounding_mode) self.run_compare_torch( [x1, x2], model, backend=backend, compute_unit=compute_unit, input_as_shape=False, expected_results=out, ) class TestElementWiseUnary(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shape, op_string", itertools.product( compute_units, backends, frontends, [(1, 3, 5, 8)], [ "abs", "acos", "asin", "atan", "ceil", "cos", "cosh", "exp", "floor", "round", "sin", "sinh", "sqrt", "square", "tan", "tanh", "sign", ], ), ) def test_elementwise_no_params(self, compute_unit, backend, frontend, shape, op_string): if not contains_op(torch, op_string): return if op_string == "sqrt" and compute_unit != ct.ComputeUnit.CPU_ONLY: pytest.skip("sqrt on GPU producing nan.") op_func = getattr(torch, op_string) model = ModuleWrapper(function=op_func) self.run_compare_torch( shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend, shape, clamp_range, minimum_deployment_target", itertools.product( compute_units, backends, frontends, [(1, 3, 5, 8)], [ (0.0, 1.0), (-1.0, 0.5), (0.2, 0.7), (None, 4.0), (-3.0, None), (1, 2), (1, 3.5), (1, -1), ], [None, ct.target.iOS17], ), ) def test_clamp( self, compute_unit, backend, frontend, shape, clamp_range, minimum_deployment_target ): params_dict = {} if clamp_range[0] is not None: params_dict["min"] = clamp_range[0] if clamp_range[1] is not None: params_dict["max"] = clamp_range[1] model = ModuleWrapper(torch.clamp, params_dict) self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(-5, 5), minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_clamp_int_input(self, compute_unit, backend, frontend): params_dict = {"min": -2, "max": 2} input_data = torch.randint(low=-5, high=5, size=(2, 3, 4)) model = ModuleWrapper(torch.clamp, params_dict) self.run_compare_torch( input_data, model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, converter_input_type=[TensorType(shape=input_data.shape, dtype=np.int32)], ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_clamp_min_int(self, compute_unit, backend, frontend): params_dict = {"min": 0} input_data = torch.randint(low=-5, high=5, size=(2, 3, 4)) model = ModuleWrapper(torch.clamp_min, params_dict) self.run_compare_torch( input_data, model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, converter_input_type=[TensorType(shape=input_data.shape, dtype=np.int32)], ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_clamp_min_float(self, compute_unit, backend, frontend): params_dict = {"min": 0.0} input_data = torch.randn((2, 3, 4)) model = ModuleWrapper(torch.clamp_min, params_dict) self.run_compare_torch( input_data, model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, shape, threshold, minimum_deployment_target", itertools.product( compute_units, backends, frontends, [(1, 3, 5, 8)], [(0.0, 0.0), (0.5, 0.5), (0.5, 10), (0.9, 0.0)], [None, ct.target.iOS17], ), ) def test_threshold( self, compute_unit, backend, frontend, shape, threshold, minimum_deployment_target ): model = torch.nn.Threshold(threshold[0], threshold[1]).eval() input_value = torch.rand(np.prod(shape)) # make sure the values are not too close to the threshold for i in range(len(input_value)): if abs(input_value[i] - threshold[0]) < 0.005: input_value[i] += 0.05 input_value = torch.reshape(input_value, shape) self.run_compare_torch( input_value, model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, shape, op_string", itertools.product( compute_units, backends, frontends, [(1, 3, 5, 8)], [ "log", "rsqrt", "reciprocal", ], ), ) def test_elementwise_numerically_stable( self, compute_unit, backend, frontend, shape, op_string ): op_func = getattr(torch, op_string) model = ModuleWrapper(function=op_func) self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(20, 100), ) @pytest.mark.parametrize( "compute_unit, backend, frontend, dtype", itertools.product( compute_units, backends, frontends, [np.int32, np.float32], ), ) def test_log_dtype(self, compute_unit, backend, frontend, dtype): SHAPE = (2, 3) input_data = np.random.randint(1, 100, SHAPE).astype(dtype) input_data = torch.from_numpy(input_data) model = ModuleWrapper(torch.log) converter_input_type = [TensorType(shape=SHAPE, dtype=dtype)] self.run_compare_torch( input_data, model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, converter_input_type=converter_input_type, ) class TestAtan2(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, rank", itertools.product(compute_units, backends, frontends, range(1, 6)), ) def test_atan2(self, compute_unit, backend, frontend, rank): model = ModuleWrapper(function=torch.atan2) input_shape = tuple(np.random.randint(low=1, high=10, size=rank)) self.run_compare_torch( [input_shape, input_shape], model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, rank", itertools.product(compute_units, backends, frontends, range(1, 6)), ) def test_atan2_x0(self, compute_unit, backend, frontend, rank): model = ModuleWrapper(function=torch.atan2) input_shape = tuple(np.random.randint(low=1, high=10, size=rank)) y = generate_input_data(input_shape, rand_range=(-1.0, 1.0)) x = torch.zeros(input_shape) self.run_compare_torch( (y, x), model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, rank", itertools.product(compute_units, backends, frontends, range(1, 6)), ) def test_atan2_y0x0(self, compute_unit, backend, frontend, rank): model = ModuleWrapper(function=torch.atan2) input_shape = tuple(np.random.randint(low=1, high=10, size=rank)) y = torch.zeros(input_shape) x = torch.zeros(input_shape) self.run_compare_torch( (y, x), model, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, rank", itertools.product(compute_units, backends, frontends, range(1, 6)), ) def test_atan2_broadcast(self, compute_unit, backend, frontend, rank): model = ModuleWrapper(function=torch.atan2) input_shape = tuple(np.random.randint(low=1, high=10, size=rank)) truncated_shape = list(input_shape) while len(truncated_shape) > 1: truncated_shape.pop(0) self.run_compare_torch( [input_shape, truncated_shape], model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) self.run_compare_torch( [truncated_shape, input_shape], model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestTriu(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shape, diagonal", itertools.product( compute_units, backends, frontends, [(5, 5), (3, 4), (5, 1)], [None, -1, 0, 2], ), ) def test_triu(self, compute_unit, backend, frontend, shape, diagonal): params_dict = {} if diagonal is not None: params_dict["diagonal"] = diagonal model = ModuleWrapper(torch.triu, params_dict) self.run_compare_torch( shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestTril(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shape, diagonal", itertools.product( compute_units, backends, frontends, [(5, 5), (3, 4), (5, 1)], [None, -1, 0, 2], ), ) def test_tril(self, compute_unit, backend, frontend, shape, diagonal): params_dict = {} if diagonal is not None: params_dict["diagonal"] = diagonal model = ModuleWrapper(torch.tril, params_dict) self.run_compare_torch( shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend, ) class TestMatMul(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_bmm(self, compute_unit, backend, frontend): shape_x, shape_y = (3, 4, 5), (3, 5, 6) model = ModuleWrapper(function=torch.bmm) self.run_compare_torch( [shape_x, shape_y], model, compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_bmm_with_fp16_inputs(self, compute_unit, backend, frontend): if platform.machine() == "x86_64" and ct.utils._macos_version() <= (14, 2): pytest.xfail("rdar://135925921 ([CI] Upgrade External CI Machine OS)") class TestModel(torch.nn.Module): def forward(self, x, y): x = x.to(torch.float16) y = y + 1 return torch.bmm(x, y) inputs = [ TensorType(name="x", shape=(1, 2, 3), dtype=np.int32), TensorType(name="y", shape=(1, 3, 2), dtype=np.float16), ] self.run_compare_torch( inputs, TestModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=ct.target.iOS16, torch_device=torch.device("mps"), ) class TestNumel(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, input_shape", itertools.product( compute_units, backends, frontends, [(1,), (2, 3)], ), ) def test_numel(self, compute_unit, backend, frontend, input_shape): class TestModel(torch.nn.Module): def forward(self, x): res = torch.numel(x) return x + res model = TestModel() self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestSplit(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, split_size_or_sections, dim", itertools.product(compute_units, backends, frontends, [1, 2, [1, 4]], [0, -2]), ) def test_split(self, compute_unit, backend, frontend, split_size_or_sections, dim): input_shape = (5, 2) model = ModuleWrapper( function=torch.split, kwargs={"split_size_or_sections": split_size_or_sections, "dim": dim}, ) self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, frontend, split_sizes, dim", itertools.product(compute_units, backends, frontends, [[1, 4], [3, 2]], [-1, -2]), ) def test_split_with_sizes(self, compute_unit, backend, frontend, split_sizes, dim): input_shape = (5, 5) model = ModuleWrapper( function=torch.split_with_sizes, kwargs={"split_sizes": split_sizes, "dim": dim}, ) self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, frontend, dim", itertools.product(compute_units, backends, frontends, [-1]), ) def test_split_with_dynamic_sizes(self, compute_unit, backend, frontend, dim): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.skip("Torch.Export cannot export dynamic sizes") class TestModel(torch.nn.Module): def forward(self, x): size = x[0] return torch.split(x, size, dim=dim) input_shape = np.random.randint(low=2, high=6, size=20) torch_in = torch.tensor(input_shape) model = TestModel() torch_out = model(torch_in) self.run_compare_torch( torch_in, model, expected_results=torch_out, input_as_shape=False, frontend=frontend, backend=backend, compute_unit=compute_unit, ) if backends[0] == "mlprogram": with patch.object(Var, "_is_nonreplaceable_var") as mocked_is_nonreplaceable_var: # Mock that shape op is non-replaceable, so the gather op will be kept. mocked_is_nonreplaceable_var.side_effect = ( lambda var: var.op and "shape" in var.op.op_type ) with pytest.raises( RuntimeError, match="in operation of type split: Param 'split_sizes' must be const", ): self.run_compare_torch( torch_in, model, expected_results=torch_out, input_as_shape=False, frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestUnbind(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, dim", itertools.product(compute_units, backends, frontends, [0, 1, 2]), ) def test_unbind(self, compute_unit, backend, frontend, dim): input_shape = (3, 3, 4) model = ModuleWrapper(function=torch.unbind, kwargs={"dim": dim}) self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_unbind_one_dim_shape(self, compute_unit, backend, frontend): input_shape = (1,) dim = 0 model = ModuleWrapper(function=torch.unbind, kwargs={"dim": dim}) self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestTranspose(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shape, dims", itertools.product( compute_units, backends, frontends, COMMON_SHAPES, [(0, 1), (-2, -1), (1, 0), (-1, -2)] ), ) def test(self, compute_unit, backend, frontend, shape, dims): model = ModuleWrapper(function=torch.transpose, kwargs={"dim0": dims[0], "dim1": dims[1]}) self.run_compare_torch( shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestTo(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_cast_bug(self, compute_unit, backend): if _macos_version() < (13, 0) and backend[0] == "mlprogram": pytest.xfail("Issue fixed in iOS16/macOS13") class TestModel(torch.nn.Module): def forward(self, spans, embedding): spans = spans.float().relu().int() max1, _ = torch.max(spans, dim=1, keepdim=False) max1, _ = torch.max(max1, dim=1, keepdim=False) max2, _ = torch.max(embedding, dim=1, keepdim=False) max2, _ = torch.max(max2, dim=1, keepdim=False) sigmoided_scores = max1 + max2 return sigmoided_scores if ( platform.machine() == "arm64" and compute_unit != ct.ComputeUnit.CPU_ONLY and backend[0] == "neuralnetwork" ): pytest.xfail( "rdar://98015195 ([M1 native tests] Some MIL unittests are failing on M1 native)" ) model = TestModel() self.run_compare_torch( [(1, 4, 2), (1, 6, 3)], model, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_to_uint8(self, compute_unit, backend): class TestModel(torch.nn.Module): def forward(self, input_data): input_data = input_data + input_data return input_data.to(torch.uint8) inputs = [TensorType(name="input_data", shape=(1, 2, 3), dtype=np.int32)] self.run_compare_torch( inputs, TestModel(), rand_range=(0, 127), backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_to_float16(self, compute_unit, backend): class TestModel(torch.nn.Module): def forward(self, input_data): input_data = input_data.to(torch.float16) return input_data + 8 inputs = [TensorType(name="input_data", shape=(1, 2, 3), dtype=np.float32)] self.run_compare_torch( inputs, TestModel(), backend=backend, compute_unit=compute_unit, atol=0.01, rtol=0.001, ) @pytest.mark.parametrize( "compute_unit, backend, input_type", itertools.product( compute_units, backends, [np.float32, np.float16, np.int32], ), ) def test_to_no_param(self, compute_unit, backend: Tuple[str], input_type): if input_type == np.float16 and backend[0] == "neuralnetwork": pytest.skip("Input float16 needs target >= iOS16, which doesn't support neuralnetwork.") if input_type == np.float16 and _macos_version() < (13, 0): pytest.skip( "Input float16 needs target >= iOS16, which is not available until macOS 13." ) class TestModel(torch.nn.Module): def forward(self, input_data): return input_data.to() inputs = [TensorType(name="input_data", shape=(1, 2, 3), dtype=input_type)] # The float16 dtype for inputs is only supported for deployment target >= iOS16/macOS13. minimum_deployment_target = ( ct.target.iOS16 if input_type == np.float16 else None ) self.run_compare_torch( inputs, TestModel(), backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ) ) def test_fold_const(self, compute_unit: ct.ComputeUnit.CPU_ONLY, backend: List[Tuple[str]]): class TestModel(torch.nn.Module): def forward(self, x): return torch.arange(0, 3).float() model = TestModel() mlmodel = self.run_compare_torch( [(1, 2, 3)], model, backend=backend, compute_unit=compute_unit ) prog = mlmodel[1]._mil_program # The range_1d op translated from `torch.arange` is folded to const. assert len(prog.find_ops(op_type="range_1d")) == 0 with patch.object(Var, '_is_nonreplaceable_var') as mocked_is_nonreplaceable_var: # Mock that only the range_1d op is not replaceable. mocked_is_nonreplaceable_var.side_effect = ( lambda var: var.op and "range_1d" in var.op.op_type ) mlmodel = self.run_compare_torch( [(1, 2, 3)], model, backend=backend, compute_unit=compute_unit ) prog = mlmodel[1]._mil_program # The range_1d op translated from `torch.arange` shouldn't be folded. assert len(prog.find_ops(op_type="range_1d")) == 1 class TestSlice(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, start, end, step", itertools.product( compute_units, backends, frontends, (0, -5, None), (7, -1, 100, None), (1, 2, None) ), ) def test_slice(self, compute_unit, backend, frontend, start, end, step): class SliceModel(torch.nn.Module): def forward(self, x): y = x[start:end:step] return y model = SliceModel() model.eval() self.run_compare_torch( (9,), model, frontend=frontend, backend=backend, compute_unit=compute_unit ) @pytest.mark.skipif(_python_version() < (3, 6), reason="requires python 3.6") @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_dynamic_slice(self, compute_unit, backend, frontend): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.xfail( "https://github.com/apple/coremltools/issues/2189: " "torch.export Cannot Use Dynamic Index to Slice" ) class DynamicSlicer(torch.nn.Module): def forward(self, x, context_length): return x[context_length:, :, :] class Model(torch.nn.Module): def __init__(self): super(Model, self).__init__() self.tokens_embedding = torch.nn.Embedding(10, 10, 0) self.context_embedding = torch.nn.Embedding(10, 10, 0) self.dynamic_slicer = DynamicSlicer() def forward(self, tokens, context, context_length): # CoreML requires rank1~5 input, so we use rank 1 for # context-length tokens_embeddings = self.tokens_embedding(tokens) context_embeddings = self.context_embedding(context) embeddings = torch.cat((context_embeddings, tokens_embeddings), dim=0) embeddings = self.dynamic_slicer(embeddings, torch.squeeze(context_length)) return embeddings model = Model() batch_size = 5 inputs = [ TensorType(name="tokens", shape=(10, batch_size), dtype=np.int64), TensorType(name="context", shape=(3, batch_size), dtype=np.int64), TensorType(name="context_length", shape=(1,), dtype=np.int32), ] self.run_compare_torch( inputs, model, rand_range=(0, 8), frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestRepeat(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, rank", itertools.product(compute_units, backends, frontends, range(1, 6)), ) def test_repeat(self, compute_unit, backend, frontend, rank): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.skip("ectedly found a in the inputs") input_shape = np.random.randint(low=2, high=6, size=rank) repeats = np.random.randint(low=2, high=4, size=rank) input_shape = tuple(input_shape) model = ModuleWrapper(function=lambda x: x.repeat(*repeats)) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend, rank", itertools.product(compute_units, backends, frontends, (1, 2)), ) def test_repeats_with_extra_dimensions(self, compute_unit, backend, frontend, rank): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.skip("unexpectedly found a in the inputs") input_shape = np.random.randint(low=2, high=6, size=rank) for num_extra_dims in (1, 2): repeats = np.random.randint(low=2, high=4, size=rank + num_extra_dims) model = ModuleWrapper(function=lambda x: x.repeat(*repeats)) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_repeats_with_enumerated_shape_case1(self, compute_unit, backend, frontend): class Model(nn.Module): def forward(self, x, y): reps = x.size(0) return y.repeat(reps) enumerated_shapes = ct.EnumeratedShapes(shapes=[(1, 1), (2, 1)]) module = Model() inputs = [torch.tensor([[1]]), torch.tensor([2])] self.run_compare_torch( inputs, module, input_as_shape=False, converter_input_type=[ ct.TensorType(shape=enumerated_shapes), ct.TensorType(shape=(1,)), ], backend=backend, compute_unit=compute_unit, frontend=frontend, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_repeats_with_enumerated_shape_case2(self, compute_unit, backend, frontend): class Model(nn.Module): def forward(self, x, y): return y.repeat(x.size(0), x.size(1)) enumerated_shapes = ct.EnumeratedShapes(shapes=[(1, 1), (2, 1)]) module = Model() inputs = [torch.tensor([[1], [2]]), torch.tensor([2])] self.run_compare_torch( inputs, module, input_as_shape=False, converter_input_type=[ ct.TensorType(shape=enumerated_shapes), ct.TensorType(shape=(1,)), ], backend=backend, compute_unit=compute_unit, frontend=frontend, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_repeats_with_symbolic_shape(self, compute_unit, backend, frontend): class Model(nn.Module): def forward(self, x, y): return y.repeat([x.shape[-1], 1, x.shape[0]]) module = Model() inputs = [torch.tensor([[1], [2]]), torch.tensor([2])] upper_bound = 10 if backend[0] == "mlprogram" else -1 self.run_compare_torch( inputs, module, input_as_shape=False, converter_input_type=[ ct.TensorType( shape=( ct.RangeDim(upper_bound=upper_bound), ct.RangeDim(upper_bound=upper_bound), ) ), ct.TensorType(shape=(1,)), ], backend=backend, compute_unit=compute_unit, frontend=frontend, ) class TestRepeatInterleave(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, rank, dim, repeat", itertools.product( compute_units, backends, frontends, (1, 3, 5), (None, 0, 1, 2, 3, 4), (1, torch.tensor(1), torch.tensor([1]), 2, torch.tensor(3), torch.tensor([4])), ), ) def test_scalar_repeat(self, compute_unit, backend, frontend, rank, dim, repeat): if dim is not None and dim >= rank: pytest.skip() if frontend == TorchFrontend.EXECUTORCH: pytest.skip("torch._ops.aten.repeat_interleave.Tensor is not Aten Canonical") input_shape = tuple(np.random.randint(low=1, high=6, size=rank)) model = ModuleWrapper(function=lambda x: x.repeat_interleave(repeat, dim=dim)) mlmodel = self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend, )[1] # when repeat = 1, repeat_interelave is a noop if repeat in (1, torch.tensor(1), torch.tensor([1])): assert get_op_types_in_program(mlmodel._mil_program) in ( ["identity"], ["identity", "identity"], ["cast", "cast"], ["reshape"], ["cast", "reshape", "cast"], ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_single_fill_tensor_repeat(self, compute_unit, backend, frontend): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("torch._ops.aten.repeat_interleave.Tensor is not Aten Canonical") input_shape = (3, 2) model = ModuleWrapper(function=lambda x: x.repeat_interleave(torch.tensor([2, 2]), dim=1)) self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend, ) def test_unsupported_tensor_repeat(self): input_shape = (4, 1, 3) model = ModuleWrapper( function=lambda x: x.repeat_interleave(torch.tensor([1, 2, 3]), dim=2) ) with pytest.raises( NotImplementedError, match=r"Conversion for torch.repeat_interleave with Tensor repeats has not been implemented", ): self.run_compare_torch(input_shape, model) @pytest.mark.parametrize( "compute_unit, backend, frontend, dim", itertools.product( compute_units, backends, frontends, (None, -4, -3, -2, -1), ), ) def test_dynamic(self, compute_unit, backend, frontend, dim): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("ExecuTorch size op does not work on FakeTensor") if platform.machine() == "x86_64": pytest.xfail("rdar://135843153 ([Bug] Models failed on x86_64 platform)") input_shape = (2, 3, 5, 7) class Model(torch.nn.Module): def forward(self, x): return x.repeat_interleave(2, dim=dim) model = Model() torch_export_dynamic_shapes = None if frontend in TORCH_EXPORT_BASED_FRONTENDS: batch_dim = torch.export.Dim(name="batch_dim", max=128) sequence_length = torch.export.Dim(name="sequence_length", max=256) torch_export_dynamic_shapes = {"x": {0: batch_dim, 2: sequence_length}} converter_input_type = None if frontend == TorchFrontend.TORCHSCRIPT: batch_dim = RangeDim(lower_bound=2, upper_bound=128) sequence_length = RangeDim(lower_bound=2, upper_bound=256) input_symbolic_shape = (batch_dim, 3, sequence_length, 7) converter_input_type = [TensorType(shape=input_symbolic_shape)] self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend, torch_export_dynamic_shapes=torch_export_dynamic_shapes, converter_input_type=converter_input_type, ) class TestStd(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, unbiased", itertools.product(compute_units, backends, [True, False]), ) def test_std_2_inputs(self, compute_unit, backend, unbiased): model = ModuleWrapper(function=torch.std, kwargs={"unbiased": unbiased}) x = torch.randn(1, 5, 10) * 3 out = torch.std(x, unbiased=unbiased).unsqueeze(0) self.run_compare_torch( x, model, expected_results=out, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, unbiased, dim, keepdim", itertools.product( compute_units, backends, [True, False], [[0, 2], [1], [2]], [True, False] ), ) def test_std_4_inputs(self, compute_unit, backend, unbiased, dim, keepdim): model = ModuleWrapper( function=torch.std, kwargs={"unbiased": unbiased, "dim": dim, "keepdim": keepdim}, ) input_shape = (2, 5, 10) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit ) class TestOnesLike(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [1, 3], ), ) def test_ones_like_static(self, compute_unit, backend, rank): class OnesLikeStaticModel(nn.Module): def forward(self, x): return torch.ones_like(x) input_shape = np.random.randint(low=2, high=6, size=rank) input_shape = tuple(input_shape) model = OnesLikeStaticModel() self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, [ ["neuralnetwork", "fp32", ct.target.iOS14], ["mlprogram", "fp16", ct.target.iOS15], ["mlprogram", "fp32", ct.target.iOS15], ["mlprogram", "fp16", ct.target.iOS16], ["mlprogram", "fp32", ct.target.iOS16], ], [1, 3], ), ) def test_ones_like_dynamic(self, compute_unit, backend, rank): if _macos_version() < (13, 0) and backend[2] == ct.target.iOS16: pytest.skip("iOS16 target not available on macOS 13") class OnesLikeDynamicModel(nn.Module): def forward(self, x): if rank == 1: h = x[0] x = torch.zeros(h) elif rank == 3: h, w, d = x[0], x[1], x[2] x = torch.zeros(h, w, d) return torch.ones_like(x) input_shape = np.random.randint(low=2, high=6, size=rank) torch_in = torch.tensor(input_shape) model = OnesLikeDynamicModel() torch_out = model(torch_in) self.run_compare_torch( torch_in, model, expected_results=torch_out, input_as_shape=False, backend=backend[:2], compute_unit=compute_unit, minimum_deployment_target=backend[2], ) class TestFill(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, rank, dynamic, fill_scalar, src_dtype", itertools.product( compute_units, backends, frontends, [1, 3], [False, True], [0.2, torch.tensor(float("-inf")), torch.tensor(2)], [torch.int32, torch.float32], ), ) def test_fill_(self, compute_unit, backend, frontend, rank, dynamic, fill_scalar, src_dtype): if src_dtype == torch.int32 and fill_scalar == torch.tensor(float("-inf")): pytest.skip("float(-inf) cannot be casted to int.") if ( backend[0] == "neuralnetwork" and fill_scalar == 0.2 and src_dtype == torch.int32 and frontend in TORCH_EXPORT_BASED_FRONTENDS ): pytest.xfail("rdar://133816197 Cast mb.fill output dtype to EXIR specification") input_shape = np.random.randint(low=2, high=6, size=rank) input_shape = tuple(input_shape) if frontend == TorchFrontend.TORCHSCRIPT: class FillModel(nn.Module): def forward(self, x): y = torch.empty(x.shape, dtype=src_dtype) y.fill_(fill_scalar) return y model = FillModel() else: class FillModel(nn.Module): def __init__(self, fill_scalar): super().__init__() self.fill_scalar = fill_scalar def forward(self, x): y = torch.empty(x.shape, dtype=src_dtype) y.fill_(self.fill_scalar) return y model = FillModel(fill_scalar) if dynamic: upper_bound = 10 if backend[0] == "mlprogram" else -1 if rank == 1: converter_input_type = [ ct.TensorType( shape=( ct.RangeDim(upper_bound=upper_bound), ) ), ] else: converter_input_type = [ ct.TensorType( shape=( ct.RangeDim(upper_bound=upper_bound), ct.RangeDim(upper_bound=upper_bound), ct.RangeDim(upper_bound=upper_bound), ) ), ] else: converter_input_type = None self.run_compare_torch( input_shape, model, converter_input_type=converter_input_type, compute_unit=compute_unit, backend=backend, frontend=frontend, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, rank, dynamic, fill_scalar, src_dtype", itertools.product( compute_units, backends, frontends, [1, 3], [False, True], [0.2, torch.tensor(float("-inf")), torch.tensor(2)], [torch.int32, torch.float32], ), ) def test_fill__2(self, compute_unit, backend, frontend, rank, dynamic, fill_scalar, src_dtype): if src_dtype == torch.int32 and fill_scalar == torch.tensor(float("-inf")): pytest.skip("float(-inf) cannot be casted to int.") if ( backend[0] == "neuralnetwork" and fill_scalar == 0.2 and src_dtype == torch.int32 and frontend in TORCH_EXPORT_BASED_FRONTENDS ): pytest.xfail("rdar://133816197 Cast mb.fill output dtype to EXIR specification") input_shape = np.random.randint(low=2, high=6, size=rank) input_shape = tuple(input_shape) if frontend == TorchFrontend.TORCHSCRIPT: class FillModel(nn.Module): def forward(self, x): y = torch.empty(x.shape, dtype=src_dtype) y.fill_(fill_scalar) return y + 1 model = FillModel() else: class FillModel(nn.Module): def __init__(self, fill_scalar): super().__init__() self.fill_scalar = fill_scalar def forward(self, x): y = torch.empty(x.shape, dtype=src_dtype) y.fill_(self.fill_scalar) return y + 1 model = FillModel(fill_scalar) if dynamic: upper_bound = 10 if backend[0] == "mlprogram" else -1 if rank == 1: converter_input_type = [ ct.TensorType( shape=( ct.RangeDim(upper_bound=upper_bound), ) ), ] else: converter_input_type = [ ct.TensorType( shape=( ct.RangeDim(upper_bound=upper_bound), ct.RangeDim(upper_bound=upper_bound), ct.RangeDim(upper_bound=upper_bound), ) ), ] else: converter_input_type = None self.run_compare_torch( input_shape, model, converter_input_type=converter_input_type, compute_unit=compute_unit, backend=backend, frontend=frontend, ) class TestCopy(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, rank", itertools.product( compute_units, backends, frontends, [1, 3], ), ) def test_copy_(self, compute_unit, backend, frontend, rank): input_shape = np.random.randint(low=2, high=6, size=rank) input_shape = tuple(input_shape) class CopyModel(nn.Module): def forward(self, x): y = torch.empty(x.shape) y.copy_(x) return y model = CopyModel() self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, rank", itertools.product( compute_units, backends, frontends, [1, 3], ), ) def test_copy__2(self, compute_unit, backend, frontend, rank): input_shape = np.random.randint(low=2, high=6, size=rank) input_shape = tuple(input_shape) class CopyModel(nn.Module): def forward(self, x): y = torch.empty(x.shape) y.copy_(x) return y + 1 model = CopyModel() self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestZeros(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [1, 3], ), ) def test_zeros_like_static(self, compute_unit, backend, rank): class ZerosLikeStaticModel(nn.Module): def forward(self, x): return torch.zeros_like(x) input_shape = np.random.randint(low=2, high=6, size=rank) input_shape = tuple(input_shape) model = ZerosLikeStaticModel() self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, [ ["neuralnetwork", "fp32", ct.target.iOS14], ["mlprogram", "fp16", ct.target.iOS15], ["mlprogram", "fp32", ct.target.iOS15], ["mlprogram", "fp16", ct.target.iOS16], ["mlprogram", "fp32", ct.target.iOS16], ], [1, 3], ), ) def test_zeros_like_dynamic(self, compute_unit, backend, rank): if _macos_version() < (13, 0) and backend[2] == ct.target.iOS16: pytest.skip("iOS16 target not available on macOS 13") class ZerosLikeDynamicModel(nn.Module): def forward(self, x): if rank == 1: h = x[0] x = torch.zeros(h) elif rank == 3: h, w, d = x[0], x[1], x[2] x = torch.zeros(h, w, d) return torch.zeros_like(x) input_shape = np.random.randint(low=2, high=6, size=rank) torch_in = torch.tensor(input_shape, dtype=torch.int32) model = ZerosLikeDynamicModel() torch_out = model(torch_in) self.run_compare_torch( torch_in, model, expected_results=torch_out, input_as_shape=False, backend=backend[:2], compute_unit=compute_unit, minimum_deployment_target=backend[2], ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ) ) def test_zeros_like_static_fold_to_const(self, compute_unit, backend): class TestModel(nn.Module): def forward(self, x): x = torch.arange(0, 3) return torch.zeros_like(x) model = TestModel() mlmodel = self.run_compare_torch( [(1, 2, 3)], model, backend=backend, compute_unit=compute_unit ) prog = mlmodel[1]._mil_program # The empty_like op is folded to const, so there is no fill nor fill_like op. assert len(prog.find_ops(op_type="fill")) + len(prog.find_ops(op_type="fill_like")) == 0 @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [1, 3], ), ) def test_zeros_static(self, compute_unit, backend, rank): class ZerosStaticModel(nn.Module): def forward(self, x): if rank == 1: return torch.zeros(1) elif rank == 3: return torch.zeros(2, 3, 5) input_shape = np.random.randint(low=2, high=6, size=rank) input_shape = tuple(input_shape) model = ZerosStaticModel() self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product( compute_units, backends, [1, 3], ), ) def test_zeros_dynamic(self, compute_unit, backend, rank): class ZerosDynamicModel(nn.Module): def forward(self, x): if rank == 1: h = x[0] x = torch.zeros(h) elif rank == 3: h, w, d = x[0], x[1], x[2] x = torch.zeros(h, w, d) return x input_shape = np.random.randint(low=2, high=6, size=rank) torch_in = torch.tensor(input_shape, dtype=torch.int32) model = ZerosDynamicModel() torch_out = model(torch_in) self.run_compare_torch( torch_in, model, expected_results=torch_out, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ) ) def test_zeros_static_fold_to_const(self, compute_unit, backend): class TestModel(nn.Module): def forward(self, x): return torch.zeros(2, 3, 5) model = TestModel() mlmodel = self.run_compare_torch( [(1, 2, 3)], model, backend=backend, compute_unit=compute_unit ) prog = mlmodel[1]._mil_program # The zeros op is folded to const. assert len(prog.find_ops(op_type="fill")) == 0 with patch.object(Var, '_is_nonreplaceable_var') as mocked_is_nonreplaceable_var: # Mock that the size parameter to torch.zeros is non-replaceable. mocked_is_nonreplaceable_var.side_effect = ( lambda var: var.op and var.rank == 1 and var.val.shape == (3, ) and np.all(var.val == [2, 3, 5]) ) mlmodel = self.run_compare_torch( [(1, 2, 3)], model, backend=backend, compute_unit=compute_unit ) prog = mlmodel[1]._mil_program # The zeros op is not folded to const. assert len(prog.find_ops(op_type="fill")) == 1 class TestTopk(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, largest, sort, dynamic, shape_dim_k", itertools.product( compute_units, backends, [True, False], [True, False], [True, False], [((4, 6, 7, 3), -1, 2), ((10, 3, 4), 2, 2), ((5,), 0, 2)], ), ) def test_topk(self, compute_unit, backend, largest, sort, dynamic, shape_dim_k): if not sort and backend[0] == "neuralnetwork": pytest.xfail("iOS16 version topk needed for sort = False") if not sort and _macos_version() < (13, 0): pytest.skip("New functionality in macOS13/iOS16") if ( backend[0] == "mlprogram" and largest and sort and not dynamic and shape_dim_k == ((4, 6, 7, 3), -1, 2) ): pytest.xfail( "rdar://132358055 Why It Randomly Numerically Fails on CI but Cannot Reproduce Locally " ) input_shape = shape_dim_k[0] dim = shape_dim_k[1] k = shape_dim_k[2] class TopkModel(nn.Module): def forward(self, x, y): if dynamic: nonlocal k k = torch.min(y) topk = torch.topk(x, k, dim=dim, largest=largest, sorted=sort) values, indices = topk.values, topk.indices if not sort: values, _ = torch.sort(values, dim=dim) indices, _ = torch.sort(indices, dim=dim) return values, indices, y + 1 input_data = torch.rand(input_shape) k_list = torch.tensor([k + 1, k, k + 2]) model = TopkModel() expected_results = model(input_data, k_list) self.run_compare_torch( [input_data, k_list], model, expected_results=expected_results, input_as_shape=False, backend=backend, compute_unit=compute_unit, minimum_deployment_target=ct.target.iOS16 if not sort else None, ) @pytest.mark.parametrize( "compute_unit, backend, x_dtype", itertools.product( compute_units, [("mlprogram", "fp16")], [np.float32, np.float16, np.int32, np.int16, np.uint16], ), ) def test_topk_ios17(self, compute_unit, backend, x_dtype): if x_dtype == np.float16: pytest.skip("PyTorch doesn't support fp16 topk.") if x_dtype == np.uint16: pytest.skip("PyTorch doesn't have uint16 data type.") x_torch_dtype = NUM_TO_TORCH_DTYPE[NUMPY_DTYPE_TO_TORCH_NUM[x_dtype]] class TopkModel(nn.Module): def forward(self, x, y): topk = torch.topk(x.to(x_torch_dtype), k=2, dim=-1, largest=True, sorted=True) return topk.values + y input_data_x = torch.randint(low=0, high=100, size=(2, 3, 4)) input_data_y = torch.randint(low=0, high=100, size=(1,)) model = TopkModel() expected_results = model(input_data_x, input_data_y) mlmodel = self.run_compare_torch( [input_data_x, input_data_y], model, expected_results=expected_results, input_as_shape=False, backend=backend, compute_unit=compute_unit, minimum_deployment_target=ct.target.iOS17, ) prog = mlmodel[1]._mil_program topk_op = prog.find_ops(op_type="topk", exactly_one=True)[0] expected_topk_x_dtype = types.type_mapping.numpy_type_to_builtin_type(x_dtype) if backend[1] == "fp16" and x_dtype == np.float32: # For fp16 precision the fp32 input/output will be cast to fp16. expected_topk_x_dtype = types.fp16 assert topk_op.x.dtype == expected_topk_x_dtype class TestLog10(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, rank", itertools.product(compute_units, backends, frontends, range(1, 6)), ) def test_log10(self, compute_unit, backend, frontend, rank): class Log10Model(nn.Module): def forward(self, x): return torch.log10(x) input_shape = tuple(np.random.randint(low=1, high=10, size=rank)) model = Log10Model() self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestLog2(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, rank", itertools.product(compute_units, backends, frontends, range(1, 6)), ) def test_log2(self, compute_unit, backend, frontend, rank): class Log2Model(nn.Module): def __init__(self): super(Log2Model, self).__init__() def forward(self, x): return torch.log2(x) input_shape = tuple(np.random.randint(low=1, high=10, size=rank)) model = Log2Model() self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestUnique(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, x, return_inverse, return_counts", itertools.product( compute_units, backends, frontends, ( [1, 2, 3, 2, 2, 3, 99, -1, 1], [[1, 2, 3, 100], [3, 2, 99, 1]], ), (True, False), (True, False), ), ) def test(self, compute_unit, backend, frontend, x, return_inverse, return_counts): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.skip("torch._dynamo.exc.Unsupported: dynamic shape operator: aten._unique2") class Model(nn.Module): def forward(self, x): return torch.unique(x, return_inverse=return_inverse, return_counts=return_counts) if backend[0] == "neuralnetwork": pytest.xfail("This op is only supported on mlprogram backend.") self.run_compare_torch( torch.Tensor(x), Model(), input_as_shape=False, frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestFlip(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, rank_dim", itertools.product( compute_units, backends, frontends, [(1, [0]), (2, [0, 1]), (3, [1]), (4, [0, 1, 2, 3])], ), ) def test_flip(self, compute_unit, backend, frontend, rank_dim): rank, dim = rank_dim class FlipModel(nn.Module): def forward(self, x): return torch.flip(x, dim) input_shape = tuple(np.random.randint(low=1, high=10, size=rank)) model = FlipModel() self.run_compare_torch( input_shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestBitWiseLogical(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, x_y, op_string", itertools.product( compute_units, backends, frontends, [ ([True, False, True, False], [True, True, False, False]), ([[True, False], [True, False]], [[True, True], [False, False]]), ([[True, False], [True, False]], [[1, 0], [2, 1]]), ([-1.5, 0.0, 1.0, 0.0], [0.1, 2.5, 0.0, 0.0]), ([2, 0, -1, 0, 5], [1, 1, 0, 0, -5]), ], [ "eq", "ne", ], ), ) def test_bitwise_logical(self, compute_unit, backend, frontend, x_y, op_string): if not contains_op(torch, op_string): return op_func = getattr(torch, op_string) model = ModuleWrapper(function=op_func) x = torch.tensor(x_y[0]) y = torch.tensor(x_y[1]) self.run_compare_torch( [x, y], model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) class TestLogicalAnd(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, x_y", itertools.product( compute_units, backends, frontends, [ ([True, False, True, False], [True, True, False, False]), ([[True, False], [True, False]], [[True, True], [False, False]]), ([-1.5, 0.0, 1.0, 0.0], [0.1, 2.5, 0.0, 0.0]), ([2, 0, -1, 0, 5], [1, 1, 0, 0, -5]), ], ), ) def test_logical_and(self, compute_unit, backend, frontend, x_y): class TestNet(nn.Module): def forward(self, x, y): return torch.logical_and(x, y) model = TestNet() x = torch.tensor(x_y[0]) y = torch.tensor(x_y[1]) self.run_compare_torch( [x, y], model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) class TestLogicalOr(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, x_y", itertools.product( compute_units, backends, frontends, [ ([True, False, True, False], [True, True, False, False]), ([[True, False], [True, False]], [[True, True], [False, False]]), ([-1.5, 0.0, 1.0, 0.0], [0.1, 2.5, 0.0, 0.0]), ([2, 0, -1, 0, 5], [1, 1, 0, 0, -5]), ], ), ) def test_logical_or(self, compute_unit, backend, frontend, x_y): class TestNet(nn.Module): def forward(self, x, y): return torch.logical_or(x, y) model = TestNet() x = torch.tensor(x_y[0]) y = torch.tensor(x_y[1]) self.run_compare_torch( [x, y], model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) class TestLogicalXor(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, x_y", itertools.product( compute_units, backends, frontends, [ ([True, False, True, False], [True, True, False, False]), ([[True, False], [True, False]], [[True, True], [False, False]]), ([-1.5, 0.0, 1.0, 0.0], [0.1, 2.5, 0.0, 0.0]), ([2, 0, -1, 0, 5], [1, 1, 0, 0, -5]), ], ), ) def test_logical_xor(self, compute_unit, backend, frontend, x_y): class TestNet(nn.Module): def forward(self, x, y): return torch.logical_xor(x, y) model = TestNet() x = torch.tensor(x_y[0]) y = torch.tensor(x_y[1]) self.run_compare_torch( [x, y], model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) class TestLogicalNot(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype", itertools.product( compute_units, backends, frontends, [torch.int32, torch.float32, torch.bool], ), ) def test_logical_not(self, compute_unit, backend, frontend, input_dtype): class TestModel(torch.nn.Module): def forward(self, x): return torch.logical_not(x) input_data = torch.randint( low=0, high=2 if input_dtype == torch.bool else 4, size=(2, 3, 4), dtype=input_dtype ) self.run_compare_torch( input_data, TestModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, output_dtype", itertools.product( compute_units, backends, frontends, [torch.int32, torch.float32, torch.bool], [torch.int16, torch.float16, torch.bool], ), ) def test_logical_not_with_out(self, compute_unit, backend, frontend, input_dtype, output_dtype): class TestModel(torch.nn.Module): def forward(self, x): out_tensor = torch.empty((2, 3, 4), dtype=output_dtype) torch.logical_not(x, out=out_tensor) return out_tensor input_data = torch.randint( low=0, high=2 if input_dtype == torch.bool else 4, size=(2, 3, 4), dtype=input_dtype ) self.run_compare_torch( input_data, TestModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) class TestWhere(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shape", itertools.product(compute_units, backends, frontends, [(2, 6), (3, 4, 5)]), ) def test_where_test1(self, compute_unit, backend, frontend, shape): class WhereModel(nn.Module): def forward(self, x, y): return torch.where(x > 0.5, x, y) input_shape = [shape, shape] model = WhereModel() self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, frontend, shape", itertools.product(compute_units, backends, frontends, [(2, 6), (3, 4, 5)]), ) def test_where_test2(self, compute_unit, backend, frontend, shape): class WhereModel(nn.Module): def forward(self, cond, x, y): return torch.where(cond, x, y) cond = torch.rand(*shape) > 0.5 inputs = [cond, torch.rand(*shape), torch.rand(*shape)] model = WhereModel() expected_results = model(*inputs) self.run_compare_torch( inputs, model, frontend=frontend, backend=backend, compute_unit=compute_unit, expected_results=expected_results, input_as_shape=False, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, shapes", itertools.product( compute_units, backends, frontends, [ [(1, 2), (1, 2), (1, 1)], [(1, 2, 3), (1, 1, 1), (1, 1, 3)], ], ), ) def test_where_test3(self, compute_unit, backend, frontend, shapes): class WhereModel(nn.Module): def forward(self, cond, x, y): return torch.where(cond, x, y) cond_shape, x_shape, y_shape = shapes cond = torch.rand(*cond_shape) > 0.5 inputs = [cond, torch.rand(*x_shape), torch.rand(*y_shape)] model = WhereModel() expected_results = model(*inputs) self.run_compare_torch( inputs, model, frontend=frontend, backend=backend, compute_unit=compute_unit, expected_results=expected_results, input_as_shape=False, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, shapes, xdtype, ydtype", itertools.product( compute_units, backends, frontends, [ [(1, 2), (1, 2), (1, 1)], [(1, 2, 3), (1, 2, 1), (1, 1, 3)], ], (torch.float16, torch.float32), (torch.float16, torch.float32), ), ) def test_where_mixed_precision(self, compute_unit, backend, frontend, shapes, xdtype, ydtype): class WhereModel(nn.Module): def forward(self, cond, x, y): a = x.to(xdtype) b = y.to(ydtype) return torch.where(cond, a, b) cond_shape, x_shape, y_shape = shapes cond = torch.rand(*cond_shape) > 0.5 inputs = [cond, torch.rand(*x_shape), torch.rand(*y_shape)] self.run_compare_torch( inputs, WhereModel(), compute_unit=compute_unit, frontend=frontend, backend=backend, input_as_shape=False, rtol=1e-6 if xdtype == ydtype and xdtype == torch.float32 else 1e-3, atol=1e-6 if xdtype == ydtype and xdtype == torch.float32 else 1e-3, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, shape", itertools.product(compute_units, backends, frontends, COMMON_SHAPES + [(10,)]), ) def test_where_single_param(self, compute_unit, backend, frontend, shape): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.xfail( "https://github.com/apple/coremltools/issues/2183: " "Operator torch._ops.aten._assert_async.msg is not Aten Canonical" ) class WhereModelSingleParam(nn.Module): def forward(self, x): return torch.where(x) # Create a tensor of given shape with ~90% zero entries x = np.zeros(shape) all_indices = list(zip(*np.where(x == 0))) num_indices = len(all_indices) random_picks = np.random.choice( np.arange(num_indices), size=num_indices // 10, replace=False ) for i in random_picks: x[all_indices[i]] = np.random.choice([-1, 12, 100]) x = torch.Tensor(x) self.run_compare_torch( x, WhereModelSingleParam(), frontend=frontend, backend=backend, input_as_shape=False, compute_unit=compute_unit, ) class TestSelect(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, dim_index", itertools.product( compute_units, backends, frontends, [ [0, 0], [1, 1], [-1, -1], ], ), ) def test_select(self, compute_unit, backend, frontend, dim_index): dim, index = dim_index class SelectModel(nn.Module): def forward(self, x): return x.select(dim, index) input_shape = (1, 2, 3) self.run_compare_torch( input_shape, SelectModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends) ) def test_dynamic_index(self, compute_unit, backend, frontend): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.xfail( "https://github.com/apple/coremltools/issues/2189: " "torch.export Cannot Use Dynamic Index to Select" ) class M(torch.nn.Module): def forward(self, float_arr, int_arr): dynamic_index = int_arr[1] float_arr[dynamic_index] = 12.95 return float_arr a = torch.Tensor([1.0, 2.0, 4.0, 5]) i = torch.Tensor([0, 1, 2]).long() inputs_types = [ ct.TensorType(name="a", shape=a.shape), ct.TensorType(name="i", shape=i.shape, dtype=np.int32), ] self.run_compare_torch( [a, i], M(), input_as_shape=False, converter_input_type=inputs_types, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends), ) def test_dynamic_index_with_explicit_slice_on_all_other_dims( self, compute_unit, backend, frontend ): class SelectModel(torch.nn.Module): def forward(self, x, position): y = x[:, :, position] return y self.run_compare_torch( [(2, 3, 4), (1,)], SelectModel(), input_dtype=np.int32, rand_range=(0, 2), frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestNonZero(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, rank, as_tuple", itertools.product( compute_units, backends, frontends, [1, 3], [False, True], ), ) def test_non_zero(self, compute_unit, backend, frontend, rank, as_tuple): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.skip("Cannot support _assert_async") if rank == 1: input_shape = 10 zeros_indices = np.array([1, 4, 7, 9]) elif rank == 3: input_shape = (2, 7, 3) zeros_indices = np.array([1, 12, 33, 40]) input = np.arange(np.prod(input_shape)).astype(np.float32) input[zeros_indices] = 0 input = np.reshape(input, input_shape) input = torch.tensor(input) model = ModuleWrapper( torch.nonzero, {"as_tuple": as_tuple}, ) self.run_compare_torch( input, model, input_as_shape=False, frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestTorchTensor(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, rank", itertools.product( compute_units, backends, frontends, [0, 1, 2, 3, 4, 5], ), ) def test_torch_tensor(self, compute_unit, backend, frontend, rank): if frontend == TorchFrontend.TORCHSCRIPT: class Model(nn.Module): def __init__(self, rank): super(Model, self).__init__() self.rank = rank def forward(self, x): with torch.no_grad(): if self.rank == 0: res = self.generate_tensor_rank_0(x) return torch.unsqueeze(res, 0) if self.rank == 1: return self.generate_tensor_rank_1(x) if self.rank == 2: return self.generate_tensor_rank_2(x) if self.rank == 3: return self.generate_tensor_rank_3(x) if self.rank == 4: return self.generate_tensor_rank_4(x) if self.rank == 5: return self.generate_tensor_rank_5(x) @torch.jit.script def generate_tensor_rank_0(x): _, _, _, w = x.shape return torch.tensor(w, dtype=torch.int32) @torch.jit.script def generate_tensor_rank_1(x): _, _, h, w = x.shape return torch.tensor([h, w, 0, 1], dtype=torch.int32) @torch.jit.script def generate_tensor_rank_2(x): _, _, h, w = x.shape return torch.tensor([[0, h], [h, w], [w, w]], dtype=torch.float32) @torch.jit.script def generate_tensor_rank_3(x): _, _, h, w = x.shape return torch.tensor([[[h, 1]], [[3, w]]], dtype=torch.int32) @torch.jit.script def generate_tensor_rank_4(x): _, _, h, w = x.shape return torch.tensor( [ [[[h, h], [h, w]], [[w, w], [w, 1]]], [[[0, 0], [1, 1]], [[0, h], [h, w]]], ], dtype=torch.float32, ) @torch.jit.script def generate_tensor_rank_5(x): _, _, h, w = x.shape return torch.tensor( [[[[[h, w], [w, w]], [[1, 1], [0, h]]]]], dtype=torch.float32 ) else: class Model(nn.Module): def __init__(self, rank): super(Model, self).__init__() self.rank = rank def forward(self, x): if self.rank == 0: return self.generate_tensor_rank_0(x) if self.rank == 1: return self.generate_tensor_rank_1(x) if self.rank == 2: return self.generate_tensor_rank_2(x) if self.rank == 3: return self.generate_tensor_rank_3(x) if self.rank == 4: return self.generate_tensor_rank_4(x) if self.rank == 5: return self.generate_tensor_rank_5(x) def generate_tensor_rank_0(self, x): _, _, _, w = x.shape return torch.tensor(w, dtype=torch.int32) def generate_tensor_rank_1(self, x): _, _, h, w = x.shape return torch.tensor([h, w, 0, 1], dtype=torch.int32) def generate_tensor_rank_2(self, x): _, _, h, w = x.shape return torch.tensor([[0, h], [h, w], [w, w]], dtype=torch.float32) def generate_tensor_rank_3(self, x): _, _, h, w = x.shape return torch.tensor([[[h, 1]], [[3, w]]], dtype=torch.int32) def generate_tensor_rank_4(self, x): _, _, h, w = x.shape return torch.tensor( [ [[[h, h], [h, w]], [[w, w], [w, 1]]], [[[0, 0], [1, 1]], [[0, h], [h, w]]], ], dtype=torch.float32, ) def generate_tensor_rank_5(self, x): _, _, h, w = x.shape return torch.tensor( [[[[[h, w], [w, w]], [[1, 1], [0, h]]]]], dtype=torch.float32 ) shape = (1, 1, 3, 4) model = Model(rank) self.run_compare_torch( shape, model, compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend, torch_op", itertools.product( compute_units, backends, frontends, [ torch.abs, torch.acos, torch.asin, torch.atan, torch.atanh, torch.ceil, torch.cos, torch.cosh, torch.exp, torch.exp2, torch.floor, torch.log, torch.log2, torch.round, torch.rsqrt, torch.sign, torch.sin, torch.sinh, torch.sqrt, torch.square, torch.tan, torch.tanh, ], ), ) def test_torch_rank0_tensor(self, compute_unit, backend, frontend, torch_op): if frontend == TorchFrontend.EXECUTORCH and torch_op == torch.exp2: pytest.skip("torch._ops.aten.exp2.default is not Aten Canonical") class Model(nn.Module): def forward(self, x: torch.Tensor) -> torch.Tensor: return torch_op(torch.tensor(0.1)) model = Model() self.run_compare_torch( torch.tensor([1.0, 2.0, 3.0]), model, input_as_shape=False, frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestTensorAssign(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, minimum_deployment_target", itertools.product( compute_units, backends, [None, ct.target.iOS18], ), ) def test_tensor_assign_scalar(self, compute_unit, backend, minimum_deployment_target): # single dimension assignment for a 1D tensor class TensorAssignModel(torch.nn.Module): def forward(self, x): x[0] = 0 x[1] = 1 y = x + 1 x[1] = 2 * y[1] return x, y shape = (5,) model = TensorAssignModel() self.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, minimum_deployment_target", itertools.product(compute_units, backends, [None, ct.target.iOS18]), ) def test_tensor_assign_case_scalar_case_2( self, compute_unit, backend, minimum_deployment_target ): """ A little bit more complicated scalar tensor assignment test. """ # single dimension assignment for two 1D tensors class TensorAssignModel(torch.nn.Module): def forward(self, x, y): x[0] = 0 y[1] = 2 y = x + y x = 2 * y y[3] = x[1] + 5 y[0] = x[0] * 10 z = x + y return z, x, y shape = (5,) model = TensorAssignModel() self.run_compare_torch( [shape, shape], model, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, shape, minimum_deployment_target", itertools.product( compute_units, backends, [ (5, 4), (5, 4, 3), ], [None, ct.target.iOS18], ), ) def test_tensor_assign_case_broadcast( self, compute_unit, backend, shape, minimum_deployment_target ): # broadcast assignment for two n-D tensors if compute_unit != ct.ComputeUnit.CPU_ONLY: pytest.xfail( "rdar://128024502 ([Bug][iOS18] slice_update failing test on backends beside CPU_ONLY + Classic CPU)" ) class TensorAssignModel(torch.nn.Module): def __init__(self): super(TensorAssignModel, self).__init__() def forward(self, x, y): x[0] = 0 x[3] = 1 y[2] = 2 return x model = TensorAssignModel() res = self.run_compare_torch( [shape, shape], model, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # check slice_update is used if minimum_deployment_target == ct.target.iOS18: prog = res[1]._mil_program assert "slice_update" in get_op_types_in_program(prog) @pytest.mark.parametrize( "compute_unit, backend, minimum_deployment_target", itertools.product( compute_units, backends, [None, ct.target.iOS18], ), ) def test_tensor_assign_nd_tensor(self, compute_unit, backend, minimum_deployment_target): # single dimension assignment for two n-D tensors class TensorAssignModel(torch.nn.Module): def forward(self, x, y): x[0] = torch.tensor([1.0, 2.0, 3.0, 4.0]) x[3] = 1 y[0] = x[0] return x, y shape = (5, 4) model = TensorAssignModel() res = self.run_compare_torch( [shape, shape], model, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # check slice_update is used if minimum_deployment_target == ct.target.iOS18: prog = res[1]._mil_program assert "slice_update" in get_op_types_in_program(prog) @pytest.mark.parametrize( "compute_unit, backend, minimum_deployment_target", itertools.product( compute_units, backends, [None, ct.target.iOS18], ), ) def test_tensor_assign_slice(self, compute_unit, backend, minimum_deployment_target): # slice dimension assignment class TensorAssignModel(torch.nn.Module): def forward(self, x): x[:, 1] = torch.tensor([1.0, 2.0]) return x shape = (2, 10) model = TensorAssignModel() res = self.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # check slice_update is used if minimum_deployment_target == ct.target.iOS18: prog = res[1]._mil_program assert "slice_update" in get_op_types_in_program(prog) @pytest.mark.parametrize( "compute_unit, backend, minimum_deployment_target", itertools.product( compute_units, backends, [None, ct.target.iOS18], ), ) def test_tensor_assign_slice_case_2(self, compute_unit, backend, minimum_deployment_target): # a more complicated slice dimension assignment class TensorAssignModel(torch.nn.Module): def forward(self, x): x[:, 1, :] = torch.tensor([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).view(2, 3) return x shape = (2, 10, 3) model = TensorAssignModel() res = self.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # check slice_update is used if minimum_deployment_target == ct.target.iOS18: prog = res[1]._mil_program assert "slice_update" in get_op_types_in_program(prog) @pytest.mark.parametrize( "compute_unit, backend, dynamic, minimum_deployment_target", itertools.product( compute_units, backends, [True, False], [None, ct.target.iOS18], ), ) def test_tensor_assign_complex_slice( self, compute_unit, backend, dynamic, minimum_deployment_target ): # general case class TensorAssignModel(torch.nn.Module): def forward(self, x): x[:1, 1, :1] = torch.tensor([1.0]).view(1, 1) x[0, 1, 2] = 6. x[:2, 2:8:2, 1:2] = torch.tensor([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]).view(2, 3, 1) x[:, 1:10:8, 1:3] = torch.tensor([1.0, 2.0, 3.0, 4.0]).view(2, 1, 2) return x shape = (2, 10, 3) model = TensorAssignModel() if dynamic: upper_bound = 10 if backend[0] == "mlprogram" else -1 converter_input_type = [ ct.TensorType( shape=( ct.RangeDim(upper_bound=upper_bound), ct.RangeDim(upper_bound=upper_bound), ct.RangeDim(upper_bound=upper_bound), ) ) ] else: converter_input_type = None res = self.run_compare_torch( shape, model, converter_input_type=converter_input_type, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # check slice_update is used if minimum_deployment_target == ct.target.iOS18: prog = res[1]._mil_program assert "slice_update" in get_op_types_in_program(prog) @pytest.mark.parametrize( "compute_unit, backend, dynamic, mixed_rank, minimum_deployment_target", itertools.product( compute_units, backends, [True, False], [True, False], [None, ct.target.iOS18] ), ) def test_tensor_assign_dynamic_slice( self, compute_unit, backend, dynamic, mixed_rank, minimum_deployment_target ): if compute_unit != ct.ComputeUnit.CPU_ONLY: pytest.xfail( "rdar://128024502 ([Bug][iOS18] slice_update failing test on backends beside CPU_ONLY + Classic CPU)" ) if ( backend[0] == "mlprogram" and not dynamic and minimum_deployment_target == ct.target.iOS18 ): pytest.xfail( "rdar://133494070 [iOS18] [Slice_Update] " "Toy iOS18.slice_update Model Passes in BNNS but Dies in Core ML" ) # general case with dynamic begin and end class TensorAssignModel(torch.nn.Module): def forward(self, x, begin_0, begin_1, end_1): x[:1, begin_0:begin_0+5:2, 2] = torch.tensor([1.0, 2.0, 3.0]).view(1, 3) x[:, 4, begin_1:end_1] = torch.tensor([1.0]).view(1, 1) return x shape = (2, 10, 3) model = TensorAssignModel() if mixed_rank: inputs = [ torch.rand(*shape), torch.as_tensor([[[1]]], dtype=torch.int32), torch.as_tensor([1], dtype=torch.int32), torch.as_tensor([[2]], dtype=torch.int32), ] else: inputs = [ torch.rand(*shape), torch.as_tensor([1], dtype=torch.int32), torch.as_tensor([1], dtype=torch.int32), torch.as_tensor([2], dtype=torch.int32), ] if dynamic: upper_bound = 10 if backend[0] == "mlprogram" else -1 converter_input_type = [ ct.TensorType( shape=( ct.RangeDim(upper_bound=upper_bound), ct.RangeDim(upper_bound=upper_bound), ct.RangeDim(upper_bound=upper_bound), ) ), ct.TensorType(shape=inputs[1].shape, dtype=np.int32), ct.TensorType(shape=inputs[2].shape, dtype=np.int32), ct.TensorType(shape=inputs[3].shape, dtype=np.int32), ] else: converter_input_type = None torch_inputs = [torch.clone(x) for x in inputs] expected_results = model(*torch_inputs) res = self.run_compare_torch( inputs, model, expected_results=expected_results, input_as_shape=False, converter_input_type=converter_input_type, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) if not mixed_rank: # the fuse_squeeze_expand_dims graph pass is going to # fuse the pattern of ``squeeze -> expand_dims`` prog = res[1]._mil_program assert "squeeze" not in get_op_types_in_program(prog) assert "expand_dims" not in get_op_types_in_program(prog) # check slice_update is used if minimum_deployment_target == ct.target.iOS18: prog = res[1]._mil_program assert "slice_update" in get_op_types_in_program(prog) @pytest.mark.parametrize( "compute_unit, backend, minimum_deployment_target", itertools.product( compute_units, backends, [None, ct.target.iOS18], ), ) def test_tensor_assign_type_compatibility( self, compute_unit, backend, minimum_deployment_target ): class TensorAssignModel(torch.nn.Module): def forward(self, x): x[:, 1] = torch.tensor([1, 2], dtype=torch.int32) return x shape = (2, 3) model = TensorAssignModel() res = self.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # check slice_update is used if minimum_deployment_target == ct.target.iOS18: prog = res[1]._mil_program assert "slice_update" in get_op_types_in_program(prog) class TestSelectScatter(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target, input_shape", itertools.product( compute_units, backends, frontends, [None, ct.target.iOS18], [(1,), (4,), (3, 4), (1, 2, 4)], ), ) def test_select_scatter( self, compute_unit, backend, frontend, minimum_deployment_target, input_shape ): rank = len(input_shape) def test_model(src_shape, dim, index): class SelectScatterModel(torch.nn.Module): def forward(self, x, y): return torch.select_scatter( input=x, src=y, dim=dim, index=index, ) class Rank0SelectScatterModel(torch.nn.Module): def forward(self, x, y): y = y[0] return torch.select_scatter( input=x, src=y, dim=dim, index=index, ) if len(src_shape) == 0: src_shape = [1] model = Rank0SelectScatterModel() else: model = SelectScatterModel() res = self.run_compare_torch( [input_shape, src_shape], model, frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # check slice_update is used if ( minimum_deployment_target == ct.target.iOS18 and frontend != TorchFrontend.EXECUTORCH ): prog = res[1]._mil_program assert "slice_update" in get_op_types_in_program(prog) for dim in range(-rank, rank): for index in range(-input_shape[dim], input_shape[dim]): dim_val = dim + rank if dim < 0 else dim src_shape = list(input_shape) src_shape = src_shape[:dim_val] + src_shape[dim_val + 1 :] test_model(src_shape, dim, index) class TestSliceScatter(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target, input_shape", itertools.product( compute_units, backends, frontends, [None, ct.target.iOS18], [(1,), (4,), (3, 4), (1, 2, 4)], ), ) def test_slice_scatter( self, compute_unit, backend, frontend, minimum_deployment_target, input_shape ): rank = len(input_shape) def test_model(src_shape, dim, start, end, step): class SliceScatterModel(torch.nn.Module): def forward(self, x, y): return torch.slice_scatter( input=x, src=y, dim=dim, start=start, end=end, step=step, ) res = self.run_compare_torch( [input_shape, src_shape], SliceScatterModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # check slice_update is used if minimum_deployment_target == ct.target.iOS18: prog = res[1]._mil_program assert "slice_update" in get_op_types_in_program(prog) for dim in range(-rank, rank): for start in list(range(0, input_shape[dim])) + [None]: start_val = start if start is not None else 0 for end in list(range(start_val + 1, input_shape[dim] + 1)) + [None]: end_val = end if end is not None else input_shape[dim] for step in range(1, end_val - start_val + 1): src_shape = list(input_shape) src_shape[dim] = 1 + (end_val - start_val - 1) // step src_shape = tuple(src_shape) test_model(src_shape, dim, start, end, step) class TestIndexPut(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target", itertools.product( compute_units, backends, frontends, [None, ct.target.iOS17], ), ) def test_index_put_bool_index_case_1(self, compute_unit, backend, frontend, minimum_deployment_target): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.xfail( "https://github.com/apple/coremltools/issues/2183: " "Operator torch._ops.aten._assert_async.msg is not Aten Canonical" ) class IndexPutModel(torch.nn.Module): def forward(self, x, y): y = x + 1 mask = torch.tensor([True, False, False, False, True, True]).view(3, 2) x[mask] = y[mask] return x shape = (3, 2) self.run_compare_torch( [shape, shape], IndexPutModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, rank, minimum_deployment_target", itertools.product( compute_units, backends, frontends, [0, 1], [None, ct.target.iOS17], ), ) def test_index_put_bool_index_case_2( self, compute_unit, backend, frontend, rank, minimum_deployment_target ): if backend[0] == "neuralnetwork" and frontend in ( TorchFrontend.TORCHEXPORT, TorchFrontend.EXECUTORCH, ): pytest.xfail( "https://github.com/apple/coremltools/issues/2185: " "EXIR IndexPut Fails on NeuralNetwork Backend" ) class IndexPutModel(torch.nn.Module): def forward(self, x): mask = torch.tensor([True, False, False, False, True, True]).view(3, 2) if frontend in TORCH_EXPORT_BASED_FRONTENDS: x = x.clone() if rank == 0: x[mask] = 0.0 if rank == 1: x[mask] = torch.tensor([1.0]) return x self.run_compare_torch( (3, 2), IndexPutModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, rank, minimum_deployment_target", itertools.product( compute_units, backends, frontends, [0, 1], [None, ct.target.iOS17], ), ) def test_index_put_bool_index_all_false( self, compute_unit, backend, frontend, rank, minimum_deployment_target ): class IndexPutModel(torch.nn.Module): def forward(self, x): mask = torch.tensor([False, False, False, False, False, False]).view(3, 2) if frontend in TORCH_EXPORT_BASED_FRONTENDS: x = x.clone() if rank == 0: x[mask] = 0.0 if rank == 1: x[mask] = torch.tensor([1.0]) return x self.run_compare_torch( (3, 2), IndexPutModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target", itertools.product( compute_units, backends, frontends, [None, ct.target.iOS17], ), ) def test_index_put_dynamic_bool_index( self, compute_unit, backend, frontend, minimum_deployment_target ): if backend[0] == "neuralnetwork" and frontend in ( TorchFrontend.TORCHEXPORT, TorchFrontend.EXECUTORCH, ): pytest.xfail( "https://github.com/apple/coremltools/issues/2185: " "EXIR IndexPut Fails on NeuralNetwork Backend" ) if _macos_version() < (13, 0): pytest.skip("Issue fixed in iOS16/macOS13") class IndexPutModel(torch.nn.Module): def forward(self, x, y): mask = y > 1 if frontend in TORCH_EXPORT_BASED_FRONTENDS: x = x.clone() x[y > 1] = 0.0 return x inputs = [ torch.Tensor([1.0, 2.0, 3.0, 4.0, 5.0, 6]), torch.Tensor([0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), ] self.run_compare_torch( inputs, IndexPutModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, rank, accumulate, minimum_deployment_target", itertools.product( compute_units, backends, frontends, [3], [True, False], [None, ct.target.iOS17], ), ) def test_index_put_int_index_case_1( self, compute_unit, backend, frontend, rank, accumulate, minimum_deployment_target ): if backend[0] == "neuralnetwork" and frontend in ( TorchFrontend.TORCHEXPORT, TorchFrontend.EXECUTORCH, ): pytest.xfail( "https://github.com/apple/coremltools/issues/2185: " "EXIR IndexPut Fails on NeuralNetwork Backend" ) class IndexPutModel(torch.nn.Module): def forward(self, x, indices, values): if frontend in TORCH_EXPORT_BASED_FRONTENDS: x = x.clone() x.index_put_(tuple(indices.t()), values, accumulate=accumulate) return x if rank == 1: inputs = [ torch.Tensor([1.0, 2.0, 3.0, 4.0, 5.0, 6]), torch.LongTensor([[0], [4]]), torch.Tensor([3.0, 7.0]), ] elif rank == 2: inputs = [ torch.ones([3, 4]), torch.LongTensor([[0, 1], [1, 2], [2, 2]]), torch.Tensor([1.0, 5.0, 8.0]), ] elif rank == 3: inputs = [ torch.ones([2, 3, 4]), torch.LongTensor([[0, 1], [1, 1], [0, 0]]), torch.tensor([[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0], [9.0, 6.0, 2.0, 1.0]]), ] model = IndexPutModel() self.run_compare_torch( inputs, model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target", itertools.product(compute_units, backends, frontends, [None, ct.target.iOS18]), ) def test_index_put_int_index_case_2( self, compute_unit, backend, frontend, minimum_deployment_target ): class IndexPutModel(torch.nn.Module): def forward(self, x): box_corner = x.new(x.shape) box_corner[:, :, 0] = x[:, :, 0] box_corner[:, :, 1] = x[:, :, 1] return box_corner[:, :, :2] res = self.run_compare_torch( (2, 3, 4), IndexPutModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # check slice_update is used if minimum_deployment_target == ct.target.iOS18: prog = res[1]._mil_program assert "slice_update" in get_op_types_in_program(prog) @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target", itertools.product(compute_units, backends, frontends, [None, ct.target.iOS18]), ) def test_index_put_int_index_case_3( self, compute_unit, backend, frontend, minimum_deployment_target ): class IndexPutModel(torch.nn.Module): def forward(self, x): y = x.clone() y[:, 0] = 1.0 return y res = self.run_compare_torch( (2, 3), IndexPutModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # check slice_update is used if minimum_deployment_target == ct.target.iOS18: prog = res[1]._mil_program assert "slice_update" in get_op_types_in_program(prog) @pytest.mark.parametrize( "compute_unit, backend, frontend, val_shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, ((2, 1), (1,)), [None, ct.target.iOS18] ), ) def test_index_put_dynamic_int_index_case_1( self, compute_unit, backend, frontend, val_shape, minimum_deployment_target ): if frontend == TorchFrontend.TORCHSCRIPT: pytest.xfail( "https://github.com/apple/coremltools/issues/2188: " "torch.jit.trace Inplace Index Put Silent Error" ) class IndexPutModel(torch.nn.Module): def forward(self, x, position, val): y = x.clone() y[:, position] = val return y res = self.run_compare_torch( [(2, 3), (1,), val_shape], IndexPutModel(), input_dtype=np.int32, rand_range=(0, 2), frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # check slice_update is used if minimum_deployment_target == ct.target.iOS18: prog = res[1]._mil_program assert "slice_update" in get_op_types_in_program(prog) @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target", itertools.product(compute_units, backends, frontends, [None, ct.target.iOS18]), ) def test_index_put_dynamic_int_index_case_2( self, compute_unit, backend, frontend, minimum_deployment_target ): if frontend == TorchFrontend.TORCHSCRIPT: pytest.xfail( "https://github.com/apple/coremltools/issues/2188: " "torch.jit.trace Inplace Index Put Silent Error" ) class IndexPutModel(torch.nn.Module): def forward(self, x, position, val): y = x.clone() y[position, 1:4] = val return y res = self.run_compare_torch( [(2, 4), (1,), (1,)], IndexPutModel(), input_dtype=np.int32, rand_range=(0, 2), frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # check slice_update is used if minimum_deployment_target == ct.target.iOS18: prog = res[1]._mil_program assert "slice_update" in get_op_types_in_program(prog) @pytest.mark.parametrize( "compute_unit, backend, frontend, accumulate, minimum_deployment_target", itertools.product( compute_units, backends, frontends, [True, False], [None, ct.target.iOS17], ), ) def test_index_put_negative_indices_case_1( self, compute_unit, backend, frontend, accumulate, minimum_deployment_target ): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.skip( "https://github.com/pytorch/pytorch/issues/134443 " "Torch exported program outputs fake tensor" ) class IndexPutModel(torch.nn.Module): def forward(self, x): if frontend in TORCH_EXPORT_BASED_FRONTENDS: x = x.clone() x.index_put_( indices=(torch.LongTensor([0, -1]), torch.LongTensor([-2, 1])), values=torch.Tensor([1.0, 5.0]), accumulate=accumulate, ) return x self.run_compare_torch( (3, 4), IndexPutModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, rank, accumulate, minimum_deployment_target", itertools.product( compute_units, backends, frontends, [1, 2, 3], [True, False], [None, ct.target.iOS17], ), ) def test_index_put_negative_indices_case_2( self, compute_unit, backend, frontend, rank, accumulate, minimum_deployment_target ): if backend[0] == "neuralnetwork" and frontend in ( TorchFrontend.TORCHEXPORT, TorchFrontend.EXECUTORCH, ): pytest.xfail( "https://github.com/apple/coremltools/issues/2185: " "EXIR IndexPut Fails on NeuralNetwork Backend" ) if ( backend[0] == "mlprogram" and frontend == TorchFrontend.TORCHSCRIPT and minimum_deployment_target == ct.target.iOS17 ): if (rank == 2 and accumulate) or rank == 3: pytest.xfail("rdar://133476254 Toy iOS17.scatter_nd Model Failing") class IndexPutModel(torch.nn.Module): def forward(self, x, indices, values): if frontend in TORCH_EXPORT_BASED_FRONTENDS: x = x.clone() x.index_put_(tuple(indices.t()), values, accumulate=accumulate) return x if rank == 1: inputs = [ torch.Tensor([1.0, 2.0, 3.0, 4.0, 5.0, 6]), torch.LongTensor([[-1], [-4]]), torch.Tensor([3.0, 7.0]), ] elif rank == 2: inputs = [ torch.ones([3, 4]), torch.LongTensor([[-2, -1], [-2, 0], [-1, 1]]), torch.Tensor([1.0, 5.0, 8.0]), ] elif rank == 3: inputs = [ torch.ones([2, 3, 4]), torch.LongTensor([[-1, -1], [-2, 0], [0, 1]]), torch.tensor([[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0], [9.0, 6.0, 2.0, 1.0]]), ] model = IndexPutModel() self.run_compare_torch( inputs, model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, minimum_deployment_target=minimum_deployment_target, ) class TestIndex(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (10,), (3, 4, 5, 6), ], [None, ct.target.iOS17], ), ) def test_index_bool_indices( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.xfail( "https://github.com/apple/coremltools/issues/2183: " "Operator torch._ops.aten._assert_async.msg is not Aten Canonical" ) class IndexModel(torch.nn.Module): def __init__(self, axis): super().__init__() self.axis = axis def forward(self, x, y): index = y > 0.5 if self.axis == 0: return x[index] elif self.axis == 1: return x[:, index] elif self.axis == 2: return x[:, :, index] else: assert self.axis == 3 return x[:, :, :, index] rank = len(shape) for index_rank in range(1, rank + 1): for axis in range(rank + 1 - index_rank): input_data = generate_input_data(shape, rand_range=(0, 2), dtype=input_dtype) ref_data_shape = shape[axis:axis+index_rank] ref_data = torch.rand(ref_data_shape) # We set the first element to 0.6, so that we can make sure at least one element is selected, # and ensure no empty tensors are produced. ref_data[0] = 0.6 model = IndexModel(axis=axis) self.run_compare_torch( [input_data, ref_data], model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (1, 2), (3, 4, 5, 6), ], [None, ct.target.iOS17], ), ) def test_index_int_index_case_1( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.xfail( "https://github.com/apple/coremltools/issues/2184: " "Cannot Convert Empty EXIR Model" ) # all elements are selected class IndexModel(torch.nn.Module): def forward(self, x): if len(shape) == 2: return x[:, :] elif len(shape) == 4: return x[:] model = IndexModel() self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(0, 2), input_dtype=input_dtype, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (1, 2), (3, 4, 5, 6), ], [None, ct.target.iOS17], ), ) def test_index_int_index_case_2( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): """Only one axis is sliced.""" class IndexModel(torch.nn.Module): def forward(self, x): if len(shape) == 2: index = torch.tensor([0]) return x[index, :] elif len(shape) == 4: index = torch.tensor([1, -2]) return x[:, :, index] model = IndexModel() self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(0, 2), input_dtype=input_dtype, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (1, 2, 3), (2, 3, 4, 5), ], [None, ct.target.iOS17], ), ) def test_index_int_index_case_3( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): """Only two axes are sliced, and connected.""" class IndexModel(torch.nn.Module): def forward(self, x): if len(shape) == 3: index_1 = torch.tensor([0]) index_2 = torch.tensor([1]) return x[index_1, index_2, :] elif len(shape) == 4: index_1 = torch.tensor([0, 1, 1]) index_2 = torch.tensor([2, 1, 0]) return x[:, index_1, index_2, :] model = IndexModel() self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(0, 2), input_dtype=input_dtype, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (1, 2, 3), (2, 3, 4, 5), ], [None, ct.target.iOS17], ), ) def test_index_int_index_case_4( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): """Only two axes are sliced, and not connected.""" class IndexModel(torch.nn.Module): def forward(self, x): if len(shape) == 3: index_1 = torch.tensor([0]) index_2 = torch.tensor([1]) return x[index_1, :, index_2] elif len(shape) == 4: index_1 = torch.tensor([0, 1, 1]) index_2 = torch.tensor([3, 3, 4]) return x[index_1, :, :, index_2] model = IndexModel() self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(0, 2), input_dtype=input_dtype, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (1, 2, 3), (2, 3, 4, 5), ], [None, ct.target.iOS17], ), ) def test_index_int_index_case_5( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): """All axes are sliced.""" class IndexModel(torch.nn.Module): def forward(self, x): if len(shape) == 3: index_1 = torch.tensor([0]) index_2 = torch.tensor([1]) index_3 = torch.tensor([-1]) # Test negative indices. return x[index_1, index_2, index_3] elif len(shape) == 4: index_1 = torch.tensor([0, 1, 1, 0, 0]) index_2 = torch.tensor([1, 2, 0, 0, 0]) index_3 = torch.tensor([0, 1, -2, 3, 3]) # Test negative indices. index_4 = torch.tensor([2, 1, 0, 4, 4]) return x[index_1, index_2, index_3, index_4] model = IndexModel() self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(0, 2), input_dtype=input_dtype, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (1, 2), (3, 4, 5, 6), ], [None, ct.target.iOS17], ), ) def test_index_int_index_case_6( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): """Only one axis is sliced + nd mode.""" class IndexModel(torch.nn.Module): def forward(self, x): if len(shape) == 2: index = torch.tensor([0, 0, 0, 0, 0, 0]) index = index.view(2, 3) return x[index, :] elif len(shape) == 4: index = torch.tensor([0, 1, 2, 3, 0, 1]) index = index.view(3, 2) return x[:, index] model = IndexModel() self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(0, 2), input_dtype=input_dtype, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (1, 2, 3), (2, 3, 4, 5), ], [None, ct.target.iOS17], ), ) def test_index_int_index_case_7( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): """Two axes are sliced, and connected + nd mode.""" class IndexModel(torch.nn.Module): def forward(self, x): if len(shape) == 3: index_1 = torch.tensor([0, 0, 0, 0, 0, 0, 0, 0]).view(4, 2) index_2 = torch.tensor([1, 0, 0, 0, 1, 1, 1, 1]).view(4, 2) return x[index_1, index_2, :] elif len(shape) == 4: index_1 = torch.tensor([0, 0, 2, 2, 1, 1, 2, 0]).view(2, 4) index_2 = torch.tensor([0, 1, 2, 3, 0, 1, 2, 3]).view(2, 4) return x[:, index_1, index_2, :] model = IndexModel() self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(0, 2), input_dtype=input_dtype, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (1, 2, 3), (2, 3, 4, 5), ], [None, ct.target.iOS17], ), ) def test_index_int_index_case_8( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): """Two axes are sliced, and not connected + nd mode.""" class IndexModel(torch.nn.Module): def forward(self, x): if len(shape) == 3: index_1 = torch.tensor([0, 0, 0, 0, 0, 0, 0, 0]).view(2, 4) index_2 = torch.tensor([1, 0, 0, 2, 2, 1, 1, 1]).view(2, 4) return x[index_1, :, index_2] elif len(shape) == 4: index_1 = torch.tensor([0, 1, 1, 1, 1, 1, 0, 0]).view(4, 2) index_2 = torch.tensor([0, 1, 2, 3, 4, 0, 1, 2]).view(4, 2) return x[index_1, :, :, index_2] model = IndexModel() self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(0, 2), input_dtype=input_dtype, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (1, 2, 3), (2, 3, 4, 5), ], [None, ct.target.iOS17], ), ) def test_index_int_index_case_9( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.xfail( "https://github.com/apple/coremltools/issues/2183: " "Operator torch._ops.aten._assert_async.msg is not Aten Canonical" ) """One axis is sliced through bool mask.""" class IndexModel(torch.nn.Module): def forward(self, x): if len(shape) == 3: return x[:, [True, False], :] elif len(shape) == 4: return x[[True, False], :, :, :] model = IndexModel() self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(0, 2), input_dtype=input_dtype, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (1, 2, 3), (2, 3, 4, 5), ], [None, ct.target.iOS17], ), ) def test_index_int_index_case_10( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.xfail( "https://github.com/apple/coremltools/issues/2183: " "Operator torch._ops.aten._assert_async.msg is not Aten Canonical" ) """Multiple axes are sliced through bool masks with possible broadcasting.""" class IndexModel(torch.nn.Module): def forward(self, x): if len(shape) == 3: return x[[True], [True, False], [False, True, False]] else: assert len(shape) == 4 # This is an non-broadcasable case, where the number of `True` for each dimension is the same output_1 = x[ [True, True], :, [True, True, False, False], [True, False, False, True, False], ] # This is a broadcasable case output_2 = x[ [True, True], :, [False, False, True, False], [True, False, False, True, False], ] return output_1, output_2 model = IndexModel() self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(0, 2), input_dtype=input_dtype, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (3, 4), (3, 4, 5, 6) ], [None, ct.target.iOS17], ), ) def test_index_int_index_case_11( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): """Broadcastable indices.""" class IndexModel(torch.nn.Module): def forward(self, x): if len(shape) == 2: index_1 = torch.tensor([0, 1]) index_2 = torch.tensor([0]) return x[index_1, index_2] else: assert len(shape) == 4 index_1 = torch.tensor([0, 1, 1, 1, 1, 1, 0, 0]).view(4, 2) index_2 = torch.tensor([0, 1, 2, 3]).view(4, 1) index_3 = torch.tensor([2]).view(1,) return x[index_1, :, index_3, index_2] model = IndexModel() self.run_compare_torch( shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(0, 2), input_dtype=input_dtype, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (1, 2, 3), (2, 3, 4, 5), ], [None, ct.target.iOS17], ), ) def test_index_int_index_case_12( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): """Another broadcastable indices test case.""" class IndexModel(torch.nn.Module): def forward(self, x): index_1 = torch.tensor([0, 1]) index_2 = torch.tensor([0]) return ( x[:, index_1, index_2] if len(shape) == 3 else x[:, index_1, index_2, :] ) self.run_compare_torch( shape, IndexModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(0, 2), input_dtype=input_dtype, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target", itertools.product( compute_units, backends, frontends, (np.float32, np.int32, np.bool_), [ (1, 2, 3), (2, 3, 4, 5), ], [None, ct.target.iOS17], ), ) def test_index_int_index_case_13( self, compute_unit, backend, frontend, input_dtype, shape, minimum_deployment_target ): """Another broadcastable indices (negative) test case.""" class IndexModel(torch.nn.Module): def forward(self, x): index_1 = torch.tensor([-1, 1]) index_2 = torch.tensor([-1]) return x[:, index_1, index_2] if len(shape) == 3 else x[:, index_1, index_2, :] self.run_compare_torch( shape, IndexModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, rand_range=(0, 2), input_dtype=input_dtype, minimum_deployment_target=minimum_deployment_target, ) class TestIndexSelect(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, dim", itertools.product(compute_units, backends, [0, -1]), ) def test_index_select(self, compute_unit, backend, dim): class TestModel(torch.nn.Module): def forward(self, x): indices = torch.tensor([0, 2]) return torch.index_select(x, dim, indices) self.run_compare_torch((3, 4), TestModel(), backend=backend, compute_unit=compute_unit) def test_index_select_invalid_indices(self): """This test is to verify that PyTorch index_select op doesn't allow negative nor out-of-range indices, so we don't need to add mb.select for IOS17 mb.gather when lowering PyTorch index_select op.""" x = torch.randn(3, 4) with pytest.raises(IndexError, match="index out of range"): torch.index_select(x, 0, torch.tensor([0, -1])) with pytest.raises(IndexError, match="index out of range"): torch.index_select(x, 0, torch.tensor([0, 3])) class TestLoss(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, reduction", itertools.product(compute_units, backends, range(1, 4), ["none", "mean", "sum"]), ) def test_mse_loss(self, compute_unit, backend, rank: int, reduction: str): input_shape = tuple(np.random.randint(low=1, high=5, size=rank)) class Model(torch.nn.Module): def __init__(self): super().__init__() self.loss = nn.MSELoss(reduction=reduction) def forward(self, x, y): return self.loss(x, y) input_shapes = [input_shape, input_shape] self.run_compare_torch(input_shapes, Model(), backend=backend, compute_unit=compute_unit) class TestPad(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, rank, mode", itertools.product( compute_units, backends, frontends, range(3, 5), ["reflect", "replicate"] ), ) def test_pad_reflect_replicate(self, compute_unit, backend, frontend, rank: int, mode: str): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.skip( "torch._dynamo.exc.UserError: Tried to use data-dependent value " "in the subsequent computation" ) if rank == 3: pad_len = 2 input_shape = (5, 10, 10) elif rank == 4: pad_len = 4 input_shape = (10, 5, 5, 10) else: raise NotImplementedError( "Only 3D, 4D padding with non-constant padding are supported for now" ) max_pad = min(input_shape[-1], input_shape[-2]) pad = list(np.random.randint(low=0, high=max_pad, size=pad_len)) model = ModuleWrapper(function=torch.nn.functional.pad, kwargs={"pad": pad, "mode": mode}) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend, rank", itertools.product(compute_units, backends, frontends, range(1, 6)), ) def test_pad_constant(self, compute_unit, backend, frontend, rank: int): if frontend in TORCH_EXPORT_BASED_FRONTENDS: pytest.skip( "torch._dynamo.exc.UserError: Tried to use data-dependent value in the subsequent " "computation" ) if rank > 5: raise NotImplementedError("Only supports < 6D constant padding") val = float(np.random.random(1)) input_shape = tuple(np.random.randint(low=1, high=10, size=rank)) pad_dims = np.random.randint(low=1, high=rank + 1) pad = list(np.random.randint(low=0, high=10, size=pad_dims * 2)) model = ModuleWrapper( function=torch.nn.functional.pad, kwargs={"pad": pad, "mode": "constant", "value": val}, ) self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, frontend=frontend, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_constant_pad_1d(self, compute_unit, backend, frontend): input_shape = (3, 4, 5) model = torch.nn.ConstantPad1d((5, 6), 3.5).eval() self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_constant_pad_2d(self, compute_unit, backend, frontend): input_shape = (3, 4, 5, 6) model = torch.nn.ConstantPad2d((5, 6, 3, 8), 3.5).eval() self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_constant_pad_3d(self, compute_unit, backend, frontend): input_shape = (3, 4, 5, 6, 2) model = torch.nn.ConstantPad3d((5, 6, 3, 8, 2, 4), 3.5).eval() self.run_compare_torch( input_shape, model, backend=backend, compute_unit=compute_unit, frontend=frontend ) class TestMaskedFill(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, dtype, value", itertools.product( compute_units, backends, frontends, [np.int32, np.float32], [10.3, 7, 0], ), ) def test_masked_fill(self, compute_unit, backend, frontend, dtype, value): SHAPE = (2, 3) MASK = torch.bernoulli(torch.rand(SHAPE[-1])).to(torch.bool) input_data = np.random.randint(-100, 100, SHAPE).astype(dtype) input_data = torch.from_numpy(input_data) model = ModuleWrapper(torch.masked_fill, {"mask": MASK, "value": value}) converter_input_type = [TensorType(shape=SHAPE, dtype=dtype)] self.run_compare_torch( input_data, model, frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, converter_input_type=converter_input_type, ) class TestMeshgrid(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, x, y, z, dtype, inp_mode, indexing", itertools.product( compute_units, backends, frontends, [1, 2], [3, 4], [5, 6], [torch.int, torch.float], ["norm", "list"], [None, "ij", "xy"], ), ) def test_meshgrid( self, compute_unit, backend, frontend, x, y, z, dtype, inp_mode, indexing, ): class TestModel(nn.Module): def forward(self, x, y, z): if inp_mode == "norm": return torch.meshgrid(x, y, z, indexing=indexing) elif inp_mode == "list": return torch.meshgrid([x, y, z], indexing=indexing) else: raise ValueError("Unsupported mode: {mode}".format(mode=inp_mode)) inputs = ( torch.arange(start=0, end=x, step=1, dtype=dtype), torch.arange(start=0, end=y, step=1, dtype=dtype), torch.arange(start=0, end=z, step=1, dtype=dtype), ) model = TestModel().eval() expected_results = model(*inputs) self.run_compare_torch( inputs, model, expected_results, input_as_shape=False, frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestAddmm(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shapes, beta, alpha", itertools.product( compute_units, backends, ((2, 2, 2), (4, 5, 9)), (1.0, 2.0), (1.0, 3.0), ), ) def test_addmm(self, compute_unit, backend, shapes, beta, alpha): class TestModel(nn.Module): def forward(self, x): return torch.addmm(x, m1, m2, beta=beta, alpha=alpha) m, n, p = shapes # m1 @ m2 must be legal m1 = torch.randn(m, n) m2 = torch.randn(n, p) # x must be the same shape as m1 @ m2 x_shape = (m, p) self.run_compare_torch( x_shape, TestModel(), backend=backend, compute_unit=compute_unit, ) class TestScatter(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shapes_dims, minimum_deployment_target", itertools.product( compute_units, backends, [ [(10,), (0, -1)], [(2, 3), (1, -1)], [(2, 3, 4, 5), (0, -2)], ], [None, ct.target.iOS17], ), ) def test_scatter(self, compute_unit, backend, shapes_dims, minimum_deployment_target): class TestModel(nn.Module): def __init__(self, dim, shapes): super(TestModel, self).__init__() self.dim = dim self.source = torch.rand(*(shapes)) self.index = torch.randint(0, shapes[dim], size=shapes) def forward(self, x): return x.scatter_(self.dim, self.index, self.source) shapes, dims = shapes_dims for dim in dims: m = TestModel(dim, shapes) self.run_compare_torch( shapes, m, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, shapes_dims, minimum_deployment_target", itertools.product( compute_units, backends, [ [(10,), (0, -1)], [(2, 3), (1, -1)], [(2, 3, 4, 5), (0, -2)], ], [None, ct.target.iOS17], ), ) def test_scatter_with_scalar_source( self, compute_unit, backend, shapes_dims, minimum_deployment_target ): class TestModel(nn.Module): def __init__(self, dim, shapes): super(TestModel, self).__init__() self.dim = dim self.source = 1.0 self.index = torch.randint(0, shapes[dim], size=shapes) def forward(self, x): return x.scatter_(self.dim, self.index, self.source) shapes, dims = shapes_dims for dim in dims: m = TestModel(dim, shapes) self.run_compare_torch( shapes, m, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, shapes_dims, mode, minimum_deployment_target", itertools.product( compute_units, backends, [ [(10,), (0, -1)], [(2, 3), (1, -1)], [(2, 3, 4, 5), (0, -2)], ], ["add", "multiply"], [None, ct.target.iOS17], ), ) def test_scatter_with_reduce(self, compute_unit, backend, shapes_dims, mode, minimum_deployment_target): class TestModel(nn.Module): def __init__(self, dim, shapes, mode): super(TestModel, self).__init__() self.dim = dim self.mode = mode self.source = torch.rand(*(shapes)) self.index = torch.randint(0, shapes[dim], size=shapes) def forward(self, x): return x.scatter_(self.dim, self.index, self.source, reduce=self.mode) shapes, dims = shapes_dims for dim in dims: m = TestModel(dim, shapes, mode) self.run_compare_torch( shapes, m, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, shapes_dims, minimum_deployment_target", itertools.product( compute_units, backends, [ [(10,), (0, -1)], [(2, 3), (1, -1)], [(2, 3, 4, 5), (0, -2)], ], [None, ct.target.iOS17], ), ) def test_scatter_add(self, compute_unit, backend, shapes_dims, minimum_deployment_target): class TestModel(nn.Module): def __init__(self, dim, shapes): super(TestModel, self).__init__() self.dim = dim self.source = torch.rand(*(shapes)) self.index = torch.randint(0, shapes[dim], size=shapes) def forward(self, x): return x.scatter_add_(self.dim, self.index, self.source) shapes, dims = shapes_dims for dim in dims: m = TestModel(dim, shapes) self.run_compare_torch( shapes, m, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, [("mlprogram", "fp16")], ), ) def test_scatter_with_invalid_indices(self, compute_unit, backend): """ As PyTorch's `scatter_` and `scatter_add_` do verify indices and error out for negative and out-of-bound indices, it doesn't involve the PyMIL validation. """ class ScatterModel(nn.Module): def forward(self, x): index = torch.tensor([[-1, 1, 2, 0]]) return torch.zeros(1, 4, dtype=x.dtype).scatter_(1, index, x) class ScatterAddModel(nn.Module): def forward(self, x): index = torch.tensor([[0, 5, 2, 0]]) return torch.zeros(1, 4, dtype=x.dtype).scatter_add_(1, index, x) with pytest.raises(RuntimeError, match="index -1 is out of bounds for dimension 1"): self.run_compare_torch( (1, 4), ScatterModel(), backend=backend, compute_unit=compute_unit, minimum_deployment_target=ct.target.iOS17, ) with pytest.raises(RuntimeError, match="index 5 is out of bounds for dimension 1"): self.run_compare_torch( (1, 4), ScatterAddModel(), backend=backend, compute_unit=compute_unit, minimum_deployment_target=ct.target.iOS17, ) class TestBroadcastTensors(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shapes", itertools.product( compute_units, backends, frontends, [(1,), (1, 2)], ), ) def test_one_tensor(self, compute_unit, backend, frontend, shapes): class TestModel(nn.Module): def forward(self, a): return torch.broadcast_tensors(a) self.run_compare_torch( shapes, TestModel(), compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend, shapes", itertools.product( compute_units, backends, frontends, [ [(2, 1), (1, 3)], [(5, 1, 4, 1), (3, 1, 1)], [(1,), (3, 1, 7)], [(2, 1), (4, 3, 2, 1)], ], ), ) def test_two_tensors(self, compute_unit, backend, frontend, shapes): class TestModel(nn.Module): def forward(self, a, b): return torch.broadcast_tensors(a, b) self.run_compare_torch( shapes, TestModel(), compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend, shapes", itertools.product( compute_units, backends, frontends, [ [(2, 1), (1, 3), (1,), (1, 1)], [(5, 1, 4, 1), (3, 1, 1), (1,), (4, 8)], [(1,), (2, 1), (3, 2, 1), (5, 4, 3, 2, 1)], ], ), ) def test_four_tensors(self, compute_unit, backend, frontend, shapes): class TestModel(nn.Module): def forward(self, a, b, c, d): return torch.broadcast_tensors(a, b, c, d) self.run_compare_torch( shapes, TestModel(), compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestEmbedding(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, input_dtype", itertools.product( compute_units, backends, frontends, [np.int32, np.float32], ), ) def test_embedding(self, compute_unit, backend, frontend, input_dtype): num_embeddings = 4 embedding_size = 10 B = 2 dim = 5 converter_input_type = [TensorType(shape=(B, dim), dtype=input_dtype)] # input shape: (B, dim) # output shape : (B, dim, embedding_size) # shape of weights : (num_embeddings, embedding_size) class EmbeddingModel(nn.Module): def __init__(self): super(EmbeddingModel, self).__init__() self.embedding = torch.nn.Embedding(num_embeddings, embedding_size) def forward(self, x): return self.embedding(x) input_data = np.random.randint(low=0, high=num_embeddings, size=(B, dim)) input_data = torch.from_numpy(input_data) model = EmbeddingModel() expected_results = model(input_data) self.run_compare_torch( input_data, model, expected_results=expected_results, input_as_shape=False, frontend=frontend, backend=backend, compute_unit=compute_unit, converter_input_type=converter_input_type, ) def test_embedding_invalid_indices(self): """This test is to verify that PyTorch embedding op doesn't allow negative and out-of-range indices, so we don't need to add mb.select for IOS17 mb.gather op.""" embedding_matrix = torch.rand(10, 3) with pytest.raises(IndexError, match="index out of range"): torch.nn.functional.embedding(torch.tensor([[-1, 2], [4, 3]]), embedding_matrix) with pytest.raises(IndexError, match="index out of range"): torch.nn.functional.embedding(torch.tensor([[1, 2], [4, 10]]), embedding_matrix) class TestDuplicateOutputTensors(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) # Test case for rdar://100138064 (Duplicate output tensors trigger ops removal errors). def test_duplicate_output_not_raise_errors(self, compute_unit, backend, frontend): if backend[0] == "neuralnetwork": pytest.skip( "rdar://100243127 ([PyTorch] Duplicate Output Tensor Doesn't work for neuralnetwork)" ) class DuplicateTensorsModel(torch.nn.Module): def forward(self, x): return x, x input_data = torch.rand(2, 2, 1, 1) converter_input_type = [ct.TensorType(shape=input_data.shape)] model = DuplicateTensorsModel() expected_results = model(input_data) self.run_compare_torch( input_data, model, expected_results=expected_results, input_as_shape=False, frontend=frontend, backend=backend, compute_unit=compute_unit, converter_input_type=converter_input_type, ) class TestBaddbmm(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shapes, beta", itertools.product( compute_units, backends, [(2, 4, 6, 8), (4, 12, 6, 16)], [0.0, 0.5, 1.0, 2], ), ) def test_baddbmm(self, compute_unit, backend, shapes, beta): B, N, M, P = shapes # input shape: any shape broadcastable to (B, N, P) # batch1 shape: (B, N, M) # batch2 shape: (B, M, P) # output shape : (B, N, P) class BaddbmmModel(nn.Module): def __init__(self): super(BaddbmmModel, self).__init__() self.batch1 = torch.randn(B, N, M) self.batch2 = torch.randn(B, M, P) def forward(self, x): return torch.baddbmm(x, self.batch1, self.batch2, beta=beta) model = BaddbmmModel() # Makes it broadcastable to (B, N, P). for input_shape in [(1, N, P), (B, 1, P), (1, P)]: self.run_compare_torch(input_shape, model, backend=backend, compute_unit=compute_unit) class TestGlu(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shapes", itertools.product( compute_units, backends, [(2, 4, 6, 8), (6, 2, 10)], ), ) def test_glu(self, compute_unit, backend, shapes): # The dim specified for GLU shouldn't exceed the max dim in input. glu_dim_list = [-1] + [i for i in range(len(shapes))] for glu_dim in glu_dim_list: model = torch.nn.GLU(glu_dim) self.run_compare_torch(shapes, model, backend=backend, compute_unit=compute_unit) class TestHstack(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shapes", itertools.product( compute_units, backends, frontends, [ [(2, 4, 6), (2, 4, 6)], [(1, 4, 5), (1, 2, 5)], [(1,), (3,)], ], # Test 1-D tensors. ), ) def test_hstack(self, compute_unit, backend, frontend, shapes): class HstackModel(nn.Module): def forward(self, *tensors): return torch.hstack(tensors) self.run_compare_torch( shapes, HstackModel(), compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend, shapes", itertools.product( compute_units, backends, frontends, [[(2, 4, 6), (2, 4, 6)]], ), ) def test_hstack_with_parameter_out(self, compute_unit, backend, frontend, shapes): class HstackModel(nn.Module): def forward(self, *tensors): output_tensor = torch.tensor([]) torch.hstack(tensors, out=output_tensor) return output_tensor self.run_compare_torch( shapes, HstackModel(), compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestRemainder(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, shapes", itertools.product( compute_units, backends, frontends, [ [(2, 4, 6), (2, 4, 6)], [(2, 4, 6), (4, 6)], # broadcastable tensors [(2, 4, 6), (2, 1, 6)], ], ), ) def test_remainder(self, compute_unit, backend, frontend, shapes): class RemainderModel(nn.Module): def forward(self, dividend, divisor): return torch.remainder(dividend, divisor) self.run_compare_torch( shapes, RemainderModel(), compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend, shapes", itertools.product( compute_units, backends, frontends, [[(2, 4, 6), (2, 4, 6)]], ), ) def test_remainder_with_parameter_out(self, compute_unit, backend, frontend, shapes): class RemainderModel(nn.Module): def forward(self, dividend, divisor): output_tensor = torch.tensor([]) torch.remainder(dividend, divisor, out=output_tensor) return output_tensor self.run_compare_torch( shapes, RemainderModel(), compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_remainder_input_types_promotion(self, compute_unit, backend, frontend): class RemainderModel(nn.Module): def forward(self, dividend, divisor): return torch.remainder(dividend, divisor) input_dividend = torch.randint(low=0, high=10, size=(2, 3), dtype=torch.int32) input_divisor = torch.rand(2, 3) self.run_compare_torch( [input_dividend, input_divisor], RemainderModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) class TestSum(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_dtype", itertools.product(compute_units, backends, [torch.int32, torch.float32, torch.bool]), ) def test_sum(self, compute_unit, backend, input_dtype): model = ModuleWrapper(function=torch.sum) input_data = torch.zeros(2, 3).to(input_dtype) expected_results = model(input_data) TorchBaseTest.run_compare_torch( input_data, model, expected_results=expected_results, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) class TestLogsumexp(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape, dim", itertools.product( compute_units, backends, COMMON_SHAPES, [0, -1], ), ) def test_logsumexp(self, compute_unit, backend, shape, dim): params = {"dim": dim} model = ModuleWrapper( function=torch.logsumexp, kwargs=params, ) TorchBaseTest.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit, ) class TestHannWindow(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, window_length, periodic", itertools.product( compute_units, backends, [1, 3, 6, 10, 12], [True, False], ), ) def test_hann_window(self, compute_unit, backend, window_length, periodic): class HannWindowModel(nn.Module): def forward(self, x): return torch.hann_window(window_length, periodic) input_shape = np.random.randint(low=1, high=10, size=(window_length,)) torch_in = torch.tensor(input_shape, dtype=torch.int32) model = HannWindowModel().eval() torch_out = model(torch_in) self.run_compare_torch( torch_in, model, expected_results=torch_out, input_as_shape=False, backend=backend, compute_unit=compute_unit, ) class TestTrace(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product( compute_units, backends, [(1, 1), (2, 4), (4, 3), (5, 5)], ), ) def test_trace(self, compute_unit, backend, shape): model = ModuleWrapper(torch.trace) self.run_compare_torch(shape, model, backend=backend, compute_unit=compute_unit) class TestRoll(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape, shifts", itertools.product( compute_units, backends, [(5,), (2, 4), (4, 2, 3)], [0, 1, 3], ), ) def test_roll(self, compute_unit, backend, shape, shifts): model = ModuleWrapper(torch.roll, kwargs={"shifts": shifts}) self.run_compare_torch( shape, model, backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, shape, shifts_dims", itertools.product( compute_units, backends, [(4, 2, 3)], [ [0, 0], [4, 0], [9, 0], [[0, 1], [0, 1]], # Shifts exceeds dimension [[89, 93, 102], [0, 1, 2]], # Negative shifts [[-9, -1], [1, 2]], # Duplicate dims [[8, 10, -8], [0, 1, 0]], ], ), ) def test_roll_with_dims(self, compute_unit, backend, shape, shifts_dims): shifts, dims = shifts_dims model = ModuleWrapper(torch.roll, kwargs={"shifts": shifts, "dims": dims}) self.run_compare_torch(shape, model, backend=backend, compute_unit=compute_unit) class TestArgmax(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, shape, axis, input_dtype", itertools.product( compute_units, backends, COMMON_SHAPES, [-1, 0], [np.float32, np.int32, np.int64], ), ) def test_argmax( self, compute_unit, backend: Tuple[str, str], shape: Tuple[int], axis: int, input_dtype: np.dtype, ): input_data = torch.rand(*shape) if input_dtype == np.float32 else torch.randint(10, shape) converter_input_type = [ct.TensorType(shape=input_data.shape, dtype=input_dtype)] model = ModuleWrapper(function=torch.argmax, kwargs={"dim": axis}) expected_results = model(input_data) TorchBaseTest.run_compare_torch( input_data, model, expected_results=expected_results, input_as_shape=False, backend=backend, converter_input_type=converter_input_type, compute_unit=compute_unit, ) class TestStack(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, rank, num", itertools.product( compute_units, backends, [1, 3], [1, 3], ), ) def test_stack(self, compute_unit, backend, rank, num): input_shape = np.random.randint(low=1, high=6, size=rank) for dim in [None] + list(range(rank + 1)): print("dim", dim) class StackModel(torch.nn.Module): def forward(self, *inputs): if dim is None: return torch.stack(inputs) else: return torch.stack(inputs, dim=dim) TorchBaseTest.run_compare_torch( [input_shape] * num, StackModel(), backend=backend, compute_unit=compute_unit, ) class TestComplex(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_complex(self, compute_unit: ct.ComputeUnit, backend, frontend): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("torch._ops.aten.complex.default is not Aten Canonical") class ComplexModel(torch.nn.Module): def forward(self, x): real_part = x + 1 imag_part = -x complex_data = torch.complex(real_part, imag_part) return torch.stack([complex_data.real, complex_data.imag], dim=1) TorchBaseTest.run_compare_torch( (2, 3, 4), ComplexModel(), compute_unit=compute_unit, backend=backend, frontend=frontend, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_complex_real_imag_same_input(self, compute_unit: ct.ComputeUnit, backend, frontend): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("torch._ops.aten.complex.default is not Aten Canonical") class ComplexModel(torch.nn.Module): def forward(self, x): return torch.complex(x, x).real TorchBaseTest.run_compare_torch( (2, 3, 4), ComplexModel(), compute_unit=compute_unit, backend=backend, frontend=frontend, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_complex_input_error(self, compute_unit: ct.ComputeUnit, backend, frontend): class ComplexModel(torch.nn.Module): def forward(self, x): return torch.complex(x.real, x.imag) input_data = torch.tensor([1 + 0j, 2 + 3j], dtype=torch.complex64) with pytest.raises( TypeError, match="dtype= is unsupported for inputs/outputs of the model", ): converter_input_type = [ct.TensorType(shape=input_data.shape, dtype=np.complex64)] TorchBaseTest.run_compare_torch( input_data, ComplexModel(), input_as_shape=False, converter_input_type=converter_input_type, compute_unit=compute_unit, backend=backend, frontend=frontend, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_complex_output_error(self, compute_unit: ct.ComputeUnit, backend, frontend): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("torch._ops.aten.complex.default is not Aten Canonical") class ComplexModel(torch.nn.Module): def forward(self, x): return torch.complex(x, x) with pytest.raises(ValueError, match="MIL doesn't support complex data as model's output"): TorchBaseTest.run_compare_torch( (2, 3, 4), ComplexModel(), compute_unit=compute_unit, backend=backend, frontend=frontend, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_abs(self, compute_unit, backend, frontend): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("torch._ops.aten.complex.default is not Aten Canonical") class AbsModel(torch.nn.Module): def forward(self, x): x = torch.complex(x, x) return torch.abs(x) TorchBaseTest.run_compare_torch( (1, 16), AbsModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestReal(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_real_real_input(self, compute_unit: ct.ComputeUnit, backend, frontend): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("torch._ops.aten.complex.default is not Aten Canonical") class RealModel(torch.nn.Module): def forward(self, x): return torch.real(x) TorchBaseTest.run_compare_torch( (2, 3, 4), RealModel(), compute_unit=compute_unit, backend=backend, frontend=frontend ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_real_complex_input(self, compute_unit: ct.ComputeUnit, backend, frontend): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("torch._ops.aten.complex.default is not Aten Canonical") class RealModel(torch.nn.Module): def forward(self, x): return torch.real(torch.complex(x, x)) TorchBaseTest.run_compare_torch( (2, 3, 4), RealModel(), compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestImag(TorchBaseTest): # torch.imag only support complex input, so we don't need to test real number input. @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_imag_complex_input(self, compute_unit: ct.ComputeUnit, backend, frontend): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("torch._ops.aten.complex.default is not Aten Canonical") class ImagModel(torch.nn.Module): def forward(self, x): return torch.imag(torch.complex(x, x)) TorchBaseTest.run_compare_torch( (2, 3, 4), ImagModel(), compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestViewAsReal(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_view_as_real(self, compute_unit: ct.ComputeUnit, backend, frontend): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("torch._ops.aten.complex.default is not Aten Canonical") class RealModel(torch.nn.Module): def forward(self, x): return torch.view_as_real(torch.complex(x, 2 * x)) TorchBaseTest.run_compare_torch( (2, 3, 4), RealModel(), compute_unit=compute_unit, backend=backend, frontend=frontend, ) class TestFft(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_directly_use_fft_complex_output_error(self, compute_unit: ct.ComputeUnit, backend): class FftModel(torch.nn.Module): def forward(self, x): return torch.fft.fft(x) with pytest.raises(ValueError, match="MIL doesn't support complex data as model's output"): TorchBaseTest.run_compare_torch( (2, 3, 4), FftModel(), backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, input_shape, fft_variant", itertools.product( compute_units, backends, [(1,), (2, 3), (3, 1, 2)], ["fft", "rfft", "ifft", "irfft"], ), ) def test_fft_basic_no_param( self, compute_unit: ct.ComputeUnit, backend, input_shape, fft_variant ): if input_shape == (1,) and fft_variant == "irfft": pytest.skip("PyTorch doesn't support length-1 input (1,) for irfft.") class FftModel(torch.nn.Module): def forward(self, x): if fft_variant == "fft": return torch.fft.fft(x).real elif fft_variant == "rfft": return torch.fft.rfft(x).real elif fft_variant == "ifft": x = torch.complex(x, x) return torch.fft.ifft(x).real elif fft_variant == "irfft": x = torch.complex(x, x) return torch.fft.irfft(x) else: raise ValueError(f"Invalid fft_variant {fft_variant}.") TorchBaseTest.run_compare_torch( input_shape, FftModel(), backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, fft_variant, n, dim, norm", itertools.product( compute_units, backends, ["fft", "rfft", "ifft", "irfft"], [None, 1, 5], [0, 1, -1], [None, "forward", "backward", "ortho"], ), ) def test_fft_basic(self, compute_unit: ct.ComputeUnit, backend, fft_variant, n, dim, norm): class FftModel(torch.nn.Module): def forward(self, x): if fft_variant == "fft": fft_res = torch.fft.fft(x, n=n, dim=dim, norm=norm) elif fft_variant == "rfft": fft_res = torch.fft.rfft(x, n=n, dim=dim, norm=norm) elif fft_variant == "ifft": x = torch.complex(x, x) fft_res = torch.fft.ifft(x, n=n, dim=dim, norm=norm) elif fft_variant == "irfft": x = torch.complex(x, x) return torch.fft.irfft(x, n=n, dim=dim, norm=norm) else: raise ValueError(f"Invalid fft_variant {fft_variant}.") return torch.stack([fft_res.real, fft_res.imag], dim=0) TorchBaseTest.run_compare_torch( (2, 3, 4), FftModel(), backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_fft_nested(self, compute_unit: ct.ComputeUnit, backend): class FftModel(torch.nn.Module): def forward(self, x): fft_1 = torch.fft.fft(x, dim=2, norm="forward") fft_2 = torch.fft.fft(fft_1, dim=0, norm="backward") fft_3 = torch.fft.fft(fft_2, dim=1, norm="ortho") return torch.real(fft_3) TorchBaseTest.run_compare_torch( (2, 3, 4), FftModel(), backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend, fftn_variant, shapes_and_dims, norm", itertools.product( compute_units, backends, ["fftn", "rfftn", "ifftn", "irfftn"], [ (None, None), (None, [1, 0]), ([2], None), ([5], [0]), ([1, 4], [1, 2]), ([1, 3, 5], [1, -1, 0]), ], [None, "forward", "backward", "ortho"], ), ) def test_fftn( self, compute_unit: ct.ComputeUnit, backend, fftn_variant, shapes_and_dims, norm ): shapes, dims = shapes_and_dims class FftnModel(torch.nn.Module): def forward(self, x): if fftn_variant == "fftn": fftn_res = torch.fft.fftn(x, s=shapes, dim=dims, norm=norm) elif fftn_variant == "rfftn": fftn_res = torch.fft.rfftn(x, s=shapes, dim=dims, norm=norm) elif fftn_variant == "ifftn": x = torch.complex(x, x) fftn_res = torch.fft.ifftn(x, s=shapes, dim=dims, norm=norm) elif fftn_variant == "irfftn": x = torch.complex(x, x) return torch.fft.irfftn(x, s=shapes, dim=dims, norm=norm) else: raise ValueError(f"Invalid fftn_variant {fftn_variant}.") return torch.stack([torch.real(fftn_res), torch.imag(fftn_res)], dim=0) TorchBaseTest.run_compare_torch( (2, 3, 4), FftnModel(), backend=backend, compute_unit=compute_unit ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_dims_specify_by_shapes(self, compute_unit: ct.ComputeUnit, backend): class FftnModel(torch.nn.Module): def forward(self, x): x = torch.complex(x, x) return torch.fft.irfftn(x, s=x.shape[-3:], dim=(-3, -2, -1)) TorchBaseTest.run_compare_torch( (2, 3, 4), FftnModel(), backend=backend, compute_unit=compute_unit ) class TestSTFT(TorchBaseTest): @pytest.mark.slow @pytest.mark.parametrize( "compute_unit, backend, input_shape, complex, n_fft, hop_length, win_length, window, center, pad_mode, normalized, onesided", itertools.product( compute_units, backends, [(1, 32), (32,), (3, 32)], # input shape [False, True], # complex [16], # n_fft [None, 4, 5], # hop_length [None, 16, 9], # win_length [None, torch.hann_window], # window [None, False, True], # center ["constant", "reflect", "replicate"], # pad mode [False, True], # normalized [None, False, True], # onesided ) ) def test_stft(self, compute_unit, backend, input_shape, complex, n_fft, hop_length, win_length, window, center, pad_mode, normalized, onesided): if complex and onesided: pytest.skip("Onesided stft not possible for complex inputs") class STFTModel(torch.nn.Module): def forward(self, x): applied_window = window(win_length) if window and win_length else None x = torch.complex(x, x) if complex else x x = torch.stft( x, n_fft=n_fft, hop_length=hop_length, win_length=win_length, window=applied_window, center=center, pad_mode=pad_mode, normalized=normalized, onesided=onesided, return_complex=True) x = torch.stack([torch.real(x), torch.imag(x)], dim=0) return x TorchBaseTest.run_compare_torch( input_shape, STFTModel(), backend=backend, compute_unit=compute_unit ) if _HAS_TORCH_AUDIO: class TestSpectrogram(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, input_shape, spec, power", itertools.product( compute_units, backends, [(1, 1000), (1000,), (3, 1000)], # input shape [torchaudio.transforms.Spectrogram, torchaudio.transforms.MelSpectrogram], [None, 1, 2], # magnitude or power ), ) def test_spectrogram(self, compute_unit, backend, input_shape, spec, power): if platform.machine() != "arm64": pytest.xfail( "rdar://108001659 ([PyTorch] Torchaudio Spectrogram Failed on Intel Machine)" ) if spec is torchaudio.transforms.MelSpectrogram and power is None: pytest.skip("power or magnitude required for melspec") class SpectrogramModel(torch.nn.Module): def __init__(self) -> None: super().__init__() # the other spectrogram options are passed through to stft # and are tested in TestSTFT self.spec = spec(power=power, n_fft=128) def forward(self, x): x = self.spec(x) if power is None: # complex: stack them x = torch.stack([torch.real(x), torch.imag(x)], dim=0) return x np.random.seed(1024) TorchBaseTest.run_compare_torch( input_shape, SpectrogramModel(), backend=backend, compute_unit=compute_unit, rtol=1e-4, atol=1e-4, ) class TestNms(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, box_num, iou_threshold, dynamic_input, minimum_deployment_target", itertools.product( compute_units, backends, [1, 5, 20, 1000], [0.0, 0.2, 0.8], [True, False], [None, ct.target.iOS17], ), ) def test_nms( self, compute_unit, backend: Tuple[str, str], box_num: int, iou_threshold: float, dynamic_input: bool, minimum_deployment_target: ct.target, ): if box_num >= 1000 and backend == ("mlprogram", "fp16"): pytest.xfail( "rdar://103891349 ([TensorFlow] [PyTorch] NMS discrepancy in Fp16 when " "number of boxes is large)" ) class NmsModel(torch.nn.Module): def forward(self, boxes, scores): return torchvision.ops.nms(boxes, scores, iou_threshold=iou_threshold) input_boxes = torch.randint( low=0, high=box_num, size=(box_num, 4), dtype=torch.float32 ) # When two boxes have IOU exactly equal to iou_threshold (>0.0), it will hit the corner case as shown in # `test_nms_corner_case`, which has a discrepancy between CoreML and PyTorch. To avoid this situation, we keep # regenerating the input boxes at most _MAX_REGEN times until there is no corner case in the generated boxes. _MAX_REGEN = 3 regen_count = 0 while regen_count < _MAX_REGEN and iou_threshold > 0.0 and iou_threshold in torchvision.ops.box_iou( input_boxes, input_boxes): input_boxes = torch.randint( low=0, high=box_num, size=(box_num, 4), dtype=torch.float32 ) regen_count += 1 # When the input score is too close, the returned index order is not guaranteed (same # behaviour as PyTorch). So instead of generating random scores by torch.rand, use shuffle. input_scores = np.arange(box_num) np.random.shuffle(input_scores) input_scores = torch.tensor(input_scores, dtype=torch.float32) if dynamic_input: upper_bound = 4096 if backend[0] == "mlprogram" else -1 converter_input_type = [ ct.TensorType(shape=(RangeDim(1, upper_bound), 4)), ct.TensorType(shape=(RangeDim(1, upper_bound),)), ] else: converter_input_type = [ ct.TensorType(shape=input_boxes.shape), ct.TensorType(shape=input_scores.shape), ] nms_model = NmsModel() nms_model.eval() expected_results = nms_model(input_boxes, input_scores) TorchBaseTest.run_compare_torch( [input_boxes, input_scores], nms_model, expected_results=expected_results, input_as_shape=False, backend=backend, converter_input_type=converter_input_type, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) @pytest.mark.parametrize( "compute_unit, backend, minimum_deployment_target", itertools.product( compute_units, backends, [None, ct.target.iOS17], ), ) def test_nms_corner_case_iou_equal_threshold( self, compute_unit, backend: Tuple[str, str], minimum_deployment_target: ct.target, ): class NmsModel(torch.nn.Module): def forward(self, boxes, scores): return torchvision.ops.nms(boxes, scores, iou_threshold=0.2) input_boxes = torch.tensor( [ [3.0, 2.0, 3.0, 0.0], [0.0, 0.0, 2.0, 2.0], [1.0, 3.0, 2.0, 1.0], [0.0, 2.0, 1.0, 3.0], [1.0, 1.0, 2.0, 3.0], ], dtype=torch.float32, ) input_scores = torch.tensor([3.0, 2.0, 0.0, 1.0, 4.0], dtype=torch.float32) converter_input_type = [ ct.TensorType(shape=input_boxes.shape), ct.TensorType(shape=input_scores.shape), ] nms_model = NmsModel() nms_model.eval() expected_results = nms_model(input_boxes, input_scores) if backend[1] == "fp32" and minimum_deployment_target != ct.target.iOS17: with pytest.raises(AssertionError, match="Items are not equal"): # TODO: rdar://104966206 ([PyTorch] Re-enable NMS Corner Case Tests After PyTorch Fixes Bugs). # This is because the IOU between the last box ([1., 1., 2., 3.]) and the 2nd box ([0., 0., 2., 2.]) is # exactly 0.2 (IOU threshold), which leads to a corner case that PyTorch will remove the second box while # CoreML keeps it. According to PyTorch's doc, only boxes with `greater than iou_threshold` should be # removed, so it's a bug in PyTorch's side. # # The reason of the PyTorch bug is: # They always use fp64 for the IOU threshold in their c++ backend, # even if the boxes and the scores can be fp32, # so the IOU threshold (fp64 0.2) rounds to 0.20000000000000001 and # the IOU between the last and the 2nd boxes (fp32 0.2) rounds to 0.20000000298023224, # leading to fp32 0.2 > fp64 0.2 and the removal happens TorchBaseTest.run_compare_torch( [input_boxes, input_scores], nms_model, expected_results=expected_results, input_as_shape=False, backend=backend, converter_input_type=converter_input_type, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) else: # In fp16, the IOU threshold (fp16 0.2) rounds to 0.199951171875. # On CPU, espresso computes everything in fp32, so the IOU between # the last and the 2nd boxes (fp32 0.2) rounds to 0.20000000298023224, # leading to fp32 0.2 > fp16 0.2 and the removal happens # # In IOS17, the CoreML and PyTorch have same results for the corner case. TorchBaseTest.run_compare_torch( [input_boxes, input_scores], nms_model, expected_results=expected_results, input_as_shape=False, backend=backend, converter_input_type=converter_input_type, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # Change the last input box to make IOU slightly larger than 0.2, the output of CoreML will match PyTorch. input_boxes[-1][-1] = 2.997 expected_results = nms_model(input_boxes, input_scores) TorchBaseTest.run_compare_torch( [input_boxes, input_scores], nms_model, expected_results=expected_results, input_as_shape=False, backend=backend, converter_input_type=converter_input_type, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # Change the last input box to make IOU slightly smaller than 0.2, the output of CoreML will match PyTorch. input_boxes[-1][-1] = 3.003 expected_results = nms_model(input_boxes, input_scores) TorchBaseTest.run_compare_torch( [input_boxes, input_scores], nms_model, expected_results=expected_results, input_as_shape=False, backend=backend, converter_input_type=converter_input_type, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) class TestTensorSize(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_tensor_size( self, compute_unit: ct.ComputeUnit.CPU_ONLY, backend: List[Tuple[str]], frontend ): class TestModel(torch.nn.Module): def forward(self, x): # torch.export cannot deal with # * non-tensor output (because torch.export will try to call .detach) # * empty graph (i.e. no tenosr operation) # so we use an op to wrap the output into tensor if frontend in TORCH_EXPORT_BASED_FRONTENDS: return torch.tensor(x.size()) else: return x.size() self.run_compare_torch( [(1, 2, 3)], TestModel(), backend=backend, compute_unit=compute_unit, frontend=frontend, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, dim, minimum_deployment_target", itertools.product( compute_units, [("mlprogram", "fp16")], frontends, [2, -1], [None, ct.target.iOS17], ), ) def test_tensor_size_with_dim( self, compute_unit: ct.ComputeUnit.CPU_ONLY, backend: List[Tuple[str]], frontend, dim: int, minimum_deployment_target: ct.target, ): class TestModel(torch.nn.Module): def forward(self, x): # torch.export cannot deal with # * non-tensor output (because torch.export will try to call .detach) # * empty graph (i.e. no tenosr operation) # so we use an op to wrap the output into tensor if frontend in TORCH_EXPORT_BASED_FRONTENDS: return torch.tensor(x.size(dim=dim)) else: return x.size(dim=dim) self.run_compare_torch( [(1, 2, 3)], TestModel(), backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, frontend=frontend, ) class TestBitwiseAnd(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_bitwise_and( self, compute_unit: ct.ComputeUnit.CPU_ONLY, backend: List[Tuple[str]], frontend: TorchFrontend, ): class TestModel(torch.nn.Module): def forward(self, x, y): return torch.bitwise_and(x, y) input_shape = (2, 3) input_data_x = torch.rand(*input_shape) > 0.2 input_data_y = torch.rand(*input_shape) < 0.8 self.run_compare_torch( [input_data_x, input_data_y], TestModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_bitwise_and_unsupport_input( self, compute_unit: ct.ComputeUnit.CPU_ONLY, backend: List[Tuple[str]], frontend: TorchFrontend, ): class TestModel(torch.nn.Module): def forward(self, x, y): return torch.bitwise_and(x, y) input_shape = (2, 3) input_data_x = torch.randint(low=0, high=10, size=input_shape, dtype=torch.int32) input_data_y = torch.randint(low=0, high=10, size=input_shape, dtype=torch.int32) with pytest.raises( NotImplementedError, match="The `bitwise_and` op only supports boolean input", ): self.run_compare_torch( [input_data_x, input_data_y], TestModel(), frontend=frontend, backend=backend, compute_unit=compute_unit, input_as_shape=False, ) class TestUnfold(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, input_shape, kernel_size, padding, stride", itertools.product( compute_units, backends, frontends, [(1, 1, 10, 11), (5, 3, 12, 13)], [(2, 3)], [0, 1, 8, (1, 3), (2, 6), (0, 5)], [1, 2, 7, (2, 3), (5, 4)], ), ) def test_unfold( self, compute_unit, backend, frontend, input_shape, kernel_size, padding, stride ): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("ExecuTorch produces rank > 5 tensor") self.run_compare_torch( input_shape, ModuleWrapper( function=torch.nn.functional.unfold, kwargs={ "kernel_size": kernel_size, "padding": padding, "stride": stride, }, ), frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestFold(TorchBaseTest): @staticmethod def construct_block_count( output_size: Tuple[int], kernel_size: Tuple[int], dilation=1, padding=0, stride=1, ): dim = len(kernel_size) if not isinstance(dilation, tuple): dilation = (dilation,) * dim if not isinstance(padding, tuple): padding = (padding,) * dim if not isinstance(stride, tuple): stride = (stride,) * dim block_count = 1 for i in range(dim): block_count *= np.floor( (output_size[i] + 2 * padding[i] - dilation[i] * (kernel_size[i] - 1) - 1) / stride[i] + 1 ).astype(np.int32) return block_count @pytest.mark.parametrize( "compute_unit, backend, frontend, N, C, output_size, kernel_size", itertools.product( compute_units, backends, frontends, [1, 2], [1, 3], [(12, 12), (12, 24)], [(2, 2), (2, 3)], ), ) def test_unfold(self, compute_unit, backend, frontend, N, C, output_size, kernel_size): if frontend == TorchFrontend.EXECUTORCH: pytest.skip("torch._ops.aten._unsafe_index_put.default is not Aten Canonical") block_count = self.construct_block_count( output_size, kernel_size, stride=kernel_size, ) self.run_compare_torch( (N, C * np.prod(kernel_size), block_count), ModuleWrapper( function=torch.nn.functional.fold, kwargs={ "output_size": output_size, "kernel_size": kernel_size, "stride": kernel_size, }, ), frontend=frontend, backend=backend, compute_unit=compute_unit, ) class TestTupleUnpack(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product( compute_units, backends, frontends, ), ) def test_tuple_unpack(self, compute_unit, backend, frontend): class ReturnTupleModel(nn.Module): def forward(self, x): return x * 3, x * 4, x * 5 class TestModel(nn.Module): def __init__(self): super().__init__() self.return_tuple_layer = ReturnTupleModel() def forward(self, x): out1, out2, out3 = self.return_tuple_layer(x) return out1.relu(), out2.sigmoid(), out3.softmax(1) self.run_compare_torch( (1, 2, 3), TestModel(), compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestTupleIndex(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_tuple_index(self, compute_unit, backend): class InnerModel(nn.Module): def forward(self, x): return (torch.tensor([0]), torch.tensor([1])) class OuterModel(nn.Module): def __init__(self): super().__init__() self.innermodel = torch.jit.trace(InnerModel().eval(), x) def forward(self, x): inner = self.innermodel(x) return inner[0] x = torch.rand(1, 3, 640, 640) self.run_compare_torch( x, OuterModel(), input_as_shape=False, use_scripting=True, backend=backend, compute_unit=compute_unit, ) @pytest.mark.skipif( platform.machine() == "x86_64", reason="The x86_64 has outdated PyTorch, which doesn't have _scaled_dot_product_flash_attention in fx node.", ) class TestScaledDotProductAttention(TorchBaseTest): """ Tests for torch.nn.functional.scaled_dot_product_attention op (https://pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html) """ @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target", itertools.product( compute_units, backends, frontends, [None, ct.target.iOS18], ), ) def test_different_batch_dims(self, compute_unit, backend, frontend, minimum_deployment_target): """ The query/key/value inputs can have different batch_dims. """ q_shape = [1, 2, 10, 3] k_shape = [2, 1, 10, 3] v_shape = [2, 2, 10, 3] input_shape = [ q_shape, k_shape, v_shape, ] model = ModuleWrapper( function=nn.functional.scaled_dot_product_attention, kwargs={ "attn_mask": None, "dropout_p": 0.0, "is_causal": False, }, ) res = self.run_compare_torch( input_shape, model, frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # mb.sdpa is introduced in iOS 18, so before iOS 18 we would decompose sdpa # torch.sdpa is not a core aten op, so executorch would decompose sdpa if ( backend[0] == "mlprogram" and minimum_deployment_target == ct.target.iOS18 and frontend != TorchFrontend.EXECUTORCH ): if backend[1] == "fp16": expected_ops = [ "cast", "tile", "cast", "tile", "cast", "scaled_dot_product_attention", ] else: expected_ops = ["tile", "tile", "scaled_dot_product_attention"] assert get_op_types_in_program(res[1]._mil_program) == expected_ops @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target, rank, dynamic", itertools.product( compute_units, backends, frontends, [None, ct.target.iOS18], [2, 3, 4, 5], [True, False], ), ) def test_different_input_ranks_no_mask( self, compute_unit, backend, frontend, minimum_deployment_target, rank, dynamic ): """ The query/key/value inputs can be any rank 2 or greater. """ batch_size, seq_len, n_heads_1, n_heads_2, d = 2, 10, 3, 4, 7 if rank == 2: input_shape = (seq_len, d) elif rank == 3: input_shape = (batch_size, seq_len, d) elif rank == 4: input_shape = (batch_size, n_heads_1, seq_len, d) elif rank == 5: input_shape = (batch_size, n_heads_1, n_heads_2, seq_len, d) else: raise ValueError("invalid rank") model = ModuleWrapper( function=nn.functional.scaled_dot_product_attention, kwargs={ "attn_mask": None, "dropout_p": 0.0, "is_causal": False, }, ) if dynamic: converter_input_type = [ ct.TensorType( shape=(ct.RangeDim(upper_bound=10, default=batch_size),) + input_shape[1:] ) for _ in range(3) ] else: converter_input_type = None _, coreml_model, _, _, _, _ = self.run_compare_torch( [input_shape] * 3, model, frontend=frontend, backend=backend, converter_input_type=converter_input_type, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # mb.sdpa is introduced in iOS 18, so before iOS 18 we would decompose sdpa # torch.sdpa is not a core aten op, so executorch would decompose sdpa if ( backend[0] == "mlprogram" and minimum_deployment_target == ct.target.iOS18 and frontend != TorchFrontend.EXECUTORCH ): pymil_inputs = list(coreml_model._mil_program.functions["main"].inputs.values()) is_io_fp16 = pymil_inputs[0].dtype == types.fp16 is_io_precision_same_as_compute_precision = is_io_fp16 == (backend[1] == "fp16") if rank == 2: if is_io_precision_same_as_compute_precision: expected_ops = [ "expand_dims", "expand_dims", "expand_dims", "scaled_dot_product_attention", "squeeze", ] else: expected_ops = [ "cast", "expand_dims", "cast", "expand_dims", "cast", "expand_dims", "scaled_dot_product_attention", "squeeze", ] else: if is_io_precision_same_as_compute_precision: expected_ops = ["scaled_dot_product_attention"] else: expected_ops = ["cast", "cast", "cast", "scaled_dot_product_attention"] assert get_op_types_in_program(coreml_model._mil_program) == expected_ops @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target, seq_lengths, include_heads, dynamic", itertools.product( compute_units, backends, frontends, [None, ct.target.iOS18], [(5, 5), (5, 7), (6, 4)], [False, True], [True, False], ), ) def test_is_causal_flag( self, compute_unit, backend, frontend, minimum_deployment_target, seq_lengths, include_heads, dynamic, ): if frontend == TorchFrontend.EXECUTORCH: pytest.xfail( "https://github.com/apple/coremltools/issues/2199: placeholder assertion error" ) source_seq_len, target_seq_len = seq_lengths query_shape = (2, 2, target_seq_len, 7) if include_heads else (2, target_seq_len, 7) key_shape = (2, 2, source_seq_len, 7) if include_heads else (2, source_seq_len, 7) value_shape = key_shape model = ModuleWrapper( function=nn.functional.scaled_dot_product_attention, kwargs={ "attn_mask": None, "is_causal": True, }, ) if dynamic: converter_input_type = [ ct.TensorType( shape=(ct.RangeDim(upper_bound=10, default=input_shape[0]),) + input_shape[1:] ) for input_shape in [query_shape, key_shape, value_shape] ] else: converter_input_type = None res = self.run_compare_torch( [query_shape, key_shape, value_shape], model, frontend=frontend, backend=backend, converter_input_type=converter_input_type, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) # check that "fill" and "band_part" ops, which are needed to compute mask, have been constant folded mil_prog = res[1]._get_mil_internal() # assert that "lstm" ops are present in the mil program assert len(mil_prog.find_ops(op_type="fill")) == 0 assert len(mil_prog.find_ops(op_type="band_part")) == 0 @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target, seq_lengths, bool_mask, dynamic", itertools.product( compute_units, backends, frontends, [None, ct.target.iOS18], [(5, 5), (7, 5)], [False, True], [False, True], ), ) def test_attn_mask( self, compute_unit, backend, frontend, minimum_deployment_target, seq_lengths, bool_mask, dynamic, ): if frontend != TorchFrontend.EXECUTORCH and bool_mask: pytest.xfail( "rdar://110499660 ([CI][Bug] test_attn_mask is occasionally failing when bool_mask = True)" ) source_seq_len, target_seq_len = seq_lengths query_shape = (2, 3, target_seq_len, 7) key_shape = (2, 3, source_seq_len, 7) value_shape = key_shape mask_shape = (target_seq_len, source_seq_len) query = generate_input_data(query_shape) key = generate_input_data(key_shape) value = generate_input_data(value_shape) if bool_mask: mask = torch.rand(mask_shape) > 0.5 mask = mask.bool() else: mask = generate_input_data(mask_shape) model = ModuleWrapper(function=nn.functional.scaled_dot_product_attention) if dynamic: converter_input_type = [ ct.TensorType( shape=(ct.RangeDim(upper_bound=10, default=input_data.shape[0]),) + input_data.shape[1:] ) for input_data in [query, key, value, mask] ] else: converter_input_type = None self.run_compare_torch( (query, key, value, mask), model, frontend=frontend, backend=backend, converter_input_type=converter_input_type, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, input_as_shape=False, ) @pytest.mark.parametrize( "compute_unit, backend, frontend", itertools.product(compute_units, backends, frontends), ) def test_scale(self, compute_unit, backend, frontend): batch_size, seq_len, n_heads, d = 2, 10, 3, 7 input_shape = (batch_size, n_heads, seq_len, d) model = ModuleWrapper( function=nn.functional.scaled_dot_product_attention, kwargs={ "attn_mask": None, "dropout_p": 0.0, "is_causal": False, "scale": 1.5, }, ) self.run_compare_torch( [input_shape] * 3, model, frontend=frontend, backend=backend, compute_unit=compute_unit, ) @pytest.mark.parametrize( "compute_unit, backend, frontend, minimum_deployment_target, mask_as_input, dynamic", itertools.product( compute_units, backends, frontends, [None, ct.target.iOS18], [True, False], [True, False], ), ) def test_toy_xformer_with_sdpa( self, compute_unit, backend, frontend, minimum_deployment_target, mask_as_input, dynamic, ): if frontend == TorchFrontend.EXECUTORCH and not mask_as_input: pytest.xfail( "https://github.com/apple/coremltools/issues/2199: placeholder assertion error" ) embedding_size = 32 seq_length = 16 n_heads = 4 batch_size = 2 num_blocks = 3 class AttentionBlock(nn.Module): def __init__(self, embed_dim=embedding_size, n_head=n_heads): super().__init__() self.query_proj_op = nn.Linear(embed_dim, embed_dim) self.key_proj_op = nn.Linear(embed_dim, embed_dim) self.value_proj_op = nn.Linear(embed_dim, embed_dim) self.out_proj_op = nn.Linear(embed_dim, embed_dim) self.n_head = n_head def forward(self, x, mask=None): # in comments below for shapes, using following notation: # B: batch_size, S: seq_length, E: embedding_size, h: n_heads # x: (B,S,E) # mask: (S,S) batch_size, seq_len, dim = x.shape query_proj = self.query_proj_op(x) # (B,S,E) key_proj = self.key_proj_op(x) # (B,S,E) value_proj = self.value_proj_op(x) # (B,S,E) # reshape to (B, h, S, E/h) query_proj = query_proj.reshape( batch_size, seq_len, self.n_head, dim // self.n_head ).permute( 0, 2, 1, 3 ) # (B, h, S, E/h) key_proj = key_proj.reshape( batch_size, seq_len, self.n_head, dim // self.n_head ).permute( 0, 2, 1, 3 ) # (B, h, S, E/h) value_proj = value_proj.reshape( batch_size, seq_len, self.n_head, dim // self.n_head ).permute( 0, 2, 1, 3 ) # (B, h, S, E/h) # now do scaled dot produce attention if mask is None: out = nn.functional.scaled_dot_product_attention( query_proj, key_proj, value_proj, is_causal=True ) # (B, h, S, E/h) else: out = nn.functional.scaled_dot_product_attention( query_proj, key_proj, value_proj, mask ) # (B, h, S, E/h) # reshape back to (B, S, E) out = out.permute(0, 2, 1, 3).reshape(batch_size, seq_len, dim) # (B, S, E) return self.out_proj_op(out) class MLPBlock(nn.Module): def __init__(self, embed_dim=embedding_size): super().__init__() self.fc1 = nn.Linear(embed_dim, embed_dim) self.activation = nn.GELU() self.fc2 = nn.Linear(embed_dim, embed_dim) def forward(self, x): x = self.fc1(x) x = self.activation(x) return self.fc2(x) class ToyTransformer(nn.Module): def __init__(self, n_blocks=num_blocks, embed_dim=embedding_size): super().__init__() self.attn_block = AttentionBlock(embed_dim=embed_dim) self.mlp = MLPBlock(embed_dim=embed_dim) self.n_blocks = n_blocks self.lnorm = nn.LayerNorm(embed_dim) def forward(self, x, mask=None): for i in range(self.n_blocks): x = self.attn_block(x, mask) + x x = self.lnorm(x) x = self.mlp(x) + x x = self.lnorm(x) return x model = ToyTransformer() input_shapes = ( [(batch_size, seq_length, embedding_size), (seq_length, seq_length)] if mask_as_input else [(batch_size, seq_length, embedding_size)] ) if dynamic: converter_input_type = [ ct.TensorType( shape=(ct.RangeDim(upper_bound=16, default=input_shape[0]),) + input_shape[1:] ) for input_shape in input_shapes ] else: converter_input_type = None self.run_compare_torch( input_shapes, model, converter_input_type=converter_input_type, frontend=frontend, backend=backend, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, ) def test_dropout_early_error_out(self): B, S, L, E, EV = 3, 5, 7, 16, 32 query_shape = (B, L, E) key_shape = (B, S, E) value_shape = (B, S, EV) query = generate_input_data(query_shape) key = generate_input_data(key_shape) value = generate_input_data(value_shape) model = ModuleWrapper( function=nn.functional.scaled_dot_product_attention, kwargs={"dropout_p": 0.0} ) self.run_compare_torch( (query, key, value), model, input_as_shape=False, ) with pytest.raises( ValueError, match=( r"A non-zero dropout probability is specified. Since Core ML " r"does not support dropout yet, we cannot convert it" ), ): model = ModuleWrapper( function=nn.functional.scaled_dot_product_attention, kwargs={"dropout_p": 0.1} ) self.run_compare_torch( (query, key, value), model, input_as_shape=False, ) class TestTransformer(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_transformer_encoder(self, compute_unit, backend): class TransformerEncoder(nn.Module): def __init__(self, input_size, hidden_size, nhead=1, num_layers=1, dropout_rate=0.1): super(TransformerEncoder, self).__init__() encoder_layers = nn.TransformerEncoderLayer( d_model=input_size, nhead=nhead, dim_feedforward=hidden_size, dropout=dropout_rate, ) self.transformer_encoder = nn.TransformerEncoder( encoder_layers, num_layers=num_layers ) def forward(self, x): y = self.transformer_encoder(x) return y model = TransformerEncoder(32, 16, nhead=4, num_layers=2) model.eval() self.run_compare_torch((3, 32), model, backend=backend, compute_unit=compute_unit) @pytest.mark.parametrize( "compute_unit, backend, dynamic", itertools.product(compute_units, backends, (True, False)), ) def test_transformer(self, compute_unit, backend, dynamic): if dynamic: inputs = [ ct.TensorType( shape=( ct.RangeDim(lower_bound=1, upper_bound=16), ct.RangeDim(lower_bound=1, upper_bound=4), 3, ) ), ct.TensorType( shape=( ct.RangeDim(lower_bound=1, upper_bound=16), ct.RangeDim(lower_bound=1, upper_bound=4), 3, ) ), ] else: inputs = [ct.TensorType(shape=(1, 4, 3)), ct.TensorType(shape=(1, 4, 3))] self.run_compare_torch( [(1, 4, 3), (1, 4, 3)], nn.Transformer( d_model=3, nhead=1, batch_first=True, ), converter_input_type=inputs, backend=backend, compute_unit=compute_unit, ) class TestFliplr(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, frontend, input_shape", itertools.product(compute_units, backends, frontends, [(2, 3), (3, 4, 5), (8, 2, 6, 4)]), ) def test_fliplr(self, compute_unit, backend, frontend, input_shape): class TestModel(nn.Module): def forward(self, x): return torch.fliplr(x) self.run_compare_torch( input_shape, TestModel(), compute_unit=compute_unit, backend=backend, frontend=frontend ) class TestMultinomial(TorchBaseTest): @pytest.mark.parametrize( "compute_unit, backend, num_samples", itertools.product(compute_units, backends, [1, 3]), ) def test_multinomial(self, compute_unit, backend, num_samples): class TestModel(nn.Module): def forward(self, x): return torch.multinomial(x, num_samples, replacement=True) # As sampling is random, we make one element significantly larger than others to make # outputs consistent. input_data = torch.tensor([0, 5e4, 0, 0, 1, 1, 1], dtype=torch.float) self.run_compare_torch( input_data, TestModel(), backend=backend, compute_unit=compute_unit, input_as_shape=False, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_multinomial_probs_instead_of_logits(self, compute_unit, backend): """ Verify the input to multinomial is probs instead of logits. When the number of drawing is large, the drawing results could tell us if the input is probs or logits. In this test we use only 2 classes, so we can compare the number of `1` in results to verify if the input is taken a logarithm or not. """ class TestModel(nn.Module): def forward(self, x): return torch.multinomial(x, 1000, replacement=True) input_data = torch.tensor([0.01, 0.1], dtype=torch.float) torch_model = TestModel() torch_model.eval() traced_model = torch.jit.trace(torch_model, input_data) mlmodel = ct.convert( traced_model, inputs=[ct.TensorType(name="input", shape=input_data.shape, dtype=np.float16)], outputs=[ct.TensorType(name="output", dtype=np.float16)], convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY, minimum_deployment_target=ct.target.iOS16, ) if ct.utils._is_macos(): mlmodel_out = mlmodel.predict({"input": input_data.numpy()})["output"] torch_out = torch_model(input_data).numpy() # The counting of 1 in PyTorch and CoreML output should be similar. assert np.abs(np.sum(mlmodel_out) - np.sum(torch_out)) / mlmodel_out.size < 0.05 @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_multinomial_not_supported(self, compute_unit, backend): class TestModel(nn.Module): def forward(self, x): return torch.multinomial(x, 4) class TestModelDynamicNumSamples(nn.Module): def forward(self, x): return torch.multinomial(x, x.shape[0], replacement=True) input_data = torch.tensor([0, 10, 0, 0, 1, 1, 1], dtype=torch.float) with pytest.raises( ValueError, match="When num_samples is larger than 1, only replacement=True is supported.", ): self.run_compare_torch( input_data, TestModel(), backend=backend, compute_unit=compute_unit, input_as_shape=False, ) with pytest.raises(ValueError, match="In torch.multinomial op, num_samples must be const"): converter_input_type = [TensorType(shape=(RangeDim(1, 10),), dtype=np.float32)] self.run_compare_torch( input_data, TestModelDynamicNumSamples(), backend=backend, compute_unit=compute_unit, input_as_shape=False, converter_input_type=converter_input_type, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/test/test_torch_quantization_ops.py0000644000000000000000000015635314672066616032165 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools from contextlib import nullcontext from typing import Optional import numpy as np import numpy.testing import pytest import torch import torchvision from packaging.version import Version import coremltools as ct import coremltools.optimize as cto from coremltools import TensorType from coremltools._deps import ( _HAS_TORCH, _HAS_TORCH_VISION, _HAS_TORCHAO, MSG_TORCH_NOT_FOUND, MSG_TORCH_VISION_NOT_FOUND, MSG_TORCHAO_NOT_FOUND, ) from coremltools.converters.mil import testing_reqs from coremltools.converters.mil.frontend.torch.utils import TorchFrontend from coremltools.converters.mil.mil import types from coremltools.converters.mil.testing_utils import get_op_types_in_program from coremltools.optimize.coreml import _quantization_passes from coremltools.test.ml_program.test_compression import get_test_model_and_data from coremltools.test.optimize.coreml.test_post_training_quantization import ( create_quantize_friendly_weight, create_sparse_weight, create_unique_weight, ) from .testing_utils import TorchBaseTest, frontends if _HAS_TORCHAO: import torchao from torchao.quantization import quant_primitives as torchao_quant pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) torch.manual_seed(30) np.random.seed(30) torch.backends.quantized.engine = "qnnpack" compute_units = testing_reqs.compute_units def _force_quantize_model( model: torch.nn.Module, q_dtype: torch.dtype, low: Optional[int] = None, high: Optional[int] = None, scale: Optional[float] = None, zero_point: Optional[int] = None, channel_axis: Optional[int] = None, ): """ In torch, the quantized model can only be obtained from PTQ. This utility allows us to produce an int8 quantized model. If channel_axis is set, it will do per-channel quantization instead of per-tensor, for the param that channel_axis is valid for. """ if scale is None: scale = 1.0 if zero_point is None: zero_point = 0 # modify the parameter to force the quantization within a specific range. with torch.no_grad(): for name, param in model.named_parameters(): shape = param.shape input_data = ( torch.rand(*shape) if low is None else torch.randint(low, high, shape).float() ) input_data = (input_data - zero_point) * scale if channel_axis is not None and -len(shape) <= channel_axis < len(shape): scale = torch.Tensor([scale] * shape[channel_axis]) zero_point = torch.Tensor([zero_point] * shape[channel_axis]) new_value = torch.quantize_per_channel( input_data, scales=scale, zero_points=zero_point, axis=channel_axis, dtype=q_dtype, ) else: new_value = torch.quantize_per_tensor( input_data, scale=scale, zero_point=zero_point, dtype=q_dtype ) param_cls = type(param) new_value = param_cls(new_value, requires_grad=False).to(torch.device("cpu")) model._parameters[name] = new_value return model class TorchQuantizationBaseTest(TorchBaseTest): @staticmethod def run_compare_torch( input_data, model, atol=1e-04, rtol=1e-05, input_as_shape=True, minimum_deployment_target=ct.target.iOS17, compute_unit=ct.ComputeUnit.CPU_ONLY, frontend=TorchFrontend.TORCHSCRIPT, converter=ct.convert, ): # TODO(rdar://108472419): properly design a random input if input_as_shape: input_data = [torch.ones(*shape) for shape in input_data] return TorchBaseTest.run_compare_torch( input_data, model, atol=atol, rtol=rtol, input_as_shape=False, backend=("mlprogram", "fp32"), use_scripting=False, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, frontend=frontend, converter=converter, ) # TODO(rdar://107430678): test stand-alone quantize and dequantize when cast is ready class TestPyTorchQuantizationOps(TorchQuantizationBaseTest): @pytest.mark.parametrize( "quant_dtype, input_rank, is_zp_present, zp_dtype, are_params_tensors", itertools.product( (torch.qint8, torch.quint8, torch.qint32), (1, 3, 5), (True, False), (np.int8, np.uint8, np.int32), (True, False), ), ) def test_quantize_dequantize_per_tensor( self, quant_dtype, input_rank, is_zp_present, zp_dtype, are_params_tensors, ): input_shape = [*np.random.randint(low=1, high=5, size=(input_rank,))] scale = np.random.rand() zero_point = 0 if is_zp_present: low = 0 if quant_dtype == torch.quint8 or zp_dtype == np.uint8 else -128 high = 128 if quant_dtype == torch.qint8 or zp_dtype == np.int8 else 256 zero_point = np.random.randint(low, high, dtype=zp_dtype) if are_params_tensors: scale = torch.tensor([scale]) zero_point = torch.tensor([zero_point]) class Model(torch.nn.Module): def forward(self, x): quantized = torch.quantize_per_tensor(x, scale, zero_point, quant_dtype) dequantized = torch.dequantize(quantized) return dequantized model = Model() if quant_dtype == torch.qint32: with pytest.raises( ValueError, match=r"MIL quantization dtype must be int8 or uint8", ): self.run_compare_torch([input_shape], model) else: self.run_compare_torch([input_shape], model, atol=5e-4, rtol=5e-4) @pytest.mark.parametrize( "quant_dtype, input_rank, is_zp_present, zp_dtype", itertools.product( (torch.qint8, torch.quint8, torch.qint32), (1, 4, 5), (True, False), (torch.int8, torch.uint8, torch.int32), ), ) def test_quantize_dequantize_per_channel( self, quant_dtype, input_rank, is_zp_present, zp_dtype ): input_shape = [*np.random.randint(low=1, high=5, size=(input_rank,))] axis = np.random.randint(low=0, high=input_rank) scale = torch.rand(input_shape[axis]) zero_point = torch.zeros(input_shape[axis], dtype=zp_dtype) if is_zp_present: low = 0 if quant_dtype == torch.quint8 or zp_dtype == torch.uint8 else -128 high = 128 if quant_dtype == torch.qint8 or zp_dtype == torch.int8 else 256 zero_point = torch.randint(low, high, (input_shape[axis],), dtype=zp_dtype) class Model(torch.nn.Module): def forward(self, x): quantized = torch.quantize_per_channel(x, scale, zero_point, axis, quant_dtype) dequantized = torch.dequantize(quantized) return dequantized model = Model() if quant_dtype == torch.qint32: with pytest.raises( ValueError, match=r"MIL quantization dtype must be int8 or uint8", ): self.run_compare_torch([input_shape], model) else: self.run_compare_torch([input_shape], model, atol=5e-4, rtol=5e-4) # TODO(rdar://108463675): refactor torch op tests later to parametrize quantized vs standard ops class TestPytorchQuantizedOps(TorchQuantizationBaseTest): # PyTorch quantized_linear kernel only supports rank >= 2 @pytest.mark.parametrize( "use_relu, input_rank, quant_dtype", itertools.product([True, False], [2, 3, 4], [torch.quint8, torch.qint8]), ) def test_quantized_linear(self, use_relu, input_rank, quant_dtype): class Model(torch.nn.Module): def __init__(self): super().__init__() if use_relu: linear = torch.nn.intrinsic.quantized.LinearReLU else: linear = torch.nn.quantized.Linear self.quant_linear = linear(5, 4) def forward(self, x): x = torch.quantize_per_tensor(x, scale=1.0, zero_point=0, dtype=quant_dtype) x = self.quant_linear(x) return torch.dequantize(x) model = Model() if input_rank == 2: input_shape = (3, 5) elif input_rank == 3: input_shape = (1, 3, 5) elif input_rank == 4: input_shape = (1, 2, 3, 5) self.run_compare_torch([input_shape], model) @pytest.mark.parametrize( ",".join( [ "use_relu", "quant_dtype", "padding", "stride", "height", "width", "in_channels", "out_channels", "kernel_size", "dilation", "bias", ] ), [ (use_relu, quant_dtype, padding, stride, *param) for use_relu, quant_dtype, padding, stride, param in itertools.product( [True, False], [torch.quint8, torch.qint8], [1, 0], [1, 2, 3], [ (5, 3, 1, 1, 1, 1, True), (3, 3, 1, 1, 1, 3, False), (4, 3, 3, 3, 2, 1, True), (7, 3, 3, 3, 1, 1, False), ], ) ], ) def test_quantized_conv2d( self, use_relu, quant_dtype, padding, stride, height, width, in_channels, out_channels, kernel_size, dilation, bias, ): if padding == "same" and stride != 1: return class Model(torch.nn.Module): def __init__(self): super().__init__() if use_relu: conv = torch.nn.intrinsic.quantized.ConvReLU2d else: conv = torch.nn.quantized.Conv2d self.quant_conv = conv( in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, bias=bias, dtype=quant_dtype, ) def forward(self, x): x = torch.quantize_per_tensor(x, scale=1.0, zero_point=0, dtype=quant_dtype) x = self.quant_conv(x) return torch.dequantize(x) model = Model() self.run_compare_torch( [(1, in_channels, height, width)], model, ) @pytest.mark.parametrize( "input_dtype", (np.int32, np.float32), ) def test_quantized_embedding(self, input_dtype): pytest.xfail("rdar://106152706 gather: Required param 'validate_indices' is missing") num_embeddings = 4 embedding_size = 10 B = 2 dim = 5 converter_input_type = TensorType(shape=(B, dim), dtype=input_dtype) # input shape: (B, dim) # output shape : (B, dim, embedding_size) # shape of weights : (num_embeddings, embedding_size) class EmbeddingModel(torch.nn.Module): def __init__(self): super().__init__() self.embedding = torch.nn.quantized.Embedding(num_embeddings, embedding_size) def forward(self, x): return self.embedding(x) input_data = np.random.randint(low=0, high=num_embeddings, size=(B, dim)) input_data = torch.from_numpy(input_data) model = EmbeddingModel() self.run_compare_torch( [input_data], model, input_as_shape=False, converter_input_type=converter_input_type ) # Tests for add, add_relu, mul # See: https://pytorch.org/docs/stable/generated/torch.ao.nn.quantized.QFunctional.html @pytest.mark.parametrize( "quant_dtype, qfunc_name", itertools.product( [torch.quint8, torch.qint8], ["add", "add_relu", "mul"], ), ) def test_qfunc_binary_ops(self, quant_dtype, qfunc_name): class Model(torch.nn.Module): def __init__(self): super().__init__() self.qfunc = torch.nn.quantized.QFunctional() def forward(self, x): x = torch.quantize_per_tensor(x, scale=1.0, zero_point=0, dtype=quant_dtype) x = getattr(self.qfunc, qfunc_name)(x, x) return torch.dequantize(x) model = Model() self.run_compare_torch([(2, 3)], model) @pytest.mark.xfail( reason="torch.ops.quantized.matmul is not supporting mixed precision computation.", strict=True, ) @pytest.mark.parametrize( "quant_dtype", [torch.quint8, torch.qint8], ) def test_quantized_matmul(self, quant_dtype): class Model(torch.nn.Module): def __init__(self): super().__init__() self.weight = torch.nn.parameter.Parameter(torch.rand(5, 4)) def forward(self, x): return torch.ops.quantized.matmul(x, self.weight, 0, 0) model = Model() model = _force_quantize_model(model, q_dtype=quant_dtype) input_shape = [(3, 5)] self.run_compare_torch(input_shape, model) @pytest.mark.parametrize( "compute_unit, quant_dtype, channel_axis, minimum_deployment_target", itertools.product( compute_units, [torch.quint8, torch.qint8], [0, 1, None], [ct.target.iOS16, ct.target.iOS17, ct.target.iOS18], ), ) def test_quantized_params( self, compute_unit, quant_dtype, channel_axis, minimum_deployment_target ): class Model(torch.nn.Module): def __init__(self): super().__init__() self.weight = torch.nn.parameter.Parameter(torch.rand(5, 4)) def forward(self, x): dequanitized_weight = torch.dequantize(self.weight) return torch.matmul(x, dequanitized_weight) model = Model() model = _force_quantize_model(model, q_dtype=quant_dtype, channel_axis=channel_axis) input_shape = [(3, 5)] res = self.run_compare_torch( input_shape, model, minimum_deployment_target=minimum_deployment_target, compute_unit=compute_unit, ) prog = res[1]._mil_program if minimum_deployment_target < ct.target.iOS18: assert get_op_types_in_program(prog) == ["constexpr_affine_dequantize", "linear"] else: assert get_op_types_in_program(prog) == ["constexpr_blockwise_shift_scale", "matmul"] @pytest.mark.skipif(not _HAS_TORCHAO, reason=MSG_TORCHAO_NOT_FOUND) @pytest.mark.parametrize( "use_numpy, inner_k_tiles, group_size", itertools.product([True, False], [2, 4, 8], [32, 64]), ) def test_unpack_int4packed_by_mm_with_eye_matrix(self, use_numpy, inner_k_tiles, group_size): """ Check if the packed weight could be restored by _weight_int4pack_mm with eye matrix on CPU. As there is no kernel implemented for CPU to unpack the data packed by `torch._convert_weight_to_int4pack`, we use `torch._weight_int4pack_mm` to do matrix multiplication with an eye matrix to get unpacked data. """ if use_numpy: y_np = numpy.random.rand(128, 128).astype(np.float32) y = torch.from_numpy(y_np).to(torch.device("cpu")) else: y = torch.rand(128, 128, dtype=torch.float32, device=torch.device("cpu")) ( y_quantized, y_scales_and_zeros, ) = torchao.quantization.utils.groupwise_affine_quantize_tensor( y, n_bit=4, groupsize=group_size, dtype=torch.float32 ) y_int4packed = torch._convert_weight_to_int4pack(y_quantized, inner_k_tiles) y_unpacked_shape = (y_int4packed.shape[0] * 8, y_int4packed.shape[1] * (inner_k_tiles * 16)) eye_shape = y_unpacked_shape[1] eye_matrix = torch.eye(eye_shape, device=torch.device("cpu"), dtype=torch.float32) if Version(torch.__version__) < Version("2.4.0"): # The `torch._weight_int4pack_mm` op requires bfloat16 before PyTorch 2.4.0. eye_matrix = eye_matrix.to(torch.bfloat16) y_scales_and_zeros = y_scales_and_zeros.to(torch.bfloat16) y_dequant = torch._weight_int4pack_mm( eye_matrix, y_int4packed, group_size, y_scales_and_zeros, ) y_dequant = y_dequant.t().contiguous().float() # Makes sure this `_weight_int4pack_mm` with eye matrix fully restores the original y. np.testing.assert_allclose(y_dequant.numpy(), y.numpy(), atol=0.035, rtol=0.05) # Also verifies that the quantized y could be accurately reproduced by torchao utils. scales = torch.transpose(y_scales_and_zeros[:, :, 0], 0, 1) zero_points = torch.transpose(y_scales_and_zeros[:, :, 1], 0, 1) block_size = (1, group_size) y_dequant_quantized = torchao_quant.quantize_affine( y_dequant, block_size, scales, zero_points, torch.int32, quant_min=0, quant_max=2**4 - 1, zero_point_domain=torchao_quant.ZeroPointDomain.FLOAT, ) assert torch.equal(y_quantized, y_dequant_quantized) # The torchao dequantization utils should be able to recover the original y. y_dequantized_by_torchao = torchao_quant.dequantize_affine( y_quantized, (1, group_size), scales, zero_points, torch.int32, quant_min=0, quant_max=2**4 - 1, zero_point_domain=torchao_quant.ZeroPointDomain.FLOAT, ) np.testing.assert_allclose(y_dequant.numpy(), y_dequantized_by_torchao.numpy(), rtol=4e-3) @pytest.mark.skipif( Version(torch.__version__) < Version("2.4.0"), reason="_weight_int4pack_mm requires bfloat16 before PyTorch 2.4.0", ) @pytest.mark.skipif(not _HAS_TORCHAO, reason=MSG_TORCHAO_NOT_FOUND) @pytest.mark.parametrize( "compute_unit, inner_k_tiles, group_size", itertools.product(compute_units, [2, 4, 8], [32, 64]), ) def test_weight_int4pack_mm(self, compute_unit, inner_k_tiles, group_size): y = torch.rand(128, 128, dtype=torch.float32, device=torch.device("cpu")) class Model(torch.nn.Module): def forward(self, x): ( y_quantized, y_scales_and_zeros, ) = torchao.quantization.utils.groupwise_affine_quantize_tensor( y, n_bit=4, groupsize=group_size, dtype=torch.float32 ) y_int4packed = torch._convert_weight_to_int4pack(y_quantized, inner_k_tiles) return torch._weight_int4pack_mm(x, y_int4packed, group_size, y_scales_and_zeros) model = Model().to(torch.device("cpu")) input_shape = [(2, 128)] res = self.run_compare_torch( input_shape, model, minimum_deployment_target=ct.target.iOS18, compute_unit=compute_unit, rtol=0.1, ) prog = res[1]._mil_program assert get_op_types_in_program(prog) == ["constexpr_blockwise_shift_scale", "linear"] @pytest.mark.skipif( not hasattr(torch.ops.quantized_decomposed, "embedding_4bit"), reason="The `embedding_4bit` op doesn't exist in quantized_decomposed custom opset.", ) @pytest.mark.parametrize( "compute_unit, group_size, dtype, signed", itertools.product( compute_units, [32, 64], [None, torch.float16, torch.float32], [False, True] ), ) def test_quantized_decomposed_embedding_4bit_dtype( self, compute_unit, group_size, dtype, signed ): if not signed: # To reproduce this executorch bug, use following settings # scales = torch.ones(size=scales_shape, dtype=torch.float32, device=torch.device("cpu")) # input_data = torch.zeros(size=(1, 1), dtype=torch.int32) # Then you will find coreml outputs is the expected (consistent with `unpacked_weight`). pytest.skip( "rdar://135216194 (Executorch embedding_4bit implementation bug for unsigned quantization)" ) quant_low = -8 if signed else 0 quant_high = 7 if signed else 15 quant_dtype = torch.int8 if signed else torch.uint8 weight_shape = (128, 128) unpacked_weight = torch.randint( low=quant_low, high=quant_high + 1, size=weight_shape, dtype=quant_dtype, ) # Pack the weight to embedding_4bit's usable format. weight_range_shifted = unpacked_weight.add(-quant_low).view(torch.uint8) weight_view = weight_range_shifted.view( unpacked_weight.shape[0], unpacked_weight.shape[1] // 2, 2 ) weight_even = weight_view[:, :, 0] * 16 # left shift 4 weight_odd = weight_view[:, :, 1] weight = weight_even + weight_odd scales_shape = list(weight_shape) scales_shape[-1] = weight_shape[-1] // group_size scales = torch.rand(*scales_shape, dtype=torch.float32) class Model(torch.nn.Module): def forward(self, indices: torch.Tensor): if dtype is not None: return torch.ops.quantized_decomposed.embedding_4bit( weight, scales, None, quant_low, quant_high, indices, dtype=dtype ) else: return torch.ops.quantized_decomposed.embedding_4bit( weight, scales, None, quant_low, quant_high, indices ) # The 4-bit packing-unpacking in torch could be messed up when transferring between devices, so it's safer # to specify device at the beginning. model = Model().to(torch.device("cpu")) input_data = torch.randint(low=0, high=weight_shape[-1], size=(2, 128), dtype=torch.int32) res = self.run_compare_torch( input_data, model, input_as_shape=False, minimum_deployment_target=ct.target.iOS18, compute_unit=compute_unit, rtol=1e-3, ) prog = res[1]._mil_program assert get_op_types_in_program(prog) == ["constexpr_blockwise_shift_scale", "gather"] @pytest.mark.skipif(not _HAS_TORCH_VISION, reason=MSG_TORCH_VISION_NOT_FOUND) class TestTorchvisionQuantizedModels(TorchQuantizationBaseTest): # TODO (rdar://107444188): add other torchvision quantized models # As of torchvision 0.13.1, there are 5 quantized models: # googlenet, inception, mobilenet, resnet, shufflenet # Unfortunately, only mobilenet is working. Others would have # RuntimeError: Quantized backend not supported # Presumably because they need `fbgemm`, which does not support macOS # We should add them to our end-to-end test once torchvision fix their macOS def test_quantized_mobilenetv2(self): model = torchvision.models.quantization.mobilenet_v2(pretrained=True, quantize=True) self.run_compare_torch([(1, 3, 224, 224)], model, atol=1.0) class TestPytorchCarryCompressionInfo(TorchQuantizationBaseTest): """Test compressed PyTorch models which use register_buffer to carry compression info.""" @pytest.mark.parametrize( "compute_unit, n_bits, signed, use_linear, minimum_deployment_target, frontend", itertools.product( compute_units, [4, 8], [True, False], [True, False], [ct.target.iOS16, ct.target.iOS18], frontends, ), ) def test_quantization( self, compute_unit, n_bits, signed, use_linear, minimum_deployment_target, frontend ): if n_bits == 4 and minimum_deployment_target < ct.target.iOS18: pytest.skip("Sub-byte quantization is only supported since iOS18.") model, inputs, _, _ = get_test_model_and_data( quantize_config=cto.coreml.OpLinearQuantizerConfig( mode="linear_symmetric", dtype=types.get_nbits_int_builtin_type(n_bits, signed), granularity="per_tensor", ), use_linear=use_linear, ) target_scale_shape = (1, 1) if use_linear else (1, 1, 1, 1) scale = np.array([2.0], dtype=np.float32).reshape(*target_scale_shape) zero_point = np.array( [0 if signed else 2 ** (n_bits - 1)], dtype=np.int8 if signed else np.uint8 ).reshape(*target_scale_shape) model.register_buffer("_COREML_/metadata_version", torch.tensor(2)) model.register_buffer("_COREML_/weight/compression_type", torch.tensor([3])) model.register_buffer("_COREML_/weight/quantization_n_bits", torch.tensor(n_bits)) model.register_buffer("_COREML_/weight/quantization_scale", torch.from_numpy(scale)) model.register_buffer("_COREML_/weight/zero_point", torch.from_numpy(zero_point)) input_shape = [input.shape.to_list() for input in inputs] res = self.run_compare_torch( input_shape, model, minimum_deployment_target=minimum_deployment_target, compute_unit=compute_unit, frontend=frontend, converter=ct.convert, rtol=1e-04, atol=1e-03, ) main_func = res[1]._mil_program.functions["main"] target_dtype_str = ("int" if signed else "uint") + str(n_bits) if minimum_deployment_target >= ct.target.iOS18: quantize_ops = main_func.find_ops(op_type="constexpr_blockwise_shift_scale") assert len(quantize_ops) > 0 for quantize_op in quantize_ops: assert types.builtin_to_string(quantize_op.data.dtype) == target_dtype_str if not signed: assert types.builtin_to_string(quantize_op.offset.dtype) == target_dtype_str else: quantize_ops = main_func.find_ops(op_type="constexpr_affine_dequantize") assert len(quantize_ops) > 0 for quantize_op in quantize_ops: assert types.builtin_to_string(quantize_op.quantized_data.dtype) == target_dtype_str assert types.builtin_to_string(quantize_op.zero_point.dtype) == target_dtype_str @pytest.mark.parametrize( "compute_unit, n_bits, minimum_deployment_target, frontend", itertools.product(compute_units, [4, 8], [ct.target.iOS16, ct.target.iOS18], frontends), ) def test_multiple_parameters_in_same_layer( self, compute_unit, n_bits, minimum_deployment_target, frontend ): """Test one layer has multiple parameters (such as weight and bias in a linear layer)""" if n_bits == 4 and minimum_deployment_target < ct.target.iOS18: pytest.skip("Sub-byte quantization is only supported since iOS18.") class Model(torch.nn.Module): def __init__(self): super(Model, self).__init__() self.linear_1 = torch.nn.Linear(16, 32) self.linear_2 = torch.nn.Linear(32, 64) def forward(self, x): return self.linear_2(self.linear_1(x)) model = Model().eval() with torch.no_grad(): fake_weight_scale = 2 if n_bits == 4 else 40 model.linear_2.weight = torch.nn.Parameter( torch.from_numpy( np.ones_like(model.linear_2.weight.detach().numpy()) * fake_weight_scale ).float() ) model.linear_2.bias = torch.nn.Parameter( torch.from_numpy( np.ones_like(model.linear_2.bias.detach().numpy()) * fake_weight_scale ).float() ) # Register buffers for both weight and bias for linear_2 layer. weight_scale = np.array([2.0], dtype=np.float32).reshape(1, 1) bias_scale = np.array([2.0], dtype=np.float32) model.linear_2.register_buffer("_COREML_/weight/compression_type", torch.tensor([3])) model.linear_2.register_buffer("_COREML_/weight/quantization_n_bits", torch.tensor(n_bits)) model.linear_2.register_buffer( "_COREML_/weight/quantization_scale", torch.from_numpy(weight_scale) ) model.linear_2.register_buffer("_COREML_/bias/compression_type", torch.tensor([3])) model.linear_2.register_buffer("_COREML_/bias/quantization_n_bits", torch.tensor(n_bits)) model.linear_2.register_buffer( "_COREML_/bias/quantization_scale", torch.from_numpy(bias_scale) ) model.register_buffer("_COREML_/metadata_version", torch.tensor(2)) res = self.run_compare_torch( [(8, 16)], model, minimum_deployment_target=minimum_deployment_target, compute_unit=compute_unit, frontend=frontend, converter=ct.convert, ) main_func = res[1]._mil_program.functions["main"] quantize_op_type = ( "constexpr_blockwise_shift_scale" if minimum_deployment_target >= ct.target.iOS18 else "constexpr_affine_dequantize" ) # Only the linear_2 layer got quantized based on registered buffers. linear_ops = main_func.find_ops(op_type="linear") assert linear_ops[0].weight.op.op_type == "const" assert linear_ops[0].bias.op.op_type == "const" if frontend == TorchFrontend.EXECUTORCH: # In EXECUTORCH, the second linear layer is represented by `matmul` and `add` op. matmul_op = main_func.find_ops(op_type="matmul")[0] add_op = main_func.find_ops(op_type="add")[0] assert matmul_op.y.op.op_type == quantize_op_type assert add_op.x.op.op_type == quantize_op_type else: assert linear_ops[1].weight.op.op_type == quantize_op_type assert linear_ops[1].bias.op.op_type == quantize_op_type quantize_ops = main_func.find_ops(op_type=quantize_op_type) assert len(quantize_ops) == 2 for quantize_op in quantize_ops: if minimum_deployment_target >= ct.target.iOS18: assert types.builtin_to_string(quantize_op.data.dtype) == f"uint{n_bits}" else: assert types.builtin_to_string(quantize_op.quantized_data.dtype) == f"uint{n_bits}" def test_invalid_compression_info(self): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() # Invalid key combination (didn't specify compression schema) model.register_buffer("_COREML_/weight/quantization_n_bits", torch.tensor(4)) with pytest.raises( ValueError, match="There are coreml compression related buffers registered in the torch .* but " "the 'compression_type' is not set", ): self.run_compare_torch( [input.shape.to_list() for input in inputs], torch.jit.trace(model, torch_input_values), minimum_deployment_target=ct.target.iOS18, compute_unit=ct.ComputeUnit.CPU_ONLY, converter=ct.convert, ) # Invalid key names. model.register_buffer("_COREML_/weight/invalid_key", torch.tensor(4)) with pytest.raises(AttributeError, match="has no attribute 'invalid_key'"): self.run_compare_torch( [input.shape.to_list() for input in inputs], torch.jit.trace(model, torch_input_values), minimum_deployment_target=ct.target.iOS18, compute_unit=ct.ComputeUnit.CPU_ONLY, converter=ct.convert, ) # The lut must be specified for palettization. model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() model.register_buffer("_COREML_/weight/compression_type", torch.tensor([2])) with pytest.raises( ValueError, match="Missing lut in compression info. Please register a buffer for lut." ): self.run_compare_torch( [input.shape.to_list() for input in inputs], torch.jit.trace(model, torch_input_values), minimum_deployment_target=ct.target.iOS18, compute_unit=ct.ComputeUnit.CPU_ONLY, converter=ct.convert, ) @pytest.mark.parametrize( "compute_unit, n_bits, group_size, channel_axis, cluster_dim, use_linear, minimum_deployment_target, frontend", itertools.product( compute_units, [4, 8], [0, 1, 2], [0, 1], [1, 2], [True, False], [ct.target.iOS16, ct.target.iOS18], frontends, ), ) def test_palettization( self, compute_unit, n_bits, group_size, channel_axis, cluster_dim, use_linear, minimum_deployment_target, frontend, ): if ( group_size in (0, 2) and cluster_dim == 2 and minimum_deployment_target == ct.target.iOS18 ): pytest.xfail("rdar://131964912 [Quantization] Test Should not Overflow FP16") if cluster_dim > 1: if minimum_deployment_target < ct.target.iOS18: pytest.skip("Vector palettization is only supported in iOS18+") if group_size != 0 and group_size < cluster_dim: pytest.skip("Cluster dim must <= group size.") model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data( multi_layer=True, use_linear=use_linear, ) if use_linear: # per-channel scales for the [32, 64] and [16, 32] weight. scale_1 = np.array([2.0] * 32, dtype=np.float32).reshape(32, 1) scale_2 = np.array([3.0] * 16, dtype=np.float32).reshape(16, 1) else: # per-channel scales for the [32, 64, 2, 2] and [64, 32, 2, 2] weight. scale_1 = np.array([2.0] * 32, dtype=np.float32).reshape(32, 1, 1, 1) scale_2 = np.array([3.0] * 64, dtype=np.float32).reshape(64, 1, 1, 1) layername_1 = "linear_1" if use_linear else "conv_1" layername_2 = "linear_2" if use_linear else "conv_2" unique_weight_1 = create_unique_weight( getattr(model, layername_1).weight, nbits=n_bits, vector_size=cluster_dim, vector_axis=channel_axis, ) unique_weight_2 = create_unique_weight( getattr(model, layername_2).weight, nbits=n_bits, vector_size=cluster_dim, vector_axis=channel_axis, ) # Use grouped-channel-wise lut for layer1 for iOS18+. block_sizes = [0] * len(unique_weight_1.shape) if minimum_deployment_target >= ct.target.iOS18: block_sizes[channel_axis] = group_size lut_1_params = _quantization_passes.palettize_weights.blockwise_compress( unique_weight_1, "UNIQUE", nbits=n_bits, block_sizes=block_sizes, cluster_dim=cluster_dim, channel_axis=channel_axis, ) # Use per-tensor lut for layer2. lut_2_params = _quantization_passes.palettize_weights.blockwise_compress( unique_weight_2, "UNIQUE", nbits=n_bits, block_sizes=[0] * len(unique_weight_2.shape), cluster_dim=cluster_dim, channel_axis=channel_axis, ) if minimum_deployment_target >= ct.target.iOS18: # Only do per-channel-scale for iOS18+. unique_weight_1 *= scale_1 unique_weight_2 *= scale_2 with torch.no_grad(): getattr(model, layername_1).weight = torch.nn.Parameter(torch.Tensor(unique_weight_1)) getattr(model, layername_2).weight = torch.nn.Parameter(torch.Tensor(unique_weight_2)) model.register_buffer("_COREML_/metadata_version", torch.tensor(1)) if minimum_deployment_target >= ct.target.iOS18: getattr(model, layername_1).register_buffer( "_COREML_/weight/compression_type", torch.tensor([2]) ) getattr(model, layername_1).register_buffer( "_COREML_/weight/lut", torch.tensor(lut_1_params.lut) ) getattr(model, layername_1).register_buffer( "_COREML_/weight/palettization_scale", torch.from_numpy(scale_1) ) getattr(model, layername_2).register_buffer( "_COREML_/weight/compression_type", torch.tensor([2]) ) getattr(model, layername_2).register_buffer( "_COREML_/weight/lut", torch.tensor(lut_2_params.lut) ) if minimum_deployment_target >= ct.target.iOS18: getattr(model, layername_2).register_buffer( "_COREML_/weight/palettization_scale", torch.from_numpy(scale_2) ) input_shape = [input.shape.to_list() for input in inputs] res = self.run_compare_torch( input_shape, model, minimum_deployment_target=minimum_deployment_target, compute_unit=compute_unit, frontend=frontend, converter=ct.convert, rtol=0.2 if cluster_dim > 1 else 1e-5, # Vector palettization has larger info loss. ) main_func = res[1]._mil_program.functions["main"] if minimum_deployment_target >= ct.target.iOS18: expected_dtype = f"uint{n_bits}" expected_quantize_ops_num = 2 expected_palettize_ops_num = 2 # The lut with pcs op order is determined by canonicalize_quantized_lut_pattern graph pass. palettize_op_child_op_type = "linear" if use_linear else "conv" else: expected_dtype = "uint8" expected_quantize_ops_num = 0 expected_palettize_ops_num = 1 # The iOS16 doesn't have per-channel-scale, so lut output is directly fed into next op. palettize_op_child_op_type = "linear" if use_linear else "conv" quantize_ops = main_func.find_ops(op_type="constexpr_blockwise_shift_scale") assert len(quantize_ops) == expected_quantize_ops_num for quantize_op in quantize_ops: assert quantize_op.outputs[0].child_ops[0].op_type == "constexpr_lut_to_dense" palettize_ops = main_func.find_ops(op_type="constexpr_lut_to_dense") assert len(palettize_ops) == expected_palettize_ops_num for palettize_op in palettize_ops: assert types.builtin_to_string(palettize_op.indices.dtype) == expected_dtype assert palettize_op.outputs[0].child_ops[0].op_type == palettize_op_child_op_type if minimum_deployment_target >= ct.target.iOS18: assert palettize_op.lut.shape[-1] == cluster_dim @pytest.mark.parametrize( "compute_unit, minimum_deployment_target, frontend", itertools.product(compute_units, [ct.target.iOS16, ct.target.iOS18], frontends), ) def test_palettization_8bit_lut(self, compute_unit, minimum_deployment_target, frontend): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data( multi_layer=True ) unique_weight_1 = create_unique_weight(model.conv_1.weight, nbits=4) unique_weight_2 = create_unique_weight(model.conv_2.weight, nbits=6) lut_1_params = _quantization_passes.palettize_weights.grouped_channelwise_compress( unique_weight_1, "UNIQUE", nbits=4, channel_axis=0, channel_group_size=0, ) quant_1_params = _quantization_passes.linear_quantize_weights.blockwise_compress( lut_1_params.lut, nbits=8, mode="LINEAR", signed=True, block_sizes=[0] * len(lut_1_params.lut.shape), ) lut_2_params = _quantization_passes.palettize_weights.grouped_channelwise_compress( unique_weight_2, "UNIQUE", nbits=6, channel_axis=1, channel_group_size=0, ) quant_2_params = _quantization_passes.linear_quantize_weights.blockwise_compress( lut_2_params.lut, nbits=8, mode="LINEAR_SYMMETRIC", signed=False, block_sizes=[0] * len(lut_2_params.lut.shape), ) # Reconstruct the weight in torch model for numerical comparison. dequantized_lut_1 = _quantization_passes.linear_quantize_weights.decompress(quant_1_params) reconstruct_weight_1 = _quantization_passes.palettize_weights.decompress( lut_1_params._replace(lut=dequantized_lut_1) ) dequantized_lut_2 = _quantization_passes.linear_quantize_weights.decompress(quant_2_params) reconstruct_weight_2 = _quantization_passes.palettize_weights.decompress( lut_2_params._replace(lut=dequantized_lut_2) ) with torch.no_grad(): model.conv_1.weight = torch.nn.Parameter(torch.Tensor(reconstruct_weight_1)) model.conv_2.weight = torch.nn.Parameter(torch.Tensor(reconstruct_weight_2)) # Register buffers for compression metadata. model.register_buffer("_COREML_/metadata_version", torch.tensor(1)) model.conv_1.register_buffer("_COREML_/weight/compression_type", torch.tensor([2, 3])) model.conv_1.register_buffer("_COREML_/weight/lut", torch.tensor(dequantized_lut_1)) model.conv_1.register_buffer("_COREML_/weight/quantization_n_bits", torch.tensor(8)) model.conv_1.register_buffer( "_COREML_/weight/quantization_scale", torch.from_numpy(quant_1_params.scale) ) model.conv_1.register_buffer( "_COREML_/weight/zero_point", torch.from_numpy(quant_1_params.offset) ) model.conv_2.register_buffer("_COREML_/weight/compression_type", torch.tensor([2, 3])) model.conv_2.register_buffer("_COREML_/weight/lut", torch.tensor(dequantized_lut_2)) model.conv_2.register_buffer("_COREML_/weight/quantization_n_bits", torch.tensor(8)) model.conv_2.register_buffer( "_COREML_/weight/quantization_scale", torch.from_numpy(quant_2_params.scale) ) model.conv_2.register_buffer( "_COREML_/weight/zero_point", torch.from_numpy(quant_2_params.offset) ) traced_model = torch.jit.trace(model, torch_input_values) input_shape = [input.shape.to_list() for input in inputs] pytest_context_manager = nullcontext() if minimum_deployment_target < ct.target.iOS18: pytest_context_manager = pytest.raises( ValueError, match="Please set minimum_deployment_target to iOS18 or later" ) with pytest_context_manager: res = self.run_compare_torch( input_shape, traced_model, minimum_deployment_target=minimum_deployment_target, compute_unit=compute_unit, converter=ct.convert, ) if minimum_deployment_target < ct.target.iOS18: return main_func = res[1]._mil_program.functions["main"] quantize_ops = main_func.find_ops(op_type="constexpr_blockwise_shift_scale") assert len(quantize_ops) == 2 palettize_ops = main_func.find_ops(op_type="constexpr_lut_to_dense") assert len(palettize_ops) == 2 assert types.builtin_to_string(palettize_ops[0].indices.dtype) == "uint4" assert types.builtin_to_string(palettize_ops[1].indices.dtype) == "uint6" # The op order is adjusted by common::canonicalize_quantized_lut_pattern graph pass. for quantize_op in quantize_ops: assert quantize_op.outputs[0].child_ops[0].op_type == "constexpr_lut_to_dense" for palettize_op in palettize_ops: assert palettize_op.outputs[0].child_ops[0].op_type == "conv" @pytest.mark.parametrize( "compute_unit, sparse_ratio, use_linear, minimum_deployment_target, frontend", itertools.product( compute_units, [0.01, 0.5, 0.99], [True, False], [ct.target.iOS16, ct.target.iOS18], frontends, ), ) def test_pruning( self, compute_unit, sparse_ratio, use_linear, minimum_deployment_target, frontend ): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data( multi_layer=True, use_linear=use_linear ) layername_1 = "linear_1" if use_linear else "conv_1" layername_2 = "linear_2" if use_linear else "conv_2" with torch.no_grad(): getattr(model, layername_1).weight = torch.nn.Parameter( torch.Tensor( create_sparse_weight( getattr(model, layername_1).weight, target_sparsity=sparse_ratio ) ) ) getattr(model, layername_2).weight = torch.nn.Parameter( torch.Tensor( create_sparse_weight( getattr(model, layername_2).weight, target_sparsity=sparse_ratio ) ) ) model.register_buffer("_COREML_/metadata_version", torch.tensor(1)) getattr(model, layername_1).register_buffer( "_COREML_/weight/compression_type", torch.tensor([1]) ) getattr(model, layername_2).register_buffer( "_COREML_/weight/compression_type", torch.tensor([1]) ) traced_model = torch.jit.trace(model, torch_input_values) input_shape = [input.shape.to_list() for input in inputs] res = self.run_compare_torch( input_shape, traced_model, minimum_deployment_target=minimum_deployment_target, compute_unit=compute_unit, converter=ct.convert, ) main_func = res[1]._mil_program.functions["main"] sparse_ops = main_func.find_ops(op_type="constexpr_sparse_to_dense") assert len(sparse_ops) == 2 for sparse_op in sparse_ops: assert sparse_op.outputs[0].child_ops[0].op_type == "linear" if use_linear else "conv" assert types.builtin_to_string(sparse_op.nonzero_data.dtype) == "fp32" if minimum_deployment_target >= ct.target.iOS18: assert types.builtin_to_string(sparse_op.mask.dtype) == "uint1" else: assert types.builtin_to_string(sparse_op.mask.dtype) == "uint8" assert types.builtin_to_string(sparse_op.shape.dtype) == "uint32" @pytest.mark.parametrize( "compute_unit, n_bits, signed, use_linear, frontend", itertools.product( compute_units, [4, 8], [True, False], [True, False], frontends, ), ) def test_joint_pruning_quantization(self, compute_unit, n_bits, signed, use_linear, frontend): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data( multi_layer=True, use_linear=use_linear, ) # Make the weight sparse and also quantization-friendly. layername_1 = "linear_1" if use_linear else "conv_1" layername_2 = "linear_2" if use_linear else "conv_2" weight_1, scale_1, zero_point_1 = create_quantize_friendly_weight( getattr(model, layername_1).weight.detach().numpy(), nbits=n_bits, signed=signed ) weight_1 *= np.random.randint(low=0, high=2, size=weight_1.shape) weight_2, scale_2, zero_point_2 = create_quantize_friendly_weight( getattr(model, layername_2).weight.detach().numpy(), nbits=n_bits, signed=signed ) weight_2 *= np.random.randint(low=0, high=2, size=weight_2.shape) with torch.no_grad(): getattr(model, layername_1).weight = torch.nn.Parameter(torch.Tensor(weight_1)) getattr(model, layername_2).weight = torch.nn.Parameter(torch.Tensor(weight_2)) model.register_buffer("_COREML_/metadata_version", torch.tensor(2)) getattr(model, layername_1).register_buffer( "_COREML_/weight/compression_type", torch.tensor([1, 3]) ) getattr(model, layername_1).register_buffer( "_COREML_/weight/quantization_n_bits", torch.tensor(n_bits) ) getattr(model, layername_1).register_buffer( "_COREML_/weight/quantization_scale", torch.from_numpy(scale_1) ) getattr(model, layername_1).register_buffer( "_COREML_/weight/zero_point", torch.from_numpy(zero_point_1) ) getattr(model, layername_2).register_buffer( "_COREML_/weight/compression_type", torch.tensor([1, 3]) ) getattr(model, layername_2).register_buffer( "_COREML_/weight/quantization_n_bits", torch.tensor(n_bits) ) getattr(model, layername_2).register_buffer( "_COREML_/weight/quantization_scale", torch.from_numpy(scale_2) ) getattr(model, layername_2).register_buffer( "_COREML_/weight/zero_point", torch.from_numpy(zero_point_2) ) input_shape = [input.shape.to_list() for input in inputs] res = self.run_compare_torch( input_shape, model, minimum_deployment_target=ct.target.iOS18, compute_unit=compute_unit, frontend=frontend, converter=ct.convert, atol=1e-2, ) main_func = res[1]._mil_program.functions["main"] sparse_quantize_ops = main_func.find_ops(op_type="constexpr_sparse_blockwise_shift_scale") assert len(sparse_quantize_ops) == 2 for sparse_quantize_op in sparse_quantize_ops: expected_dtype = f"int{n_bits}" if signed else f"uint{n_bits}" assert types.builtin_to_string(sparse_quantize_op.nonzero_data.dtype) == expected_dtype assert types.builtin_to_string(sparse_quantize_op.data_mask.dtype) == "uint1" assert types.builtin_to_string(sparse_quantize_op.scale.dtype) == "fp32" assert sparse_quantize_op.outputs[1].child_ops[0].op_type == "constexpr_sparse_to_dense" sparse_ops = main_func.find_ops(op_type="constexpr_sparse_to_dense") assert len(sparse_ops) == 2 for sparse_op in sparse_ops: assert types.builtin_to_string(sparse_op.mask.dtype) == "uint1" assert types.builtin_to_string(sparse_op.nonzero_data.dtype) == "fp32" assert sparse_op.outputs[0].child_ops[0].op_type == "linear" if use_linear else "conv" @pytest.mark.parametrize( "compute_unit, n_bits, group_size, use_linear, frontend", itertools.product( compute_units, [4, 8], [0, 1, 2], [True, False], frontends, ), ) def test_joint_pruning_palettization( self, compute_unit, n_bits, group_size, use_linear, frontend ): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data( multi_layer=True, use_linear=use_linear, ) # Make the weight sparse and also can be represented by lut. layername_1 = "linear_1" if use_linear else "conv_1" layername_2 = "linear_2" if use_linear else "conv_2" weight_1 = create_unique_weight( getattr(model, layername_1).weight, nbits=n_bits ) * np.random.randint(low=0, high=2, size=getattr(model, layername_1).weight.shape) weight_2 = create_unique_weight( getattr(model, layername_2).weight, nbits=n_bits ) * np.random.randint(low=0, high=2, size=getattr(model, layername_2).weight.shape) with torch.no_grad(): getattr(model, layername_1).weight = torch.nn.Parameter(torch.Tensor(weight_1)) getattr(model, layername_2).weight = torch.nn.Parameter(torch.Tensor(weight_2)) lut_1_params = _quantization_passes.palettize_weights.blockwise_compress( weight_1, "UNIQUE", nbits=n_bits, block_sizes=[group_size] + [0] * (len(weight_1.shape) - 1), ) lut_2_params = _quantization_passes.palettize_weights.blockwise_compress( weight_2, "UNIQUE", nbits=n_bits, block_sizes=[group_size] + [0] * (len(weight_2.shape) - 1), ) model.register_buffer("_COREML_/metadata_version", torch.tensor(1)) getattr(model, layername_1).register_buffer( "_COREML_/weight/compression_type", torch.tensor([1, 2]) ) getattr(model, layername_1).register_buffer( "_COREML_/weight/lut", torch.tensor(lut_1_params.lut) ) getattr(model, layername_2).register_buffer( "_COREML_/weight/compression_type", torch.tensor([1, 2]) ) getattr(model, layername_2).register_buffer( "_COREML_/weight/lut", torch.tensor(lut_2_params.lut) ) traced_model = torch.jit.trace(model, torch_input_values) input_shape = [input.shape.to_list() for input in inputs] res = self.run_compare_torch( input_shape, traced_model, minimum_deployment_target=ct.target.iOS18, compute_unit=compute_unit, converter=ct.convert, ) main_func = res[1]._mil_program.functions["main"] sparse_palettize_ops = main_func.find_ops(op_type="constexpr_lut_to_sparse") assert len(sparse_palettize_ops) == 2 for sparse_palettize_op in sparse_palettize_ops: assert ( types.builtin_to_string(sparse_palettize_op.indices_nonzero_data.dtype) == f"uint{n_bits}" ) assert types.builtin_to_string(sparse_palettize_op.indices_mask.dtype) == "uint1" assert types.builtin_to_string(sparse_palettize_op.lut.dtype) == "fp32" assert ( sparse_palettize_op.outputs[1].child_ops[0].op_type == "constexpr_sparse_to_dense" ) # As both palettization and pruning is on the original weight, the shape of lut should # match the original weight's shape except on the output channel. weight_shape = sparse_palettize_op.outputs[1].child_ops[0].outputs[0].shape expected_lut_shape = [1] * len(weight_shape) + [2**n_bits] + [1] if group_size > 0: expected_lut_shape[0] = weight_shape[0] // group_size assert sparse_palettize_op.lut.shape == tuple(expected_lut_shape) sparse_ops = main_func.find_ops(op_type="constexpr_sparse_to_dense") assert len(sparse_ops) == 2 for sparse_op in sparse_ops: assert types.builtin_to_string(sparse_op.mask.dtype) == "uint1" assert types.builtin_to_string(sparse_op.nonzero_data.dtype) == "fp32" assert sparse_op.outputs[0].child_ops[0].op_type == "linear" if use_linear else "conv" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/test/test_torch_stateful_model.py0000644000000000000000000012155114672066616031555 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools.converters.mil.frontend.torch.utils import TorchFrontend from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.types.symbolic import any_symbolic from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import ( assert_output_dtype, assert_prog_output_type, assert_spec_input_image_type, assert_spec_output_image_type, get_op_types_in_program, verify_prediction, ) from coremltools.proto import FeatureTypes_pb2 as ft torch = pytest.importorskip("torch") from .testing_utils import export_torch_model_to_frontend, frontends @pytest.fixture def float16_buffer_model(): class Model(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer("state", torch.tensor(np.array([7, 5, 6], dtype=np.float16))) def forward(self, x): x = x.type(torch.float16) self.state.mul_(x) self.state.add_(torch.tensor(np.array([1, 2, 3], dtype=np.float16))) return self.state * 9 example_input = torch.randint(0, 100, (3,), dtype=torch.int32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def float32_buffer_model(): class Model(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer("state", torch.tensor(np.array([7, 5, 6], dtype=np.float32))) def forward(self, x): self.state.add_(x) return self.state * 5 example_input = torch.randint(0, 100, (3,), dtype=torch.int32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def float32_non_persistent_buffer_model(): class Model(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer( "state", torch.tensor(np.array([7, 5, 6], dtype=np.float32)), persistent=False ) def forward(self, x): self.state.add_(x) return self.state * 5 example_input = torch.randint(0, 100, (3,), dtype=torch.int32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def float32_buffer_not_returned_model(): class Model(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer("state_1", torch.tensor(np.array([7, 5, 6], dtype=np.float32))) self.register_buffer("state_2", torch.tensor(np.array([7, 5, 6], dtype=np.float32))) def forward(self, x): self.state_1.add_(x) self.state_2.add_(x) return x example_input = torch.randint(0, 100, (3,), dtype=torch.int32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def float32_buffer_not_returned_model_2(): class Model(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer("state_1", torch.tensor(np.array([7, 5, 6], dtype=np.float32))) self.register_buffer("state_2", torch.tensor(np.array([7, 5, 6], dtype=np.float32))) def forward(self, x): self.state_1.add_(x) self.state_2.add_(x) self.state_1.add_(x) return x example_input = torch.randint(0, 100, (3,), dtype=torch.int32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def float32_buffer_model_with_two_inputs(): class Model(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer("state", torch.tensor(np.array([7, 5, 6], dtype=np.float32))) def forward(self, x, y): self.state.add_(x) self.state.add_(y) return self.state * 5 example_input = [ torch.randint(0, 100, (3,), dtype=torch.int32), torch.randint(0, 100, (3,), dtype=torch.int32), ] return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def float32_buffer_model_two_inputs_two_states(): class Model(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer("state_1", torch.tensor(np.array([1, 2, 3], dtype=np.float32))) self.register_buffer("state_2", torch.tensor(np.array([4, 5, 6], dtype=np.float32))) def forward(self, x, y): self.state_1.add_(x) self.state_2.add_(y) return self.state_1 * self.state_2 example_input = [ torch.randint(0, 100, (3,), dtype=torch.int32), torch.randint(0, 100, (3,), dtype=torch.int32), ] return torch.jit.trace(Model().eval(), example_input) def float32_buffer_sequantial_model(): class Model(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer("state", torch.tensor(np.array([7, 5, 6], dtype=np.float32))) def forward(self, x): res = self.state + 8 self.state[0] = 9.0 x = self.state * x self.state.mul_(self.state) self.state.sub_(x) return torch.relu(self.state) example_input = torch.randint(0, 100, (3,), dtype=torch.int32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def float32_two_buffers_model(): class Model(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer("state_1", torch.tensor(np.array([1, 2, 3], dtype=np.float32))) self.register_buffer("state_2", torch.tensor(np.array([4, 5, 6], dtype=np.float32))) def forward(self, x): v1 = self.state_2 - x self.state_2.mul_(self.state_1) self.state_1.mul_(v1) self.state_1.add_(self.state_2) return self.state_1 + x example_input = torch.randint(0, 100, (3,), dtype=torch.int32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def rank4_input_model_with_buffer(): class Model(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer( "state_1", torch.tensor(np.zeros((1, 3, 10, 20), dtype=np.float32)) ) def forward(self, x): x = x + 5.5 self.state_1.add_(x) self.state_1[0, 0, 0, 0:1] = torch.tensor([1.0]) return x example_input = torch.randint(0, 100, (1, 3, 10, 20), dtype=torch.float32) return torch.jit.trace(Model().eval(), example_input) @pytest.fixture def rank4_grayscale_input_model_with_buffer(): class Model(torch.nn.Module): def __init__(self): super().__init__() self.register_buffer( "state_1", torch.tensor(np.zeros((1, 1, 10, 20), dtype=np.float32)) ) def forward(self, x): x = x + 5 self.state_1.add_(x) self.state_1[0, 0, 0, 0:1] = torch.tensor([1.0]) return x example_input = torch.randint(0, 100, (1, 1, 10, 20), dtype=torch.float32) return torch.jit.trace(Model().eval(), example_input) @pytest.mark.skipif( ct.utils._macos_version() < (15, 0), reason="Tests are for deployment target iOS18/macos15" ) class TestStateConversionAPI: @pytest.mark.parametrize( "compute_unit, frontend", itertools.product(compute_units, frontends), ) def test_state_model_api_example(self, compute_unit, frontend): """ Test the public API example. """ class UpdateBufferModel(torch.nn.Module): def __init__(self): super(UpdateBufferModel, self).__init__() self.register_buffer("state_1", torch.tensor(np.array([0, 0, 0], dtype=np.float32))) def forward(self, x): # In place update of the model state self.state_1.mul_(x) return self.state_1 + 1.0 source_model = UpdateBufferModel() source_model.eval() torch_model = export_torch_model_to_frontend( source_model, (torch.tensor([1, 2, 3], dtype=torch.float16),), frontend, ) inputs = [ct.TensorType(shape=(3,))] if frontend == TorchFrontend.TORCHSCRIPT else None states = ( [ct.StateType(wrapped_type=ct.TensorType(shape=(3,)), name="state_1")] if frontend == TorchFrontend.TORCHSCRIPT else None ) mlmodel = ct.convert( torch_model, inputs=inputs, states=states, minimum_deployment_target=ct.target.iOS18, convert_to="mlprogram", compute_units=compute_unit, ) assert get_op_types_in_program(mlmodel._mil_program) == [ "read_state", "mul", "coreml_update_state", "add", ] verify_prediction(mlmodel) @pytest.mark.parametrize( "compute_unit", compute_units, ) def test_single_state_single_input( self, float32_buffer_model, float32_non_persistent_buffer_model, compute_unit ): """ Tests for different combination of input dtypes. """ def test_valid_prog(prog, expected_ops=None): block = prog.functions["main"] assert types.is_tensor(block.inputs["x"].sym_type) assert types.is_state(block.inputs["state_workaround"].sym_type) assert len(block.outputs) == 1 assert types.is_tensor(block.outputs[0].sym_type) if expected_ops is None: expected_ops = [ "read_state", "add", "coreml_update_state", "mul", ] assert get_op_types_in_program(prog) == expected_ops """ fp32 state / input (default with compute_precision=fp32), with both persistent and non-persistent buffer. fp32 state is not supported through runtime. (%x: Tensor(fp32), %state: State(fp32)) -> { %read_state(fp32) = read_state(%state) %add(fp32) = add(%read_state, %x) %update(fp32) = coreml_update_state(%state, %add) %mul(fp32) = mul(%update, 5) } -> (%mul) """ for model in [float32_buffer_model, float32_non_persistent_buffer_model]: prog = ct.convert( model, inputs=[ ct.TensorType(shape=(3,)), ], states=[ ct.StateType( wrapped_type=ct.TensorType( shape=(3,), ), name="state", ), ], minimum_deployment_target=ct.target.iOS18, compute_precision=ct.precision.FLOAT32, convert_to="milinternal", ) test_valid_prog(prog) block = prog.functions["main"] assert block.inputs["x"].sym_type.get_primitive() == types.fp32 assert ( block.inputs["state_workaround"].sym_type.wrapped_type().get_primitive() == types.fp32 ) assert block.outputs[0].dtype == types.fp32 """ fp16 state / input (user specify) (%x: Tensor(fp16), %state: State(fp16)) -> { %read_state(fp16) = read_state(%state) %add(fp16) = add(%read_state, %x) %update(fp16) = coreml_update_state(%state, %add) %mul(fp16) = mul(%update, 5) } -> (%mul) """ mlmodel = ct.convert( float32_buffer_model, inputs=[ ct.TensorType(shape=(3,), dtype=np.float16), ], states=[ ct.StateType( wrapped_type=ct.TensorType( shape=(3,), dtype=np.float16, ), name="state", ), ], minimum_deployment_target=ct.target.iOS18, convert_to="mlprogram", compute_units=compute_unit, ) # check the pymil program prog = mlmodel._mil_program test_valid_prog(prog) block = prog.functions["main"] assert block.inputs["x"].sym_type.get_primitive() == types.fp16 assert ( block.inputs["state_workaround"].sym_type.wrapped_type().get_primitive() == types.fp16 ) assert block.outputs[0].dtype == types.fp16 # check the mil proto mil = mlmodel.get_spec().mlProgram for function in mil.functions.values(): for block in function.block_specializations.values(): ops = list(block.operations) expected_ops = [ "read_state", "add", "write_state", "read_state", "const", "mul", ] assert [val.type for val in ops] == expected_ops assert len(ops[2].outputs) == 0 verify_prediction(mlmodel) """ fp16 state / input (default with compute_precision=fp16) (%x: Tensor(fp16), %state: State(fp16)) -> { %read_state(fp16) = read_state(%state) %add(fp16) = add(%read_state, %x) %update(fp16) = coreml_update_state(%state, %add) %mul(fp16) = mul(%update, 5) } -> (%mul) """ mlmodel = ct.convert( float32_buffer_model, inputs=[ ct.TensorType(shape=(3,)), ], states=[ ct.StateType( wrapped_type=ct.TensorType( shape=(3,), ), name="state", ), ], minimum_deployment_target=ct.target.iOS18, compute_units=compute_unit, ) prog = mlmodel._mil_program test_valid_prog(prog) block = prog.functions["main"] assert block.inputs["x"].sym_type.get_primitive() == types.fp16 assert ( block.inputs["state_workaround"].sym_type.wrapped_type().get_primitive() == types.fp16 ) assert block.outputs[0].dtype == types.fp16 verify_prediction(mlmodel) """ fp16 state and fp32 input (%x: Tensor(fp32), %state: State(fp16)) -> { %read_state(fp16) = read_state(%state) %x_cast(fp16) = cast(%x) %add(fp16) = add(%read_state, %x_cast) %update(fp16) = coreml_update_state(%state, %add) %mul(fp16) = mul(%update, 5) } -> (%mul) """ mlmodel = ct.convert( float32_buffer_model, inputs=[ ct.TensorType(shape=(3,), dtype=np.float32), ], states=[ ct.StateType( wrapped_type=ct.TensorType(shape=(3,), dtype=np.float16), name="state" ), ], minimum_deployment_target=ct.target.iOS18, compute_units=compute_unit, ) prog = mlmodel._mil_program expected_ops = [ "read_state", "cast", "add", "coreml_update_state", "mul", ] test_valid_prog(prog, expected_ops) block = prog.functions["main"] assert block.inputs["x"].sym_type.get_primitive() == types.fp32 assert ( block.inputs["state_workaround"].sym_type.wrapped_type().get_primitive() == types.fp16 ) assert prog.find_ops("cast")[0].x.op is None assert block.outputs[0].dtype == types.fp16 verify_prediction(mlmodel) """ fp32 state and fp16 input. This is a rare corner case that shouldn't happend often. fp32 state is not supported through runtime. (%x: Tensor(fp16), %state: State(fp32)) -> { %read_state(fp32) = read_state(%state) %read_state_cast(fp16) = cast(read_state) %add(fp16) = add(%read_state_casr, %x) %add_cast(fp32) = cast(%add) %update(fp32) = coreml_update_state(%state, %add_cast) %update_cast(fp16) = cast(%update) %mul(fp16) = mul(%update_cast, 5) } -> (%mul) """ prog = ct.convert( float32_buffer_model, inputs=[ ct.TensorType(shape=(3,), dtype=np.float16), ], states=[ ct.StateType( wrapped_type=ct.TensorType(shape=(3,), dtype=np.float32), name="state" ), ], minimum_deployment_target=ct.target.iOS18, convert_to="milinternal", ) expected_ops = [ "read_state", "cast", "add", "cast", "coreml_update_state", "cast", "mul", ] test_valid_prog(prog, expected_ops) block = prog.functions["main"] assert block.inputs["x"].sym_type.get_primitive() == types.fp16 assert ( block.inputs["state_workaround"].sym_type.wrapped_type().get_primitive() == types.fp32 ) assert prog.find_ops("cast")[0].x.op.op_type == "read_state" assert prog.find_ops("cast")[1].x.op.op_type == "add" assert prog.find_ops("cast")[2].x.op.op_type == "coreml_update_state" assert block.outputs[0].dtype == types.fp16 @pytest.mark.parametrize( "compute_unit", compute_units, ) def test_single_state_single_input_model_fp16(self, float16_buffer_model, compute_unit): """ Tests conversion of a stateful torch model defined in fp16. This will be common in model with large size. """ # fp16 state / input mlmodel = ct.convert( float16_buffer_model, inputs=[ ct.TensorType(shape=(3,), dtype=np.float16), ], states=[ ct.StateType(wrapped_type=ct.TensorType(shape=(3,), dtype=np.float16), name="state") ], minimum_deployment_target=ct.target.iOS18, convert_to="mlprogram", compute_units=compute_unit, ) prog = mlmodel._mil_program assert get_op_types_in_program(prog) == [ "read_state", "mul", "coreml_update_state", "add", "coreml_update_state", "mul", ] verify_prediction(mlmodel) # force state / input to be fp32 (intented stress test) prog = ct.convert( float16_buffer_model, inputs=[ ct.TensorType(shape=(3,), dtype=np.float32), ], states=[ ct.StateType( wrapped_type=ct.TensorType(shape=(3,), dtype=np.float32), name="state" ), ], minimum_deployment_target=ct.target.iOS18, convert_to="milinternal", ) assert get_op_types_in_program(prog) == [ "read_state", "cast", "cast", "mul", "cast", "coreml_update_state", "cast", "add", "cast", "coreml_update_state", "cast", "mul", ] @pytest.mark.parametrize( "compute_unit", compute_units, ) def test_multiple_states_model(self, float32_two_buffers_model, compute_unit): """ Tests for a model with multiple buffers. """ mlmodel = ct.convert( float32_two_buffers_model, inputs=[ ct.TensorType(shape=(3,)), ], states=[ ct.StateType( wrapped_type=ct.TensorType( shape=(3,), ), name="state_1", ), ct.StateType( wrapped_type=ct.TensorType( shape=(3,), ), name="state_2", ), ], minimum_deployment_target=ct.target.iOS18, convert_to="mlprogram", compute_units=compute_unit, ) prog = mlmodel._mil_program assert get_op_types_in_program(prog) == [ "read_state", "sub", "read_state", "mul", "coreml_update_state", "mul", "coreml_update_state", "add", "coreml_update_state", "add", ] verify_prediction(mlmodel) def test_convert_buffer_model_without_state_type(self, float32_buffer_model): """ If the users don't specify StateType for buffer states, they will be treated as const tensors. We should modify this unittest after we fix this radar: rdar://116489054 ([Infra] Have a more sophisticated handling for torch buffer state when not declared as StateType) """ prog = ct.convert( float32_buffer_model, inputs=[ ct.TensorType(shape=(3,)), ], minimum_deployment_target=ct.target.iOS17, convert_to="milinternal", ) assert get_op_types_in_program(prog) == [ "add", "mul", ] @pytest.mark.parametrize( "compute_unit", compute_units, ) def test_tensor_state_inputs_interleave( self, float32_buffer_model_two_inputs_two_states, compute_unit ): """ We allow the user to interleave tensor / state input types. """ mlmodel = ct.convert( float32_buffer_model_two_inputs_two_states, inputs=[ ct.TensorType(shape=(3,)), ct.TensorType(shape=(3,)), ], states=[ ct.StateType( wrapped_type=ct.TensorType( shape=(3,), ), name="state_1", ), ct.StateType( wrapped_type=ct.TensorType( shape=(3,), ), name="state_2", ), ], minimum_deployment_target=ct.target.iOS18, convert_to="mlprogram", compute_units=compute_unit, ) prog = mlmodel._mil_program assert get_op_types_in_program(prog) == [ "read_state", "add", "coreml_update_state", "read_state", "add", "coreml_update_state", "mul", ] verify_prediction(mlmodel) def test_invalid_deployment_target_error_out(self, float32_buffer_model): """ The conversion should error out if the user tries to convert it into deployment target < ioS18. """ with pytest.raises( ValueError, match="State model is supported only >= iOS18. Please update the minimum_deployment_target to at least coremltools.target.iOS18", ): prog = ct.convert( float32_buffer_model, inputs=[ ct.TensorType(shape=(3,)), ], states=[ ct.StateType( wrapped_type=ct.TensorType( shape=(3,), ), name="state", ), ], minimum_deployment_target=ct.target.iOS17, ) with pytest.raises( ValueError, match="State model is supported only >= iOS18. Please update the minimum_deployment_target to at least coremltools.target.iOS18", ): prog = ct.convert( float32_buffer_model, inputs=[ ct.TensorType(shape=(3,)), ], states=[ ct.StateType( wrapped_type=ct.TensorType( shape=(3,), ), name="state", ), ], convert_to="neuralnetwork", ) def test_invalid_state_name_error_out(self, float32_buffer_model): """ The conversion should error out if the user doesn't provide / or provides wrong name of the buffer """ with pytest.raises( ValueError, match="StateType named None not provided or not found in the source torch model. Please make sure the name in 'ct.StateType\(name=..., wrapped_type=ct.TensorType\(...\)\)' match the 'named_buffers\(\)' in the source torch model.", ): prog = ct.convert( float32_buffer_model, inputs=[ ct.TensorType(shape=(3,)), ], states=[ ct.StateType( wrapped_type=ct.TensorType( shape=(3,), ) ), ], minimum_deployment_target=ct.target.iOS18, compute_precision=ct.precision.FLOAT32, convert_to="milinternal", ) with pytest.raises( ValueError, match="StateType named invalid not provided or not found in the source torch model. Please make sure the name in 'ct.StateType\(name=..., wrapped_type=ct.TensorType\(...\)\)' match the 'named_buffers\(\)' in the source torch model: \['state'\]", ): prog = ct.convert( float32_buffer_model, inputs=[ ct.TensorType(shape=(3,)), ], states=[ ct.StateType(wrapped_type=ct.TensorType(shape=(3,)), name="invalid"), ], minimum_deployment_target=ct.target.iOS18, compute_precision=ct.precision.FLOAT32, convert_to="milinternal", ) def test_invalid_state_shape_out(self, float32_buffer_model): """ The conversion should error out if the provided StateType has a different shape than the registered buffer. """ with pytest.raises( ValueError, match="StateType shape \(2,\) must match the torch buffer shape \(3,\)", ): prog = ct.convert( float32_buffer_model, inputs=[ ct.TensorType(shape=(3,)), ], states=[ ct.StateType( wrapped_type=ct.TensorType( shape=(2,), ), name="state", ), ], minimum_deployment_target=ct.target.iOS18, compute_precision=ct.precision.FLOAT32, convert_to="milinternal", ) def test_invalid_input_numbers_error_out(self, float32_buffer_model_with_two_inputs): """ The checking for the tensor inputs should not be affected by the new added StateType inputs """ with pytest.raises( ValueError, match="Number of TorchScript inputs \(2\) must match the user provided inputs \(1\).", ): prog = ct.convert( float32_buffer_model_with_two_inputs, inputs=[ ct.TensorType(shape=(3,)), ], states=[ ct.StateType( wrapped_type=ct.TensorType( shape=(3,), ), name="state", ), ], minimum_deployment_target=ct.target.iOS18, compute_precision=ct.precision.FLOAT32, convert_to="milinternal", ) def test_invalid_inputs_contains_states_error_out(self, float32_buffer_model_with_two_inputs): """ The checking for the inputs should not contain StateType. """ with pytest.raises( ValueError, match="'inputs' cannot contain an instance of StateType", ): prog = ct.convert( float32_buffer_model_with_two_inputs, inputs=[ ct.TensorType(shape=(3,)), ct.StateType( wrapped_type=ct.TensorType( shape=(3,), ), name="state", ), ], minimum_deployment_target=ct.target.iOS18, compute_precision=ct.precision.FLOAT32, convert_to="milinternal", ) @staticmethod def convert_state_model(model, backend, compute_unit=ct.ComputeUnit.CPU_ONLY): return ct.convert( model, inputs=[ ct.TensorType(shape=(3,)), ], states=[ ct.StateType( wrapped_type=ct.TensorType( shape=(3,), ), name="state_1", ), ct.StateType( wrapped_type=ct.TensorType( shape=(3,), ), name="state_2", ), ], minimum_deployment_target=ct.target.iOS18, convert_to=backend, compute_units=compute_unit, ) @staticmethod def check_state_model(mlmodel, expected_ops, run_prediction=True): mil = mlmodel.get_spec().mlProgram for function in mil.functions.values(): for block in function.block_specializations.values(): ops = list(block.operations) assert [val.type for val in ops] == expected_ops if run_prediction: verify_prediction(mlmodel) @pytest.mark.parametrize( "compute_unit", compute_units, ) def test_state_ops_cannot_removed( self, float32_buffer_not_returned_model, float32_buffer_not_returned_model_2, compute_unit, ): """ Check the coreml_update_state should not be removed by dead_code_elimination pass. """ # Test case 1 prog = self.convert_state_model(float32_buffer_not_returned_model, "milinternal") assert get_op_types_in_program(prog) == [ "identity", "read_state", "add", "coreml_update_state", "read_state", "add", "coreml_update_state", ] mlmodel = self.convert_state_model( float32_buffer_not_returned_model, "mlprogram", compute_unit ) expected_ops = [ "identity", "read_state", "add", "write_state", "read_state", "add", "write_state", ] # This model is failing on CPU_AND_NE, which is tracked by # rdar://130912134 ([Bug][Stateful model][CPU_AND_NE] Stateful model fails to run with compute_units=CPU_AND_NE) run_prediction = compute_unit != ct.ComputeUnit.CPU_AND_NE self.check_state_model(mlmodel, expected_ops, run_prediction) # Test case 2 prog = self.convert_state_model(float32_buffer_not_returned_model_2, "milinternal") assert get_op_types_in_program(prog) == [ "identity", "read_state", "add", "coreml_update_state", "read_state", "add", "coreml_update_state", "add", "coreml_update_state", ] mlmodel = self.convert_state_model( float32_buffer_not_returned_model_2, "mlprogram", compute_unit ) expected_ops = [ "identity", "read_state", "add", "write_state", "read_state", "read_state", "add", "write_state", "add", "write_state", ] # This model is failing on CPU_AND_NE, which is tracked by # rdar://130912134 ([Bug][Stateful model][CPU_AND_NE] Stateful model fails to run with compute_units=CPU_AND_NE) run_prediction = compute_unit != ct.ComputeUnit.CPU_AND_NE self.check_state_model(mlmodel, expected_ops, run_prediction) @pytest.mark.parametrize( "compute_unit, dtype", itertools.product( compute_units, [np.float16, np.float32], ), ) def test_single_state_single_input_sequential_model(self, compute_unit, dtype): """ Tests for a model with a sequence of inplace ops. """ def get_stateful_model(): # fp32 state is not supported through runtime convert_to = "milinternal" if dtype == np.float32 else "mlprogram" compute_precision_mapping = { np.float16: ct.precision.FLOAT16, np.float32: ct.precision.FLOAT32, } model = ct.convert( float32_buffer_sequantial_model(), inputs=[ ct.TensorType(shape=(3,), dtype=dtype), ], states=[ ct.StateType(wrapped_type=ct.TensorType(shape=(3,), dtype=dtype), name="state"), ], minimum_deployment_target=ct.target.iOS18, compute_precision=compute_precision_mapping[dtype], convert_to=convert_to, compute_units=compute_unit, ) if dtype == np.float32: return None, model assert dtype == np.float16 return model, model._mil_program mlmodel, prog = get_stateful_model() assert get_op_types_in_program(prog) == [ "read_state", "slice_update", "coreml_update_state", "mul", "mul", "coreml_update_state", "sub", "coreml_update_state", "relu", ] if mlmodel is not None: verify_prediction(mlmodel) @pytest.mark.parametrize( "compute_unit", compute_units, ) def test_color_input_with_buffer(self, rank4_input_model_with_buffer, compute_unit): mlmodel = ct.convert( rank4_input_model_with_buffer, inputs=[ct.ImageType(shape=(1, 3, 10, 20), color_layout=ct.colorlayout.RGB)], states=[ct.StateType(wrapped_type=ct.TensorType(shape=(1, 3, 10, 20)), name="state_1")], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.iOS18, compute_units=compute_unit, ) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB) assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") verify_prediction(mlmodel) @pytest.mark.parametrize( "compute_unit", compute_units, ) def test_color_output_with_buffer(self, rank4_input_model_with_buffer, compute_unit): # image input / image output mlmodel = ct.convert( rank4_input_model_with_buffer, inputs=[ct.ImageType(shape=(1, 3, 10, 20), color_layout=ct.colorlayout.BGR)], states=[ct.StateType(wrapped_type=ct.TensorType(shape=(1, 3, 10, 20)), name="state_1")], outputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.iOS18, compute_units=compute_unit, ) assert_spec_input_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.BGR) assert_spec_output_image_type(mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB) assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") verify_prediction(mlmodel) # tensor input / image output # check mlprogram can have image output, both static and dynamic case are tested for is_dynamic in [True, False]: shape = ( ct.Shape((1, 3, ct.RangeDim(5, 10, default=10), ct.RangeDim(5, 20, default=20))) if is_dynamic else ct.Shape((1, 3, 10, 20)) ) mlmodel = ct.convert( rank4_input_model_with_buffer, inputs=[ct.TensorType(shape=shape, dtype=np.float32)], states=[ ct.StateType(wrapped_type=ct.TensorType(shape=(1, 3, 10, 20)), name="state_1") ], outputs=[ct.ImageType(name="output_image", color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.iOS18, compute_units=compute_unit, ) assert_spec_output_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.RGB ) assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") if is_dynamic: assert any_symbolic(mlmodel._mil_program.functions["main"].outputs[0].shape) verify_prediction(mlmodel) @pytest.mark.parametrize( "compute_unit", compute_units, ) def test_grayscale_input_with_buffer( self, rank4_grayscale_input_model_with_buffer, compute_unit ): # test with GRAYSCALE mlmodel = ct.convert( rank4_grayscale_input_model_with_buffer, inputs=[ct.ImageType(shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE)], states=[ct.StateType(wrapped_type=ct.TensorType(shape=(1, 1, 10, 20)), name="state_1")], outputs=[ct.TensorType(dtype=np.float32)], minimum_deployment_target=ct.target.iOS18, compute_units=compute_unit, ) assert_spec_input_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE ) assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp32") verify_prediction(mlmodel) # test with GRAYSCALE_FLOAT16 mlmodel = ct.convert( rank4_grayscale_input_model_with_buffer, inputs=[ ct.ImageType(shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE_FLOAT16) ], states=[ct.StateType(wrapped_type=ct.TensorType(shape=(1, 1, 10, 20)), name="state_1")], outputs=[ct.TensorType(dtype=np.float16)], minimum_deployment_target=ct.target.iOS18, compute_units=compute_unit, ) assert_spec_input_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16 ) assert_output_dtype(mlmodel, expected_type_str="fp16") verify_prediction(mlmodel) @pytest.mark.parametrize( "compute_unit", compute_units, ) def test_grayscale_output_with_buffer( self, rank4_grayscale_input_model_with_buffer, compute_unit ): # grayscale fp16 input and output mlmodel = ct.convert( rank4_grayscale_input_model_with_buffer, inputs=[ ct.ImageType(shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE_FLOAT16) ], states=[ct.StateType(wrapped_type=ct.TensorType(shape=(1, 1, 10, 20)), name="state_1")], outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], minimum_deployment_target=ct.target.iOS18, compute_units=compute_unit, ) assert_spec_input_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16 ) assert_spec_output_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16 ) assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) # grayscale input and grayscale fp16 output mlmodel = ct.convert( rank4_grayscale_input_model_with_buffer, inputs=[ct.ImageType(shape=(1, 1, 10, 20), color_layout=ct.colorlayout.GRAYSCALE)], outputs=[ct.ImageType(color_layout=ct.colorlayout.GRAYSCALE_FLOAT16)], minimum_deployment_target=ct.target.iOS18, compute_units=compute_unit, ) assert_spec_input_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE ) assert_spec_output_image_type( mlmodel._spec, expected_feature_type=ft.ImageFeatureType.GRAYSCALE_FLOAT16 ) assert_prog_output_type(mlmodel._mil_program, expected_dtype_str="fp16") verify_prediction(mlmodel) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508431.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/test/testing_utils.py0000644000000000000000000003244514672066617027211 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import platform from pathlib import Path from typing import List, Union import numpy as np import pytest import torch import torch.nn as nn import coremltools as ct import coremltools.models.utils as coremltoolsutils from coremltools import RangeDim, TensorType from coremltools import _logger as logger from coremltools._deps import _HAS_EXECUTORCH, _HAS_TORCH_EXPORT_API, _IS_MACOS from coremltools.converters.mil.mil.types.type_mapping import nptype_from_builtin from coremltools.converters.mil.testing_utils import ( _create_current_pytest_serialization_path, ct_convert, debug_save_mlmodels, validate_minimum_deployment_target, ) from ..utils import TORCH_DTYPE_TO_MIL_DTYPE, TORCH_EXPORT_BASED_FRONTENDS, TorchFrontend if _HAS_TORCH_EXPORT_API: from torch.export import ExportedProgram if _HAS_EXECUTORCH: import executorch.exir if "TORCH_FRONTENDS" in os.environ: frontends = [] for frontend_str in os.environ["TORCH_FRONTENDS"].split(","): frontend = TorchFrontend[frontend_str] if platform.machine() == "x86_64" and frontend in TORCH_EXPORT_BASED_FRONTENDS: logger.warning( f"{frontend_str} is not supported well on x86_64, skipped this frontend test" ) continue if frontend == TorchFrontend.TORCHEXPORT and not _HAS_TORCH_EXPORT_API: logger.warning( "Must have torch.export API to test TORCHEXPORT frontend. Skipped this frontend test." ) continue if frontend == TorchFrontend.EXECUTORCH and not _HAS_EXECUTORCH: logger.warning( "Must have executorch to test EXECUTORCH frontend. Skipped this frontend test." ) continue frontends.append(frontend) else: frontends = [TorchFrontend.TORCHSCRIPT] if platform.machine() != "x86_64": if _HAS_TORCH_EXPORT_API: frontends.append(TorchFrontend.TORCHEXPORT) if _HAS_EXECUTORCH: frontends.append(TorchFrontend.EXECUTORCH) class ModuleWrapper(nn.Module): """ Helper class to transform torch function into torch nn module. This helps to keep the testing interface same for torch functional api. """ def __init__(self, function, kwargs=None): super(ModuleWrapper, self).__init__() self.function = function self.kwargs = kwargs if kwargs else {} def forward(self, *args): return self.function(*args, **self.kwargs) np.random.seed(1984) def _flatten(objects): flattened_list = [] for item in objects: if isinstance(item, (list, tuple)): flattened_list.extend(_flatten(item)) else: flattened_list.append(item) return flattened_list def _copy_input_data(input_data): if isinstance(input_data, (list, tuple)): return [_copy_input_data(x) for x in input_data] return input_data.clone().detach() def contains_op(torch, op_string): return hasattr(torch, op_string) def convert_to_coreml_inputs(input_description, inputs): """ Convenience function to combine a CoreML model's input description and set of raw inputs into the format expected by the model's predict function. """ flattened_inputs = _flatten(inputs) coreml_inputs = { str(x): inp.cpu().numpy().astype(np.float32) for x, inp in zip(input_description, flattened_inputs) } for k, v in coreml_inputs.items(): if isinstance(v, np.ndarray) and v.ndim == 0: coreml_inputs[k] = np.expand_dims(v, axis=-1) return coreml_inputs def convert_to_mlmodel( model_spec, tensor_inputs, backend=("neuralnetwork", "fp32"), converter_input_type=None, compute_unit=ct.ComputeUnit.CPU_ONLY, minimum_deployment_target=None, converter=ct.convert, ): def _convert_to_inputtype(inputs): if isinstance(inputs, list): return [_convert_to_inputtype(x) for x in inputs] elif isinstance(inputs, tuple): return tuple([_convert_to_inputtype(x) for x in inputs]) elif isinstance(inputs, TensorType): return inputs elif isinstance(inputs, torch.Tensor): return TensorType(shape=inputs.shape, dtype=TORCH_DTYPE_TO_MIL_DTYPE[inputs.dtype]) else: raise ValueError( "Unable to parse type {} into InputType.".format(type(inputs)) ) if converter_input_type is None: inputs = list(_convert_to_inputtype(tensor_inputs)) else: inputs = converter_input_type if _HAS_TORCH_EXPORT_API and isinstance(model_spec, ExportedProgram): inputs = None outputs = None return ct_convert( model_spec, inputs=inputs, convert_to=backend, source="pytorch", compute_units=compute_unit, minimum_deployment_target=minimum_deployment_target, converter=converter, ) def generate_input_data( input_size, rand_range=(0, 1), dtype=np.float32, torch_device=torch.device("cpu") ) -> Union[torch.Tensor, List[torch.Tensor]]: r1, r2 = rand_range def random_data(spec, dtype=np.float32): if isinstance(spec, TensorType): spec_shape = spec.shape.shape dtype = nptype_from_builtin(spec.dtype) else: spec_shape = spec static_shape = tuple([np.random.randint(dim.lower_bound, dim.upper_bound if dim.upper_bound > 0 else 10) if isinstance(dim, RangeDim) else dim for dim in spec_shape]) if np.issubdtype(dtype, np.floating): data = np.random.rand(*static_shape) if static_shape != () else np.random.rand() data = (r1 - r2) * data + r2 else: data = np.random.randint(r1, r2, size=static_shape, dtype=dtype) return torch.from_numpy(np.array(data).astype(dtype)).to(torch_device) if isinstance(input_size, list): return [random_data(size, dtype) for size in input_size] else: return random_data(input_size, dtype) def export_torch_model_to_frontend( model, input_data, frontend, use_scripting=False, torch_export_dynamic_shapes=None, ): input_data_clone = _copy_input_data(input_data) if isinstance(input_data_clone, list): input_data_clone = tuple(input_data_clone) elif isinstance(input_data_clone, torch.Tensor): input_data_clone = (input_data_clone,) if frontend == TorchFrontend.TORCHSCRIPT: model.eval() if use_scripting: model_spec = torch.jit.script(model) else: model_spec = torch.jit.trace(model, input_data_clone) elif frontend in TORCH_EXPORT_BASED_FRONTENDS: try: model.eval() except NotImplementedError: # Some torch.export stuff, e.g. quantization, has not implemented eval() yet logger.warning("PyTorch EXIR converter received a model without .eval method") model_spec = torch.export.export( model, input_data_clone, dynamic_shapes=torch_export_dynamic_shapes ) if frontend == TorchFrontend.EXECUTORCH: model_spec = executorch.exir.to_edge(model_spec).exported_program() else: raise ValueError( "Unknown value of frontend. Needs to be either TorchFrontend.TORCHSCRIPT " f"or TorchFrontend.TORCHEXPORT or TorchFrontend.EXECUTORCH. Provided: {frontend}" ) return model_spec def flatten_and_detach_torch_results(torch_results): if isinstance(torch_results, (list, tuple)): if len(torch_results) == 1 and isinstance(torch_results[0], dict): return [value.detach().numpy() for value in torch_results[0].values()] else: return [x.detach().numpy() for x in _flatten(torch_results) if x is not None] elif isinstance(torch_results, dict): return [value.detach().numpy() for value in torch_results.values()] # Do not need to flatten return [torch_results.detach().cpu().numpy()] def convert_and_compare( input_data, model_spec, expected_results=None, atol=1e-4, rtol=1e-05, backend=("neuralnetwork", "fp32"), converter_input_type=None, compute_unit=ct.ComputeUnit.CPU_ONLY, minimum_deployment_target=None, converter=ct.convert, ): """ If expected results is not set, it will by default be set to the flattened output of the torch model. Inputs: - input_data: torch.tensor or list[torch.tensor] """ if isinstance(model_spec, str): torch_model = torch.jit.load(model_spec) else: torch_model = model_spec if _HAS_TORCH_EXPORT_API and isinstance(torch_model, ExportedProgram): torch_model = torch_model.module() if not isinstance(input_data, (list, tuple)): input_data = [input_data] if expected_results is None: torch_input = _copy_input_data(input_data) expected_results = torch_model(*torch_input) expected_results = flatten_and_detach_torch_results(expected_results) PYTEST_CURRENT_TEST = os.environ.get("PYTEST_CURRENT_TEST").split("(call)")[0].strip() if PYTEST_CURRENT_TEST in debug_save_mlmodels: serialization_path = _create_current_pytest_serialization_path() Path(serialization_path).mkdir(parents=True, exist_ok=True) flat_inputs = flatten_and_detach_torch_results(input_data) np.savez(serialization_path + "ref_inputs.npz", *flat_inputs) np.savez(serialization_path + "ref_outputs.npz", *expected_results) mlmodel = convert_to_mlmodel( model_spec, input_data, backend=backend, converter_input_type=converter_input_type, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, converter=converter, ) coreml_inputs = convert_to_coreml_inputs(mlmodel.input_description, input_data) if not _IS_MACOS or (mlmodel.is_package and coremltoolsutils._macos_version() < (12, 0)): return model_spec, mlmodel, coreml_inputs, None _, dtype = backend if mlmodel.compute_unit != ct.ComputeUnit.CPU_ONLY or (dtype == "fp16"): atol = max(atol * 100.0, 5e-1) rtol = max(rtol * 100.0, 5e-2) if not coremltoolsutils._has_custom_layer(mlmodel._spec): coreml_preds = mlmodel.predict(coreml_inputs) coreml_outputs = mlmodel._spec.description.output coreml_results = [coreml_preds[output.name] for output in coreml_outputs] for torch_result, coreml_result in zip(expected_results, coreml_results): if torch_result.shape == (): torch_result = np.array([torch_result]) np.testing.assert_equal(coreml_result.shape, torch_result.shape) np.testing.assert_allclose(coreml_result, torch_result, atol=atol, rtol=rtol) return model_spec, mlmodel, coreml_inputs, coreml_preds class TorchBaseTest: testclassname = '' testmodelname = '' @pytest.fixture(autouse=True) def store_testname_with_args(self, request): TorchBaseTest.testclassname = type(self).__name__ TorchBaseTest.testmodelname = request.node.name @staticmethod def run_compare_torch( input_data, model, expected_results=None, atol=1e-04, rtol=1e-05, input_as_shape=True, input_dtype=np.float32, backend=("neuralnetwork", "fp32"), rand_range=(-1.0, 1.0), use_scripting=False, converter_input_type=None, compute_unit=ct.ComputeUnit.CPU_ONLY, minimum_deployment_target=None, torch_device=torch.device("cpu"), frontend=TorchFrontend.TORCHSCRIPT, torch_export_dynamic_shapes=None, converter=ct.convert, ): """ Traces a model and runs a numerical test. Args: input_as_shape : If true generates random input data with shape. expected_results : Expected result from running pytorch model. converter_input_type: If not None, then pass it to the "inputs" argument to the ct.convert() call. frontend: TorchFrontend enum """ if minimum_deployment_target is not None: validate_minimum_deployment_target(minimum_deployment_target, backend) if input_as_shape: input_data = generate_input_data(input_data, rand_range, input_dtype, torch_device) model_spec = export_torch_model_to_frontend( model, input_data, frontend, use_scripting=use_scripting, torch_export_dynamic_shapes=torch_export_dynamic_shapes, ) model_spec, mlmodel, coreml_inputs, coreml_results = convert_and_compare( input_data, model_spec, expected_results=expected_results, atol=atol, rtol=rtol, backend=backend, converter_input_type=converter_input_type, compute_unit=compute_unit, minimum_deployment_target=minimum_deployment_target, converter=converter, ) return model_spec, mlmodel, coreml_inputs, coreml_results, \ TorchBaseTest.testclassname, TorchBaseTest.testmodelname ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/torch_op_registry.py0000644000000000000000000001411714672066616027075 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Callable import torch from coremltools import _logger as logger from coremltools.models._deprecation import deprecated as _deprecated from .utils import sanitize_op_kind, unify_inplace_and_functional class TorchOpsRegistry: def __init__(self): self.name_to_func_mapping = {} def get_func(self, op_lookup: str) -> Callable: """ Given a op type key, return the according translation function. Note that we will not distinguish in-place and functional, since we mostly use functional CoreML ops to translate. For instance, ``sub_`` -> ``sub`` """ op_lookup = sanitize_op_kind(op_lookup) op_lookup = unify_inplace_and_functional(op_lookup) return self.name_to_func_mapping.get(op_lookup, None) def register_func(self, func=None, torch_alias=None, override=False): """ Given an op name and its alias, put the translation function (callable) into the registry. """ f_name = func.__name__ all_f_names = [f_name] if torch_alias is not None: all_f_names.extend(torch_alias) for name in all_f_names: if name.endswith("_"): raise Exception( f'Attempting to register "{name}" op. Do not register inplace ops. (inplace torch ops' f' end in a "_"). Instead register the normal op version: "{name[:-1]}". The inplace' f" version will be supported automatically." ) if not override and name in self.name_to_func_mapping: raise ValueError(f"Torch op {name} already registered.") self.set_func_by_name(func, name) def set_func_by_name(self, func, name): self.name_to_func_mapping[name] = func def is_inplace_op(self, op_lookup: str): """ A torch op is considered inplace if the op name endswith ``_``. """ return not (op_lookup.startswith("__") and op_lookup.endswith("__")) and op_lookup.endswith( "_" ) # The following functions will be deprecated after 7.2 # rdar://117502178 ([Infra][Pytorch] We should deprecate the direct use of _TORCH_OPS_REGISTRY in 7.2) @_deprecated( suffix="Please use coremltools.converters.mil.frontend.torch.register_torch_op", version="7.2", obj_prefix="_TORCH_OPS_REGISTRY.", ) def __contains__(self, key: str) -> bool: return key in self.name_to_func_mapping @_deprecated( suffix="Please use coremltools.converters.mil.frontend.torch.register_torch_op", version="7.2", obj_prefix="_TORCH_OPS_REGISTRY.", ) def __setitem__(self, key: str, value: Callable) -> None: self.name_to_func_mapping[key] = value @_deprecated( suffix="Please use coremltools.converters.mil.frontend.torch.register_torch_op", version="7.2", obj_prefix="_TORCH_OPS_REGISTRY.", ) def __delitem__(self, key: str) -> None: del self.name_to_func_mapping[key] @_deprecated( suffix="Please use coremltools.converters.mil.frontend.torch.register_torch_op", version="7.2", obj_prefix="_TORCH_OPS_REGISTRY.", ) def __getitem__(self, key: str) -> Callable: return self.name_to_func_mapping[key] _TORCH_OPS_REGISTRY = TorchOpsRegistry() def register_torch_op(_func=None, torch_alias=None, override=False): """ Registration routine for PyTorch operators _func: (PyTorch conversion function) [Default=None] PyTorch conversion function to register torch_alias: (List of string) [Default=None] All other PyTorch operators that should also be mapped to current conversion routine. e.g. Sort aliased with SortV1, SortV2 All provided alias operators must not be registered previously. "In place" alias are looked up automatically and do not need to be registered. PyTorch uses an underscore suffix to denote the in place version, e.g. "sum_" is the in place version of "sum". override: (Boolean) [Default=False] If True, overrides earlier registration i.e. specified operator and alias will start pointing to current conversion function. Otherwise, duplicate registration will error out. """ def func_wrapper(func): _TORCH_OPS_REGISTRY.register_func(func, torch_alias, override) return func if _func is None: # decorator called without argument return func_wrapper return func_wrapper(_func) def is_torch_fx_node_supported(torch_fx_node: "torch.fx.Node") -> bool: # There are many types of torch fx node: # 1. call_function # 2. call_module # 3. call_method # 4. get_attr # 5. placeholder # 6. output # ... # Only "call_*" nodes contain PyTorch ops, # among them we only support "call_function" node for now if torch_fx_node.op != "call_function": logger.warning( "For now, among all types of torch fx nodes, CoreML only supports call_function node" ) return False # Get the target in torch fx node, then sanitize its name torch_fx_node_target = torch_fx_node.target if isinstance(torch_fx_node_target, str): torch_fx_node_target_name = torch_fx_node_target else: torch_fx_node_target_name = torch_fx_node.target.__name__ torch_fx_node_target_name = sanitize_op_kind(torch_fx_node_target_name) # Since we are only dealing with "call_function" node, # the contained PyTorch op must be functional, i.e. not in-place assert ( not torch_fx_node_target_name.endswith("_") ), ( "For now, since CoreML only supports call_function torch fx node, " "all ops should be functional, i.e. there should not be any in-place op" ) return torch_fx_node_target_name in _TORCH_OPS_REGISTRY ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/torchir_passes.py0000644000000000000000000003725714672066616026372 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict, defaultdict from typing import Dict, Optional from coremltools import _logger as logger from .internal_graph import InternalTorchIRGraph, InternalTorchIRNode def generate_tensor_assignment_ops(graph: InternalTorchIRGraph) -> None: """ This graph pass handles inplace tensor assignments, specifically it handles: `torch.Tensor.copy_` and `torch.Tensor.fill_`. There are many other inplace tensor assignments which are currently not handled. for instance: def forward(self, x): # x a tensor with shape [4,10] x[:2, 4] = [[1],[3]] return x In Pytorch, this is represented by a sequence of slice / select ops followed by a copy op: input -> %x %1 = slice(%x, dim=0, begin=0, end=2, stride=1) # the slice for dimension 0 %2 = select(%1, dim=1, index=4) # the select for dimension 1 %3 = copy_(%2, value=[[1], [3]]) output -> %x This graph pass fuses the sequences into a single InternalTorchIRNode of a new kind, which is defined as `_internal_op_tensor_inplace_copy_`. input -> %x %nodes_to_fuse = [slice(%x, begin=0, end=2, stride=1), select(%1, dim=1, index=4)] %x_internal_tensor_assign_1 = _internal_op_tensor_inplace_copy_(%x, value=[[1],[3]], nodes_to_fuse=nodes_to_fuse) output -> x_internal_tensor_assign_1 The _internal_tensor_value_assign op takes an additional internal data member nodes_to_fuse, which is a list of select / slice InternalTorchIRNodes that need to be fused. Here is a more complicated example: def forward(self, x): # x a tensor with shape [4,10] x[0, 0] = 1 x[1:2, 1:2] = [[0]] return x Input graph: input -> %x %1 = select(%x, dim=0, index=0) %2 = select(%1, dim=0, index=0) %3 = copy_(%2, value=1) %4 = slice(%x, dim=0, begin=1, end=2, stride=1) %5 = slice(%4, dim=1, begin=1, end=2, stride=1) %6 = copy_(%5, value=[[0]]) output -> %x Output graph: input -> %x %nodes_to_fuse_1 = [select(%x, dim=0, index=0), select(%1, dim=0, index=0)] %x_internal_tensor_assign_1 = _internal_op_tensor_inplace_copy_(%x, value=1, nodes_to_fuse=nodes_to_fuse_1) %nodes_to_fuse_2 = [slice(%x, dim=0, begin=1, end=2, stride=1), slice(%4, dim=1, begin=1, end=2, stride=1)] %x_internal_tensor_assign_2 = _internal_op_tensor_inplace_copy_(%x_internal_tensor_assign_1, value=[[0]], nodes_to_fuse=nodes_to_fuse_2) output -> x_internal_tensor_assign_2 torch.Tensor.fill_ works in a similar way, except the InternalTorchIRNodes is defined by `_internal_op_tensor_inplace_fill_`. A fill_ operator is generated from the following forward pass: def forward(self, x): # x a tensor with shape [5, 4] x[2] = 9 return x In case: def forward(self, x): # x a tensor with shape [4,10] y = torch.empty(*x.shape) y.copy_(0) return y Input graph: input -> %x %y = empty[](x.shape) %1 = copy_[](%y, %x) return (%1) output -> %1 In result of fuse input -> %x %y = [empty[](x.shape)] %x_internal_tensor_assign_1 = _internal_op_tensor_inplace_copy_(%y, %x) output -> %x_internal_tensor_assign_1 As a result of side effects of fusing, output of `_internal_op_tensor_inplace_copy_` will be renamed to `x_internal_tensor_assign_1`. If `%1` should be renamed to `x_internal_tensor_assign_1` too, the graph will be invalid. In this purpose out_alias was introduced. """ TENSOR_ASSIGMENT_PREFIX = "_internal_tensor_assign_" def _get_updated_name(name, updated_tensor_count, out_alias): if name in updated_tensor_count: return name + TENSOR_ASSIGMENT_PREFIX + str(updated_tensor_count[name]) if name in out_alias: return out_alias[name] return name def _construct_nodes_to_fuse_inputs(nodes_to_fuse): inputs = [] for node in nodes_to_fuse: if node.kind == "select": inputs += [node.inputs[2], None, None] if node.kind == "slice": inputs += [node.inputs[2], node.inputs[3], node.inputs[4]] return inputs tensor_to_node_sequence_mapping = {} updated_tensor_count = defaultdict(lambda: 0) out_alias = {} for i in range(len(graph.nodes)): node = graph.nodes[i] for idx in range(len(node.inputs)): input_name = node.inputs[idx] node.inputs[idx] = _get_updated_name(input_name, updated_tensor_count, out_alias) if node.kind in ("empty", "select", "slice"): node_input = node.inputs[0] node_output = node.outputs[0] node_sequence = tensor_to_node_sequence_mapping.get(node_input, []) if len(node_sequence) > 0: tensor_to_node_sequence_mapping.pop(node_input) node_sequence.append(node) tensor_to_node_sequence_mapping[node_output] = node_sequence if node.kind == "to": node_input = node.inputs[0] if node_input in tensor_to_node_sequence_mapping: # update the mapping node_output = node.outputs[0] val = tensor_to_node_sequence_mapping[node_input] del tensor_to_node_sequence_mapping[node_input] tensor_to_node_sequence_mapping[node_output] = val if node.kind in ("copy_", "fill_"): node_input = node.inputs[0] if node_input not in tensor_to_node_sequence_mapping: raise ValueError("No matching select or slice.") if node.kind == "copy_": kind = "_internal_op_tensor_inplace_copy_" else: kind = "_internal_op_tensor_inplace_fill_" nodes_to_fuse = tensor_to_node_sequence_mapping[node_input] if nodes_to_fuse[0].kind in ["select", "slice"]: source_tensor = nodes_to_fuse[0].inputs[0] else: source_tensor = nodes_to_fuse[0].outputs[0] origin_name = source_tensor.split(TENSOR_ASSIGMENT_PREFIX)[0] updated_tensor_count[origin_name] += 1 outputs = [_get_updated_name(origin_name, updated_tensor_count, out_alias)] out_alias[node.outputs[0]] = outputs[0] update_value = node.inputs[1] nodes_to_fuse_inputs = _construct_nodes_to_fuse_inputs(nodes_to_fuse) tensor_assign_node = InternalTorchIRNode( name=outputs[0], inputs=[source_tensor, update_value] + nodes_to_fuse_inputs, outputs=outputs, kind=kind, blocks=[], model_hierarchy=node.model_hierarchy, ) graph.nodes[i] = tensor_assign_node # modify the graph outputs if it is effected by this graph pass for idx in range(len(graph.outputs)): output = graph.outputs[idx] graph.outputs[idx] = _get_updated_name(output, updated_tensor_count, out_alias) def populate_native_const_model_hierarchy(graph: InternalTorchIRGraph) -> None: """ Torchscript doesn't capture the model hierarchy of those python native consts. For instance: class Submodule(torch.nn.Module): def forward(self, x): x = x + 0.9 x = x * 0.9 return torch.relu(x) class Model(torch.nn.Module): def __init__(self): super().__init__() self.submodule_1 = Submodule() def forward(self, x): return self.submodule_1(x) The two ``0.9`` constants don't have the scope of Submodule. In this graph pass, we make the model hierarchy of such constants inherited from their child ops. """ cached_model_hierarchy = {} child_ops = defaultdict(list) for node in graph.nodes: for b in node.blocks: populate_native_const_model_hierarchy(b) for node in graph.nodes: cached_model_hierarchy[node.name] = node.model_hierarchy for val in node.inputs: child_ops[val].append(node.name) for node in graph.nodes: if node.kind != "constant": continue if node.model_hierarchy == "" and len(child_ops[node.name]) == 1: node.model_hierarchy = cached_model_hierarchy[child_ops[node.name][0]] def remove_getattr_nodes(graph: InternalTorchIRGraph) -> None: """ Remove the getattr nodes in the graph """ getattr_nodes = [] new_nodes = [] for node in graph.nodes: for block in node.blocks: remove_getattr_nodes(block) if node.kind == "getattr": getattr_nodes.append(node) else: new_nodes.append(node) # check the getattr nodes not in the outputs for node in getattr_nodes: if node.name in graph.outputs: raise RuntimeError("{} should not be in the graph outputs.".format(node.name)) # remove the getattr nodes graph.nodes = new_nodes def transform_inplace_ops( graph: InternalTorchIRGraph, name_remap_dict: Optional[Dict[str, str]] = None ) -> None: # As we modify ops, we'll need to remap symbols. if name_remap_dict is None: name_remap_dict = {} for node in graph.nodes: for k, v in name_remap_dict.items(): node.replace_name(k, v) if node.kind == "append": if isinstance(node.parent, InternalTorchIRGraph): # If append appears in a graph (outer block), replace # subsequent uses of its input symbol with its output symbol. name_remap_dict[node.inputs[0]] = node.outputs[0] elif node.parent.parent.kind == "loop": # If append appears in a loop block, add its inputs to the block # inputs and loop inputs, and its outputs to the block outputs # and loop outputs. # This is the global input to append. We need to add it to the # loop's input list, and replace any uses after the node with # @global_output below. global_input = node.inputs[0] # This will be the name of the input to append within the # block. We need to add it to the block inputs. local_input = node.parent.parent.name + ".0" # This is the output of append. We need to add it to the list # of block outputs. local_output = node.outputs[0] # This is the name of the new output from the loop. It should # replace any uses of @global_input after the loop op. global_output = local_output + ".out" name_remap_dict[global_input] = global_output node.parent.parent.inputs.append(global_input) node.parent.inputs.append(local_input) node.replace_name(global_input, local_input) node.parent.outputs.append(local_output) node.parent.parent.outputs.append(global_output) node.parent.parent.name = node.parent.parent.outputs[0] elif node.parent.parent.kind == "if": # If append appears in an if/else block, add its outputs to the # block outputs and loop outputs. # Note that we can't assume the append appears in both blocks. raise NotImplementedError( "inplace_ops pass doesn't yet support append op inside conditional" ) for block in node.blocks: transform_inplace_ops(block, name_remap_dict) # Replace names in graph outputs for k, v in name_remap_dict.items(): try: idx = graph.outputs.index(k) except ValueError: pass else: graph.outputs[idx] = v def flatten_graph_input_values(graph: InternalTorchIRGraph) -> None: """CoreML can't handle nested iterables of tensors, so we flatten the inputs of any graph that expects them. """ new_graph_inputs = graph.inputs all_new_nodes = [] changed = True notified = False while changed: old_graph_inputs = new_graph_inputs new_graph_inputs = OrderedDict() new_nodes = [] changed = False for _input_name, _input_val in old_graph_inputs.items(): if isinstance(_input_val, (tuple, list)): changed = True if not notified: notified = True logger.warning( "Tuple detected at graph input. This will be flattened in the converted model." ) # If this input to the graph is a tuple, we want to replace it # with a flattened version and add an op to construct the tuple. node_inputs = [] for idx, item in enumerate(_input_val): name = _input_name + "_{}".format(idx) new_graph_inputs[name] = item node_inputs.append(name) new_nodes.append( InternalTorchIRNode( inputs=node_inputs, outputs=[_input_name], kind="tupleconstruct", name=_input_name, ) ) else: # This input isn't a tuple, keep it as is. new_graph_inputs[_input_name] = _input_val all_new_nodes = new_nodes + all_new_nodes graph.inputs = new_graph_inputs graph.nodes = all_new_nodes + graph.nodes def flatten_graph_output_values(graph: InternalTorchIRGraph) -> None: """ CoreML can't handle nested iterables of tensors, so we flatten the outputs of any graph that produces them. """ node_names = [node.name for node in graph.nodes] new_graph_outputs = graph.outputs changed = True notified = False while changed: old_graph_outputs = new_graph_outputs new_graph_outputs = [] changed = False for outp in old_graph_outputs: # Find the node that generates this output var. # It is possible to not find the output var in the list of node # names since nodes are named after their first output. In that # case, it means the output var comes from a node that returns # multiple outputs, which means that node cannot be a construct op. try: node_idx = node_names.index(outp) except: # @outp doesn't come from a construct op new_graph_outputs.append(outp) continue if graph.nodes[node_idx].kind in [ "tupleconstruct", "listconstruct", ]: # Since this output came from a construct op, we can replace it # with the inputs to the op. new_graph_outputs.extend(graph.nodes[node_idx].inputs) changed = True if not notified: notified = True logger.warning( "Tuple detected at graph output. This will be flattened in the converted model." ) else: new_graph_outputs.append(outp) # Note: if we flattened outputs, there are likely to be construct ops # that are no longer needed. These will be removed in a later DCE pass. graph.outputs = new_graph_outputs ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/torchscript_utils.py0000644000000000000000000002032014672066616027105 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import torch from coremltools._deps import version_lt def _jit_pass_lower_graph(graph, torchscript): """ This graph pass does a similar thing as torch._C._jit_pass_lower_graph does. It does three things: 1. Rename getattr nodes which produce a torch tensor to match the keys in torch model's state_dict 2. Construct the params_dict, with the keys similar to state_dict 3. Get the named_buffer dict in torch model To be more specific, this graph pass traces down series of GetAttr ops, and rename the final node to match the torch model state_dict. It also replaces the node inputs by the first created tensor node with the same name. Example: Input graph: graph(%self.1 : __torch__.torch.nn.modules.Sequential, %input.1 : Tensor): %2 : prim::GetAttr[name="linear"](%self.1) %3 : prim::GetAttr[name="weight"](%2) %4 : prim::GetAttr[name="bias"](%2) %5 : prim::GetAttr[name="bias"](%2) # duplicated node %6 : conv(%input.1, %3, %4) %7 : add(%input.1, %5) return (%6, %7) Output graph: graph(%self.1 : __torch__.torch.nn.modules.Sequential, %input.1 : Tensor): %2 : prim::GetAttr[name="linear"](%self.1) %linear.weight : prim::GetAttr[name="weight"](%2) %linear.bias : prim::GetAttr[name="bias"](%2) %5 : prim::GetAttr[name="bias"](%2) # duplicated node, it is not used now %6 : conv(%input.1, %linear.weight, %linear.bias) %7 : add(%input.1, %linear.bias) # the second input is replaced return (%6, %7) And a dictionary {"linear.weight": ..., "linear.bias": ...} is returned, to record the parameters values. Note that, those GetAttr nodes are still in the torch ir graph, but they would be removed in a latter graph pass in the coremltools torch internal graph """ """ Each getattr node corresponds to a torch object in the torch IR, it could be either: 1. torch.nn.modules: submodule in a torch model. For instance, a linear layer in a MLP network. 2. torch.Tensor: torch model parameters. For instance, weight for a conv layer. 3. torch._C.ScriptObject: quantized torch model parameters. For example, in the graph above, %2 is pointing to the __torch__.torch.nn.modules.Sequential.linear torch submodule. node_to_module_map tracks these mapping. node_to_prefic_map track the name for each module, for example, %2 has the prefix name linear and %3 is linear.weight. These names are also keys in the state_dict """ node_to_module_map = {} node_to_prefix_map = {} first_node_with_prefix = {} replace_input = {} base_module_node = list(graph.inputs())[0] node_to_module_map[base_module_node] = torchscript node_to_prefix_map[base_module_node] = "" """ params_dict will be contructed in this graph pass. It contains all const tensors needed for the graph computation. And the value is validated against the state_dict if the key is presented in both dictionaries. In some rare cases, state_dict lacks parameters / buffers, so we still need to go through the while graph ourselves. """ params_dict = {} state_dict = torchscript.state_dict(keep_vars=True) buffer_dict = {k: v for k, v in torchscript.named_buffers()} def _check_is_tensor(node, module): if not isinstance(module, torch.Tensor): return False if str(node.output().type()) not in ("Tensor", "Optional[Tensor]"): raise TypeError(f'Type "{node.output().type()}" not supported') return True def _check_is_quantized_tensor(node, module): if not isinstance(module, torch._C.ScriptObject): return False # We only support ScriptObjects that correspond to quantized packed params. assert "PackedParams" in node.output().type().name() return True def _lower_graph_block(graph): for node in list(graph.nodes()): for block in node.blocks(): _lower_graph_block(block) for idx, _input in enumerate(list(node.inputs())): if _input in replace_input: node.replaceInput(idx, replace_input[_input]) kind = node.kind().split("::")[1].lower() if kind != "getattr": continue _input = node.input() _output = node.output() attr_name = getattr(node, node.kindOf("name"))("name") module = getattr(node_to_module_map[_input], attr_name) node_to_module_map[_output] = module input_prefix = node_to_prefix_map[_input] prefix = input_prefix + '.' + attr_name if input_prefix != "" else attr_name node_to_prefix_map[_output] = prefix is_tensor = _check_is_tensor(node, module) is_quantized_tensor = _check_is_quantized_tensor(node, module) if is_tensor or is_quantized_tensor: if is_tensor and prefix in state_dict: assert torch.equal( module.cpu(), state_dict[prefix].cpu() ), "tensor value not consistent between torch ir and state_dict" if prefix in params_dict: assert torch.equal(module.cpu(), params_dict[prefix].cpu()) replace_input[_output] = first_node_with_prefix[prefix] else: params_dict[prefix] = module first_node_with_prefix[prefix] = _output _output.setDebugName(prefix) _lower_graph_block(graph) return graph, params_dict, buffer_dict def _expand_and_optimize_ir(torchscript): """ Given a torch.jit.ScriptModule, convert it to a optimized torch._C.Graph and dict of model parameter's names to tensors. """ graph = torchscript.forward.graph # From PyTorch code: Inline function and method calls. torch._C._jit_pass_inline(graph) # From PyTorch code: This inlines the forked section in the fork() # callsite and replaces uses of the result of wait() calls with the # values produced from the (now-inlined) forked section. torch._C._jit_pass_inline_fork_wait(graph) # Starting from the return node, marks all nodes that feed into the # output, as well as nodes with side effects. Any nodes not marked are # eliminated. torch._C._jit_pass_dce(graph) # From PyTorch code: checks well-formedness and invariants of graph. torch._C._jit_pass_lint(graph) # Replaces a couple specific ops patterns (add, sub, mul, div, chunk). if version_lt(torch, "1.6.0"): torch._C._jit_pass_canonicalize_ops(graph) torch._C._jit_pass_lint(graph) # From PyTorch code: This pass catches all of the small, easy to catch # peephole optimizations you might be interested in doing. # Eliminate no-op 'expand' nodes # Simplify x.t().t() to x # pass disabled for v1.6.0 and onwards, wrongly captures the shape of dummy inputs during tracing. torch._C._jit_pass_peephole(graph, addmm_fusion_enabled=False) else: # v1.6.0 pass renamed torch._C._jit_pass_canonicalize_graph_fuser_ops(graph) torch._C._jit_pass_lint(graph) # From PyTorch docs: Renumber the graph so that all structurally # equivalent graphs have same numbers. graph = torch._C._jit_pass_canonicalize(graph) torch._C._jit_pass_lint(graph) if version_lt(torch, "1.6.0"): # v1.6.0 JIT changes disallows pulling list values out of # prim::Constant. We can only pull scalar values. constant # propagation removes `listConstruct` and results in list values. # We disallow constant prop pass to keep them as scalars, and rely # on our own constant prop to interpret `listConstruct`. torch._C._jit_pass_constant_propagation(graph) # NOTE: Don't need another DCE, it's included in constant propagation. torch._C._jit_pass_lint(graph) # Get the params_dict and rename the getattr nodes in the graph graph, params_dict, buffer_dict = _jit_pass_lower_graph(graph, torchscript) return graph, params_dict, buffer_dict ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/frontend/torch/utils.py0000644000000000000000000001236514672066616024473 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from enum import Enum import numpy as np import torch from coremltools.converters.mil.mil import types # NOTE [represent torch dtype by integer] # In TorchScript, some ops will receive a dtype input as an integer which maps to a torch dtype. # The below mapping was found by converting test models with different dtypes passed to ones. # There is one modification to original torch mapping, though, due to Core ML lacks 64-bit dtype # When mapping from torch dtype to integer number, we map # * int64 to int32's number # * float64 to float32's number # When mapping from integer number back to torch dtype, we map # * int64's number to int32 # * float64's number to float32 # TODO(https://github.com/apple/coremltools/issues/2153): This is confusing... we should refactor NUM_TO_TORCH_DTYPE = { 0: torch.uint8, 1: torch.int8, 2: torch.int16, 3: torch.int32, 4: torch.int64, 5: torch.float16, 6: torch.float32, 7: torch.float64, 11: torch.bool, 12: torch.qint8, 13: torch.quint8, 14: torch.qint32, } def dtype_to_32bit(dtype): if dtype == torch.int64: return torch.int32 elif dtype == torch.float64: return torch.float32 else: return dtype TORCH_DTYPE_TO_NUM = { dtype: val for val, dtype in NUM_TO_TORCH_DTYPE.items() } TORCH_DTYPE_TO_NUM[torch.int64] = TORCH_DTYPE_TO_NUM[torch.int32] TORCH_DTYPE_TO_NUM[torch.float64] = TORCH_DTYPE_TO_NUM[torch.float32] NUM_TO_NUMPY_DTYPE = { 0: np.uint8, 1: np.int8, 2: np.int16, 3: np.int32, 4: np.int32, 5: np.float16, 6: np.float32, 7: np.float32, 11: bool, } NUMPY_DTYPE_TO_TORCH_NUM = { dtype: val for val, dtype in NUM_TO_NUMPY_DTYPE.items() } NUMPY_DTYPE_TO_TORCH_NUM[np.int64] = NUMPY_DTYPE_TO_TORCH_NUM[np.int32] NUMPY_DTYPE_TO_TORCH_NUM[np.float64] = NUMPY_DTYPE_TO_TORCH_NUM[np.float32] NUM_TO_DTYPE_STRING = { 0: "uint8", 1: "int8", 2: "int16", 3: "int32", 4: "int32", 5: "fp16", 6: "fp32", 7: "fp32", 11: "bool", } TYPE_TO_DTYPE_STRING = { types.uint8: "uint8", types.int8: "int8", types.int32: "int32", types.fp16: "fp16", types.fp32: "fp32", types.bool: "bool", } TORCH_QTYPE_TO_NP_TYPE = { torch.int8: np.int8, torch.qint8: np.int8, torch.uint8: np.uint8, torch.quint8: np.uint8, } TORCH_QTYPE_TO_STR = { torch.int8: "int8", torch.qint8: "int8", torch.uint8: "uint8", torch.quint8: "uint8", } MIL_DTYPE_TO_TORCH_DTYPE = { types.bool: torch.bool, types.fp16: torch.float16, types.fp32: torch.float32, types.int16: torch.int16, types.int32: torch.int32, } TORCH_DTYPE_TO_MIL_DTYPE = {v: k for k, v in MIL_DTYPE_TO_TORCH_DTYPE.items()} TORCH_DTYPE_TO_MIL_DTYPE[torch.int64] = types.int32 TORCH_DTYPE_TO_MIL_DTYPE[torch.float64] = types.fp32 class TorchFrontend(Enum): TORCHSCRIPT = 1 TORCHEXPORT = 2 EXECUTORCH = 3 TORCH_EXPORT_BASED_FRONTENDS = (TorchFrontend.TORCHEXPORT, TorchFrontend.EXECUTORCH) def sanitize_op_kind(op_kind: str) -> str: """ In our torch converter, we register torch ops only by its "canonical" name: 1. Lower-case characters only, e.g. ``div.Tensor`` -> ``div.tensor`` 2. No double underscore prefix and suffix, e.g. ``__add__`` -> ``add`` 3. No namespace prefix if it is the common aten/prim, e.g. ``aten::softmax`` -> ``softmax`` ``aten.pow`` -> ``pow`` and no type trait suffix if it is not distinguished in Core ML, e.g. ``bmm.default`` -> ``bmm`` ``slice_copy.tensor`` -> ``slice_copy`` ``mul.scalar`` -> ``mul`` """ def skip_default_prefix_and_suffix_with_deliminator( op_kind: str, deliminator: str, ) -> str: split = op_kind.split(deliminator) start = 1 if split[0] in {"aten", "prim"} and len(split) > 1 else 0 stop = ( -1 if split[-1] in { "default", "int", "tensor", "tensor_mode", "scalar", "tensor_scalar", } and len(split) - start > 1 else len(split) ) op_kind = deliminator.join(split[start:stop]) return op_kind # 1. Lower case op_kind = op_kind.lower() # 2. Remove underscore prefix and suffix if op_kind.startswith("__") and op_kind.endswith("__"): op_kind = op_kind[2:-2] # 3. Skip the aten/prim namespace prefix, and default/tensor/scalar suffix op_kind = skip_default_prefix_and_suffix_with_deliminator(op_kind, "::") op_kind = skip_default_prefix_and_suffix_with_deliminator(op_kind, ".") return op_kind def unify_inplace_and_functional(op_kind: str) -> str: """ In many cases, Core ML uses only functional ops, so we do not have to distinguish in-place from functional, so we will want to remove the conventional in-place suffix ``_`` of PyTorch. For instance, ``sub_`` -> ``sub`` """ if op_kind.endswith("_"): op_kind = op_kind[:-1] return op_kind ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/input_types.py0000644000000000000000000005141414672066616022756 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from enum import Enum from typing import Optional import numpy as np from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.types.symbolic import is_symbolic class ColorLayout(Enum): RGB = "RGB" BGR = "BGR" GRAYSCALE = "G" GRAYSCALE_FLOAT16 = "G_FLOAT16" class ClassifierConfig: def __init__( self, class_labels, predicted_feature_name="classLabel", predicted_probabilities_output=None, ): """ Configuration for classifier models. Parameters ---------- class_labels: str / list of int / list of str * If a ``list`` is provided, the ``list`` maps the index of the output of a neural network to labels in a classifier. * If a ``str`` is provided, the ``str`` points to a file which maps the index to labels in a classifier. predicted_feature_name: str Name of the output feature for the class labels exposed in the Core ML neural network classifier. Default: ``'classLabel'``. predicted_probabilities_output: str * If provided, then this is the name of the neural network blob which generates the probabilities for each class label (typically the output of a softmax layer). * If not provided, then the last output layer is assumed. """ self.class_labels = class_labels self.predicted_feature_name = predicted_feature_name self.predicted_probabilities_output = predicted_probabilities_output class InputType: def __init__(self, name=None, shape=None, dtype=None): """ The input type for inputs fed into the model. Parameters ---------- name: (str) The name of the input. shape: list, tuple, Shape object, EnumeratedShapes object, or None The shape(s) that are valid for this input. If set to ``None``, the shape will be inferred from the model itself. """ self.name = name if shape is not None: self.shape = _get_shaping_class(shape) else: self.shape = None self.dtype = dtype class ImageType(InputType): def __init__( self, name=None, shape=None, scale=1.0, bias=None, color_layout=ColorLayout.RGB, channel_first=None, ): """ Configuration class used for image inputs in Core ML. Parameters ---------- scale: float or list of floats The scaling factor for all values in the image channels. bias: float or list of floats * If ``color_layout`` is ``ct.colorlayout.GRAYSCALE`` or ``ct.colorlayout.GRAYSCALE_FLOAT16``, bias would be a ``float``. * If ``color_layout`` is ``ct.colorlayout.RGB`` or ``ct.colorlayout.BGR``, bias would be a list of ``float``. color_layout: string or enumeration of type ``ct.colorlayout`` Color layout of the image. Valid values are as follows: Enumeration (recommended): * ``ct.colorlayout.RGB`` * ``ct.colorlayout.BGR`` * ``ct.colorlayout.GRAYSCALE`` * ``ct.colorlayout.GRAYSCALE_FLOAT16`` String values (older way to specify): * ``'G'``: Grayscale (maps to ``ct.colorlayout.GRAYSCALE``) * ``'RGB'``: [Red, Green, Blue] (maps to ``ct.colorlayout.BGR``) * ``'BGR'``: [Blue, Green, Red] (maps to ``ct.colorlayout.RGB``) channel_first: (bool) or None Set to ``True`` if input format is channel first. Default format: * For TensorFlow: channel last (``channel_first=False``). * For PyTorch: channel first (``channel_first=True``). """ super(ImageType, self).__init__(name, shape) self.scale = scale msg = "color_layout should be an enum of type ct.colorlayout, i.e. one of: " \ "{ct.colorlayout.RGB, ct.colorlayout.BGR, " \ "ct.colorlayout.GRAYSCALE, ct.colorlayout.GRAYSCALE_FLOAT16}" if not (isinstance(color_layout, str) or isinstance(color_layout, ColorLayout)): raise ValueError(msg) if isinstance(color_layout, str): if color_layout not in ("G", "RGB", "BGR"): raise ValueError(msg) color_layout = ColorLayout(color_layout) self.color_layout = color_layout if color_layout == ColorLayout.GRAYSCALE_FLOAT16: self.dtype = types.fp16 if bias is None: if color_layout in (ColorLayout.GRAYSCALE, ColorLayout.GRAYSCALE_FLOAT16): self.bias = 0.0 else: self.bias = [0.0, 0.0, 0.0] else: self.bias = bias self.channel_first = channel_first def __repr__(self): return self.__str__() def __str__(self): str_repr = 'ImageType(name={}, shape={}, scale={}, bias={}, ' +\ 'color_layout={}, channel_first={})' return str_repr.format(self.name, self.shape, self.scale, self.bias, self.color_layout, self.channel_first) class TensorType(InputType): def __init__(self, name=None, shape=None, dtype=None, default_value=None): """ Specify a (dense) tensor input. Parameters ---------- name: str Input name. Must match an input name in the model (usually the Placeholder name for TensorFlow or the input name for PyTorch). The ``name`` is required except for a TensorFlow model in which there is exactly one input Placeholder. shape: The shape of the input - List of positive int or :py:class:`RangeDim`, or - :py:class:`EnumeratedShapes` For TensorFlow: * The ``shape`` is optional. If omitted, the shape is inferred from TensorFlow graph's Placeholder shape. For PyTorch: * The ``shape`` is required. dtype: np.generic or mil.type type For example, ``np.int32`` or ``coremltools.converters.mil.mil.types.fp32`` default_value: np.ndarray If provided, the input is considered optional. At runtime, if the input is not provided, ``default_value`` is used. Limitations: * If ``default_value`` is ``np.ndarray``, all elements are required to have the same value. * The ``default_value`` may not be specified if ``shape`` is :py:class:`EnumeratedShapes`. Examples -------- * ``ct.TensorType(name="input", shape=(1, 2, 3))`` implies ``dtype == np.float32`` * ``ct.TensorType(name="input", shape=(1, 2, 3), dtype=np.int32)`` * ``ct.TensorType(name="input", shape=(1, 2, 3), dtype=ct.converters.mil.types.fp32)`` """ super(TensorType, self).__init__(name, shape) if dtype is not None: if types.is_builtin(dtype): self.dtype = dtype if dtype not in ( types.int8, types.uint8, types.fp16, types.fp32, types.fp64, types.int32, types.int64, types.bool, ): raise TypeError( "dtype={} is unsupported for inputs/outputs of the model".format(dtype) ) else: # Assume dtype is numpy type try: self.dtype = types.numpy_type_to_builtin_type(dtype) except TypeError: raise TypeError("dtype={} is unsupported".format(dtype)) if dtype not in (np.float16, np.float32, np.float64, float, np.int32, np.int64, int, bool, np.bool_): raise TypeError("dtype={} is unsupported for inputs/outputs of the model".format(dtype)) if default_value is not None: if isinstance(shape, EnumeratedShapes): msg = 'TensorType input {} has EnumeratedShapes and ' +\ 'may not be optional' raise ValueError(msg.format(name)) if not isinstance(default_value, np.ndarray): msg = 'TensorType {} default_value is not np.ndarray' raise ValueError(msg.format(name)) default_fill_val = default_value.flatten()[0] if not np.all(default_value == default_fill_val): msg = 'TensorType {} default_value can only have ' +\ 'same entries' raise ValueError(msg.format(name)) if not self.shape.has_symbolic and list(default_value.shape) != list( self.shape.symbolic_shape ): msg = "TensorType {} default_value shape {} != " + "TensorType.shape {}" raise ValueError(msg.format(name, default_value.shape, self.shape.to_list())) if ( self.dtype is not None and types.numpy_type_to_builtin_type(default_value.dtype) != self.dtype ): msg = "TensorType {} default_value dtype {} != " + "TensorType.dtype {}" raise ValueError(msg.format(name, default_value.dtype, self.dtype.__type_info__())) else: self.dtype = types.numpy_type_to_builtin_type(default_value.dtype) self.default_value = default_value def __repr__(self): return self.__str__() def __str__(self): return 'TensorType(name={}, shape={}, dtype={})'.format(self.name, self.shape, self.dtype) class StateType(InputType): SUPPORTED_WRAPPER_TYPE = ( TensorType, ) def __init__( self, wrapped_type: type, name: Optional[str] = None, ): """ Specify a model state as a wrapper of a ``TensorType``. For example, you can use the following code to create a state type input that wraps a fp16 tensor with shape ``(2, 3)``:: ct.StateType( wrapped_type=ct.TensorType( shape=(2, 3), dtype=np.float16 ), name="state", ) Parameters ---------- wrapped_type: coremltools.converters.mil.input_types.InputType - The type wrapped in the state. - Must be ``TensorType``. Note that the ``name`` and ``default_value`` of the wrapped ``TensorType`` must not be provided. name: str The name of the state. It must match the key of ``named_buffers()`` in the source TorchScript model. """ if not isinstance(wrapped_type, StateType.SUPPORTED_WRAPPER_TYPE): raise ValueError( f"StateType only supports {StateType.SUPPORTED_WRAPPER_TYPE}. Got {type(wrapped_type)}." ) # name and default_value cannot be set if wrapped_type.name is not None: raise ValueError("name cannot be set in the state wrapped_type.") if wrapped_type.default_value is not None: raise ValueError("default_value cannot be set in the state wrapped_type.") super(StateType, self).__init__(name, wrapped_type.shape, wrapped_type.dtype) self.wrapped_type = wrapped_type def __repr__(self): return self.__str__() def __str__(self): return f"StateType[{self.wrapped_type}]" class RangeDim: def __init__( self, lower_bound: int = 1, upper_bound: int = -1, default: Optional[int] = None, symbol: Optional[str] = None, ): """ A class for providing a range of accepted shapes. Parameters ---------- lower_bound: The minimum valid value for the shape. upper_bound: The maximum valid value for the shape. Set to ``-1`` if there is no upper limit (only works if backend is set to "neuralnetwork"). When backend is set to "mlprogram" during conversion, -1 is not allowed. A finite positive upper bound must be provided. default: The default value that is used for initiating the model, and set in the input shape field of the model file. If set to ``None``, ``lower_bound`` would be used as default. symbol: Optional symbol name for the dim. Autogenerate a symbol name if not specified. """ if symbol is None: from coremltools.converters.mil.mil import get_new_symbol self.symbol = get_new_symbol() else: from coremltools.converters.mil.mil import Symbol self.symbol = Symbol(symbol) self.lower_bound = lower_bound self.upper_bound = upper_bound if default is None: self.default = lower_bound else: if default < lower_bound: raise ValueError( f"Default value {default} is less than minimum value ({lower_bound}) for range" ) if default > upper_bound > 0: raise ValueError( f"Default value {default} is greater than maximum value ({upper_bound}) for range" ) self.default = default def __repr__(self): return self.__str__() def __str__(self): return 'RangeDim(lower_bound={}, upper_bound={}, default={}, symbol="{}")'.format( self.lower_bound, self.upper_bound, self.default, self.symbol) class Shape: def __init__(self, shape, default=None): """ The basic shape class to be set in :py:class:`InputType`. Parameters ---------- shape: list of (int), symbolic values, RangeDim object The valid shape of the input. default: tuple of int or None The default shape that is used for initiating the model, and set in the metadata of the model file. If ``None``, then ``shape`` is used. """ from coremltools.converters.mil.mil import get_new_symbol if not isinstance(shape, (list, tuple)): msg = "Shape should be list or tuple, got type {} instead" raise ValueError(msg.format(type(shape))) self.symbolic_shape = [] shape = list(shape) for idx, s in enumerate(shape): if s is None or s == -1: msg = 'Dimension cannot be None or -1. Use ' +\ 'ct.RangeDim for runtime determined dimension. ' +\ 'Dim {}: {} ' +\ 'See https://coremltools.readme.io/docs/flexible-inputs' raise ValueError(msg.format(idx, s)) if isinstance(s, RangeDim): sym = s.symbol self.symbolic_shape.append(sym) elif isinstance(s, (np.generic, int)) or is_symbolic(s): self.symbolic_shape.append(s) else: raise ValueError( "Unknown type {} to build symbolic shape.".format(type(s)) ) self.shape = tuple(shape) if default is not None: if not isinstance(default, (list, tuple)): raise ValueError( "Default shape should be list or tuple, got type {} instead".format( type(default) ) ) for idx, s in enumerate(default): if not isinstance( s, (np.generic, int) ) and not is_symbolic(s): raise ValueError( "Default shape invalid, got error at index {} which is {}".format( idx, s ) ) else: default = [] for idx, s in enumerate(self.shape): if isinstance(s, RangeDim): default.append(s.default) elif s is None or s == -1: default.append(self.symbolic_shape[idx]) else: default.append(s) self.default = tuple(default) def __str__(self): return str(self.shape) def __repr__(self): return self.__str__() @property def has_symbolic(self): return any(is_symbolic(s) for s in self.symbolic_shape) def to_list(self, allow_symbolic=False): if not allow_symbolic and self.has_symbolic: return None return self.symbolic_shape class EnumeratedShapes: def __init__(self, shapes, default=None): """ A shape class for setting multiple valid shapes in InputType. Parameters ---------- shapes: list of Shape objects, or Shape-compatible lists * The valid shapes of the inputs. * If input provided is not a :py:class:`Shape` object, but can be converted to a :py:class:`Shape`, the :py:class:`Shape` object would be stored in ``shapes`` instead. default: tuple of int or None * The default shape that is used for initiating the model, and set in the metadata of the model file. * If ``None``, then the first element in ``shapes`` is used. Examples -------- .. sourcecode:: python sample_shape = ct.EnumeratedShapes( shapes=[(2, 4, 64, 64), (2, 4, 48, 48), (2, 4, 32, 32)], default=(2, 4, 64, 64) ) my_core_ml_model = ct.convert( my_model, inputs=[ct.TensorType(name="sample", shape=sample_shape)], ) """ # lazy import to avoid circular import from coremltools.converters.mil.mil import get_new_symbol if not isinstance(shapes, (list, tuple)): raise ValueError( "EnumeratedShapes should be list or tuple of shape, got type {} instead".format( type(shapes) ) ) if len(shapes) < 2: raise ValueError( "EnumeratedShapes should be take a list or tuple with len >= 2, got {} instead".format( len(shapes) ) ) self.shapes = [] for idx, s in enumerate(shapes): if isinstance(s, Shape): self.shapes.append(s) else: self.shapes.append(Shape(s)) self.symbolic_shape = self.shapes[0].symbolic_shape for shape in self.shapes: for idx, s in enumerate(shape.symbolic_shape): if is_symbolic(self.symbolic_shape[idx]): continue elif is_symbolic(s): self.symbolic_shape[idx] = s elif s != self.symbolic_shape[idx]: self.symbolic_shape[idx] = get_new_symbol() if default is not None: if not isinstance(default, (list, tuple)): raise ValueError( "Default shape should be list or tuple, got type {} instead".format( type(default) ) ) for idx, s in enumerate(default): if not isinstance( s, (np.generic, int) ) and not is_symbolic(s): raise ValueError( "Default shape invalid, got error at index {} which is {}".format( idx, s ) ) else: default = self.shapes[0].default self.default = default def __repr__(self): return self.__str__() def __str__(self): return "EnumeratedShapes(" + str(self.shapes) + ", default=" + str(self.default) + ")" def _get_shaping_class(shape): """ Returns a Shape class or EnumeratedShapes class for `shape` where `shape` could be lists/tuple/Shape/EnumeratedShapes/etc. """ if isinstance(shape, (Shape, EnumeratedShapes)): return shape try: enum_shape = EnumeratedShapes(shape) return enum_shape except ValueError: pass try: shape = Shape(shape) return shape except ValueError: pass raise ValueError("Can't convert to CoreML shaping class from {}.".format(shape)) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2295468 coremltools-8.0/coremltools/converters/mil/mil/0000755000000000000000000000000014672075535020575 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/__init__.py0000644000000000000000000000154614672066616022714 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause SPACES = " " from .block import Block, Function, curr_block from .builder import Builder from .input_type import ( SUPPORT_FLOAT_TYPES, SUPPORT_INT_TYPES, DefaultInputs, InputSpec, InternalVar, ListInputType, PyFunctionInputType, TensorInputType, TupleInputType, ) from .operation import Operation, mil_list, precondition from .program import ( Placeholder, Program, Symbol, get_existing_symbol, get_new_symbol, get_new_variadic_symbol, ) from .var import ListVar, Var """ DO NOT REMOVE THIS COMMENT, since we need to keep the import order. """ from .ops.defs._op_reqs import register_op ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/block.py0000644000000000000000000012645214672066616022253 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy from collections import Counter, OrderedDict from typing import List, Optional, Set, Tuple, Union from coremltools import _OPSET from coremltools import _logger as logger from coremltools.converters.mil._deployment_compatibility import AvailableTarget as _target from coremltools.converters.mil.input_types import InputType from . import SPACES, types from .operation import Operation from .scope import SCOPE_STACK, VALID_OPS_TO_COPY_SCOPE_INFO, ScopeSource, add_graph_pass_scope from .types.symbolic import is_symbolic, k_used_symbols from .utils import CacheDoublyLinkedList from .var import ComplexVar, InternalVar, Var from .visitors.dot_visitor import DotVisitor # BLOCK_STACK[-1] is the current block BLOCK_STACK = [] DEBUG = False def curr_block(): if len(BLOCK_STACK) == 0: raise ValueError("Must call Builder inside an Function" + " or Block") return BLOCK_STACK[-1] def curr_opset_version(): block = curr_block() while not isinstance(block, Function): block = block.outer_op.enclosing_block return block.opset_version def is_current_opset_version_compatible_with(opset_version): if curr_opset_version() is None: return opset_version <= _target.iOS13 return curr_opset_version() >= opset_version class InvalidBlockStateError(Exception): pass class Block: __slots__ = [ "name", "_block_inputs", "_outputs", "operations", "_internal_vars", "outer_op", "cache_operations", "_essential_scope_sources", ] counter = 0 @classmethod def _get_new_name(cls): curr_val = cls.counter cls.counter += 1 return "block" + str(curr_val) def __init__(self, block_inputs=None, outer_op=None, name=None): """ Inputs: block_inputs: python tuple[Var]. block_inputs is None except when the block represents loop. By convention block_inputs should have name ending in '.x', and the Variable are not produced by any op (block_inputs[i]._op is None). Ex: # main(%a: (1, 2, fp32), # %b: (1, 2, fp32), # %c: (1, 2, fp32)) { # block0() { # %const1: (1, fp32) = const(...) # %loop:0: (1, 2, fp32), %loop:1: (1, 2, fp32) = \ # while_loop(loop_vars=(%a, %b)) # loop_cond(%a.x, %b.x) { # %blah: (bool) = some_op(x=%a.x, y=%b.x) # %cond_var: (bool) = some_op2(x=%a.x, y=%blah) # } -> (%cond_var) # loop_body(%a.x, %b.x) { # %add_0: (1, 2, fp32) = add(x=%a.x, y=%b.x) # } -> (%add_0, %b.x) # %linear: (1, fp32) = linear(...) # } -> (%loop:0, %loop:1) # } %a.x, %b.x are block_inputs. `some_op` in `loop_cond` block can access %a, %b, %a.x, %b.x. `some_op`, however, cannot take %linear as input. outer_op: Operation The enclosing op. None iff this Block is an Function. function_inputs: tuple[Var] function_inputs are always visible for this block and all blocks nested within. If function_inputs is None, get it from `outer_op.block` """ self.name = name if self.name is None: self.name = Block._get_new_name() # list[Operation]. Topologically sorted. self.operations = CacheDoublyLinkedList() # Must be set before self.validate() self.outer_op = outer_op self._block_inputs = block_inputs if self._block_inputs is None: self._block_inputs = tuple() # list[Var]. This is converted to str when generating MIL proto. self._outputs = [] # If we create const, whose inputs (mode, val) cannot be const # (infinite recursion). They must be considered as always visible. self._internal_vars = set() # List[ScopeSource]. During graph pass, those scope source cannot be missed self._essential_scope_sources = [] if self.outer_op is None and not isinstance(self, Function): msg = "Block {} is not Function and thus outer_op cannot be None" raise ValueError(msg.format(self.name)) self.validate() def _add_essential_scope_source( self, scope_source: Union[ScopeSource, List[ScopeSource]] ) -> None: """ Add essential scope sources to self._essential_scope_sources. When self.validate() is called, we make sure that all source info are not missing. """ if not isinstance(scope_source, list): scope_source = [scope_source] for source in scope_source: if source in self._essential_scope_sources: raise ValueError(f"{source} already exist in _essential_scope_sources.") self._essential_scope_sources.append(source) def _check_has_scope_info(self) -> None: """ Check no ops in the function are missing scope information. """ def _check_has_scope_info_block(block: Block): for op in block.operations: for b in op.blocks: _check_has_scope_info_block(b) for scope in self._essential_scope_sources: if scope not in op.scopes or len(op.scopes[scope]) == 0: raise ValueError( f"op {op.name} with scopes {op.scopes} is missing essential scopes {scope}." ) _check_has_scope_info_block(self) def _check_vars_visibility_in_block( self, visible_vars_from_outer_block: Optional[Set[Var]] = None ): """ This utils does a one pass program-wise checking of vars visibility. That is, each input of an op, should appear before the op in the sequantial order. For the debug purpose, if you want to pinpoint the operation which caused the invalid program state, please set DEBUG=True, and it will be captured by the ``is_var_visible_in_block`` utils. """ if visible_vars_from_outer_block is None: visible_vars_from_outer_block = set() block_inputs = list(self.inputs.values()) if isinstance(self, Function) else self.inputs visible_vars_in_block = set(block_inputs) for op in self.operations: for b in op.blocks: b._check_vars_visibility_in_block( visible_vars_from_outer_block=visible_vars_from_outer_block.union( visible_vars_in_block ) ) for val in op.get_flattened_inputs(): if ( val not in self._internal_vars and val not in visible_vars_in_block and val not in visible_vars_from_outer_block ): raise ValueError(f"Var {val} not visible in the block {self.name}.") for out_var in op.outputs: visible_vars_in_block.add(out_var) def validate( self, force_validate: Optional[bool] = False, check_essential_scope: Optional[bool] = False, ) -> None: """ Basic validation to protect against some invalid state. If force_validate is False, the validation is done only if the global variable DEBUG=True. """ if not DEBUG and not force_validate: return # Check vars visibility if isinstance(self, Function): self._check_vars_visibility_in_block() # Other validations for op in self.operations: for b in op.blocks: b.validate(force_validate=force_validate) if op.outputs is None: raise InvalidBlockStateError() # Check the input output relationships # from outputs -> inputs for ov in op.outputs: child_op_count = Counter(ov.child_ops) for next_op, c in child_op_count.items(): c_actual = next_op.get_flattened_inputs().count(ov) if c_actual != c: msg = ( "Var {} should be consumed by op {} {}" + " times, but op {} uses it {} times.\n{}" ) raise InvalidBlockStateError( msg.format( ov.name, next_op.name, c, next_op.name, c_actual, next_op, ) ) # from inputs -> outputs input_var_count = Counter(op.get_flattened_inputs()) for iv, c in input_var_count.items(): c_actual = iv.child_ops.count(op) if c_actual != c: msg = ( "Var {} should be consumed by op {} {}" + " times, but op {} uses it {} times.\n{}" ) raise InvalidBlockStateError( msg.format(iv.name, op.name, c_actual, op.name, c, op) ) # 1 to 1 mapping between Block outputs and Var.consuming_blocks for op in self.operations: for ov in op.outputs: for b in ov.consuming_blocks: if ov not in b.outputs: msg = "Var {} should be output of block {}: {}" raise ValueError(msg.format(ov.name, b.name, b)) for v in self.outputs: if self not in v.consuming_blocks: msg = "Var {} should be output of block {}: {}" raise ValueError(msg.format(ov.name, b.name, b)) # checking internal vars are consistent with self._internal_vars internal_var_in_block = set() for op in self.operations: for v in op.internal_inputs.values(): internal_var_in_block.add(v) if not internal_var_in_block == self._internal_vars: raise ValueError( "internal vars in the block are not consistent with self._internal_vars." ) # check essential scope info are not missing if check_essential_scope: self._check_has_scope_info() def remove_inputs(self, curr_input_vars): """ curr_input_vars: list[Var], whose elements must be in self._block_inputs. """ self.validate() remove_idx = [self._block_inputs.index(v) for v in curr_input_vars] self._block_inputs = [ v for i, v in enumerate(self._block_inputs) if i not in remove_idx ] def find_ops(self, prefix=None, op_type=None): """ Return list of ops with name matching `prefix` if specified and op_type, if specified. At least one of {prefix, op_type} must be specified. prefix: str Return list[Operation]. Empty list if no op satisfies. """ if prefix is None and op_type is None: raise ValueError("Must specify one of {prefix, op_type}") found_ops = [] for op in self.operations: prefix_match = prefix is None or op.name[: len(prefix)] == prefix op_type_match = op_type is None or op.op_type == op_type if prefix_match and op_type_match: found_ops.append(op) for b in op.blocks: found_ops.extend(b.find_ops(prefix=prefix, op_type=op_type)) return found_ops def add_internal_var(self, internal_var): if not isinstance(internal_var, InternalVar): raise ValueError("Only InternalVar can be manually added to Block.") self._internal_vars.add(internal_var) @property def inputs(self): return self._block_inputs @property def outputs(self): return self._outputs def is_var_visible_in_block(self, var: Var, upto_op: Optional[Operation] = None): """ Checks if a var is visible to ops starting from id=`upto_op_with_id` inside the block. Var is visible if - It is the output of a const op, or - It is the output of "preceding" operations in that block, or - It is visible in the enclosing block, or - It is either a block or a function input If upto_op_with_id is None, outputs of all operations inside the block are visible to that block. For debugging: - By default (DEBUG=False), this utils is guarded by the flag in calling code and not running. - By setting DEBUG=True, this utils is triggered in multiple places in the code base, so the users can pinpoint the exact place where an invalid operation is made by the converter. Beware that, the converter could be slow in the debug mode, since the overal conversion time will explode to O(N^2) in the average cases by this util. """ if not DEBUG: # Only in debug mode, there is a chance that self.operations is type of list when executing this function. assert isinstance( self.operations, CacheDoublyLinkedList ), "operations must be type of CacheDoublyLinkedList." if var in self._internal_vars: return True inputs = list(self.inputs.values()) if isinstance(self, Function) else self.inputs if var in inputs: return True if upto_op is None: if var.op in self.operations: return True else: if isinstance(self.operations, list): # This could only happen in debug mode assert DEBUG is True, "block.operations can only be type of list in debug mode." idx = self.find_op_id_in_block(upto_op) for i in range(idx - 1, -1, -1): if var.op is self.operations[i]: return True else: cursor = self.operations._get_node_from_op(upto_op).prev while cursor is not None: if cursor.op is var.op: return True cursor = cursor.prev if self.outer_op is not None: enclosing_block = self.outer_op.enclosing_block if enclosing_block.is_var_visible_in_block(var, upto_op=self.outer_op): return True return False def find_op_id_in_block(self, target_op: Operation) -> int: if len(self.operations) > 0 and target_op == self.operations[-1]: return len(self.operations) - 1 op_list = self.operations if isinstance(self.operations, list) else list(self.operations) try: idx = op_list.index(target_op) except ValueError: raise ValueError("Op {} not found in {}: {}".format(target_op.name, self.name, self)) return idx def set_outputs(self, outputs): """ outputs: list[Var] """ if not isinstance(outputs, list): raise ValueError("Outputs must be list of Vars") self.validate() # check var visibility in debug mode if DEBUG: for ov in outputs: if not self.is_var_visible_in_block(ov): msg = ( "Var {} is not visible in block {} and thus cannot " + "be a block output.\n{}" ) raise ValueError(msg.format(ov.name, self.name, self)) # For duplicate vars in self._outputs, only remove block once. for ov in set(self._outputs): ov.consuming_blocks.remove(self) # Need to copy, or block's output would be completely tied to a var's # output and we cannot replace a block output with another var's # output. self._outputs = copy.copy(outputs) # For duplicate vars in outputs, only add consuming_blocks once. for ov in set(outputs): ov.consuming_blocks.append(self) def __enter__(self): global BLOCK_STACK BLOCK_STACK.append(self) return self def __exit__(self, type, value, traceback): self._propagate_nonreplaceable_vars() global BLOCK_STACK BLOCK_STACK = BLOCK_STACK[:-1] def _insert_op_before(self, new_op: Operation, before_op: Optional[Operation] = None): """ A private API used by builder. Please use `builder.YOUR_OP(...,before_op)`. new_op's outputs are not used (not input to any other op) after this call. All inputs to new_op must be visible at or before the before_op (i.e., new_op must be added in topologically sorted order). Note that this is more restrictive than MIL, whose Block supports lexical scoping and thus an op can reference Var in enclosing scopes. new_op.name must be unique in the block. before_op=None to append new_op at the end of self.operations. Given: %2 = op0(%1, %1) %4 = op2(%1) %6 = op3(%4, %4) Execute: insert_op_before(op1, before_op=op2), where %3 = op1(%1, %2) Result: %2 = op0(%1, %1) %3 = op1(%1, %2) %4 = op2(%1) %6 = op3(%4, %4) Comment: We assume op1 has been constructed outside the block with %1, %2 as inputs. Typically it's builder's job to create an op and insert into the current block. Comment: insert_op_before(op1, before_op=op0) would error as %2 (an input to op1) is not visible before op0. """ self.validate() if isinstance(self.operations, CacheDoublyLinkedList): self.operations.insert_op_before(new_op, before_op) return if before_op is None: self.operations.append(new_op) return # check inputs visibility in debug mode if DEBUG: for k, v in new_op.inputs.items(): if not isinstance(v, (Var, tuple)): continue vs = [v] if isinstance(v, Var) else v for v in vs: if not self.is_var_visible_in_block(v, upto_op=before_op): before_op_name = before_op.name if before_op is not None else "None" msg = "Op '{}' input {}={} is not in scope of {} before {}" raise ValueError( msg.format(new_op.name, k, v.name, self.name, before_op_name) ) idx = self.find_op_id_in_block(before_op) self.operations.insert(idx, new_op) def _replace_var( self, old_var: Var, new_var: Var, anchor_op: Optional[Operation] = None, end_op: Optional[Operation] = None, no_check_var_types: Optional[bool] = False, ): """ Helper function for replace_uses_of_var_after_op """ self._copy_metadata(old_var, new_var) self._copy_scope_info(old_var, new_var) num_ops_affected = 0 # If we start checking right after the old_var, we can reduce the time # complexity hugely, by only checking the child_ops, without iterating # through whole program. # This fix reduce the overall time from O(N) -> O(1). replace_vars_right_after_old_var = ( end_op is None and len(self.operations) > 0 and anchor_op is not None and anchor_op is old_var.op ) # We should only compute start_idx and end_idx once if needed. start_idx = end_idx = None if replace_vars_right_after_old_var: op_list = list(old_var.child_ops) else: if isinstance(self.operations, list): start_idx = self.find_op_id_in_block(anchor_op) + 1 if anchor_op is not None else 0 end_idx = ( self.find_op_id_in_block(end_op) if end_op is not None else len(self.operations) - 1 ) op_list = self.operations[start_idx : end_idx + 1] else: assert isinstance( self.operations, CacheDoublyLinkedList ), f"Expect operations be type of CacheDoublyLinkedList. Got {type(self.operations)}." if len(self.operations) == 0 and anchor_op is not None: raise ValueError(f"anchor op {anchor_op} not in the block.") start_node = ( self.operations.start if anchor_op is None else self.operations._get_node_from_op(anchor_op).next ) cursor = start_node op_list = [] while cursor is not None: op_list.append(cursor.op) if cursor.op is end_op: break cursor = cursor.next for op in op_list: new_inputs = {} affected = False for k, v in op.inputs.items(): if isinstance(v, (list, tuple)) and old_var in v: new_inputs[k] = tuple(new_var if vv == old_var else vv for vv in v) affected = True elif v == old_var: new_inputs[k] = new_var affected = True else: new_inputs[k] = v if affected: num_ops_affected += 1 op.set_inputs(no_check_var_types=no_check_var_types, **new_inputs) # Replace recursively. for b in op.blocks: num_ops_affected += b._replace_var(old_var, new_var) # Replace consuming_blocks's outputs. # It is important to use list copy here, # since replace_block_output_var is going to change the consuming_blocks # Note that, there are some expensive index query in the following implementation, # but overally it won't affect the time complexity too much, # since we can assume the number of the block outputs in a program as a constant. # As the result, the amortized time complexity will not blow up. for b in list(old_var.consuming_blocks): outer_op = b.outer_op if outer_op is not None: # Query the start and end index if needed if start_idx is None: start_idx = ( self.find_op_id_in_block(anchor_op) + 1 if anchor_op is not None else 0 ) if end_idx is None: end_idx = ( self.find_op_id_in_block(end_op) if end_op is not None else len(self.operations) - 1 ) op_to_idx = {} while outer_op is not None: block = outer_op.enclosing_block if block is self: if len(op_to_idx) == 0: for idx, op in enumerate(self.operations): op_to_idx[op] = idx op_idx = op_to_idx[outer_op] if op_idx >= start_idx and op_idx <= end_idx: b.replace_block_output_var(old_var, new_var) break outer_op = block.outer_op if end_op is not None and old_var.op not in op_list: return num_ops_affected if old_var in self._block_inputs: idx = self._block_inputs.index(old_var) self._block_inputs = list(self._block_inputs) self._block_inputs[idx] = new_var self._block_inputs = tuple(self._block_inputs) # If old_var is block's output, replace as well. self.replace_block_output_var(old_var, new_var) return num_ops_affected def replace_block_output_var( self, old_var, new_var, ): """ If old_var is in the list of block's outputs, replace old_var with the new_var. """ found_old_var_in_output = False # There could be multiple matched `old_var` in output when the program has duplicate vars # in the output. for idx, output_var in enumerate(self._outputs): if old_var == output_var: found_old_var_in_output = True self._outputs[idx] = new_var if found_old_var_in_output: new_var.consuming_blocks.append(self) # This block no longer uses `old_var` as its outputs old_var.consuming_blocks.remove(self) # Ensure output name is consistent if isinstance(self, Function): if new_var in self.inputs.values() and new_var.name != old_var.name: raise ValueError("It is not allowed to modify function inputs name.") new_var.name = old_var.name def try_replace_uses_of_var_after_op( self, anchor_op: Operation, old_var: Var, new_var: Var, end_op: Optional[Operation] = None, no_check_var_types: Optional[bool] = False, ): """ :param anchor_op: Operation :param old_var: Var :param new_var: Var :param end_op: Operation :param no_check_var_types: bool :return: True if the old_var can be replaced by new_var. False otherwsie. This helper function guards the replace_uses_of_var_after_op function, by first checking if the old_var could be replaced by the new_var. 1. If old_var can be replaced by new_var, the replace_uses_of_var_after_op is called, and returns True. 2. Return False if the replacement is not allow. """ if not old_var.can_be_replaced_by_var(new_var): return False self.replace_uses_of_var_after_op( anchor_op=anchor_op, end_op=end_op, old_var=old_var, new_var=new_var, no_check_var_types=no_check_var_types, ) return True @staticmethod def _copy_scope_info(src: Var, dst: Var) -> None: """ Populate meta data from old var (src) to new var (dst) """ curr_scopes = SCOPE_STACK.get_curr_scopes() if ScopeSource.COREMLTOOLS_GRAPH_PASS in curr_scopes: if src.op in VALID_OPS_TO_COPY_SCOPE_INFO[-1]: return elif dst.op in VALID_OPS_TO_COPY_SCOPE_INFO[-1]: op = dst.op assert op is not None, "new_var cannot be a placeholder output" VALID_OPS_TO_COPY_SCOPE_INFO[-1].remove(op) # If old_var is a placeholder output, we assign defaults values to essential scope source old_scopes = src.scopes if len(old_scopes) == 0: essential_scope_sources = op.enclosing_block._essential_scope_sources for val in essential_scope_sources: res = None if val == ScopeSource.TORCHSCRIPT_MODULE_TYPE: res = ["__COREML__::TORCHSCRIPT_PLACEHOLDER"] elif val == ScopeSource.TORCHSCRIPT_MODULE_NAME: res = [f"__COREML__::TORCHSCRIPT_PLACEHOLDER_{src.name}"] elif val == ScopeSource.EXIR_STACK_TRACE: res = [None] elif val == ScopeSource.EXIR_DEBUG_HANDLE: res = [None] else: raise ValueError(f"No default placeholder info for {val}.") old_scopes[val] = res dst.scopes = add_graph_pass_scope(old_scopes, dst.scopes) for input in op.inputs.values(): if not isinstance(input, (list, tuple)): input = [input] for i in input: Block._copy_scope_info(src, i) @staticmethod def _copy_metadata(old_var: Var, new_var: Var) -> None: """ Populate meta data from old var to new var """ return def replace_uses_of_var_after_op( self, anchor_op: Operation, old_var: Var, new_var: Var, end_op: Optional[Operation] = None, no_check_var_types: Optional[bool] = False, force_replace: Optional[bool] = False, ): """ Replace all uses of `old_var` with `new_var` after `anchor_op`, and before `end_op` (inclusive). That is all the ops that use `old_var` will now use `new_var`. The op that produces the `old_var` will continue to produce it, its output won't be replaced by `new_var`. If `anchor_op` is None, replace all input occurrences of `old_var` in the block. If `end_op` is None, all occurrences of `old_var` are replaced in the block starting from the op just after `anchor_op` no_check_var_types: An error will be raised if the type of new_var is not same as the old_var, unless `no_check_var_types` is set to True. Normally type inference is re-invoked for all the child ops of `old_var` after updating it to `new_var`. However, this is skipped if `no_check_var_types` is set to True. old_var, new_var must meet the following conditions: - old_var, new_var both existing within the block. This implies that the op generating new_var must be inserted prior to this replacement. - Affected ops (i.e., Operation after anchor_op that take old_var as input) must generate the same type inference results as before. - new_var must be visible at or before anchor_op in the order of self.operations. Given: %2 = op0(%1, %1) %3 = op1(%1, %2) %4 = op2(%1) %6 = op3(%4, %4) Execute: replace_uses_of_var_after_op(op2, %4, %3) Result: %2 = op0(%1, %1) %3 = op1(%1, %2) %4 = op2(%1) %6 = op3(%3, %3) # type inference check against %6 Comment: Execute: replace_uses_of_var_after_op(op1, %4, %3) would lead to identical results, as op2 does not take %4 as input. Comment: replace_uses_of_var_after_op(op0, %4, %3) would cause error as %3 is after op0 Comment: To avoid clutter, we drop the names of arguments and return Var in the illustration above. Another example, usage of "end_op": Given: %2 = op0(%1, %1) %3 = op1() %4 = op2(%1, %2) %5 = op3(%2) if execute replace_uses_of_var_after_op(anchor_op=op0, old_var=%2, new_var=%3) Result: %2 = op0(%1, %1) %3 = op1() %4 = op2(%1, %3) %5 = op3(%3) if execute replace_uses_of_var_after_op(anchor_op=op0, old_var=%2, new_var=%3, end_op=op2) Result: %2 = op0(%1, %1) %3 = op1() %4 = op2(%1, %3) # %2 is replaced with %3 till here %5 = op3(%2) # will continue using %2 """ if not force_replace and old_var.op is not None and new_var.op is not None: if not old_var.can_be_replaced_by_var(new_var): old_nonreplaceable_vars = old_var.nonreplaceable_vars_upstream new_nonreplaceable_vars = new_var.nonreplaceable_vars_upstream err_var = None for _var in old_nonreplaceable_vars: if _var not in new_nonreplaceable_vars: err_var = _var break msg = ( "var {} cannot be replaced by {}. Since the nonreplaceable var {} might " "potentially " "be removed during the replacement of those vars." ).format(old_var, new_var, err_var) raise ValueError(msg) # It is expensive to check the var visibility, and it should only be done while debugging. if DEBUG: self.validate() visibility_error_msg = ( "new_var '{}' is not visible in block '{}' at or before " + "anchor_op '{}'" ) anchor_op_name = "None" if anchor_op is None else anchor_op.name if isinstance(new_var, ComplexVar): # For ComplexVar, as it's just a temp wrapper to transit the real and imag data, we # check the visibility of its real and imaginary Var instead. if not self.is_var_visible_in_block(new_var.real, upto_op=anchor_op): raise ValueError( visibility_error_msg.format(new_var.real.name, self.name, anchor_op_name) ) if not self.is_var_visible_in_block(new_var.imag, upto_op=anchor_op): raise ValueError( visibility_error_msg.format(new_var.imag.name, self.name, anchor_op_name) ) else: if not self.is_var_visible_in_block(new_var, upto_op=anchor_op): raise ValueError( visibility_error_msg.format(new_var.name, self.name, anchor_op_name) ) start = self.find_op_id_in_block(anchor_op) + 1 if anchor_op is not None else 0 end_id = self.find_op_id_in_block(end_op) if end_op is not None else -1 if end_id != -1 and end_id < start: msg = "end_op '{}' comes before the anchor_op '{}'" raise ValueError(msg.format(end_op.name, anchor_op.name)) num_ops_affected = self._replace_var( old_var, new_var, anchor_op=anchor_op, end_op=end_op, no_check_var_types=no_check_var_types, ) logger.debug("Num ops affected in replacing var: {}".format(num_ops_affected)) def remove_ops(self, ops_to_remove: List[Operation]): """ Remove ops in `ops_to_remove`. Args: ops_to_remove: List[Operation]. All ops in this list must be pre-existing in the block. It allows duplicated ops, but duplicated ops will only be removed once. Raises: ValueError if any `op` in `ops_to_remove` meets any of following conditions: - `op` is not found in the block - any other op in the block uses output Vars of `op` - the output var is block's output """ self.validate() # Dedup ops because each op can only be deleted once. ops_to_remove_set = set(ops_to_remove) ops_to_remove = list(ops_to_remove_set) for op in ops_to_remove: for i, v in enumerate(op.outputs): # Check that the output Var isn't block's output if v in self._outputs: raise ValueError( f"cannot delete op {op.name} with output {i}: {v.name} that's block {self.name}'s output." ) for b in op.blocks: b.set_outputs([]) b.remove_ops(b.operations) self.operations.remove(op) op.enclosing_block = None for v in op.get_flattened_inputs(): v.remove_child_op(op) # Remove InternalVar from self._internal_vars for v in op.internal_inputs.values(): self._internal_vars.remove(v) # In the end, we check no ops depend on removed op's outputs for op in ops_to_remove: for i, v in enumerate(op.outputs): if len(v.child_ops) > 0: child_op_names = [s.name for s in v.child_ops] raise ValueError( f"Cannot delete op '{op.name}' with active output at id {i}: '{v.name}' used by ops {child_op_names}." ) def _propagate_nonreplaceable_vars(self): def propagate_nonreplaceable_vars_block(block): for op in block.operations: for b in op.blocks: propagate_nonreplaceable_vars_block(b) if op.outputs is None: continue for o in op.outputs: o._reset_nonreplaceable_vars_upstream() o._set_nonreplaceable_vars_upstream() propagate_nonreplaceable_vars_block(self) def indented_str(self, indent: Optional[str] = None, print_attr: Optional[bool] = False) -> str: if indent is None: indent = "" s = ( indent + self.name + "(" + ", ".join([str(var) for var in self._block_inputs]) ) s += ") {\n" for op in self.operations: s += op.indented_str(indent + SPACES * 1, print_attr=print_attr) s += indent + "} -> (" if self._outputs is not None: s += ", ".join(["%" + v.name for v in self._outputs]) s += ")\n" return s def __repr__(self): return self.__str__() def __str__(self): return self.indented_str() def get_dot_string( self, function_name="main", prefix_id=0, highlight_debug_op_types=None, highlight_debug_op_names=None, ): """ Return the dot string that can be used to show the block with dot. Const ops are not added to the dot string. * Input vars : yellow * output vars : goldenrod2 * op names that user wants to highlight, provided in "highlight_debug_op_names": cyan * op types that user wants to highlight, provided in "highlight_debug_op_types": green Examples -------- >>> import graphviz >>> graphviz.Source(block.get_dot_string()).view() >>> # OR >>> graphviz.Source(block.get_dot_string()).view(filename='graph.pdf') """ if highlight_debug_op_types is None: highlight_debug_op_types = [] if highlight_debug_op_names is None: highlight_debug_op_names = [] dotstring = "digraph g {\n" + "\tcompound=true;\n" input_var_names = list(self.inputs.keys()) output_var_names = [v.name for v in self.outputs] debug_op_types = [] if len(highlight_debug_op_types) > 0: for op in self.operations: if op.op_type in highlight_debug_op_types: debug_op_types.append(op.name) vis = DotVisitor() vis.highlight_nodes(input_var_names, "yellow").highlight_nodes( output_var_names, "goldenrod2" ).highlight_nodes(highlight_debug_op_names, "cyan").highlight_nodes( debug_op_types, "green" ) vis.visit_all(self, nodename_prefix=str(prefix_id)) res = vis.get_result("subgraph", "cluster_" + function_name.replace("/", "_")) dotstring += "\n".join("\t" + r for r in res.split("\n")) + "\n" dotstring += "}" return dotstring class Function(Block): def __init__(self, inputs, opset_version=None): """ inputs: str -> placeholder opset_version: AvailableTarget enum. Describes the opset version of the function """ self.placeholder_inputs = inputs self.opset_version = opset_version self.output_types = None self.input_types = [] # str -> Var self._input_dict = OrderedDict() for k, v in self.placeholder_inputs.items(): v.set_name(k) # set to user input name self._input_dict[k] = v.outputs[0] global k_used_symbols global k_num_internal_syms for inp in self._input_dict.values(): if types.is_tensor(inp.dtype): shapes = inp.dtype.get_shape() for s in shapes: if is_symbolic(s): k_used_symbols.add(s) super().__init__() # Override Block's input @property def inputs(self): return self._input_dict @property def opset_version(self): return self._opset_version @opset_version.setter def opset_version(self, version): if not ( isinstance(version, _target) or version is None ): raise ValueError("opset_version must be type of coremltools.AvailableTarget") self._opset_version = version def __repr__(self): return self.__str__() def __str__(self): return self.to_str("function") def to_str( self, func_name: Optional[str] = "function", print_attr: Optional[bool] = False ) -> str: func_name = func_name + "[{}]".format(_OPSET[self.opset_version]) if len(self._input_dict) == 0: s = func_name + "()" else: inputs = [(in_name, ph) for in_name, ph in self._input_dict.items()] s = func_name + "(" + str(inputs[0][1]) for in_name, ph in inputs[1:]: s += ",\n" + " " * (len(func_name) + 1) + str(ph) s += ")" s += " {\n" s += self.indented_str(SPACES, print_attr=print_attr) s += "}\n" return s def get_max_opset_version_and_op(self) -> Tuple[_target, Operation]: """ Find the max opset version among all operations in the function. Returns the opset version Enum and the corresponding op. """ max_opset_version = _target.iOS13 op_with_max_opset_version = None def update_max_opset_version_block(block): nonlocal max_opset_version nonlocal op_with_max_opset_version for op in block.operations: for b in op.blocks: update_max_opset_version_block(b) if not hasattr(op, "_op_variants") or not isinstance(op._op_variants, dict): continue if op.opset_version > max_opset_version: max_opset_version = op.opset_version op_with_max_opset_version = op update_max_opset_version_block(self) return max_opset_version, op_with_max_opset_version def set_output_types(self, outputs: Optional[List[InputType]] = None) -> None: """ Set the user defined output type for a function. Note: the common::update_output_dtypes graph pass takes this information, and changes the function output signature accordingly. """ if outputs is not None: if not ( isinstance(outputs, list) and all([isinstance(out, InputType) for out in outputs]) ): raise TypeError( "main outputs should be a list of type ct.TensorType or ct.ImageType" ) self.output_types = outputs def set_input_types(self, input_types: List[InputType]): if not isinstance(input_types, tuple): raise ValueError("main inputs should be tuple of TensorType or ImageType") elif not all([isinstance(inp, InputType) for inp in input_types]): raise ValueError("main inputs should be tuple of InputSpec") self.input_types = input_types ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/builder.py0000644000000000000000000003356614672066616022612 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numbers from collections import defaultdict from typing import Any, Callable, List, Optional, Tuple, Type import numpy as np from coremltools import _logger as logger from coremltools.converters.mil import mil from coremltools.converters.mil._deployment_compatibility import AvailableTarget from coremltools.converters.mil.mil.types.symbolic import any_symbolic from .block import Function, curr_block from .input_type import ( InternalInputType, ListOrTensorOrDictInputType, TensorInputType, TupleInputType, ) from .program import Placeholder, StateTensorPlaceholder from .scope import ( SCOPE_STACK, VALID_OPS_TO_COPY_SCOPE_INFO, ScopeContextManger, ScopeInfo, ScopeSource, ) from .var import InternalVar, Var def is_python_value(val): return ( isinstance(val, (np.generic, np.ndarray)) or isinstance(val, numbers.Number) or isinstance(val, str) or isinstance(val, bool) or (isinstance(val, (tuple, list)) and all(is_python_value(v) for v in val)) ) class Builder: """ This class is a singleton builder to construct a MIL program. For more information, see `Create a MIL program `_. Importing ``.ops`` triggers the installation of all MIL ops into the Builder. For details on each op, see `MIL ops `_. Examples -------- >>> from coremltools.converters.mil.mil import Builder as mb >>> from coremltools.converters.mil.mil import Program, Function >>> prog = Program() >>> func_inputs = {"x": mb.placeholder(shape=[2,3]), >>> "y": mb.placeholder(shape=[2,3])} >>> with Function(func_inputs) as ssa_fun: >>> x, y = ssa_fun.inputs['x'], ssa_fun.inputs['y'] >>> res_var = mb.add(x=x, y=y) # created within ssa_fun block >>> ssa_fun.set_outputs([res_var]) >>> prog.add_function("main", ssa_fun) >>> # Importing ops triggers installation of all ops into Builder. >>> from .ops import defs as _ops """ name_count = defaultdict(int) @classmethod def _get_free_name(cls, name): new_name = name + "_" + str(cls.name_count[name]) cls.name_count[name] += 1 return new_name @classmethod def _maybe_set_name(cls, kwargs, op_type): if "name" not in kwargs: kwargs["name"] = cls._get_free_name(op_type) return kwargs @classmethod def _add_const(cls, val, name, before_op): if not is_python_value(val): err_msg = f"Cannot add const {val}" if any_symbolic(val): err_msg += ( "\nPython native vals (list, tuple), np.array that are" + "operation inputs cannot have symbolic values. Consider feeding" + "symbolic shape in through placeholder and use mb.shape() " + f"operator. Input {name}: {val}" ) raise ValueError(err_msg) const_name = cls._get_free_name(name) logger.debug("Adding const op '{}'".format(const_name)) output_var = cls.const(val=val, name=const_name, before_op=before_op) return output_var @classmethod def _create_vars(cls, input_spec, op_name, before_op, candidate_kv): """ For each key K in `candidate_kv`, create a Var if the following are satisfied: - K exists in input_spec and is not an InternalInputType - candidate_kv[K] is not already a Var Inputs ------ - candidate_kv: Dict[str, Any] Key-values may be inputs to an op (whose inputs is defined by input_spec) Returns ------- - var_kv: Dict[str, Var] For the K satisfying the above, var_kv[K] is the newly created Var """ update_dict = {} for k, val in candidate_kv.items(): if isinstance(val, Var): continue # already a Var if k not in input_spec.input_types: continue # k is not an op input in_type = input_spec.input_types[k] if isinstance(in_type, InternalInputType): new_var_name = op_name + "_" + k var = InternalVar(val, name=new_var_name) curr_block().add_internal_var(var) update_dict[k] = var continue # Not a regular Var new_var_name = op_name + "_" + k if isinstance(in_type, TupleInputType): var = [] if not isinstance(val, (list, tuple)): raise ValueError(f"Invalid type {type(val)} for TupleInputType param.") for i, v in enumerate(val): if isinstance(v, Var): var.append(v) continue var.append( cls._add_const(v, new_var_name + str(i), before_op) ) update_dict[k] = var continue if isinstance(in_type, (TensorInputType, ListOrTensorOrDictInputType)): var = cls._add_const(val, new_var_name, before_op) update_dict[k] = var return update_dict @classmethod def _add_op(cls, op_cls, **kwargs): """ Add an op of type `op_cls` (e.g., convolution) to current block. """ kwargs = cls._maybe_set_name(kwargs, op_cls.__name__) logger.debug( "Adding op '{}' of type {}".format(kwargs["name"], op_cls.__name__) ) before_op = kwargs.get("before_op", None) # Shallow copy list inputs to ensure op inputs are immutable kwargs = {k: v if not isinstance(v, (list, tuple)) else v[:] for k, v in kwargs.items() if v is not None} kwargs.update(cls._create_vars( input_spec=op_cls.input_spec, op_name=kwargs["name"], before_op=before_op, candidate_kv=kwargs)) kwargs["enclosing_block"] = curr_block() # Add scope information current_scopes = SCOPE_STACK.get_curr_scopes() kwargs["scopes"] = current_scopes new_op = op_cls(**kwargs) # We record if the op is created under graph pass if len(current_scopes) == 1 and ScopeSource.COREMLTOOLS_GRAPH_PASS in current_scopes: VALID_OPS_TO_COPY_SCOPE_INFO[-1].add(new_op) # Initialize optional input Vars if it wasn't in kwargs default_inputs = new_op.default_inputs() # Shallow copy list inputs to ensure op inputs are immutable missing_optional_vals = {k: v if not isinstance(v, (list, tuple)) else v[:] for k, v in default_inputs.items() if k not in kwargs and v is not None} missing_optional_vars = cls._create_vars( input_spec=op_cls.input_spec, op_name=kwargs["name"], before_op=before_op, candidate_kv=missing_optional_vals) new_op.set_inputs(type_inference=False, **missing_optional_vars) curr_block()._insert_op_before(new_op, before_op=before_op) new_op.build_nested_blocks() new_op.type_value_inference() if len(new_op.outputs) == 1: return new_op.outputs[0] return new_op.outputs @staticmethod def placeholder( shape: Tuple[Any], dtype: Optional[Type] = None, allow_rank0_input: Optional[bool] = False, name: Optional[str] = None, ) -> Placeholder: return Placeholder(shape, dtype, allow_rank0_input=allow_rank0_input, name=name) @staticmethod def TensorSpec(shape, dtype=None): return Placeholder(shape, dtype) @staticmethod def StateTensorSpec(shape, dtype=None): return StateTensorPlaceholder(shape, dtype) @staticmethod def state_tensor_placeholder(shape, dtype=None): return StateTensorPlaceholder(shape, dtype) @staticmethod def _create_function( main_block: Callable, input_specs: Optional[List[Placeholder]] = None, opset_version: Optional[AvailableTarget] = None, ): """ Utility to construct a pymil function. """ if input_specs is None: input_specs = [] # validate number of function inputs num_args = main_block.__code__.co_argcount arg_names = list(main_block.__code__.co_varnames)[:num_args] if len(input_specs) != num_args: raise ValueError( f"{main_block.__name__} expects {num_args} inputs: {arg_names}. Got {len(input_specs)} input_specs." ) # create the function input_spec_dict = {k: v for k, v in zip(arg_names, input_specs)} with Function(input_spec_dict, opset_version) as func: input_vars = [func.inputs[a] for a in arg_names] outputs = main_block(*input_vars) if isinstance(outputs, tuple): outputs = list(outputs) elif not isinstance(outputs, list): outputs = [outputs] func.set_outputs(outputs) # infer the opset version if not provided max_opset_version, _ = func.get_max_opset_version_and_op() if opset_version is None: func.opset_version = max_opset_version return func @staticmethod def function( input_specs: Optional[List[Placeholder]] = None, opset_version: Optional[AvailableTarget] = None, ): """ The ``mb.function`` decorator creates a MIL function. Parameters ---------- input_specs: List[TensorSpec] Describes the function inputs opset_version: AvailableTarget enum Describes the opset version of the function Examples -------- >>> import coremltools as ct >>> @mb.function(input_specs=[mb.TensorSpec(shape=(1,2))], opset_version=ct.target.iOS16) >>> def func(a): >>> return mb.add(x=a, y=2) """ def wrapper(main_block): return Builder._create_function(main_block, input_specs, opset_version) return wrapper @staticmethod def program( input_specs: Optional[List[Placeholder]] = None, opset_version: Optional[AvailableTarget] = None, function_name: Optional[str] = "main", ): """ The ``mb.program`` decorator creates a MIL program with a single function with name ``function_name``. Parameters ---------- input_specs: List[TensorSpec] Describes the function inputs opset_version: AvailableTarget enum Describes the opset version of the program function_name: str Name of the function Examples -------- >>> import coremltools as ct >>> from coremltools.converters.mil.mil import Builder as mb >>> >>> @mb.program(input_specs=[mb.TensorSpec(shape=(1,2))], opset_version=ct.target.iOS16) >>> def prog(a): >>> return mb.add(x=a, y=2) """ def wrapper(main_block): function = Builder._create_function(main_block, input_specs, opset_version) program = mil.Program() program.add_function(function_name, function) return program return wrapper @staticmethod def scope( *scopes: List[ScopeInfo], ) -> ScopeContextManger: """ The ``mb.scope`` creates a context manager, which makes the operations created within it have the corresponding scope information. Parameters ---------- scopes: Optional[List[ScopeInfo]] (Optional) * A list of ScopeInfo under the context manager. * The source in each ScopeInfo cannot be duplicated. * If not provided, this context manager does no affects. Examples -------- The following is an example of creating a scope for torchscript module heirarchy with type and name information. .. sourcecode:: python @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module1"]), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_1"]), ): return mb.add(x=x, y=4.3, name="add_1") In the previous example, the "add_1" op will have two scope attributes, for torchscipt module type and name: * TORCHSCRIPT_MODULE_TYPE: ["Module1"] * TORCHSCRIPT_MODULE_NAME: ["module_1"] The following is an example of creating nested scopes: .. sourcecode:: python @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module1"]), ): x = mb.add(x=x, y=4.3, name="add_1") with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module2"]), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_2"]), ): return mb.add(x=x, y=3.2, name="add_2") In the previous example, the "add_1" op would have a scope attribute: * TORCHSCRIPT_MODULE_TYPE: ["Module1"] while the "add_2" op would have scope attributes: * TORCHSCRIPT_MODULE_TYPE: ["Module1", "Module2"] * TORCHSCRIPT_MODULE_NAME: ["module_2"] """ return ScopeContextManger(*scopes) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/input_type.py0000644000000000000000000002672714672066616023365 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.var import InternalVar SUPPORT_FLOAT_TYPES = [ types.fp16, types.fp32, types.fp64, ] SUPPORT_INT_TYPES = [ types.uint8, types.uint16, types.uint32, types.uint64, types.int8, types.int16, types.int32, types.int64, ] + list(types._SUB_BYTE_TYPES) SUPPORT_COMPLEX_TYPES = [ types.complex64, types.complex128, ] _SUPPORT_TYPES = ( SUPPORT_FLOAT_TYPES + SUPPORT_INT_TYPES + SUPPORT_COMPLEX_TYPES + [types.bool, types.str] + list(types._SUB_BYTE_TYPES) ) class DefaultInputs: def __init__(self, **kwargs): # Since python 3.6, kwargs preserves the input order. See # https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-pep468 self._default_inputs = [(k, v) for k, v in kwargs.items()] self._ordered_dict = OrderedDict() for k, v in self._default_inputs: self._ordered_dict[k] = v def items(self): return self._ordered_dict.items() def __add__(self, default_inputs): new_order_dict = {k: v for k, v in self._ordered_dict.items()} for k, v in default_inputs._default_inputs: new_order_dict[k] = v return DefaultInputs(**new_order_dict) class InputSpec: def __init__(self, **kwargs): # Since python 3.6, kwargs preserves the input order. See # https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-pep468 self._input_types = [(k, v) for k, v in kwargs.items()] self._ordered_dict = OrderedDict() for k, v in self._input_types: self._ordered_dict[k] = v def __add__(self, input_spec): new_order_dict = {k: v for k, v in self._ordered_dict.items()} for k, v in input_spec._input_types: new_order_dict[k] = v return InputSpec(**new_order_dict) @property def input_types(self): """ Ordered dict[str, _InputType] (name, input_type) """ return self._ordered_dict def validate_inputs(self, op_name, op_type, candidate_kvs): """ For each key K in `candidate_kvs`, if K is found in self.input_types, perform the following: - check that candidate_kvs[K] is a Var and satisfies requirements in InputType (const, types) - Place K, candidate_kvs[K] in output (list of (name, var) pairs). Note that this does not ensure the presence of all required input_spec (optional == False). Parameters ---------- - op_name: str - op_type: str - candidate_kvs: Dict[str, Var] Values cannot be None Return ------ None Raise: ValueErrr if value type is incompatible """ msg_prefix = 'Op \"{}\" (op_type: {}) '.format(op_name, op_type) # check vars sharing the same type_domain_id have the same dtype type_domain_group = {} var_to_input_name = {} for name, var in candidate_kvs.items(): input_type = self.input_types[name] if isinstance(input_type, TensorInputType) and input_type.type_domain_id is not None: type_domain_id = input_type.type_domain_id if type_domain_id in type_domain_group: type_domain_group[type_domain_id].append(var) else: type_domain_group[type_domain_id] = [var] var_to_input_name[var] = name for type_domain_id, vars in type_domain_group.items(): expected_dtype = vars[0].dtype ref_name = var_to_input_name[vars[0]] for var in vars: name = var_to_input_name[var] if not var.dtype == expected_dtype: msg = ( "In op, of type {}, named {}, the named input `{}` must have the same data type " "as the named input `{}`. However, {} has dtype {} whereas {} has dtype {}." ).format(op_type, op_name, name, ref_name, name, var.dtype.__type_info__(), ref_name, expected_dtype.__type_info__()) raise ValueError(msg) # Ensure candidate_kvs doesn't contain None for name, var in candidate_kvs.items(): if var is None: raise ValueError(msg_prefix + 'Input {} is None'.format(name)) if name not in self.input_types: raise ValueError(msg_prefix + \ 'Unrecognized input {}'.format(name)) input_type = self.input_types[name] # Check constness # Don't check InternalInputType (so _const_symbolic can work) if ( input_type.const and not isinstance(input_type, InternalInputType) and not var.is_descendant_of_const ): msg = msg_prefix + "Input {} must be const at compile time" raise ValueError(msg.format(name), name, var.name) if not isinstance(var, InternalVar) and \ not input_type.is_compatible(var): msg = msg_prefix + "Input {}=\"{}\" expects " +\ "{} but got {}" raise ValueError(msg.format(name, var.name, input_type.type_str, var.sym_type.__type_info__())) class _InputType: """ (Untyped) input containing fundamental properties of all inputs to an Operation: """ def __init__(self, const=False, optional=False): """ const (bool): True if the InputType has to be constant / materialized at compile time. Const InputType is semantically equivalent to attribute. By default False. Read-only. optional (bool): True to allow user not to specify this input and rely on default values (defined in default_inputs). Note: _InputType should not be directly instantiated. Only its subclasses may be instantiated. """ self.const = const self.optional = optional def is_compatible(self, v): """ Return True if (possibly symbolic) value `v` is compatible. False otherwise. Inputs: v (Var | ListVar | native python function): input Comment: Define is_compatible as instance method to call proper subclass methods. """ return self._is_compatible(v) def _is_compatible(self, v): return True def _get_predefined_datatype(self): """ Override this function if datatype can be known without `_default` or `_val`. """ return None def __str__(self): return type(self).__name__ @property def type_str(self): """Descriptive string describing expected mil types""" return self.__str__() class TensorInputType(_InputType): """ TensorInputType specifies the generic tensor inputs. The `type_domain` validates data type constraints, and it could be either (1) A object / tuple of builtin types: This puts constraint on the allowed inputs data type. For example: ``` input_spec = InputSpec( x=TensorInputType(type_domain=types.int32), ) ``` only allows input `x` have int32 dtype. ``` input_spec = InputSpec( x=TensorInputType(type_domain=(types.int32, types.fp16)), ) ``` allows input `x` be either type of int32 or float16 (2) string: Verify different input parameters binding with the same `type_domain` are the same data type. This additional check is done by defining a `type_domains` dictionary in the Operation class For example: ``` class conv(Operation): input_spec = InputSpec( x=TensorInputType(type_domain="T"), weight=TensorInputType(type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } ``` would verify: (i) `x` and `weight` are one of the float16 or float32 type. (ii) `x` and `weight` are the same type. """ def __init__(self, type_domain, **kwargs): self._type_domain = () self._type_domain_id = None if isinstance(type_domain, str): self.type_domain_id = type_domain else: if isinstance(type_domain, type): type_domain = (type_domain,) self.type_domain = type_domain super().__init__(**kwargs) def _is_compatible(self, v): result = types.is_scalar(v.sym_type) or types.is_tensor(v.sym_type) result = result and (v.dtype in self.type_domain) return result @property def type_domain(self): return self._type_domain @type_domain.setter def type_domain(self, val): msg = f"type_domain {val} must be a tuple of builtin types" if not isinstance(val, tuple) or any(map(lambda t: t not in _SUPPORT_TYPES, val)): raise ValueError(msg) self._type_domain = val @property def type_domain_id(self): return self._type_domain_id @type_domain_id.setter def type_domain_id(self, val): if not isinstance(val, str): raise ValueError("type_domain_id must be type of str") self._type_domain_id = val @property def type_str(self): return 'tensor or scalar of dtype from type domain ' + str([types.builtin_to_string(v) for v in self.type_domain]) class ListInputType(_InputType): """ ListInputType allows inputs of type types.list """ def _is_compatible(self, v): return types.is_list(v.sym_type) @property def type_str(self): return 'list' class ListOrTensorOrDictInputType(_InputType): """ ListOrTensorOrDictInputType allows inputs of (1) MIL tensor (2) python list/tuple of MIL tensors (3) MIL dictionary """ def _is_compatible(self, v): return ( types.is_list(v.sym_type) or types.is_scalar(v.sym_type) or types.is_tensor(v.sym_type) or types.is_dict(v.sym_type) ) @property def type_str(self): return 'list, tensor, or scalar' class TupleInputType(_InputType): """ TupleInputType specifies input types of python list/tuple of MIL tensors. """ def _is_compatible(self, v): # We don't check the detail types within the tuple. return isinstance(v, (tuple, list)) @property def type_str(self): return 'tuple' class InternalInputType(_InputType): """ InternalInputType specifies input types outside of Program's type system. It allows ops to take, for example, python primitive types, instead of only the builtin types. """ def _is_compatible(self, v): return True # skip type check by default for InternalInputType. class StateInputType(_InputType): """ StateInputType allows inputs of type types.state """ def _is_compatible(self, v): return types.is_state(v.sym_type) class PyFunctionInputType(InternalInputType): """ Native python function. """ def _is_compatible(self, v): return callable(v.val) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/operation.py0000644000000000000000000005533414672066616023161 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Any, Dict, Optional, Tuple import numpy as np from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.types.symbolic import any_symbolic, is_symbolic from . import SPACES from .input_type import DefaultInputs, TensorInputType, TupleInputType from .var import ComplexVar, InternalVar, ListVar, Var VALUE = 1 SYMBOL = 2 NONE = 4 ALL = 7 def _is_compatible_symbolic_array(a, b): """ A helper function which check if two numpy array with symbolic value. For instance, a = np.array([is0, is2]) b = np.array([is1, 1]) are considered compatible. a = np.array([is0, 1]) b = np.array([is1, -1]) are not. """ if not a.shape == b.shape: return False a = a.flatten() b = b.flatten() for t, v in zip(a, b): if not is_symbolic(t) and not is_symbolic(v): if t != v: return False return True def precondition(allow=ALL): """ A helper decorator for value_inference method. Decorate value_inference with parameter VALUE/SYMBOL/NONE or ALL. For VALUE/SYMBOL/NONE use logical or ( | ) for multiple allowance. Note that: 1. ALL == VALUE | SYMBOL | NONE 2. Chosen flag (some or all VALUE/SYMBOL/NONE) must be satisfied by EVERY INPUTS for the precondition to be satisfied. The meaning for each flag is: VALUE: value that can be materialized during compile time SYMBOL: value that cannot be materialized by exist as a symbol value NONE: a None value Usage: @precondition(allow=VALUE|SYMBOL) def value_inference(self): '''some value_inference implementation''' """ ALLOW_VALUE = allow & VALUE ALLOW_SYMBOL = allow & SYMBOL ALLOW_NONE = allow & NONE def process(v, has_value, has_symbol, has_none): """ v: Var Return updated has_value, has_symbol, has_none """ if any_symbolic(v.sym_val): return has_value, True, has_none elif v.val is None: return has_value, has_symbol, True return True, has_symbol, has_none def decorator(func): def wrapper(self): HAS_VALUE = False HAS_SYMBOL = False HAS_NONE = False for in_name, in_type in self._input_types.items(): if in_type.optional: # Optional inputs are not required to invoke value_inference() continue if isinstance(in_type, TupleInputType): for v in self._input_vars[in_name]: HAS_VALUE, HAS_SYMBOL, HAS_NONE = process( v, HAS_VALUE, HAS_SYMBOL, HAS_NONE ) else: HAS_VALUE, HAS_SYMBOL, HAS_NONE = process( self._input_vars[in_name], HAS_VALUE, HAS_SYMBOL, HAS_NONE ) if HAS_VALUE and not ALLOW_VALUE: msg = "Implementation of value_inference() for op {} doesn't support input with VALUE" raise NotImplementedError(msg.format(self.op_type)) elif HAS_SYMBOL and not ALLOW_SYMBOL: msg = "Implementation of value_inference() for op {} doesn't support input with SYMBOL" raise NotImplementedError(msg.format(self.op_type)) elif HAS_NONE and not ALLOW_NONE: msg = "Implementation of value_inference() for op {} doesn't support input with NONE" raise NotImplementedError(msg.format(self.op_type)) else: return func(self) return wrapper return decorator def is_internal_input(arg_name): return arg_name[0] == "_" class mil_list: """ A wrapper around python list """ def __init__(self, ls=None): self.ls = ls if ls is not None else [] if not isinstance(self.ls, list): raise TypeError("Type of 'ls' must be list in the 'mil_list' class") class Operation: """ Represents Operation in MIL. # Properties name (str): The name of the operation input_types (InputSpec, class attr): Read-only named input types from all subclasses. Input types are used to validate `inputs`. If an input arg name start with prefix `_`, that indicates the input has the following properties: 1. Most of the time, the input is type of ``InternalInputType`` and used only in pymil scope. It doesn't have the corresponding arg / attr in the MIL framework definition. 2. It won't be printed in pymil. inputs [_input_vars] (dict of str --> Var): An Operation (subclass of Operation) only has access to input Var, which is already validated against `input_spec`. outputs [_output_vars] (list of Var): List of output var based on type inference. Read-only """ # Map from type domain id to a tuple of accepted types. type_domains: Dict[str, Tuple[Any]] = dict() def __init__(self, **kwargs): self._input_types = self.input_spec.input_types self._type_domains = self.type_domains self.name = kwargs.get("name", None) self._output_vars = None self._input_vars = {} self.blocks = [] self.enclosing_block = kwargs["enclosing_block"] self.scopes = kwargs["scopes"] # Initialize inputs as object attributes (all None) for k in self._input_types.keys(): setattr(self, k, None) self._input_vars[k] = None self._check_expected_inputs(kwargs) # Populate type_domains into input types for v in self._input_types.values(): if not isinstance(v, TensorInputType): continue if len(v.type_domain) == 0: if v.type_domain_id not in self._type_domains: raise ValueError("type_domain {} not defined.".format(v.type_domain_id)) v.type_domain = self._type_domains[v.type_domain_id] # Set inputs from kwargs input_kv = {k: v for k, v in kwargs.items() if k in self._input_types and v is not None} self._validate_and_set_inputs(input_kv) self._ensure_required_inputs() def _check_expected_inputs(self, kwargs): """ Check that all kwargs are one of the following: - system inputs (non-attributes) - op inputs (self._input_types.keys()) """ non_attributes = [ "name", "symbolic_datatype", "datatype", "symbolic_value", "value", "version", "before_op", "no_check_var_types", # no_check_var_types==True to force set inputs, even if type does not match with earlier ones "enclosing_block", "scopes", ] for k in kwargs.keys(): if k not in non_attributes and k not in self._input_types: raise ValueError( "Unknown input '{}' for op '{}'".format(k, self.op_type) ) def set_inputs(self, no_check_var_types=False, type_inference=False, **input_kvs): """ Parameters ---------- - input_kvs: Dict[str, Var] Value cannot be None - type_inference: bool True to perform type inference and recreate output Var. """ self._validate_and_set_inputs(input_kvs, no_check_var_types=no_check_var_types) if type_inference and not no_check_var_types: self.type_value_inference() self._ensure_required_inputs() def get_flattened_inputs(self): """ Returns: list[Var]. Flatten all tuple inputs """ flat_inputs = [] for v in self.inputs.values(): if isinstance(v, (list, tuple)): flat_inputs.extend(v) else: flat_inputs.append(v) return flat_inputs def type_value_inference(self, overwrite_output=False): """ Perform type inference and auto_val computation based on new input Vars in kwargs. If self._output_vars is None then we generate _output_vars; otherwise no new Var is created, but type inference result is verified against existing _output_vars, if overwrite_output is False. If overwrite_output is True, then the type inference result overwrites the existing _output_vars """ output_types = self.type_inference() if not isinstance(output_types, tuple): output_types = (output_types,) output_vals = self._auto_val(output_types) try: output_names = self.output_names() if not isinstance(output_names, tuple): output_names = (output_names,) except NotImplementedError: if len(output_types) > 1: output_names = tuple(str(i) for i, _ in enumerate(output_types)) else: output_names = ("",) # output name same as op name. # Combine (output_names, output_types, output_vals) to create output # Vars. if self._output_vars is None: self._output_vars = [] for i, (n, sym_type, sym_val) in enumerate( zip(output_names, output_types, output_vals) ): name = self.name + "_" + n if n != "" else self.name if types.is_list(sym_type): new_var = ListVar( name, elem_type=sym_type.T[0], init_length=sym_type.T[1], dynamic_length=sym_type.T[2], sym_val=sym_val if (sym_val is not None and isinstance(sym_val.val, list)) else None, op=self, op_output_idx=i, ) elem_shape = new_var.elem_shape if elem_shape is not None and len(elem_shape) >= 5: msg = ( "Core ML only supports list of elements with rank <= 4. " 'Layer "{}", with type "{}", outputs a list of rank {} tensors.' ).format(self.name, self.op_type, len(elem_shape)) raise ValueError(msg) else: if types.is_tensor(sym_type) and types.is_complex(sym_type.T[0]): # Only `complex` op needs to maintain the real/imag data in the ComplexVar. # For other ops, this ComplexVar is just a placeholder here, which will be # replaced by a newly created ComplexVar during complex ops lowering pass. real_data = ( self.real_data if self.op_type == "complex" else None ) imag_data = ( self.imag_data if self.op_type == "complex" else None ) new_var = ComplexVar( name, sym_type, sym_val, op=self, op_output_idx=i, real=real_data, imag=imag_data, ) else: new_var = Var(name, sym_type, sym_val, op=self, op_output_idx=i) self._output_vars.append(new_var) else: # Check new inference result against existing self._output_vars. for i, (sym_type, sym_val) in enumerate(zip(output_types, output_vals)): out_var = self._output_vars[i] # Check type inference if overwrite_output: out_var._sym_type = sym_type elif not types.is_compatible_type(sym_type, out_var.sym_type): msg = "Output Var {} in op {} type changes with new input Vars" raise ValueError(msg.format(out_var.name, self.name)) # Check value inference if overwrite_output: out_var._sym_val = sym_val if sym_val is not None and out_var.sym_val is not None: if np.any(sym_val.val != out_var.sym_val): if overwrite_output: out_var._sym_val = sym_val else: msg = 'value_inference differs for var {} in op {}' if not _is_compatible_symbolic_array(sym_val.val, out_var.sym_val): raise ValueError(msg.format(out_var.name, self.name)) for o in self.outputs: o._set_nonreplaceable_vars_upstream() def _auto_val(self, output_types): """ # Evaluation has two stages: # # Stage 1: Check whether the method value_inference() is implemented # # Stage 2: Check if there's an value_inference() implementation # for given input types. # # Suppose input are all SYMBOL: # Case 1: No value_inference() implemented => fail at stage 1 # Case 2: If value_inference() implemented, but requires all VALUE not # SYMBOL => fail at stage 2 # Case 3: If value_inference() implemented, and has no restriction on # input types => Success # # If either stage fails, outputs[i].val is None. # Otherwise, output[i].sym_val is not None. output_types: tuple of builtin types Returns: output_vals: tuple of builtin type with value, or tuple of None """ do_auto_val = True if do_auto_val: # Is self.value_inference implemented for corresponding input? try: vals = self.value_inference() except NotImplementedError: do_auto_val = False if not do_auto_val: # No auto_val possible. return tuple(None for _ in output_types) if not isinstance(vals, (tuple, list)): vals = (vals,) for val in vals: if val is None: do_auto_val = False if not do_auto_val: # No auto_val possible. return tuple(None for _ in output_types) auto_val = [] for t, v in zip(output_types, vals): builtin_val = t() if isinstance(v, mil_list): builtin_val.val = v.ls else: builtin_val.val = v auto_val.append(builtin_val) return auto_val def value_inference(self): """ Optional Python implementation of the op based on (materialized) values in `self.input_var`. Return a builtin value (single output) or a tuple of builtin values (multi-outputs) of the same length as returned by ` type_inference` Please note that, for ``constexpr_`` (compression) ops, we implement ``materialized_val_inference`` instead, so that we don't compute the actual values for those ops, which might potentially results in memory issue. """ msg = "value_inference() is not implemented by op {}" raise NotImplementedError(msg.format(self.op_type)) def default_inputs(self): """ Optional. Returns default values for optional inputs. The function is guaranteed to have access to all required inputs and possibly some optional inputs should the user supply them. They may be used to construct default values, such as `strides=[1]*num_spatial_dims` in conv, where `num_spatial_dims` may be inferred from the rank of required inputs """ return DefaultInputs() def output_names(self): """ Optional. If implemented, we set the output var i name as self.name + "/" + output_names[i] Returns a string (single output) or tuple of strings """ msg = "output_names() is not implemented by op {}" raise NotImplementedError(msg.format(self.op_type)) def type_inference(self): """ Return (builtin_type, builtin_val) pair from type inference. builtin_val may be None if symbolic_value is not attainable at compile time. """ raise NotImplementedError("This function must be implemented by each op") def build_nested_blocks(self): """ Build nested blocks (for cond and while_loop and other composite blocks) """ pass def _ensure_required_inputs(self): """ Raises ValueError if required inputs are not present """ for name, input_type in self._input_types.items(): if not input_type.optional and self._input_vars[name] is None: msg_prefix = 'Op "{}" (op_type: {}) '.format(self.name, self.op_type) raise ValueError( msg_prefix + "Required input {} is missing".format(name) ) def _validate_and_set_inputs(self, input_kvs, no_check_var_types=False): """ For each k, v in `input_kvs`, perform the following: - Check k exists in `self.input_specs` - Check that v satisfies the correspodning `InputType` - Set input, possibly replacing existing input. Note that it does not ensure all required inputs are satisfied. Use _ensure_required_inputs() for that. Parameters ---------- - input_kvs: Dict[str, Var] Each key in input_kvs must exist in `self.input_specs`. Its values must be a Var. - no_check_var_types: bool True to check var types against input_specs only, but not enforcing new input vars to be a subtype of existing input vars """ for key in input_kvs.keys(): if key not in self._input_types: raise RuntimeError( "Unknown input '{}' for op '{}'".format(key, self.op_type) ) def check_and_detach(v_new, v_old, op, no_check_var_types): # Check new var's sym_type is compatible with the # existing's sym_type. if ( not types.is_compatible_type(v_new.sym_type, v_old.sym_type) and not no_check_var_types ): raise ValueError( f"New var {v_new} doesn't have compatible " f"subtype of existing var `{v_old}`." ) v_old.remove_child_op(op, no_check_var_types) self.input_spec.validate_inputs(self.name, self.op_type, input_kvs) for name, var in input_kvs.items(): # Remove this operation itself from existing input # Var's child_ops existing_input_var = self._input_vars[name] if existing_input_var is not None: if isinstance(existing_input_var, (list, tuple)): for v_old, v_new in zip(existing_input_var, var): check_and_detach(v_new, v_old, self, no_check_var_types) else: check_and_detach( var, existing_input_var, self, no_check_var_types ) # Set var as input_var if isinstance(var, Var): # TODO: the child op of complex op's input might get lost, as the complex op will # be lowered. Maybe should add child op here and take care of it in lowering pass. var.add_child_op(self) elif isinstance(var, (tuple, list)): for v in var: v.add_child_op(self) # ignore function inputs self._input_vars[name] = var setattr(self, name, var) @property def inputs(self): """ Returns ------- - inputs: Dict[str, Union[Var, Tuple[Var]]] """ # Filter out InternalVar return { k: v for k, v in self._input_vars.items() if not isinstance(v, InternalVar) and v is not None } @property def internal_inputs(self) -> Dict[str, InternalVar]: """ Get internal var inputs of an op. """ return {k: v for k, v in self._input_vars.items() if isinstance(v, InternalVar)} @property def outputs(self): return self._output_vars @property def op_type(self): return type(self).__name__ @property def opset_version(self): op_variants = type(self)._op_variants opset_versions = sorted(list(op_variants.keys())) for i in opset_versions: if op_variants[i] == type(self): return i def remove_from_block(self): """ Remove / detach itself from the enclosing block. See Block.remove_ops for details. """ self.enclosing_block.remove_ops([self]) @staticmethod def var_to_str(v): if isinstance(v, (tuple, list)): return "(" + ", ".join(["%" + s.name for s in v]) + ")" elif v.op and v.op.op_type == "const": val = v.op.val.sym_val if isinstance(val, (np.generic, np.ndarray)): # for small tensors, serialize as string; skip large tensors. if val.size <= 10: return str(val.tolist()) else: # other types are small enough they can be serialized return ( '"' + val + '"' if isinstance(val, str) else str(val) ) return "%" + v.name def indented_str(self, indent: Optional[str] = "", print_attr: Optional[bool] = False) -> str: if self.op_type == "const": return "" s = indent if self.outputs is not None: s += ", ".join([str(o) for o in self.outputs]) if print_attr: attr = "[" for k, v in self.scopes.items(): attr += f"{k}: {v}, " attr = attr[:-2] + "]" else: attr = "" s += " = " + self.op_type + attr + "(" s += ", ".join([k + "=" + Operation.var_to_str(v) for k, v in self.inputs.items()]) s += ', name="{}")\n'.format(self.name) for b in self.blocks: s += b.indented_str(indent=indent + SPACES, print_attr=print_attr) return s def __repr__(self): return str(self) def __str__(self): return self.indented_str(SPACES) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2295468 coremltools-8.0/coremltools/converters/mil/mil/ops/0000755000000000000000000000000014672075535021376 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/__init__.py0000644000000000000000000000033214672066616023505 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2295468 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/0000755000000000000000000000000014672075535022317 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/__init__.py0000644000000000000000000000045114672066616024430 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import complex_dialect_ops, coreml_dialect, iOS15, iOS16, iOS17, iOS18 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/_op_reqs.py0000644000000000000000000000054214672066616024501 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil.ops.registry import \ SSAOpRegistry as _SSAOpRegistry register_op = _SSAOpRegistry.register_op ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/_utils.py0000644000000000000000000006044214672066616024176 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import math import numbers from typing import List, Tuple import numpy as np from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Var, get_new_symbol, types from coremltools.converters.mil.mil.ops.defs.iOS15.elementwise_unary import ( cast as cast_op_class, ) from coremltools.converters.mil.mil.types import builtin_to_string, promote_dtypes from coremltools.converters.mil.mil.types.symbolic import is_symbolic MAX_SIZE_CONSTANT_FOLDING = 1024 * 1024 / 4 # When a fp32 const takes over 1MB, we won't create a const op for that class ConvPoolingTypeInferenceCache(dict): """ An utility class to cache the shape inference of ``conv`` and ``pool`` op. The cache mechanism makes sure ops with the same input shape (symbolic also), and params (``pad, stride, kernel``) would produce the same output shape. """ @staticmethod def get_cache_key( input_shape: Tuple[int], pad_type: str, pad: Tuple[int], strides: Tuple[int], kernel: Tuple[int], ceil_mode: bool, ) -> Tuple[Tuple]: return ( ("input_shape", input_shape), ("pad_type", pad_type), ("pad", pad), ("strides", strides), ("kernel", kernel), ("ceil_mode", ceil_mode), ) def __setitem__(self, key, value): if key in self: raise ValueError(f"cache key {key} already exisit.") return dict.__setitem__(self, key, value) CONV_POOLING_TYPE_INFERENCE_CACHE = ConvPoolingTypeInferenceCache() def broadcast_shapes(shape_x, shape_y): """ Check and broadcast given input shapes. :param shape_x: tuple of int or symbols Shape of the first tensor (possibly symbolic). :param shape_y: tuple of int or symbols Shape of the second tensor (possibly symbolic). :return: tuple of int or symbols Result from broadcast. """ def raise_incompatible_dim_exception(): raise ValueError( "Incompatible dim {} in shapes {} vs. {}".format( i, shape_x, shape_y ) ) shape_x = tuple(shape_x) shape_y = tuple(shape_y) if len(shape_x) < len(shape_y): shape_x = tuple([1] * (len(shape_y) - len(shape_x))) + shape_x if len(shape_y) < len(shape_x): shape_y = tuple([1] * (len(shape_x) - len(shape_y))) + shape_y ret_shapes = list() for i in range(len(shape_x)): if shape_x[i] == shape_y[i]: ret_shapes.append(shape_x[i]) else: is_x_unknown = is_symbolic(shape_x[i]) is_y_unknown = is_symbolic(shape_y[i]) if shape_x[i] == 1: ret_shapes.append(shape_y[i]) elif shape_y[i] == 1: ret_shapes.append(shape_x[i]) elif not is_y_unknown and shape_y[i] > 1: if not is_x_unknown and shape_x[i] != shape_y[i]: raise_incompatible_dim_exception() ret_shapes.append(shape_y[i]) elif not is_x_unknown and shape_x[i] > 1: if not is_y_unknown and shape_x[i] != shape_y[i]: raise_incompatible_dim_exception() ret_shapes.append(shape_x[i]) elif is_x_unknown or is_y_unknown: ret_shapes.append(get_new_symbol()) else: raise_incompatible_dim_exception() return tuple(ret_shapes) def infer_type_with_broadcast(typea, typeb, primitive_type): """ Given 2 primitive types `typea` and `typeb`, and their promotion `primitive_type`, return the type after broadcast """ # broadcast if not types.is_tensor(typea) and not types.is_tensor(typeb): # both typea and typeb are not tensors return primitive_type if types.is_tensor(typea) and not types.is_tensor(typeb): # a is tensor, b is not return types.tensor(primitive_type, typea.get_shape()) if not types.is_tensor(typea) and types.is_tensor(typeb): # a is not tensor, b is return types.tensor(primitive_type, typeb.get_shape()) # both a, b are tensors shapea = list(typea.get_shape()) shapeb = list(typeb.get_shape()) ret_shape = broadcast_shapes(shapea, shapeb) return types.tensor(primitive_type, ret_shape) def promoted_primitive_type(type1, type2): """ Given a pair of tensor or primitive types, find the smallest type that can store an instance of their primitive type. """ ptype1 = type1.get_primitive() if types.is_tensor(type1) else type1 ptype2 = type2.get_primitive() if types.is_tensor(type2) else type2 return types.promote_types(ptype1, ptype2) def effective_kernel(kernel_shape, dilations): """ Args: kernel_shape: tuple[int] representing the kernel shape in each given dimension. dilations: tuple[int] representing the dilation of the kernel in each given dimension. Must be the same length as kernel_shape, and is assumed to give the dimensions in the same order as kernel_shape Returns: tuple[int] representing the effective shape of the kernel in each given dimension, with each dimension in the order given, taking into account dilation. See http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html#dilated-convolutions Note that a dilation of 1 is equivalent to having no dilation. """ if len(kernel_shape) != len(dilations): raise ValueError( f"kernel_shape ({len(kernel_shape)}) and dilations ({len(dilations)}) " f"must be the same length" ) return tuple([(k - 1) * d + 1 for k, d in zip(kernel_shape, dilations)]) def aggregated_pad( pad_type, kernel_shape, input_shape=None, strides=None, dilations=None, custom_pad=None, ): """ Args pad_type: string. Must be one of ('same', 'same_lower', 'valid', 'custom') kernel_shape: [kH, kW, ...]: spatial kernel dims (excluding channels) input_shape: [iH, iW, ...]: spatial input dims (excluding channels) Required iff pad_type in ['same', 'same_lower'] strides: [sH, sW, ...]: spatial strides (excluding channels) Required iff pad_type in ['same', 'same_lower'] dilations: [dH, dW, ...]: dilations (excluding channels) If not provided, defaults to [1, 1, ...], effectively no dilation. custom_pad: Required iff pad_type == 'custom'. custom_pad[2*i], custom_pad[2*i+1] are before/after custom padding for spatial dim i. Returns: A tuple of total (before + after) padding for each spatial dimension in kernel_shape. """ num_spatial_dims = len(kernel_shape) if dilations is None: dilations = [1] * num_spatial_dims elif len(dilations) != num_spatial_dims: raise ValueError( f"dilations must have same length as kernel_shape " f"({num_spatial_dims}, but got {len(dilations)})" ) if pad_type in ["same", "same_lower"]: if input_shape is None or len(input_shape) != num_spatial_dims: raise ValueError( "For SAME padding input_shape must not be None and must have " "same length as kernel_shape ({}, but got {})".format( num_spatial_dims, len(input_shape) if input_shape is not None else "None", ) ) if strides is None or len(strides) != num_spatial_dims: raise ValueError( "For SAME padding strides must not be None and must have " "same length as kernel_shape ({}, but got {})".format( num_spatial_dims, len(strides) if strides is not None else "None" ) ) effective_ks = effective_kernel(kernel_shape, dilations) return tuple( [ int(max(0, s * math.ceil(float(i) / float(s)) - i + k - s)) if not is_symbolic(i) else get_new_symbol() for i, k, s in zip(input_shape, effective_ks, strides) ] ) if pad_type == "valid": return tuple([0] * num_spatial_dims) if pad_type == "custom": if custom_pad is None or len(custom_pad) != 2 * num_spatial_dims: raise ValueError("Invalid custom_pad.") return tuple([custom_pad[2 * d] + custom_pad[2 * d + 1] for d in range(num_spatial_dims)]) raise ValueError('Invalid padding pad_type "{}"'.format(pad_type)) def spatial_dimensions_out_shape( pad_type, input_shape, kernel_shape, strides, dilations=None, custom_pad=None, ceil_mode=False, ): """ Args pad_type: string. Must be one of ('same', 'same_lower', 'valid', 'custom') input_shape: [iH, iW, ...]: spatial input dims (excluding channels) Required iff pad_type in ['same', 'same_lower'] kernel_shape: [kH, kW, ...]: spatial kernel dims (excluding channels) strides: [sH, sW, ...]: spatial strides (excluding channels) Required iff pad_type in ['same', 'same_lower'] dilations: [dH, dW, ...]: dilations (excluding channels) If not provided, defaults to [1, 1, ...], effectively no dilation. custom_pad: Required iff pad_type == 'custom'. custom_pad[2*i], custom_pad[2*i+1] are before/after custom padding for spatial dim i. ceil_mode: determines the padding and output shape. When ceil mode is True: out_dim = floor((in_dim + pad_l + pad_r - kernel_size + (stride-1)) / stride) + 1 if (out_dim-1) * stride >= in_dim + pad_l and (pad_l > 0 or pad_r > 0): out_dim = out_dim - 1 When ceil mode is False: out_dim = floor((in_dim + pad_l + pad_r - kernel_size) / stride) + 1 Returns: A list of spatial output sizes for each spatial dimension of kernel_shape. """ num_spatial_dims = len(kernel_shape) if dilations is None: dilations = [1] * num_spatial_dims if custom_pad is None: custom_pad = np.array([0] * num_spatial_dims * 2) if not ( len(input_shape) == len(kernel_shape) == len(strides) == len(dilations) == len(custom_pad) / 2 ): raise ValueError( f"input_shape (length {len(input_shape)}), " f"kernel_shape (length {len(kernel_shape)}), " f"strides (length {len(strides)}), " f"dilations (length {len(dilations)}), " f"and custom_pad (length {len(custom_pad)}) divided by two " "must all be the same length" ) effective_ks = effective_kernel(kernel_shape, dilations) if isinstance(strides, np.ndarray): strides = tuple(strides.tolist()) if isinstance(custom_pad, np.ndarray): custom_pad = tuple(custom_pad.tolist()) cache_key = CONV_POOLING_TYPE_INFERENCE_CACHE.get_cache_key( input_shape, pad_type, custom_pad, strides, effective_ks, ceil_mode, ) if cache_key in CONV_POOLING_TYPE_INFERENCE_CACHE: return CONV_POOLING_TYPE_INFERENCE_CACHE[cache_key] pad = aggregated_pad( pad_type=pad_type, kernel_shape=kernel_shape, input_shape=input_shape, strides=strides, dilations=dilations, custom_pad=custom_pad, ) out_shape = [] for r in range(num_spatial_dims): # only check if `input_shape` (spatial part of the input image) is symbolic, because: # * `input_shape` can be symbolic # * `pad` (aggregated from `input_shape` + ...) is symbolic only if `input_shape` is symbolic # * `effective_ks` (effective kernel size, determined from kernel size + dilations) cannot be symbolic # * strides cannot be symbolic if is_symbolic(input_shape[r]): out_shape.append(get_new_symbol()) else: out_dim = 0 if not ceil_mode: out_dim = math.floor((input_shape[r] + pad[r] - effective_ks[r]) / strides[r] + 1) else: out_dim = math.floor((input_shape[r] + pad[r] - effective_ks[r] + strides[r] - 1) / strides[r] + 1) if (out_dim - 1) * strides[r] >= input_shape[r] + pad[r]/2 and pad[r] > 0: out_dim = out_dim - 1 if out_dim <= 0: raise ValueError(f"spatial dimension {r} has invalid output size {out_dim}") out_shape.append(out_dim) CONV_POOLING_TYPE_INFERENCE_CACHE[cache_key] = out_shape return out_shape def parse_einsum_equation(equation: str) -> Tuple[List[str]]: """ Args equation : str parse the equation in the following manner: (running example: "nchw,nwhr->nchr") step 1: split the equation with delimiter "->" e.g.: this will give "nchw,nwhr" and "nchr" step 2: split the LHS equation string with delimiter "," e.g.: this will give input1 : "nchw", input2: "nwhr" step 3: map each character to a unique integer, which is incremented. Iterate over input1, input2 and output, in that order. e.g.: input 1, i.e., "nchw" will give vector {0,1,2,3} input 2, i.e, "nwhr" will produce {0,3,2,4} output , i.e. "nchr" will produce {0,1,2,4} return vectors corresponding to the 2 inputs and the output """ input_output_str = equation.split('->') assert len(input_output_str) == 2, "unsupported einsum equation {}".format(equation) input_str = input_output_str[0] output_str = input_output_str[1] inputs = input_str.split(',') in_outs = inputs + [output_str] map_char_to_int = {} def _update_vec(str, map_char_to_int, index): vec = [] for i, s in enumerate(str): if s not in map_char_to_int: map_char_to_int[s] = index index += 1 vec.append(map_char_to_int[s]) return index, vec index = 0 in_outs_vec = [] for inout_str in in_outs: index, vec = _update_vec(inout_str, map_char_to_int, index) in_outs_vec.append(vec) return tuple(in_outs_vec) def compute_gather(params, indices, axis, batch_dims): """ This utility function computes the gather operation with batch_dims supported. """ def compute_gather_helper(params, indices, axis): scalar_indices = isinstance(indices, numbers.Integral) if scalar_indices: res = np.take(params, [indices], axis) res2 = np.squeeze(res, axis=axis) if isinstance(res2, np.ndarray) and len(res2.shape) == 0: # The `res2` is a numpy 0-d array (after doing np.squeeze on a 1-d array). # For 0-d array in numpy, we need to extract the scalar value by first converting # it back to 1-d array. # Notice that .item() doesn't work because it returns a built-in type instead of # np.generic type, which will fail the downstream var value setter. return np.atleast_1d(res2)[0] return res2 return np.take(params, indices, axis) if batch_dims == 0: return compute_gather_helper(params, indices, axis) params_shape = params.shape indices_shape = indices.shape batch_shape = params_shape[:batch_dims] params_new_shape = [np.prod(batch_shape)] + list(params_shape[batch_dims:]) indices_new_shape = [np.prod(batch_shape)] + list(indices_shape[batch_dims:]) params_reshape = np.reshape(params, params_new_shape) indices_reshape = np.reshape(indices, indices_new_shape) res = [] for p, i in zip(params_reshape, indices_reshape): res.append(compute_gather_helper(p, i, axis - batch_dims)) res = np.stack(res) res_new_shape = tuple(batch_shape) + tuple(res.shape[1:]) return np.reshape(res, res_new_shape) def promote_input_dtypes(input_vars): """ This utility function promotes all input variables to the same data type. It is used to homogenize inputs to an op such as matmul / elementwise_binary, and not the inputs to a function itself. """ def _is_same_dtype(dtype1, dtype2): return builtin_to_string(dtype1) == builtin_to_string(dtype2) def _promoted_var(var, promoted_dtype): if var.val is None: x = mb.cast( x=var, dtype=builtin_to_string(promoted_dtype), name=var.name + "_promoted") else: const_value_after_cast = cast_op_class.get_cast_value(var, builtin_to_string(promoted_dtype)) x = mb.const(val=const_value_after_cast, name=var.name + "_promoted") return x for i, var in enumerate(input_vars): if not isinstance(var, Var): input_vars[i] = mb.const(val=var) promoted_dtype = promote_dtypes([var.dtype for var in input_vars]) for i, var in enumerate(input_vars): if not _is_same_dtype(var.dtype, promoted_dtype): input_vars[i] = _promoted_var(var, promoted_dtype) return input_vars def get_squeeze_axes(squeeze_mask, rank): """ Utility function to get the squeeze_axes from squeeze_mask. i.e., returns a list of indices ``i`` where ``squeeze_mask[i] == True``. For instance, given ``squeeze_mask = [True, False, True]``, this utility returns ``[0, 2]`` """ if squeeze_mask is None: squeeze_mask = [False] * rank squeeze_axes = [] for idx, mask in enumerate(squeeze_mask): if mask: squeeze_axes.append(idx) return squeeze_axes def get_param_val(param): """ Given a param, if it is not None, returns param.val, else returns None. """ if param is None: return None return param.val def solve_slice_by_index_slice(x_shape, begin, end, stride, begin_mask, end_mask, squeeze_mask): """ Utility function to solve the slices of tensor slicing """ # set default values for parameters rank = len(x_shape) begin = [int(i) for i in list(begin[:])] end = [int(i) for i in list(end[:])] if stride is None: stride = [1] * rank if begin_mask is None: begin_mask = [False] * rank if end_mask is None: end_mask = [False] * rank if squeeze_mask is None: squeeze_mask = [False] * rank # compute slices slices = [] for idx, mask in enumerate(begin_mask): if mask: begin[idx] = None for idx, mask in enumerate(end_mask): if mask: end[idx] = None for idx, mask in enumerate(squeeze_mask): if mask: end[idx] = None stride[idx] = np.iinfo( np.int32 ).max # We slice out only 1 element by setting stride to INF for idx in range(rank): slices.append(slice(begin[idx], end[idx], stride[idx])) return tuple(slices) def solve_slice_by_index_shape(x_shape, begin, end, stride, begin_mask, end_mask, squeeze_mask): """ Helper function to solve the shape of tensor slicing. """ # set default values rank = len(x_shape) if begin is None or len(begin) == 0: begin = [None] * rank if end is None or len(end) == 0: end = [None] * rank if stride is None: stride = [1] * rank if begin_mask is None: begin_mask = [False] * rank if end_mask is None: end_mask = [False] * rank if squeeze_mask is None: squeeze_mask = [False] * rank # basic validation for tensor shape if len(begin) != len(x_shape): raise TypeError( "slice_by_index op: size of 'begin', {}, is not equal to the rank of input, which is {}".format( len(begin), len(x_shape) ) ) if len(end) != len(x_shape): raise TypeError( "slice_by_index op: size of 'end', {}, is not equal to the rank of input, which is {}".format( len(end), len(x_shape) ) ) # solve for shape inference ret_shape = [] for idx in range(len(x_shape)): # skip if we want to squeeze the dimension if squeeze_mask[idx]: continue # for those a[:] cases if begin_mask[idx] and end_mask[idx]: if is_symbolic(x_shape[idx]): if stride[idx] == -1 or stride[idx] == 1: ret_shape.append(x_shape[idx]) else: ret_shape.append(get_new_symbol()) else: num = np.ceil(float(x_shape[idx]) / abs(stride[idx])).astype( np.int32 ) ret_shape.append(num) continue """ We first deal with those cases, where the output size is a deterministic number, even if the input dimension is unknown (i.e. symbolic) - No begin_mask and no end_mask: - begin == end. output shape = 0. - begin == end - 1, stride > 0. output shape = 1 - begin == end + 1, stride < 0. output shape = 1 - begin_mask is false and end_mask is true: - begin == -1, stride > 0. output shape = 1 - begin == 0, stride < 0. output shape = 1 """ if ( not begin_mask[idx] and not end_mask[idx] and begin[idx] is not None and end[idx] is not None ): out_shape = None if begin[idx] >= 0 and end[idx] >= 0 and stride[idx] > 0: if end[idx] < begin[idx]: raise ValueError( "slice_by_index op: unsupported values in for dimension {}, " "(begin, end, stride) : ({}, {}, {})".format( idx, begin[idx], end[idx], stride[idx] ) ) out_shape = np.arange(end[idx] - begin[idx])[ slice(0, end[idx] - begin[idx], stride[idx]) ].size if begin[idx] < 0 and end[idx] < 0 and stride[idx] < 0: if begin[idx] < end[idx]: raise ValueError( "slice_by_index op: unsupported values in for dimension {}, " "(begin, end, stride) : ({}, {}, {})".format( idx, begin[idx], end[idx], stride[idx] ) ) out_shape = np.arange(begin[idx] - end[idx])[ slice(-1, end[idx] - begin[idx] - 1, stride[idx]) ].size if out_shape in (0, 1): ret_shape.append(out_shape) continue if not begin_mask[idx] and end_mask[idx] and begin[idx] is not None: if (begin[idx] == 0 and stride[idx] < 0) or (begin[idx] == -1 and stride[idx] > 0): ret_shape.append(1) continue # for symbolic case if is_symbolic(x_shape[idx]): ret_shape.append(get_new_symbol()) continue # for single-element extraction case if x_shape[idx] == 1: ret_shape.append(1) continue # when begin and end are not determined if begin[idx] is None and not begin_mask[idx]: ret_shape.append(get_new_symbol()) continue if end[idx] is None and not end_mask[idx]: ret_shape.append(get_new_symbol()) continue # parse negative dimension if begin[idx] is not None and begin[idx] < 0: begin[idx] = max(0, begin[idx] + x_shape[idx]) if end[idx] is not None and end[idx] < 0: end[idx] = max(0, end[idx] + x_shape[idx]) # compute shape low, high = [0, x_shape[idx]] if stride[idx] > 0 else [-1, x_shape[idx] - 1] begin_idx, end_idx = ( [begin[idx], end[idx]] if stride[idx] > 0 else [end[idx], begin[idx]] ) is_begin_mask, is_end_mask = ( [begin_mask[idx], end_mask[idx]] if stride[idx] > 0 else [end_mask[idx], begin_mask[idx]] ) if is_begin_mask: begin_idx = low end_idx = high if is_end_mask else min(end_idx, high) num = np.ceil(float(end_idx - begin_idx) / abs(stride[idx])).astype( np.int32 ) ret_shape.append(max(0, num)) return ret_shape ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/complex_dialect_ops.py0000644000000000000000000007365714672066616026730 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ This file contains the dialect ops for handling complex numbers. For example, torch.fft.fft accepts complex input and produces complex outputs, which is not supported by CoreML. However, we can break the calculation into the real part and imaginary part to work around the restriction. The dialect op provided by this file could be used by any frontend (PyTorch, Tensorflow, etc). For example, during torch frontend translation, the torch's fft_fft op could be translated to def fft_fft(context, nodes): input_data, n, dim, norm = _get_inputs(context, node, expected=[4]) fft_res = mb.complex_fft(data=input_data, n=n, dim=dim, norm=norm) context.add(fft_res, node.name) and then the fft dialect op will be lowered into core ops by calculating the real and imaginary part separately. There are mainly three types of complex dialect ops: - Ops where real and imag data has interactions (such as fft). - Ops where real and imag data go through the non-complex version op separately (such as add). - Ops where only one of the real/imag data go through the non-complex version (such as shape). All dialect ops in this file will be lowered into core ops by `lower_complex_dialect_ops` pass. For adding a new dialect op, see steps in the file docstring of `lower_complex_dialect_ops.py`. Notice that all dialect op has `complex_` as prefix, because it's required by setting the `namespace="complex"` in `register_op`. """ from typing import Optional, Tuple import numpy as np from coremltools.converters.mil.mil import operation, types from coremltools.converters.mil.mil.input_type import DefaultInputs, InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.registry import SSAOpRegistry from coremltools.converters.mil.mil.types.symbolic import any_symbolic, is_symbolic from coremltools.converters.mil.mil.types.type_mapping import ( infer_complex_dtype, infer_fp_dtype_from_complex, ) from coremltools.converters.mil.mil.var import ComplexVar, Var register_op = SSAOpRegistry.register_op _FFT_VALID_NORMS = {"forward", "backward", "ortho"} def fft_canonicalize_length_dim( input_data: Var, length: Optional[Var], dim: Optional[Var], c2r: bool = False ) -> Tuple[int, int]: """ Canonicalize shape and dim for 1-D FFT (based on PyTorch's fft documentation): - length: Signal length. If given, the input will either be zero-padded or trimmed to this length before computing the FFT. - dim: The dimension along which to take the one dimensional FFT. - c2r: Use for "complex to real", such as irfft, which takes complex and output real data. """ shapes, dims = fft_canonicalize_shapes_dims(input_data, length, dim, c2r) return shapes[0], dims[0] def fft_canonicalize_shapes_dims( input_data: Var, shapes: Optional[Var], dims: Optional[Var], c2r: bool = False ) -> Tuple[Tuple[int], Tuple[int]]: """ Canonicalize shapes and dims for N-D FFT (based on PyTorch's fftn documentation): - shapes: Signal size in the transformed dimensions. If given, each dimension dims[i] will either be zero-padded or trimmed to the length s[i] before computing the FFT. If a length -1 is specified, no padding is done in that dimension. Default: s = [input.size(d) for d in dims] - dims: Dimensions to be transformed. Default: all dimensions, or the last len(s) dimensions if s is given. - c2r: Use for "complex to real", such as irfftn, which takes complex and output real data. """ if shapes is not None: shapes = shapes.val if isinstance(shapes, np.integer): shapes = (shapes,) if dims is not None: dims = dims.val if isinstance(dims, np.integer): dims = (dims,) # Input validation. input_rank = input_data.rank if dims is not None: for dim in dims: if dim < -input_rank or dim >= input_rank: raise ValueError(f"Invalid dim {dim} in `dims`.") if shapes is not None: for shape in shapes: if shape <= 0: raise ValueError(f"Invalid shape {shape} in `shapes`.") # Determine if the last dim specified in dims need to be expanded. For IRFFTN, the input is # interpreted as a one-sided Hermitian signal in the Fourier domain, as produced by rfftn(), so # we need to expand the dim back to the full matrix (with conjugate part not pruned). last_dim_expand: bool = shapes is None and c2r if shapes is not None: if dims is None: # Has shape, no dim. # Default is last len(s) dimensions. dims = tuple(range(input_rank - len(shapes), input_rank)) else: # Has shape, has dim. if len(shapes) != len(dims): raise ValueError( "shapes and dims must have the same number of elements." ) shapes = tuple( shape if shape != -1 else input_data.shape[dim] for (shape, dim) in zip(shapes, dims) ) elif dims is None: # No shape, no dim. dims = tuple(range(input_rank)) shapes = tuple(input_data.shape) else: # No shape, has dim. shapes = tuple(input_data.shape[dim] for dim in dims) # In RFFTN, the output is trimmed (because FFT of real-value input is Hermitian-symmetric, the # conjugate part is removed) to ``original_dim // 2 + 1``, so here we do the reverse # ``2 * (trimmed_dim - 1)`` to restore the original shape. if last_dim_expand: target_last_dim_shape = 2 * (input_data.shape[dims[-1]] - 1) shapes = shapes[:-1] + (target_last_dim_shape,) if len(shapes) != len(dims): raise ValueError( f"shape ({len(shapes)}) and dim ({len(dims)}) should have same number of elements." ) return shapes, dims @register_op(namespace="complex") class complex(operation.Operation): """ Dialect op for constructing a complex data from real and imaginary data. """ input_spec = InputSpec( real_data=TensorInputType(type_domain="T"), imag_data=TensorInputType(type_domain="T"), ) type_domains = { "T": (types.fp32,), } def type_inference(self): if self.real_data.shape != self.imag_data.shape: raise ValueError( f"The shape of real_data ({self.real_data.shape}) and imag_data " f"({self.imag_data.shape}) must match to construct complex data." ) return types.tensor( infer_complex_dtype(self.real_data.dtype, self.imag_data.dtype), self.real_data.shape, ) @register_op(namespace="complex") class complex_real(operation.Operation): """Dialect op for extracting real part of complex data.""" input_spec = InputSpec( data=TensorInputType(type_domain="T"), ) type_domains = { "T": (types.complex64,), } def type_inference(self): return types.tensor( infer_fp_dtype_from_complex(self.data.dtype), self.data.shape ) @register_op(namespace="complex") class complex_imag(operation.Operation): """Dialect op for extracting imaginary part of complex data.""" input_spec = InputSpec( data=TensorInputType(type_domain="T"), ) type_domains = { "T": (types.complex64,), } def type_inference(self): return types.tensor( infer_fp_dtype_from_complex(self.data.dtype), self.data.shape ) @register_op(namespace="complex") class complex_fft(operation.Operation): """ Dialect op for 1-D FFT. As PyTorch's FFT API has a much more fine-grained control than TensorFlow's, the parameters of this dialect op mainly follows `torch.fft.fft`. Parameters ---------- data: tensor<\*D, T> (Required) * The input tensor. n: const i32 (Optional. Default=None) * Signal length. If given, the input will either be zero-padded or trimmed to this length before computing the FFT. dim: const i32 (Optional. Default=``-1``) * The dimension along which to take the one dimensional FFT. norm: const str (Optional. Default=``backward``) * Normalization mode. For the forward transform (fft()), these correspond to: * "forward" - normalize by 1/n * "backward" - no normalization * "ortho" - normalize by 1/sqrt(n) (making the FFT orthonormal) * Calling the backward transform (ifft()) with the same normalization mode will apply an overall normalization of 1/n between the two transforms. This is required to make ifft() the exact inverse. * Default is "backward" (no normalization). Returns ------- tensor<\*V, complex64> * A complex tensor where real and imag parts have the same shape. * If ``n`` is None, real's and imag's shapes are same as the input. * If ``n`` is specified, shape is ``V[dim]=n``. Attributes ---------- T: fp32, complex64 References ---------- See `torch.fft.fft `_. """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), n=TensorInputType(const=True, optional=True, type_domain=types.int32), dim=TensorInputType(const=True, optional=True, type_domain=types.int32), norm=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp32, types.complex64), } def default_inputs(self): return DefaultInputs( n=None, dim=-1, norm="backward", ) def type_inference(self): if self.norm.val not in _FFT_VALID_NORMS: raise ValueError( f"Invalid norm param. Valid options are {_FFT_VALID_NORMS}" ) output_type = ( self.data.dtype if types.is_complex(self.data.dtype) else types.complex64 ) # The shape of FFT output is determined by `n` and `dim`. output_shape = list(self.data.shape) n, dim = fft_canonicalize_length_dim(self.data, self.n, self.dim) output_shape[dim] = n return types.tensor(output_type, tuple(output_shape)) @register_op(namespace="complex") class complex_fftn(operation.Operation): """ Dialect op for N-D FFT. As PyTorch's FFT API has a much more fine-grained control than TensorFlow's, the parameters of this dialect op mainly follows `torch.fft.fftn`. Parameters ---------- data: tensor<\*D, T> (Required) * The input tensor. shapes: const tensor (Optional. Default=None) * Signal size in the transformed dimensions. If given, each dimension ``dims[i]`` will either be zero-padded or trimmed to the length ``shapes[i]`` before computing the FFT. If a length ``-1`` is specified, no padding is done in that dimension. If not specified, it's equivalent to ``shapes = [data.size(dim) for dim in dims]``. dims: const tensor (Optional. Default=None) * Dimensions to be transformed. If not specified, it's equivalent to all dimensions, or the last ``len(shapes)`` dimensions if ``shapes`` is given. norm: const str (Optional. Default=``backward``) * Normalization mode. For the forward transform (fftn()), these correspond to: * "forward" - normalize by 1/n * "backward" - no normalization * "ortho" - normalize by 1/sqrt(n) (making the FFT orthonormal) where ``n = prod(shapes)`` is the logical FFT size. Calling the backward transform (ifftn()) with the same normalization mode will apply an overall normalization of 1/n between the two transforms. This is required to make ifftn() the exact inverse. * Default is "backward" (no normalization). Returns ------- tensor<\*V, complex64> * A complex tensor where real and imag parts have the same shape. * If ``shapes`` and ``dims`` are both None, real's and imag's shapes are same as the input. * If ``shapes`` or ``dims`` is specified, shape is ``V[dim]=shapes[dim] for dim in dims``. Attributes ---------- T: fp32, complex64 References ---------- See `torch.fft.fftn `_. """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), shapes=TensorInputType(const=True, optional=True, type_domain=types.int32), dims=TensorInputType(const=True, optional=True, type_domain=types.int32), norm=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp32, types.complex64), } def default_inputs(self): return DefaultInputs( shapes=None, dims=None, norm="backward", ) def type_inference(self): if self.norm.val not in _FFT_VALID_NORMS: raise ValueError( f"Invalid norm param. Valid options are {_FFT_VALID_NORMS}" ) output_type = ( self.data.dtype if types.is_complex(self.data.dtype) else types.complex64 ) # The shape of FFT output is determined by `shapes` and `dims`. shapes, dims = fft_canonicalize_shapes_dims(self.data, self.shapes, self.dims) output_shape = list(self.data.shape) for shape, dim in zip(shapes, dims): output_shape[dim] = shape return types.tensor(output_type, tuple(output_shape)) @register_op(namespace="complex") class complex_rfft(operation.Operation): """ Dialect op for 1-D RFFT. It's similar to 1-D FFT, but the input is real number. The FFT of a real signal is Hermitian-symmetric, ``X[i] = conj(X[-i])``, so the output contains only the positive frequencies below the Nyquist frequency. To compute the full output, use FFT. Parameters ---------- See the ``complex_fft`` op. Returns ------- tensor<\*V, complex64> * Based on the output of FFT, further remove the redundant conjugate part, which means ``V[dim] = V[dim] // 2 + 1``. Attributes ---------- T: fp32 References ---------- See `torch.fft.rfft `_. """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), n=TensorInputType(const=True, optional=True, type_domain=types.int32), dim=TensorInputType(const=True, optional=True, type_domain=types.int32), norm=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp32,), } def default_inputs(self): return DefaultInputs( n=None, dim=-1, norm="backward", ) def type_inference(self): if types.is_complex(self.data.dtype): raise ValueError( "RFFT requires real-value input. For complex input, please use FFT." ) output_type = infer_complex_dtype(self.data.dtype, self.data.dtype) output_shape = list(self.data.shape) n, dim = fft_canonicalize_length_dim(self.data, self.n, self.dim) output_shape[dim] = n # The shape of RFFT output is FFT after removing redundant conjugate part. output_shape[self.dim.val] = output_shape[self.dim.val] // 2 + 1 return types.tensor(output_type, tuple(output_shape)) @register_op(namespace="complex") class complex_rfftn(operation.Operation): """ Dialect op for N-D RFFT (rfftn). The FFT of a real signal is Hermitian-symmetric, X[i_1, ..., i_n] = conj(X[-i_1, ..., -i_n]) so the full ``complex_fftn`` output contains redundant information. ``complex_rfftn`` omits the negative frequencies in the last dimension. Parameters ---------- See the ``complex_fftn`` op. Returns ------- tensor<\*V, complex64> * Based on the output of N-D FFT, further remove the redundant conjugate part in last dim, which means ``V[dims[-1]] = V[dims[-1]] // 2 + 1``. Attributes ---------- T: fp32 References ---------- See `torch.fft.rfftn `_. """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), shapes=TensorInputType(const=True, optional=True, type_domain=types.int32), dims=TensorInputType(const=True, optional=True, type_domain=types.int32), norm=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp32,), } def default_inputs(self): return DefaultInputs( shapes=None, dims=None, norm="backward", ) def type_inference(self): output_type = infer_complex_dtype(self.data.dtype, self.data.dtype) output_shape = list(self.data.shape) shapes, dims = fft_canonicalize_shapes_dims(self.data, self.shapes, self.dims) for shape, dim in zip(shapes, dims): output_shape[dim] = shape # The last dim's shape is after removing the redundant conjugate part. output_shape[dims[-1]] = output_shape[dims[-1]] // 2 + 1 return types.tensor(output_type, tuple(output_shape)) @register_op(namespace="complex") class complex_ifft(operation.Operation): """ Dialect op for IFFT. Computes the one dimensional inverse discrete Fourier transform of input. Parameters ---------- All parameters except ``norm`` are same as the ``complex_fft`` op. norm: const str (Optional. Default=``backward``) * Normalization mode. For the backward transform (ifft()), these correspond to: * "forward" - no normalization * "backward" - normalize by 1/n * "ortho" - normalize by 1/sqrt(n) (making the IFFT orthonormal) * Calling the forward transform (fft()) with the same normalization mode will apply an overall normalization of 1/n between the two transforms. This is required to make ifft() the exact inverse. * Default is "backward" (normalize by 1/n). Returns ------- tensor<\*V, T> * A complex tensor where real and imag parts have the same shape. The shape is the same as the input except for the ``dim``: * If ``n`` is None, the shape is same as the input. * If ``n`` is specified, the shape at the `dim` is ``V[dim]=n``. Attributes ---------- T: complex64 References ---------- See `torch.fft.ifft `_. """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), n=TensorInputType(const=True, optional=True, type_domain=types.int32), dim=TensorInputType(const=True, optional=True, type_domain=types.int32), norm=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.complex64,), } def default_inputs(self): return DefaultInputs( n=None, dim=-1, norm="backward", ) def type_inference(self): output_type = self.data.dtype output_shape = list(self.data.shape) n, dim = fft_canonicalize_length_dim(self.data, self.n, self.dim) output_shape[dim] = n return types.tensor(output_type, tuple(output_shape)) @register_op(namespace="complex") class complex_ifftn(operation.Operation): """ Dialect op for N-D IFFT (ifftn). Parameters ---------- All parameters except ``norm`` are same as the ``complex_fftn`` op. norm: const str (Optional. Default=``backward``) * Normalization mode. For the backward transform (ifftn()), these correspond to: * "forward" - no normalization * "backward" - normalize by 1/n * "ortho" - normalize by 1/sqrt(n) (making the IFFT orthonormal) where n = prod(s) is the logical IFFT size. Calling the forward transform (fftn()) with the same normalization mode will apply an overall normalization of 1/n between the two transforms. This is required to make ifftn() the exact inverse. * Default is "backward" (normalize by 1/n). Returns ------- tensor<\*V, T> * A complex tensor where real and imag parts have the same shape. The shape is the same as the input except for the ``dim`` in ``dims``: * If ``shapes`` and ``dims`` are both None, the shape is same as the input. * If ``shapes`` or ``dims`` is specified, shape at ``dim`` is ``shapes[dim]``. Attributes ---------- T: complex64 References ---------- See `torch.fft.ifftn `_. """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), shapes=TensorInputType(const=True, optional=True, type_domain=types.int32), dims=TensorInputType(const=True, optional=True, type_domain=types.int32), norm=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.complex64,), } def default_inputs(self): return DefaultInputs( shapes=None, dims=None, norm="backward", ) def type_inference(self): output_type = self.data.dtype output_shape = list(self.data.shape) shapes, dims = fft_canonicalize_shapes_dims(self.data, self.shapes, self.dims) for shape, dim in zip(shapes, dims): output_shape[dim] = shape return types.tensor(output_type, tuple(output_shape)) @register_op(namespace="complex") class complex_irfft(operation.Operation): """ Dialect op for IRFFT. Computes the inverse of RFFT. The input is interpreted as a one-sided Hermitian signal in the Fourier domain, as produced by rfft(). By the Hermitian property, the output will be real-valued. Parameters ---------- See the ``complex_ifft`` op for details. Returns ------- tensor<\*V, fp32> * The shape is the same as the input except for the ``dim``: * If ``n`` is None, the shape at the `dim` is ``V[dim] = 2 * (D[dim] - 1)``. * If ``n`` is specified, the shape at the `dim` is ``V[dim]=n``. Attributes ---------- T: complex64 References ---------- See `torch.fft.irfft `_. """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), n=TensorInputType(const=True, optional=True, type_domain=types.int32), dim=TensorInputType(const=True, optional=True, type_domain=types.int32), norm=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.complex64,), } def default_inputs(self): return DefaultInputs( n=None, dim=-1, norm="backward", ) def type_inference(self): output_type = infer_fp_dtype_from_complex(self.data.dtype) output_shape = list(self.data.shape) n, dim = fft_canonicalize_length_dim(self.data, self.n, self.dim, c2r=True) output_shape[dim] = n return types.tensor(output_type, tuple(output_shape)) @register_op(namespace="complex") class complex_irfftn(operation.Operation): """ Dialect op for N-D IRFFT (irfftn). Parameters ---------- See the ``complex_ifftn`` op for details. Returns ------- tensor<\*V, fp32> * The shape is the same as the input except for: * If ``shapes`` and ``dims`` are both None, shape at the last dim ``V[-1]`` is ``2 * (D[-1] - 1)``. * If ``shapes`` or ``dims`` is specified, shape at ``dim`` is ``shapes[dim]``. Attributes ---------- T: complex64 References ---------- See `torch.fft.irfftn `_. """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), shapes=TensorInputType(const=True, optional=True, type_domain=types.int32), dims=TensorInputType(const=True, optional=True, type_domain=types.int32), norm=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.complex64,), } def default_inputs(self): return DefaultInputs( shapes=None, dims=None, norm="backward", ) def type_inference(self): output_type = infer_fp_dtype_from_complex(self.data.dtype) output_shape = list(self.data.shape) shapes, dims = fft_canonicalize_shapes_dims( self.data, self.shapes, self.dims, c2r=True ) for shape, dim in zip(shapes, dims): output_shape[dim] = shape return types.tensor(output_type, tuple(output_shape)) @register_op(namespace="complex") class complex_shape(operation.Operation): """ Returns a 1-dimensional tensor with the shape of the input complex tensor. Parameters ---------- x: tensor<[*?], T> (Required) * Input tensor. Returns ------- tensor * Shape of the input tensor. * ``K = x.real.rank``. Attributes ---------- T: complex64 """ input_spec = InputSpec(x=TensorInputType(type_domain="T")) type_domains = { "T": (types.complex64,), } # If type_inference or value_inference is invoked when the graph is being constructed, # x.real and x.imag may not be set since the complex lowering pass hasn't yet been invoked. # self.x should already have the shape set, so use that instead. def type_inference(self): if not isinstance(self.x, ComplexVar): raise ValueError("x must be a ComplexVar.") input_rank = self.x.rank return types.tensor(types.int32, tuple([input_rank])) def value_inference(self): if any_symbolic(self.x.shape): # convert elements in shape to int32 res = [x if is_symbolic(x) else np.int32(x) for x in self.x.shape] return np.array(res) else: return np.array(self.x.shape).astype(np.int32) @register_op(namespace="complex") class complex_abs(operation.Operation): """ Returns the absolute value of a complex tensor. Parameters ---------- x: tensor<[*d], T> (Required) Returns ------- tensor<[*d], fp32> * A float tensor with the same shape as ``x`` Attributes ---------- T: complex64 """ input_spec = InputSpec(x=TensorInputType(type_domain="T")) type_domains = { "T": (types.complex64,), } def type_inference(self): if not isinstance(self.x, ComplexVar): raise ValueError("x must be a ComplexVar.") return types.tensor(infer_fp_dtype_from_complex(self.x.dtype), self.x.shape) @register_op(namespace="complex") class complex_stft(operation.Operation): """ Dialect op for 1-D STFT. Parameters ---------- input: tensor<\*D, T> (Required) * The input tensor. n_fft: const i32 (Required) * Size of the fourier transform. hop_length: const i32 (Optional) * Stride between window frames of the input tensor. win_length: const i32 (optional) * The size of the window frame. window: tensor<1, win_length> (optional) * The window to apply to the input signal before performing the fourier transform. normalized: const bool (optional, Default=``false``) * Whether to normalize the results of the STFT onesided: const bool (optional, Default=``true``) * For real-valued inputs, whether to return the first half of the results. Returns ------- tensor<\*V, complex64> * A complex tensor where real and imag parts have the same shape. Attributes ---------- T: fp32, complex64 References ---------- See `torch.stft `_. """ input_spec = InputSpec( input=TensorInputType(type_domain="T"), n_fft=TensorInputType(const=True, type_domain=types.int32), hop_length=TensorInputType(const=True, optional=True, type_domain=types.int32), win_length=TensorInputType(const=True, optional=True, type_domain=types.int32), window=TensorInputType(const=True, optional=True, type_domain=types.fp32), normalized=TensorInputType(const=True, optional=True, type_domain=types.bool), onesided=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp32, types.complex64), } def default_inputs(self): return DefaultInputs( hop_length = None, win_length = None, window = None, normalized = False, onesided = True, ) def type_inference(self): output_type = (types.complex64) # STFT shape is [B x N x T], where N is the number of frequency bins # and T is the number of windows # B is 1 for a time series or 2 for a batch of time series window_length = self.n_fft.val hop = self.hop_length.val if self.hop_length else self.n_fft.val // 4 # if onesided is true, the input is real valued # because of Hermitian symmetry, we only need to calculate the FFT # for the first half of the frequencies if self.onesided and self.onesided.val: window_length = window_length // 2 + 1 frames = (self.input.shape[-1] - self.n_fft.val) // hop + 1 output_shape = [window_length, frames] # add back rank if needed if self.input.rank == 2: output_shape = [self.input.shape[0]] + output_shape return types.tensor(output_type, tuple(output_shape)) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2295468 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/coreml_dialect/0000755000000000000000000000000014672075535025265 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/coreml_dialect/__init__.py0000644000000000000000000000040014672066616027370 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .ops import coreml_update_state ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/coreml_dialect/ops.py0000644000000000000000000000367414672066616026452 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, StateInputType, TensorInputType from coremltools.converters.mil.mil.operation import Operation from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op @register_op(namespace="coreml") class coreml_update_state(Operation): """ Copy the content of a variable into a state and return the copy of the variable. The type of the variable must match the type that is wrapped inside the state. This is a coreml dialect op to simplify the program. When loading into MIL, the following transformation is done: .. code-block:: %x = coreml_update_state(state=%state, value=%value) --> write_state(state=%state, value=%value) %x = read_state(input=%state) Parameters ---------- state: state (Required) value: ST (Required) Returns ------- ST Attributes ---------- ST: tensor """ input_spec = InputSpec( state=StateInputType(), value=TensorInputType(type_domain="T"), ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.int16, types.int32, types.uint8, types.uint16, types.bool, ), } def type_inference(self): state_wrapped_type = self.state._sym_type.wrapped_type() if not state_wrapped_type == self.value.sym_type: raise ValueError( f"State wrapped type {state_wrapped_type.__type_info__()} not matched with the value's sym_type {self.value.sym_type.__type_info__()}." ) return self.value.sym_type ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2335467 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/0000755000000000000000000000000014672075535023157 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/__init__.py0000644000000000000000000000615714672066616025301 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil._deployment_compatibility import \ AvailableTarget as target _IOS15_TARGET = target.iOS15 from .activation import (clamped_relu, elu, gelu, leaky_relu, linear_activation, prelu, relu, relu6, scaled_tanh, sigmoid, sigmoid_hard, silu, softmax, softplus, softplus_parametric, softsign, thresholded_relu) from .classify import classify from .control_flow import (cond, const, list_gather, list_length, list_read, list_scatter, list_write, make_list, select, while_loop) from .conv import conv, conv_quantized, conv_transpose from .elementwise_binary import (add, elementwise_binary, equal, floor_div, greater, greater_equal, less, less_equal, logical_and, logical_or, logical_xor, maximum, minimum, mod, mul, not_equal, pow, real_div, sub) from .elementwise_unary import (abs, acos, asin, atan, atanh, cast, ceil, clip, cos, cosh, erf, exp, exp2, floor, inverse, log, logical_not, round, rsqrt, sign, sin, sinh, sqrt, square, tan, tanh, threshold) from .image_resizing import (affine, crop, crop_resize, resample, resize_bilinear, resize_nearest_neighbor, upsample_bilinear, upsample_nearest_neighbor) from .linear import einsum, linear, matmul from .normalization import (batch_norm, instance_norm, l2_norm, layer_norm, local_response_norm) from .pool import avg_pool, l2_pool, max_pool from .random import (random_bernoulli, random_categorical, random_normal, random_uniform) from .recurrent import gru, lstm, rnn from .reduction import (reduce_argmax, reduce_argmin, reduce_l1_norm, reduce_l2_norm, reduce_log_sum, reduce_log_sum_exp, reduce_max, reduce_mean, reduce_min, reduce_prod, reduce_sum, reduce_sum_square) from .scatter_gather import (gather, gather_along_axis, gather_nd, scatter, scatter_along_axis, scatter_nd) from .tensor_operation import (argsort, band_part, concat, cumsum, fill, flatten2d, identity, non_maximum_suppression, non_zero, one_hot, pad, range_1d, shape, split, stack, tile, topk) from .tensor_transformation import (depth_to_space, expand_dims, pixel_shuffle, reshape, reverse, reverse_sequence, slice_by_index, slice_by_size, sliding_windows, space_to_batch, space_to_depth, squeeze, transpose) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/activation.py0000644000000000000000000003601114672066616025673 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import math import numpy as np from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import (DefaultInputs, InputSpec, TensorInputType) from coremltools.converters.mil.mil.operation import (VALUE, Operation, precondition) from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from .elementwise_unary import elementwise_unary class activation_with_alpha(Operation): """ Activation with Alpha Op Superclass """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), alpha=TensorInputType(const=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): return self.x.sym_type class activation_with_alpha_and_beta(Operation): """ Activation with Alpha Beta Op Superclass """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), alpha=TensorInputType(const=True, type_domain="T"), beta=TensorInputType(const=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): return self.x.sym_type @register_op class clamped_relu(activation_with_alpha_and_beta): """ If ``x >= 0`` return elementwise ``min(beta, x)``, otherwise return ``min(beta, alpha * x)``. Parameters ---------- x: tensor<\*?, T> (Required) alpha: const T (Required) beta: const T (Required) Returns ------- tensor<\*?, T> * A tensor of the same type and shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): x = np.minimum(np.maximum(self.x.val, 0), self.beta.val) y = np.minimum(np.minimum(self.x.val, 0) * self.alpha.val, self.beta.val) return x + y @register_op class elu(activation_with_alpha): """ If ``x > 0`` return elementwise ``x``, otherwise return ``alpha * (e^x - 1)``. Parameters ---------- x: tensor<\*?, T> (Required) alpha: const T (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): b = np.copy(self.x.val) b[b < 0] = self.alpha.val * (np.exp(b[b < 0]) - 1) return b @register_op class gelu(Operation): """ Return the elementwise Gaussian error linear unit activation function for ``x``. You can use ``EXACT``, ``TANH_APPROXIMATION``, or ``SIGMOID_APPROXIMATION`` values based on the following formulas: * ``EXACT``: .. math:: f(x) = 0.5x\\left ( 1+\\rm{erf}\\left ( \\frac{x}{\\sqrt{2}} \\right ) \\right ) * ``TANH_APPROXIMATION``: .. math:: f(x) = 0.5x\\left ( 1+\\rm{tanh}\\left ( \\sqrt{2/\\pi}\\left ( x + 0.044715x^3 \\right ) \\right ) \\right ) * ``SIGMOID_APPROXIMATION``: .. math:: f(x) = x*\\rm{sigmoid}(1.702x) Parameters ---------- x: tensor<\*?, T> (Required) mode: const str (Optional) * Use ``'EXACT'``, ``'TANH_APPROXIMATION'``, or ``'SIGMOID_APPROXIMATION'`` for ``str``. * Default is ``'EXACT'``. Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), mode=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( mode="EXACT", ) @precondition(allow=VALUE) def value_inference(self): if self.mode.val == "TANH_APPROXIMATION": a = np.sqrt(2 / np.pi) * (self.x.val + 0.044715 * np.power(self.x.val, 3)) return 0.5 * self.x.val * (1 + np.tanh(a)) elif self.mode.val == "SIGMOID_APPROXIMATION": return self.x.val * (1 / (1 + np.exp(-(1.702 * self.x.val)))) else: sqaure_root_of_2 = np.sqrt(2) vfunc = np.vectorize(lambda x: 0.5 * x * (1 + math.erf(x / sqaure_root_of_2))) return vfunc(self.x.val) def type_inference(self): allowed_values = {"EXACT", "TANH_APPROXIMATION", "SIGMOID_APPROXIMATION"} if self.mode.val not in allowed_values: msg = '"gelu" op: unrecognized value of mode: "{}". Allowed values are {}' raise ValueError(msg.format(self.mode.val, allowed_values)) return self.x.sym_type @register_op class leaky_relu(activation_with_alpha): """ If ``x >= 0`` apply ``x`` elementwise, otherwise apply ``alpha * x`` elementwise. Parameters ---------- x: <*?, T> (Required) alpha: const T (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): b = np.copy(self.x.val) b[b < 0] *= self.alpha.val return b @register_op class linear_activation(activation_with_alpha_and_beta): """ Apply elementwise ``x * alpha + beta``. Parameters ---------- x: tensor<\*?, T> (Required) alpha: const T (Required) beta: const T (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): return self.alpha.val * self.x.val + self.beta.val @register_op class prelu(activation_with_alpha): """ Where ``i = 1 ... C``, if ``x_i > 0``, return ``x_i`` , otherwise return ``alpha_i * x_i``. Parameters ---------- x: tensor<[B, C, 1..3], T> (Required) * ``x`` must have rank 4, rank 3, or rank 5; that is, a shape of ``(B,C,H)``, ``(B,C,H,W)``, or ``(B,C,D,H,W)``. alpha: const tensor<[C], T>, (Required) * The length of ``alpha`` must match the second dimension of ``x`` (channel dimension). Returns ------- tensor<[B, C, 1..3], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp32, fp16 """ @precondition(allow=VALUE) def value_inference(self): # Expends alpha on all dims besides the channel (2nd) dim. alpha_br = self.alpha.val for i in range(len(self.x.shape)): if i != 1: alpha_br = np.expand_dims(alpha_br, i) x_pos = np.maximum(self.x.val, 0) b = np.minimum(self.x.val, 0) return x_pos + b * alpha_br def type_inference(self): if self.x.rank not in (3, 4, 5): raise ValueError( "prelu op: x must be rank 3 or 4 or 5, instead it is of rank {}".format( len(self.x.shape) ) ) if len(self.alpha.val.shape) != 1: raise ValueError("alpha should be rank 1") if self.x.shape[1] != self.alpha.val.shape[0]: raise ValueError( f"Size of dimension 0 of alpha ({self.alpha.val.shape[0]}) should be " f"the same as the size of dimension 1 of x ({self.x.shape[1]})." ) if self.x.rank in (3, 5): # check whether all alpha values are the same or not are_values_same = ( np.where(np.abs(self.alpha.val - self.alpha.val[0]) > 1e-5)[0].size == 0 ) if not are_values_same: raise ValueError( "prelu op: rank 3 or rank 5 input is only supported when all the values of alpha are same," "which is not the case here" ) return self.x.sym_type @register_op class relu(elementwise_unary): """ Return elementwise-applied rectified linear activation: ``max(x, 0)``. Parameters ---------- x: tensor<\*?, T> (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): return np.maximum(self.x.val, 0) @register_op class relu6(elementwise_unary): """ Return elementwise-applied rectified linear activation: ``min(max(x, 0), 6)``. Parameters ---------- x: tensor<\*?, T> (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): return np.minimum(np.maximum(self.x.val, 0), 6) @register_op class scaled_tanh(activation_with_alpha_and_beta): """ Return ``alpha * tanh(beta * x)`` elementwise. Parameters ---------- x: tensor<\*?, T> (Required) * Input range is ``(-inf, inf)``. alpha: const T (Required) beta: const T (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): return self.alpha.val * np.tanh(self.x.val * self.beta.val) @register_op class sigmoid(elementwise_unary): """ Return ``sigmoid(x)`` elementwise. Parameters ---------- x: tensor<\*?, T> (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): return 1 / (1 + np.exp(-self.x.val)) @register_op class sigmoid_hard(activation_with_alpha_and_beta): """ Return ``min( max( alpha * x + beta, 0 ), 1 )`` elementwise. Parameters ---------- x: tensor<\*?, T> (Required) alpha: const T (Required) beta: const T (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): return np.minimum( np.maximum((self.alpha.val * self.x.val) + self.beta.val, 0), 1 ) @register_op class silu(elementwise_unary): """ Sigmoid Linear Unit, elementwise apply the SiLU or Swish operation ``x * sigmoid(x)``. Parameters ---------- x: tensor<\*, T> Returns ------- tensor<\*, T> Attributes ---------- T: fp16, fp32 """ pass @register_op class softplus(elementwise_unary): """ Return ``log( 1 + e^x )`` elementwise. Parameters ---------- x: tensor<\*?, T> (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): return np.log(1 + np.exp(-np.abs(self.x.val))) + np.maximum(self.x.val, 0) @register_op class softplus_parametric(activation_with_alpha_and_beta): """ Return ``alpha_i * log( 1 + e^( beta_i * x_i ) )``, where ``i = 1 ... C``. Parameters ---------- x: tensor<[b, C, n, m], T> (Required) alpha: const tensor<[C], T> (Required) beta: const tensor<[C], T> (Required) Returns ------- tensor<[b, C, n, m], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): alpha_br = np.copy(self.alpha.val) beta_br = np.copy(self.beta.val) # Expends alpha and beta on all dims besides the channel (2nd) dim. for i in range(len(self.x.val.shape)): if i != 1: alpha_br = np.expand_dims(alpha_br, i) beta_br = np.expand_dims(beta_br, i) return alpha_br * np.log(1 + np.exp(self.x.val * beta_br)) def type_inference(self): if len(self.x.shape) < 3: raise ValueError("x should be at least rank 3") if len(self.alpha.val.shape) != 1: raise ValueError("alpha should be rank 1") if self.x.shape[1] != self.alpha.val.shape[0]: raise ValueError( "Size of dimension 0 of alpha should be the same as " + "the size of dimension 1 of x." ) if len(self.beta.val.shape) != 1: raise ValueError("beta should be rank 1") if self.x.shape[1] != self.beta.val.shape[0]: raise ValueError( "Size of dimension 0 of beta should be the same as " + "the size of dimension 1 of x." ) return self.x.sym_type @register_op class softmax(Operation): """ Return ``exp(x) / tf.reduce_sum(tf.exp(x), axis)``. Parameters ---------- x: tensor<\*?, T> (Required) axis: const i32 (Optional) * Default is ``-1``. Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( axis=-1, ) def type_inference(self): return self.x.sym_type @precondition(allow=VALUE) def value_inference(self): x = self.x.val axis = self.axis.val max_vals = np.max(x, axis=axis, keepdims=True) temp = np.exp(x - max_vals) return temp / np.sum(temp, axis=axis, keepdims=True) @register_op class softsign(elementwise_unary): """ Return ``x / ( 1 + |x| )`` applied elementwise. Parameters ---------- x: tensor<\*?, T> (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): return self.x.val / (1 + np.abs(self.x.val)) @register_op class thresholded_relu(activation_with_alpha): """ Return ``x`` if ``x >= alpha``, otherwise return ``0``. Parameters ---------- x: tensor<\*?, T> (Required) alpha: const T (Required) Returns ------- tensor<\*, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): y = self.x.val y[y < self.alpha.val] = 0 return y ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/classify.py0000644000000000000000000000632314672066616025352 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import Operation, types from coremltools.converters.mil.mil.input_type import (InputSpec, ListInputType, TensorInputType) from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.types.symbolic import any_symbolic @register_op class classify(Operation): """ The presence of this op indicates that the model is of type classifier. The op constructs the model output accordingly; that is, the predicted class label and the output probability dictionary. The parameters of this op are set based on the attributes set for the `coremltools.ClassifierConfig `_ class by the user. The outputs of this op cannot be used by another op. Parameters ---------- probabilities: tensor<[\* , ProbT]> (Required) A tensor in the graph, which is used to compute the classifier output(s). This is the tensor whose values are mapped to the class labels and used for constructing the predicted class label and the output dictionary of class names and values. classes: list<\*, ClassT> (Required) List of classes. Returns ------- Dict[classT, probT] Attributes ---------- ProbT: fp32 ClassT: i64, str """ input_spec = InputSpec( probabilities=TensorInputType(type_domain=types.fp32), classes=ListInputType(const=True), ) def type_inference(self): # check the type of "classes" if not types.is_list(self.classes.sym_type): msg = "'classes' in the op 'classify' must be of type list. Instead it is {}." raise ValueError(msg.format(self.classes.sym_type.__type_info__())) # check the type of "probabilities" if self.probabilities.dtype != types.fp32: msg = "classify op: input probabilities must be of type fp32. Instead it is of type {}" raise TypeError(msg.format(self.probabilities.sym_type.get_primitive().__type_info__())) classes_elem_type = self.classes.elem_type if classes_elem_type not in {types.str, types.int64}: msg = "Type of elements in 'classes' in the op 'classify' must be either str or int64. Instead it is {}." raise ValueError(msg.format(classes_elem_type.__type_info__())) # check that the size of "classes" is compatible with the size of "probabilities" if not any_symbolic(self.probabilities.shape): size = np.prod(self.probabilities.shape) if len(self.classes.val) != size: msg = "In op 'classify', number of classes must match the size of the tensor corresponding to 'probabilities'." raise ValueError(msg) return classes_elem_type, types.dict(classes_elem_type, types.double) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/control_flow.py0000644000000000000000000007175714672066616026261 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import numpy as np from coremltools import _logger as logger from coremltools.converters.mil.mil import Block, get_existing_symbol, get_new_symbol, types from coremltools.converters.mil.mil.input_type import ( DefaultInputs, InputSpec, InternalInputType, ListInputType, PyFunctionInputType, TensorInputType, TupleInputType, ) from coremltools.converters.mil.mil.operation import ( NONE, SYMBOL, VALUE, Operation, mil_list, precondition, ) from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs._utils import ( infer_type_with_broadcast, promoted_primitive_type, ) from coremltools.converters.mil.mil.types import is_compatible_type from coremltools.converters.mil.mil.types.type_list import list as types_list from coremltools.converters.mil.mil.types.type_mapping import ( builtin_to_string, is_subtype, numpy_type_to_builtin_type, numpy_val_to_builtin_val, ) @register_op class cond(Operation): """ Perform a conditional execution. The return types must be identical between the true and false branches. Parameters ---------- pred: tensor<[], bool> (Required) * 0-D tensor (scalar) predicate to switch between true and false branches. _true_fn: function (Required) * A Python function that executes if ``pred`` evaluates to ``True``. * It must take zero input (i.e, no input), and return one or more values whose type becomes the operation's return type. _false_fn: function (Required) * A Python function that executes if ``pred`` evaluates to ``False``. * It must take zero input (i.e. no input), and have return types that match those of the ``if`` branch. _existing_blocks: list[Block] (Optional) * Python list of ``Block``. * For internal use only. When converting a milproto, we already got existing blocks, and the ``build_nested_blocks`` function can use them directly. * When ``_existing_blocks`` is set, ``_true_fn`` and ``_false_fn`` must be dummy functions which returns ``None``. Returns ------- tuple * Python tuple of ``Variables`` from one of the branches. """ input_spec = InputSpec( pred=TensorInputType(type_domain=types.bool), _true_fn=PyFunctionInputType(), _false_fn=PyFunctionInputType(), _existing_blocks=InternalInputType(optional=True), ) def build_nested_blocks(self): # If the front end is milproto, we already have the well constructed cond/body block. # For this case, we set self.blocks directly. # We also check that _cond and _body are both dummy functions (return None). if self._existing_blocks is not None and self._existing_blocks.val is not None: assert self._true_fn.val([]) is None assert self._false_fn.val([]) is None self.blocks = self._existing_blocks.val return # Cond block true_block_name = self.name + "_true" with Block(name=true_block_name, outer_op=self) as true_block: true_func = self._true_fn.val true_ret_vars = true_func() if isinstance(true_ret_vars, tuple): true_ret_vars = list(true_ret_vars) if not isinstance(true_ret_vars, list): true_ret_vars = [true_ret_vars] true_block.set_outputs(true_ret_vars) self.blocks.append(true_block) false_block_name = self.name + "_false" with Block(name=false_block_name, outer_op=self) as false_block: false_func = self._false_fn.val false_ret_vars = false_func() if isinstance(false_ret_vars, tuple): false_ret_vars = list(false_ret_vars) if not isinstance(false_ret_vars, list): false_ret_vars = [false_ret_vars] false_block.set_outputs(false_ret_vars) self.blocks.append(false_block) def type_inference(self): true_ret_vars = self.blocks[0].outputs false_ret_vars = self.blocks[1].outputs # Verify true_ret_vars has the same types as false_ret_vars for i, (vt, vf) in enumerate(zip(true_ret_vars, false_ret_vars)): if not is_compatible_type(vt.sym_type, vf.sym_type): msg = ( "true branch output {} type {} mismatch false branch" + " output type {}" ) raise ValueError(msg.format(vt.name, vt.sym_type.__type_info__(), vf.sym_type.__type_info__())) return tuple(v.sym_type for v in true_ret_vars) def value_inference(self): if self.pred.val is None: raise NotImplementedError() if self.pred.val: return [v.val for v in self.blocks[0].outputs] return [v.val for v in self.blocks[1].outputs] class Const(Operation): """ A base class that returns constant values. Parameters ---------- val: const<\*,T> (Required) mode: immediate_value, file_value (Optional) * Determines how the constant value is stored in the internal MIL format. * For large constants such as convolution weights, use ``file_value``. * For smaller-size constants such as values of a stride, use ``immediate_value``. Returns ------- const<\*,T> Attributes ---------- T: fp16, fp32, i32, str, bool """ input_spec = InputSpec( val=InternalInputType(const=True), ) def __init__(self, **kwargs): super(Const, self).__init__(**kwargs) self._weight_id = None def type_inference(self): builtin_type, _ = self._get_type_val(self.val.val) return builtin_type def value_inference(self): _, val = self._get_type_val(self.val.val) return val def _get_type_val(self, value): int32_max = np.int32(np.iinfo(np.int32).max) int32_min = np.int32(np.iinfo(np.int32).min) if isinstance(value, (float, np.float64)): value = np.float32(value) elif isinstance(value, bool): pass elif isinstance(value, (int, np.int64)): if value > int32_max: value = int32_max elif value < int32_min: value = int32_min else: value = np.int32(value) elif isinstance(value, (tuple, list, np.ndarray)): value = np.array(value) if isinstance(value, (tuple, list)) else value if value.dtype in [np.uint64, np.int64]: logger.debug( f"Downcast const op {self.name} data {builtin_to_string(numpy_type_to_builtin_type(value.dtype))} as int32" ) value_clip_max = np.where(value > int32_max, int32_max, np.int32(value)) value_clip_min = np.where(value_clip_max < int32_min, int32_min, np.int32(value_clip_max)) value = value_clip_min if value.dtype == np.float64: logger.debug(f"Downcast const op {self.name} data fp64 as fp32") value = value.astype(np.float32) elif isinstance(value, mil_list): # If val that was passed in is of type mil_list, which is just a wrapper on top of # python list, then construct the list type. list_value = value.ls if len(list_value) == 0: raise ValueError("'mil_list' points to an empty list") builtin_elem_type, _ = self._get_type_val(list_value[0]) # mil_list is a special case that we want to preserve the int64 element type if isinstance(list_value[0], np.int64): builtin_elem_type = types.int64 builtin_type = types_list(builtin_elem_type, init_length=len(list_value), dynamic_length=False) return builtin_type, value if not isinstance(value, (np.generic, np.ndarray, str, bool, mil_list)): raise ValueError(f"Unknown value for constant: {value}") _, builtin_type = numpy_val_to_builtin_val(value) return builtin_type, value @property def weight_id(self) -> int: """ Weight id for the const. It is used for weight sharing across multiple functions. Constants sharing the same weight_id will use the same blob file value when lowering to milproto. """ return self._weight_id @weight_id.setter def weight_id(self, val: int) -> None: """ Set weight id for the const. """ assert isinstance(val, int), f"weight_id must be type of int. Got {type(val)}." assert self._weight_id is None, f"cannot set {self.name} weight_id twice." self._weight_id = val @register_op class const(Const): def __init__(self, **kwargs): super().__init__(**kwargs) # Internal const can have symbolic value (for testing purpose) @register_op class _const_symbolic(const): def __init__(self, **kwargs): super().__init__(**kwargs) def type_inference(self): builtin_type, _ = self._get_type_val(self.val.sym_val) return builtin_type def value_inference(self): # We allow symbolic values in _const_symbolic _, val = self._get_type_val(self.val.sym_val) return val @register_op class select(Operation): """ Return the elements selected from either ``a`` or ``b`` depending on the ``cond``. You must provide ``a``, ``b`` and ``cond``. The shape of ``cond``, ``a``, and ``b`` must be broadcastable. Parameters ---------- cond: tensor<[\*D1], B> (Required) * Tensor. When ``True``, select element from ``x``, otherwise, ``y``. a: tensor<[\*D2], T> (Optional) * Values selected at indices where ``cond`` is ``True``. * Default is ``None``. b: tensor<[\*D3], T> (Optional) * Values selected at indices where ``cond`` is ``False``. * Default is ``None``. Returns ------- tensor<[\*D_out], T> or tensor<[n, len(D1)], int32> * If ``a, b`` are both provided, the return shape is based on broadcast rules from ``cond, a, b``. * If ``a, b`` are ``None``, the return shape is 2-D, where the first dimension ``n`` is the number of matching indices in ``cond``, and ``len(D1)`` is the rank of ``cond``. Attributes ---------- B: bool T: fp16, fp32, i32, bool """ input_spec = InputSpec( cond=TensorInputType(type_domain=types.bool), a=TensorInputType(type_domain="T"), b=TensorInputType(type_domain="T") ) type_domains = { "T": (types.fp16, types.fp32, types.bool, types.int32), } def type_inference(self): typea = self.a.sym_type typeb = self.b.sym_type primitive_type = promoted_primitive_type(typea, typeb) if primitive_type is None: raise ValueError("Incompatible primitive types in broadcast operation") return infer_type_with_broadcast(typea, typeb, primitive_type) @precondition(allow=VALUE) def value_inference(self): res = np.where(self.cond.val, self.a.val, self.b.val) sym_type = self.type_inference() if types.is_scalar(sym_type) and not np.isscalar(res): res = getattr(np, str(res.dtype))(res.item()) return res @register_op class while_loop(Operation): """ Perform the body repeatedly while the condition ``cond`` is true. Parameters ---------- _cond: function (Required) * A Python function that takes ``loop_vars`` as positional arguments. * The function must return a ``bool`` ``Var``. _body: function (Required) * A Python function that takes ``loop_vars`` as positional arguments. * The function must return the same number of output vars as ``loop_vars`` with the same types. loop_vars: tuple (Required) * Python tuple of ``Variables``. _existing_blocks: list[Block] (Optional) * Python list of ``Block``. * For internal use only. When converting a milproto, we already got existing blocks, and the ``build_nested_blocks`` function can use them directly. * When ``_existing_blocks`` is set, ``_cond`` and ``_body`` must be dummy functions which returns ``None``. Returns ------- tuple * Python tuple (same type as ``loop_vars``). """ input_spec = InputSpec( # arg name with underscore prefix won't be printed. _cond=PyFunctionInputType(), _body=PyFunctionInputType(), loop_vars=TupleInputType(), _existing_blocks=InternalInputType(optional=True), ) @staticmethod def _check_equal_value(val1, val2): if val1 is None and val2 is None: return True if val1 is None or val2 is None: return False if isinstance(val1, np.ndarray) and isinstance(val2, np.ndarray): return np.array_equal(val1, val2) return val1 == val2 @staticmethod def _clean_up_child_ops(block): for op in block.operations: for b in op.blocks: while_loop._clean_up_child_ops(b) inputs = op.get_flattened_inputs() for in_var in inputs: in_var.remove_child_op(op) def _build_block(self, block_inputs): # Cond block: block_name = self.name + '_cond_block' with Block(block_inputs=block_inputs, outer_op=self, name=block_name) as cond_block: cond_func = self._cond.val cond_var = cond_func(*cond_block.inputs) cond_vars = cond_var if isinstance(cond_var, list) else [cond_var] cond_block.set_outputs(cond_vars) # Body block block_name = self.name + '_body_block' with Block(block_inputs=block_inputs, outer_op=self, name=block_name) as body_block: body_func = self._body.val exit_vars = body_func(*body_block.inputs) exit_vars = list(exit_vars) if isinstance(exit_vars, (list, tuple)) \ else [exit_vars] body_block.set_outputs(exit_vars) return cond_block, body_block, exit_vars def build_nested_blocks(self): # self.loop_vars is python tuple of Vars. # block_inputs Var are not produced by any op. # We assume block_inputs have the same types as self.loop_var. If not # (e.g., when certain dimensions change shape during iterate), we'd # adjust later. # We assume that sym_val is unchanging across the block iterate. If it # changes, we rebuild the block and rerun type and value inference. # Design notes on two blocks (cond and body): # # - Observe that two blocks can always be represented as a single # block that contains both cond and body logic, which would return # [loop_cond] + loop_carries. `loop_cond` is a bool. # # - Observe that single block implies a do-while logic, # in which the first iterate is always executed. It's possible to add # a cond input to while_loop to modify do-while behavior: # # %first_cond = cond_logic(...) # while_loop(cond=%first_cond, loop_vars=(...)) # # and we enter the first iterate only if cond is True. But this would # require caller to execute cond logic outside of while_loop first # (which also needs to be duplicated within the loop), # resulting in duplicated code / ops. # # - Thus, single block is unnatural for the natural execution order, # in which we execute the cond block first to get the loop_cond. Only # if `loop_cond` is True do we execute the body block. This is the # semantics of tf.while_loop. # If the front end is milproto, we already have the well constructed cond/body block. # For this case, we set self.blocks directly. # We also check that _cond and _body are both dummy functions (return None). if self._existing_blocks is not None and self._existing_blocks.val is not None: assert self._cond.val([]) is None assert self._body.val([]) is None self.blocks = self._existing_blocks.val return block_inputs = tuple(copy.copy(v) for v in self.loop_vars) name_count = {v.name: 0 for v in block_inputs} for v in block_inputs: v._op = None v.op_output_idx = None v._child_ops = list() # Get unique name old_v_name = v.name v.name = v.name + "_x" + str(name_count[v.name]) name_count[old_v_name] += 1 v._sym_val = v._sym_val v.consuming_blocks = list() cond_block, body_block, exit_vars = self._build_block(block_inputs) # Verify exit_vars has the same types as loop_vars block_input_type_change = False for i, (v_in, v_out) in enumerate(zip(block_inputs, exit_vars)): if not is_subtype(v_out.sym_type, v_in.sym_type): compat_shape = while_loop.get_compat_shape(v_out.sym_type, v_in.sym_type) if compat_shape is None: msg = "loop_vars '{}' changes in the body of " \ "while_loop '{}':\n {} -> {}" raise ValueError(msg.format( v_in.name, self.name, v_in.sym_type, v_out.sym_type)) else: block_inputs[i]._sym_type = types.tensor( v_in.dtype, compat_shape) block_input_type_change = True if not while_loop._check_equal_value(v_out.sym_val, v_in.sym_val): block_inputs[i]._sym_val = None block_input_type_change = True if block_input_type_change: # Since we are going to build the block again, we first need to remove ops # in the block from vars's _child_ops. while_loop._clean_up_child_ops(cond_block) while_loop._clean_up_child_ops(body_block) # Rebuild our block to invoke type inference. cond_block, body_block, exit_vars = self._build_block(block_inputs) for i, (v_in, v_out) in enumerate(zip(block_inputs, exit_vars)): if not is_subtype(v_out.sym_type, v_in.sym_type): msg = 'Block output {}: {} is not a subtype of ' +\ 'block input {}: {} after factoring shape changes' raise ValueError(msg.format(v_out.name, v_out.sym_type.__name__, v_in.name, v_in.sym_type.__name__)) if not while_loop._check_equal_value(v_out.sym_val, v_in.sym_val): msg = 'Block output {}: {} is not equal to ' +\ 'block input {}: {} after value changes' raise ValueError(msg.format(v_out.name, v.sym_val, v_in.name, v_in.sym_val)) self.blocks.append(cond_block) self.blocks.append(body_block) @staticmethod def get_compat_shape(type1, type2): """ For tensor types `type1`, `type2` that are of the same rank, return compat_shape (python list) where compat_shape[i] is integer iff type1 and type2 have the same integer shape on dim i. compat_shape[i] is symbolic otherwise. Return None if `type1`, `type2` have different rank or non-tensor type. """ if not types.is_tensor(type1) or not types.is_tensor(type2): return None s1 = type1.get_shape() s2 = type2.get_shape() if len(s1) != len(s2): return None compat_shape = [] for d1, d2 in zip(s1, s2): if d1 != d2: compat_shape.append(get_new_symbol()) else: compat_shape.append(d1) return compat_shape def type_inference(self): # Skip the conditional var return tuple(v.sym_type for v in self.blocks[1].outputs) @register_op class make_list(Operation): """ Create a list of tensor elements. The elements should have the same shape. The list is similar to an auto-resizing array. Parameters ---------- init_length: (Optional, Default=1) * Initial length for the list. * If ``dynamic_length`` is ``False``, ``init_length`` is the fixed length of the list throughout runtime. dynamic_length: (Optional, Default is True) elem_shape: Tuple[const] (Required) * 1-D vector denoting the shape of elements. * If ``T = int32``, the element shape is known at compile time. * ``T = string`` denotes the symbolic shape, in which the shape is determined at runtime. * If not provided, the resulting ``List`` won’t have the elementary shape info, which may cause backend errors. Remedy this with SSA passes. dtype: const (Optional, Default is fp32) * Possible values: ``{"bool", "fp16", "fp32", "int32"}`` * Element tensor’s ``dtype``. Returns ------- List[*] Attributes ---------- T: i32, string """ input_spec = InputSpec( init_length=TensorInputType(optional=True, type_domain=types.int32), dynamic_length=TensorInputType(const=True, optional=True, type_domain=types.bool), elem_shape=TupleInputType(), dtype=TensorInputType(const=True, optional=True, type_domain=types.str), ) def default_inputs(self): return DefaultInputs( init_length=1, dynamic_length=True, dtype="fp32", ) def type_inference(self): builtin_dtype = types.string_to_builtin(self.dtype.val) if builtin_dtype is None: raise ValueError("Unsupported dtype {}".format(self.dtype.val)) # Replace string with symbol elem_shape_sym = [] for s_var in self.elem_shape: # s is str or int s = s_var.val if s is None: msg = 'make_list elem_shape must be tuple of const. ' +\ 'Tuple elem {} is not' raise ValueError(msg.format(s_var.name)) if isinstance(s, str): try: symbol = get_existing_symbol(s) except ValueError: # Must be a new symbol symbol = get_new_symbol(s) elem_shape_sym.append(symbol) else: elem_shape_sym.append(s) elem_type = types.tensor(builtin_dtype, elem_shape_sym) return types.list( elem_type, init_length=self.init_length.val, dynamic_length=self.dynamic_length.val, ) @register_op class list_length(Operation): """ Return the length of ``ls``. Parameters ---------- ls: List[*] (Required) Returns ------- * Length of ``ls``. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec(ls=ListInputType(),) def type_inference(self): return types.int32 @precondition(allow=VALUE | SYMBOL | NONE) def value_inference(self): if not self.ls.dynamic_length: return self.ls.init_length raise NotImplementedError() @register_op class list_write(Operation): """ Write a value into index ``index`` of ``ls``. Parameters ---------- ls: List (Required) index: (Required) * Size of the list. value: <*,T> (Optional) * Element value to write, which must match the element shape of ``ls``. * Default is ``None``. Returns ------- List[*] Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( ls=ListInputType(), index=TensorInputType(type_domain=types.int32), value=TensorInputType(type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32, types.bool, types.int32), } def type_inference(self): list_elem_type = self.ls.elem_type value_type = self.value.sym_type dynamic_length = self.ls.dynamic_length init_length = self.ls.init_length if list_elem_type is None: # fill in the elem type using value's type info. return types.list( value_type, init_length=init_length, dynamic_length=dynamic_length ) if list_elem_type == types.unknown: msg = "Input ls elem type unknown. Override with {}" logger.warning(msg.format(value_type)) return types.list( value_type, init_length=init_length, dynamic_length=dynamic_length ) if not types.is_subtype(value_type, list_elem_type): msg = "Elem type mismatch: ls elem type {} vs " + "value type {}" raise ValueError(msg.format(list_elem_type.__type_info__(), value_type.__type_info__())) return self.ls.sym_type @register_op class list_read(Operation): """ Read the value at location ``index`` of ``ls``. Parameters ---------- ls: List[\*] (Required) index: (Required) * Size of the list. Returns ------- <\*,T> * The element's value. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( ls=ListInputType(), index=TensorInputType(type_domain=types.int32), ) def type_inference(self): list_elem_type = self.ls.elem_type if list_elem_type is None: msg = ( "Unknown element type. The List might not have been " + "written to ({})" ) raise ValueError(msg.format(self.name)) return list_elem_type @register_op class list_gather(Operation): """ Return selected values in ``ls`` as a packed ``Tensor``. Parameters ---------- ls: List[\*] (Required) indices: (Required) * Gather from indices, whose element must be in ``[0, ls.length)`` at runtime. Returns ------- <\*K,T> * Selected tensors packed into a ``len(ls.elem_shape)+1`` rank tensor. * ``K[0] == len(indices)``. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( ls=ListInputType(), indices=TensorInputType(type_domain=types.int32), ) def type_inference(self): list_elem_type = self.ls.elem_type if list_elem_type == types.unknown: msg = ( "Unknown element type. The List might not have been " + "written to ({})" ) raise ValueError(msg.format(self.name)) elem_shape = list_elem_type.get_shape() dtype = list_elem_type.get_primitive() ret_shape = [self.indices.shape[0]] + list(elem_shape) return types.tensor(dtype, tuple(ret_shape)) @register_op class list_scatter(Operation): """ Scatter ``values`` to ``ls`` at locations ``indices``. Parameters ---------- ls: List[*] (Required) indices: tensor (Required) * Indices of ``ls`` to scatter to. * Elements of ``indices`` must be in ``[0, ls.length)`` at runtime. * If indices are greater than or equal to the list length, the list is dynamically resized. value: <*,T> (Optional) * Element value to write, which must match the element shape of ``ls``. * Default is ``None``. Returns ------- List[*] * Updated list. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( ls=ListInputType(), indices=TensorInputType(type_domain=types.int32), value=TensorInputType(type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32, types.bool, types.int32), } def type_inference(self): num_indices = self.indices.shape[0] num_values = self.value.shape[0] if num_values != num_indices: raise ValueError( "Cannot scatter {} values to {} indices".format(num_values, num_indices) ) list_elem_type = self.ls.elem_type value_type = self.value.sym_type dynamic_length = self.ls.dynamic_length init_length = self.ls.init_length elem_type = types.tensor(value_type.get_primitive(), value_type.get_shape()[1:]) if list_elem_type == types.unknown: # fill in the elem type using value's type info. return types.list( elem_type, dynamic_length=dynamic_length, init_length=init_length ) if not types.is_subtype(elem_type, list_elem_type): msg = "Elem type mismatch: ls elem type {} vs " + "value type {}" raise ValueError(msg.format(list_elem_type, elem_type)) return self.ls.sym_type ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/conv.py0000644000000000000000000004070514672066616024504 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Operation, types from coremltools.converters.mil.mil.block import curr_opset_version from coremltools.converters.mil.mil.input_type import (DefaultInputs, InputSpec, TensorInputType) from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs._utils import \ spatial_dimensions_out_shape from coremltools.converters.mil.mil.ops.defs.iOS15 import _IOS15_TARGET @register_op class conv(Operation): """ Perform convolution over input. Supports 1-D, 2-D, and 3-D convolution. Parameters ---------- x: tensor<[n, C_in, \*d_in], T> (Required) * ``d_in`` are (possibly runtime-determined) spatial dimensions. For example, ``d_in = [224, 224]`` for 2D convolution. * ``1 <= len(d_in) <= 3``. * ``C_in`` is the number of input channels or depth dimensions. * ``n`` is the batch dimension. weight: tensor<[C_out, C_in/groups, \*K], T> (Required) * Filter weights. * ``C_in`` is the number of input channels. * ``C_in`` must be divisible by ``groups``. * ``K`` are kernel sizes. For example, ``K = [KH, KW]`` for 2-D convolution. * When ``dilations`` is not all ``1``, ``weight`` has to be ``const`` at compile time strides: const tensor<[S], i32> (Optional) * Default to one vector of length equal to the number of spatial dimensions. * Strides along each of the spatial dimensions. * ``S == len(d_in)``. pad_type: const str (Required) Must be one of the following: * ``valid``: No padding. This is equivalent to custom pad with ``pad[2*i] == pad[2*i+1] == 0, for i=0,...,len(d_in)-1``. * ``custom``: Specify custom padding in the parameter ``pad``. * ``same``: Input is padded such that out spatial shapes are ``d_out[i] = ceil(d_in[i] / strides[i])``. * ``same_lower``: Similar to ``same`` but the padding will place extra rows/cols on the top/left if the padding amount is odd. Specifically, for ``i = 0,..,,len(d_in)-1``, the equivalent paddings are calculated as follows: * ``dilated_kernel = (K[i] - 1) * dilate[i] + 1`` * If ``dilated_kernel`` is odd, ``padding[2*i] = padding[2*i+1] = floor(dilated_kernel / 2)`` * Otherwise: ``padding[2*i] = ceil((dilated_kernel - 1) / 2)``, ``padding[2*i+1] = floor((dilated_kernel - 1) / 2)`` pad: const tensor<[P], i32> (Optional. Default to all zeros) * ``len(P) = 2 * len(d_in)`` * ``pad`` should be specified if and only if ``pad_type == custom``, otherwise errors occur. * ``pad`` represents the number of elements to pad before and after each dimension. Specifically, ``pad[0], pad[1]`` are the pad size before / after spatial dimension 0, ``pad[2], pad[3]`` are the pad size before / after spatial dimension 1, etc. dilations: const tensor<[S], i32> (Optional. Default to all 1s) * Dilation value along each spatial dimension in ``d_in``. See `visualization `_. * ``S == len(d_in)``. groups: const tensor<[], i32> (Optional, default to 1) * Input and output channels are split by ``groups``. * ``C_in`` must be divisible by ``groups``. * Maximum value for group is ``C_in``, in which case it is a depthwise convolution. For examples (assuming ``C_in = 16, C_out = 32``): * ``groups == 1``, ``weight`` has shape ``[32, 16, KH, KW]``: All input channels are convolved with the ``weight`` kernel to produce all output channels. * ``groups == 2``, ``weight`` has shape ``[32, 8, KH, KW]``: Input channels 0~7 are convolved with half of the ``weight`` kernel to produce output channels 0~15. Similarly, input channels 8~15 are convolved with the other half of ``weight`` to product output channels 16~31. * ``groups == C_in``, ``weight`` has shape ``[32, 1, KH, KW]``: Each input channel is convolved with its own set of filters and each produce ``C_out / C_in = 2`` channels. This is equivalent to depthwise convolution. bias: const tensor<[C_out],T> (Optional, default to all 0) * Bias along output channels. Returns ------- tensor<[n, C_out, \*d_out], T> * Output activation has the same rank and spatial dimension as the input. That is, ``len(d_out) == len(d_in)``. * For ``i=0,..,len(d_in)-1, d_out[i] = floor [(D_in[i] + pad[2*i] + pad[2*i+1] - (K[i]-1)*dilations[i] - 1) / strides[i] ] + 1``. Attributes ---------- T: fp16, fp32 See Also -------- conv_transpose """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), weight=TensorInputType(type_domain="T"), bias=TensorInputType(const=True, optional=True, type_domain="T"), strides=TensorInputType(const=True, optional=True, type_domain=types.int32), pad_type=TensorInputType(const=True, optional=True, type_domain=types.str), pad=TensorInputType(const=True, optional=True, type_domain=types.int32), dilations=TensorInputType(const=True, optional=True, type_domain=types.int32), groups=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): num_spatial_dims = self.x.rank - 2 return DefaultInputs( bias=None, strides=[1]*num_spatial_dims, pad_type="valid", pad=[0]*num_spatial_dims*2, dilations=[1]*num_spatial_dims, groups=1, ) def type_inference(self): inshape = self.x.shape f_shape = self.weight.shape kernel_shape = f_shape[2:] C_out = f_shape[0] C_in = self.x.shape[1] groups = self.groups.val if self.bias is not None and (len(self.bias.shape) > 1 or self.bias.shape[0] != C_out): msg = "# of bias values {} not equal to # output channels {}" raise ValueError(msg.format(self.bias.shape[0], C_out)) if C_in % groups != 0: msg = "# of input channels {} not divisible by groups {}" raise ValueError(msg.format(C_in, groups)) if C_in // groups != self.weight.shape[1]: msg = "C_in / groups = {}/{} != weight[1] ({})" raise ValueError(msg.format(C_in, groups, self.weight.shape[1])) strides = self.strides.val dilations = self.dilations.val # The same_lower padding is not supported in iOS15 if curr_opset_version() == _IOS15_TARGET and self.pad_type.val == "same_lower": msg = "iOS15 version of conv does not support pad_type = `same_lower`" raise ValueError(msg) # Ignore self.pad if pad_type != custom custom_pad = None if self.pad_type.val != 'custom' else self.pad.val is_weight_dynamic = not self.weight.is_descendant_of_const if is_weight_dynamic and any([True if d > 1 else False for d in dilations]): raise ValueError("Convolution with dynamic weights does not support dilations!") N = inshape[0] C_out = f_shape[0] # spatial dimensions d_out_shape = spatial_dimensions_out_shape( pad_type=self.pad_type.val, input_shape=inshape[2:], kernel_shape=kernel_shape, strides=strides, dilations=dilations, custom_pad=custom_pad, ) retshape = [N, C_out] + d_out_shape return types.tensor(self.x.dtype, tuple(retshape)) @register_op class conv_quantized(conv): """ Note: This is experimental and may change in the future. Supports weight quantization for parameters while performing convolution over input. ``W_float = W_quantized * scale + bias``. Parameters ---------- In addition to convolutional layer parameters, the following additional parameters are required. quantization_type: const str (Required) * One of ``linear``, or ``lut``. nbits: const tensor<[], i32> (Optional. Default to 8) * Denotes the bit-width of the quantization. ``1 <= nbits <= 8``. quant_scale: tensor<*?, T> (Required) * Denotes the scale of quantization. quant_bias: tensor<*?, T> (Required) * Denotes the bias that is used to quantize/dequantize. Returns ------- tensor<[n, C_out, *d_out], T> * Output activation has the same rank and spatial dimension as the input. That is, ``len(d_out) == len(d_in)``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), weight=TensorInputType(type_domain="U"), bias=TensorInputType(const=True, optional=True, type_domain="U"), quantization_type=TensorInputType(const=True, type_domain=types.str), nbits=TensorInputType(const=True, optional=True, type_domain=types.int32), quant_scale=TensorInputType(const=True, type_domain="T"), quant_bias=TensorInputType(const=True, type_domain="T"), strides=TensorInputType(const=True, optional=True, type_domain=types.int32), pad_type=TensorInputType(const=True, optional=True, type_domain=types.str), pad=TensorInputType(const=True, optional=True, type_domain=types.int32), dilations=TensorInputType(const=True, optional=True, type_domain=types.int32), groups=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp32, types.fp16), "U": (types.uint8,), } def default_inputs(self): return super().default_inputs() + \ DefaultInputs( nbits=8, ) @register_op class conv_transpose(Operation): """ Perform transposed convolution (also known as deconvolution and fractionally stride convolution) over input. ``conv_transpose`` can also be used to compute the gradient of conv. Supports 1-D, 2-D, and 3-D convolution. Parameters ---------- x: tensor<[n,C_in,*D_in],T> (Required) * Input data. * ``D_in`` are spatial dimensions. * ``1 <= len(D_in) <= 3``. * ``C_in`` is the number of input channels. weight: const tensor<[C_in,C_out/groups,*D_in], T> (Required) * Filter weights. ``C_in, C_out`` are the number of input and output channels respectively. * ``D_in`` are spatial dimensions. ``1 <= len(D_in) <= 2``. bias: const tensor<[C_out],T> (Optional, default to all 0) * Bias added along output channels. pad: const tensor<[P],i32> (Optional, default to all 0s) * Number of elements to pad before and after each dimension. * ``P == 2 * len(D_in)``. * ``pad[2*i], pad[2*i+1]`` are pad sizes before and after dimension ``i``, where ``0 <= i < len(D_in)``. output_shape: const tensor<[P],i32> (Optional, default None) * Expected output shape. The first two dimensions must be ``[n, C_out]``. * The output shape of ``conv_transpose`` is underdetermined in general, because ``conv`` can map multiple input shapes to a single output shape. For example, for ``same`` padding mode, ``conv_out = ceil(conv_in/stride)``. Hence we need ``output_shape`` when this occurs. pad_type: const tensor<[P],i32> (Optional, default valid) * One of ``same``, ``valid``, or ``custom``. strides: const tensor<[S],i32> (Optional. Default to all 1s) * Stride along each of the spatial dimensions. * ``S == len(D_in)``. dilations: const tensor<[S],i32> (Optional. Default to all 1s) * Dilation value along each spatial dimension in ``d_in``. See ``conv``. * ``S == len(D_in)``. groups: const tensor<[], i32> (Optional. Default to 1) * Input and output channels are separated into ``groups``. * ``C_in`` and ``C_out`` must be divisible by the number of groups. See ``conv`` for examples. Returns ------- tensor<[n,C_out,*D_out],T> * If ``output_shape`` is not ``None``: ``Dout = output_shape`` * If ``pad_type == "custom"``: ``Dout[i] = (D_in[i]-1)*stride[i] + (K[i]-1) * dilation[i] + 1 - pad[2*i] - pad[2*i-1]`` * If ``pad_type == "valid"``: ``Dout[i] = (D_in[i]-1)*stride[i] + (K[i]-1) * dilation[i] + 1`` * If ``pad_type == "same"``: ``Dout[i] = D_in[i] * stride[i]`` Attributes ---------- T: fp16, fp32 See Also -------- conv """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), # [n, C_in, spatial_dims] weight=TensorInputType(const=True, type_domain="T"), # [C_out, C_in, spatial_dims] bias=TensorInputType(const=True, optional=True, type_domain="T"), pad=TensorInputType(const=True, optional=True, type_domain=types.int32), output_shape=TensorInputType(const=True, optional=True, type_domain=types.int32), pad_type=TensorInputType(const=True, optional=True, type_domain=types.str), strides=TensorInputType(const=True, optional=True, type_domain=types.int32), dilations=TensorInputType(const=True, optional=True, type_domain=types.int32), groups=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): num_spatial_dims = self.x.rank - 2 return DefaultInputs( bias=None, pad=[0]*2*num_spatial_dims, output_shape=None, pad_type="valid", strides=[1]*num_spatial_dims, dilations=[1]*num_spatial_dims, groups=1, ) def type_inference(self): # Input shape is [n, C_in, spatial_dims] in_shape = self.x.shape # Weight shape is [C_in, C_out/group, spatial_dims] f_shape = self.weight.shape kernel_shape = f_shape[2:] spatial_dim_rank = len(in_shape) - 2 N = in_shape[0] C_in = self.x.shape[0] groups = self.groups.val C_out = f_shape[1] * groups if self.bias is not None and self.bias.val.shape[0] != C_out: msg = "# of bias values {} not equal to # output channels {}" raise ValueError(msg.format(self.bias.val.shape[0], C_out)) if C_out % groups != 0: msg = "# of input channels {} not divisible by groups {}" raise ValueError(msg.format(C_in, groups)) # If output shape is given, return it if self.output_shape is not None: output_shape = self.output_shape.val assert output_shape[0] == N assert output_shape[1] == C_out return types.tensor( self.x.dtype, tuple(output_shape) ) strides = self.strides.val dilations = self.dilations.val kernel_shape = [ (kernel_shape[r] - 1) * dilations[r] + 1 for r in range(spatial_dim_rank) ] D_in = in_shape[2:] # spatial dimensions # Deconv's output shape is non-deterministic, we follow TF shape logic here. if self.pad_type.val == "same": d_out_shape = [strides[r] * D_in[r] for r in range(spatial_dim_rank)] elif self.pad_type.val == "valid": d_out_shape = [ strides[r] * (D_in[r]-1) + kernel_shape[r] for r in range(spatial_dim_rank) ] elif self.pad_type.val == "custom": if self.pad is None: raise ValueError("self.pad must exist if pad_type is custom") pad = self.pad.val d_out_shape = [ strides[r] * (D_in[r] - 1) + kernel_shape[r] - pad[2 * r] - pad[2 * r + 1] for r in range(spatial_dim_rank) ] retshape = [N, C_out] + d_out_shape return types.tensor(self.x.dtype, tuple(retshape)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/elementwise_binary.py0000644000000000000000000003522614672066616027426 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import operator import numpy as np from coremltools.converters.mil.mil import ( InputSpec, Operation, TensorInputType, precondition, types, ) from coremltools.converters.mil.mil.operation import VALUE from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs._utils import ( infer_type_with_broadcast, promoted_primitive_type, ) class elementwise_binary(Operation): """ Elementwise Binary Op Superclass """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), y=TensorInputType(type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def type_inference(self): typea = self.x.sym_type typeb = self.y.sym_type primitive_type = promoted_primitive_type(typea, typeb) if primitive_type is None: raise ValueError("Incompatible primitive types in broadcast operation") primitive_type = self.get_dtype(primitive_type) return infer_type_with_broadcast(typea, typeb, primitive_type) @precondition(allow=VALUE) def value_inference(self): return self._cast_check_value_inferene(self.x.val, self.y.val) def get_operator(self): """ All subclasses have to implement this. """ raise NotImplementedError() def get_dtype(self, promoted_dtype): """ Override if output primitive type is different from input types (e.g., less, greater) """ return promoted_dtype def _cast_check_value_inferene(self, a, b): """ If one of the input is tensor, cast the result to tensor. """ to_cast = any([isinstance(x, np.ndarray) for x in [a, b]]) result = self.get_operator()(a, b) return result if not to_cast else np.array(result) class elementwise_binary_logical(elementwise_binary): """ Elementwise Binary Logical Op Superclass """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), y=TensorInputType(type_domain="T"), ) type_domains = { "T": (types.bool,), } """ Elementwise Binary Op Implementation(s) """ @register_op class add(elementwise_binary): """ Return ``x + y`` element-wise with `broadcasting `_. Parameters ---------- x: <\*,T> (Required) * Shape must be compatible with ``y`` in broadcast. y: <\*,T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- <\*,T> Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return operator.add @register_op class equal(elementwise_binary): """ Return the truth value of ``x == y`` element-wise with `broadcasting `_ (``1`` for true, ``0`` for false in numeric domain). Parameters ---------- x: <\*,T> (Required) * Shape must be compatible with ``y`` in broadcast. y: <\*,T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- <\*, bool> * A boolean tensor with the same shape as the inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return np.equal def get_dtype(self, promoted_dtype): return types.bool @register_op class floor_div(elementwise_binary): """ Return ``x / y`` element-wise with `broadcasting `_, rounded towards negative infinity. Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*, T> * A tensor of the same type and shape as the inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return operator.floordiv @register_op class greater(elementwise_binary): """ Return the truth value of ``x > y`` element-wise with `broadcasting `_ (``1`` for true, ``0`` for false in numeric domain). Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*, bool> * A boolean tensor with the same shape as the inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return operator.gt def get_dtype(self, promoted_dtype): return types.bool @register_op class greater_equal(elementwise_binary): """ Return the truth value of ``x >= y`` element-wise with `broadcasting `_ (``1`` for true, ``0`` for false in numeric domain). Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, bool> * A boolean tensor with the same shape as the inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return operator.ge def get_dtype(self, promoted_dtype): return types.bool @register_op class less(elementwise_binary): """ Return the truth value of ``x < y`` element-wise with `broadcasting `_ (``1`` for true, ``0`` for false in numeric domain). Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, bool> * A boolean tensor with the same shape as the inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return operator.lt def get_dtype(self, promoted_dtype): return types.bool @register_op class less_equal(elementwise_binary): """ Return the truth value of ``x <= y`` element-wise with `broadcasting `_ (``1`` for true, ``0`` for false in numeric domain). Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, bool> * A boolean tensor with the same shape as the inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return operator.le def get_dtype(self, promoted_dtype): return types.bool @register_op class logical_and(elementwise_binary_logical): """ Return the truth value of ``x AND y`` element-wise with `broadcasting `_ Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, bool> * A boolean tensor with the same shape as the inputs. Attributes ---------- T: bool """ def get_operator(self): return np.logical_and def get_dtype(self, promoted_dtype): return types.bool @register_op class logical_or(elementwise_binary_logical): """ Return the truth value of ``x OR y`` element-wise with `broadcasting `_ Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, bool> * A boolean tensor with the same shape as the inputs. Attributes ---------- T: bool """ def get_operator(self): return np.logical_or def get_dtype(self, promoted_dtype): return types.bool @register_op class logical_xor(elementwise_binary_logical): """ Return the truth value of ``x XOR y`` element-wise with `broadcasting `_ Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, bool> * A boolean tensor with the same shape as the inputs. Attributes ---------- T: bool """ def get_operator(self): return np.logical_xor def get_dtype(self, promoted_dtype): return types.bool @register_op class maximum(elementwise_binary): """ Return ``x > y ? x : y`` element-wise with `broadcasting `_. Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, T> * A tensor with the broadcasted shape from inputs, and type is derived from inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return np.maximum @register_op class minimum(elementwise_binary): """ Return ``x > y ? y : x`` element-wise with `broadcasting `_. Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, T> * A tensor with the broadcasted shape from inputs, and type is derived from inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return np.minimum @register_op class mod(elementwise_binary): """ Return ``x % y`` element-wise with `broadcasting `_. Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, T> * A tensor with the broadcasted shape from inputs, and type is derived from inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return operator.mod @register_op class mul(elementwise_binary): """ Return ``x * y`` element-wise with `broadcasting `_. Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, T> * A tensor with the broadcasted shape from inputs, and type is derived from inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return operator.mul @register_op class not_equal(elementwise_binary): """ Return the truth value of ``x != y`` element-wise with `broadcasting `_ (``1`` for true, ``0`` for false in numeric domain). Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, bool> * A boolean tensor with the broadcasted shape from inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return operator.ne def get_dtype(self, promoted_dtype): return types.bool @register_op class real_div(elementwise_binary): """ Return ``x / y`` element-wise with `broadcasting `_. Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, T> * A tensor with the broadcasted shape from inputs, and type is derived from inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return operator.truediv @register_op class pow(elementwise_binary): """ Return ``x ^ y`` element-wise with `broadcasting `_. Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, T> * A tensor with the broadcasted shape from inputs, and type is derived from inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return operator.pow @register_op class sub(elementwise_binary): """ Return ``x - y`` element-wise with `broadcasting `_. Parameters ---------- x: tensor<\*, T> (Required) * Shape must be compatible with ``y`` in broadcast. y: tensor<\*, T> (Required) * Shape must be compatible with ``x`` in broadcast. Returns ------- tensor<\*?, T> * A tensor with the broadcasted shape from inputs, and type is derived from inputs. Attributes ---------- T: fp16, fp32, i32 """ def get_operator(self): return operator.sub ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/elementwise_unary.py0000644000000000000000000004712614672066616027302 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import math import numpy as np from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import DefaultInputs, InputSpec, TensorInputType from coremltools.converters.mil.mil.operation import SYMBOL, VALUE, Operation, precondition from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.types import nptype_from_builtin from coremltools.converters.mil.mil.types.symbolic import is_symbolic from coremltools.converters.mil.mil.types.type_mapping import ( builtin_to_string, string_to_builtin, string_to_nptype, ) def _maintain_shape(x, y): # numpy converts rank 0 tensors to scalars if x.ndim == 0: # convert back to rank 0 tensor return np.array(y) return y class elementwise_unary(Operation): """ Elementwise Unary Op Superclass """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): return self.x.sym_type class elementwise_unary_with_int(Operation): """ Elementwise Unary Op Superclass """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def type_inference(self): return self.x.sym_type """ Elementwise unary op implementation(s) """ @register_op class abs(elementwise_unary_with_int): """ Return the absolute values of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32, i32 """ @precondition(allow=VALUE) def value_inference(self): result = np.abs(self.x.val) return _maintain_shape(self.x.val, result) @register_op class acos(elementwise_unary): """ Return the inverse cosine values of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.arccos(self.x.val) return _maintain_shape(self.x.val, result) @register_op class asin(elementwise_unary): """ Return the inverse sine of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.arcsin(self.x.val) return _maintain_shape(self.x.val, result) @register_op class atan(elementwise_unary): """ Return the inverse tangent of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.arctan(self.x.val) return _maintain_shape(self.x.val, result) @register_op class atanh(elementwise_unary): """ Return the inverse hyperbolic tangent values of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.arctanh(self.x.val) return _maintain_shape(self.x.val, result) @register_op class ceil(elementwise_unary): """ Return the ceil values of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.ceil(self.x.val) return _maintain_shape(self.x.val, result) @register_op class clip(Operation): """ Clip the values in the input ``x`` to ``[alpha, beta]``, element-wise. Any values less than ``alpha`` are set to ``alpha``, and any values greater than ``beta`` are set to ``beta``. Parameters ---------- x: tensor<[\*d], T> (Required) alpha: const T (Required) beta: const T (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), alpha=TensorInputType(const=True, type_domain="T"), beta=TensorInputType(const=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): return self.x.sym_type @precondition(allow=VALUE) def value_inference(self): return np.minimum(np.maximum(self.x.val, self.alpha.val), self.beta.val) @register_op class cos(elementwise_unary): """ Return cosine of ``x`` element-wise. Input domain is ``(-inf, inf)`` and output range is ``[-1,1]``. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.cos(self.x.val) return _maintain_shape(self.x.val, result) @register_op class cosh(elementwise_unary): """ Return hyperbolic cosine of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.cosh(self.x.val) return _maintain_shape(self.x.val, result) @register_op class erf(elementwise_unary): """ Return the gauss error function of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): erf_vector_function = np.vectorize(math.erf) return erf_vector_function(self.x.val) @register_op class exp(elementwise_unary): """ Return e^x, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.exp(self.x.val) return _maintain_shape(self.x.val, result) @register_op class exp2(elementwise_unary_with_int): """ Return 2^x, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32, i32 """ @precondition(allow=VALUE) def value_inference(self): result = np.exp2(self.x.val) return _maintain_shape(self.x.val, result) @register_op class floor(elementwise_unary): """ Return the floor of the input ``x``, element-wise, the same as rounding towards negative infinity. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.floor(self.x.val) return _maintain_shape(self.x.val, result) @register_op class inverse(Operation): """ Return the reciprocal value of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) epsilon: const T (Optional, default=1e-4) * This is a small constant that is added to the input, before taking its inverse, for stability. * ``y = 1 / (x + epsilon)``. Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), epsilon=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( epsilon=nptype_from_builtin(self.x.dtype)(1e-4), ) def type_inference(self): return self.x.sym_type @precondition(allow=VALUE) def value_inference(self): return np.array(np.reciprocal(self.x.val + self.epsilon.val), copy=False) @register_op class log(Operation): """ Return the natural logarithm value of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) epsilon: const T (Optional, default=1e-45) * This is a small constant that is added to the input, before taking log. * ``y = log(x + epsilon)``. Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), epsilon=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( epsilon=nptype_from_builtin(self.x.dtype)(1e-45) ) def type_inference(self): return self.x.sym_type @precondition(allow=VALUE) def value_inference(self): result = np.log(self.x.val + self.epsilon.val) return _maintain_shape(self.x.val, result) @register_op class logical_not(Operation): """ Return the value of NOT the input ``x``, element-wise. (``1`` for true, ``0`` for false in numeric domain.) A numeric value ``t`` is evaluated to true ``iff t != 0``. Parameters ---------- x: tensor<[\*d], bool> (Required) Returns ------- tensor<[\*d], bool> * A tensor of the same shape as ``x``. Attributes ---------- T: bool """ input_spec = InputSpec( x=TensorInputType(type_domain=types.bool), ) @precondition(allow=VALUE) def value_inference(self): return np.logical_not(self.x.val) def type_inference(self): return self.x.sym_type @register_op class round(elementwise_unary): """ Return the round value of the input ``x`` to nearest integer, element-wise. ``0.5`` is rounded to ``0``. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.round(self.x.val) return _maintain_shape(self.x.val, result) @register_op class rsqrt(Operation): """ Return the reciprocal value of the square root of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) epsilon: const T (Optional, default=1e-12) * This is a small constant that is added to the input, before applying the ``rsqrt`` function, for stability. * ``y = 1 / sqrt(x + epsilon)``. Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), epsilon=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( epsilon=nptype_from_builtin(self.x.dtype)(1e-12), ) def type_inference(self): return self.x.sym_type @precondition(allow=VALUE) def value_inference(self): result = 1.0 / np.sqrt(self.x.val + self.epsilon.val) return _maintain_shape(self.x.val, result) @register_op class sign(elementwise_unary_with_int): """ Return the sign value of the input ``x``, element-wise. All elements in the output will be either ``-1`` or ``1``, or zero if the input ``x`` is zero. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32, i32 """ @precondition(allow=VALUE) def value_inference(self): result = np.sign(self.x.val) return _maintain_shape(self.x.val, result) @register_op class sin(elementwise_unary): """ Return the sine value of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.sin(self.x.val) return _maintain_shape(self.x.val, result) @register_op class sinh(elementwise_unary): """ Return the hyperbolic sine value of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.sinh(self.x.val) return _maintain_shape(self.x.val, result) @register_op class sqrt(elementwise_unary): """ Returns the square root value of the input ``x``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.sqrt(self.x.val) return _maintain_shape(self.x.val, result) @register_op class square(elementwise_unary_with_int): """ Return ``x^2``, element-wise. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32, i32 """ @precondition(allow=VALUE) def value_inference(self): return np.square(self.x.val) @register_op class tan(elementwise_unary): """ Return the tangent value of the input ``x``, element-wise. Both input and output ranges are ``(-inf, inf)``. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.tan(self.x.val) return _maintain_shape(self.x.val, result) @register_op class tanh(elementwise_unary): """ Return the hyperbolic tangent value of the input ``x``, element-wise. Both input and output ranges are ``(-inf, inf)`` while output range is ``[-1, 1]``. Parameters ---------- x: tensor<[\*d], T> (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ @precondition(allow=VALUE) def value_inference(self): result = np.tanh(self.x.val) return _maintain_shape(self.x.val, result) @register_op class threshold(Operation): """ Set a lower bound ``alpha`` to the values in the input ``x``, element-wise. Any values less than ``alpha`` are set to ``alpha``. Parameters ---------- x: tensor<[\*d], T> (Required) alpha: const T (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32, i32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), alpha=TensorInputType(const=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def type_inference(self): return self.x.sym_type @precondition(allow=VALUE) def value_inference(self): return np.maximum(self.x.val, self.alpha.val) @register_op class cast(Operation): """ Cast the input ``x`` to the new type ``dtype``. Parameters ---------- x: tensor<[\*d], T> (Required) dtype: const str (Required) * Can be one of the following types: ``int32``, ``fp16``, ``fp32``, ``bool``. Returns ------- tensor<[\*d], dtype> * A tensor of the same shape as ``x``, with type ``dtype``. Attributes ---------- T: i32, fp16, fp32, bool. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), dtype=TensorInputType(const=True, type_domain=types.str) ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } @classmethod def supported_dtypes(cls): return [builtin_to_string(v) for v in cls.type_domains["T"]] def type_inference(self): if self.dtype.val not in self.supported_dtypes(): raise NotImplementedError( "Parameter dtype of the cast operation can be one of the {}. " "Provided {}".format(self.supported_dtypes(), self.dtype.val) ) if not types.is_tensor(self.x.sym_type): return string_to_builtin(self.dtype.val) ret_shape = self.x.shape return types.tensor(string_to_builtin(self.dtype.val), ret_shape) @precondition(allow=VALUE | SYMBOL) def value_inference(self): return self.get_cast_value(self.x, self.dtype.val) @classmethod def get_cast_value(cls, input_var, dtype_val): if dtype_val not in cls.supported_dtypes(): raise NotImplementedError( "Parameter dtype of the cast operation can be one of the {}. " "Provided {}".format(cls.supported_dtypes(), dtype_val) ) if input_var.val is None: if ( input_var.sym_val is not None and not is_symbolic(input_var.sym_val) and len(input_var.sym_val.shape) == 1 ): result = [ np.array(val).astype(dtype=string_to_nptype(dtype_val)).item() if not is_symbolic(val) else val for val in input_var.sym_val ] return np.array(result) return None if hasattr(input_var.val, "astype"): return input_var.val.astype(dtype=string_to_nptype(dtype_val)) else: return string_to_nptype(dtype_val)(input_var.val) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/image_resizing.py0000644000000000000000000010246614672066616026536 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import (DefaultInputs, InputSpec, Operation, TensorInputType, get_new_symbol, types) from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS15 import _IOS15_TARGET from coremltools.converters.mil.mil.types.symbolic import is_symbolic @register_op class upsample_nearest_neighbor(Operation): """ Upsample the spatial dimensions (last two dimensions) of the input by integer scale factors using nearest-neighbor interpolation. Parameters ---------- x: tensor<[\*D, H1, W1],T> (Required) * Must be at least rank ``3``. scale_factor_height: const or const (Optional, default=1) * Scale factor for the height dimension (``axis=-2``). * Can be either an integer or fractional. scale_factor_width: const or const (Optional, default=1) * Scale factor for the width dimension (``axis=-1``). * Can be either an integer or fractional. Returns ------- tensor<[\*D, H2, W2],T> * Tensor with same type as the input. * ``H2`` = floor(``H1`` * ``scale_factor_height``). * ``W2`` = floor(``W1`` * ``scale_factor_width``). Attributes ---------- T: fp16, fp32 U: fp32, i32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), scale_factor_height=TensorInputType( const=True, optional=True, type_domain="U" ), scale_factor_width=TensorInputType( const=True, optional=True, type_domain="U" ), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( scale_factor_height=1, scale_factor_width=1, ) def type_inference(self): if self.x.rank < 3: raise ValueError( 'input to the "upsample_nearest_neighbor" op must have rank at least 3' ) ret_shape = list(self.x.shape) ret_shape[-1] = np.floor(self.scale_factor_width.val * ret_shape[-1]) if not is_symbolic(ret_shape[-1]) else get_new_symbol() ret_shape[-2] = np.floor(self.scale_factor_height.val * ret_shape[-2]) if not is_symbolic(ret_shape[-2]) else get_new_symbol() return types.tensor(self.x.dtype, ret_shape) @register_op class resize_nearest_neighbor(Operation): """ Resize the spatial (last two) dimensions to the specified target size using nearest neighbor interpolation. Although this op is similar to ``upsample_nearest_neighbor``, ``resize_nearest_neighbor`` works with a target size rather than with scale factors. Parameters ---------- x: tensor<[\*D, H1, W1], T> (Required) * Must be at least rank ``3``. target_size_height: const (Required) * Target spatial size for the height dimension (``axis=-2``). target_size_width: const (Required) * Target spatial size for the width dimension (``axis=-1``). Notes ----- See ``resize_bilinear`` for examples. See Also -------- resize_bilinear Returns ------- tensor<[\*D, H2, W2], T> * Tensor with same type as the input. * ``H2`` = ``target_size_height``. * ``W2`` = ``target_size_width``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), target_size_height=TensorInputType(const=True, type_domain=types.int32), target_size_width=TensorInputType(const=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): if self.x.rank < 3: raise ValueError( 'input to the "resize_nearest_neighbor" op must have rank at least 3' ) ret_shape = list(self.x.shape) ret_shape[-1] = int(self.target_size_width.val) ret_shape[-2] = int(self.target_size_height.val) return types.tensor(self.x.dtype, ret_shape) @register_op class upsample_bilinear(Operation): """ Upsample the spatial dimensions (last two dimensions) of the input by scale factors using bilinear interpolation. The upsample_bilinear operation in MIL corresponds to the ``recompute_scale_factor=True`` mode in the pyorch bilinear interpolation op. That is, the scale factor is recomputed by the output size. Note that when the ``scale_factor_height`` and ``scale_factor_width`` are floating point, this could result in a different scale factor due to rounding. Parameters ---------- x: tensor<[\*D, H1, W1], T> (Required) * Must be at least rank ``3``. scale_factor_height: const (Optional, default=1) * Scale factor for the height dimension (``axis=-2``). scale_factor_width: const (Optional, default=1) * Scale factor for the width dimension (``axis=-1``). align_corners: const (Optional, default=True) * This parameter determines how samples are chosen for bilinear interpolation. For details, see the Notes section. Notes ----- To understand the ``align_corners`` parameter, consider the 1-D case. You need to sample a grid of pixels whose values are computed using linear interpolation. This parameter controls how the grid is sampled. If the input grid is ``[0, Xin-1]`` (corresponding to an input size of ``Xin``), and if the output size is ``Xout``, then the grid points are sampled in the following manner: .. sourcecode:: python # If align_corners == True: spacing = (Xin - 1) / (Xout - 1) grid_point[i] = min(Xin - 1, max(0, i*spacing)), for i=0,1,...,Xout-1 # If align_corners == False: spacing = Xin / Xout grid_point[i] = min(Xin - 1, max(0, i*spacing + 0.5*spacing - 0.5)), ... for i=0,1,...,Xout-1 For example: .. sourcecode:: python Xin = 2 input_interval = [0,1] Grid points: .. sourcecode:: python [0., 0.1, 0.5, 0.9, 1.] (Xout = 5, align_corners=False) [0., 0.25, 0.5, 0.75, 1.] (Xout = 5, align_corners=True) [0., 0., 0.33, 0.67, 1., 1.] (Xout = 6, align_corners=False) [0., 0.2, 0.4, 0.6, 0.8, 1.] (Xout = 6, align_corners=True) Note the following similarities: * ``align_corners=False`` is the same as ``tf.raw_ops.ResizeBilinear(align_corners=False, half_pixel_centers=True)``. * ``align_corners=True`` is the same as ``tf.raw_ops.ResizeBilinear(align_corners=True, half_pixel_centers=False)``. Returns ------- tensor<[\*D, H2, W2], T> * Tensor with same type as the input. * ``H2`` = floor(``H1`` * ``scale_factor_height``). * ``W2`` = floor(``W1`` * ``scale_factor_width``). Attributes ---------- T: fp16, fp32 U : fp32, i32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), scale_factor_height=TensorInputType( const=True, optional=True, type_domain="U", ), scale_factor_width=TensorInputType( const=True, optional=True, type_domain="U", ), align_corners=TensorInputType( const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.int32, types.fp32), } def default_inputs(self): return DefaultInputs( scale_factor_height=1, scale_factor_width=1, align_corners=True, ) def type_inference(self): if self.x.rank < 3: raise ValueError( 'input to the "upsample_bilinear" op must have rank at least 3' ) ret_shape = list(self.x.shape) ret_shape[-1] = np.floor(self.scale_factor_width.val * ret_shape[-1]) if not is_symbolic(ret_shape[-1]) else get_new_symbol() ret_shape[-2] = np.floor(self.scale_factor_height.val * ret_shape[-2]) if not is_symbolic(ret_shape[-2]) else get_new_symbol() return types.tensor(self.x.dtype, ret_shape) @register_op class resize_bilinear(Operation): """ Resize the spatial (last two) dimensions to the specified target size using bilinear interpolation. Although this op is similar to ``upsample_bilinear``, ``resize_bilinear`` works with a target size rather than with scale factors. Parameters ---------- x: tensor<[\*D, H1, W1],T> (Required) * Must be at least rank ``3``. target_size_height: const (Optional, default=1) * Target spatial size for the height dimension (``axis=-2``). target_size_width: const (Optional, default=1) * Target spatial size for the width dimension (``axis=-1``). sampling_mode: const (Optional, default="DEFAULT") * This parameter can take ``"STRICT_ALIGN_CORNERS”``, ``"ALIGN_CORNERS"``, ``"DEFAULT"``, ``"OFFSET_CORNERS"`` or ``UNALIGN_CORNERS`` as values. For details, see the Notes section. Notes ----- To understand the ``sampling_mode`` parameter, consider the 1-D case. You need to sample a grid of pixels whose values are computed using linear interpolation. This parameter controls how the grid is sampled. If the input grid is ``[0, Xin-1]`` (corresponding to an input size of ``Xin``), and if the output size is ``Xout``, then the grid points are sampled in the following manner: .. sourcecode:: python # "STRICT_ALIGN_CORNERS": spacing = (Xin - 1) / (Xout - 1) grid_point[i] = min(Xin-1, max(0, i*spacing)), for i=0,1,...,Xout-1 # "ALIGN_CORNERS": Same as "STRICT_ALIGN_CORNERS" unless Xout=1, # in which case: grid_point[0] = (Xin-1) / 2, if Xout==1 # "DEFAULT": spacing = (Xin - Xin/Xout) / (Xout - 1) grid_point[i] = min(Xin-1, max(0, i*spacing)), for i=0,1,...,Xout-1 # "OFFSET_CORNERS": delta = max(1, Xin - 1) / Xout spacing = ((Xout - 1) * delta) / (Xout - 1) grid_point[i] = min(Xin-1, max(0, 0.5*delta + i*spacing)), for ... i=0,1,...,Xout-1 # "UNALIGN_CORNERS": spacing = Xin / Xout grid_point[i] = min(Xin - 1, max(0, i*spacing + 0.5*spacing - 0.5)), for i=0,1,...,Xout-1 For example: .. sourcecode:: python Xin = 2 input_interval = [0,1] Grid points: .. sourcecode:: python [0., 0.1, 0.5, 0.9, 1.] (Xout = 5, UNALIGN_CORNERS) [0., 0.25, 0.5, 0.75, 1.] (Xout = 5, "STRICT_ALIGN_CORNERS" / "ALIGN_CORNERS") [0., 0.4, 0.8, 1., 1.] (Xout = 5, "DEFAULT") [0.1, 0.3, 0.5, 0.7, 0.9] (Xout = 5, "OFFSET_CORNERS") [0., 0., 0.33, 0.67, 1., 1.] (Xout = 6, UNALIGN_CORNERS) [0., 0.2, 0.4, 0.6, 0.8, 1.] (Xout = 6, "STRICT_ALIGN_CORNERS" / "ALIGN_CORNERS") [0., 0.33, 0.67, 1., 1., 1.] (Xout = 6, "DEFAULT") [0.08, 0.25, 0.42, 0.58, 0.75, 0.92] (Xout = 6, "OFFSET_CORNERS") Note the following similarities: * ``"DEFAULT"`` is same as ``tf.raw_ops.ResizeBilinear(align_corners=False, half_pixel_centers=False)``. * ``"STRICT_ALIGN_CORNERS"`` is same as ``tf.raw_ops.ResizeBilinear(align_corners=True, half_pixel_centers=False)``. Returns ------- tensor<[\*D, H2, W2],T> * Tensor with same type as the input. * ``H2`` = ``target_size_height``. * ``W2`` = ``target_size_width``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), target_size_height=TensorInputType( const=True, optional=True, type_domain=types.int32 ), target_size_width=TensorInputType( const=True, optional=True, type_domain=types.int32 ), sampling_mode=TensorInputType( const=True, optional=True, type_domain=types.str ), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( target_size_height=1, target_size_width=1, sampling_mode="DEFAULT", ) def type_inference(self): if self.x.rank < 3: raise ValueError( 'input to the "resize_bilinear" op must have rank at least 3' ) if self.sampling_mode.val not in { "STRICT_ALIGN_CORNERS", "ALIGN_CORNERS", "UNALIGN_CORNERS", "DEFAULT", "OFFSET_CORNERS", }: raise ValueError( '"resize_bilinear" op: unrecognized sampling mode "{}"'.format( self.sampling_mode.val ) ) ret_shape = list(self.x.shape) ret_shape[-1] = self.target_size_width.val ret_shape[-2] = self.target_size_height.val return types.tensor(self.x.dtype, ret_shape) @register_op class crop_resize(Operation): """ Resize the spatial dimensions (last two dimensions) of the first input according to the bounding boxes specified in the second input, using bilinear interpolation. Parameters ---------- x: tensor<[B, C, H, W],T> (Required) * The input, from which patches (regions of interest) are extracted and resized using bilinear interpolation. * Rank ``4``. roi: tensor<[N,1,4,1,1], T> or tensor<[N,1,5,1,1], T> (Required) * Regions of interest, or coordinates of the boxes. The above input represents coordinates of ``N`` boxes. * The convention to express coordinates depends on the value of the input ``box_coordinate_mode``. * Rank ``5``. * If ``tensor<[N,1,4,1,1], T>``: Resized images are computed for all ``B`` input images. * If ``tensor<[N,1,5,1,1], T>``: The first element from ``axis=-3`` to be resized is an index. It must be within range ``[0, B)``. target_height: const (Optional, Default=1) * Target height for resizing each patch. target_width: const (Optional, Default=1) * Target width for resizing each patch. normalized_coordinates : const (Optional, default=False) * If ``True``, the bounding box coordinates must be in the interval ``[0, 1]``. Scaling is based on the input spatial dimensions: ``(H_in - 1)`` for height and ``(W_in - 1)`` for width. * If ``False``, the bounding box coordinates must be in the interval ``[0, H_in - 1]`` for height dimensions and ``[0, W_in - 1]`` for width dimensions. spatial_scale : const (Optional, default=1.0) * Additional spatial scale that multiplies the bounding box coordinates. You would use this to implement the RoI Align layer, which typically uses unnormalized RoI coordinates along with a spatial scale that is less than or equal to 1. box_coordinate_mode: const (Optional, default="CORNERS_HEIGHT_FIRST") * Specifies the convention for specifying the four bounding box coordinates for an image of size ``(Height, Width)``. The ``(0,0)`` coordinate corresponds to the top-left corner of the image. * This parameter can take one of four values: ``"CORNERS_HEIGHT_FIRST"``: ``[h_start, w_start, h_end, w_end]`` ``"CORNERS_WIDTH_FIRST"``: ``[w_start, h_start, w_end, h_end]`` ``"CENTER_SIZE_HEIGHT_FIRST"``: ``[h_center, w_center, box_height, box_width]`` ``"CENTER_SIZE_WIDTH_FIRST"``: ``[w_center, h_center, box_width, box_height]`` sampling_mode : const (Optional, default="DEFAULT") * This parameter can take ``"STRICT_ALIGN_CORNERS"``, ``"ALIGN_CORNERS"``, ``"DEFAULT"``, ``"OFFSET_CORNERS"`` or ``UNALIGN_CORNERS`` as values. * This same convention is used by the ``resize_bilinear`` op (see that op for details). See Also -------- resize_bilinear Returns ------- tensor<[N, B, C, target_height, target_width],T> or tensor<[N, 1, C, target_height, target_width],T> * Tensor with same type as the input. * If ``roi : tensor<[N,1,4,1,1], T>``, the output is ``tensor<[N, B, C, target_height, target_width],T>``. Total crops = ``N*B``; that is, ``N`` crops for each input in the batch. * If ``roi : tensor<[N,1,5,1,1], T>``, the output is ``tensor<[N, 1, C, target_height, target_width],T>``. Total crops = ``N``; that is, 1 crop for given input image index in the batch. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), roi=TensorInputType(type_domain="T"), target_height=TensorInputType(const=True, optional=True, type_domain=types.int32), target_width=TensorInputType(const=True, optional=True, type_domain=types.int32), normalized_coordinates=TensorInputType(const=True, optional=True, type_domain=types.bool), spatial_scale=TensorInputType(const=True, optional=True, type_domain=types.fp32), box_coordinate_mode=TensorInputType(const=True, optional=True, type_domain=types.str), sampling_mode=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp16, types.fp32), } _VALID_SAMPLING_MODES = { "STRICT_ALIGN_CORNERS", "ALIGN_CORNERS", "UNALIGN_CORNERS", "DEFAULT", "OFFSET_CORNERS", } _VALID_BOX_COORDINATE_MODES = { "CORNERS_HEIGHT_FIRST", "CORNERS_WIDTH_FIRST", "CENTER_SIZE_HEIGHT_FIRST", "CENTER_SIZE_WIDTH_FIRST", } def default_inputs(self): return DefaultInputs( target_height=1, target_width=1, normalized_coordinates=False, spatial_scale=1., box_coordinate_mode="CORNERS_HEIGHT_FIRST", sampling_mode="DEFAULT", ) def _validate_input(self): if self.x.rank != 4: raise ValueError( f'input to the "crop_resize" op must be of rank 4. Provided {self.x.rank}' ) if self.roi.rank != 5: raise ValueError( f'ROI input to the "crop_resize" op must be of rank 5, provided {self.roi.rank}' ) if self.box_coordinate_mode.val not in self._VALID_BOX_COORDINATE_MODES: raise ValueError( f'"crop_resize" op: unrecognized box_coordinate_mode "{self.box_coordinate_mode.val}"' ) if self.sampling_mode.val not in self._VALID_SAMPLING_MODES: raise ValueError( f'"crop_resize" op: unrecognized sampling mode "{self.sampling_mode.val}"' ) def type_inference(self): self._validate_input() # ret_shape: [N] + [B, C, h_out, w_out] N, B, C = self.roi.shape[0], self.x.shape[0], self.x.shape[1] ret_shape = [N, B, C, self.target_height.val, self.target_width.val] return types.tensor(self.x.dtype, ret_shape) @register_op class crop(Operation): """ Crop the spatial dimensions (last two dimensions) of the input by the specified amounts. Parameters ---------- x: tensor<[\*D, H1, W1],T> (Required) * Must be at least rank ``3``. crop_height: const<2, i32> (Required) * Amount to be cropped from the top and bottom of the height dimension (``axis=-2``). crop_width: const<2, i32> (Required) * Amount to be cropped from the left and right sides of the width dimension (``axis=-1``). Returns ------- tensor<[\*D, H2, W2],T> * Tensor with same type as the input. * ``H2`` = ``H1 - crop_height[0] - crop_height[1]``. * ``W2`` = ``W1 - crop_width[0] - crop_width[1]``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), crop_height=TensorInputType(const=True, type_domain=types.int32), crop_width=TensorInputType(const=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): if self.x.rank < 3: raise ValueError( 'input to the "crop" op must at least be of rank 3. Provided {}'.format( self.x.rank ) ) crop_height = self.crop_height.val crop_width = self.crop_width.val if len(crop_height.flatten()) != 2: raise ValueError( "crop_height must have 2 elements. Provided {}".format( len(crop_height.flatten()) ) ) if len(crop_width.flatten()) != 2: raise ValueError( "crop_width must have 2 elements. Provided {}".format( len(crop_width.flatten()) ) ) input_shape = list(self.x.shape) ret_shape = ( input_shape[:-2] + [input_shape[-2] - crop_height[0] - crop_height[1]] + [input_shape[-1] - crop_width[0] - crop_width[1]] ) return types.tensor(self.x.dtype, ret_shape) @register_op(opset_version=_IOS15_TARGET) class affine(Operation): """ Apply a linear affine transform to the input 2D image tensor. The value at the ``(x, y)`` (i.e., ``(w, h)``) coordinate of the output is computed by first computing the coordinates ``x’`` and ``y’`` with the following equation, and then computing the value at the coordinate ``(x’,y’)`` in the input image using either bilinear or nearest neighbor interpolation. If the ``(x’, y’)`` point falls outside the input image, then padding information is used to compute the value:: x’ = a0 * x + a1 * y + a2 y’ = b0 * x + b1 * y + b2 Parameters ---------- x: tensor<[B, C, H1, W1], T> * Must be rank ``4``. transform_matrix: tensor<[D, 6], T> * Must be rank ``2``. * ``D`` can be either ``B`` or 1. * If ``D == B``, there is a separate transform matrix for each batch. * If ``D == 1``, the same matrix is used for all input batches. * For each batch: ``[a0, a1, a2, b0, b1, b2]``. output_height: const * Target output height output_width: const * Target output width sampling_mode: const * Allowed values: ``"bilinear"`` padding_mode: const * Allowed values: ``"constant"``. * Note that the following example is 1D case for brevity. The op supports only 2D image input. * If ``padding_mode == "constant"``: * The input image is assumed to be padded with the padding_value. * For example, ``|1, 2, 3| -> |0, 0, 0, 1, 2, 3, 0, 0, 0|``. padding_value: const * Currently non-zero values are not supported. * To be used only when ``padding_mode == "constant"``, ignored in other cases. coordinates_mode: const * Allowed values: ``"normalized_minus_one_to_one"``. * If ``coordinates_mode == "normalized_minus_one_to_one"``, in-image values are ``[-1, 1]``. * For example, if ``coordinates_mode == "normalized_minus_one_to_one"``, the in-range values are ``[-1, 1]``. That is: * ``(-1, -1)``, i.e. ``(w=-1, h=-1)``, corresponds to the top-left pixel. * ``(1, -1)``, i.e. ``(w=1, h=-1)``, corresponds to the top-right pixel. * ``(-1, 1)``, i.e. ``(w=-1, h=1)``, corresponds to the bottom-left pixel. * ``(1, 1)``, i.e. ``(w=1, h=1)``, corresponds to the bottom-right pixel. align_corners: const * Currently ``align_corners=False`` is not supported. * To be used only when ``coordinates_mode != unnormalized``, ignored otherwise. * If ``align_corners == True``, the extrema coordinates correspond to the center of the first and last corner pixels. * If ``align_corners == False``, the extrema coordinates correspond to the edge of the first and last corner pixels. Returns ------- tensor<[B, C, output_height, output_width], T> Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), transform_matrix=TensorInputType(type_domain="T"), output_height=TensorInputType(const=True, type_domain=types.int32), output_width=TensorInputType(const=True, type_domain=types.int32), sampling_mode=TensorInputType(const=True, type_domain=types.str), padding_mode=TensorInputType(const=True, type_domain=types.str), padding_value=TensorInputType(const=True, type_domain="T"), coordinates_mode=TensorInputType(const=True, type_domain=types.str), align_corners=TensorInputType(const=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): if self.x.rank != 4: raise ValueError( 'input "x" to the "affine" op must be a rank 4 tensor. ' "Got rank {} tensor of shape {}".format( self.x.rank, self.x.shape ) ) if self.transform_matrix.rank != 2: raise ValueError( 'input "transform_matrix" to the "affine" op must be a rank 2 tensor. ' "Got rank {} tensor of shape {}".format( self.transform_matrix.rank, self.transform_matrix.shape ) ) if self.sampling_mode.val.lower() != "bilinear": raise NotImplementedError( 'input "sampling_mode" to the "affine" not implemented. ' 'Got "{}"'.format(self.sampling_mode.val) ) if self.coordinates_mode.val.lower() != "normalized_minus_one_to_one": raise NotImplementedError( 'input "coordinates_mode" to the "affine" not implemented. ' 'Got "{}"'.format(self.coordinates_mode.val) ) if self.padding_mode.val.lower() != "constant" or self.padding_value.val != 0.0: raise NotImplementedError( 'input "padding_mode" to the "affine" not implemented. ' 'Got "{}" with "padding_value={}"'.format( self.padding_mode.val, self.padding_value.val ) ) input_shape = self.x.shape transform_matrix_shape = self.transform_matrix.shape if ( not is_symbolic(transform_matrix_shape[-1]) and transform_matrix_shape[-1] != 6 ): raise ValueError( 'input "transform_matrix" to the "affine" op last dimension must be 6 ' "[a0, a1, a2, b0, b1, b2], " "Got {} for last dimension".format(transform_matrix_shape[-1]) ) ret_shape = list(input_shape) ret_shape[2] = self.output_height.val ret_shape[3] = self.output_width.val return types.tensor(self.x.dtype, tuple(ret_shape)) @register_op(opset_version=_IOS15_TARGET) class resample(Operation): """ Resample the input image tensor ``x`` at the ``coordinates``. Resampling is required if the coordinates do not correspond to exact pixels in the input image. The ``sampling_mode`` determines the algorithm used for resampling and computing the values. Parameters ---------- x: tensor<[B, C, H1, W1], T> * Must be rank ``4``. coordinates: tensor<[B, H2, W2, 2], U> * Must be rank ``4``. * Coordinates are provided in the order ``(x, y)`` (i.e. ``(w, h)``). * The value of each output location ``output[b, c, h, w]`` is calculated by sampling from the input image ``x[b, c, :, :]``. * The pixel at the ``(x, y)`` location corresponds to the length-2 vector: ``coordinates[b, h, w, :]``. * Coordinate (normalized or unnormalized) should be specified according to ``coordinates_mode``. sampling_mode: const * Allowed values: ``"bilinear"`` , ``"nearest"`` padding_mode: const * Allowed values: ``"constant"``, ``"border"``, ``"reflection"``, ``"symmetric"`` * Note that the following example is 1D case for brevity. The op supports only 2D image input. * If ``padding_mode == "constant"``: * The input image is assumed to be padded with the ``padding_value``. * For example: ``|1, 2, 3| -> |0, 0, 0, 1, 2, 3, 0, 0, 0|`` * if ``padding_mode == "border"``: * The input image is assumed to be padded with the values replicated from the values at the edge. This is also referred to as the "clamped" or "replication" mode, since the padded values are clamped to the border values. * For example: ``|1, 2, 3| -> |1, 1, 1, 1, 2, 3, 3, 3, 3|`` * If ``padding_mode == "reflection"``: * The border values are reflected, *not* including the values at the edge/border. * For example: ``|1, 2, 3| -> |2, 3, 2, 1, 2, 3, 2, 1, 2|`` * If ``padding_mode == "symmetric"``: * Values are reflected, including the border/edge values. * For example: ``|1, 2, 3| -> |3, 2, 1 , 1, 2, 3, 3, 2, 1|`` padding_value: const * To be used only when ``padding_mode == "constant"``, ignored in other cases. coordinates_mode: const * Allowed values: ``"unnormalized"``, ``"normalized_minus_one_to_one"``, ``"normalized_zero_to_one"`` * If ``coordinates_mode == "unnormalized"``, the coordinates input values are interpreted to be in range ``[0, W - 1] / [0, H - 1]``, which corresponds to the in-image point. * If ``coordinates_mode == "normalized_minus_one_to_one"``, the in-image values are ``[-1, 1]``. * If ``coordinates_mode == "normalized_zero_to_one"``, in-image values are ``[0, 1]``. * For example, if ``coordinates_mode == "normalized_minus_one_to_one"``, the in range values are [-1, 1]. That is: * ``(-1, -1)``, i.e. ``(w=-1, h=-1)``, corresponds to the top-left pixel. * ``(1, -1)``, i.e. ``(w=1, h=-1)``, corresponds to the top-right pixel. * ``(-1, 1)``, i.e. ``(w=-1, h=1)``, corresponds to the bottom-left pixel. * ``(1, 1)``, i.e. ``(w=1, h=1)``, corresponds to the bottom-right pixel. align_corners: const * If ``align_corners == True``, the extrema coordinates correspond to the center of the first and last corner pixels. * If ``align_corners == False``, the extrema coordinates correspond to the edge of the first and last corner pixels. Returns ------- tensor<[B, C, H2, W2], T> Attributes ---------- T: fp16, fp32 U: fp32, int32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), coordinates=TensorInputType(type_domain="U"), sampling_mode=TensorInputType(const=True, type_domain=types.str), padding_mode=TensorInputType(const=True, type_domain=types.str), padding_value=TensorInputType(const=True, type_domain="T"), coordinates_mode=TensorInputType(const=True, type_domain=types.str), align_corners=TensorInputType(const=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.int32, types.fp32), } def type_inference(self): if self.x.rank != 4: raise ValueError( 'input "x" to the "resample" op must be a rank 4 tensor. ' "Got rank {} tensor of shape {}".format( self.x.rank, self.x.shape ) ) if self.coordinates.rank != 4: raise ValueError( 'input "coordinates" to the "resample" op must be a rank 4 tensor. ' "Got rank {} tensor of shape {}".format( self.coordinates.rank, self.coordinates.shape ) ) input_shape = self.x.shape coord_shape = self.coordinates.shape if ( not is_symbolic(input_shape[0]) and not is_symbolic(coord_shape[0]) and input_shape[0] != coord_shape[0] ): raise ValueError( 'input "x" and "coordinates" to the "resample" must agree on ' "dimension of batch size: {} vs. {}".format( input_shape[0], coord_shape[0] ) ) if not is_symbolic(coord_shape[-1]) and coord_shape[-1] != 2: raise ValueError( 'input "coordinates" to the "resample" op last dimension must be 2. ' "Got {} for last dimension".format( coord_shape[-1] ) ) ret_shape = list(input_shape) ret_shape[2] = coord_shape[1] # Output height ret_shape[3] = coord_shape[2] # Output width return types.tensor(self.x.dtype, tuple(ret_shape)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/linear.py0000644000000000000000000003067314672066616025014 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import ( DefaultInputs, InputSpec, Operation, TensorInputType, TupleInputType, precondition, types, ) from coremltools.converters.mil.mil.operation import VALUE from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs._utils import broadcast_shapes, parse_einsum_equation from coremltools.converters.mil.mil.types import nptype_from_builtin from coremltools.converters.mil.mil.types.symbolic import is_symbolic @register_op class linear(Operation): """ Perform ``x * weight.T + bias`` where ``weight`` and ``bias`` are constant at compile time. Parameters ---------- x: tensor<[\*D,D_in], T> (Required) * ``1 <= rank <= 3``. * ``0 <= rank(*D) <= 2``. weight: const tensor<[D_out,D_in], T> (Required) bias: const tensor<[D_out],T> (Optional) * Default to ``0``. Returns ------- tensor<[\*D,D_out], T> * Same rank as the input ``x``. Attributes ---------- T: fp16, fp32, i32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), weight=TensorInputType(const=True, type_domain="T"), bias=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): Dout = self.weight.shape[0] # If the bias is not provided, we initialize it a zero vector # with dtype of weight. return DefaultInputs( bias=np.array([0.0] * Dout, dtype=nptype_from_builtin(self.weight.dtype)), ) def type_inference(self): x_type = self.x.dtype x_shape = self.x.shape weight_shape = self.weight.shape assert len(weight_shape) == 2 if not ( x_shape[-1] == weight_shape[-1] or is_symbolic(x_shape[-1]) or is_symbolic(weight_shape[-1]) ): msg = "Op '{}' (linear op): Size of the last dimension of x, which is {}, " \ "does not match the last dimension of weights, which is {}" raise ValueError(msg.format(self.name, x_shape[-1], weight_shape[-1])) if self.bias is not None: assert len(self.bias.shape) == 1 if self.bias.shape[0] != weight_shape[-2]: msg = "Op '{}' (linear op): Size of the bias, which is {}, " \ "does not match the first dimension of weights, which is {}" raise ValueError(msg.format(self.name, self.bias.shape[0], weight_shape[-2])) shape = list(x_shape) shape[-1] = weight_shape[0] return types.tensor(x_type, tuple(shape)) @precondition(allow=VALUE) def value_inference(self): res = np.matmul(self.x.val, np.transpose(self.weight.val)) if self.bias is not None: res += self.bias.val return res @register_op class matmul(Operation): """ Perform N-D batch matrix multiplication with NumPy-style broadcasting based on the following rules: Rule 1. If both ``x, y`` are 1-D, return the scalar from the dot product. Rule 2. If both ``x, y`` are 2-D or higher, perform a broadcast on the batch dimensions (all dimensions except the last ``2``). For example: * ``x.shape == (10, 4, 3)`` * ``y.shape == (5, 10, 3, 2)`` * ``matmul(x, y).shape == (5, 10, 4, 2)`` Conventional matrix multiplication is a special case where both ``x, y`` are exactly 2-D. For example: * ``x.shape == (4, 3)`` * ``y.shape == (3, 2)`` * ``matmul(x, y).shape == (4, 2)`` If ``x`` is 1-D, and ``y`` is N-D where ``N >= 2``, ``x`` is first promoted to matrix ``xm`` by prepending a ``1`` to its dimension, and the resulting ``xm`` is broadcast to ``y`` following Rule 2 above. After this, remove the inserted dimension. For example: * ``x.shape == (4)`` * ``y.shape == (10, 4, 3)`` * ``xm.shape == (1, 4)`` * ``matmul(xm, y).shape == (10, 1, 3)`` * Removing the inserted dimension results in ``matmul(x, y).shape == (10, 3)``. * Note: ``xm`` and ``matmul(xm, y)`` are for illustration only. If ``x`` is N-D where ``N >= 2``, and ``y`` is 1-D, ``y`` is first promoted to matrix ``ym`` by appending a ``1`` to its dimension, and the resulting ``ym`` is broadcast to ``x`` following Rule 2 above. After this, remove the inserted dimension. For example: * ``x.shape == (10, 3, 4)`` * ``y.shape == (4,)`` * ``ym.shape == (4, 1)`` * ``matmul(x, ym).shape == (10, 3, 1)`` * Removing the inserted dimension results in ``matmul(x, y).shape == (10, 3)``. * Note: ``xm`` and ``matmul(xm, y)`` are for illustration only. Parameters ---------- x: tensor<[\*,K1], T> (Required) * ``x`` must be 1-D or higher. y: tensor<[\*,K2], T> (Required) * ``y`` must be 1-D or higher. transpose_x: const bool (Optional) * Default to ``False``. * Use ``True`` to transpose the last two dimensions of ``x`` before multiplication. It has no effect when ``x`` is 1-D. transpose_y: const bool (Optional) * Default to ``False``. * Use ``True`` to transpose the last two dimensions of ``y`` before multiplication. It has no effect when ``y`` is 1-D. Returns ------- tensor<\*, T> * Scalar or tensor output. Attributes ---------- T: fp16, fp32, i32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), y=TensorInputType(type_domain="T"), transpose_x=TensorInputType(const=True, optional=True, type_domain=types.bool), transpose_y=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( transpose_x=False, transpose_y=False, ) def type_inference(self): x_type = self.x.dtype x_shape = list(self.x.shape) y_shape = list(self.y.shape) x_rank = len(x_shape) if x_rank == 1 and self.transpose_x.val: msg = "Op {} (matmul): x is rank 1, but transpose_x is True, which is not allowed." raise ValueError(msg.format(self.name)) if self.transpose_x.val: x_shape = list(x_shape) x_shape[-1], x_shape[-2] = x_shape[-2], x_shape[-1] x_shape = tuple(x_shape) if self.transpose_y.val: y_shape = list(y_shape) y_shape[-1], y_shape[-2] = y_shape[-2], y_shape[-1] y_shape = tuple(y_shape) if not ( x_shape[-1] == y_shape[-2] or is_symbolic(x_shape[-1]) or is_symbolic(y_shape[-2]) ): msg = "Op {} (matmul): x {}, y {} are not broadcastable" raise ValueError(msg.format(self.name, self.x.shape, self.y.shape)) if x_rank == 1: # promote shape of x to rank 2 x_shape = list((1,) + tuple(x_shape)) ret_shape = list(broadcast_shapes(x_shape[:-2], y_shape[:-2])) ret_shape += [x_shape[-2], y_shape[-1]] if x_rank == 1: # remove the first dimension of the returned shape return types.tensor(x_type, tuple(ret_shape[1:])) else: return types.tensor(x_type, tuple(ret_shape)) @precondition(allow=VALUE) def value_inference(self): x = self.x.val if self.transpose_x.val: x = np.swapaxes(x, -1, -2) y = self.y.val if self.transpose_y.val: y = np.swapaxes(y, -1, -2) return np.matmul(x, y) @register_op class einsum(Operation): """ Perform tensor multiplication expressed according to the einsum notation. The mode/equation that is currently supported is multiplying matrices that are laid out on dimensions -1 and -3, treating all the other dimensions as batch. Broadcasting is supported along batch dimensions. In particular, the inputs must be of the following shapes: * Rank 4 input case: * Input 1: ``[B, C, H, W1]``. * Input 2: ``[B, W1, H, W2]``. * Output: ``[B, C, H, W2]``. * If, for one of the inputs, the dimensions ``"B"`` or ``"H"`` is 1, they are broadcast to match the other input. * Rank 3 input case: * Input 1: ``[C, H, W1]``. * Input 2: ``[W1, H, W2]``. * Output: ``[C, H, W2]``. * If, for one of the inputs, the dimension ``"H"`` is 1, it is broadcast to match the other input. Parameters ---------- values : Tuple(tensor_1, tensor_2) * Where: * ``tensor_1``: ``tensor<[*D, C, H, W1], T>``. * Must be of rank 3 or 4. * ``tensor_2``: ``tensor<[*D, W1, H, W2], T>``. * Must be of rank 3 or 4. equation: const * Supported equations are: * ``"nchw,nwhu->nchu"`` and its equivalent equation strings. * ``"chw,whr->chr"`` and its equivalent equation strings. Returns ------- tensor<[\*D, C, H, W2], T> * Same ranks as the inputs. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( values=TupleInputType(), equation=TensorInputType(const=True, type_domain=types.str) ) def type_inference(self): if len(self.values) != 2: raise ValueError("einsum op must get \'values\' of length 2") x = self.values[0] y = self.values[1] # validate the input shapes x_type = x.dtype assert x_type == y.dtype, "input types do not match" x_shape = x.shape y_shape = y.shape assert len(x_shape) == len(y_shape), "inputs not of the same rank" if not (is_symbolic(x_shape[-1]) or is_symbolic(y_shape[-3])): assert x_shape[-1] == y_shape[-3], f"input shapes incompatible: {x_shape[-1]} and {y_shape[-3]}" if x_shape[-2] != 1 and y_shape[-2] != 1: assert x_shape[-2] == y_shape[-2], "input shapes incompatible" if len(x_shape) == 4: if x_shape[-4] != 1 and y_shape[-4] != 1: assert x_shape[-4] == y_shape[-4], "input shapes incompatible" # validate the equation input1_vec, input2_vec, output_vec = parse_einsum_equation(self.equation.val) assert \ (input1_vec == [0, 1, 2, 3] and input2_vec == [0, 3, 2, 4] and output_vec == [0, 1, 2, 4]) or \ (input1_vec == [0, 1, 2] and input2_vec == [2, 1, 3] and output_vec == [0, 1, 3]), \ "unsupported einsum equation {}".format(self.equation.val) # calculate the output shape def _get_dim_value(shape1, shape2, dim): if is_symbolic(shape1[dim]) and is_symbolic(shape2[dim]): return shape1[dim] elif is_symbolic(shape1[dim]): return shape1[dim] elif is_symbolic(shape2[dim]): return shape2[dim] else: return max(shape1[dim], shape2[dim]) out_shape = [1 for i in range(len(x_shape))] out_shape[-1] = y_shape[-1] out_shape[-3] = x_shape[-3] out_shape[-2] = _get_dim_value(x_shape, y_shape, -2) if len(x_shape) == 4: out_shape[-4] = _get_dim_value(x_shape, y_shape, -4) return types.tensor(x_type, tuple(out_shape)) @precondition(allow=VALUE) def value_inference(self): x = self.values[0] y = self.values[1] x_shape = x.val.shape y_shape = y.val.shape # broadcast dimensions -2 and -4, if required if len(x_shape) == 4: x_shape = (max(x_shape[0], y_shape[0]), x_shape[1], max(x_shape[2], y_shape[2]), x_shape[3]) y_shape = (max(x_shape[0], y_shape[0]), y_shape[1], max(x_shape[2], y_shape[2]), y_shape[3]) elif len(x_shape) == 3: x_shape = (x_shape[0], max(x_shape[1], y_shape[1]), x_shape[2]) y_shape = (y_shape[0], max(x_shape[1], y_shape[1]), y_shape[2]) else: raise ValueError("ranks of the input must be 3 or 4") res = np.einsum(self.equation.val, np.broadcast_to(x.val, x_shape), np.broadcast_to(y.val, y_shape)) return res ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/normalization.py0000644000000000000000000003046314672066616026425 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import (DefaultInputs, InputSpec, Operation, TensorInputType, precondition, types) from coremltools.converters.mil.mil.operation import VALUE from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.types.symbolic import any_symbolic @register_op class batch_norm(Operation): """ Normalize input tensor ``x`` by ``mean`` and ``variance``, and optionally apply a scale ``gamma`` and an offset ``beta``: .. math:: y_i = \\gamma_i \\dfrac{ (x_i - mean_i)}{\\sqrt{variance_i + epsilon}} + beta_i \\;,\\;i=1,....,C The ``mean``, ``variance``, ``gamma``, and ``beta`` must be 1-D tensors whose lengths are equal to the second axis (the "depth" or "channel" dimension) of ``x``. Parameters ---------- x: tensor<[n,C,*D], T> (Required) * ``3 <= rank <= 5``. * ``*D`` refers to the spatial dimensions, ``1 <= rank(*D) <= 3``. * ``n`` is the batch dimension. mean: const tensor<[C], T> (Required) variance: const tensor<[C], T> (Required) gamma: const tensor<[C], T> (Optional) * Optional scale applied to normalized tensor. * Default is all ones. beta: const tensor<[C], T> (Optional) * Optional offset applied to normalized tensor. * Default is all zeros. epsilon: const T (Optional) * Default is ``1e-5``. Returns ------- tensor<[n,C,*D], T> * Output tensor has the same shape and type as the input ``x``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), mean=TensorInputType(const=True, type_domain="T"), variance=TensorInputType(const=True, type_domain="T"), gamma=TensorInputType(const=True, optional=True, type_domain="T"), beta=TensorInputType(const=True, optional=True, type_domain="T"), epsilon=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( gamma=None, beta=None, epsilon=1e-5, ) def type_inference(self): x_shape = self.x.shape return types.tensor(self.x.dtype, tuple(x_shape)) @register_op class instance_norm(Operation): """ Apply instance normalization to the n-dimensional input tensor. Parameters ---------- x: tensor<[n,C,*D], T> (Required) * ``3 <= rank(x) <= 4``. * ``*D`` refers to the spatial dimensions, ``1 <= rank(*D) <= 2``. * ``n`` is the batch dimension. gamma: const tensor<[C], T> (Optional) * Optional scale applied to normalized tensor. * Default to all ones. beta: const tensor<[C], T> (Optional) * Optional offset applied to normalized tensor. * Default to all zeros. epsilon: const f32 (Optional) * Default to ``1e-5``. Returns ------- tensor<[n,C,*D], T> * Output tensor has the same shape and type as the input ``x``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), gamma=TensorInputType(const=True, optional=True, type_domain="T"), beta=TensorInputType(const=True, optional=True, type_domain="T"), epsilon=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( gamma=None, beta=None, epsilon=1e-5, ) def type_inference(self): x_shape = self.x.shape return types.tensor(self.x.dtype, tuple(x_shape)) @register_op class l2_norm(Operation): """ Apply L2 normalization to the n-dimensional input tensor. That is, divide the input tensor by the square root of the sum of squares of all elements of the input. .. math:: x_i \\leftarrow \\dfrac{x_i}{\\sqrt{\\sum{x_i^2} + \\epsilon}} Parameters ---------- x: tensor<[\*B, \*D], T> (Required) * Input tensor, ``rank(x) >= 3``. * ``*B`` refers to the leading dimensions. * ``*D`` refers to the spatial dimensions to be normalized. Must be rank 3: ``rank(*D) == 3``. * When ``rank(x) == 3``, in which ``rank(*B) == 0 and rank(*D) == 3``, the input is divided by the square root of the sum of squares of all elements. * For ranks greater than 3, in which ``rank(*B) >= 1 and rank(*D) == 3``, the leading dimensions \*B, starting from ``0`` to ``-4`` (inclusive), are all treated as batch. The L2 normalization are done batch-wise. epsilon: const T (Optional) * Small constant to avoid division by ``0``. * Optional, defaults to ``1e-6``. Returns ------- tensor<[\*B, \*D], T> * Same type and shape as the input tensor ``x``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), epsilon=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( epsilon=1e-6, ) def type_inference(self): if self.x.rank < 3: msg = "Input rank of l2_norm must be at least 3. Got {}".format(self.x.rank) raise ValueError(msg) x_shape = self.x.shape return types.tensor(self.x.dtype, tuple(x_shape)) @precondition(allow=VALUE) def value_inference(self): val = self.x.val eps = self.epsilon.val shape = self.x.shape rank = self.x.rank batch_dims = rank - 3 if batch_dims == 0: square_sum = np.sum(val**2) output = val/np.power(square_sum + eps, 0.5) else: batch_dim_prod = np.prod(shape[:batch_dims]) reshape_val = np.reshape(val, (batch_dim_prod, -1)) square_sum = np.sum(reshape_val * reshape_val, axis=1, keepdims=True) + eps output = reshape_val/np.power(square_sum, 0.5) output = np.reshape(output, shape) return output @register_op class layer_norm(Operation): """ Apply layer normalization to the n-dimensional input tensor: .. math:: out = gamma * (input - E[x]) / sqrt(Var[x] + epsilon) + beta Parameters ---------- x: tensor<\*?, T> (Required) * Input tensor. axes: const<[K], i32> (Optional) * Dimensions to perform layer normalization. * Default is ``None`` (all dimensions). gamma: const tensor<\*?, T>, T> (Optional) * if provided, the shape must be be ``x.shape[axes]``. For instance, if input ``x`` with shape ``(3,4,5,6)`` and ``axes = [2,3]``, gamma must have shape ``(5,6)``. * Default is all ones. beta: const tensor<\*?, T>, T> (Optional) * Same shape as gamma. * Default is all zeros. epsilon: const T (Optional) * Small constant to avoid division by ``0``. * Default is ``1e-5``. Returns ------- tensor<\*?, T>: * Tensor with same shape and type as the input tensor ``x``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axes=TensorInputType(const=True, optional=True, type_domain=types.int32), gamma=TensorInputType(const=True, optional=True, type_domain="T"), beta=TensorInputType(const=True, optional=True, type_domain="T"), epsilon=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( axes=range(self.x.rank), gamma=None, beta=None, epsilon=1e-5, ) @staticmethod def _is_compatible_shape(shapea, shapeb): if not len(shapea) == len(shapeb): return False for a, b in zip(shapea, shapeb): if any_symbolic([a, b]): continue if a != b: return False return True def type_inference(self): rank = self.x.rank # check valid axes positive_axes = [axis + rank if axis < 0 else axis for axis in self.axes.val] if not all([axis >= 0 and axis < rank for axis in positive_axes]): raise ValueError("axes must in the range of [-x.rank, x.rank-1].") # check shape of gamma and beta normalized_shape = [self.x.shape[i] for i in range(rank) if i in positive_axes] if self.gamma is not None and not layer_norm._is_compatible_shape(list(self.gamma.shape), normalized_shape): raise ValueError("Expect shape {} for gamma, but get shape {} instead".format(normalized_shape, self.gamma.shape)) if self.beta is not None and not layer_norm._is_compatible_shape(list(self.gamma.shape), normalized_shape): raise ValueError("Expect shape {} for beta, but get shape {} instead".format(normalized_shape, self.beta.shape)) x_shape = self.x.shape return types.tensor(self.x.dtype, tuple(x_shape)) @precondition(allow=VALUE) def value_inference(self): def np_layer_norm(x, axes, gamma, beta, epsilon=1e-5): rank = len(x.shape) axes = [axis + rank if axis < 0 else axis for axis in axes] normalized_shape = [x.shape[i] if i in axes else 1 for i in range(rank)] gamma = np.ones(shape=normalized_shape) if gamma is None else np.reshape(gamma, normalized_shape) beta = np.zeros(shape=normalized_shape) if beta is None else np.reshape(beta, normalized_shape) num = x - np.mean(x, axis=tuple(axes), keepdims=True) dem = np.sqrt( np.sum(np.square(num), axis=tuple(axes), keepdims=True) / np.prod(normalized_shape) + epsilon ) return num / dem * gamma + beta _axes = self.x.shape if self.axes is None else self.axes.val _gamma = None if self.gamma is None else self.gamma.val _beta = None if self.beta is None else self.beta.val return np_layer_norm(self.x.val, _axes, _gamma, _beta, self.epsilon.val) @register_op class local_response_norm(Operation): """ Apply local response normalization to the n-dimensional input tensor: .. math:: x_i \\leftarrow \\dfrac{x_i}{\\left ( k + \\dfrac{\\alpha}{\\text{size}} \\sum_j x_j^2 \\right )^\\beta} Parameters ---------- x: tensor<[n,C,*D], T> (Required) * Input tensor, ``3 <= rank(x) <= 4``. * ``*D`` refers to the spatial dimensions, ``1 <= rank(*D) <= 2``. * ``n`` is the batch dimension. size: const i32 (Required) * Amount of neighboring channels to normalize. alpha: const T (Optional) * Scale factor. * Default is ``1e-4``. beta: const T (Optional) * An exponent. * Default is ``0.75``. k: const T (Optional) * Additive factor. * Default is ``1.0``. Returns ------- tensor<[n,C,*D], T> * Same type and shape as the input tensor ``x``. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), size=TensorInputType(const=True, type_domain=types.int32), alpha=TensorInputType(const=True, optional=True, type_domain="T"), beta=TensorInputType(const=True, optional=True, type_domain="T"), k=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( alpha=1e-4, beta=0.75, k=1., ) def type_inference(self): x_shape = self.x.shape return types.tensor(self.x.dtype, tuple(x_shape)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/pool.py0000644000000000000000000002212414672066616024503 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Operation, types from coremltools.converters.mil.mil.block import curr_opset_version from coremltools.converters.mil.mil.input_type import (DefaultInputs, InputSpec, TensorInputType) from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs._utils import \ spatial_dimensions_out_shape from coremltools.converters.mil.mil.ops.defs.iOS15 import _IOS15_TARGET class Pooling(Operation): """ Pooling Op Superclass """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), kernel_sizes=TensorInputType(const=True, type_domain=types.int32), strides=TensorInputType(const=True, optional=True, type_domain=types.int32), pad_type=TensorInputType(const=True, type_domain=types.str), pad=TensorInputType(const=True, optional=True, type_domain=types.int32), ceil_mode=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): num_spatial_dims = self.x.rank - 2 return DefaultInputs( strides=[1] * num_spatial_dims, pad=[0] * 2 * num_spatial_dims, ceil_mode=False, ) def type_inference(self): ksize = self.kernel_sizes.val x_shape = self.x.shape D_in_rank = len(x_shape) - 2 strides = [1] * D_in_rank if self.strides is None else self.strides.val pad_type = "valid" if self.pad_type is None else self.pad_type.val.lower() if pad_type not in ["valid", "same", "custom", "same_lower"]: raise ValueError("Unrecognized value of pad_type : {}".format(pad_type)) pad = None if self.pad is None else self.pad.val D_in = x_shape[2:] # spatial dimensions if self.ceil_mode.val: if D_in_rank > 2: raise ValueError('pool: ceil_mode only supported for 1D or 2D pool') if pad_type == "same" and self.ceil_mode.val: raise ValueError("ceil_mode must be False when pad_type==same") if pad is not None: for i in range(D_in_rank): if pad[2 * i] != pad[2 * i + 1]: raise ValueError("Padding must be symmetric if ceil_mode is True") # The same_lower padding is not supported in iOS15 if curr_opset_version() == _IOS15_TARGET and self.pad_type.val == "same_lower": msg = "iOS15 version of pooling layers do not support pad_type = `same_lower`" raise ValueError(msg) D_out_shape = spatial_dimensions_out_shape( pad_type=pad_type, input_shape=D_in, kernel_shape=ksize, strides=strides, custom_pad=pad, ceil_mode=self.ceil_mode.val, ) ret_shape = list(x_shape[:2]) + D_out_shape return types.tensor(self.x.dtype, tuple(ret_shape)) @register_op class avg_pool(Pooling): """ Perform average pooling. Supports 1-D, 2-D, and 3-D pool (1, 2, or 3 spatial dimensions). Parameters ---------- x: tensor<[n,C_in,\*D_in], T> (Required) * ``3 <= rank <= 5``. * ``D_in`` are spatial dimensions, ``1 <= len(D_in) <= 3``. * ``C_in`` is the number of input channels or depth dimensions. * ``n`` is the batch dimension. kernel_sizes: const tensor<[K], T> (Required) * The size of the window for each spatial dimension ``D_in`` of the input tensor. * ``K == len(D_in)`` strides: const tensor<[S],i32> (Optional, default to all 1s) * Stride along each of the spatial dimensions. * ``S == len(D_in)``. pad_type: const str (Required) Must be one of ``valid``, ``same``, ``custom`` or ``same_lower``. * ``valid``: No padding. This is equivalent to custom pad with ``pad[i] = 0, for all i``. * ``same`` : This is equivalent to custom pad with ``pad[2*i] + pad[2*i+1] = kernel_size[i]``. * ``custom``: Specify custom padding in the parameter pad. note that ``same`` padding is equivalent to custom padding with ``pad[2*i] + pad[2*i+1] = kernel_size[i]``. * ``same_lower``: Similar to ``same`` but the padding will place extra rows/cols on the top/left if the padding amount is odd. pad: const<[P],i32> (Optional. Default to all 0s) * ``pad`` represents the number of elements to pad before and after each dimension: ``pad[2*i], pad[2*i+1]`` are the pad size before and after spatial dimension ``i``. * ``P = 2 * len(D_in)``. * ``pad`` should be specified if and only if ``pad_type == custom`` exclude_padding_from_average: const tensor<[], bool> (Optional, default to False) * If ``True``, padded values (0s) are excluded from the denominator count when computing the average over the kernel window. ceil_mode: const * Same as PyTorch's ``ceil`` mode. * ``ceil`` is used instead of floor in calculating the output size. * Optional, defaults to ``False``. * Only applicable when ``pad_type`` is ``valid`` or ``custom``. * When ``ceil_mode`` is True, padding must be symmetric; that is, if specified, ``pad[2*i] == pad[2*i+1]`` must hold. Returns ------- tensor<[n, C_out,\*D_out], T> * Same rank as ``x``. * ``C_out`` is the number of output channels or depth dimensions. * When ``ceil_mode = False``: * ``D_out[i] = floor[(D_in[i] + pad[2*i] + pad[2*i+1] - kernel_sizes[i]) / strides[i]] +1, for i = 0, .., len(D_in) - 1`` is mathematically the same as (when all parameters involved are integers): * ``D_out[i] = ceil [(D_in[i] + pad[2*i] + pad[2*i+1] - kernel_size[i] - 1) / stride[i]], for i = 0, .., len(D_in) - 1``. * ``*D_out`` is all ones if ``global_pooling`` is ``true``. * When ``ceil_mode = True``: * ``D_out[i] = ceil[(D_in[i] + pad[2*i] + pad[2*i+1] - kernel_sizes[i]) / strides[i]] +1, for i = 0, .., len(D_in) - 1`` * If ``(D_out[i] - 1) * strides[i] >= D_in[i] + pad[2*i] and (pad[2*i] + pad[2*i+1] > 0)`` then ``D_out[i] = D_out[i] - 1``. * The first equation is same as: * ``D_out[i] = floor[(D_in[i] + pad[2*i] + pad[2*i+1] - kernel_sizes[i] + strides[i] - 1) / strides[i]] +1, for i = 0, .., len(D_in) - 1`` Attributes ---------- T: fp16, fp32 See Also -------- l2_pool, max_pool """ input_spec = ( InputSpec( exclude_padding_from_average=TensorInputType( const=True, optional=True, type_domain=types.bool ) ) + Pooling.input_spec ) def default_inputs(self): return super().default_inputs() + DefaultInputs( exclude_padding_from_average=False, ) @register_op class l2_pool(Pooling): """ Perform L2 pooling. Supports 1-D and 2-D pool. Parameters ---------- x: tensor<[n,C_in,*D_in], T> (Required) * Only support 1d and 2d pooling. * See :py:class:`avg_pool`. kernel_sizes: const tensor<[K], T> (Required) * See :py:class:`avg_pool`. strides: const tensor<[S],i32> (Optional, default to all 1s) * See :py:class:`avg_pool`. pad_type: const str (Required) * See :py:class:`avg_pool`. pad: const<[P],i32> (Optional, default to all 0s) * See :py:class:`avg_pool`. Returns ------- tensor<[n, C_out,*D_out], T> * See :py:class:`avg_pool`. Attributes ---------- T: fp16, fp32 See Also -------- avg_pool, max_pool """ def type_inference(self): if self.x.rank - 2 > 2: msg = "l2_pool only supports rank 1 or 2. Got rank: {}".format(self.x.rank - 2) raise ValueError(msg) return super().type_inference() @register_op class max_pool(Pooling): """ Perform max pooling. Supports 1-D, 2-D, and 3-D pool. Parameters ---------- x: tensor<[n,C_in,*D_in], T> (Required) * See :py:class:`avg_pool`. kernel_sizes: const tensor<[K], T> (Required) * See :py:class:`avg_pool`. strides: const tensor<[S],i32> (Optional, default to all 1s) * See :py:class:`avg_pool`. pad_type: const str (Required) * See :py:class:`avg_pool`. pad: const<[P],i32> (Optional, default to all 0s) * See :py:class:`avg_pool`. ceil_mode: const * see :py:class:`avg_pool`. Returns ------- tensor<[n, C_out,*D_out], T> * See :py:class:`avg_pool`. Attributes ---------- T: fp16, fp32 See Also -------- avg_pool, l2_pool """ pass ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/random.py0000644000000000000000000002247014672066616025016 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import (get_new_symbol, get_new_variadic_symbol, types) from coremltools.converters.mil.mil.input_type import (DefaultInputs, InputSpec, TensorInputType) from coremltools.converters.mil.mil.operation import Operation from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.types.symbolic import any_symbolic class RandomDistribution(Operation): """ Random Op Superclass """ input_spec = InputSpec( shape=TensorInputType(type_domain=types.int32), ) out_dtype = types.fp32 def type_inference(self): if any_symbolic(self.shape.shape): # We can't infer any shape if shape has variable length. return types.tensor(self.out_dtype, (get_new_variadic_symbol(),)) # shape has fixed length here. if self.shape.sym_val is None: shape = tuple([get_new_symbol() for _ in range(self.shape.shape[0])]) return types.tensor(self.out_dtype, shape) return types.tensor(self.out_dtype, tuple(self.shape.sym_val.tolist())) """ Random Op Implementation(s) """ @register_op class random_bernoulli(RandomDistribution): r""" Returns a tensor with the specified shape, with random values from a Bernoulli distribution. .. math:: f(k) = \begin{cases}1-p &\text{if } k = 0\\ p &\text{if } k = 1\end{cases} for :math:`k` in :math:`\{0, 1\}`. Parameters ---------- shape: (Required) * Target output tensor shape. * ``K`` is the rank of the output tensor. ``shape[k] > 0`` for ``k = 0,..., K-1``. prob: const (Optional) * The probability of sampling ``1``. Defaults to ``0.5``. seed: const (Optional) * Seed to create a reproducible sequence of values across multiple invokes. Returns ------- <\*, T> * A tensor of the given target output shape filled with random values. Attributes ---------- T: fp16, fp32 See Also -------- random_categorical, random_normal, random_uniform """ input_spec = ( InputSpec( shape=TensorInputType(type_domain=types.int32), prob=TensorInputType(const=True, optional=True, type_domain="T"), seed=TensorInputType(const=True, optional=True, type_domain=types.int32), ) + RandomDistribution.input_spec ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return super().default_inputs() + \ DefaultInputs( seed=-1, prob=0.5, ) def type_inference(self): self.out_dtype = self.prob.dtype return super().type_inference() @register_op class random_categorical(Operation): """ Returns random values from a categorical distribution. Parameters ---------- x: <\*D_in, T> * N-dimensional tensor which represents ``logits`` (event log-probabilities) or ``probs`` (event probabilities) depending on ``mode``. The first ``N - 1`` dimensions specifies distributions, and the last dimension represents a vector of probabilities. mode: const (Optional) One of ``['logits', 'probs']``. Defaults to ``logits``. When set to ``probs``, an element-wise log layer will be added to calculate logits. size: const (Optional) Number of samples to draw. Defaults to ``1``. When set as ``1``, it's categorical distribution. When set larger than ``1``, it's actually multinomial distribution by drawing with replacement. It means that when a sample index is drawn, it can be drawn again. The categorical distribution is a special case of the multinomial distribution, giving the probabilities of potential outcomes of a single drawing rather than multiple drawings. seed: const (Optional) Seed to create a reproducible sequence of values across multiple invokes. Returns ------- <\*D_in[:-1] + [size], T> * A tensor of the given target output shape filled with random values. Attributes ---------- T: fp16, fp32 See Also -------- random_bernoulli, random_normal, random_uniform """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), mode=TensorInputType(const=True, optional=True, type_domain=types.str), size=TensorInputType(const=True, optional=True, type_domain=types.int32), seed=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( mode="logits", size=1, seed=-1, ) def type_inference(self): self.out_dtype = self.x.dtype output_shape = self.x.shape[:-1] + (self.size.val,) return types.tensor(self.out_dtype, output_shape) @register_op class random_normal(RandomDistribution): r""" Returns a tensor with the specified shape, with random values from a normal distribution. Parameters ---------- shape: (Required) * Target output tensor shape. * ``K`` is the rank of the output tensor. ``shape[k] > 0`` for ``k = 0,..., K-1``. mean: const (Optional) The mean (center) of the normal distribution. Defaults to 0.0. stddev: const (Optional) The standard deviation (width) of the normal distribution. Defaults to ``1.0``. seed: const (Optional) Seed to create a reproducible sequence of values across multiple invokes. Returns ------- <\*, T> * A tensor of the given target output shape filled with random values. Attributes ---------- T: fp16, fp32 See Also -------- random_categorical, random_bernoulli, random_uniform """ input_spec = ( InputSpec( shape=TensorInputType(type_domain=types.int32), mean=TensorInputType(const=True, optional=True, type_domain="T"), stddev=TensorInputType(const=True, optional=True, type_domain="T"), seed=TensorInputType(const=True, optional=True, type_domain=types.int32), ) + RandomDistribution.input_spec ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return super().default_inputs() + \ DefaultInputs( mean=0., stddev=1., seed=-1, ) def type_inference(self): if self.mean.dtype != self.stddev.dtype: raise ValueError("Incompatible primitive types in random_normal operation") self.out_dtype = self.mean.dtype return super().type_inference() @register_op class random_uniform(RandomDistribution): r""" Returns a tensor with the specified shape with random values from a uniform distribution. Samples are uniformly distributed over the half-open interval ``[low, high)`` (includes low, but excludes high). .. math:: p(x) = \frac{1}{high - low} For a real number :math:`x`. When ``high == low``, values of ``low`` will be returned. If ``high < low``, the results are officially undefined and may eventually raise an error. Parameters ---------- shape: (Required) * Target output tensor shape. * ``K`` is the rank of the output tensor. ``shape[k] > 0`` for ``k = 0,..., K-1``. low: const (Optional) * Lower boundary of the output interval (inclusive). Defaults to ``0.0``. high: const (Optional) * Upper boundary of the output interval (exclusive). Defaults to ``1.0``. seed: const (Optional) * Seed to create a reproducible sequence of values across multiple invokes. Returns ------- <\*, T> * A tensor of the given target output shape filled with random values. Attributes ---------- T: fp16, fp32 See Also -------- random_categorical, random_bernoulli, random_normal """ input_spec = ( InputSpec( shape=TensorInputType(type_domain=types.int32), low=TensorInputType(const=True, optional=True, type_domain="T"), high=TensorInputType(const=True, optional=True, type_domain="T"), seed=TensorInputType(const=True, optional=True, type_domain=types.int32), ) + RandomDistribution.input_spec ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return super().default_inputs() + \ DefaultInputs( low=0., high=1., seed=-1, ) def type_inference(self): if self.low.dtype != self.high.dtype: raise ValueError("Incompatible primitive types in random_uniform operation") self.out_dtype = self.low.dtype return super().type_inference() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/recurrent.py0000644000000000000000000005145414672066616025553 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Operation, Var, types from coremltools.converters.mil.mil.input_type import DefaultInputs, InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op @register_op class gru(Operation): r""" Gated Recurrent Unit (GRU) .. math:: r_t = \rm{recurrent\_activation}(W_{ir} x_t + b_{ir} + W_{hr} h_{t-1} + b_{hr}) .. math:: z_t = \rm{recurrent\_activation}(W_{iz} x_t + b_{iz} + W_{hz} h_{t-1} + b_{hz}) .. math:: o_t = \rm{activation}(W_{io} x_t + b_{io} + r_t * W_{ho} h_{t-1} + b_{ho}) .. math:: h_t = (1 − z_t) * o_t + z_t * h_{t−1} Where: * :math:`W_{i[r|o|z]}` are state input weights for reset, output and update gate, respectively. * :math:`b_{i[r|o|z]}` are input biases for reset, output and update gate, respectively. * :math:`W_{h[r|o|z]}` are recurrent/hidden weights on hidden state to reset, output, and update gates, respectively. * :math:`b_{h[r|o|z]}` are recurrent/hidden biases on hidden state to reset, output, and update gates, respectively. * :math:`h_t` is the hidden state at time ``t``. * :math:`x_t` is the input at time ``t``. * :math:`h_{t-1}` is the hidden state of the layer at time ``t-1`` or the initial hidden state at time ``0``. * :math:`r_t`, :math:`o_t`, and :math:`z_t` are the reset, new, and update gates, respectively. * :math:`*` is elementwise product. Parameters ---------- x: (Required) * ``s`` is the sequence length, ``b`` is the batch size, and ``I`` is the input dimension. initial_h: (Required) * ``H`` denotes hidden size. weight_ih: const<3*H, I, T> (Required) - Weight matrix * ``weigh_ih = [W_{ir} | W_{io} | W_{iz}]`` where ``[a|b]`` denotes column concatenation and ``[a, b]`` denotes row concatenation. ``W_{ir}``, ``W_{io}``, and ``W_{iz}`` have shape ``(H, I)``. weight_hh: const<3*H, H, T> (Required) - Weight matrix * ``weight_hh = [W_{hr} | W_{ho} | W_{hz}]``: ``W_{hr}``, ``W_{ho}``, and ``W_{hz}`` have shape ``(H, H)``. bias: const<3*H, T> (Optional) [Default all 0s] * ``bias[0]`` are input-hidden and hidden-hidden bias. * ``3*H`` are biases for ``[b_{ir} | b_{io} | b_{hz}]``. direction: const (Optional) [Default=forward] * Either ``forward`` or ``reverse``. output_sequence: const (Optional) [Default=False] * Outputs every step if ``True``. recurrent_activation: const (Optional) [Default=sigmoid] * Activation applied on update and reset gate. activation: const (Optional) [Default=tanh] * Activation applied on output gate. Returns ------- or <1, b, H, T> * If ``output_sequence == True`` (hidden states from every step): ````. * Else ``<1, b, H, T>`` (hidden states of the final step). * Hidden states of the final step. Attributes ---------- T: fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), initial_h=TensorInputType(type_domain="T"), weight_ih=TensorInputType(const=True, type_domain="T"), weight_hh=TensorInputType(const=True, type_domain="T"), bias=TensorInputType(const=True, optional=True, type_domain="T"), direction=TensorInputType(const=True, optional=True, type_domain=types.str), output_sequence=TensorInputType(const=True, optional=True, type_domain=types.bool), recurrent_activation=TensorInputType(const=True, optional=True, type_domain=types.str), activation=TensorInputType(const=True, optional=True, type_domain=types.str) ) type_domains = { "T": (types.fp32,), } def default_inputs(self): return DefaultInputs( bias=None, direction="forward", output_sequence=False, recurrent_activation="sigmoid", activation="tanh", ) def type_inference(self): if self.x.rank != 3: raise ValueError( "Invalid input shape. Expecting Rank 3 input, got {}".format( len(self.x.rank) ) ) sequence_length, batch_size, input_size = self.x.shape if self.weight_ih.rank != 2: raise ValueError( "Invalid weight shape. Expecting Rank 2 input, got {}".format( len(self.weight_ih.rank) ) ) if self.weight_hh.rank != 2: raise ValueError( "Invalid weight shape. Expecting Rank 2 input, got {}".format( len(self.weight_hh.rank) ) ) hidden_dim, hidden_size = self.weight_hh.shape direction = self.direction.val valid_directions = {"forward", "reverse"} if direction not in valid_directions: raise ValueError( "Direction {} not supported. Supported directions: {}".format( direction, valid_directions ) ) dim_factor = 3 if hidden_size != (hidden_dim // dim_factor): raise ValueError( "Incorrect weight matrix: hidden dim size mismatch. \ Provided weight_ih {}, weight_hh {}. Expecting ".format( self.weight_ih.shape, self.weight_hh.shape ) ) out_seq_len = sequence_length if self.output_sequence.val else 1 output_shape = [out_seq_len, batch_size, hidden_size] output_h_shape = [batch_size, hidden_size] return ( types.tensor(self.x.dtype, tuple(output_shape)), types.tensor(self.x.dtype, tuple(output_h_shape)), ) @register_op class lstm(Operation): r""" Long Short-Term Memory (LSTM) .. math:: i_t = \rm{recurrent\_activation}(W_{ii} x_t + B_{ii} + W_{hi} h_{t-1} + B_{hi}) .. math:: f_t = \rm{recurrent\_activation}(W_{if} x_t + B_{if} + W_{hf} h_{t-1} + B_{hf}) .. math:: z_t = \rm{cell\_activation}(W_{iz} x_t + B_{iz} + W_{hz} h_{t-1} + B_{hz}) .. math:: o_t = \rm{recurrent\_activation}(W_{io} x_t + B_{io} + W_{ho} h_{t-1} + B_{ho}) .. math:: c_t = f_t * c_{t-1} + i_t * z_t .. math:: h_t = o_t * \rm{activation(c_t)} Where: * :math:`i_t`, :math:`f_t`, :math:`o_t`, and :math:`z_t` are input, forget, output, and cell gates, respectively, at time ``t``. * :math:`c_t` is cell state at time ``t``. * :math:`h_t` is the hidden state at time ``t``. * :math:`W_{ii}`, :math:`W_{if}`, :math:`W_{io}`, and :math:`W_{iz}` are input weights for input, forget, output, and cell gate, respectively. * :math:`B_{ii}`, :math:`B_{if}`, :math:`B_{io}`, and :math:`B_{iz}` are input biases for input, forget, output, and cell gate, respectively. * :math:`W_{hi}`, :math:`W_{hf}`, :math:`W_{ho}`, and :math:`W_{hz}` are recurrent weights for input, forget, output, and cell gate, respectively. * :math:`B_{hi}`, :math:`B_{hf}`, :math:`B_{ho}`, and :math:`B_{hz}` are recurrent weights for input, forget, output, and cell gate, respectively. Parameters ---------- x: (Required) * ``s`` is the sequence length, ``b`` is the batch size, and ``I`` is the input dimension. initial_h: (Required) * Initial hidden state. ``DIRECTIONS = 1`` for uni-directional. ``DIRECTIONS = 2`` for bi-directional LSTM. * ``H`` denotes hidden size. * ``[b, :H]`` and ``[b, H:]`` represents forward and reverse direction values, respectively. initial_c: (Required) * Initial cell state. * Format is same as ``initial_h``. weight_ih: const<4*H, I, T> (Required) * Input-hidden weight matrix * Weight tensor should be in order of ``[input_gate, forget_gate, output_gate, cell_gate]``. * If direction=="bidirectional", this is applied in forward direction. * If direction=="forward" or "backward" these weights are used. weight_hh: const<4*H, H, T> (Required) * Hidden-hidden weight matrix. * Weight tensor should be in order of ``[input_gate, forget_gate, output_gate, cell_gate]``. * If direction=="bidirectional", this is applied in forward direction. * If direction=="forward" or "backward" these weights are used. bias: const<4*H, T> (Optional, default all 0s) * bias = input-hidden bias + hidden-hidden bias * If direction=="bidirectional", this is applied in forward direction. * If direction=="forward" or "backward" this bias are used. peephole: const<3*H, T> (Optional, default all 0s) * Weight tensor for peephole. * Order is ``[input_gate, forget_gate, output_gate]``. * Shape of each peephole vector is ``(H,)`` (``H`` is hidden size). * If direction=="bidirectional", this is applied in forward direction. * If direction=="forward" or "backward" these weights are used. weight_ih_back: const<4*H, I, T> (Optional) - * Input-hidden weight matrix for backward direction for `bidirectinal LSTM`. * Weight tensor should be in order of ``[input_gate, forget_gate, output_gate, cell_gate]``. * Must be provided for `bidirectional LSTM`. * This is only used when `direction` is "bidirectional". * For direction="reverse" use `weight_ih` instead. weight_hh_back: const<4*H, H, T> (Optional) - Hidden-hidden weight matrix * Hidden-hidden weight matrix for backward direction for `bidirectinal LSTM`. * Weight tensor should be in order of ``[input_gate, forget_gate, output_gate, cell_gate]``. * Must be provided for `bidirectional LSTM`. * This is only used when `direction` is "bidirectional". * For direction="reverse" use `weight_hh` instead. bias_back: const<4*H, T> (Optional, default all 0s) * bias = input-hidden bias + hidden-hidden bias. * Bias of backward direction for `bidirectional lstm` * This is only used when `direction` is "bidirectional". * For direction="reverse" use `bias` instead. peephole_back: const<3*H, T> (Optional, default all 0s) * Weight tensor for peephole in backward direction for `bidirectional LSTM`. * Order is ``[input_gate, forget_gate, output_gate]``. * Shape of each peephole vector is ``(H,)`` (``H`` is hidden size). * Peephole of backward direction for `bidirectional lstm` * Bias of backward direction for `bidirectional lstm` * This is only used when `direction` is "bidirectional". * For direction="reverse" use `peephole` instead. direction: const (Optional) [Default=forward] * One of the following: ``forward``, ``reverse``, or ``bidirectional``. * Must match ``DIRECTIONAL`` in initial states and weight parameters. output_sequence: const (Optional) [Default=False] * Outputs every step if ``True``. recurrent_activation: const (Optional) [Default=sigmoid] * Activation applied on input, forget, and output gates. * Supported values: ``hard_sigmoid``, ``linear``, ``relu``, ``scaled_tanh``, ``sigmoid``, ``tanh`` cell_activation: const (Optional) [Default=tanh] * Activation applied on cell gate. * Supported values: ``hard_sigmoid``, ``linear``, ``relu``, ``scaled_tanh``, ``sigmoid``, ``tanh`` activation: const (Optional) [Default=tanh] * Activation applied on output gate. * Supported values: ``hard_sigmoid``, ``linear``, ``relu``, ``scaled_tanh``, ``sigmoid``, ``tanh`` clip: const (optional) [Default=None] * Cell gate is clipped to ``[-clip, +clip]``. Returns ------- or <1, b, DIRECTIONS*H, T> * If ``output_sequence == True`` (hidden states from every step): ````. * Else ``<1, b, DIRECTIONS*H, T>`` (hidden states of the final step). * Hidden states of the final step. * Memory state of the final step. Attributes ---------- T: fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), initial_h=TensorInputType(type_domain="T"), initial_c=TensorInputType(type_domain="T"), weight_ih=TensorInputType(const=True, type_domain="T"), # ifoz layout, weight_hh=TensorInputType(const=True, type_domain="T"), # ifoz layout bias=TensorInputType(const=True, optional=True, type_domain="T"), # ifoz layout peephole=TensorInputType(const=True, optional=True, type_domain="T"), # ifo layout weight_ih_back=TensorInputType(const=True, optional=True, type_domain="T"), # ifoz layout, weight_hh_back=TensorInputType(const=True, optional=True, type_domain="T"), # ifoz layout bias_back=TensorInputType(const=True, optional=True, type_domain="T"), # ifoz layout peephole_back=TensorInputType(const=True, optional=True, type_domain="T"), # ifo layout direction=TensorInputType(const=True, optional=True, type_domain=types.str), output_sequence=TensorInputType(const=True, optional=True, type_domain=types.bool), recurrent_activation=TensorInputType(const=True, optional=True, type_domain=types.str), cell_activation=TensorInputType(const=True, optional=True, type_domain=types.str), activation=TensorInputType(const=True, optional=True, type_domain=types.str), clip=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp32,), } def default_inputs(self): return DefaultInputs( bias=None, direction="forward", output_sequence=False, recurrent_activation="sigmoid", cell_activation="tanh", activation="tanh", peephole=None, clip=None) def type_inference(self): self._validate_inputs() sequence_length, batch_size, input_size = self.x.shape hidden_dim, hidden_size = self.weight_hh.shape dim_factor = 8 if self.direction.val == "bidirectional" else 4 out_seq_len = sequence_length if self.output_sequence.val else 1 num_directions = dim_factor // 4 output_shape = [out_seq_len, batch_size, num_directions * hidden_size] output_h_shape = [batch_size, num_directions * hidden_size] output_c_shape = [batch_size, num_directions * hidden_size] return ( types.tensor(self.x.dtype, tuple(output_shape)), types.tensor(self.x.dtype, tuple(output_h_shape)), types.tensor(self.x.dtype, tuple(output_c_shape)), ) def _validate_inputs(self): _ALLOWED_DIRECTIONS = {"forward", "reverse", "bidirectional"} _ALLOWED_ACTIVATIONS = {"tanh", "scaled_tanh", "sigmoid", "hard_sigmoid", "relu", "linear"} def check_activation(activation: str): if activation.lower() not in _ALLOWED_ACTIVATIONS: raise ValueError( f"Activation `{activation}` not supported. Supported activations: {_ALLOWED_ACTIVATIONS}" ) if self.x.rank != 3: raise ValueError(f"Invalid input shape. Expecting Rank 3 input, got {len(self.x.rank)}") direction = self.direction.val if direction not in _ALLOWED_DIRECTIONS: raise ValueError( f"Direction {direction} not supported. Supported directions: {_ALLOWED_DIRECTIONS}" ) self._weight_shape_check(self.weight_ih, self.weight_hh) if direction == "bidirectional": if self.weight_ih_back is None or self.weight_hh_back is None: raise ValueError( "For bidirectional LSTM, the `weight_ih_back` and `weight_hh_back`" " must be provided." ) self._weight_shape_check(self.weight_ih_back, self.weight_hh_back) check_activation(self.recurrent_activation.val) check_activation(self.cell_activation.val) check_activation(self.activation.val) @staticmethod def _weight_shape_check(wt_ih: Var, wt_hh: Var): if wt_ih.rank != 2 or wt_hh.rank != 2: raise ValueError( f"Expecting Rank 2 input, got weight_ih rank: {wt_ih.rank}, " f"weight_hh rank: {wt_hh.rank}" ) hidden_size = wt_hh.shape[1] if wt_hh.shape[0] // hidden_size != 4 or wt_ih.shape[0] // hidden_size != 4: raise ValueError( f"Incorrect weight matrix: hidden dim size mismatch. Provided " f"weight_ih {wt_ih.shape}, weight_hh {wt_hh.shape}. Expecting <4*H, H>" ) @register_op class rnn(Operation): r""" Recurrent Neural Network (RNN) .. math:: h_t = \rm{activation}(W_{ih} x_t + b_{ih} + W_{hh} h_{t−1} + b_{hh}) Where: * :math:`W_{ih}` is the input weight. * :math:`W_{hh}` is the hidden/recurrent weight. * :math:`h_t` is the hidden state at time ``t``. * :math:`x_t` is the input at time ``t``. * :math:`h_{t-1}` is the hidden state of the layer at time ``t-1`` or the initial hidden state at ``t = 0``. * :math:`b_{ih}` is the input bias. * :math:`b_{hh}` if the hidden/recurrent bias. Parameters ---------- x: (Required) * ``s`` is the sequence length, ``b`` is the batch size, and ``I`` is the input dimension. initial_h: (Required) * ``H`` denotes hidden size. weight_ih: const (Required) - Input-hidden weight matrix weight_hh: const (Required) - Hidden-hidden weight matrix bias: const (Optional) [Default all 0s] * bias for input-hidden and hidden-hidden direction: const (Optional) [Default=forward] * Either ``forward`` or ``reverse``. output_sequence: const (Optional) [Default=False] * Outputs every step if ``True``. activation: const (Optional) [Default=tanh] * Supported activation functions: ``relu``, ``tanh``, ``sigmoid``, ``sigmoid_hard``, ``scaled_tanh``, and ``linear``. Returns ------- or <1, b, H, T> * If ``output_sequence == True`` (hidden states from every step): ````. * Else ``<1, b, H, T>`` (hidden states of the final step). * Hidden states of the final step. Attributes ---------- T: fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), initial_h=TensorInputType(type_domain="T"), weight_ih=TensorInputType(const=True, type_domain="T"), weight_hh=TensorInputType(const=True, type_domain="T"), bias=TensorInputType(const=True, optional=True, type_domain="T"), direction=TensorInputType(const=True, optional=True, type_domain=types.str), output_sequence=TensorInputType(const=True, optional=True, type_domain=types.bool), activation=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp32,), } def default_inputs(self): return DefaultInputs( bias=None, direction="forward", output_sequence=False, activation="tanh") def type_inference(self): if self.x.rank != 3: raise ValueError( f"Invalid input shape. Expecting Rank 3 input, got {len(self.x.rank)}" ) sequence_length, batch_size, input_size = self.x.shape if self.weight_ih.rank != 2 or self.weight_hh.rank != 2: raise ValueError( f"Invalid weight shape. Expecting Rank 2 input, got weight_ih " f"{self.weight_ih.rank}, weight_hh {self.weight_hh.rank}" ) hidden_size, _ = self.weight_ih.shape direction = self.direction.val valid_directions = {"forward", "reverse"} if direction not in valid_directions: raise ValueError( f"Direction {direction} not supported. Supported directions: {valid_directions}" ) out_seq_len = sequence_length if self.output_sequence.val else 1 output_shape = [out_seq_len, batch_size, hidden_size] output_h_shape = [batch_size, hidden_size] return ( types.tensor(self.x.dtype, tuple(output_shape)), types.tensor(self.x.dtype, tuple(output_h_shape)), ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/reduction.py0000644000000000000000000003506714672066616025540 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import Operation, precondition, types from coremltools.converters.mil.mil.input_type import (DefaultInputs, InputSpec, TensorInputType) from coremltools.converters.mil.mil.operation import VALUE from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.types import nptype_from_builtin class ReductionAxes(Operation): """ Reduction Op Superclasses """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axes=TensorInputType(const=True, optional=True, type_domain=types.int32), keep_dims=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( axes=None, keep_dims=False, ) def type_inference(self): x_type = self.x.dtype x_shape = self.x.shape axes = self.axes.val if self.axes is not None else None if axes is None: axes = range(self.x.rank) keep_dims = self.keep_dims.val reduced_shape = list(x_shape) if keep_dims: for i in axes: reduced_shape[i] = 1 else: # sort reverse so we can delete shape elements back to front axes = [axis if axis >= 0 else axis + len(reduced_shape) for axis in axes] for i in sorted(axes)[::-1]: reduced_shape.pop(i) if len(reduced_shape) == 0: return x_type # scalar return types.tensor(x_type, tuple(reduced_shape)) @precondition(allow=VALUE) def value_inference(self): axes = tuple(self.axes.val) if self.axes is not None else None res = self.get_operator()(self.x.val, axis=axes, keepdims=self.keep_dims.val) return res.astype(nptype_from_builtin(self.x.dtype)) def get_operator(self): raise NotImplementedError() class ReductionAxis(Operation): input_spec = InputSpec( x=TensorInputType(type_domain="T"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), keep_dims=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( axis=-1, keep_dims=False, ) def _find_reduced_shape(self): x_shape = self.x.shape axis = self.axis.val reduced_shape = list(x_shape) axis = axis if axis >= 0 else axis + len(reduced_shape) if self.keep_dims.val: reduced_shape[axis] = 1 else: reduced_shape.pop(axis) return reduced_shape def type_inference(self): x_type = self.x.dtype reduced_shape = self._find_reduced_shape_and_axis() return types.tensor(x_type, tuple(reduced_shape)) @precondition(allow=VALUE) def value_inference(self): tmp = self.get_operator()(self.x.val, axis=self.axis.val) reduced_shape = self._find_reduced_shape() if self.keep_dims.val: tmp = np.reshape(tmp, reduced_shape) return tmp def get_operator(self): raise NotImplementedError() class reduce_arg(ReductionAxis): def __init__(self, **kwargs): super().__init__(**kwargs) def type_inference(self): reduced_shape = self._find_reduced_shape() return types.tensor(types.int32, tuple(reduced_shape)) """ Reduction op implementations """ @register_op class reduce_argmax(reduce_arg): """ Computes the indices of the maximum value across dimensions of a tensor. In case of ties, the identity of the return value is not guaranteed. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axis: const (Optional) * The dimension to reduce. Default is ``-1``. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` by removing the dimension specified in ``axis``. If ``True``, retain reduced axis with length ``1``. Returns ------- <\*, int32> Attributes ---------- T: fp16, fp32, i32 References ---------- See `tf.math.argmax `_. """ def get_operator(self): return np.argmax @register_op class reduce_argmin(reduce_arg): """ Computes the indices of the minimum value across dimensions of a tensor. In case of ties, the identity of the return value is not guaranteed. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axis: const (Optional) * The dimension to reduce. Default is ``-1``. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` by removing the dimension specified in ``axis``, otherwise retain reduced axis with length ``1``. Returns ------- <\*, int32> Attributes ---------- T: fp16, fp32, i32 References ---------- See `tf.math.argmin `_. """ def get_operator(self): return np.argmin @register_op class reduce_l1_norm(ReductionAxes): """ Computes the L1 normalization of elements across given dimensions of the input tensor. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axes: const (Optional, default="None", reduce on all axes.) * The dimensions to reduce. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` for each entry in ``axes``, otherwise retain reduced axes with length ``1``. Returns ------- <\*,T> * Scalar or tensor: The reduced tensor. Attributes ---------- T: i32, fp16, fp32 References ---------- See `reduce_mean `_. """ def get_operator(self): def l1_norm(x, axis=None, keepdims=False): return np.sum(np.abs(x), axis=axis, keepdims=keepdims) return l1_norm @register_op class reduce_l2_norm(ReductionAxes): """ Computes the L2 normalization of elements across given dimensions of the input tensor. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axes: const (Optional, default="None", reduce on all axes.) * The dimensions to reduce. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` for each entry in ``axes``, otherwise retain reduced axes with length ``1``. Returns ------- <\*,T> * Scalar or tensor: The reduced tensor. Attributes ---------- T: i32, fp16, fp32 """ def get_operator(self): def l2_norm(x, axis=None, keepdims=False): return np.sqrt(np.sum(np.square(x), axis=axis, keepdims=keepdims)) return l2_norm @register_op class reduce_log_sum(ReductionAxes): """ Computes the natural logarithm of the sum of all the elements across given dimensions of the input tensor. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axes: const (Optional, default="None", reduce on all axes.) * The dimensions to reduce. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` for each entry in ``axes``, otherwise retain reduced axes with length ``1``. Returns ------- <\*,T> * Scalar or tensor: The reduced tensor. Attributes ---------- T: i32, fp16, fp32 """ def get_operator(self): def log_sum(x, axis=None, keepdims=False): return np.log(np.sum(x, axis=axis, keepdims=keepdims)) return log_sum @register_op class reduce_log_sum_exp(ReductionAxes): """ Computes the natural logarithm of the sum of the exponentials of the elements across given dimensions of the input tensor. It is a smooth approximation of the maximum function, more numerically stable than ``log(sum(exp(input)))``. It avoids overflows caused by taking the ``exp`` of large inputs and underflows caused by taking the ``log`` of small inputs. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axes: const (Optional, default="None", reduce on all axes.) * The dimensions to reduce. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` for each entry in ``axes``, otherwise retain reduced axes with length ``1``. Returns ------- <\*,T> * Scalar or tensor: The reduced tensor. Attributes ---------- T: i32, fp16, fp32 References ---------- See `tf.math.reduce_logsumexp `_. """ def get_operator(self): def operator(a, axis=None, keepdims=False): max_values = np.amax(a, axis=axis, keepdims=True) temp = np.exp(a - max_values) if not keepdims: max_values = np.squeeze(max_values, axis=axis) sum = np.sum(temp, axis=axis, keepdims=keepdims) result = np.log(sum) return result + max_values return operator @register_op class reduce_max(ReductionAxes): """ Computes the maximum of elements across given dimensions of the input tensor. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axes: const (Optional, default="None", reduce on all axes.) * The dimensions to reduce. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` for each entry in ``axes``, otherwise retain reduced axes with length ``1``. Returns ------- <\*,T> * Scalar or tensor: The reduced tensor. Attributes ---------- T: i32, fp16, fp32 """ def __init__(self, **kwargs): super().__init__(**kwargs) def get_operator(self): return np.max @register_op class reduce_mean(ReductionAxes): """ Computes the mean of elements across given dimensions of the input tensor. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axes: const (Optional, default="None", reduce on all axes.) * The dimensions to reduce. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` for each entry in ``axes``, otherwise retain reduced axes with length ``1``. Returns ------- <\*,T> * Scalar or tensor: The reduced tensor. Attributes ---------- T: i32, fp16, fp32 References ---------- For an example, see `tf.math.reduce_mean `_. """ def get_operator(self): return np.mean @register_op class reduce_min(ReductionAxes): """ Computes the minimum of elements across given dimensions of the input tensor. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axes: const (Optional, default="None", reduce on all axes.) * The dimensions to reduce. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` for each entry in ``axes``, otherwise retain reduced axes with length ``1``. Returns ------- <\*,T> * Scalar or tensor: The reduced tensor. Attributes ---------- T: i32, fp16, fp32 """ def get_operator(self): return np.min @register_op class reduce_prod(ReductionAxes): """ Computes the product of elements across given dimensions of the input tensor. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axes: const (Optional, default="None", reduce on all axes.) * The dimensions to reduce. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` for each entry in ``axes``, otherwise retain reduced axes with length ``1``. Returns ------- <\*,T> * Scalar or tensor: The reduced tensor. Attributes ---------- T: i32, fp16, fp32 """ def get_operator(self): return np.prod @register_op class reduce_sum(ReductionAxes): """ Computes the sum of elements across given dimensions of the input tensor. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axes: const (Optional, default="None", reduce on all axes.) * The dimensions to reduce. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` for each entry in ``axes``, otherwise retain reduced axes with length ``1``. Returns ------- <\*,T> * Scalar or tensor: The reduced tensor. Attributes ---------- T: i32, fp16, fp32 """ def get_operator(self): return np.sum @register_op class reduce_sum_square(ReductionAxes): """ Computes the sum of squares of elements across given dimensions of the input tensor. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axes: const (Optional, default="None", reduce on all axes.) * The dimensions to reduce. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` for each entry in ``axes``, otherwise retain reduced axes with length ``1``. Returns ------- <\*,T> * Scalar or tensor: The reduced tensor. Attributes ---------- T: i32, fp16, fp32 """ def get_operator(self): def sum_squre(x, axis=None, keepdims=False): return np.sum(np.square(x), axis=axis, keepdims=keepdims) return sum_squre ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/scatter_gather.py0000644000000000000000000004106314672066616026534 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import Operation, types from coremltools.converters.mil.mil.input_type import DefaultInputs, InputSpec, TensorInputType from coremltools.converters.mil.mil.operation import SYMBOL, VALUE, precondition from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs._utils import compute_gather from coremltools.converters.mil.mil.types.symbolic import is_compatible_symbolic_vector, is_symbolic @register_op class gather(Operation): """ Gather slices from input ``x`` along dimension ``axis`` according to ``indices``, similar to `tf.gather `_. * If ``indices`` is scalar (0-D): .. math:: output[p_0, ..., p_{axis-1}, ~~~~~~~~~~~~~~~~~~~~~~~~ p_{axis+1}, ..., p_{rank(x)-1}] = .. math:: x[p_0, ..., p_{axis-1}, ~~~~~~~~~ indices, ~~~~~~~~ p_{axis+1}, ..., p_{rank(x)-1}] Where ``rank(x)`` is the rank of ``x``. The ``output`` has rank ``rank(x) - 1``. * If ``indices`` is 1-D tensor: .. math:: output[p_0, ..., p_{axis-1}, ~~~~~~~~~~~~~ i, ~~~~~~~~~~~~~ p_{axis+1}, ..., p_{rank(*D)-1}] = .. math:: x[p_0, ..., p_{axis-1}, ~~~~~~~~ indices[i], ~~~~~~~~ p_{axis+1}, ..., p_{rank(*D)-1}] The output has rank ``rank(x)``. * In general: .. math:: output[p_0, ..., p_{axis-1}, ~~~~~~~~ i_0, ..., i_{M-1}, ~~~~~~~~ p_{axis+1}, ..., p_{rank(x)-1}] = .. math:: x[p_0, ..., p_{axis-1}, ~~~~~~~ indices[i_0, ..., i_{M-1}], ~~~~~~~ p_{axis+1}, ..., p_{rank(x)-1}] Where ``M = rank(indices)``. Parameters ---------- x: tensor<\*D, T> (Required) indices: tensor<\*N, i32> (Required) * Indices values may be negative. More precisely, ``-D[axis]<= v < D[axis]`` for ``v`` in ``indices``. axis: const i32 (Optional. Default=``0``) * Negative axis is supported. Returns ------- tensor<\*K, T> * Where ``K = D[:axis] + N + D[axis+1:]``. Attributes ---------- T: fp16, fp32, i32 References ---------- See `tf.gather `_. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain=types.int32), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( axis=0, ) @precondition(allow=VALUE | SYMBOL) def value_inference(self): x = self.x.sym_val indices = self.indices.val if indices is None: # only allow x to be symbolic. indices cannot. return None return compute_gather( params=self.x.sym_val, indices=self.indices.val, axis=self.axis.val, batch_dims=0 ) def type_inference(self): out_type = self.x.dtype if self.axis.val < -self.x.rank or self.axis.val >= self.x.rank: raise IndexError( "Axis value {} is out of bounds for {} node {}".format( self.axis.val, self.op_type, self.name ) ) output_rank = self.x.rank - 1 + self.indices.rank if output_rank == 0: # output scalar return out_type axis = self.axis.val axis = axis if axis >= 0 else axis + self.x.rank out_shape = self.x.shape[:axis] + self.indices.shape + self.x.shape[axis + 1 :] return types.tensor(out_type, out_shape) @register_op class scatter(Operation): """ Scatter ``updates`` to ``data`` at locations ``indices`` at dimension ``axis`` by operation ``mode``. Example: ``mode == update``. * For ``i`` in ``[0, len(indices)]``: .. math:: output[p_0, ..., p_{axis-1}, indice[i], p_{axis+1}, ..., p_D] = .. math:: updates[p_0, ..., p_{axis-1}, i, p_{axis+1}, ..., p_D] * For ``j != i``: .. math:: output[p_0, ..., p_{axis-1}, j, p_{axis+1}, ..., p_D] = .. math:: data[p_0, ..., p_{axis-1}, j, p_{axis+1}, ..., p_D] Example: ``mode == add``. * For ``i`` in ``[0, len(indices)]``: .. math:: output[p_0, ..., p_{axis-1}, indice[i], p_{axis+1}, ..., p_D] = .. math:: updates[p_0, ..., p_{axis-1}, i, p_{axis+1}, ..., p_D] + .. math:: x[p_0, ..., p_{axis-1}, indice[i], p_{axis+1}, ..., p_D] * For ``j != i``: .. math:: output[p_0, ..., p_{axis-1}, j, p_{axis+1}, ..., p_D] = .. math:: data[p_0, ..., p_{axis-1}, j, p_{axis+1}, ..., p_D] Parameters ---------- data: tensor<\*D, T> (Required) indices: tensor<[C], i32> (Required) * 1-D tensor. updates: tensor<\*K, T> (Required) * ``K = data.shape[:axis] + [len(indices)] + data.shape[axis+1:]``. axis: const i32 (Optional) * Default to ``0``. mode: const string (Optional) * Can be the following modes: ``update``, ``add``, ``sub``, ``mul``, ``div``, ``max``, ``min``. * Default value is ``update``. Returns ------- tensor<\*D, T> * With the same type and shape as input ``x``. Attributes ---------- T: fp16, fp32, i32 For example: data = [[1, 2, 3], [4, 5, 6]] indices = [1, 0] updates = [[5, 6, 7], [8, 9, 10]] axis = 0 mode = "update" produces: [[9, 11, 13], [9, 11, 13]] """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain=types.int32), updates=TensorInputType(type_domain="T"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), mode=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( axis=0, mode="add", ) def type_inference(self): if self.axis.val < -self.data.rank or self.axis.val >= self.data.rank: raise IndexError( "Axis value {} is out of bounds for {} node {}".format( self.axis.val, self.op_type, self.name ) ) axis = self.axis.val axis = axis if axis >= 0 else axis + self.data.rank expected_updates_shape = ( self.data.shape[:axis] + self.indices.shape + self.data.shape[axis + 1 :] ) err = "Updates shape {} is incorrect. It should be {}.".format(self.updates.shape, expected_updates_shape) assert is_compatible_symbolic_vector( self.updates.shape, tuple(expected_updates_shape) ), err return self.data.sym_type @register_op class gather_along_axis(Operation): """ Take the values along ``axis`` at locations ``indices``. .. math:: idx = indices[p_0, ..., p_{axis-1}, i, p_{axis+1}, ..., p_D] .. math:: output[p_0, ..., p_{axis-1}, i, p_{axis+1}, ..., p_D] = = x[p_0, ..., p_{axis-1}, idx, p_{axis+1}, ..., p_D] Parameters ---------- x: tensor<\*D, T> (Required) indices: tensor<\*K, i32> (Required) * ``rank(indices) == rank(x)``. axis: const i32 (Optional): * Default to ``0``. Returns ------- tensor<\*D, T>: * Output tensor has the same shape as ``indices``. Attributes ---------- T: fp16, fp32, i32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain=types.int32), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( axis=0, ) @precondition(allow=VALUE) def value_inference(self): x = self.x.val indices = self.indices.val axis = self.axis.val return np.take_along_axis(x, indices, axis) def type_inference(self): if self.x.rank != self.indices.rank: raise ValueError( "Rank mismatch between input and indices. \ Input rank: {}, indices rank: {}".format( self.x.rank, self.indices.rank ) ) if self.axis.val < -self.x.rank or self.axis.val >= self.x.rank: raise IndexError( "Axis value {} is out of bounds for {} node {}".format( self.axis.val, self.op_type, self.name ) ) axis = self.axis.val axis = axis if axis >= 0 else axis + self.x.rank for i in range(self.x.rank): x_size = self.x.shape[i] indices_size = self.indices.shape[i] if i != axis and not is_symbolic(x_size) and not is_symbolic(indices_size): if x_size != indices_size: raise AssertionError( "The input data and indices should have the same size at " f"axis {i}, but got {x_size} vs {indices_size}" ) return types.tensor(self.x.dtype, self.indices.shape) @register_op class scatter_along_axis(Operation): """ Scatter ``updates`` to ``data`` at locations ``indices`` along ``axis`` dimension using ``mode`` operation. Example: ``mode == update``. * For ``i`` in ``[0, len(indices)]``: .. math:: idx = indices[p_0, ..., p_{axis-1}, i, p_{axis+1}, ..., p_D] .. math:: output[p_0, ..., p_{axis-1}, idx, p_{axis+1}, ..., p_D] = .. math:: updates[p_0, ..., p_{axis-1}, i, p_{axis+1}, ..., p_D] * For ``j! = i``: .. math:: output[p_0, ..., p_{axis-1}, j, p_{axis+1}, ..., p_D] = .. math:: data[p_0, ..., p_{axis-1}, j, p_{axis+1}, ..., p_D] Example: ``mode == add``. * For ``i`` in ``[0, len(indices)]``: .. math:: idx = indices[p_0, ..., p_{axis-1}, i, p_{axis+1}, ..., p_D] .. math:: output[p_0, ..., p_{axis-1}, idx, p_{axis+1}, ..., p_D] = .. math:: updates[p_0, ..., p_{axis-1}, i, p_{axis+1}, ..., p_D] + .. math:: x[p_0, ..., p_{axis-1}, indice[i], p_{axis+1}, ..., p_D] * For ``j! = i``: .. math:: output[p_0, ..., p_{axis-1}, j, p_{axis+1}, ..., p_D] = .. math:: data[p_0, ..., p_{axis-1}, j, p_{axis+1}, ..., p_D] Parameters ---------- data: tensor<\*D, T> (Required) indices: tensor<\*K, i32> (Required) * ``rank(indices) == rank(data)``. updates: tensor<\*K, T> (Required) * Must be the same shape as ``indices``. axis: const i32 (Optional) * Default to ``0``. mode: const string (Optional) * Default to ``add``. * Can be the following modes: ``update``, ``add``, ``sub``, ``mul``, ``div``, ``max``, ``min``. Returns ------- tensor<\*D, T> * With the same type and shape as input ``x``. Attributes ---------- T: fp16, fp32, i32 """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain=types.int32), updates=TensorInputType(type_domain="T"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), mode=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( axis=0, mode="add", ) @precondition(allow=VALUE) def value_inference(self): data = np.copy(self.data.val) indices = self.indices.val updates = self.updates.val axis = self.axis.val np_output = data np.put_along_axis(np_output, indices, updates, axis=axis) return np_output def type_inference(self): if self.axis.val < -self.data.rank or self.axis.val >= self.data.rank: raise IndexError( "Axis value {} is out of bounds for {} node {}".format( self.axis.val, self.op_type, self.name ) ) axis = self.axis.val axis = axis if axis >= 0 else axis + self.data.rank assert is_compatible_symbolic_vector( self.indices.shape, self.updates.shape ) assert self.data.rank == self.indices.rank for i in range(self.data.rank): if i != axis: assert self.data.shape[i] == self.indices.shape[i] return self.data.sym_type @register_op class gather_nd(Operation): """ Gather slices from ``x`` according to ``indices``, similar to `tf.gather_nd `_. The ``indices`` is a K-dim tensor, where ``indices[i_0,...,i_{K-2}]`` defines a slice of ``x``: .. math:: output[i_0, ..., i_{K-2}]= x[indices[i_0, ..., i_{K-2}]] Where ``K = rank(indices)`` and ``x[indices[i_0, ..., i_{K-2}]]`` has rank ``rank(x) - indices.shape[-1]``. Parameters ---------- x: tensor<\*D, T> (Required) indices: tensor<\*K, i32> (Required) Returns ------- tensor<\*V, T> * ``V = K[:-1] + D[K[-1]:]``, where ``D = x.shape`` and ``K = indices.shape``. Attributes ---------- T: fp16, fp32, i32 References ---------- See `tf.gather_nd `_. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def type_inference(self): assert self.indices.shape[-1] <= self.x.rank out_type = self.x.dtype out_shape = self.indices.shape[:-1] + self.x.shape[self.indices.shape[-1] :] return types.tensor(out_type, out_shape) @register_op class scatter_nd(Operation): """ Scatter ``updates`` to ``data`` at locations ``indices``. The ``indices`` is a K-dim tensor, where ``indices[i_0,...,i_{K-2}]`` defines a slice of ``data``, ``K = rank(indices)``, and ``data[i_0, ..., i_{K-2}]`` has rank ``rank(data) - indices.shape[-1]``. Concretely, this means the index is stored in the last dim of ``indices``, e.g. take a ``K == 2`` example .. math:: indices = [[0, 1], [0, 2]] where ``indices[0]`` / ``[0, 1]`` and ``indices[1]`` / ``[0, 2]`` are two indices that get applied to ``data`` * Example: ``mode == update``: The ``output`` is set to ``data`` initially, and the op updates ``output`` as follows: .. math:: output[indices[i_0, ..., i_{K-2}]]= updates[i_0, ..., i_{K-2}] * Example: ``mode == add``. The update rule is: .. math:: output[indices[i_0, ..., i_{K-2}]] += updates[i_0, ..., i_{K-2}] Parameters ---------- data: tensor<\*D, T> (Required) indices: tensor<\*E, i32> (Required) * indices.shape[-1] <= data.rank updates: tensor<\*F, T> (Required) * Must be the shape as ``indices.shape[:-1] + data.shape[indices.shape[-1]:]``. mode: const string (Optional) * Default to ``add``. * Can be the following modes: ``update``, ``add``, ``sub``, ``mul``, ``div``, ``max``, ``min``. Returns ------- tensor<\*D, T> * A tensor with the same shape and type as ``data``. Attributes ---------- T: fp16, fp32, i32 """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain=types.int32), updates=TensorInputType(type_domain="T"), mode=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( mode="add", ) def type_inference(self): assert self.indices.shape[-1] <= self.data.rank expected_updates_shape = ( self.indices.shape[:-1] + self.data.shape[self.indices.shape[-1] :] ) assert is_compatible_symbolic_vector( self.updates.shape, tuple(expected_updates_shape) ) return self.data.sym_type ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/tensor_operation.py0000644000000000000000000012404014672066616027124 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import math import numpy as np from coremltools.converters.mil.mil import get_new_symbol, get_new_variadic_symbol, types from coremltools.converters.mil.mil.input_type import ( DefaultInputs, InputSpec, ListOrTensorOrDictInputType, TensorInputType, TupleInputType, ) from coremltools.converters.mil.mil.operation import NONE, SYMBOL, VALUE, Operation, precondition from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs._utils import MAX_SIZE_CONSTANT_FOLDING from coremltools.converters.mil.mil.types.symbolic import ( any_symbolic, is_compatible_symbolic_vector, is_symbolic, ) @register_op class band_part(Operation): """ Returns a tensor setting everything outside a center band to zeros for the innermost matrix. That is, band(m, n) = (lower < 0 || (m-n) <= lower) && (upper < 0 || (n-m) <= upper) output[i, j, k, ..., m, n] = band(m, n) * input[i, j, k, ..., m, n] Special cases: - ``band_part(x, 0, -1)`` returns upper triangular part. - ``band_part(x, -1, 0)`` returns lower triangular part. - ``band_part(x, 0, 0)`` returns diagonal. Parameters ---------- x: tensor<\*?, T> (Required) * Input tensor. lower: const (Optional) * Number of lower / below sub-diagonals to keep. If negative, keep entire lower triangle. * Defaults to ``-1`` (keep the entire lower triangle). upper: const (Optional) * Number of upper / above sub-diagonals to keep. If negative, keep entire lower triangle. * Defaults to ``-1`` (keep the entire upper triangle). Returns ------- tensor<\*?, T> * Same type and shape as the input tensor. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), lower=TensorInputType(const=True, optional=True, type_domain=types.int32), upper=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def default_inputs(self): return DefaultInputs( lower=-1, upper=-1) def type_inference(self): return self.x.sym_type @precondition(allow=VALUE) def value_inference(self): M, N = self.x.val.shape[-2:] band = np.zeros((M, N), dtype=types.nptype_from_builtin(self.x.dtype)) num_lower = self.lower.val num_upper = self.upper.val for m in range(M): for n in range(N): band[m, n] = (num_lower < 0 or (m - n) <= num_lower) and ( num_upper < 0 or (n - m) <= num_upper ) return np.multiply(band, self.x.val) @register_op class cumsum(Operation): """ Returns the cumulative sum of the input along the given axis. Parameters ---------- x: tensor<\*?, T> (Required) * Input tensor. axis: const (Optional) * Defaults to ``0``. * Axis for which the cumulative sum is computed. exclusive: const (Optional) * Defaults to ``False``. * When set to ``False``, inclusive cumsum is computed, that is the first element of the output is identical to the first element in the input. * When set to ``True``, exclusive cumsum is computed, which makes the first element of output to ``0``. reverse: const (Optional) * Defaults to ``False``. * When set to ``True``, perform cumsum in the reverse order. Returns ------- tensor<\*?, T> * Same type and shape as the input tensor. Attributes ---------- T: fp16, fp32, i32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), exclusive=TensorInputType(const=True, optional=True, type_domain=types.bool), reverse=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( axis=0, exclusive=False, reverse=False) @precondition(allow=VALUE) def value_inference(self): data = np.copy(self.x.val) axis = self.axis.val reverse = self.reverse.val exclusive = self.exclusive.val if reverse: data = np.flip(data, axis=axis) data = np.cumsum(data, axis=axis) if exclusive: zero_shape = np.copy(data.shape) zero_shape[axis] = 1 data = np.concatenate((np.zeros(zero_shape, data)), axis=axis) if reverse: data = np.flip(data, axis=axis) return data def type_inference(self): # Check range of axis if self.axis.val < -1 or self.axis.val > self.x.rank - 1: raise ValueError( "axis should be in the range [-1, {}]".format(self.x.rank - 1) ) return self.x.sym_type @register_op class fill(Operation): """ Returns a tensor with a given shape filled with a constant value. Parameters ---------- shape: tensor<[K], i32> (Required) * Target output tensor shape. * ``K`` is the rank of the output tensor. ``shape[k] > 0`` for ``k = 0,..., K-1``. value: const (Optional) * Defaults to ``0.0``. * Constant value to fill in. Returns ------- tensor<\*?, T> * Tensor with shape determined by the input shape. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( shape=TensorInputType(type_domain=types.int32), value=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def default_inputs(self): return DefaultInputs( value=0.) def type_inference(self): if any_symbolic(self.shape.shape): # We can't infer any shape if shape has variable length. return types.tensor(self.value.dtype, (get_new_variadic_symbol(),)) # shape has fixed length here. if self.shape.sym_val is None: ret_shape = tuple([get_new_symbol() for _ in range(self.shape.shape[0])]) return types.tensor(self.value.dtype, ret_shape) return types.tensor(self.value.dtype, tuple(self.shape.sym_val.tolist())) @precondition(allow=VALUE) def value_inference(self): return np.full(shape=self.shape.val, fill_value=self.value.val) @register_op class non_maximum_suppression(Operation): """ Applies non-maximum suppression (NMS) on the input box coordinates according to their intersection-over-union (IoU). NMS selects a subset of bounding boxes in descending order of score, and removes boxes that have high intersection-over-union (IOU) overlap with previously-selected boxes. Parameters ---------- boxes: tensor<[n, B, 4], T> (Required) * Box coordinates on which to perform NMS. The coordinates are expected in CENTER_SIZE_WIDTH_FIRST format (x, y, width, height) where (x, y) is the center. scores: tensor<[n, B, K], T> (Required) * Scores for each one of the boxes. K is the number of classes. iou_threshold: const (Required) * The intersection over union (``IoU``) threshold over which boxes are suppressed. NMS remove all overlapping boxes with ``IoU > iou_threshold``. score_threshold: const (Required) * Before IoU suppression is performed, boxes with class scores below this threshold are rejected. max_boxes: const (Required) * Maximum number of boxes to select. If the number of surviving boxes are less, output is padded up to this number. per_class_suppression: const (Optional) * Defaults to ``False``. * If ``True``, suppression is performed independently within boxes of each class. Returns ------- tensor<[n, max_boxes, 4], T> * Coordinates of selected boxes. tensor<[n, max_boxes, K], T> * Scores of selected boxes. tensor<[n, max_boxes], i32> * Indices of selected boxes. tensor<[n], i32> * Number of boxes selected for each batch. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( boxes=TensorInputType(type_domain="T"), scores=TensorInputType(type_domain="T"), iou_threshold=TensorInputType(const=True, type_domain="T"), score_threshold=TensorInputType(const=True, type_domain="T"), max_boxes=TensorInputType(const=True, type_domain=types.int32), per_class_suppression=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( per_class_suppression=False) def type_inference(self): boxes_dtype = self.boxes.dtype scores_dtype = self.scores.dtype n_batch, _, n_score = self.scores.shape max_boxes = self.max_boxes.val return ( types.tensor(boxes_dtype, (n_batch, max_boxes, 4)), types.tensor(scores_dtype, (n_batch, max_boxes, n_score)), types.tensor(types.int32, (n_batch, max_boxes)), types.tensor(types.int32, (n_batch,)), ) @register_op class non_zero(Operation): """ Returns the indices of the elements in the given tensor that are non-zero. Parameters ---------- x: tensor<\*?, T> (Required) * Tensor, values selected at indices where its values is not equal to ``0``. Returns ------- tensor<[N, R], int32> * 2-dimensional tensor contains indices of elements that are non-zero. Each row is the index for a non-zero value. * ``N`` is the number of non-zero elements, ``R`` is the rank of the input. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T") ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def type_inference(self): if self.x.val is not None: value = self.value_inference() return types.tensor(types.int32, value.shape) shape = tuple([get_new_symbol(), self.x.rank]) return types.tensor(types.int32, shape) @precondition(allow=VALUE) def value_inference(self): return np.transpose(np.nonzero(self.x.val)) @register_op class one_hot(Operation): """ Returns one-hot vectors whose locations represented in ``indices`` take the ``on_value``, while other locations take the ``off_value``. Parameters ---------- indices: tensor<[D], i32> (Required) * Tensor, values indicate the locations for each one-hot vector to take the ``on_value``. one_hot_vector_size: i32 (Required) * Indicates the number of returning vectors. axis: const i32 (Optional) * Indicates which dimension to append the new axis. * If the input indices is rank ``D``, the output tensor will have rank ``D+1``. * Defaults to ``-1`` (the last dimension). on_value: const T (Optional) * Values for locations where defined in ``indices``. * Defaults to ``1``. off_value: const T (Optional) * Defaults to ``0``. Returns ------- tensor<\*?,T> * A tensor that contains one-hot vectors. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( indices=TensorInputType(type_domain=types.int32), one_hot_vector_size=TensorInputType(type_domain=types.int32), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), on_value=TensorInputType(const=True, optional=True, type_domain="T"), off_value=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def default_inputs(self): return DefaultInputs( axis=-1, on_value=1, off_value=0, ) def type_inference(self): on_type = self.on_value.dtype off_type = self.off_value.dtype if on_type != off_type: raise TypeError( "Parameters on_value and off_value must have same input types." ) if self.axis.val < -self.indices.rank - 1 or self.axis.val > self.indices.rank: raise IndexError( "Axis value {} is out of bounds for {} node {}".format( self.axis.val, self.op_type, self.name ) ) indices_shape = list(self.indices.shape) depth_value = self.one_hot_vector_size.val if depth_value is None: depth_value = get_new_symbol() elif depth_value < 0: raise ValueError("Parameter one_hot_vector_size must be non-negative") retshape = indices_shape if self.axis.val < 0: cut = len(retshape) + self.axis.val + 1 else: cut = self.axis.val retshape = retshape[0:cut] + [depth_value] + retshape[cut:] return types.tensor(on_type, retshape) @register_op class pad(Operation): """ Pads a tensor. Parameters ---------- x: tensor<[\*D_in], T> (Required) pad: tensor<[2\*N], i32> (Required) ``N <= D_in``. Last ``N`` dimensions of ``x`` are padded as follows: * For each dimension ``i`` of ``x`` if ``i >= D_in - N``: * pad ``pad[2*i]`` elements before ``x[..,i,..]``. * pad ``pad[2*i+1]`` elements after ``x[..,i,..]``. * If mode is "reflect" then ``pad[2*i]`` and ``pad[2*i+1]`` can be at most ``D[i]-1``. * If mode is "replicate" then ``pad[2*i]`` and ``pad[2*i+1]`` can be at most ``D[i]``. * If pad is not a constant, it must be a vector of length ``2 * rank(x)``, that is, ``N == D_in``. mode: const (Optional) * Defaults to ``constant``. * Must be one of the following values: ``constant``, ``reflect``, or ``replicate``. constant_val: const (Optional) * Defaults to ``0``. * Constant value to pad. Ignored if ``mode != constant``. Returns ------- tensor<[\*D_out],T> * Tensor with same type as the input. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), pad=TensorInputType(type_domain=types.int32), mode=TensorInputType(const=True, optional=True, type_domain=types.str), constant_val=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } def default_inputs(self): return DefaultInputs( mode="constant", constant_val=0., ) def type_inference(self): in_shape = self.x.shape ret_shape = list(in_shape) pad = self.pad if len(pad.shape) != 1: raise ValueError("Pad should be a 1D tensor!") if self.mode and self.mode.val not in {"constant", "reflect", "replicate"}: raise ValueError("Pad mode should be one of {'constant', 'reflect', 'replicate'}") if pad.val is None and pad.shape[0] != self.x.rank * 2: raise ValueError( f"Non-constant 'pad' must have shape ({2*self.x.rank},). Got {pad.shape}" ) if pad.sym_val is None: for i in range(self.pad.shape[0] // 2): ret_shape[-self.pad.shape[0] // 2 + i] = get_new_symbol() else: pad = pad.sym_val pad = pad.copy() if len(pad) % 2 != 0: raise ValueError("Number of elements in the argument Pad must be divisible by 2.") for i in range(len(pad)): if not is_symbolic(pad[i]) and pad[i] < 0: raise ValueError(f"pad must be non-negative integer, got {pad[i]} at index {i}") pad = pad.reshape(-1, 2) if pad.shape[0] > len(ret_shape): raise ValueError( "Number of dimensions specified through pad must less than or equal to rank " "of input x" ) for i in range(len(pad)): ret_shape[-len(pad) + i] = ret_shape[-len(pad) + i] + pad[i][0] + pad[i][1] return types.tensor(self.x.dtype, tuple(ret_shape)) @precondition(allow=VALUE) def value_inference(self): # NumPy `edge` mode is equivalent to `replicate` mode of PyTorch and CoreML mode = "edge" if self.mode.val == "replicate" else self.mode.val pad_val = self.pad.val if pad_val is None: return None if len(self.x.val.shape) > (pad_val.shape[0] // 2): updated_pad = np.zeros(len(self.x.val.shape) * 2) updated_pad[-pad_val.shape[0] :] = pad_val pad_val = updated_pad pad_val = pad_val.reshape(-1, 2).astype(np.int32) if mode == "constant": return np.pad( self.x.val, pad_val, mode, constant_values=self.constant_val.val ) # NumPy does not support non-constant mode and constant_values argument return np.pad(self.x.val, pad_val, mode) @register_op class range_1d(Operation): """ Returns a numpy-like 1-D range sequence. Parameters ---------- start: (Required) * The start point of the sequence. end: (Required) * The upper limit of the sequence, exclusive. step: (Required) * Number that increments ``start``. Returns ------- tensor * A 1-D tensor, where ``M`` is the length of the sequence. Attributes ---------- T: i32, fp16, fp32 """ input_spec = InputSpec( end=TensorInputType(type_domain="T"), start=TensorInputType(type_domain="T"), step=TensorInputType(type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } @precondition(allow=VALUE) def value_inference(self): start = self.start.val end = self.end.val step = self.step.val shape = (end - start) / step # To prevent from creating constant greater then 1MB, # a upper bound of the size of the resulting array is set. if shape > MAX_SIZE_CONSTANT_FOLDING: return None return np.arange(start, end, step) def type_inference(self): start = self.start.sym_val end = self.end.sym_val step = self.step.sym_val if ( (self.start.dtype != self.end.dtype) or (self.start.dtype != self.step.dtype) or (self.end.dtype != self.step.dtype) ): raise TypeError( "All inputs to the range operation must have same input types." ) if all(sym_val is not None for sym_val in (start, end, step)): shape = (end - start) / step shape = shape if is_symbolic(shape) else int(math.ceil(shape)) shape = tuple([shape]) else: shape = tuple( [ get_new_symbol(), ] ) return types.tensor(self.start.dtype, shape) @register_op class tile(Operation): """ Returns a new tensor by replicating input ``x`` multiples times. Dimension ``i`` of ``x`` will be replicated ``reps[i]`` times. Parameters ---------- x: tensor<\*?, T> (Required) * Input tensor. reps: tensor<[rank(x)], i32> (Required) * A 1-D tensor with length ``rank(x)``, which indicates the number to replicate the input along each dimension. Returns ------- tensor<\*?, T>: * An n-D tensor with same type as the input. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), reps=TensorInputType(type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def type_inference(self): x_type = self.x.dtype x_shape = np.array(self.x.shape) reps = self.reps.sym_val if reps is None: out_shape = tuple([get_new_symbol() for _ in range(self.x.rank)]) return types.tensor(x_type, out_shape) if len(reps) == 0 or len(reps) != self.x.rank: msg = ( "Length of the reps ({}) must be at least 1, and " "equal to the rank of the input x ({})" ) raise ValueError(msg.format(len(reps), self.x.rank)) out_shape = [] for i, rep in enumerate(reps): if not is_symbolic(rep): if rep <= 0: raise ValueError("All entries of reps parameter must be greater than 0") if is_symbolic(rep): out_shape.append(get_new_symbol()) elif is_symbolic(x_shape[i]): if rep == 1: out_shape.append(x_shape[i]) else: out_shape.append(get_new_symbol()) else: out_shape.append(rep * x_shape[i]) out_shape = tuple(out_shape) return types.tensor(x_type, out_shape) @precondition(allow=VALUE) def value_inference(self): # Infer only if don't have symbolic values. if self.reps.val is None: return None return np.tile(self.x.val, reps=self.reps.val) @register_op class argsort(Operation): """ Returns a tensor containing the indices of the sorted values along a given axis of the input tensor. Parameters ---------- x: <\*?, T> (Required) * Input tensor. * axis: const (Optional) * Defaults to ``-1`` (the last dimension). * Axis to perform the operation. * ascending: const (Optional) * Defaults to ``False``, sort in descending order. * ``True`` to sort in ascending order. Returns ------- tensor<\*?, int32> * Tensor containing the indices of the sorted values Attributes ---------- T: fp16, fp32, i32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), ascending=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( axis=-1, ascending=False, ) def type_inference(self): return types.tensor(types.int32, self.x.shape) @precondition(allow=VALUE) def value_inference(self): # The default np argsort mode is ascending, which is opposite to MIL's argsort op. if self.ascending.val: return np.argsort(self.x.val, axis=self.axis.val) return np.argsort(-self.x.val, axis=self.axis.val) @register_op class topk(Operation): """ Returns a tensor containing top or bottom ``k`` values and the corresponding indices of the input tensor along a given axis. Parameters ---------- x: <\*?, T> (Required) * Input tensor. k: const (Optional) * Defaults to ``1``. * Number of values/indices to be computed along each axis. axis: const (Optional) * Defaults to ``-1`` (last dimension). * Axis to perform the operation. ascending: const (Optional) * Defaults to ``False``, sort in descending order. * ``True`` to sort in ascending order. Returns ------- tensor<\*?, T> * Values of top/bottom ``k`` elements. tensor<\*?, int32> * Indices of the top/bottom ``k`` elements along axis. Attributes ---------- T: fp16, fp32, int32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), k=TensorInputType(const=True, optional=True, type_domain=types.int32), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), ascending=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( k=1, axis=-1, ascending=False, ) def type_inference(self): x_type = self.x.dtype x_shape = self.x.shape k = self.k.val axis = self.axis.val if not is_symbolic(x_shape[axis]) and k > x_shape[axis]: msg = "K={} is greater than size of the given axis={}" raise ValueError(msg.format(k, axis)) ret_shape = list(x_shape) ret_shape[axis] = k return types.tensor(x_type, ret_shape), types.tensor(types.int32, ret_shape) @precondition(allow=VALUE) def value_inference(self): indices = np.argsort(self.x.val, axis=self.axis.val) if not self.ascending.val: indices = np.argsort(-self.x.val, axis=self.axis.val) slc = [slice(None)] * self.x.rank slc[self.axis.val] = slice(0, self.k.val) indices = indices[tuple(slc)] values = np.take_along_axis(self.x.val, indices, axis=self.axis.val) return values, indices @register_op class flatten2d(Operation): """ Flattens input tensor into 2d tensor by flattening dimensions before and after the provided axis. Parameters ---------- x: tensor<[*d], T> (Required) * Input tensor. axis: const (Optional) * Defaults to ``1``. * Negative axis is supported. Returns ------- tensor * ``d_prior`` is product of dimensions ``x[:axis]`` * ``d_post`` is product of dimensions ``x[axis:]`` Examples -------- 1. ``input_shape = (3, ), axis = -1, output_shape = (1, 3)`` 2. ``input_shape = (3, ), axis = 1, output_shape = (3, 1)`` 3. ``input_shape = (4, 3), axis = -1, output_shape = (4, 3)`` 4. ``input_shape = (2, 3, 2), axis = -1, output_shape = (6, 2)`` 5. ``input_shape = (5, 5, 2), axis = 1, output_shape = (5, 10)`` Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32) ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def default_inputs(self): return DefaultInputs( axis=1, ) def type_inference(self): shape = list(self.x.shape) axis = self.axis.val dim_pre_axis = np.prod(shape[:axis]) dim_post_axis = np.prod(shape[axis:]) new_shape = [dim_pre_axis, dim_post_axis] return types.tensor(self.x.dtype, tuple(new_shape)) @precondition(allow=VALUE | SYMBOL) def value_inference(self): shape = self.x.shape axis = self.axis.val dim_pre_axis = np.prod(shape[:axis]) dim_post_axis = np.prod(shape[axis:]) return self.x.val.reshape(dim_pre_axis, dim_post_axis) @register_op class shape(Operation): """ Returns a 1-dimensional tensor with the shape of the input tensor. Parameters ---------- x: tensor<[*?], T> (Required) * Input tensor. Returns ------- tensor * Shape of the input tensor. * ``K = x.rank``. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec(x=TensorInputType(type_domain="T")) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def type_inference(self): input_rank = self.x.rank return types.tensor(types.int32, tuple([input_rank])) def value_inference(self): if any_symbolic(self.x.shape): # convert elements in shape to int32 res = [x if is_symbolic(x) else np.int32(x) for x in self.x.shape] return np.array(res) else: return np.array(self.x.shape).astype(np.int32) @register_op class concat(Operation): """ Concatenates tensors along a dimension. Parameters ---------- values: Tuple[tensor<[d0, d1, ..., d_axis_i, ..., d_n],T>] (Required) * The number of dimensions of the input tensors must match, and all dimensions except ``axis`` must be equal. * The tensors may be variadic, but the number of tensors must be determined at compile time (i.e. a tuple). axis: const (Required) * The dimension along which to concatenate. Must be in the range ``[-rank(values[i]), rank(values[i]))`` for all ``i``. interleave: const (Optional, Default=False) * If True, concatenate the inputs by interleaving them. * If True, all the inputs to this op must have the exact same shape. Examples -------- .. sourcecode:: python in1 = [[1, 2], [3, 4], [5, 6]] # shape (3, 2) in2 = [[7, 8], [9, 10], [11, 12]] # shape (3, 2) axis = 0 # output shape is (6, 2) if interleave is False: # default # output[0:3, :] = in1 # output[3:6, :] = in2 output = [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]] if interleave is True: # output[0::2, :] = in1 # output[1::2, :] = in2 output = [[1, 2], [7, 8], [3, 4], [9, 10], [5, 6], [11, 12]] Returns ------- tensor<[d0, d1,...d_axis_out, ..., d_n],T> * Where ``d_axis_out = sum(d_axis_i)``. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( values=TupleInputType(), axis=TensorInputType(const=True, type_domain=types.int32), interleave=TensorInputType(const=True, optional=True, type_domain=types.bool) ) def default_inputs(self): return DefaultInputs( interleave=False, ) def type_inference(self): concat_dim_len = 0 if len(self.values) == 0: raise ValueError("Concat {} got 0 values".format(self.name)) # Validate values have the same rank rank = self.values[0].rank for v in self.values: if v.rank != rank: msg = "Input {} has rank {} != other inputs rank {}" raise ValueError(msg.format(v.name, v.rank, rank)) # Check concat axis is within (-rank, rank) concat_axis = self.axis.val if concat_axis < 0: concat_axis += rank if rank > 0 and (concat_axis < 0 or concat_axis >= rank): msg = "In {} of op_type {}: axis out of bound for input " + "(rank {})" raise ValueError(msg.format(self.name, self.op_type, rank)) # Validate values share the same data type dtype = self.values[0].dtype for v in self.values[1:]: if v.dtype != dtype: msg = ( "Tensors in 'values' of the concat op ({}) should share the " "same data type. Got {}." ).format(self.name, [x.dtype for x in self.values]) raise ValueError(msg) # validate that non-axis dimensions match retshape = list(self.values[0].shape) for v in self.values[1:]: for i in range(rank): if is_symbolic(retshape[i]) or is_symbolic(v.shape[i]): continue if i != concat_axis and retshape[i] != v.shape[i]: msg = 'Dimension mismatch in {} ("{}"): shapes {} vs. {}' raise ValueError( msg.format(self.op_type, self.name, retshape, v.shape) ) if self.interleave.val and retshape[i] != v.shape[i]: msg = 'Dimension mismatch in {} ("{}"): shapes {} vs. {}. ' \ 'All inputs must have same shape when \'interleave\' option is True.' raise ValueError( msg.format(self.op_type, self.name, retshape, v.shape) ) # Get length of concat dim concat_dim_len = 0 for v in self.values: if len(v.shape) == 0: taxis = 1 else: taxis = v.shape[concat_axis] if is_symbolic(taxis): concat_dim_len = get_new_symbol() break concat_dim_len += taxis if len(retshape) == 0: retshape = [concat_dim_len] else: retshape[concat_axis] = concat_dim_len return types.tensor(dtype, retshape) @precondition(allow=VALUE | SYMBOL | NONE) def value_inference(self): values = [] for v in self.values: if v.sym_val is not None: values.append(v.sym_val) continue if v.rank == 0: values.append(get_new_symbol()) continue if any_symbolic(v.shape): values.append(None) continue # we support value inference when number of elements for each tensor is less than 10 shape = v.shape num_element = np.prod(shape) if num_element > 10: values.append(None) continue symbolic_tensor = [get_new_symbol() for _ in range(num_element)] symbolic_tensor = np.reshape(np.array(symbolic_tensor), shape) values.append(symbolic_tensor) if any([val is None for val in values]): return None if not isinstance(values[0], np.ndarray) or values[0].shape == (): return np.stack(values, axis=self.axis.val) return np.concatenate(values, axis=self.axis.val) @register_op class split(Operation): """ Split tensors into a tuple Parameters ---------- x: <\*?,T> (Required) * The tensor to split. * The tensors may be variadic, but the number of tensors must be determined at compile time (i.e. a tuple). axis: const (Required) * The dimension along which to concatenate. Must be in the range ``[-rank(x), rank(x))``. num_splits: (Optional) If specified, divide ``x`` into ``num_splits`` tensors along ``axis``. Its behavior depends on ``split_sizes``: * If ``split_sizes`` is defined, ``num_splits == S``, and the output sizes may be uneven. * If ``split_sizes`` is not defined, ``value.shape[axis]`` must be divisible by ``num_splits``, and the output sizes must be even. At least one of ``num_splits`` or ``split_sizes`` must be provided. If ``split_sizes`` length ``S`` cannot be determined at compile time, ``num_splits`` must be supplied to determine the number of outputs. split_sizes: const (Optional) * Sizes to split to. The sum of ``split_sizes`` must equal to ``value.shape[axis]``. Returns ------- Tuple[tensor<\*?, T>] * Where the length of the tuple is the number of splits (determined from ``num_splits`` or ``split_sizes``). Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), num_splits=TensorInputType(const=True, optional=True, type_domain=types.int32), split_sizes=TensorInputType(const=True, optional=True, type_domain=types.int32), axis=TensorInputType(const=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def type_inference(self): num_splits, sizes = self._get_num_splits_and_sizes() x_shape = list(self.x.shape) ret_shapes = [x_shape[:] for _ in range(num_splits)] axis = self.axis.val for i, d in enumerate(sizes): ret_shapes[i][axis] = d self.sizes = sizes return tuple([types.tensor(self.x.dtype, s) for s in ret_shapes]) def _get_num_splits_and_sizes(self): """ Return: - num_splits: int - sizes: list of int/symbols. Of length num_splits Raise ValueError if num_splits cannot be determined. """ if self.num_splits is None and self.split_sizes is None: msg = ( "At least one of num_splits and split_sizes " + "must be specified in split op {}" ) raise ValueError(msg.format(self.name)) axis = self.axis.val if self.num_splits is not None: num_splits = self.num_splits.val if self.split_sizes is None: # Even split if ( not is_symbolic(self.x.shape[axis]) and self.x.shape[axis] % num_splits != 0 ): msg = "num_split {} does not divide split " + "dim (length = {})" raise ValueError(msg.format(num_splits, self.x.shape[axis])) size = self.x.shape[axis] / num_splits return num_splits, [size] * num_splits # self.split_sizes is not None if self.split_sizes.sym_val is not None: return num_splits, self.split_sizes.sym_val # self.split_size.sym_val is None. sizes = [get_new_symbol() for _ in range(num_splits)] return num_splits, sizes # self.num_splits is None, self.split_sizes is not None if self.split_sizes.sym_val is not None: return len(self.split_sizes.sym_val), self.split_sizes.sym_val # self.num_splits is None, self.split_sizes is not None # self.split_sizes.sym_val is None if any_symbolic(self.split_sizes.shape): raise ValueError("Unable to determine number of splits") num_splits = len(self.split_sizes.shape) sizes = [get_new_symbol() for _ in range(num_splits)] return num_splits, sizes @precondition(allow=VALUE | SYMBOL | NONE) def value_inference(self): num_splits, sizes = self._get_num_splits_and_sizes() if self.x.sym_val is None or any_symbolic(sizes): raise NotImplementedError() if num_splits == 1: # No split_indices possible. return self.x.sym_val split_indices = np.cumsum(sizes).astype(np.int32) return tuple(np.split(self.x.sym_val, split_indices[:-1], axis=self.axis.val)) @register_op class stack(Operation): """ Concatenates tensors along a dimension. Parameters ---------- values: Tuple[tensor<[d0, d1,...d_axis_i, ..., d_n], T>] (Required) * All tensors must have identical shape. axis: const (Required) * The dimension along which to concatenate. Must be in the range ``[-rank(values[i]), rank(values[i]))`` for all ``i``. Returns ------- tenor<[d0, d1,...d_axis_out, ..., d_n], T> * Where ``d_axis_out = sum(d_axis_i)``. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( values=TupleInputType(), axis=TensorInputType(const=True, type_domain=types.int32) ) def type_inference(self): num_tensors = len(self.values) if num_tensors == 0: raise ValueError("Cannot stack 0 tensor") # get the first value without symbolic shape t_shape = None for value in self.values: if not any_symbolic(value.shape): t_shape = value.shape break t_shape = self.values[0].shape if t_shape is None else t_shape # compare all shape for t in self.values: if not is_compatible_symbolic_vector(t.shape, t_shape): msg = "Component tensor {} has shape {}, others have {}" raise ValueError(msg.format(t.name, t.shape, t_shape)) # Validate values share the same data type dtype = self.values[0].dtype for v in self.values[1:]: if v.dtype != dtype: msg = ( "Tensors in 'values' of the stack op ({}) should share the " "same data type. Got {}." ).format(self.name, [x.dtype for x in self.values]) raise ValueError(msg) axis = self.axis.val if axis < 0: axis += (self.values[0].rank + 1) rank = self.values[0].rank if axis > rank: raise ValueError(f"axis must in range [{-rank}, {rank}). Got {axis}") ret_shape = list(t_shape) ret_shape.insert(axis, num_tensors) return types.tensor(self.values[0].dtype, ret_shape) @precondition(allow=VALUE | SYMBOL | NONE) def value_inference(self): is_all_rank_zero = all([v.rank == 0 for v in self.values]) values = [ v.sym_val if v.sym_val is not None else get_new_symbol() for v in self.values ] if any([is_symbolic(v) for v in values]) and not is_all_rank_zero: return None return np.stack(values, self.axis.val) # identity is used for renaming and is rarely necessary. See # `loop_invariant_elimination` pass for a rare use case. @register_op class identity(Operation): """ Returns a tensor with the same shape and contents as input. Parameters ---------- x: tensor<\*?, T> (Required) * Input tensor. Returns ------- tensor<\*?, T> * Same type and shape as the input tensor. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec(x=ListOrTensorOrDictInputType()) def type_inference(self): return self.x.sym_type @precondition(allow=VALUE | SYMBOL) def value_inference(self): return self.x.sym_val ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS15/tensor_transformation.py0000644000000000000000000010666214672066616030204 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import List import numpy as np import sympy as sm from coremltools import _logger as logger from coremltools.converters.mil.mil import ( Operation, get_new_symbol, get_new_variadic_symbol, precondition, types, ) from coremltools.converters.mil.mil.input_type import DefaultInputs, InputSpec, TensorInputType from coremltools.converters.mil.mil.operation import SYMBOL, VALUE from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs._utils import ( get_param_val, get_squeeze_axes, solve_slice_by_index_shape, solve_slice_by_index_slice, ) from coremltools.converters.mil.mil.types.symbolic import ( any_symbolic, any_variadic, is_symbolic, isscalar, ) @register_op class depth_to_space(Operation): """ Rearrange elements in a tensor from depth (channel) into spatial dimensions. Parameters ---------- x: tensor<[n, C, H, W], T> (Required) * Input tensor of rank ``4``. block_size: const i32 (Required) * The size of the spatial block. Must be greater than ``1`` and divisible by channel dimension ``C``. Returns ------- tensor<[n, C / block_size^2, H x block_size, W x block_size], T> * Where ``b`` is the block size. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), block_size=TensorInputType(const=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): x_type = self.x.dtype n, c, h, w = self.x.shape bs = self.block_size.val ret_shape = (n, c // (bs * bs), h * bs, w * bs) return types.tensor(x_type, ret_shape) @register_op class expand_dims(Operation): """ Insert a single-dimension in a 1-D or higher tensor at each axis in axes. Parameters ---------- x: tensor<\*?, T> (Required) * Scalar or tensor. axes: const tensor<[K], i32> Required * ``K`` is the number of dimensions expanded. * Insert single dimension at dimension index at each axes. * Negative value to index from the end. ``-d-1 <= axis <= d`` where ``d`` is the rank of ``x``. Returns ------- tensor<\*(rank(x)+K), T> * Same type as the input ``x`` with rank ``rank(x)+K``. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axes=TensorInputType(const=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def type_inference(self): x_rank = self.x.rank x_type = self.x.dtype x_shape = list(self.x.shape) axes = self.axes.val out_rank = x_rank + len(axes) for axis in axes: if axis <= -out_rank - 1 or axis >= out_rank: msg = 'Axis value {} is out of bounds for {} node "{}" of shape {}' raise IndexError( msg.format(axis, self.op_type, self.name, self.x.shape) ) ret_shape = x_shape axes = sorted([out_rank + axis if axis < 0 else axis for axis in axes]) for axis in axes: ret_shape.insert(axis, 1) return types.tensor(x_type, tuple(ret_shape)) @precondition(allow=VALUE) def value_inference(self): axes = self.axes.val out_rank = self.x.rank + len(axes) for axis in axes: if axis <= -out_rank - 1 or axis >= out_rank: msg = 'Axis value {} is out of bounds for {} node "{}" of shape {}' raise IndexError( msg.format(axis, self.op_type, self.name, self.x.shape) ) axes = sorted([out_rank + axis if axis < 0 else axis for axis in axes]) ret_shape = list(self.x.shape) for axis in axes: ret_shape.insert(axis, 1) return np.reshape(self.x.val, ret_shape) def reshape_with_symbol(v, shape): """ Perform basic reshape if v is symbolic (not array of symbols). """ if is_symbolic(v): return np.array(v).reshape(shape) shape = [int(s) for s in shape] return v.reshape(shape) @register_op class reshape(Operation): """ Return a tensor that has the same values as ``x`` with shape ``shape``. ``shape`` must have the same volume (number of elements) as ``x``. Parameters ---------- x: tensor<\*?, T> (Required) * An n-D tensor or a scalar. * If ``x`` is fixed rank (and possibly contains symbolic dimension), shape may contain elements that are not positive integers (see below). * If ``x`` is variadic rank, shape can only contain positive integers. shape: tensor<[K], i32> (Required) A 1-D tensor, with elements from the following: * Positive integers. * Symbols: All but one symbol in shape must be present in ``x.shape``. The new symbol that is not present in ``x.shape`` represent a dimension such that the total size remains constant. Symbol is illegal if ``x`` is variadic rank. * ``-1``: ``-1`` introduces a new symbol (see Symbols above). Therefore, ``-1`` is allowed if all symbols in the shape appear in ``x.shape``. ``-1`` is illegal if ``x`` is variadic rank. * ``0``: If ``K == rank(x)`` then ``0`` means inheriting from the corresponding dimension in ``x.shape``. ``0`` is illegal if ``x`` is variadic rank. Returns ------- tensor<\*?, T> * Tensor with shape determined by the input shape. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), shape=TensorInputType(type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def type_inference(self): if any_symbolic(self.shape.shape): # We can't infer any shape if shape has variable length. return types.tensor(self.x.dtype, (get_new_variadic_symbol(),)) # shape has fixed length here. if self.shape.sym_val is None: shape = tuple([get_new_symbol() for _ in range(self.shape.shape[0])]) return types.tensor(self.x.dtype, shape) t, _ = self._get_type_val() return t @precondition(allow=VALUE | SYMBOL) def value_inference(self): _, val = self._get_type_val() return val def _get_type_val(self): count_neg_one = np.count_nonzero(self.shape.sym_val == -1) if count_neg_one > 1: raise ValueError( f"Reshape op supports only one dimension to be -1, " f"but got {count_neg_one} dimensions be -1." ) if not any_symbolic(self.x.shape) and self.shape.val is not None: ret_shape = self._infer_shape_static() else: ret_shape = self._infer_shape_dynamic() ret_val = None if self.x.sym_val is not None and all( isscalar(a) and not is_symbolic(a) for a in ret_shape ): ret_val = reshape_with_symbol(self.x.sym_val, ret_shape) return types.tensor(self.x.dtype, tuple(ret_shape)), ret_val @staticmethod def replace_zeros_in_shape(from_shape: List[int], to_shape: List[int]) -> List[int]: """Replaces 0s in `to_shape` by the corresponding dims in `from_shape`.""" if to_shape.count(0): if len(from_shape) != len(to_shape): raise ValueError( f"When there is 0 in shape, the rank of x ({len(from_shape)}) " f"must equal to the target shape len ({len(to_shape)})." ) to_shape = [s if s != 0 else from_shape[dim] for dim, s in enumerate(to_shape)] return to_shape @staticmethod def replace_neg_one_in_shape(from_shape: List[int], to_shape: List[int]) -> List[int]: """Replaces -1 in `to_shape` by the corresponding dims in `from_shape`.""" if to_shape.count(-1): neg_one_idx = to_shape.index(-1) total_element_num = np.prod(from_shape) remain_element_num = np.prod( [dim for idx, dim in enumerate(to_shape) if idx != neg_one_idx] ) infer_dim = total_element_num // remain_element_num to_shape[neg_one_idx] = infer_dim return to_shape def _infer_shape_static(self): from_shape = list(self.x.shape) to_shape = list(self.shape.val) to_shape = self.replace_zeros_in_shape(from_shape, to_shape) to_shape = self.replace_neg_one_in_shape(from_shape, to_shape) if np.prod(from_shape) != np.prod(to_shape): raise ValueError( f"Invalid target shape in `reshape` op ({from_shape} to {list(self.shape.val)})." ) return to_shape def _infer_shape_dynamic(self): x_vol = np.prod(self.x.shape) # shape is const, and thus sym_val is not None sym_shape = self.shape.sym_val sym_shape = [get_new_symbol() if d == -1 else d for d in sym_shape] try: ret_shape = reshape.enforce_volumetric_constraint(x_vol, sym_shape) except: ret_shape = sym_shape return ret_shape @staticmethod def enforce_volumetric_constraint(left_volume, inshape): left_symbols = set() if is_symbolic(left_volume): left_symbols = left_volume.free_symbols # Generally, we want to solve for right in terms of left. But this # is kinda annoying actually. shape = list(inshape) # Handling when reshape is given 0 instead of actual input # input tensor shape: [4, 3, 2], reshape:[0, -1], output tensor shape: [4, 6] infer_dim_index = shape.index(-1) if -1 in shape else None right_volume = 1 for i in shape: if i != -1: right_volume = right_volume * i if infer_dim_index: shape[infer_dim_index] = left_volume // right_volume if not is_symbolic(right_volume): return shape constraints = [left_volume - right_volume] solve_for = [s for s in shape if is_symbolic(s)] for rightsym in solve_for: sol = sm.solve(constraints, [rightsym], dict=True) if not isinstance(sol, list): sol = [sol] # look for an acceptable solution for s in sol: if 0 in s.values(): continue for i in range(len(shape)): if shape[i] in s: v = s[shape[i]] if len(v.free_symbols - left_symbols) > 0: continue try: shape[i] = int(v) except: shape[i] = v return shape @register_op class reverse(Operation): """ Reverse the order of the input tensor ``x`` along specified ``axes`` (dimensions). Parameters ---------- x: tensor<\*?, T> (Required) * Input tensor. axes: const (Optional) * Dimension(s) to reverse. Each axis must be in the range ``[-rank(x), rank(x))``. * Defaults to None (reverse on all dimensions). Returns ------- tensor<\*?, T> * Same type and shape as the input tensor. Attributes ---------- T: fp16, fp32, i32, bool References ---------- See `tf.reverse `_ and `TORCH `_. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axes=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def default_inputs(self): return DefaultInputs( axes=None, ) def type_inference(self): return self.x.sym_type @precondition(allow=VALUE) def value_inference(self): res = self.x.val axes = self.axes.val if self.axes is not None else range(self.x.rank) for axis in axes: res = np.flip(res, axis=axis) return res @register_op class reverse_sequence(Operation): """ Reverse variable length slices for specified axes / dimensions of the input tensor. This op first slices input tensor along the ``batch_axis`` dimension, then partially reverses the elements along the ``seq_axis`` for the first ``lengths[i]`` elements. Parameters ---------- x: tensor<\*?, T> (Required) * Input tensor. lengths: tensor (Required) * 1-dimensional tensor of length ``x.shape[batch_axis]`` specifying the length of the sequence to reverse. * Values must be in range ``[0, x.shape[seq_axis]]``. seq_axis: const (Optional) * The dimension to reverse. * Defaults to ``0``. batch_axis: const (Optional) * Dimension for slicing. * Defaults to ``0``. Returns ------- tensor<\*?, T> * Same type and shape as the input tensor. Attributes ---------- T: fp16, fp32, i32, bool References ---------- `tf.reverse_sequence `_ """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), lengths=TensorInputType(type_domain=types.int32), seq_axis=TensorInputType(const=True, optional=True, type_domain=types.int32), batch_axis=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def default_inputs(self): return DefaultInputs( seq_axis=0, batch_axis=0) def type_inference(self): return self.x.sym_type @precondition(allow=VALUE) def value_inference(self): raise NotImplementedError("TODO") @register_op class slice_by_index(Operation): """ Method for numpy style indexing and slicing. With a tensor ``x``, this method achieves the following: ``result = x[begin[0]: end[0]: stride[0], begin[1]: end[1]: stride[1], ...]`` Note: This method does not support pure indexing. You would need to do a squeeze if indexing is intended. Parameters ---------- x: tensor<*?, T> (Required) * Input tensor begin: tensor<[rank(x)], i32> (Required) * Starting index for the dimension of slicing. end: tensor<[rank(x)], i32> (Required) * Ending index for the dimension of slicing. stride: tensor<[rank(x)], i32> (Optional) * Default is all ``1``. * Stride for the dimension of slicing. begin_mask: tensor<[rank(x)], bool> (Optional) * Default to all ``False``. * If ``begin_mask[i]==True``, ignores ``begin[i]``, and set ``begin[i]`` to ``0``. end_mask: tensor<[rank(x)], bool> (Optional) * Default to all ``False``. * If ``end_mask[i]==True``, ignores ``end[i]``, and set ``end[i]`` to ``x.shape[i]``. squeeze_mask: tensor<[rank(x)], bool> (Optional) * Default to all ``False``. * If ``squeeze_mask[i]==true``, ignores ``end[i]``, and do the pure index at ``begin[i]``. Returns ------- tensor<\*?, T> - Scalar or tensor. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), begin=TensorInputType(type_domain=types.int32), end=TensorInputType(type_domain=types.int32), stride=TensorInputType(const=True, optional=True, type_domain=types.int32), begin_mask=TensorInputType(const=True, optional=True, type_domain=types.bool), end_mask=TensorInputType(const=True, optional=True, type_domain=types.bool), squeeze_mask=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def default_inputs(self): return DefaultInputs( stride=None, begin_mask=None, end_mask=None, squeeze_mask=None, ) def type_inference(self): # solve shape ret_shape = solve_slice_by_index_shape( self.x.shape, self.begin.val, self.end.val, get_param_val(self.stride), get_param_val(self.begin_mask), get_param_val(self.end_mask), get_param_val(self.squeeze_mask), ) if len(ret_shape) == 0: # Scalar case. return self.x.dtype else: return types.tensor(self.x.dtype, tuple(ret_shape)) def value_inference(self): if self.x.sym_val is None or self.begin.val is None or self.end.val is None: return None # solve the data slices and slice tensor slices = solve_slice_by_index_slice( self.x.shape, self.begin.val, self.end.val, get_param_val(self.stride), get_param_val(self.begin_mask), get_param_val(self.end_mask), get_param_val(self.squeeze_mask), ) res = self.x.sym_val[slices] # remove squeeze_axes squeeze_axes = get_squeeze_axes(get_param_val(self.squeeze_mask), self.x.rank) if len(squeeze_axes) > 0: if len(squeeze_axes) == len(res.shape): if len(res) == 0: logger.warning("%s seems to be a 0 sized tensor", self.name) return np.array([]) res = np.squeeze(res).tolist() if is_symbolic(res): return res elif self.x.dtype == types.int32 or self.x.dtype == types.int64: res = np.int32(res) elif self.x.dtype == types.float or self.x.dtype == types.double: res = np.float32(res) else: raise ValueError( "Unable to convert type {}".format(self.x.sym_val.dtype) ) else: res = np.squeeze(res, axis=tuple(squeeze_axes)) return res @register_op class slice_by_size(Operation): """ Slice input tensor starting from the given ``begin`` index and by the amount specified by the ``size`` input, for each dimension. Parameters ---------- x: tensor<*?, T> (Required) * Input tensor. begin: tensor<[rank(x)], i32> (Required) * The begin index for slice. size: tensor<[rank(x)], i32> (Required) * The size that is to be sliced. If ``size`` is ``-1``, all the remaining elements starting with "begin" are sliced. Returns ------- tensor<\*?, T> * Scalar or tensor. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), begin=TensorInputType(type_domain=types.int32), size=TensorInputType(type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def type_inference(self): if self.begin.rank != 1: raise ValueError( "begin should be 1-D tensor, got {}-D tensor instead".format( self.begin.rank ) ) if self.size.rank != 1: raise ValueError( "size should be 1-D tensor, got {}-D tensor instead".format( self.size.rank ) ) if self.x.rank != self.begin.shape[0]: raise ValueError( "Length of begin {} doesn't equal to input rank {}.".format( len(self.begin.shape[0]), len(self.x.rank) ) ) if self.x.rank != self.size.shape[0]: raise ValueError( "Length of size {} doesn't equal to input rank {}.".format( len(self.size.shape[0]), len(self.x.rank) ) ) x_shape = self.x.shape ret_shape = [] if self.size.sym_val is None: ret_shape = [get_new_symbol() for _ in range(self.x.rank)] return types.tensor(self.x.dtype, tuple(ret_shape)) for idx, s in enumerate(self.size.sym_val): if is_symbolic(s): ret_shape.append(s) elif s != -1: ret_shape.append(s) elif self.begin.sym_val is not None: ret_shape.append(x_shape[idx] - self.begin.sym_val[idx]) else: ret_shape.append(get_new_symbol()) return types.tensor(self.x.dtype, tuple(ret_shape)) @precondition(allow=VALUE | SYMBOL) def value_inference(self): if any_symbolic(self.begin.sym_val): return None if any_symbolic(self.size.sym_val): return None if self.x.val is None: return None slices = [] for i in range(self.x.rank): begin_val = self.begin.val[i] if begin_val < 0: if is_symbolic(self.x.shape[i]): return None begin_val += self.x.shape[i] if self.size.val[i] > 0: slices.append(slice(begin_val, begin_val + self.size.val[i])) else: slices.append(slice(begin_val, None, None)) return self.x.val[tuple(slices)] @register_op class space_to_depth(Operation): """ Rearrange elements in a tensor from spatial into depth (channel) dimension. Parameters ---------- x: tensor<[n, C, H, W], T> (Required) * Input tensor of rank ``4``. block_size: const (Required) * The size of the spatial block. Must be greater than ``1`` and divisible by spatial dimensions ``H, W``. Returns ------- tensor<[n, C x block_size^2, H / block_size, W / block_size], T> * Where ``b`` is the block size. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), block_size=TensorInputType(const=True, type_domain=types.int32) ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): x_type = self.x.dtype n, c, h, w = self.x.shape bs = self.block_size.val ret_shape = (n, c * (bs * bs), h // bs, w // bs) return types.tensor(x_type, ret_shape) @register_op class space_to_batch(Operation): """ Rearrange elements in a tensor from spatial into batch dimensions. Parameters ---------- x: tensor<[n, C, H, W], T> (Required) * Input tensor must have rank ``4``. * The first and the second dimension are batch, channel; respectively. * The remaining dimensions ``(H, W)`` are treated as "spatial dimensions". block_shape: const tensor<[2], i32> (Required) * The length of the ``block_shape`` must be ``2``. * It defines the shapes of the block in which the spatial dimensions are divided. paddings: const tensor<[2, 2], i32> (Required) * It must have shape ``(2, 2)``. * It defines the padding for each spatial dimension. Returns ------- tensor<[new_n, C, new_H, new_W], T> * ``new_n = n * block_shape[0] * block_shape[1]`` * ``new_H = (H + paddings[0][0] + padding[0][1])/block_shape[0]`` * ``new_W = (W + paddings[1][0] + padding[1][1])/block_shape[1]`` * The output has the same rank as the input. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), block_shape=TensorInputType(const=True, type_domain=types.int32), paddings=TensorInputType(const=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): x_shape = self.x.shape block_shape = self.block_shape.val paddings = self.paddings.val if self.x.rank != 4: msg = "Input to space_to_batch op must be rank 4. Instead got an input with rank {}".format(self.x.rank) raise ValueError(msg) if paddings.shape != (block_shape.shape[0], 2): msg = "block_shape and paddings must have shape [2], [2, 2] accordingly in the space_to_batch op. "\ "Got {}, {}.".format(block_shape.shape, paddings.shape) raise ValueError(msg) m = block_shape.shape[0] if m != 2: msg = "space_to_batch op only supports spatial dimensions = 2. Got {}".format(m) raise ValueError(msg) b = x_shape[0] c = x_shape[1] spatial_shape = x_shape[2:2+m] if self.x.rank != m + 2: raise ValueError("The input rank of space_to_batch op must exactly be " \ "len(block_shape){} + 2! Got {}".format(self.block_shape.val, self.x.rank)) padded_spatial_shape = [x + paddings[i][0] + paddings[i][1] for i, x in enumerate(spatial_shape)] new_b = b * np.prod(block_shape) new_spatial_shape = [padded_spatial_shape[i]/block_shape[i] for i in range(m)] ret_shape = [new_b, c] + new_spatial_shape x_type = self.x.dtype return types.tensor(x_type, ret_shape) @register_op class batch_to_space(Operation): """ Rearrange elements in a tensor from batch into spatial dimensions. Parameters ---------- x: tensor<[n, C, H, W], T> (Required) * Input tensor must have rank ``4``. * The first and the second dimension are batch, channel; respectively. * The remaining dimensions ``(H, W)`` are treated as "spatial dimensions". block_shape: const tensor<[2], i32> (Required) * The length of the ``block_shape`` must be ``2``. * It defines the shapes of the block in which the spatial dimensions are multiplied. crops: const tensor<[2, 2], i32> (Required) * It must have shape ``(2, 2)``. * It defines the amount to crop from each spatial dimension. Returns ------- tensor<[new_n, C, new_H, new_W], T> * ``new_n = n / (block_shape[0] * block_shape[1])`` * ``new_H = (H * block_shape[0]) - paddings[0][0] - padding[0][1]`` * ``new_W = (W * block_shape[1]) - paddings[1][0] - padding[1][1]`` * The output has the same rank as the input. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), block_shape=TensorInputType(const=True, type_domain=types.int32), crops=TensorInputType(const=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): x_shape = self.x.shape block_shape = self.block_shape.val crops = self.crops.val if self.x.rank != 4: msg = "Input to batch_to_space op must be rank 4. Instead got an input with rank {}".format(self.x.rank) raise ValueError(msg) if crops.shape != (block_shape.shape[0], 2): msg = "block_shape and crops must have shape [2], [2, 2] accordingly in the batch_to_space op. "\ "Got {}, {}.".format(block_shape.shape, crops.shape) raise ValueError(msg) m = block_shape.shape[0] if m != 2: msg = "batch_to_space op only supports spatial dimensions = 2. Got {}".format(m) raise ValueError(msg) b = x_shape[0] c = x_shape[1] spatial_shape = x_shape[2:2+m] if self.x.rank != m + 2: raise ValueError("The input rank of batch_to_space op must exactly be " \ "len(block_shape){} + 2! Got {}".format(self.block_shape.val, self.x.rank)) if not is_symbolic(b) and b % np.prod(block_shape) != 0: msg = ("Batch size must be perfectly divided by the product of block_shape. Got batch size {}, and block_shape {}." ).format(b, block_shape) raise ValueError(msg) new_b = b / np.prod(block_shape) new_spatial_shape = [spatial_shape[i] * block_shape[i] for i in range(m)] cropped_spatial_shape = [x - crops[i][0] - crops[i][1] for i, x in enumerate(new_spatial_shape)] ret_shape = [new_b, c] + cropped_spatial_shape x_type = self.x.dtype return types.tensor(x_type, ret_shape) @register_op class squeeze(Operation): """ Remove single-dimension dimensions in a 1-D or higher tensor. Parameters ---------- x: tensor<\*?,T> (Required) * Must be at least 1-D. axes: const (Optional) * Axes to squeeze out. * The behaviour of squeezing non-single dimensions follow PyTorch instead of NumPy, where it ignores non-single dimensions instead of erroring out. More specifically, if x has shape (2, 3, 4) and axes is [0, 1], the output will be a tensor with shape (2, 3, 4). * Default to remove all single-dimensions. Returns ------- tensor<\*(rank(x)-K),T> * Tensor with same type as input ``x`` and rank ``rank(x)-K``. Attributes ---------- T: fp16, fp32, i32, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axes=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def default_inputs(self): return DefaultInputs( axes=None, ) def type_inference(self): x_type = self.x.dtype x_shape = self.x.shape squeezed_shape = list(x_shape) if self.axes is None: # Squeeze all single-dim, assuming symbolic dims != 1 squeezed_shape = [s for s in squeezed_shape if s != 1] else: axes = self.axes.val axes = [axis if axis >= 0 else axis + self.x.rank for axis in axes] for i in sorted(axes)[::-1]: # descending order if len(squeezed_shape) <= i: raise ValueError( f"Invalid axis {i} in squeeze. The axis should be smaller than {len(squeezed_shape)}" ) if squeezed_shape[i] == 1: # Only remove the dim_size=1 dimension. squeezed_shape.pop(i) return types.tensor(x_type, tuple(squeezed_shape)) if len(squeezed_shape) != 0 else x_type @precondition(allow=VALUE) def value_inference(self): if self.x.val is None: return None if self.axes is None: val = np.squeeze(self.x.val) else: val = np.squeeze(self.x.val, axis=tuple(self.axes.val)) return val if val.shape != () else self.x.val[0] @register_op class transpose(Operation): """ Permute tensor ``x`` dimensions according to ``perm``. Parameters ---------- x: tensor<\*?, T> (Required) * Must be at least 1-D. ``x`` may have a symbolic shape. perm: const<[rank(x)], i32> (Required) * Permutation order. -rank(x) <= perm[I] < rank(x) for all perm entries. Returns ------- tensor<\*?,T> * Tensor with same rank and type as ``x``. Attributes ---------- T: fp16, fp32, i32, bool References ---------- `torch.Tensor.permute `_ """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), perm=TensorInputType(const=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def type_inference(self): x_type = self.x.dtype perm = self.perm.val x_shape = np.array(self.x.shape) if len(perm) != self.x.rank: msg = "perm should have the same length as rank(x): {} != {}" raise ValueError(msg.format(len(perm), self.x.rank)) if self.x.rank == 0: return self.x.sym_type # scalar cannot be transposed if any_variadic(self.x.shape): ret_shape = get_new_variadic_symbol() else: ret_shape = x_shape[perm] return types.tensor(x_type, tuple(ret_shape)) @precondition(allow=VALUE) def value_inference(self): return np.transpose(self.x.val, axes=self.perm.val) @register_op class pixel_shuffle(Operation): """ Rearrange elements in a tensor from depth (channel) into spatial dimensions. Equivalent to PyTorch's ``PixelShuffle``. Parameters ---------- x: tensor<[n, C x f^2, H, W], T> (Required) * Input tensor of rank ``4``. upscale_factor: const * Factor to increase spatial resolution by. Returns ------- tensor<[n, C, H x f, W x f], T> * Where ``f`` is the upscale factor. Attributes ---------- T: fp16, fp32 References ---------- `torch.nn.PixelShuffle `_ """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), upscale_factor=TensorInputType(const=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): x_type = self.x.dtype n, c, h, w = self.x.shape f = self.upscale_factor.val ret_shape = (n, c // (f * f), h * f, w * f) return types.tensor(x_type, ret_shape) @register_op class sliding_windows(Operation): """ Return a tensor containing all windows of ``size``, separated by stride along the given ``axis``. Parameters ---------- x: tensor<[\*d0, d_axis, *dn], T> * Input tensor. axis: const * Axis to perform the operation. size: const * Number of elements in the sliding window. stride: const Optional * Default to ``1``. * The stride of the input elements in the sliding window. Returns ------- tensor<[\*d0, d_axis - size // stride + 1, size, \*dn], T> * The output will be a tensor of rank ``N+1`` where ``N`` is the input tensor rank. Attributes ---------- T: fp16, fp32, int32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axis=TensorInputType(const=True, type_domain=types.int32), size=TensorInputType(const=True, type_domain=types.int32), stride=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs(stride=1) def type_inference(self): x_shape = self.x.shape axis = self.axis.val size = self.size.val stride = self.stride.val ret_shape = list(x_shape) ret_shape[axis] = (x_shape[axis] - size) // stride + 1 pos_axis = axis if axis >= 0 else axis + self.x.rank ret_shape.insert(pos_axis + 1, size) return types.tensor(self.x.dtype, tuple(ret_shape)) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2335467 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS16/0000755000000000000000000000000014672075535023160 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS16/__init__.py0000644000000000000000000000132514672066616025272 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil._deployment_compatibility import \ AvailableTarget as target _IOS16_TARGET = target.iOS16 from .constexpr_ops import (constexpr_affine_dequantize, constexpr_cast, constexpr_lut_to_dense, constexpr_sparse_to_dense) from .image_resizing import crop_resize, resample, upsample_bilinear from .scatter_gather import gather, gather_nd from .tensor_operation import fill_like, topk from .tensor_transformation import pixel_unshuffle, reshape_like ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS16/constexpr_ops.py0000644000000000000000000003471414672066616026451 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, TensorInputType from coremltools.converters.mil.mil.operation import Operation from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS16 import _IOS16_TARGET @register_op(opset_version=_IOS16_TARGET) class constexpr_affine_dequantize(Operation): """ A compile-time operation that returns a constant output value upon dequantizing its constant inputs. This operation is used to represent constant 8-bit quantized data with affine/linear quantization. The quantized data is stored in the parameter ``quantized_data``. The other parameters -- ``scale``, ``zero_point``, and ``axis`` -- describe how unquantized values can be extracted from it, using the equation for affine/linear quantization: .. sourcecode:: python unquantized_data = scale * (quantized_data - zero_point) Although all of the parameters of this op are constants, this op is not constant folded to a single const op at the time of model serialization. The unquantized output will be decompressed later, based on the implementation detail (either at model load time or runtime). Parameters ---------- quantized_data: const tensor (Required) zero_point: const tensor (Required) * ``zero_point`` can be either a scalar or a vector. * ``zero_point`` follows similar broadcasting rules and size constraints as ``scale``. scale: const tensor (Required) * ``scale`` can be either a scalar or a vector. * If ``scale`` is a vector, for implementation it is broadcast to the following shape: * The rank of ``scale`` becomes the same as the rank of ``quantized_data``. * The constraint: ``size(scale-vector) == quantized_data.shape[axis]``. * For ``i == axis``, ``scale.shape[i] == quantized_data.shape[i]``. * For ``i != axis``, ``scale.shape == 1``. For example, assume ``quantized_data.shape = (2, 3, 4, 5)`` and ``axis = 1``. If ``scale`` is a vector, then ``scale.size`` needs to be equal to ``quantized_data.shape[axis] i.e = 3``, which would be broadcast to ``(1, 3, 1, 1)``. axis: const tensor (Required) Returns ------- const tensor Attributes ---------- SrcT: uint8, int8 ZeroPointT: uint8, int8, fp32 DstT: fp16, fp32 """ input_spec = InputSpec( quantized_data=TensorInputType(const=True, type_domain="SrcT"), zero_point=TensorInputType(const=True, type_domain="ZeroPointT"), scale=TensorInputType(const=True, type_domain="DstT"), axis=TensorInputType(const=True, type_domain=types.int32), ) type_domains = { "SrcT": (types.uint8, types.int8), "ZeroPointT": (types.uint8, types.int8, types.fp32), "DstT": (types.fp16, types.fp32), } def type_inference(self): def assert_is_scalar_or_vector(param, name): if param.rank not in (0, 1): raise ValueError( "Parameter {} needs to be either a scalar or vector".format(name) ) def assert_vector_size_same_as_axial_dimension(param, axis_dim_size, name): if param.rank == 1 and param.shape[0] != axis_dim_size: raise ValueError( "Parameter {}, if vector, needs to have same size as the dimension size along the parameter quantized_data".format( name ) ) rank = self.quantized_data.rank if self.axis.val < -rank or self.axis.val >= rank: raise ValueError( "Parameter axis needs to be in the range -quantized_data.rank <= axis < quantized_data.rank" ) assert_is_scalar_or_vector(self.scale, "scale") assert_is_scalar_or_vector(self.zero_point, "zero_point") assert_vector_size_same_as_axial_dimension( self.scale, self.quantized_data.shape[self.axis.val], "scale" ) assert_vector_size_same_as_axial_dimension( self.zero_point, self.quantized_data.shape[self.axis.val], "zero_point" ) dtype = self.scale.dtype shape = self.quantized_data.shape return types.tensor(dtype, shape) def materialized_val_inference(self): return self.decompress( self.quantized_data.val, self.zero_point.val, self.scale.val, self.axis.val ) def is_all_zeros(self) -> bool: zero_point = self.promote_rank_to_same_as_quantized_data( self.zero_point.val, self.quantized_data.val, self.axis.val ) return np.all(self.quantized_data.val == zero_point) @staticmethod def promote_rank_to_same_as_quantized_data( param: np.ndarray, quantized_data: np.ndarray, axis: int ) -> np.ndarray: """ Promote param (i.e. zero point or scale) rank to same as quantized data, so subtraction or multiplication can happen properly on the specified axis """ if len(param.shape) == 0: return np.reshape(param, np.ones(len(quantized_data.shape), np.int32)) else: axes = [i for i in range(len(quantized_data.shape)) if i != axis] return np.expand_dims(param, axis=tuple(axes)) @staticmethod def decompress( quantized_data: np.ndarray, zero_point: np.ndarray, scale: np.ndarray, axis: int ) -> np.ndarray: axis = axis if axis >= 0 else axis + len(quantized_data.shape) sc = constexpr_affine_dequantize.promote_rank_to_same_as_quantized_data( scale, quantized_data, axis ) zp = constexpr_affine_dequantize.promote_rank_to_same_as_quantized_data( zero_point, quantized_data, axis ) val = sc * (quantized_data.astype(np.float32) - zp.astype(np.float32)) return val.astype(scale.dtype) @register_op(opset_version=_IOS16_TARGET) class constexpr_cast(Operation): """ A compile-time operation that returns a constant output value upon casting its constant input. .. sourcecode:: python Expression: output = constexpr_cast(source_val, output_dtype="fp32") Parameters ---------- source_val: const tensor (Required) output_dtype: const tensor (Required) Returns ------- const tensor Attributes ---------- SrcT: fp16 DstT: fp32 """ input_spec = InputSpec( source_val=TensorInputType(const=True, type_domain=types.fp16), output_dtype=TensorInputType(const=True, type_domain=types.str), ) def type_inference(self): dtype = types.string_to_builtin(self.output_dtype.val) if dtype != types.fp32: raise NotImplementedError("Only output_dtype = fp32 is supported") shape = self.source_val.shape return types.tensor(dtype, shape) def materialized_val_inference(self): return np.float32(self.source_val.val) @register_op(opset_version=_IOS16_TARGET) class constexpr_lut_to_dense(Operation): """ A compile-time operation that returns a constant output value upon decompressing a look-up table (LUT) to a dense tensor. This operation is used to store constant weights in a LUT format (also known as `palettized` weights). A LUT is a mapping from index to values. Weights are quantized and stored as indices (or keys) into the LUT. Before computation, these keys are mapped to corresponding values in the LUT. Parameters ---------- indices: const tensor (Required) lut: const tensor (Required) shape: const tensor (Required) Notes ----- * Any data is packed and read in a row-major order. * ``NUM_PALETTES`` can be one of ``{2, 4, 16, 64 or 256}``. * ``n_bits = log2(NUM_PALETTES)`` can thus be one of ``{1, 2, 4, 6, 8}``. * Indices are packed in bytes of size ``M``, where ``M = ceil(n_bits * product(shape) / 8)``. The bit fields are packed one byte at a time, starting with the least significant bit (LSB) and moving upward to the most significant bit (MSB). It follows, naturally, that if an index is split across two bytes, the LSBs of that index is filled over the MSBs of current byte, and the remaining bits of the same index are filled in the LSBs of the next byte. For example: .. sourcecode:: python if n_bits = 2, shape = (5,) => M = 2 bytes MSB LSB | | indices = | 01 10 11 00 | xx xx xx 11 | <== packed elements | i3 | i2 | i1 | i0 | -- | -- | -- | i4 | <== tagged element ids | byte 0 | byte 1 | <== tagged bytes Returns ------- const tensor Attributes ---------- T: uint8, int8, fp16, fp32 """ input_spec = InputSpec( indices=TensorInputType(const=True, type_domain=types.uint8), lut=TensorInputType(const=True, type_domain="T"), shape=TensorInputType(const=True, type_domain=types.uint32), ) type_domains = { "T": (types.int8, types.uint8, types.fp16, types.fp32) } def type_inference(self): def assert_is_vector(param, name): if param.rank != 1: raise ValueError("Parameter {} needs to have rank == 1".format(name)) assert_is_vector(self.indices, "indices") assert_is_vector(self.lut, "lut") if self.lut.shape[0] not in (2, 4, 16, 64, 256): raise ValueError( "Parameter lut should be a vector of size from one of {2, 4, 16, 64, 256}" ) nbits = int(np.log2(self.lut.shape[0])) output_size = np.prod(self.shape.val) if self.indices.shape[0] != np.ceil(nbits * (output_size / 8.0)): raise AssertionError( "Constraint violated, M = ceil(n_bits * product(shape) / 8) where M = indices.size" ) dtype = self.lut.dtype shape = self.shape.val return types.tensor(dtype, shape) def materialized_val_inference(self): return self.decompress( self.lut.val, self.indices.val, self.shape.val, ) @staticmethod def decompress(lut, indices, shape): # Import here to avoid circular import. from coremltools.optimize.coreml import _utils as optimize_utils nbits = np.log2(lut.size).astype(np.int32) indices = optimize_utils.restore_elements_from_packed_bits(indices, nbits, np.prod(shape)) flatten_val = lut[indices] return flatten_val.reshape(shape) @register_op(opset_version=_IOS16_TARGET) class constexpr_sparse_to_dense(Operation): """ A compile-time operation that returns a constant output value upon de-sparsification of its constant inputs. This operation represents unstructured sparsity and uses bit mask binary representation. If a bit is set, then the corresponding element in the output tensor is non-zero and the value is read from the ``nonzero_data`` attribute. Likewise, if the bit is not set, then the corresponding element in the output tensor is zero. Parameters ---------- nonzero_data: const tensor (Required) mask: const tensor (Required) shape: const tensor (Required) Notes ----- * Any data is packed and read in a row-major order. * ``mask`` contains ``M`` bytes, where ``M = ceil( product(shape) / 8)``. That is, each bit field corresponds to one element in the output tensor. * ``D ==`` the total number of set bits in ``mask``. The bit fields are packed one byte at a time, starting with the least significant bit and moving up to the most significant bit. For example: .. sourcecode:: python shape = (5,) => M = 1 bytes MSB LSB | | mask = |x x x 0 1 1 0 0 | <== packed elements |--|--|--|i4|i3|i2|i1|i0| <== tagged element ids | byte 0 | <== tagged bytes Returns ------- const tensor Attributes ---------- T: uint8, int8, fp16, fp32 """ input_spec = InputSpec( nonzero_data=TensorInputType(const=True, type_domain="T"), mask=TensorInputType(const=True, type_domain=types.uint8), shape=TensorInputType(const=True, type_domain=types.uint32), ) type_domains = { "T": (types.int8, types.uint8, types.fp16, types.fp32) } def type_inference(self): def assert_is_vector(param, name): if param.rank != 1: raise ValueError("Parameter {} needs to have rank == 1".format(name)) assert_is_vector(self.nonzero_data, "nonzero_data") assert_is_vector(self.mask, "mask") if sum(bin(x).count("1") for x in self.mask.val) != self.nonzero_data.shape[0]: raise AssertionError( "Number of set bits in mask needs to be equal to number of elements in parameter nonzero_data" ) output_size = np.prod(self.shape.val) if self.mask.shape[0] != np.ceil(output_size / 8.0): raise AssertionError( "Constraint Violated: M = ceil( product(shape) / 8) where M = mask.size" ) bitarray = np.unpackbits(self.mask.val, bitorder="little") if any(bitarray[i] != 0 for i in range(output_size, len(bitarray))): raise AssertionError("Padded bits in mask should be unset or equals to zero") dtype = self.nonzero_data.dtype shape = self.shape.val return types.tensor(dtype, shape) def materialized_val_inference(self): return self.decompress(self.nonzero_data.val, self.mask.val, self.shape.val) @staticmethod def decompress(nonzero_data, mask, shape): flattend_val = np.zeros(shape, dtype=nonzero_data.dtype).flatten() flattend_val[ np.where(np.unpackbits(mask, bitorder="little") != 0) ] = nonzero_data return flattend_val.reshape(shape) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS16/image_resizing.py0000644000000000000000000000652314672066616026534 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import (DefaultInputs, InputSpec, TensorInputType) from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS15.image_resizing import \ crop_resize as _crop_resize_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.image_resizing import \ resample as _resample_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.image_resizing import \ upsample_bilinear as _upsample_bilinear_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS16 import _IOS16_TARGET @register_op(opset_version=_IOS16_TARGET) class resample(_resample_iOS15): """ This version of ``resample`` supports float 16 coordinates. For complete documentation, see the iOS 15 :py:class:`~.iOS15.image_resizing.resample`. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), coordinates=TensorInputType(type_domain="U"), sampling_mode=TensorInputType(const=True, type_domain=types.str), padding_mode=TensorInputType(const=True, type_domain=types.str), padding_value=TensorInputType(const=True, type_domain="T"), coordinates_mode=TensorInputType(const=True, type_domain=types.str), align_corners=TensorInputType(const=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.int32, types.fp16, types.fp32), } def type_inference(self): return super().type_inference() @register_op(opset_version=_IOS16_TARGET) class upsample_bilinear(_upsample_bilinear_iOS15): """ This version of ``upsample_bilinear`` supports ``half_pixel_centers``. For complete documentation, see the iOS 15 :py:class:`~.iOS15.image_resizing.upsample_bilinear`. Parameters ---------- half_pixel_centers: const (Optional) * Defaults to ``!align_corners`` if not provided. """ input_spec = _upsample_bilinear_iOS15.input_spec + InputSpec( half_pixel_centers=TensorInputType(const=True, optional=True, type_domain=types.bool), ) @register_op(opset_version=_IOS16_TARGET) class crop_resize(_crop_resize_iOS15): """ This version differs from the iOS 15 :py:class:`~.iOS15.image_resizing.crop_resize` by supporting ``pad_value`` as an additional parameter. Parameters ---------- pad_value : const (Optional, default=0.0) * If the box indexes go beyond the input boundary, the input image is padded with ``pad_value``. * Defaults to ``0``. * It is the same as ``extrapolation_value`` in `tf.image.crop_and_resize `_. Attributes ---------- T: fp16, fp32 """ input_spec = _crop_resize_iOS15.input_spec + InputSpec( pad_value=TensorInputType(const=True, optional=True, type_domain="T"), ) def default_inputs(self): return super().default_inputs() + DefaultInputs(pad_value=0.0) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS16/scatter_gather.py0000644000000000000000000001605214672066616026535 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Operation, types from coremltools.converters.mil.mil.input_type import DefaultInputs, InputSpec, TensorInputType from coremltools.converters.mil.mil.operation import SYMBOL, VALUE, precondition from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs._utils import compute_gather from coremltools.converters.mil.mil.ops.defs.iOS15.scatter_gather import ( gather_along_axis as _gather_along_axis_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS16 import _IOS16_TARGET @register_op(opset_version=_IOS16_TARGET) class gather(Operation): """ The iOS16 version. This section documents only the differences between this version and the iOS 15 :py:class:`~.iOS15.scatter_gather.gather`. This version supports ``batch_dims``, similar to `tf.gather `_. Input parameter ``indices`` now supports ``int16`` and ``uint16``. Parameters ---------- x: tensor<\*D, T> (Required) indices: tensor<\*N, I> (Required) * Indices values may be negative. More precisely, ``-D[axis]<= v < D[axis]`` for ``v`` in ``indices``. axis: const i32 (Optional. Default=``0``) * Negative axis is supported. batch_dims: const i32 (Optional. Default=``0``) * The number of batch dimensions. Returns ------- tensor<\*K, T> * Where ``K = D[:axis] + N[batch_dims:] + D[axis+1:]``. Attributes ---------- T: fp16, fp32, i32 I: uint16, int16, int32 References ---------- See `tf.gather `_. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain="I"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), batch_dims=TensorInputType(const=True, optional=True, type_domain=types.int32) ) type_domains = { "T": (types.fp16, types.fp32, types.int32), "I": (types.int32, types.uint16, types.int16), } def default_inputs(self): return DefaultInputs( axis=0, batch_dims=0, ) @precondition(allow=VALUE | SYMBOL) def value_inference(self): x = self.x.sym_val indices = self.indices.val if indices is None: # only allow x to be symbolic. indices cannot. return None return compute_gather( params=self.x.sym_val, indices=self.indices.val, axis=self.axis.val, batch_dims=self.batch_dims.val, ) def type_inference(self): # validate parameters if self.axis.val < -self.x.rank or self.axis.val >= self.x.rank: raise IndexError( "Axis value {} is out of bounds for {} node {}".format( self.axis.val, self.op_type, self.name ) ) if self.batch_dims.val >= self.x.rank: raise ValueError( "batch_dims {} must be less than x.rank {} for node {}".format( self.batch_dims.val, self.x.rank, self.name ) ) if self.batch_dims.val > self.indices.rank: raise ValueError( "batch_dims {} must be less or equal to than indices.rank {} for node {}".format( self.batch_dims.val, self.indices.rank, self.name ) ) output_rank = self.x.rank - 1 + self.indices.rank - self.batch_dims.val if output_rank == 0: # output scalar return self.x.dtype # compute output shape axis = self.axis.val axis = axis if axis >= 0 else axis + self.x.rank batch_dims = self.batch_dims.val out_shape = self.x.shape[:axis] + self.indices.shape[batch_dims:] + self.x.shape[axis + 1 :] return types.tensor(self.x.dtype, out_shape) @register_op(opset_version=_IOS16_TARGET) class gather_along_axis(_gather_along_axis_iOS15): """ The iOS16 version. The only difference between this version and the iOS 15 :py:class:`~.iOS15.scatter_gather.gather_along_axis`. is that input parameter ``indices`` now supports ``int16`` and ``uint16``. Parameters ---------- x: tensor<\*D, T> (Required) indices: tensor<\*K, I> (Required) axis: const i32 (Optional): * Default to ``0``. Returns ------- tensor<\*D, T>: * Output tensor has the same shape as ``indices``. Attributes ---------- T: fp16, fp32, i32 I: uint16, int16, int32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain="I"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), "I": (types.int32, types.uint16, types.int16), } @register_op(opset_version=_IOS16_TARGET) class gather_nd(Operation): """ The iOS16 version. This section documents only the differences between this version and the iOS 15 :py:class:`~.iOS15.scatter_gather.gather_nd`. This version supports ``batch_dims``. Input parameter ``indices`` now supports ``int16`` and ``uint16``. Parameters ---------- x: tensor<\*D, T> (Required) indices: tensor<\*K, I> (Required) batch_dims: const i32 (Optional. Default=``0``) * The number of batch dimensions. Returns ------- tensor<\*V, T> * ``V = K[:-1] + D[batch_dims + K[-1]:]``, where ``D = x.shape`` and ``K = indices.shape``. Attributes ---------- T: fp16, fp32, i32 I: uint16, int16, int32 References ---------- See `tf.gather_nd `_. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain="I"), batch_dims=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), "I": (types.int32, types.uint16, types.int16), } def default_inputs(self): return DefaultInputs( batch_dims=0, ) def type_inference(self): batch_dims = self.batch_dims.val indices_depth = self.indices.shape[-1] if indices_depth > self.x.rank - batch_dims: msg = "For node {}, indices.shape[-1] ({}) + batch_dims ({}) must be smaller or equal to the input rank {}".format( self.name, indices_depth, batch_dims, self.x.rank ) raise ValueError(msg) out_type = self.x.dtype out_shape = self.indices.shape[:-1] + self.x.shape[batch_dims+indices_depth:] return types.tensor(out_type, out_shape) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS16/tensor_operation.py0000644000000000000000000000752714672066616027137 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clausefrom import numpy as np from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import (DefaultInputs, InputSpec, TensorInputType) from coremltools.converters.mil.mil.operation import (VALUE, Operation, precondition) from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS15.tensor_operation import \ topk as _topk_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS16 import _IOS16_TARGET @register_op(opset_version=_IOS16_TARGET) class fill_like(Operation): """ Returns a tensor with the same shape as the input tensor filled with a constant value. Parameters ---------- ref_tensor: tensor<\*?, T> (Required) * Input tensor. value: const (Optional) * Default is ``0.0``. * Constant value to fill in. Returns ------- tensor<\*?, T> * Tensor with shape determined by the input tensor. Attributes ---------- T: fp16, fp32, int32, bool U: fp16, fp32, int32, bool """ input_spec = InputSpec( ref_tensor=TensorInputType(type_domain="T"), value=TensorInputType(const=True, optional=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), "U": (types.fp16, types.fp32, types.int32, types.bool), } def default_inputs(self): return DefaultInputs( value=0. ) def type_inference(self): return types.tensor(self.value.dtype, self.ref_tensor.shape) @precondition(allow=VALUE) def value_inference(self): return np.full(shape=self.ref_tensor.shape, fill_value=self.value.val) @register_op(opset_version=_IOS16_TARGET) class topk(_topk_iOS15): """ A version of ``topk`` for iOS 16+. This section documents the differences. The following are additional parameters for the iOS 16+ version. For the rest of the documentation, see `the iOS 15 version of topk <#coremltools.converters.mil.mil.ops.defs.iOS15.tensor_operation.topk>`_. Parameters ---------- sort: const (Optional) * Defaults to ``True``. * If ``True``, ``top-k`` elements are themselves sorted. Otherwise, no particular ordering is guaranteed. return_indices: const (Optional) * Defaults to ``True``. * If ``True``, returns both values and indices. Otherwise, returns only the ``top-k`` values. Returns ------- tensor<\*?, T> * Values of top/bottom ``k`` elements. tensor<\*?, int32> * Only returned when ``return_indices = True`` * Indices of the top/bottom ``k`` elements along axis. Attributes ---------- T: fp32, int32 """ input_spec = _topk_iOS15.input_spec + InputSpec( sort=TensorInputType(const=True, optional=True, type_domain=types.bool), return_indices=TensorInputType(const=True, optional=True, type_domain=types.bool), ) def default_inputs(self): return super().default_inputs() + DefaultInputs(sort=True, return_indices=True) def type_inference(self): value_type, indices_type = super().type_inference() if not self.return_indices.val: return value_type return value_type, indices_type @precondition(allow=VALUE) def value_inference(self): values, indices = super().value_inference() if not self.return_indices.val: return values return values, indices ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS16/tensor_transformation.py0000644000000000000000000001546214672066616030202 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clausefrom coremltools.converters.mil.mil import types import numpy as np from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import (InputSpec, TensorInputType, TupleInputType) from coremltools.converters.mil.mil.operation import Operation from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS16 import _IOS16_TARGET from coremltools.converters.mil.mil.types.symbolic import any_symbolic @register_op(opset_version=_IOS16_TARGET) class reshape_like(Operation): """ Reshape a tensor to an output shape specified by some or all dimensions of a tuple of reference tensors ``ref_tensors``. Parameters ---------- x: tensor<\*?, T> (Required) * The input tensor to be reshaped. ref_tensors: Tuple[tensor<\*?, R>] (Required) * A tuple of tensors that define the output shape. begins: Tuple[const] (Required) * A tuple of integers specifying the begin index into the shape vector of the corresponding ``ref_tensor``. ends: Tuple[const] (Required) * A tuple of integers specifying the end index into the shape vector of the corresponding ``ref_tensor``. end_masks: Tuple[const] (Required) * If ``True``, select all axes from the begin index until the end of the corresponding ``ref_tensor``, as in ``ref_tensors[i].shape[begins[i]:]``. Notes ----- The output shape is computed as follows: .. sourcecode:: python output_shape = [] num_of_refs = len(begins) for i in range(num_of_refs): if end_masks[i]: output_shape.append(ref_tensor_i.shape[begins[i]:]) else: output_shape.append(ref_tensor_i.shape[begins[i]:ends[i]]) output_shape = np.concat(output_shape, axis=0) The following is an example: .. sourcecode:: python ref_tensors=[tensor[2, 3, 4], tensor[1, 5, 6]] begins=[0, 1] ends=[2, 0] end_masks=[False, True] The output shape would be ``(2, 3, 5, 6)``. Returns ------- tensor<\*?, T> * Same type as input tensor ``x``. * Output shape is computed by ``ref_tensors``, ``begins``, ``ends``, and ``end_masks``. Attributes ---------- T: fp16, fp32, i32, bool R: fp16, fp32, i32, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), ref_tensors=TupleInputType(), begins=TupleInputType(), ends=TupleInputType(), end_masks=TupleInputType(), ) type_domains = { "T": (types.fp16, types.fp32, types.int32, types.bool), } def _check_is_const_tuple_with_scalar(self, param, expected_type, param_name): """ This utility function checks the param is a Tuple of scalar with expected data type. """ for x in param: if x.dtype != expected_type or x.shape != (): msg = "In op reshape_like {}, {} must be a Tuple of scalar {}. Got a {} tensor with shape {}.".format( self.name, param_name, expected_type.__type_info__(), x.dtype.__type_info__(), x.shape, ) raise ValueError(msg) def type_inference(self): # Validation the inputs ref_number = len(self.ref_tensors) if len(self.begins) != ref_number or len(self.ends) != ref_number or len(self.end_masks) != ref_number: msg = ( "Op reshape_like {}'s ref_tensors, begins, ends and end_masks must have exactly the same length. " "Got {}, {}, {} and {}." ).format(self.name, ref_number, len(self.begins), len(self.ends), len(self.end_masks)) self._check_is_const_tuple_with_scalar(self.begins, types.int32, "begins") self._check_is_const_tuple_with_scalar(self.ends, types.int32, "ends") self._check_is_const_tuple_with_scalar(self.end_masks, types.bool, "end_masks") # Compute the output shape out_shape = () for ref_tensor, begin, end, end_mask in zip(self.ref_tensors, self.begins, self.ends, self.end_masks): shape = ref_tensor.shape begin, end, end_mask = begin.val, end.val, end_mask.val ref_shape = shape[begin:end] if not end_mask else shape[begin:] out_shape += tuple(ref_shape) # Output shape must be known at compile time if any_symbolic(out_shape): msg = "Output shape of a reshape_like op {} must not be symbolic. Got {}".format(self.name, out_shape) raise ValueError(msg) # Output shape must be consistent with the input shape if not any_symbolic(self.x.shape): if np.prod(self.x.shape) != np.prod(out_shape): msg = "At reshape_like op {}, input shape {} not consistent with the output shape {}.".format( self.name, self.x.shape, out_shape ) raise ValueError(msg) return types.tensor(self.x.dtype, out_shape) @register_op(opset_version=_IOS16_TARGET) class pixel_unshuffle(Operation): """ Rearrange elements in a tensor from spatial dimensions into depth (channel). It is basically the inverse operation of :py:class:`~.iOS15.tensor_transformation.pixel_shuffle`. Equivalent to `PyTorch PixelUnshuffle `_. Parameters ---------- x: tensor<[n, C, H / f , W / f], T> (Required) * Input tensor of rank ``4``. downscale_factor: const * Factor to decrease spatial resolution by. Returns ------- tensor<[n, C * f^2, H, W], T> * In which ``f`` is the downscale factor. Attributes ---------- T: fp16, fp32 References ---------- `torch.nn.PixelUnshuffle `_ """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), downscale_factor=TensorInputType(const=True, type_domain=types.uint32), ) type_domains = { "T": (types.fp16, types.fp32), } def type_inference(self): x_type = self.x.dtype n, c, h, w = self.x.shape f = self.downscale_factor.val ret_shape = (n, c * f * f, h / f, w / f) return types.tensor(x_type, ret_shape) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2335467 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/0000755000000000000000000000000014672075535023161 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/__init__.py0000644000000000000000000000245514672066616025300 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil._deployment_compatibility import AvailableTarget as target _IOS17_TARGET = target.iOS17 from .activation import ( clamped_relu, elu, leaky_relu, linear_activation, prelu, scaled_tanh, sigmoid_hard, softplus_parametric, thresholded_relu, ) from .conv import conv, conv_transpose from .elementwise_unary import cast, clip, inverse, log, rsqrt from .image_resizing import crop_resize, resample, resize from .linear import linear, matmul from .normalization import batch_norm, instance_norm, l2_norm, layer_norm, local_response_norm from .quantization_ops import dequantize, quantize from .recurrent import gru, lstm, rnn from .reduction import reduce_argmax, reduce_argmin from .scatter_gather import ( gather, gather_along_axis, gather_nd, scatter, scatter_along_axis, scatter_nd, ) from .tensor_operation import non_maximum_suppression, topk from .tensor_transformation import ( expand_dims, reshape, reshape_like, reverse, reverse_sequence, sliding_windows, squeeze, transpose, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/activation.py0000644000000000000000000002322014672066616025673 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS15.activation import ( clamped_relu as _clamped_relu_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.activation import elu as _elu_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.activation import leaky_relu as _leaky_relu_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.activation import ( linear_activation as _linear_activation_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.activation import prelu as _prelu_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.activation import ( scaled_tanh as _scaled_tanh_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.activation import ( sigmoid_hard as _sigmoid_hard_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.activation import ( softplus_parametric as _softplus_parametric_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.activation import ( thresholded_relu as _thresholded_relu_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS17 import _IOS17_TARGET @register_op(opset_version=_IOS17_TARGET) class clamped_relu(_clamped_relu_iOS15): """ If ``x >= 0`` return elementwise ``min(beta, x)``, otherwise return ``min(beta, alpha * x)``. The major difference between this version and the iOS 15 :py:class:`~.iOS15.activation.clamped_relu` is that the ``alpha`` and ``beta`` may have a different dtype than the input/output. Parameters ---------- x: tensor<\*?, T> (Required) alpha: const U (Required) beta: const U (Required) Returns ------- tensor<\*?, T> * A tensor of the same type and shape as ``x``. Attributes ---------- T: fp16, fp32 U: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), alpha=TensorInputType(const=True, type_domain="U"), beta=TensorInputType(const=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class elu(_elu_iOS15): """ If ``x > 0`` return elementwise ``x``, otherwise return ``alpha * (e^x - 1)``. The major difference between this version and the iOS 15 :py:class:`~.iOS15.activation.elu` is that the ``alpha`` may have a different dtype than the input/output. Parameters ---------- x: tensor<\*?, T> (Required) alpha: const U (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 U: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), alpha=TensorInputType(const=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class leaky_relu(_leaky_relu_iOS15): """ If ``x >= 0`` apply ``x`` elementwise, otherwise apply ``alpha * x`` elementwise. The major difference between this version and the iOS 15 :py:class:`~.iOS15.activation.leaky_relu` is that the ``alpha`` may have a different dtype than the input/output. Parameters ---------- x: <*?, T> (Required) alpha: const U (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 U: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), alpha=TensorInputType(const=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class linear_activation(_linear_activation_iOS15): """ Apply elementwise ``x * alpha + beta``. The major difference between this version and the iOS 15 :py:class:`~.iOS15.activation.linear_activation` is that the ``alpha`` and ``beta`` may have a different dtype than the input/output. Parameters ---------- x: tensor<\*?, T> (Required) alpha: const U (Required) beta: const U (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 U: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), alpha=TensorInputType(const=True, type_domain="U"), beta=TensorInputType(const=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class prelu(_prelu_iOS15): """ Where ``i = 1 ... C``, if ``x_i > 0``, return ``x_i`` , otherwise return ``alpha_i * x_i``. The major difference between this version and the iOS 15 :py:class:`~.iOS15.activation.prelu` is that the ``alpha`` may have a different dtype than the input/output. Parameters ---------- x: tensor<[B, C, 1..3], T> (Required) * ``x`` must have rank 4, rank 3, or rank 5; that is, a shape of ``(B,C,H)``, ``(B,C,H,W)``, or ``(B,C,D,H,W)``. alpha: const tensor<[C], U>, (Required) * The length of ``alpha`` must match the second dimension of ``x`` (channel dimension). Returns ------- tensor<[B, C, 1..3], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp32, fp16 U: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), alpha=TensorInputType(const=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class scaled_tanh(_scaled_tanh_iOS15): """ Return ``alpha * tanh(beta * x)`` elementwise. The major difference between this version and the iOS 15 :py:class:`~.iOS15.activation.scaled_tanh` is that the ``alpha`` and ``beta`` may have a different dtype than the input/output. Parameters ---------- x: tensor<\*?, T> (Required) * Input range is ``(-inf, inf)``. alpha: const U (Required) beta: const U (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 U: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), alpha=TensorInputType(const=True, type_domain="U"), beta=TensorInputType(const=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class sigmoid_hard(_sigmoid_hard_iOS15): """ Return ``min( max( alpha * x + beta, 0 ), 1 )`` elementwise. The major difference between this version and the iOS 15 :py:class:`~.iOS15.activation.sigmoid_hard` is that the ``alpha`` and ``beta`` may have a different dtype than the input/output. Parameters ---------- x: tensor<\*?, T> (Required) alpha: const U (Required) beta: const U (Required) Returns ------- tensor<\*?, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 U: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), alpha=TensorInputType(const=True, type_domain="U"), beta=TensorInputType(const=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class softplus_parametric(_softplus_parametric_iOS15): """ Return ``alpha_i * log( 1 + e^( beta_i * x_i ) )``, where ``i = 1 ... C``. Parameters ---------- x: tensor<[b, C, n, m], T> (Required) alpha: const tensor<[C], U> (Required) beta: const tensor<[C], U> (Required) Returns ------- tensor<[b, C, n, m], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 U: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), alpha=TensorInputType(const=True, type_domain="U"), beta=TensorInputType(const=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class thresholded_relu(_thresholded_relu_iOS15): """ Return ``x`` if ``x >= alpha``, otherwise return ``0``. The major difference between this version and the iOS 15 :py:class:`~.iOS15.activation.thresholded_relu` is that the ``alpha`` may have a different dtype than the input/output. Parameters ---------- x: tensor<\*?, T> (Required) alpha: const U (Required) Returns ------- tensor<\*, T> * A tensor of the same shape and type as ``x``. Attributes ---------- T: fp16, fp32 U: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), alpha=TensorInputType(const=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/conv.py0000644000000000000000000000604614672066616024506 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS15.conv import conv as _conv_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.conv import ( conv_transpose as _conv_transpose_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS17 import _IOS17_TARGET @register_op(opset_version=_IOS17_TARGET) class conv(_conv_iOS15): """ Perform convolution over input. Supports 1-D, 2-D, and 3-D convolution. The difference between this version and the iOS 15 :py:class:`~.iOS15.conv.conv` is that the ``weight`` and ``bias`` may have a different dtype than the input/output. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), weight=TensorInputType(type_domain="U"), bias=TensorInputType(optional=True, type_domain="U"), strides=TensorInputType(const=True, optional=True, type_domain=types.int32), pad_type=TensorInputType(const=True, optional=True, type_domain=types.str), pad=TensorInputType(const=True, optional=True, type_domain=types.int32), dilations=TensorInputType(const=True, optional=True, type_domain=types.int32), groups=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class conv_transpose(_conv_transpose_iOS15): """ Perform transposed convolution (also known as deconvolution and fractionally stride convolution) over input. ``conv_transpose`` can also be used to compute the gradient of conv. Supports 1-D, 2-D, and 3-D convolution. The differences between this version and the iOS 15 :py:class:`~.iOS15.conv.conv_transpose` are: - ``weight`` and ``bias`` may have a different dtype than the input/output. - ``weight`` doesn't have to be const. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), weight=TensorInputType(type_domain="U"), bias=TensorInputType(optional=True, type_domain="U"), pad=TensorInputType(const=True, optional=True, type_domain=types.int32), output_shape=TensorInputType(const=True, optional=True, type_domain=types.int32), pad_type=TensorInputType(const=True, optional=True, type_domain=types.str), strides=TensorInputType(const=True, optional=True, type_domain=types.int32), dilations=TensorInputType(const=True, optional=True, type_domain=types.int32), groups=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/elementwise_unary.py0000644000000000000000000001403014672066616027270 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS15.elementwise_unary import cast as _cast_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.elementwise_unary import clip as _clip_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.elementwise_unary import ( inverse as _inverse_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.elementwise_unary import log as _log_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.elementwise_unary import rsqrt as _rsqrt_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS17 import _IOS17_TARGET @register_op(opset_version=_IOS17_TARGET) class cast(_cast_iOS15): """ Cast the input ``x`` to the new type ``dtype``. The only difference between this version and the iOS 15 :py:class:`~.iOS15.elementwise_unary.cast` is that it supports int8, uint8, int16, and uint16. Parameters ---------- x: tensor<[\*d], T> (Required) dtype: const str (Required) * Can be one of the following types: ``int8``, ``uint8``, ``int16``, ``uint16``, ``int32``, ``fp16``, ``fp32``, or ``bool``. Returns ------- tensor<[\*d], dtype> * A tensor of the same shape as ``x``, with type ``dtype``. Attributes ---------- T: i8, ui8, i16, ui16, i32, fp16, fp32, bool. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), dtype=TensorInputType(const=True, type_domain=types.str) ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.uint8, types.int16, types.uint16, types.int32, types.bool, ), } @register_op(opset_version=_IOS17_TARGET) class clip(_clip_iOS15): """ Clip the values in the input ``x`` to ``[alpha, beta]``, element-wise. Any values less than ``alpha`` are set to ``alpha``, and any values greater than ``beta`` are set to ``beta``. The major difference between this version and the iOS 15 :py:class:`~.iOS15.elementwise_unary.clip` is that it uses strict validation to ensure that ``alpha < beta``. Parameters ---------- x: tensor<[\*d], T> (Required) alpha: const T (Required) beta: const T (Required) Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 """ def type_inference(self): if self.alpha.val >= self.beta.val: raise ValueError( f"The `alpha` value ({self.alpha.val}) should be smaller than `beta` value " f"({self.beta.val}) in `clip` op." ) return self.x.sym_type @register_op(opset_version=_IOS17_TARGET) class inverse(_inverse_iOS15): """ Return the reciprocal value of the input ``x``, element-wise. The only difference between this version and the iOS 15 :py:class:`~.iOS15.elementwise_unary.inverse` is ``epsilon`` may have different dtypes than the inputs/outputs. Parameters ---------- x: tensor<[\*d], T> (Required) epsilon: const U (Optional, default=1e-4) * This is a small constant that is added to the input, before taking its inverse, for stability. * ``y = 1 / (x + epsilon)``. Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 U: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), epsilon=TensorInputType(const=True, optional=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class log(_log_iOS15): """ Return the natural logarithm value of the input ``x``, element-wise. The only difference between this version and the iOS 15 :py:class:`~.iOS15.elementwise_unary.log` is ``epsilon`` may have different dtypes than the inputs/outputs. Parameters ---------- x: tensor<[\*d], T> (Required) epsilon: const U (Optional, default=1e-45) * This is a small constant that is added to the input, before taking log. * ``y = log(x + epsilon)``. Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 U: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), epsilon=TensorInputType(const=True, optional=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class rsqrt(_rsqrt_iOS15): """ Return the reciprocal value of the square root of the input ``x``, element-wise. The only difference between this version and the iOS 15 :py:class:`~.iOS15.elementwise_unary.rsqrt` is ``epsilon`` may have different dtypes than the inputs/outputs. Parameters ---------- x: tensor<[\*d], T> (Required) epsilon: const U (Optional, default=1e-12) * This is a small constant that is added to the input, before applying the ``rsqrt`` function, for stability. * ``y = 1 / sqrt(x + epsilon)``. Returns ------- tensor<[\*d], T> * A tensor of the same shape as ``x``. Attributes ---------- T: fp16, fp32 U: fp16, fp32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), epsilon=TensorInputType(const=True, optional=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/image_resizing.py0000644000000000000000000003500114672066616026526 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import Operation, get_new_symbol, types from coremltools.converters.mil.mil.input_type import DefaultInputs, InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS16.image_resizing import ( crop_resize as _crop_resize_iOS16, ) from coremltools.converters.mil.mil.ops.defs.iOS16.image_resizing import resample as _resample_iOS16 from coremltools.converters.mil.mil.ops.defs.iOS17 import _IOS17_TARGET @register_op(opset_version=_IOS17_TARGET) class crop_resize(_crop_resize_iOS16): """ The major differences between this version and the iOS 16 :py:class:`~.iOS16.image_resizing.crop_resize` are as follows: - The input ``ROI`` is replaced by ``boxes`` and ``box_indices``. - The dtype domain of input ``x``, ``boxes``, and ``box_indices`` are independent. - The output no longer has the ``B`` dim. The output is ``[N, C, target_height, target_width]`` rather than the ``[N, B, C, target_height, target_width]`` in iOS 16. Parameters ---------- x: tensor<[B, C, H, W], T> (Required) * The input, from which patches (regions of interest) are extracted and resized using bilinear interpolation. * Rank ``4``. boxes: tensor<[N, 4], BOX_T> (Required) * Coordinates of ``N`` boxes. * The convention to express coordinates depends on the value of ``box_coordinate_mode``. * If ``normalized_coordinates`` is True, only fp16 and fp32 dtypes are allowed. box_indices: tensor<[N], BOX_INDEX_T> (Optional) * Default is ``arange(N)``, or ``[0, 1, ..., N-1]``. * If ``box_indices[i]=j``, this means that ``boxes[i]`` will be applied to the ``j``-th image. Therefore, it is invalid for ``box_indices[i]`` to be greater than ``B``. target_height: const (Optional, Default=1) * Target height for resizing each patch. target_width: const (Optional, Default=1) * Target width for resizing each patch. normalized_coordinates : const (Optional, default=False) * If ``True``, the bounding box coordinates must be in the interval ``[0, 1]``. Scaling is based on the input spatial dimensions: ``(H_in - 1)`` for height and ``(W_in - 1)`` for width. * If ``False``, the bounding box coordinates must be in the interval ``[0, H_in - 1]`` for height dimensions and ``[0, W_in - 1]`` for width dimensions. spatial_scale : const (Optional, default=1.0) * Additional spatial scale that multiplies the bounding box coordinates. * You would use this to implement the RoI Align layer, which typically uses unnormalized RoI coordinates along with a spatial scale that is less than or equal to ``1``. box_coordinate_mode: const (Optional, default="CORNERS_HEIGHT_FIRST") * Specifies the convention for specifying the four bounding box coordinates for an image of size ``(Height, Width)``. The ``(0,0)`` coordinate corresponds to the top-left corner of the image. * This parameter can take one of four values: ``"CORNERS_HEIGHT_FIRST"``: ``[h_start, w_start, h_end, w_end]`` ``"CORNERS_WIDTH_FIRST"``: ``[w_start, h_start, w_end, h_end]`` ``"CENTER_SIZE_HEIGHT_FIRST"``: ``[h_center, w_center, box_height, box_width]`` ``"CENTER_SIZE_WIDTH_FIRST"``: ``[w_center, h_center, box_width, box_height]`` sampling_mode : const (Optional, default="DEFAULT") * This parameter can take ``"STRICT_ALIGN_CORNERS"``, ``"ALIGN_CORNERS"``, ``"DEFAULT"``, ``"OFFSET_CORNERS"`` or ``UNALIGN_CORNERS`` as values. * This is the same convention used by the :py:class:`~.iOS15.image_resizing.resize_bilinear` op. pad_value : const (Optional, default=0.0) * If the box indexes go beyond the input boundary, the input image is padded with ``pad_value``. * Defaults to ``0``. * It is the same as ``extrapolation_value`` in `tf.image.crop_and_resize `_. Returns ------- tensor<[N, C, target_height, target_width], T> Attributes ---------- T: fp16, fp32 BOX_T: fp16, fp32, uint16 BOX_INDEX_T: uint16, int32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), boxes=TensorInputType(type_domain="BOX_T"), box_indices=TensorInputType(optional=True, type_domain="BOX_INDEX_T"), target_height=TensorInputType(const=True, optional=True, type_domain=types.int32), target_width=TensorInputType(const=True, optional=True, type_domain=types.int32), normalized_coordinates=TensorInputType(const=True, optional=True, type_domain=types.bool), spatial_scale=TensorInputType(const=True, optional=True, type_domain=types.fp32), box_coordinate_mode=TensorInputType(const=True, optional=True, type_domain=types.str), sampling_mode=TensorInputType(const=True, optional=True, type_domain=types.str), pad_value=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), "BOX_T": (types.fp16, types.fp32, types.uint16), "BOX_INDEX_T": (types.uint16, types.int32), } def default_inputs(self): if self.box_indices is None and self.boxes.shape[0] > self.x.shape[0]: # The default box indices is [0, 1, ..., N-1], which is out-of-range for N>B. raise ValueError( f'"crop_resize" op: N dimension of "boxes" ({self.boxes.shape[0]}) ' f'should not be greater than the B dimension of "x" ({self.x.shape[0]}) ' f'when "box_indices" is not specified, otherwise "box_indices" would ' f'point outside of "x" bounds.' ) return DefaultInputs( box_indices=list(range(self.boxes.shape[0])), target_height=1, target_width=1, normalized_coordinates=False, spatial_scale=1.0, box_coordinate_mode="CORNERS_HEIGHT_FIRST", sampling_mode="DEFAULT", pad_value=0.0, ) def _validate_input(self): if self.x.rank != 4: raise ValueError( f'input to the "crop_resize" op must be of rank 4, but got {self.x.rank}' ) if self.boxes.rank != 2 or self.boxes.shape[1] != 4: raise ValueError( f'"crop_resize" op: input "boxes" must has shape [N, 4], but got {self.boxes.shape}' ) if self.box_indices.rank != 1 or self.box_indices.shape[0] != self.boxes.shape[0]: raise ValueError( f'"crop_resize" op: input "box_indices" must has shape [{self.boxes.shape[0]}], ' f"but got {self.box_indices.shape}" ) if self.box_indices.val is not None and np.any(self.box_indices.val >= self.x.shape[0]): raise ValueError( f'"crop_resize" op: input "box_indices" should not have values >= B dimension of x ' f"({self.x.shape[0]}), but got {self.box_indices.val}" ) if self.box_coordinate_mode.val not in self._VALID_BOX_COORDINATE_MODES: raise ValueError( f'"crop_resize" op: unrecognized box_coordinate_mode "{self.box_coordinate_mode.val}"' ) if self.sampling_mode.val not in self._VALID_SAMPLING_MODES: raise ValueError( f'"crop_resize" op: unrecognized sampling mode "{self.sampling_mode.val}"' ) if self.normalized_coordinates.val: if self.boxes.dtype not in {types.fp16, types.fp32}: raise ValueError( f'"crop_resize" op: When normalized_coordinates is set, the ' f'"boxes" must have fp16 or fp32 dtype, but got ' f"{types.builtin_to_string(self.sampling_mode.val)}" ) def type_inference(self): self._validate_input() # Output shape is [N, C, h_out, w_out]. ret_shape = [ self.boxes.shape[0], self.x.shape[1], self.target_height.val, self.target_width.val, ] return types.tensor(self.x.dtype, ret_shape) @register_op(opset_version=_IOS17_TARGET) class resample(_resample_iOS16): """ Resample the input image tensor ``x`` at the ``coordinates``. The major difference between this version and the iOS 16 :py:class:`~.iOS16.image_resizing.resample` is that `coordinates` supports int8, uint8, int16, uint16. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), coordinates=TensorInputType(type_domain="U"), sampling_mode=TensorInputType(const=True, type_domain=types.str), padding_mode=TensorInputType(const=True, type_domain=types.str), padding_value=TensorInputType(const=True, type_domain="T"), coordinates_mode=TensorInputType(const=True, type_domain=types.str), align_corners=TensorInputType(const=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32), "U": ( types.int8, types.uint8, types.int16, types.uint16, types.int32, types.fp16, types.fp32, ), } @register_op(opset_version=_IOS17_TARGET) class resize(Operation): """ Resizes the input tensor ``x`` by choosing the right-most ``resized_dims`` dimensions from the input shape ``shape``, and by choosing the rest from ``x`` 's shape. This iOS17 ``resize`` is a superset of iOS 15 :py:class:`~.iOS15.image_resizing.resize_bilinear` and :py:class:`~.iOS15.image_resizing.resize_nearest_neighbor`. The main benefit is that this resize op allows a use-case in dynamic tensor shapes where a tensor needs to be resized to a dynamic shape as specified by another tensor. To illustrate how output shape is inferred, the following are two examples: - Example #1:: x.shape: [1, 2, 3, 4] shape: [1, 6, 8] resized_dims: 2 The output's shape will be [1, 2, 6, 8] - Example #2:: x.shape: [1, 2, 3, is0] shape: [1, 0, 0] resized_dims: 2 The output's shape will be [1, 2, 3, is0] Parameters ---------- x: tensor<[...], T> (Required) shape: tensor<[K], U> (Required) * Restriction: ``size(shape)`` <= ``rank(x)``. * If ``shape[i]==0``, the dimension in the output tensor will instead be inferred from the corresponding element of ``x.shape()``. Note this might not be ``x.shape()[i]``, as ``size(shape)``, ``resized_dims``, and ``size(x)`` may all be different sizes. resized_dims: const tensor<[], uint32> (Required) * Restriction: ``resized_dims`` <= ``size(shape)``. interpolation_mode: const (Optional, default="LINEAR") * Available mode: ``LINEAR``, ``NEAREST_NEIGHBOR``. sampling_mode: const (Optional, default="DEFAULT") * Available mode: ``DEFAULT``, ``STRICT_ALIGN_CORNERS``, ``ALIGN_CORNERS``, ``OFFSET_CORNERS``, ``UNALIGN_CORNERS``. * For details about different sampling modes, see iOS 15 :py:class:`~.iOS15.image_resizing.resize_bilinear`. Returns ------- tensor<[...], T> Attributes ---------- T: fp16, fp32, int32 U: int32, int16, uint16, uint32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), shape=TensorInputType(type_domain="U"), resized_dims=TensorInputType(const=True, type_domain=types.uint32), interpolation_mode=TensorInputType(const=True, optional=True, type_domain=types.str), sampling_mode=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), "U": (types.int32, types.int16, types.uint16, types.uint32), } _VALID_INTERPOLATION_MODES = {"LINEAR", "NEAREST_NEIGHBOR"} _VALID_SAMPLING_MODE = { "DEFAULT", "STRICT_ALIGN_CORNERS", "ALIGN_CORNERS", "OFFSET_CORNERS", "UNALIGN_CORNERS", } def default_inputs(self): return DefaultInputs( interpolation_mode="LINEAR", sampling_mode="DEFAULT", ) def _validate_input(self): if self.shape.val is not None: shape_element_num = self.shape.val.size if self.resized_dims.val > shape_element_num: raise ValueError( f"The resized_dims ({self.resized_dims.val}) must <= shape's size ({shape_element_num})" ) if shape_element_num > self.x.rank: raise ValueError( f"The shape's size ({shape_element_num}) must <= x's rank ({self.x.rank})" ) if self.shape.rank != 1: raise ValueError(f"The shape's rank must be 1, but got {self.shape.rank}") if self.interpolation_mode.val not in self._VALID_INTERPOLATION_MODES: raise ValueError( f"Invalid interpolation_mode {self.interpolation_mode.val}. Supported modes are: {self._VALID_INTERPOLATION_MODES}" ) if self.sampling_mode.val not in self._VALID_SAMPLING_MODE: raise ValueError( f"Invalid sampling_mode {self.sampling_mode.val}. Supported modes are: {self._VALID_SAMPLING_MODE}" ) def type_inference(self): self._validate_input() # The output tensor will have the same rank as the input tensor. The rightmost resized_dims # dimensions of the output_shape will be taken from the input "shape". ret_shape = list(self.x.shape) start_idx = self.shape.shape[0] - self.resized_dims.val for i in range(self.resized_dims.val): target_shape = ( get_new_symbol() if self.shape.val is None else self.shape.val[start_idx + i] ) if target_shape == 0: # The 0 in `shape` means inheriting from x's shape. target_shape = self.x.shape[self.x.rank - self.resized_dims.val + i] ret_shape[self.x.rank - self.resized_dims.val + i] = target_shape return types.tensor(self.x.dtype, ret_shape) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/linear.py0000644000000000000000000001057714672066616025017 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS15.linear import linear as _linear_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.linear import matmul as _matmul_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS17 import _IOS17_TARGET @register_op(opset_version=_IOS17_TARGET) class linear(_linear_iOS15): """ A version of ``linear`` for iOS 17+. The only difference between this version and the iOS 15 :py:class:`~.iOS15.linear.linear` is that the ``weight`` and ``bias`` may have a different dtype than the input/output. Parameters ---------- x: tensor<[\*D, D_in], T> (Required) * ``1 <= rank <= 3``. * ``0 <= rank(*D) <= 2``. weight: const tensor<[D_out, D_in], U> (Required) bias: const tensor<[D_out], U> (Optional) * Default to ``0``. Returns ------- tensor<[\*D, D_out], T> * Same rank as the input ``x``. Attributes ---------- T: fp16, fp32, i32 U: fp16, fp32, i32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), weight=TensorInputType(const=True, type_domain="U"), bias=TensorInputType(const=True, optional=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), "U": (types.fp16, types.fp32, types.int32), } @register_op(opset_version=_IOS17_TARGET) class matmul(_matmul_iOS15): """ A version of ``matmul`` for iOS 17+. The only difference between this version and the iOS 15 :py:class:`~.iOS15.linear.matmul` is that the ``x`` and ``y`` can have a different dtypes when one of them is const. Parameters ---------- x: tensor<[\*, K1], T> (Required) * ``x`` must be 1-D or higher. y: tensor<[\*, K2], U> (Required) * ``y`` must be 1-D or higher. transpose_x: const bool (Optional) * Default to ``False``. * Use ``True`` to transpose the last two dimensions of ``x`` before multiplication. It has no effect when ``x`` is 1-D. transpose_y: const bool (Optional) * Default to ``False``. * Use ``True`` to transpose the last two dimensions of ``y`` before multiplication. It has no effect when ``y`` is 1-D. Returns ------- tensor<\*, V> * Scalar or tensor output. * When ``x`` and ``y`` are both const or both non-const, it should follow ios15 behavior that ``x``, ``y``, and ``output`` all have the same dtype. When one of x and y is const, the output dtype should be the same as the non-const one. Attributes ---------- T: fp16, fp32, i32 U: fp16, fp32, i32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), y=TensorInputType(type_domain="U"), transpose_x=TensorInputType(const=True, optional=True, type_domain=types.bool), transpose_y=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), "U": (types.fp16, types.fp32, types.int32), } def type_inference(self): x_is_const = self.x.op is not None and self.x.op.op_type == "const" y_is_const = self.y.op is not None and self.y.op.op_type == "const" if x_is_const == y_is_const and self.x.dtype != self.y.dtype: is_const_str = "const" if x_is_const else "non-const" raise ValueError( f'In op "matmul", when x and y are both {is_const_str}, their dtype ' f"need to match, but got x as {types.builtin_to_string(self.x.dtype)} " f"and y as {types.builtin_to_string(self.y.dtype)}" ) inferred_type = super().type_inference() if x_is_const != y_is_const: # The output dtype should be the same as the non-const one. output_dtype = self.x.dtype if y_is_const else self.y.dtype inferred_type = types.tensor(output_dtype, inferred_type.get_shape()) return inferred_type ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/normalization.py0000644000000000000000000001264614672066616026432 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS15.normalization import ( batch_norm as _batch_norm_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.normalization import ( instance_norm as _instance_norm_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.normalization import l2_norm as _l2_norm_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.normalization import ( layer_norm as _layer_norm_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.normalization import ( local_response_norm as _local_response_norm_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS17 import _IOS17_TARGET @register_op(opset_version=_IOS17_TARGET) class batch_norm(_batch_norm_iOS15): """ Normalize input tensor ``x`` by ``mean`` and ``variance``, and optionally apply a scale ``gamma`` and an offset ``beta``: .. math:: y_i = \\gamma_i \\dfrac{ (x_i - mean_i)}{\\sqrt{variance_i + epsilon}} + beta_i \\;,\\;i=1,....,C The difference between this version and the iOS 15 :py:class:`~.iOS15.normalization.batch_norm` is that input/output can have different dtypes from other parameters. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), mean=TensorInputType(const=True, type_domain="U"), variance=TensorInputType(const=True, type_domain="U"), gamma=TensorInputType(const=True, optional=True, type_domain="U"), beta=TensorInputType(const=True, optional=True, type_domain="U"), epsilon=TensorInputType(const=True, optional=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class instance_norm(_instance_norm_iOS15): """ Apply instance normalization to the n-dimensional input tensor. The difference between this version and the iOS 15 :py:class:`~.iOS15.normalization.instance_norm` is that input/output can have different dtypes from other parameters. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), gamma=TensorInputType(const=True, optional=True, type_domain="U"), beta=TensorInputType(const=True, optional=True, type_domain="U"), epsilon=TensorInputType(const=True, optional=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class l2_norm(_l2_norm_iOS15): """ Apply L2 normalization to the n-dimensional input tensor. That is, divide the input tensor by the square root of the sum of squares of all elements of the input. .. math:: x_i \\leftarrow \\dfrac{x_i}{\\sqrt{\\sum{x_i^2} + \\epsilon}} The difference between this version and the iOS 15 :py:class:`~.iOS15.normalization.l2_norm` is that input/output and ``epsilon`` can have different dtypes. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), epsilon=TensorInputType(const=True, optional=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class layer_norm(_layer_norm_iOS15): """ Apply layer normalization to the n-dimensional input tensor: .. math:: out = gamma * (input - E[x]) / sqrt(Var[x] + epsilon) + beta The difference between this version and the iOS 15 :py:class:`~.iOS15.normalization.layer_norm` is that input/output can have different dtypes from other parameters. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axes=TensorInputType(const=True, optional=True, type_domain=types.int32), gamma=TensorInputType(const=True, optional=True, type_domain="U"), beta=TensorInputType(const=True, optional=True, type_domain="U"), epsilon=TensorInputType(const=True, optional=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class local_response_norm(_local_response_norm_iOS15): """ Apply local response normalization to the n-dimensional input tensor: .. math:: x_i \\leftarrow \\dfrac{x_i}{\\left ( k + \\dfrac{\\alpha}{\\text{size}} \\sum_j x_j^2 \\right )^\\beta} The difference between this version and the iOS 15 :py:class:`~.iOS15.normalization.local_response_norm` is that input/output can have different dtypes from other parameters. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), size=TensorInputType(const=True, type_domain=types.int32), alpha=TensorInputType(const=True, optional=True, type_domain="U"), beta=TensorInputType(const=True, optional=True, type_domain="U"), k=TensorInputType(const=True, optional=True, type_domain="U"), ) type_domains = { "T": (types.fp16, types.fp32), "U": (types.fp16, types.fp32), } ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/quantization_ops.py0000644000000000000000000002370214672066616027146 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, TensorInputType from coremltools.converters.mil.mil.operation import VALUE, Operation, precondition from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS17 import _IOS17_TARGET def _rank_promoted_to_same_as_data(data, axis, param): """ Reshapes `param` to be the same shape as `data`. """ if axis is not None: axis = axis if axis >= 0 else axis + len(data.shape) if len(param.shape) == 0: return np.reshape(param, np.ones(len(data.shape), np.int32)) else: axes = [i for i in range(len(data.shape)) if i != axis] return np.expand_dims(param, axis=tuple(axes)) def _check_scale_zp_shapes(input_data, scale, zero_point, axis): def assert_vector_size_same_as_axial_dimension(param, axis_dim_size, name): if param.rank == 1 and param.shape[0] != axis_dim_size: raise ValueError( "Parameter {}, if vector, needs to have same size as the dimension size along the parameter input".format( name ) ) if scale.rank == 0: # ios17.dequantize doesn't want axis defined for scalar quant params. if axis is not None: raise ValueError("axis should not be provided to quantize if scale/zp are scalars") if zero_point is not None and zero_point.rank != 0: raise ValueError("zero_point should be a scalar if scale is a scalar") elif scale.rank == 1: if axis is None or axis.val is None: raise ValueError("axis should be provided to quantize if scale/zp are not scalars") if axis.val < -input_data.rank or axis.val >= input_data.rank: raise ValueError( "Parameter axis needs to be in the range -input.rank <= axis < input.rank" ) input_axis_dim_size = input_data.shape[axis.val] assert_vector_size_same_as_axial_dimension(scale, input_axis_dim_size, "scale") if zero_point is not None: if zero_point.rank != 1: raise ValueError("zero_point should be a vector if scale is a vector") assert_vector_size_same_as_axial_dimension( zero_point, input_axis_dim_size, "zero_point" ) else: raise ValueError("Params scale & zero_point should both be scalars or vectors") @register_op(opset_version=_IOS17_TARGET) class quantize(Operation): """ Performs affine/linear quantization on an input tensor. The original data comes from the first "input". The other parameters -- ``scale``, ``zero_point``, and ``axis`` -- describe how quantization should occur:: quantized_data = clip(round(input / scale) + zero_point) Parameters ---------- input: tensor (Required) zero_point: const tensor (Optional) * The ``zero_point`` can be either a scalar or a vector. If not provided, it is assumed to be ``0``. * The ``zero_point`` follows similar broadcasting rules and size constraints as ``scale``. scale: const tensor (Required) * The ``scale`` can be either a scalar or a vector. * If ``scale`` is a vector, for implementation, it is broadcasted to the following shape: - The rank of ``scale`` becomes the same as the rank of the input. - Constraint: ``size(scale-vector) == input.shape[axis]``. - For ``i == axis``, ``scale.shape[i] == input.shape[i]``. - For ``i != axis``, ``scale.shape == 1``. - For example: - Assume ``input.shape = (2, 3, 4, 5)`` and ``axis = 1``. - If ``scale`` is a vector, then ``scale.size`` needs to be equal to ``input.shape[axis]``; that is, equal to ``3``. - This is broadcasted to ``(1, 3, 1, 1)``. output_dtype: const tensor (Required) * This parameter can take ``"uint8"``, ``"int8"`` as values. * The ``output_dtype`` value must match the ``zero_point`` dtype. axis: const tensor (Optional) Returns ------- tensor Attributes ---------- SrcT: fp16, fp32 DstT: uint8, int8 """ input_spec = InputSpec( input=TensorInputType(type_domain="SrcT"), zero_point=TensorInputType(const=True, optional=True, type_domain="DstT"), scale=TensorInputType(const=True, type_domain="SrcT"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), output_dtype=TensorInputType(const=True, type_domain=types.str), ) type_domains = { "SrcT": (types.fp16, types.fp32), "DstT": (types.uint8, types.int8), } def type_inference(self): out_dtype = types.string_to_builtin(self.output_dtype.val) if out_dtype not in {types.int8, types.uint8}: raise ValueError( '"quantize" op: unrecognized output dtype "{}"'.format(self.output_dtype.val) ) if self.zero_point is not None: if out_dtype != self.zero_point.dtype: raise ValueError( "output_dtype & zero_point dtype mismatch: {}, {}".format( self.output_dtype.val, types.builtin_to_string(self.zero_point.dtype) ) ) _check_scale_zp_shapes(self.input, self.scale, self.zero_point, self.axis) return types.tensor(out_dtype, self.input.shape) @precondition(allow=VALUE) def value_inference(self): original_data = self.input.val if self.zero_point is not None: zero_point = self.zero_point.val else: zero_point = np.int8(0) if self.output_dtype.val == "int8" else np.uint8(0) scale = self.scale.val axis = None if self.axis is not None: axis = self.axis.val dtype_info = np.iinfo(zero_point.dtype) sc = _rank_promoted_to_same_as_data(original_data, axis, scale) zp = _rank_promoted_to_same_as_data(original_data, axis, zero_point) val = np.clip( np.around(original_data / sc) + zp.astype(np.float32), dtype_info.min, dtype_info.max ) return val.astype(zero_point.dtype) @register_op(opset_version=_IOS17_TARGET) class dequantize(Operation): """ Performs dequantization on an input tensor with affine/linear quantization. The quantized data comes from the first "input". The other parameters -- ``scale``, ``zero_point``, and ``axis`` -- describe how unquantized values can be extracted from it, using the following equation for affine/linear quantization:: unquantized_data = scale * (input - zero_point) Parameters ---------- input: tensor (Required) zero_point: const tensor (Optional) * The ``zero_point`` can be either a scalar or a vector. If not provided, it is assumed to be ``0``. * The ``zero_point`` follows similar broadcasting rules and size constraints as ``scale``. scale: const tensor (Required) * The ``scale`` can be either a scalar or a vector. * If ``scale`` is a vector, for implementation, it is broadcasted to the following shape: - The rank of ``scale`` becomes the same as the rank of the input. - Constraint: ``size(scale-vector) == input.shape[axis]``. - For ``i == axis``, ``scale.shape[i] == input.shape[i]``. - For ``i != axis``, ``scale.shape == 1``. - For example: - Assume ``input.shape = (2, 3, 4, 5)`` and ``axis = 1``. - If ``scale`` is a vector, then ``scale.size`` needs to be equal to ``input.shape[axis]``; that is, equal to ``3``. - This is broadcasted to ``(1, 3, 1, 1)``. axis: const tensor (Optional) Returns ------- tensor Attributes ---------- SrcT: uint8, int8 DstT: fp16, fp32 """ input_spec = InputSpec( input=TensorInputType(type_domain="SrcT"), zero_point=TensorInputType(const=True, optional=True, type_domain="SrcT"), scale=TensorInputType(const=True, type_domain="DstT"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "DstT": (types.fp16, types.fp32), "SrcT": (types.uint8, types.int8), } def type_inference(self): _check_scale_zp_shapes(self.input, self.scale, self.zero_point, self.axis) return types.tensor(self.scale.dtype, self.input.shape) def can_materialize_val(self) -> bool: if self.input.val is None: return False if self.scale.val is None: return False if self.zero_point is not None and self.zero_point.val is None: return False if self.axis is not None and self.axis.val is None: return False return True def materialized_val_inference(self) -> np.ndarray: if not self.can_materialize_val(): return None quantized_data = self.input.val if self.zero_point is not None: zero_point = self.zero_point.val else: zero_point = np.int8(0) if self.input.dtype == types.int8 else np.uint8(0) scale = self.scale.val axis = None if self.axis is not None: axis = self.axis.val sc = _rank_promoted_to_same_as_data(quantized_data, axis, scale) zp = _rank_promoted_to_same_as_data(quantized_data, axis, zero_point) val = sc * (quantized_data.astype(np.float32) - zp.astype(np.float32)) return val.astype(scale.dtype) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/recurrent.py0000644000000000000000000001051214672066616025543 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS15.recurrent import gru as _gru_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.recurrent import lstm as _lstm_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.recurrent import rnn as _rnn_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS17 import _IOS17_TARGET @register_op(opset_version=_IOS17_TARGET) class gru(_gru_iOS15): """ Gated Recurrent Unit (GRU) The only difference between this version and the iOS 15 :py:class:`~.iOS15.recurrent.gru` is adding the support for fp16. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), initial_h=TensorInputType(type_domain="T"), weight_ih=TensorInputType(const=True, type_domain="T"), weight_hh=TensorInputType(const=True, type_domain="T"), bias=TensorInputType(const=True, optional=True, type_domain="T"), direction=TensorInputType(const=True, optional=True, type_domain=types.str), output_sequence=TensorInputType(const=True, optional=True, type_domain=types.bool), recurrent_activation=TensorInputType(const=True, optional=True, type_domain=types.str), activation=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class lstm(_lstm_iOS15): """ Long Short-Term Memory (LSTM) The only difference between this version and the iOS 15 :py:class:`~.iOS15.recurrent.lstm` is adding the support for fp16. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), initial_h=TensorInputType(type_domain="T"), initial_c=TensorInputType(type_domain="T"), weight_ih=TensorInputType(const=True, type_domain="T"), # ifoz layout, weight_hh=TensorInputType(const=True, type_domain="T"), # ifoz layout bias=TensorInputType(const=True, optional=True, type_domain="T"), # ifoz layout peephole=TensorInputType(const=True, optional=True, type_domain="T"), # ifo layout weight_ih_back=TensorInputType(const=True, optional=True, type_domain="T"), # ifoz layout, weight_hh_back=TensorInputType(const=True, optional=True, type_domain="T"), # ifoz layout bias_back=TensorInputType(const=True, optional=True, type_domain="T"), # ifoz layout peephole_back=TensorInputType(const=True, optional=True, type_domain="T"), # ifo layout direction=TensorInputType(const=True, optional=True, type_domain=types.str), output_sequence=TensorInputType(const=True, optional=True, type_domain=types.bool), recurrent_activation=TensorInputType(const=True, optional=True, type_domain=types.str), cell_activation=TensorInputType(const=True, optional=True, type_domain=types.str), activation=TensorInputType(const=True, optional=True, type_domain=types.str), clip=TensorInputType(const=True, optional=True, type_domain="T"), ) type_domains = { "T": (types.fp16, types.fp32), } @register_op(opset_version=_IOS17_TARGET) class rnn(_rnn_iOS15): """ Recurrent Neural Network (RNN) The only difference between this version and the iOS 15 :py:class:`~.iOS15.recurrent.rnn` is adding the support for fp16. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), initial_h=TensorInputType(type_domain="T"), weight_ih=TensorInputType(const=True, type_domain="T"), weight_hh=TensorInputType(const=True, type_domain="T"), bias=TensorInputType(const=True, optional=True, type_domain="T"), direction=TensorInputType(const=True, optional=True, type_domain=types.str), output_sequence=TensorInputType(const=True, optional=True, type_domain=types.bool), activation=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp16, types.fp32), } ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/reduction.py0000644000000000000000000001004314672066616025525 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import DefaultInputs, InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS15.reduction import reduce_arg as _reduce_arg_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS17 import _IOS17_TARGET class reduce_arg(_reduce_arg_iOS15): _VALID_OUTPUT_DTYPES = ("int32", "uint16") input_spec = InputSpec( x=TensorInputType(type_domain="T"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), keep_dims=TensorInputType(const=True, optional=True, type_domain=types.bool), output_dtype=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": (types.fp16, types.fp32, types.int32), } def default_inputs(self): return DefaultInputs( axis=-1, keep_dims=False, output_dtype="int32", ) def type_inference(self): reduced_shape = self._find_reduced_shape() output_dtype = self.output_dtype.val.lower() if output_dtype not in self._VALID_OUTPUT_DTYPES: raise ValueError( f'Invalid "output_dtype" {output_dtype}. Only support {self._VALID_OUTPUT_DTYPES}' ) return types.tensor(types.string_to_builtin(output_dtype), tuple(reduced_shape)) @register_op(opset_version=_IOS17_TARGET) class reduce_argmax(reduce_arg): """ Computes the indices of the maximum value across dimensions of a tensor. In case of ties, the identity of the return value is not guaranteed. The differences between this version and the iOS 15 :py:class:`~.iOS15.reduction.reduce_argmax` are: * The output supports uint16 dtype. * New optional input ``output_dtype``. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axis: const (Optional) * The dimension to reduce. Default is ``-1``. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` by removing the dimension specified in ``axis``. * If ``True``, retain reduced axis with length ``1``. output_dtype: const (Optional) * Possible values: ``uint16``, ``int32``. * If set, then value type inference will output using that dtype. * Default is ``int32``. Returns ------- <\*, U> Attributes ---------- T: fp16, fp32, i32 U: int32, uint16 """ def get_operator(self): return np.argmax @register_op(opset_version=_IOS17_TARGET) class reduce_argmin(reduce_arg): """ Computes the indices of the minimum value across dimensions of a tensor. In case of ties, the identity of the return value is not guaranteed. The differences between this version and the iOS 15 :py:class:`~.iOS15.reduction.reduce_argmin` are: * The output supports uint16 dtype. * New optional input ``output_dtype``. Parameters ---------- x: <\*,T> (Required) * Must be 1-dimensional or higher. axis: const (Optional) * The dimension to reduce. Default is ``-1``. keep_dims: const (Optional, default=False) * If ``False``, the rank is reduced by ``1`` by removing the dimension specified in ``axis``, otherwise retain reduced axis with length ``1``. output_dtype: const (Optional) * Possible values: ``uint16``, ``int32``. * If set, then value type inference will output using that dtype. * Default is ``int32``. Returns ------- <\*, U> Attributes ---------- T: fp16, fp32, i32 U: int32, uint16 """ def get_operator(self): return np.argmin ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/scatter_gather.py0000644000000000000000000004451014672066616026536 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import DefaultInputs, InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS15.scatter_gather import scatter as _scatter_iOS15 from coremltools.converters.mil.mil.ops.defs.iOS15.scatter_gather import ( scatter_along_axis as _scatter_along_axis_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.scatter_gather import ( scatter_nd as _scatter_nd_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS16.scatter_gather import gather as _gather_iOS16 from coremltools.converters.mil.mil.ops.defs.iOS16.scatter_gather import ( gather_along_axis as _gather_along_axis_iOS16, ) from coremltools.converters.mil.mil.ops.defs.iOS16.scatter_gather import ( gather_nd as _gather_nd_iOS16, ) from coremltools.converters.mil.mil.ops.defs.iOS17 import _IOS17_TARGET @register_op(opset_version=_IOS17_TARGET) class scatter(_scatter_iOS15): """ Scatter ``updates`` to ``data`` at locations ``indices`` at dimension ``axis`` by the operation ``mode``. This section documents only the differences between this version and the iOS 15 :py:class:`~.iOS15.scatter_gather.scatter`. The major differences are as follows: - Input parameter ``indices`` now supports only positive values -- negative values are considered out-of-bound. If support for negative indices is required, they must be explicitly converted to positive values using the following:: index = iOS17.select(index >= 0, index, index + max_index) - New input parameter called ``validate_indices`` has been added to all scatter ops. Its behavior is as follows: - If ``True``, it raises a runtime (possibly also a compile-time) exception for out-of-bound values of the ``indices`` parameter. - If ``False``, absolutely no checking is performed for out-of-bound values of ``indices`` either at compile or runtime. Behavior for out-of-bound indices is undefined but memory safe. Parameters ---------- data: tensor<\*D, T> (Required) indices: tensor<[C], i32> (Required) * 1-D tensor. updates: tensor<\*K, T> (Required) * ``K = data.shape[:axis] + [len(indices)] + data.shape[axis+1:]``. axis: const i32 (Optional) * Default to ``0``. mode: const string (Optional) * Can be the following modes: ``add``, ``div``, ``max``, ``min``, ``mul``, ``sub``, ``update``. * Default value is ``update``. validate_indices: const bool (Optional) * If ``True``, it raises a runtime (possibly also a compile-time) exception for out-of-bound values of the ``indices`` parameter. * If ``False``, absolutely no checking is performed for out-of-bound values of ``indices`` either at compile or runtime. Behavior for out-of-bound indices is undefined but memory safe. * Default value is ``False``. Returns ------- tensor<\*D, T> * With the same type and shape as input ``x``. Attributes ---------- T: fp16, fp32, i32 """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain=types.int32), updates=TensorInputType(type_domain="T"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), mode=TensorInputType(const=True, optional=True, type_domain=types.str), validate_indices=TensorInputType(const=True, optional=True, type_domain=types.bool), ) def default_inputs(self): return DefaultInputs( axis=0, mode="add", validate_indices=False, ) def type_inference(self): result = super().type_inference() if self.validate_indices.val: indices = self.indices.val if indices is not None: if np.count_nonzero( np.logical_or(indices < 0, indices >= self.data.shape[self.axis.val]) ): raise IndexError( f"Indices is out of bounds for `{self.op_type}` node {self.name}. " f"Expected indices between [0, {self.data.shape[self.axis.val]}), but got {indices}." ) return result @register_op(opset_version=_IOS17_TARGET) class scatter_along_axis(_scatter_along_axis_iOS15): """ Scatter ``updates`` to ``data`` at locations ``indices`` along ``axis`` dimension using the ``mode`` operation. The major differences from the previous version are illustrated in :py:class:`scatter`. For more information, see the iOS 15 :py:class:`~.iOS15.scatter_gather.scatter_along_axis`. Parameters ---------- data: tensor<\*D, T> (Required) indices: tensor<\*K, i32> (Required) * ``rank(indices) == rank(data)``. updates: tensor<\*K, T> (Required) * Must be the same shape as ``indices``. axis: const i32 (Optional) * Default to ``0``. mode: const string (Optional) * Default to ``add``. * Can be the following modes: ``add``, ``div``, ``max``, ``min``, ``mul``, ``sub``, ``update``. validate_indices: const bool (Optional) * If ``True``, it raises a runtime (possibly also a compile-time) exception for out-of-bound values of the ``indices`` parameter. * If ``False``, absolutely no checking is performed for out-of-bound values of ``indices`` either at compile or runtime. Behavior for out-of-bound indices is undefined but memory safe. * Default value is ``False``. Returns ------- tensor<\*D, T> * With the same type and shape as input ``x``. Attributes ---------- T: fp16, fp32, i32 """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain=types.int32), updates=TensorInputType(type_domain="T"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), mode=TensorInputType(const=True, optional=True, type_domain=types.str), validate_indices=TensorInputType(const=True, optional=True, type_domain=types.bool), ) def default_inputs(self): return DefaultInputs( axis=0, mode="add", validate_indices=False, ) def type_inference(self): result = super().type_inference() if self.validate_indices.val: indices = self.indices.val if indices is not None: if np.count_nonzero( np.logical_or(indices < 0, indices >= self.data.shape[self.axis.val]) ): raise IndexError( f"Indices is out of bounds for `{self.op_type}` node {self.name}. " f"Expected indices between [0, {self.data.shape[self.axis.val]}), but got {indices}." ) return result @register_op(opset_version=_IOS17_TARGET) class scatter_nd(_scatter_nd_iOS15): """ Scatter ``updates`` to ``data`` at locations ``indices``. The major differences from the previous version are illustrated in :py:class:`scatter`. For more information, see the iOS 15 :py:class:`~.iOS15.scatter_gather.scatter_nd`. Parameters ---------- data: tensor<\*D, T> (Required) indices: tensor<\*K, i32> (Required) updates: tensor<\*K, T> (Required) * Must be the shape as ``K[:-1]+data.shape[K[-1]:]``. mode: const string (Optional) * Default to ``add``. * Can be the following modes: ``add``, ``div``, ``max``, ``min``, ``mul``, ``sub``, ``update``. validate_indices: const bool (Optional) * If ``True``, it raises a runtime (possibly also a compile-time) exception for out-of-bound values of the ``indices`` parameter. * If ``False``, absolutely no checking is performed for out-of-bound values of ``indices`` either at compile or runtime. Behavior for out-of-bound indices is undefined but memory safe. * Default value is ``False``. Returns ------- tensor<\*D, T> * A tensor with the same shape and type as ``data``. Attributes ---------- T: fp16, fp32, i32 """ input_spec = InputSpec( data=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain=types.int32), updates=TensorInputType(type_domain="T"), mode=TensorInputType(const=True, optional=True, type_domain=types.str), validate_indices=TensorInputType(const=True, optional=True, type_domain=types.bool), ) def default_inputs(self): return DefaultInputs( mode="add", validate_indices=False, ) def type_inference(self): result = super().type_inference() if self.validate_indices.val: indices = self.indices.val upper_bound = self.data.shape if indices is not None: if np.count_nonzero(np.logical_or(indices < 0, indices >= upper_bound)): raise IndexError( f"Indices is out of bounds for `{self.op_type}` node {self.name}. " f"Expected indices between [0, {upper_bound}), but got {indices}." ) return result @register_op(opset_version=_IOS17_TARGET) class gather(_gather_iOS16): """ Gather slices from input ``x`` along dimension ``axis`` according to ``indices``, similar to `tf.gather_nd `_. This section documents only the differences between this version and the iOS 16 :py:class:`~.iOS16.scatter_gather.gather`. The major differences are as follows: - Input parameter ``x`` adds support for ``int16``, ``uint16``, ``int8``, and ``uint8``. - Input parameter ``indices`` adds support for ``int8`` and ``uint8``. - Input parameter ``indices`` now supports only positive values -- negative values are considered out-of-bound. If support for negative indices is required, they must be explicitly converted to positive values, using the following:: index = iOS17.select(index >= 0, index, index + max_index) - New input parameter called ``validate_indices`` has been added to all gather ops. Its behavior is as follows: - If ``True``, it raises a runtime (possibly also a compile-time) exception for out-of-bound values of the ``indices`` parameter. - If ``False``, absolutely no checking is performed for out-of-bound values of ``indices`` either at compile or runtime. Behavior for out-of-bound indices is undefined but memory safe. Parameters ---------- x: tensor<\*D, T> (Required) indices: tensor<\*N, I> (Required) * Indices values may be negative. More precisely, ``-D[axis]<= v < D[axis]`` for ``v`` in ``indices``. axis: const i32 (Optional. Default=``0``) * Negative axis is supported. batch_dims: const i32 (Optional. Default=``0``) * The number of batch dimensions. validate_indices: const bool (Optional) * If ``True``, it raises a runtime (possibly also a compile-time) exception for out-of-bound values of the ``indices`` parameter. * If ``False``, absolutely no checking is performed for out-of-bound values of ``indices`` either at compile or runtime. Behavior for out-of-bound indices is undefined but memory safe. * Default value is ``False``. Returns ------- tensor<\*K, T> * Where ``K = D[:axis] + N[batch_dims:] + D[axis+1:]``. Attributes ---------- T: fp16, fp32, int32, int16, uint16, int8, uint8 I: int32, int16, uint16, int8, uint8 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain="I"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), batch_dims=TensorInputType(const=True, optional=True, type_domain=types.int32), validate_indices=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": ( types.fp16, types.fp32, types.int32, types.int16, types.uint16, types.int8, types.uint8, ), "I": (types.int32, types.int16, types.uint16, types.int8, types.uint8), } def default_inputs(self): return DefaultInputs(axis=0, batch_dims=0, validate_indices=False) def type_inference(self): result = super().type_inference() if self.validate_indices.val: indices = self.indices.val if indices is not None: if np.count_nonzero( np.logical_or(indices < 0, indices >= self.x.shape[self.axis.val]) ): raise IndexError( f"Indices is out of bounds for `{self.op_type}` node {self.name}. " f"Expected indices between [0, {self.x.shape[self.axis.val]}), but got {indices}." ) return result @register_op(opset_version=_IOS17_TARGET) class gather_along_axis(_gather_along_axis_iOS16): """ Take the values along ``axis`` at locations ``indices``. The major differences from the previous version are illustrated in :py:class:`gather`. For more information, see the iOS 16 :py:class:`~.iOS16.scatter_gather.gather_along_axis`. Parameters ---------- x: tensor<\*D, T> (Required) indices: tensor<\*K, I> (Required) * ``rank(indices) == rank(x)``. axis: const i32 (Optional): * Default to ``0``. validate_indices: const bool (Optional) * If ``True``, it raises a runtime (possibly also a compile-time) exception for out-of-bound values of the ``indices`` parameter. * If ``False``, absolutely no checking is performed for out-of-bound values of ``indices`` either at compile or runtime. Behavior for out-of-bound indices is undefined but memory safe. * Default value is ``False``. Returns ------- tensor<\*D, T>: * Output tensor has the same shape as ``indices``. Attributes ---------- T: fp16, fp32, int32, int16, uint16, int8, uint8 I: int32, int16, uint16, int8, uint8 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain="I"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), validate_indices=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": ( types.fp16, types.fp32, types.int32, types.int16, types.uint16, types.int8, types.uint8, ), "I": (types.int32, types.int16, types.uint16, types.int8, types.uint8), } def default_inputs(self): return DefaultInputs( axis=0, validate_indices=False, ) def type_inference(self): result = super().type_inference() if self.validate_indices.val: indices = self.indices.val if indices is not None: upper_bound = self.x.shape[self.axis.val] if np.count_nonzero(np.logical_or(indices < 0, indices >= upper_bound)): raise IndexError( f"Indices is out of bounds for `{self.op_type}` node {self.name}. " f"Expected indices between [0, {upper_bound}), but got {indices}." ) return result @register_op(opset_version=_IOS17_TARGET) class gather_nd(_gather_nd_iOS16): """ Gather slices from ``x`` according to ``indices``, similar to `tf.gather_nd`. The major differences from the previous version are illustrated in :py:class:`gather`. For more information, see the iOS 16 :py:class:`~.iOS16.scatter_gather.gather_nd`. Parameters ---------- x: tensor<\*D, T> (Required) indices: tensor<\*K, I> (Required) batch_dims: const i32 (Optional. Default=``0``) * The number of batch dimensions. validate_indices: const bool (Optional) * If ``True``, it raises a runtime (possibly also a compile-time) exception for out-of-bound values of the ``indices`` parameter. * If ``False``, absolutely no checking is performed for out-of-bound values of ``indices`` either at compile or runtime. Behavior for out-of-bound indices is undefined but memory safe. * Default value is ``False``. Returns ------- tensor<\*V, T> * ``V = K[:-1] + D[batch_dims + K[-1]:]``, where ``D = x.shape`` and ``K = indices.shape``. Attributes ---------- T: fp16, fp32, int32, int16, uint16, int8, uint8 I: int32, int16, uint16, int8, uint8 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), indices=TensorInputType(type_domain="I"), batch_dims=TensorInputType(const=True, optional=True, type_domain=types.int32), validate_indices=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": ( types.fp16, types.fp32, types.int32, types.int16, types.uint16, types.int8, types.uint8, ), "I": (types.int32, types.int16, types.uint16, types.int8, types.uint8), } def default_inputs(self): return DefaultInputs( batch_dims=0, validate_indices=False, ) def type_inference(self): result = super().type_inference() if self.validate_indices.val: indices = self.indices.val upper_bound = self.x.shape if indices is not None: if np.count_nonzero(np.logical_or(indices < 0, indices >= upper_bound)): raise IndexError( f"Indices is out of bounds for `{self.op_type}` node {self.name}. " f"Expected indices between [0, {upper_bound}), but got {indices}." ) return result ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/tensor_operation.py0000644000000000000000000001734014672066616027132 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import DefaultInputs, InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS15.tensor_operation import ( non_maximum_suppression as _nms_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS16.tensor_operation import topk as _topk_iOS16 from coremltools.converters.mil.mil.ops.defs.iOS17 import _IOS17_TARGET @register_op(opset_version=_IOS17_TARGET) class non_maximum_suppression(_nms_iOS15): """ Performs non-maximum suppression (NMS) on the boxes according to their intersection-over-union (IoU). NMS iteratively removes lower-scoring boxes which have an IoU greater than ``iou_threshold`` with another (higher-scoring) box. The major differences between this version and the iOS 15 :py:class:`~.iOS15.tensor_operation.non_maximum_suppression` are as follows: - The input parameter ``score_threshold`` has been removed. - The inputs ``boxes`` and ``scores`` are ordered with number of boxes in the last dimension. - The fourth output containing number of boxes for each batch has been removed. Parameters ---------- boxes: tensor<[n, 4, B], T> (Required) * Box coordinates on which to perform NMS. The coordinates are expected in ``CENTER_SIZE_WIDTH_FIRST`` format ``(x, y, width, height)``, in which ``(x, y)`` is the center. scores: tensor<[n, K, B], T> (Required) * Scores for each one of the boxes. ``K`` is the number of classes. iou_threshold: const (Required) * The intersection over union (IoU) threshold over which boxes are suppressed. NMS remove all overlapping boxes with ``IoU > iou_threshold``. max_boxes: const (Required) * Maximum number of boxes to select. If the number of surviving boxes are less, the output is padded up to this number. per_class_suppression: const (Optional) * Defaults to ``False``. * If ``True``, suppression is performed independently within boxes of each class. Returns ------- tensor<[n, 4, max_boxes], T> * Coordinates of selected boxes. tensor<[n, K, max_boxes], T> * Scores of selected boxes. tensor<[n, max_boxes], i32> * Indices of selected boxes. Attributes ---------- T: fp16, fp32 """ input_spec = InputSpec( boxes=TensorInputType(type_domain="T"), scores=TensorInputType(type_domain="T"), iou_threshold=TensorInputType(const=True, type_domain="T"), max_boxes=TensorInputType(const=True, type_domain=types.int32), per_class_suppression=TensorInputType(const=True, optional=True, type_domain=types.bool), ) def type_inference(self): boxes_dtype = self.boxes.dtype scores_dtype = self.scores.dtype n_batch, n_score_class, _ = self.scores.shape max_boxes = self.max_boxes.val return ( types.tensor(boxes_dtype, (n_batch, 4, max_boxes)), types.tensor(scores_dtype, (n_batch, n_score_class, max_boxes)), types.tensor(types.int32, (n_batch, max_boxes)), ) @register_op(opset_version=_IOS17_TARGET) class topk(_topk_iOS16): """ A version of ``topk`` for iOS 17+. The differences between this version and the iOS 16 :py:class:`~.iOS16.tensor_operation.topk` are: - New data type support. The newly added data types are: - int8, uint8, int16, unint16 for ``x`` and output. - int8, int16 for ``k``. - Validation restrictions on the optional ``indices`` output must be either uint16 or int32. - A new input parameter ``output_indices_dtype`` has been added to set the dtype of output ``indices``. Parameters ---------- x: <\*?, T> (Required) * Input tensor. k: const (Optional) * Defaults to ``1``. * Number of values/indices to be computed along each axis. * Set to ``-1`` to select all elements. axis: const (Optional) * Defaults to ``-1`` (last dimension). * Axis to perform the operation. ascending: const (Optional) * Defaults to ``False``, sort in descending order. * ``True`` to sort in ascending order. sort: const (Optional) * Defaults to ``True``. * If ``True``, ``top-k`` elements are themselves sorted. Otherwise, no particular ordering is guaranteed. return_indices: const (Optional) * Defaults to ``True``. * If ``True``, returns both values and indices. Otherwise, returns only the ``top-k`` values. output_indices_dtype: const (Optional, default="int32") * It can only be set when ``return_indices`` is ``True``. * This parameter can take ``"int32"`` or ``"uint16"`` as values. Returns ------- tensor<\*?, T> * Values of top/bottom ``k`` elements. tensor<\*?, U> * Only returned when ``return_indices = True`` * Indices of the top/bottom ``k`` elements along axis. * U is int32 or uint16 determined by ``output_indices_dtype`` (int32 by default). Attributes ---------- T: fp16, fp32, int8, int16, int32, uint8, uint16 K: int8, int16, int32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), k=TensorInputType(const=True, optional=True, type_domain="K"), axis=TensorInputType(const=True, optional=True, type_domain=types.int32), ascending=TensorInputType(const=True, optional=True, type_domain=types.bool), sort=TensorInputType(const=True, optional=True, type_domain=types.bool), return_indices=TensorInputType(const=True, optional=True, type_domain=types.bool), output_indices_dtype=TensorInputType(const=True, optional=True, type_domain=types.str), ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.int16, types.int32, types.uint8, types.uint16, ), "K": (types.int8, types.int16, types.int32), } _ALLOWED_OUTPUT_INDICES_DTYPES = {"int32", "uint16"} def default_inputs(self): parent_default_inputs = super().default_inputs() # If return_indices is not set, it is default to True. # output_indices_dtype can only be set when return_indices = True if self.return_indices is None or self.return_indices.val: return parent_default_inputs + DefaultInputs(output_indices_dtype="int32") return parent_default_inputs def type_inference(self): if not self.return_indices.val and self.output_indices_dtype is not None: raise ValueError( 'In iOS17 topk op, "output_indices_dtype" can only be set when "return_indices=True".' ) if self.return_indices.val: if self.output_indices_dtype.val not in self._ALLOWED_OUTPUT_INDICES_DTYPES: raise ValueError( f'"topk" op invalid output_indices_dtype: "{self.output_indices_dtype.val}". ' f"Valid options are: {self._ALLOWED_OUTPUT_INDICES_DTYPES}" ) value_type, indices_type = super().type_inference() indices_type = types.tensor( types.string_to_builtin(self.output_indices_dtype.val), indices_type.get_shape() ) return value_type, indices_type else: return super().type_inference() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS17/tensor_transformation.py0000644000000000000000000004777714672066616030221 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import List from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, TensorInputType, TupleInputType from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS15.tensor_transformation import ( expand_dims as _expand_dims_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.tensor_transformation import ( reshape as _reshape_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.tensor_transformation import ( reverse as _reverse_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.tensor_transformation import ( reverse_sequence as _reverse_sequence_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.tensor_transformation import ( slice_by_index as _slice_by_index_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.tensor_transformation import ( slice_by_size as _slice_by_size_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.tensor_transformation import ( sliding_windows as _sliding_windows_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.tensor_transformation import ( squeeze as _squeeze_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS15.tensor_transformation import ( transpose as _transpose_iOS15, ) from coremltools.converters.mil.mil.ops.defs.iOS16.tensor_transformation import ( reshape_like as _reshape_like_iOS16, ) from coremltools.converters.mil.mil.ops.defs.iOS17 import _IOS17_TARGET @register_op(opset_version=_IOS17_TARGET) class reshape(_reshape_iOS15): """ Return a tensor that has the same values as ``x`` with shape ``shape``. ``shape`` must have the same volume (number of elements) as ``x``. The major differences between this version and the iOS 15 :py:class:`~.iOS15.tensor_transformation.reshape` are as follows: - When the ``shape`` contains ``0``, the restriction about ``K == rank(x)`` is no longer enforced. Each ``0`` in ``shape`` will match the corresponding dimension in ``x.shape``, counting from the rightmost element. So ``shape[i]`` matches ``input[j]`` if ``length(shape)-i == rank(input)-j``. If a ``0`` is out of range, assign ``1`` (equivalent to ``expand_dims`` for ``x.shape``). More specifically, when ``x.shape`` is ``[2, 50]`` and ``shape`` is ``[1, 0, -1, 0]``, it will error out in iOS 15 or iOS 16 because ``x`` has rank ``2`` while the ``len`` of ``shape`` is ``4``. In iOS 17, the result will have ``shape`` ``[1, 1, 2, 50]``, because the rightmost ``0`` will be changed to the rightmost dim of ``x.shape``, which is ``50``. There is no other ``0`` that has a corresponding dim in ``x.shape``, so it is set as ``1``. Finally, the ``-1`` is calculated based on knowing dimensions that produce ``2``. - Support more data types, including int8, uint8, int16, uint16 for ``x`` and int8, int16 for ``shape``. Parameters ---------- x: tensor<\*?, T> (Required) * An ``n-D`` tensor or a scalar. * If ``x`` has a fixed rank (and possibly contains symbolic dimension), ``shape`` may contain elements that are not positive integers (see below). * If ``x`` has a variadic rank, ``shape`` can only contain positive integers. shape: tensor<[K], U> (Required) A 1-D tensor, with elements from the following: * Positive integers. * Symbols: All but one symbol in ``shape`` must be present in ``x.shape``. The new symbol that is not present in ``x.shape`` represents a dimension such that the total size remains constant. Symbol is illegal if ``x`` has a variadic rank. * ``-1``: ``-1`` introduces a new symbol (see Symbols). Therefore, ``-1`` is allowed if all symbols in the ``shape`` appear in ``x.shape``. ``-1`` is illegal if ``x`` has a variadic rank. * ``0``: It will match the corresponding dimension in ``x.shape``. See the previous description of different behaviors with iOS 17. Returns ------- tensor<\*?, T> * Tensor with shape determined by the input shape. Attributes ---------- T: fp16, fp32, int8, uint8, int16, uint16, int32, bool U: int8, int16, int32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), shape=TensorInputType(type_domain="U"), ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.uint8, types.int16, types.uint16, types.int32, types.bool, ), "U": (types.int8, types.int16, types.int32), } @staticmethod def replace_zeros_in_shape(from_shape: List[int], to_shape: List[int]) -> List[int]: """ Replaces 0s in `to_shape` by the corresponding dims in `from_shape`. Overrides IOS15's method to demonstrate IOS17's different behaviours. """ if to_shape.count(0): # To do right alignment, we reverse the input and do left alignment instead. from_shape_reversed = from_shape[::-1] to_shape_reversed = to_shape[::-1] for idx, to_element in enumerate(to_shape_reversed): if to_element == 0: to_shape_reversed[idx] = ( from_shape_reversed[idx] if idx < len(from_shape_reversed) else 1 ) # Reverse the result back to make the right alignment. to_shape = to_shape_reversed[::-1] return to_shape @register_op(opset_version=_IOS17_TARGET) class reshape_like(_reshape_like_iOS16): """ Reshape a tensor to an output shape specified by some or all dimensions of a tuple of reference tensors ``ref_tensors``. The major difference between this version and the iOS 15 :py:class:`~.iOS16.tensor_transformation.reshape_like` is that input ``x`` and ``ref_tensors`` supports more data types: int8, uint8, int16, uint16. Parameters ---------- x: tensor<\*?, T> (Required) * The input tensor to be reshaped. ref_tensors: Tuple[tensor<\*?, R>] (Required) * A tuple of tensors that define the output shape. begins: Tuple[const] (Required) * A tuple of integers specifying the begin index into the shape vector of the corresponding ``ref_tensor``. ends: Tuple[const] (Required) * A tuple of integers specifying the end index into the shape vector of the corresponding ``ref_tensor``. end_masks: Tuple[const] (Required) * If ``True``, select all axes from the begin index until the end of the corresponding ``ref_tensor``, as in ``ref_tensors[i].shape[begins[i]:]``. Returns ------- tensor<\*?, T> * Same type as input tensor ``x``. * Output shape is computed by ``ref_tensors``, ``begins``, ``ends``, and ``end_masks``. Attributes ---------- T: fp16, fp32, int8, int16, int32, uint8, uint16, bool R: fp16, fp32, int8, int16, int32, uint8, uint16, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), ref_tensors=TupleInputType(), begins=TupleInputType(), ends=TupleInputType(), end_masks=TupleInputType(), ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.int16, types.int32, types.uint8, types.uint16, types.bool, ), } @register_op(opset_version=_IOS17_TARGET) class expand_dims(_expand_dims_iOS15): """ Insert a single-dimension in a 1-D or higher tensor at each axis in axes. The major difference between this version and the iOS 15 :py:class:`~.iOS15.tensor_transformation.expand_dims` is that input ``x`` supports more data types: int8, uint8, int16, uint16. Parameters ---------- x: tensor<\*?, T> (Required) * Scalar or tensor. axes: const tensor<[K], int32> Required * ``K`` is the number of dimensions expanded. * Insert single dimension at dimension index at each axes. * Negative value to index from the end. ``-d-1 <= axis <= d`` where ``d`` is the rank of ``x``. Returns ------- tensor<\*(rank(x)+K), T> * Same type as the input ``x`` with rank ``rank(x)+K``. Attributes ---------- T: fp16, fp32, int8, int16, int32, uint8, uint16, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axes=TensorInputType(const=True, type_domain=types.int32), ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.int16, types.int32, types.uint8, types.uint16, types.bool, ), } @register_op(opset_version=_IOS17_TARGET) class squeeze(_squeeze_iOS15): """ Remove single-dimension dimensions in a 1-D or higher tensor. The major difference between this version and the iOS 15 :py:class:`~.iOS15.tensor_transformation.squeeze` is that input ``x`` supports more data types: int8, uint8, int16, uint16. Parameters ---------- x: tensor<\*?,T> (Required) * Must be at least 1-D. axes: const (Optional) * Axes to squeeze out. * Default to remove all single-dimensions. Returns ------- tensor<\*(rank(x)-K),T> * Tensor with same type as input ``x`` and rank ``rank(x)-K``. Attributes ---------- T: fp16, fp32, int8, int16, int32, uint8, uint16, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axes=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.int16, types.int32, types.uint8, types.uint16, types.bool, ), } @register_op(opset_version=_IOS17_TARGET) class reverse(_reverse_iOS15): """ Reverse the order of the input tensor ``x`` along specified ``axes`` (dimensions). The major difference between this version and the iOS 15 :py:class:`~.iOS15.tensor_transformation.reverse` is that input ``x`` supports more data types: int8, uint8, int16, uint16. Parameters ---------- x: tensor<\*?, T> (Required) * Input tensor. axes: const (Optional) * Dimension(s) to reverse. Each axis must be in the range ``[-rank(x), rank(x))``. * Defaults to None (reverse on all dimensions). Returns ------- tensor<\*?, T> * Same type and shape as the input tensor. Attributes ---------- T: fp16, fp32, int8, int16, int32, uint8, uint16, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axes=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.int16, types.int32, types.uint8, types.uint16, types.bool, ), } @register_op(opset_version=_IOS17_TARGET) class reverse_sequence(_reverse_sequence_iOS15): """ Reverse variable length slices for specified axes / dimensions of the input tensor. This op first slices input tensor along the ``batch_axis`` dimension, then partially reverses the elements along the ``seq_axis`` for the first ``lengths[i]`` elements. The major difference between this version and the iOS 15 :py:class:`~.iOS15.tensor_transformation.reverse_sequence` is that input supports more data types: - ``x`` additionally supports int8, uint8, int16, uint16 - ``lengths`` additionally supports int8, int16 Parameters ---------- x: tensor<\*?, T> (Required) * Input tensor. lengths: tensor (Required) * 1-dimensional tensor of length ``x.shape[batch_axis]`` specifying the length of the sequence to reverse. * Values must be in range ``[0, x.shape[seq_axis]]``. seq_axis: const (Optional) * The dimension to reverse. * Defaults to ``0``. batch_axis: const (Optional) * Dimension for slicing. * Defaults to ``0``. Returns ------- tensor<\*?, T> * Same type and shape as the input tensor. Attributes ---------- T: fp16, fp32, int8, int16, int32, uint8, uint16, bool U: int8, int16, int32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), lengths=TensorInputType(type_domain="U"), seq_axis=TensorInputType(const=True, optional=True, type_domain=types.int32), batch_axis=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.int16, types.int32, types.uint8, types.uint16, types.bool, ), "U": (types.int8, types.int16, types.int32), } @register_op(opset_version=_IOS17_TARGET) class sliding_windows(_sliding_windows_iOS15): """ Return a tensor containing all windows of ``size``, separated by stride along the given ``axis``. The major difference between this version and the iOS 15 :py:class:`~.iOS15.tensor_transformation.sliding_windows` is that input ``x`` supports more data types: int8, uint8, int16, uint16. Parameters ---------- x: tensor<[\*d0, d_axis, *dn], T> * Input tensor. axis: const * Axis to perform the operation. size: const * Number of elements in the sliding window. stride: const Optional * Default to ``1``. * The stride of the input elements in the sliding window. Returns ------- tensor<[\*d0, d_axis - size // stride + 1, size, \*dn], T> * The output will be a tensor of rank ``N+1`` where ``N`` is the input tensor rank. Attributes ---------- T: fp16, fp32, int8, int16, int32, uint8, uint16, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), axis=TensorInputType(const=True, type_domain=types.int32), size=TensorInputType(const=True, type_domain=types.int32), stride=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.int16, types.int32, types.uint8, types.uint16, types.bool, ), } @register_op(opset_version=_IOS17_TARGET) class transpose(_transpose_iOS15): """ Permute tensor ``x`` dimensions according to ``perm``. The major difference between this version and the iOS 15 :py:class:`~.iOS15.tensor_transformation.transpose` is that input ``x`` supports more data types: int8, uint8, int16, uint16. Parameters ---------- x: tensor<\*?, T> (Required) * Must be at least 1-D. ``x`` may have a symbolic shape. perm: const<[rank(x)], i32> (Required) * Permutation order. -rank(x) <= perm[I] < rank(x) for all perm entries. Returns ------- tensor<\*?,T> * Tensor with same rank and type as ``x``. Attributes ---------- T: fp16, fp32, int8, int16, int32, uint8, uint16, bool """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), perm=TensorInputType(const=True, type_domain=types.int32), ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.int16, types.int32, types.uint8, types.uint16, types.bool, ), } @register_op(opset_version=_IOS17_TARGET) class slice_by_index(_slice_by_index_iOS15): """ Method for numpy style indexing and slicing. With a tensor ``x``, this method achieves the following: ``result = x[begin[0]: end[0]: stride[0], begin[1]: end[1]: stride[1], ...]`` The differences between this version and the iOS 15 :py:class:`~.iOS15.tensor_transformation.slice_by_index` is that additional data types are supported for ``x``, ``begin``, ``end``, and ``stride``. See Parameters and Attributes sections for details. Parameters ---------- x: tensor<*?, T> (Required) * Input tensor begin: tensor<[rank(x)], U> (Required) * Starting index for the dimension of slicing. end: tensor<[rank(x)], U> (Required) * Ending index for the dimension of slicing. stride: tensor<[rank(x)], U> (Optional) * Default is all ``1``. * Stride for the dimension of slicing. begin_mask: tensor<[rank(x)], bool> (Optional) * Default to all ``False``. * If ``begin_mask[i]==True``, ignores ``begin[i]``, and set ``begin[i]`` to ``0``. end_mask: tensor<[rank(x)], bool> (Optional) * Default to all ``False``. * If ``end_mask[i]==True``, ignores ``end[i]``, and set ``end[i]`` to ``x.shape[i]``. squeeze_mask: tensor<[rank(x)], bool> (Optional) * Default to all ``False``. * If ``squeeze_mask[i]==True``, ignores ``end[i]``, and do the pure index at ``begin[i]``. Returns ------- tensor<\*?, T> - Scalar or tensor. Attributes ---------- T: bool, fp16, fp32, int8, int16, int32, uint8, uint16 U: int8, int16, int32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), begin=TensorInputType(type_domain="U"), end=TensorInputType(type_domain="U"), stride=TensorInputType(const=True, optional=True, type_domain="U"), begin_mask=TensorInputType(const=True, optional=True, type_domain=types.bool), end_mask=TensorInputType(const=True, optional=True, type_domain=types.bool), squeeze_mask=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.int16, types.int32, types.uint8, types.uint16, types.bool, ), "U": (types.int8, types.int16, types.int32), } @register_op(opset_version=_IOS17_TARGET) class slice_by_size(_slice_by_size_iOS15): """ Slice input tensor starting from the given ``begin`` index and by the amount specified by the ``size`` input, for each dimension. The differences between this version and the iOS 15 :py:class:`~.iOS15.tensor_transformation.slice_by_size` is that additional data types are supported for ``x``, ``begin``, and ``size``. See Parameters and Attributes sections for details. Parameters ---------- x: tensor<*?, T> (Required) * Input tensor. begin: tensor<[rank(x)], U> Required * The begin index for slice. size: tensor<[rank(x)], U> Required * The size that is to be sliced. If ``size`` is ``-1``, all the remaining elements starting with "begin" are sliced. Returns ------- tensor<\*?, T> * Scalar or tensor. Attributes ---------- T: bool, fp16, fp32, int8, int16, int32, uint8, uint16 U: int8, int16, int32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), begin=TensorInputType(type_domain="U"), size=TensorInputType(type_domain="U"), ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.int16, types.int32, types.uint8, types.uint16, types.bool, ), "U": (types.int8, types.int16, types.int32), } ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.237547 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS18/0000755000000000000000000000000014672075535023162 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS18/__init__.py0000644000000000000000000000146514672066616025301 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil._deployment_compatibility import AvailableTarget as target from coremltools.converters.mil.mil.ops.registry import SSAOpRegistry # Ensure op registrations recognize the new opset. _IOS18_TARGET = target.iOS18 from .compression import ( constexpr_blockwise_shift_scale, constexpr_lut_to_dense, constexpr_lut_to_sparse, constexpr_sparse_blockwise_shift_scale, constexpr_sparse_to_dense, ) from .recurrent import gru from .states import read_state from .tensor_transformation import slice_update from .transformers import scaled_dot_product_attention ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS18/compression.py0000644000000000000000000010063514672066616026102 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import List, Optional import numpy as np from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, TensorInputType from coremltools.converters.mil.mil.operation import Operation from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS16.constexpr_ops import ( constexpr_cast as _constexpr_cast_iOS16, ) from coremltools.converters.mil.mil.ops.defs.iOS18 import _IOS18_TARGET from coremltools.converters.mil.mil.var import Var @register_op(opset_version=_IOS18_TARGET) class constexpr_blockwise_shift_scale(Operation): """ A compile-time operation that returns a constant output value upon dequantizing its constant inputs. It's similar to iOS 16 :py:class:`~.iOS16.constexpr_ops.constexpr_affine_dequantize`, but supports block-wise quantization for int4 and int8. Although all parameters of this op are constants, this op is not constant-folded to a single const op at the time of model serialization. The unquantized output will be decompressed later, based on the implementation detail (either at model load time or runtime). Generic expression: output = scale * (data - offset) Algorithm: Assuming Rank 3 scenario: output_data[i, j, k] = scale[i0, j0, k0] * (data[i, j, k] - offset[i0, j0, k0]) where i0 = floor(i/block_size[0]), j0 = floor(j/block_size[1]), k0 = floor(k/block_size[2]) The block size is implied by block_size[m] = data.shape[m] / scale.shape[m] Constraints: - All tensors: scale, data, offset and output have same rank. - Inputs: scale and offset (if provided) have same shape. - Output shape is same as the shape of input argument: `data`. - Number of scales along each dimension should be a factor of corresponding dimension size of `data`. That is, block_size[i] should be an integer where block_size[i] = data.shape[i] / scale.shape[i] Parameters ---------- data: const tensor (Required) scale: const tensor (Required) offset: const tensor (Optional) * If provided, must have the same shape as the ``scale``. * If dtype is not fp16 or fp32, it must be the same as SrcT. Returns ------- const tensor Attributes ---------- SrcT: int4, uint4, int8, uint8, fp16, fp32 DstT: fp16, fp32 OffsetT: int4, uint4, int8, uint8, fp16, fp32 """ input_spec = InputSpec( data=TensorInputType(const=True, type_domain="SrcT"), scale=TensorInputType(const=True, type_domain="DstT"), offset=TensorInputType(const=True, optional=True, type_domain="OffsetT"), ) type_domains = { "SrcT": (types.int4, types.uint4, types.int8, types.uint8, types.fp16, types.fp32), "DstT": (types.fp16, types.fp32), "OffsetT": (types.int4, types.uint4, types.int8, types.uint8, types.fp16, types.fp32), } @staticmethod def _validate_shift_scale_inputs( data_shape: List[int], data_dtype: types, scale: Var, offset: Var ): data_rank = len(data_shape) if data_rank != scale.rank: raise ValueError( f"Parameter 'data' and 'scale' need to have the same rank, but got {data_rank} vs {scale.rank}." ) if data_rank < 1: raise ValueError("Parameter 'data' needs to have at least rank 1, but got scalar.") for rank_idx in range(data_rank): data_dim = data_shape[rank_idx] scale_dim = scale.shape[rank_idx] if data_dim % scale_dim != 0: raise ValueError( "Number of scales along each dimension should be a factor of " "corresponding dimension size of 'data'. However, at dim " f"{rank_idx}, the 'data' has {data_dim} while 'scale' has {scale_dim}." ) if offset is not None: if offset.shape != scale.shape: raise ValueError( "Invalid parameter 'offset'; the shape of 'offset' should match the shape of " f"'scale', but got ({offset.shape}) vs ({scale.shape})." ) if not types.is_float(offset.dtype) and offset.dtype != data_dtype: raise ValueError( "Invalid parameter 'offset'; the dtype of 'offset' should match the dtype of " f"'data', but got ({types.builtin_to_string(offset.dtype)}) vs " f"({types.builtin_to_string(data_dtype)})." ) def _validate_inputs(self): self._validate_shift_scale_inputs(self.data.shape, self.data.dtype, self.scale, self.offset) def type_inference(self): self._validate_inputs() return types.tensor(self.scale.dtype, self.data.shape) def materialized_val_inference(self): data = self.data.val scale = self.scale.val if data is None and self.data.op.op_type.startswith("constexpr_"): data = self.data.op.materialized_val_inference() if scale is None and self.scale.op.op_type.startswith("constexpr_"): scale = self.scale.op.materialized_val_inference() return self.decompress( data, scale, None if self.offset is None else self.offset.val, ) @staticmethod def decompress( data: np.ndarray, scale: np.ndarray, offset: Optional[np.ndarray], ): # Adjust dtype to avoid overflow in the quantized dtype. data = data.astype(scale.dtype) # Interleaved repeat scale and offset to make it match the shape of data. block_sizes = [ data_shape // scale_shape for (data_shape, scale_shape) in zip(data.shape, scale.shape) ] for axis, block_size in enumerate(block_sizes): if block_size > 1: scale = np.repeat(scale, block_size, axis) if offset is not None: offset = np.repeat(offset, block_size, axis) if offset is not None: data = data - offset data = scale * data return data @register_op(opset_version=_IOS18_TARGET) class constexpr_lut_to_dense(Operation): """ A compile-time operation that returns a constant output value upon dequantizing its constant inputs. This operator is used to store constant weights in lookup tables format (aka palettized weights). It's similar to iOS 16 :py:class:`~.iOS16.constexpr_ops.constexpr_lut_to_dense`, but supports block-wise / vector palettization. LUT's rank is K + 2, where K is the rank of indices. Each dimension of LUT's first K dimensions should be divisible by each corresponding dimension of the decompressed tensor. e.g., when indices_shape = [2, 3, 4], lut_shape[:3] = [1, 1, 2], it means that there are two lookup tables over the last axis. And each of them have their own LUT values. See Case 1 below for details. VECTOR_SIZE is added to support vector palettization. - When VECTOR_SIZE is 1, it is scalar palettization. - When VECTOR_SIZE is larger than 1, it retrieves a vector instead of a single value from the lookup table, and fill the result continuously. The vector_axis is used to define which axis the vectored elements in the lookup table be filled across the output tensor. vector_axis is only optional if VECTOR_SIZE is 1. As a result: output_shape[i] = indices_shape[i] , i != vector_axis output_shape[i] = indices_shape[i] * VECTOR_SIZE, i == vector_axis See Case 2 below for details. Examples: Case 1: per-group scalar palettization: e.g.: - indices = tensor>([2, 3, 3, 0, 1, 0, 3, 0, 2, 1, 0, 3]) - lut = tensor([1.0, 5.0, 9.0, 13.0, 2.0, 10.0, 18.0, 26.0]) It is effectively a 2-group 2-bit scalar palettization. The output shape would be [6, 2], which is the same as the indices shape. The output tensor values are: [[lut0[2]->9.0, lut0[3]->13.0], [lut0[3]->13.0, lut0[0]->1.0], [lut0[1]->5.0, lut0[0]->1.0], [lut1[3]->26.0, lut1[0]->2.0], [lut1[2]->18.0, lut1[1]->10.0], [lut1[0]->2.0, lut1[3]->26.0]] where lut0 is the first lookup table (lut[0, :, :, :]) and lut1 is the second lookup table. Case 2: per-tensor vector palettization: e.g.: - indices = tensor>. The indices values are: [ [ [0, 0], [1, 0] ], [ [1, 1], [0, 0] ] ] - lut = tensor([a0, a1, a2, b0, b1, b2]) which means the two centroids are [a1, a2, a3] and [b1, b2, b3]. Case 2.1: vector_axis = 1 It is effectively a 1-bit vector palettization. The output shape would be [2, 2*3, 2], where each index in the indices would be effectively replaced with the 3 elements in the vector over the 1st dimension to construct the output tensor. The output values are: [ [ [a0, a0], [a1, a1], [a2, a2], [b0, a0], [b1, a1], [b2, a2], ], [ [b0, b0], [b1, b1], [b2, b2], [a0, a0], [a1, a1], [a2, a2], ] ] Case 2.2: vector_axis = 2 The output shape would be [2, 2, 2*3], where each index in the indices would be effectively replaced with the 3 elements in the vector over the last dimension to construct the output tensor. The output values are: [ [ [a0, a1, a2, a0, a1, a2], [b0, b1, b2, a0, a1, a2], ], [ [b0, b1, b2, b0, b1, b2], [a0, a1, a2, a0, a1, a2], ] ] Parameters ---------- indices: const tensor (Required) lut: const tensor (Required) * NUM_PALETTES needs to be 2^nbits where nbits is indicated by IndicesT. vector_axis: const tensor (Optional) * vector_axis can be optional if VECTOR_SIZE is 1. Returns ------- const tensor * output_shape = indices_shape * [1..1, VECTOR_SIZE, 1..1] (all 1 but VECTOR_SIZE at vector_axis dimension). Attributes ---------- IndicesT: uint1, uint2, uint3, uint4, uint6, uint8 T: uint8, int8, fp16, fp32 """ input_spec = InputSpec( indices=TensorInputType(const=True, type_domain="IndicesT"), lut=TensorInputType(const=True, type_domain="T"), vector_axis=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "IndicesT": (types.uint1, types.uint2, types.uint3, types.uint4, types.uint6, types.uint8), "T": (types.int8, types.uint8, types.fp16, types.fp32), } @staticmethod def _validate_lut_inputs( indices_shape: List[int], indices_dtype: types, lut_shape: List[int], vector_axis: Var ): indices_rank = len(indices_shape) lut_rank = len(lut_shape) if indices_rank < 1: raise ValueError("Parameter 'indices' needs to have at least rank 1, but got scalar.") if lut_rank != indices_rank + 2: raise ValueError( f"Parameter 'lut' need to have 2 more dim than 'indices', but got " f"{lut_rank}-rank 'lut' and {indices_rank}-rank 'indices'." ) for rank_idx in range(indices_rank): indices_dim = indices_shape[rank_idx] lut_dim = lut_shape[rank_idx] if indices_dim % lut_dim != 0: raise ValueError( f"Each dimension of 'indices' should be divisible by each corresponding " f"dimension of the 'lut'. However, at dim {rank_idx}, the 'indices' has " f"{indices_dim} while 'lut' has {lut_dim}." ) nbits = indices_dtype.get_bitwidth() if lut_shape[-2] != 2**nbits: raise ValueError( "Invalid parameter 'lut'; the second last dim should have size " f"2^nbits, where nbits is {nbits}, but got {lut_shape[-2]}." ) if vector_axis is not None: if vector_axis.rank > 0: raise ValueError( "Invalid parameter 'vector_axis'; It should be a scalar, but got " "a tensor." ) if not -indices_rank <= vector_axis.val < indices_rank: raise ValueError( f"Invalid parameter 'vector_axis'; The valid range is between " f"{-indices_rank} and {indices_rank}, but got {vector_axis.val}." ) else: if lut_shape[-1] > 1: raise ValueError( "When lut's last dim (VECTOR_SIZE) > 1, the parameter " "'vector_axis' need to be provided." ) def _validate_inputs(self): self._validate_lut_inputs( self.indices.shape, self.indices.dtype, self.lut.shape, self.vector_axis ) def type_inference(self): self._validate_inputs() output_shape = self.indices.shape vector_size = self.lut.shape[-1] if vector_size > 1: output_shape = list(output_shape) output_shape[self.vector_axis.val] *= vector_size output_shape = tuple(output_shape) return types.tensor(self.lut.dtype, output_shape) def materialized_val_inference(self): return self.decompress( self.indices.val, self.lut.val, None if self.vector_axis is None else self.vector_axis.val, ) @staticmethod def decompress( indices: np.ndarray, lut: np.ndarray, vector_axis: Optional[np.generic], ): num_palettes = lut.shape[-2] vector_size = lut.shape[-1] original_lut_shape = lut.shape block_size = [indices.shape[idx] // lut.shape[idx] for idx in range(len(indices.shape))] if vector_axis is not None and vector_axis < 0: vector_axis += len(indices.shape) lut = lut.reshape(-1, num_palettes, vector_size) decompressed_res = indices.astype(lut.dtype) if vector_size > 1: # Tile the vector_axis to make room for the vector retrieved from lut. decompressed_res = np.repeat(decompressed_res, vector_size, axis=vector_axis) else: lut = np.squeeze(lut, axis=-1) # TODO (rdar://115061946): Vectorize the computation. for table_idx in range(lut.shape[0]): # Get the corresponding idx in indices for the current table. # For example, if table coord is (1, 3), the corresponding indices should be # [1*block_size[0] : 2*block_size[0], 3*block_size[1], 4*block_size[1]]. original_table_coord = np.unravel_index(table_idx, original_lut_shape[:-2]) slice_idxes = tuple( slice(coord * block_size[idx], (coord + 1) * block_size[idx]) for idx, coord in enumerate(original_table_coord) ) unquantized_values = lut[table_idx][indices[slice_idxes]] if vector_size > 1: if vector_axis is None: raise ValueError("vector_axis must be provided for vector lut.") # Merge the vector dim into the decompressed values (flatten the vector). unquantized_values = np.swapaxes(unquantized_values, vector_axis, -2) unquantized_values = unquantized_values.reshape( unquantized_values.shape[:-2] + (-1,) ) unquantized_values = np.swapaxes(unquantized_values, vector_axis, -1) # Resize the slice to make room for the merged vector dequantized values. slice_idxes = list(slice_idxes) resized_slice = slice( slice_idxes[vector_axis].start * vector_size, slice_idxes[vector_axis].stop * vector_size, slice_idxes[vector_axis].step, ) slice_idxes[vector_axis] = resized_slice decompressed_res[tuple(slice_idxes)] = unquantized_values return decompressed_res @register_op(opset_version=_IOS18_TARGET) class constexpr_sparse_to_dense(Operation): """ A compile-time operation that returns a constant output value upon de-sparsification of its constant inputs. The differences from iOS16 :py:class:`~.iOS16.constexpr_ops.constexpr_sparse_to_dense` are: - In iOS16, the mask parameter is 'const tensor', which is a flat tensor with length M, so it requires a parameter `shape` to determine the output shape. In iOS18, we use uint1 (0 or 1) to represent bitmask, which packs the bitmask data and costs the same memory as the uint8 mask in iOS16, but can explicitly tell the tensor shape. We use uint1 instead of bool because bool in MIL uses uint8 as the storage dtype, which costs 8x memory compared to uint1. - Support more dtypes (int4 and uint4) for the input/output data. Parameters ---------- nonzero_data: const tensor (Required) mask: const tensor (Required) Returns ------- const tensor Attributes ---------- T: int4, uint4, int8, uint8, fp16, fp32 """ input_spec = InputSpec( nonzero_data=TensorInputType(const=True, type_domain="T"), mask=TensorInputType(const=True, type_domain=types.uint1), ) type_domains = {"T": (types.int4, types.uint4, types.int8, types.uint8, types.fp16, types.fp32)} @staticmethod def decompress(nonzero_data: np.ndarray, mask: np.ndarray) -> np.ndarray: decompressed_val = np.zeros_like(mask, dtype=nonzero_data.dtype) decompressed_val[mask != 0] = nonzero_data return decompressed_val @staticmethod def _validate_sparse_inputs(nonzero_data: Var, mask: Var): if nonzero_data.rank != 1: raise ValueError( f"Parameter nonzero_data needs to have rank 1, but got {nonzero_data.rank}" ) if mask.val is not None and np.count_nonzero(mask.val) != nonzero_data.shape[0]: raise AssertionError( "Number of 1s in mask not match number of elements in parameter nonzero_data" ) def type_inference(self): self._validate_sparse_inputs(self.nonzero_data, self.mask) return types.tensor(self.nonzero_data.dtype, self.mask.shape) def materialized_val_inference(self): nonzero_data = self.nonzero_data.val mask = self.mask.val if nonzero_data is None and self.nonzero_data.op.op_type.startswith("constexpr_"): nonzero_data = self.nonzero_data.op.materialized_val_inference() if isinstance(nonzero_data, tuple) and len(nonzero_data) > 0: # For sparse constexpr ops they have two outputs, one for mask and one for val. nonzero_data = nonzero_data[1] if mask is None and self.mask.op.op_type.startswith("constexpr_"): mask = self.mask.op.materialized_val_inference() if isinstance(mask, tuple) and len(mask) > 0: mask = mask[0] return self.decompress(nonzero_data, mask) @register_op(opset_version=_IOS18_TARGET) class constexpr_lut_to_sparse(Operation): """ A compile-time operation that returns a constant output value upon de-palettizing its constant inputs. This op is a sparse-to-sparse op to support `constexpr_lut_to_dense` on sparse data, where the de-palettization is only applied on the nonzero data. Usually it would be followed by a `constexpr_sparse_to_dense` op to get the dense tensor. So, parameters of this op are similar to `constexpr_sparse_to_dense` and `constexpr_lut_to_dense`. For detailed descriptions about its parameters, please refer to iOS 18 :py:class:`~.iOS18.constexpr_ops.constexpr_sparse_to_dense` and :py:class:`~.iOS18.constexpr_ops.constexpr_lut_to_dense`. This op has two outputs: 1. the mask of the de-palettized nonzero_data. 2. the de-palettized nonzero_data. Parameters ---------- indices_mask: const tensor (Required) indices_nonzero_data: const tensor (Required) lut: const tensor (Required) * NUM_PALETTES needs to be 2^nbits where nbits is indicated by IndicesT. vector_axis: const tensor (Optional) * vector_axis can be optional if VECTOR_SIZE is 1. Returns ------- const tensor * the mask of the de-palettized nonzero_data. For scalar palettization, it's the same as the input indices_mask. For vector palettization, it's expanded of the indices_mask over axis=vector_axis. const tensor * the de-palettized nonzero_data. For scalar palettization, VD=D (same size as indices_nonzero_data). For vector palettization, VD=VECTOR_SIZE * D (each entry is expanded by a vector). Attributes ---------- IndicesT: uint1, uint2, uint3, uint4, uint6, uint8 T: uint8, int8, fp16, fp32 Examples ---------- Assume we have the following inputs: indices_mask = [[1, 1, 0, 0, 0, 0], [1, 1, 0, 0, 0, 1], [0, 1, 1, 0, 1, 0], [0, 0, 0, 1, 0, 0]] indices_nonzero_data = [0, 1, 1, 0, 1, 1, 0, 0, 1] Notice that: - The uint1 in `indices_mask` and `indices_nonzero_data` has different meanings. For `indices_mask` the dtype is always uint1 to represent bit mask. For `indices_nonzero_data` the uint1 means the LUT only has two entries, so only 1 bit is needed to represent indices. - The 0 in `indices_mask` and `indices_nonzero_data` has different meanings. For `indices_mask` the 0 means empty entry in sparse representation. For `indices_nonzero_data` the 0 means index 0 in LUT. With the given indices_mask and indices_nonzero_data, an example for "Scalar Palettization": lut = [2.0, 3.0] (indices-to-values mapping is {0: 2.0, 1: 3.0}) The sparse indices in the dense layout would look like: 0 1 . . . . 1 0 . . . 1 . 1 0 . 0 . . . . 1 . . (here "." means spare elements in sparse representation) When we apply per-tensor de-palettization with this sparse indices, the `indices_nonzero_data` is used to read the values from the LUT as in the dense layout. The output sparse tensor in the dense layout would be: 2.0 3.0 . . . . 3.0 2.0 . . . 3.0 . 3.0 2.0 . 2.0 . . . . 3.0 . . The first output would be the same as the indices_mask. The second output would be [2.0, 3.0, 3.0, 2.0, 3.0, 3.0, 2.0, 2.0, 3.0] With the given indices_mask and indices_nonzero_data, an example for "Vector Palettization": lut = [ [1, 1, 0, 0, 0, 0], [1, 1, 0, 0, 0, 0], [1, 1, 0, 0, 0, 1], [1, 1, 0, 0, 0, 1], [0, 1, 1, 0, 1, 0], [0, 1, 1, 0, 1, 0], [0, 0, 0, 1, 0, 0], [0, 0, 0, 1, 0, 0], ] The second output in the dense layout would be: 2.0 3.0 . . . . 2.0 3.0 . . . . 3.0 2.0 . . . 3.0 3.0 2.0 . . . 3.0 . 3.0 2.0 . 2.0 . . 3.0 2.0 2.0 . . . . 3.0 . . . . . 3.0 . . It is created by fetching the vector entry from the lut for every bit 1 in the data_mask, and filling the vector over axis=0. Those two outputs of this op could be passed as inputs to a following `sparse_to_dense` op in order to recover the dense weights. """ input_spec = InputSpec( indices_mask=TensorInputType(const=True, type_domain=types.uint1), indices_nonzero_data=TensorInputType(const=True, type_domain="IndicesT"), lut=TensorInputType(const=True, type_domain="T"), vector_axis=TensorInputType(const=True, optional=True, type_domain=types.int32), ) type_domains = { "IndicesT": (types.uint1, types.uint2, types.uint3, types.uint4, types.uint6, types.uint8), "T": (types.int8, types.uint8, types.fp16, types.fp32), } def _validate_inputs(self): constexpr_sparse_to_dense._validate_sparse_inputs( self.indices_nonzero_data, self.indices_mask ) constexpr_lut_to_dense._validate_lut_inputs( self.indices_mask.shape, self.indices_nonzero_data.dtype, self.lut.shape, self.vector_axis, ) def type_inference(self): self._validate_inputs() output_mask_shape = self.indices_mask.shape output_nonzero_data_shape = self.indices_nonzero_data.shape vector_size = self.lut.shape[-1] if vector_size > 1: output_mask_shape = list(output_mask_shape) output_mask_shape[self.vector_axis.val] *= vector_size output_mask_shape = tuple(output_mask_shape) output_nonzero_data_shape = tuple( [dim * vector_size for dim in output_nonzero_data_shape] ) output_mask_type = types.tensor(self.indices_mask.dtype, output_mask_shape) output_nonzero_data_type = types.tensor(self.lut.dtype, output_nonzero_data_shape) return output_mask_type, output_nonzero_data_type @staticmethod def decompress( indices_mask: np.ndarray, indices_nonzero_data: np.ndarray, lut: np.ndarray, vector_axis: Optional[np.generic], ): indices = constexpr_sparse_to_dense.decompress(indices_nonzero_data, indices_mask) output_nonzero_data = constexpr_lut_to_dense.decompress(indices, lut, vector_axis) output_mask = indices_mask if vector_axis is not None: vector_size = lut.shape[-1] output_mask = np.repeat(output_mask, vector_size, axis=vector_axis) output_nonzero_data = output_nonzero_data[output_mask != 0].flatten() return output_mask, output_nonzero_data def materialized_val_inference(self): vector_axis = self.vector_axis.val if self.vector_axis is not None else None return self.decompress( self.indices_mask.val, self.indices_nonzero_data.val, self.lut.val, vector_axis ) @register_op(opset_version=_IOS18_TARGET) class constexpr_sparse_blockwise_shift_scale(Operation): """ A compile-time operation that returns a constant output value upon de-quantize (shift-scale) its constant inputs. This op is a sparse-to-sparse op to support `constexpr_blockwise_shift_scale` on sparse data, where the de-quantization is only applied on the nonzero data. Usually it would be followed by a `constexpr_sparse_to_dense` op to get the dense tensor. So, parameters of this op are similar to `constexpr_sparse_to_dense` and `constexpr_blockwise_shift_scale`. For detailed descriptions about its parameters, please refer to iOS 18 :py:class:`~.iOS18.constexpr_ops.constexpr_sparse_to_dense` and :py:class:`~.iOS18.constexpr_ops.constexpr_blockwise_shift_scale`. This op has two outputs: 1. the mask of the de-quantized nonzero_data. 2. the de-quantized nonzero_data. Parameters ------- data_mask: const tensor (Required) nonzero_data: const tensor (Required) scale: const tensor (Required) offset: const tensor (Optional) * If provided, must have the same shape as the ``scale``. Returns ------- const tensor * the mask of the shift-scaled nonzero_data. const tensor * the shift-scaled nonzero_data. Attributes ------- SrcT: int4, uint4, int8, uint8, fp16, fp32 DstT: fp16, fp32 OffsetT: int4, uint4, int8, uint8, fp16, fp32 Examples ------- For example: data_mask = [[1, 1, 0, 0], [1, 1, 1, 0], [0, 0, 1, 1], [1, 1, 0, 0]] nonzero_data = [10, 11, 3, 4, 5, 6, 7, 8, 9] The sparse tensor in the dense layout would look like: 10 11 . . 3 4 5 . . . 6 7 8 9 . . When we apply per-channel de-quantization on this sparse tensor, where: scale = [[0.1, 0.2, 0.3, 0.4]] offset = [[1, 2, 3, 4]] The input `nonzero_data` would be dequantized per-column as in the dense layout, and the output sparse tensor in the dense layout would be: (10-1)*0.1 (11-2)*0.2 . . (10-1)*0.1 (11-2)*0.2 . . (3-1)*0.1 (4-2)*0.2 (5-3)*0.3 . . . (6-3)*0.3 (7-4)*0.4 (8-1)*0.1 (9-2)*0.2 . . The first output would be the same as the `data_mask`, The second output would be [0.9, 1.8, 0.2, 0.4, 0.6, 0.9, 1.2, 0.7, 1.4]. The two outputs could be passed as inputs to the following `sparse_to_dense` op in order to get the dense weights. """ input_spec = InputSpec( data_mask=TensorInputType(const=True, type_domain=types.uint1), nonzero_data=TensorInputType(const=True, type_domain="SrcT"), scale=TensorInputType(const=True, type_domain="DstT"), offset=TensorInputType(const=True, optional=True, type_domain="OffsetT"), ) type_domains = { "SrcT": (types.int4, types.uint4, types.int8, types.uint8, types.fp16, types.fp32), "DstT": (types.fp16, types.fp32), "OffsetT": (types.int4, types.uint4, types.int8, types.uint8, types.fp16, types.fp32), } def _validate_inputs(self): constexpr_sparse_to_dense._validate_sparse_inputs(self.nonzero_data, self.data_mask) constexpr_blockwise_shift_scale._validate_shift_scale_inputs( self.data_mask.shape, self.nonzero_data.dtype, self.scale, self.offset ) def type_inference(self): self._validate_inputs() output_mask_shape = self.data_mask.shape output_nonzero_data_shape = self.nonzero_data.shape output_mask_type = types.tensor(self.data_mask.dtype, output_mask_shape) output_nonzero_data_type = types.tensor(self.scale.dtype, output_nonzero_data_shape) return output_mask_type, output_nonzero_data_type @staticmethod def decompress( data_mask: np.ndarray, nonzero_data: np.ndarray, scale: np.ndarray, offset: Optional[np.ndarray], ): data = constexpr_sparse_to_dense.decompress(nonzero_data, data_mask) dequantized_data = constexpr_blockwise_shift_scale.decompress(data, scale, offset) output_nonzero_data = dequantized_data[data_mask != 0].flatten() return data_mask, output_nonzero_data def materialized_val_inference(self): offset = self.offset.val if self.offset is not None else None return self.decompress(self.data_mask.val, self.nonzero_data.val, self.scale.val, offset) @register_op(opset_version=_IOS18_TARGET) class constexpr_cast(_constexpr_cast_iOS16): """ A compile-time operation that returns a constant output value upon casting its constant input. The only difference between this version and the iOS 16 :py:class:`~.iOS16.constexpr_ops.constexpr_cast` is the parameters are treated as inputs, instead of attributes in the MIL backend framework. """ input_spec = InputSpec( source_val=TensorInputType(const=True, type_domain=types.fp16), output_dtype=TensorInputType(const=True, type_domain=types.str), ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS18/recurrent.py0000644000000000000000000000343114672066616025546 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, TensorInputType from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS17.recurrent import gru as _gru_iOS17 from coremltools.converters.mil.mil.ops.defs.iOS18 import _IOS18_TARGET @register_op(opset_version=_IOS18_TARGET) class gru(_gru_iOS17): """ Gated Recurrent Unit (GRU) The only difference between this version and the iOS 17 :py:class:`~.iOS17.recurrent.gru` is the reset_after parameter. This parameter is optional and defaults to False. When True, the reset gate is applied before the elementwise matrix multiplication. """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), initial_h=TensorInputType(type_domain="T"), weight_ih=TensorInputType(const=True, type_domain="T"), weight_hh=TensorInputType(const=True, type_domain="T"), bias=TensorInputType(const=True, optional=True, type_domain="T"), direction=TensorInputType(const=True, optional=True, type_domain=types.str), output_sequence=TensorInputType(const=True, optional=True, type_domain=types.bool), recurrent_activation=TensorInputType(const=True, optional=True, type_domain=types.str), activation=TensorInputType(const=True, optional=True, type_domain=types.str), reset_after=TensorInputType(const=True, optional=True, type_domain=types.bool), input_bias=TensorInputType(const=True, optional=True, type_domain="T"), ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS18/states.py0000644000000000000000000000251514672066616025042 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, StateInputType from coremltools.converters.mil.mil.operation import Operation from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs.iOS18 import _IOS18_TARGET @register_op(opset_version=_IOS18_TARGET) class read_state(Operation): """ Read a state, copy its content into a new variable, and return the variable. The type of the output variable depends on the type that is wrapped inside the state, which could be ``types.tensor``. Parameters ---------- input: state (Required) Returns ------- ST Attributes ---------- ST: tensor """ input_spec = InputSpec( input=StateInputType(), ) def type_inference(self): sym_type = self.input.sym_type.wrapped_type() if not types.is_tensor(sym_type): raise ValueError( f"State only supports wrapped type of types.tensor. Got {sym_type.__type_info__()}." ) return sym_type ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS18/tensor_transformation.py0000644000000000000000000001256214672066616030202 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import Operation, types from coremltools.converters.mil.mil.input_type import DefaultInputs, InputSpec, TensorInputType from coremltools.converters.mil.mil.operation import Operation from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs._utils import ( get_param_val, solve_slice_by_index_shape, solve_slice_by_index_slice, ) from coremltools.converters.mil.mil.ops.defs.iOS18 import _IOS18_TARGET from coremltools.converters.mil.mil.types.symbolic import is_compatible_symbolic_vector @register_op(opset_version=_IOS18_TARGET) class slice_update(Operation): """ Update a custom slice of a source tensor with another tensor of the same shape, as dictated by the slice. For example, if you have a tensor ``x``, this method produces the following:: x[begin[0]: end[0]: stride[0], begin[1]: end[1]: stride[1], ...] = value The arguments defining the slice (``begin``, ``end``, ``stride``, ``masks``, and so on) should be treated the same way as iOS15 :py:class:`~.iOS15.tensor_transformation.slice_by_index`. Parameters ---------- x: tensor<*?, T> (Required) * Input tensor. update: tensor<\*K, T> (Required) * Value tensor to be inserted. * The shape of the update tensor must match the slicing result of the input data. * rank-0 update is not supported. begin: tensor<[rank], U> (Required) * Starting index for the dimension of slicing. end: tensor<[rank(x)], U> (Required) * Ending index for the dimension of slicing. stride: tensor<[rank(x)], U> (Optional) * Default as all ``1``. * Stride for the dimension of slicing. begin_mask: tensor<[rank(x)], bool> (Optional) * Default to all ``False``. * If ``begin_mask[i]==True``, neglect ``begin[i]``, and set ``begin[i]`` to ``0``. end_mask: tensor<[rank(x)], bool> (Optional) * Default to all ``False``. * If ``end_mask[i]==True``, neglect ``end[i]``, and set ``end[i]`` to ``x.shape[i]``. squeeze_mask: tensor<[rank(x)], bool> (Optional) * Default to all ``False``. * If ``squeeze_mask[i]==True``, neglect ``end[i]``, and do the pure index at ``begin[i]``. Returns ------- tensor<\*?, T> - Scalar or tensor. Attributes ---------- T: fp16, fp32, int8, int16, int32, uint8, uint16, bool U: int8, int16, int32 """ input_spec = InputSpec( x=TensorInputType(type_domain="T"), update=TensorInputType(type_domain="T"), begin=TensorInputType(type_domain="U"), end=TensorInputType(type_domain="U"), stride=TensorInputType(const=True, optional=True, type_domain="U"), begin_mask=TensorInputType(const=True, optional=True, type_domain=types.bool), end_mask=TensorInputType(const=True, optional=True, type_domain=types.bool), squeeze_mask=TensorInputType(const=True, optional=True, type_domain=types.bool), ) type_domains = { "T": ( types.fp16, types.fp32, types.int8, types.int16, types.int32, types.uint8, types.uint16, types.bool, ), "U": (types.int8, types.int16, types.int32), } def default_inputs(self): return DefaultInputs( stride=None, begin_mask=None, end_mask=None, squeeze_mask=None, ) def type_inference(self): # solve shape ret_shape = solve_slice_by_index_shape( self.x.shape, self.begin.val, self.end.val, get_param_val(self.stride), get_param_val(self.begin_mask), get_param_val(self.end_mask), get_param_val(self.squeeze_mask), ) if not is_compatible_symbolic_vector(ret_shape, self.update.shape): raise ValueError( "The update tensor should have shape {}. Got {}".format( ret_shape, self.update.shape ) ) if self.update.rank == 0: # rdar://128221986 ([Feature][Slice_update] The backends is not supporting scalar update for the slice_update op) raise ValueError(f"rank-0 'update' is not supported in 'slice_update' op {self.name}.") return self.x.sym_type def value_inference(self): if ( self.x.sym_val is None or self.update.sym_val is None or self.begin.val is None or self.end.val is None ): return None # solve the data slices slices = solve_slice_by_index_slice( self.x.shape, self.begin.val, self.end.val, get_param_val(self.stride), get_param_val(self.begin_mask), get_param_val(self.end_mask), get_param_val(self.squeeze_mask), ) # copy the data and do the inplace update copy_x_val = np.copy(self.x.sym_val) copy_x_val[slices] = np.reshape(self.update.sym_val, copy_x_val[slices].shape) return copy_x_val ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/defs/iOS18/transformers.py0000644000000000000000000001476714672066616026300 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.input_type import InputSpec, TensorInputType from coremltools.converters.mil.mil.operation import Operation from coremltools.converters.mil.mil.ops.defs._op_reqs import register_op from coremltools.converters.mil.mil.ops.defs._utils import broadcast_shapes from coremltools.converters.mil.mil.ops.defs.iOS18 import _IOS18_TARGET from coremltools.converters.mil.mil.types.symbolic import any_symbolic, is_symbolic @register_op(opset_version=_IOS18_TARGET) class scaled_dot_product_attention(Operation): """ Source: `PyTorch scaled dot product attention `_. Computes the scaled dot product attention on query, key, and value tensors, using an optional attention mask if passed. In PyTorch, this is equivalent to:: attn_mask = attn_mask.masked_fill(not attn_mask, -float('inf')) if attn_mask.dtype==torch.bool else attn_mask attn_weight = torch.softmax((Q @ K.transpose(-2, -1) / math.sqrt(Q.size(-1))) + attn_mask, dim=-1) return attn_weight @ V Shape key: - ``B`` = Batch size - ``S`` = Source sequence length - ``L`` = Target sequence length - ``E`` = Query/Key embedding dimension - ``EV`` = Value embedding dimension Numerical values can differ due to floating point fusion/accumulation between backends. Note: We currently do not support the ``dropout_p`` and ``is_causal``. Mask can either be bool or float matching query, key, or value. For bool, it indicates whether the element should take part in the attention. Floats are added to the attention score. Mask shape must be broadcastable to ``[B, \*?, L, S]``. Parameters ---------- query: tensor<[B, \*?, L, E], T> (Required) key: tensor<[B, \*?, S, E], T> (Required) value: tensor<[B, \*?, S, EV], T> (Required) attn_mask: tensor<[\*?, S], M> (Optional) Returns ------- tensor<[B, \*?, L, EV], T> Attributes ---------- T: fp16, fp32 M: bool, fp16, fp32 """ input_spec = InputSpec( query=TensorInputType(type_domain="T"), key=TensorInputType(type_domain="T"), value=TensorInputType(type_domain="T"), attn_mask=TensorInputType(optional=True, type_domain="M"), ) type_domains = { "T": (types.fp16, types.fp32), "M": (types.bool, types.fp16, types.fp32), } def _validate_inputs(self): query_rank = self.query.rank key_rank = self.key.rank value_rank = self.value.rank if query_rank != key_rank or query_rank != value_rank: raise ValueError( f"query, key, value must have a same rank, got\n" f"* query rank = {query_rank}\n" f"* key rank = {key_rank}\n" f"* value rank = {value_rank}" ) if query_rank < 3: raise ValueError( f"query, key, value must have at lease rank 3 " f"for batch, sequence length, embedding, got rank {query_rank}" ) query_shape = self.query.shape key_shape = self.key.shape value_shape = self.value.shape B_query = query_shape[:-2] E_query = query_shape[-1] B_key = key_shape[:-2] S_key = key_shape[-2] E_key = key_shape[-1] B_value = value_shape[:-2] S_value = value_shape[-2] batch_dims = [B_query, B_key, B_value] batch_dims = [batch_dim for batch_dim in batch_dims if not any_symbolic(batch_dims)] if len(set(batch_dims)) > 1: raise ValueError( "query, key, value must have a same batch dimension, got\n" f"* query batch = {B_query}\n" f"* key batch = {B_key}\n" f"* value batch = {B_value}" ) if not is_symbolic(E_query) and not is_symbolic(E_key) and E_query != E_key: raise ValueError( "query and key must have a same embedding dimension, got\n" f"* query embedding = {E_query}\n" f"* key embedding = {E_key}" ) if not is_symbolic(S_key) and not is_symbolic(S_value) and S_key != S_value: raise ValueError( "key and value must have a same sequence length, got\n" f"* key sequence = {S_key}\n" f"* value sequence = {S_value}" ) if self.attn_mask is not None: mask_shape = self.attn_mask.shape S_mask = mask_shape[-1] if not is_symbolic(S_mask) and not is_symbolic(S_key) and S_mask != S_key: raise ValueError( "key and mask must have a same sequence length, got\n" f"* key sequence = {S_key}\n" f"* mask sequence = {S_mask}" ) # If shapes are inconsistent, then `broadcast_shapes` would raise exception broadcast_shapes(query_shape[:-1], mask_shape[:-1]) def type_inference(self): self._validate_inputs() shape = list(self.query.shape[:-1]) + [self.value.shape[-1]] return types.tensor(self.query.dtype, shape) def value_inference(self): query = self.query.val key = self.key.val value = self.value.val if query is None or key is None or value is None: return None float_mask = None if self.attn_mask is not None and self.attn_mask.val is not None: mask = self.attn_mask.val if mask.dtype == bool: float_mask = np.zeros(mask.shape) float_mask[np.where(np.logical_not(mask))] = -np.inf else: float_mask = mask similarity = np.matmul(query, key.swapaxes(-2, -1)) / np.sqrt(query.shape[-1]) if float_mask is not None: similarity += float_mask attention_weight = self.numpy_softmax_last_dim(similarity) attention = np.matmul(attention_weight, value) return attention @staticmethod def numpy_softmax_last_dim(x: np.ndarray) -> np.ndarray: exps = np.exp(x - np.max(x, axis=-1)[..., None]) softmax = exps / np.sum(exps, axis=-1)[..., None] return softmax ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/helper.py0000644000000000000000000000305514672066616023232 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Dict, Type from coremltools.converters.mil._deployment_compatibility import AvailableTarget from coremltools.converters.mil.mil.operation import Operation def _get_version_of_op( op_variants: Dict[AvailableTarget, Type[Operation]], opset_version: AvailableTarget ) -> Type[Operation]: """ A utility function that retrieves an op cls given a dictionary of op variants and target version """ assert isinstance(op_variants, dict) opset_versions = list(op_variants.keys()) opset_versions.sort() if opset_version is None: op_cls = op_variants[opset_versions[0]] elif opset_version > opset_versions[-1] and opset_version > AvailableTarget.iOS17: # TODO(rdar://111114658): Remove when no longer required. # Inherit ops from the latest opset by default. op_cls = op_variants[opset_versions[-1]] else: if opset_version not in op_variants: op_type = list(op_variants.values())[0].__name__ msg = ( "No available version for {} in the coremltools.target.{} opset. Please update the " "minimum_deployment_target to at least coremltools.target.{}" ).format(op_type, opset_version.name, opset_versions[0].name) raise ValueError(msg) op_cls = op_variants[opset_version] return op_cls ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/registry.py0000644000000000000000000002136214672066616023624 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import defaultdict from coremltools import _logger as logger from coremltools.converters.mil._deployment_compatibility import \ AvailableTarget as target from coremltools.converters.mil.mil.block import curr_opset_version from ..builder import Builder from .helper import _get_version_of_op class SSAOpRegistry: """ There are three kinds of operations that we could register: (1) core_ops: dict[str, dict[Operation]] - These are the core ops in PyMIL, which have a direct mapping to the backend in neural_network or mlprogram - The registered op is considered a core op if the namespace is not provided - coreml_ops[op_type] is a dict that tracks different opset versions for an op. For instance - ``core_ops[op_1] = { ct.target.iOS13: op_1_iOS13, ct.target.iOS14: op_1_iOS13, ct.target.iOS15: op_1_iOS13, ct.target.iOS16: op_1_iOS13, }`` . Only one version of op type ``op_1`` is registered, and it is defined in iOS13, which both neural_network and mlprogram backend support - ``core_ops[op_2] = { ct.target.iOS13: op_2_iOS13, ct.target.iOS14: op_2_iOS13, ct.target.iOS15: op_2_iOS13, ct.target.iOS16: op_2_iOS16, }`` . Two versions of op type ``op_2`` are registered, one each for iOS13, iOS16. . The builder picks up correct version of the op according to curr_opset_version(), which returns the opset version of the current function. -- If ``curr_opset_version()`` is ``None`` (the version of the function is not set), ``mb.op_2`` would call the oldest version of the op by default, which is ``op_2_ios13`` -- Otherwise, the builder would pick up core_ops[op_2][curr_opset_version()] - In the highest level, users can choose the desired version by specifying the ``minum_deployment_target`` argument in ``coremltools.convert`` - The default ``opset_version`` for the core ops would be set to iOS13, for which neural_network backend supports (2) dialect_ops: dict[str, Operation] - These are the ops that are created for specific frontend framework, for instance: ``tf_lstm_block, torch_upsample_nearest_neighbor`` - A graph pass must be customized by the developer to translate a dialect_ops into core ops (3) custom_ops: dict[str, Operation] - These are the custom ops, in which an additional ``bindings`` which should be specified in operator """ SUPPORTED_OPSET_VERSIONS = ( target.iOS13, target.iOS14, target.iOS15, target.iOS16, target.iOS17, target.iOS18, ) core_ops = defaultdict(dict) dialect_ops = {} custom_ops = {} @staticmethod def _get_core_op_cls(op_type=None): """ A utility function that retrieves an op cls using the curr_opset_version """ if op_type not in SSAOpRegistry.core_ops: raise ValueError("op {} not registered.".format(op_type)) candidate_ops = SSAOpRegistry.core_ops[op_type] return _get_version_of_op(candidate_ops, curr_opset_version()) @staticmethod def register_op(_cls=None, is_custom_op=False, namespace=None, opset_version=target.iOS13, allow_override=False): """ Registration routine for MIL Program operators Parameters ---------- is_custom_op: boolean - If ``True``, maps current operator to ``custom_op``. ``custom_op`` requires additional ``bindings`` which should be specified in operator - Default ``False`` namespace: str - If provided, the op is registered as a dialect op - Otherwise is considered as a core op opset_version: int - Specify the minimum spec version that supports this op - Default to ``ct.target.iOS13``, which is for the neural_network backend allow_override: boolean - If True, it is allowed for an operation to override the previous operation with the same registered name - Default ``False`` """ def class_wrapper(op_cls): op_type = op_cls.__name__ # debug message op_msg = "op" is_dialect_op = (namespace is not None) if is_custom_op: op_msg = "Custom op" elif is_dialect_op: op_msg = "Dialect op" logger.debug("Registering {} {}".format(op_msg, op_type)) # pick the right dict for registration if is_custom_op: op_reg = SSAOpRegistry.custom_ops elif is_dialect_op: op_reg = SSAOpRegistry.dialect_ops # Check that op_type is prefixed with namespace if op_type[: len(namespace)] != namespace: msg = ( "Dialect op type {} registered under {} namespace must " + "prefix with {}" ) raise ValueError(msg.format(op_type, namespace, namespace)) op_cls._dialect_namespace = namespace else: op_reg = SSAOpRegistry.core_ops # verify that the op have not been registered before if allow_override = False msg = "SSA {} {} already registered.".format(op_msg, op_type) if is_custom_op or is_dialect_op: if op_type in op_reg and not allow_override: raise ValueError(msg) else: if opset_version in op_reg[op_type] and not allow_override: if opset_version - 1 not in op_reg[op_type] or (op_reg[op_type][opset_version - 1] != op_reg[op_type][opset_version]): raise ValueError(msg) # add the op to op_reg if is_custom_op or is_dialect_op: op_reg[op_type] = op_cls else: # The older version of the op must be registered first, or it will override the # newer version. For example, assuming an op has two versions: IOS13 and IOS15. If # the IOS15 is registered first, the op_reg[op_type] will have that op class for # IOS15/16/..., and when IOS13 is registered, it will override all op classes for # IOS13/14/15/16/... where IOS15 op class will get lost. So we error out early # instead of keep registering when this happens. if opset_version in op_reg[op_type]: old_op_cls = op_reg[op_type][opset_version] for i in range(opset_version, SSAOpRegistry.SUPPORTED_OPSET_VERSIONS[-1] + 1): if op_reg[op_type][i] != old_op_cls: raise ValueError( f"Older version of op {op_type} must be registered " f"before a newer version." ) idx = SSAOpRegistry.SUPPORTED_OPSET_VERSIONS.index(opset_version) for i in range(idx, len(SSAOpRegistry.SUPPORTED_OPSET_VERSIONS)): op_reg[op_type][SSAOpRegistry.SUPPORTED_OPSET_VERSIONS[i]] = op_cls # add the version information to the op cls op_cls._op_variants = op_reg[op_type] @classmethod def add_op(cls, **kwargs): """ An utility function that help the builder to pickup the correct op class when calling ``mb.op`` There are two cases: (1) custom op / dialect op: If the op is a custom op or a dialect op, we could directly pick up the op class through ``SSAOpRegistry.custom_ops[op_type]`` or ``SSAOpRegistry.dialect_ops[op_type]`` (2) core op: For the core op, the builder would pick up the correct version according to ``curr_opset_version()`` """ op_cls_to_add = None is_core_op = (op_reg == SSAOpRegistry.core_ops) if is_core_op: op_cls_to_add = SSAOpRegistry._get_core_op_cls(op_type) else: op_cls_to_add = op_reg[op_type] return cls._add_op(op_cls_to_add, **kwargs) setattr(Builder, op_type, add_op) return op_cls if _cls is None: return class_wrapper return class_wrapper(_cls) ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.237547 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/0000755000000000000000000000000014672075535022540 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/__init__.py0000644000000000000000000000033214672066616024647 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.237547 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/coreml_dialect/0000755000000000000000000000000014672075535025506 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/coreml_dialect/__init__.py0000644000000000000000000000033014672066616027613 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/coreml_dialect/test_coreml_dialect.py0000644000000000000000000000513014672066616032064 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import pytest from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.builder import Builder as mb from coremltools.converters.mil.mil.ops.defs.iOS18 import _IOS18_TARGET from coremltools.converters.mil.testing_utils import get_op_types_in_program class TestCoreMLUpdateState: @staticmethod def test_update_tensor_state_builder(): @mb.program( input_specs=[mb.StateTensorSpec((2, 3)), mb.TensorSpec((2, 3))], opset_version=_IOS18_TARGET, ) def prog(x, value): return mb.coreml_update_state(state=x, value=value) update_state_op = prog.find_ops("coreml_update_state")[0] assert types.is_state(update_state_op.state._sym_type) assert types.is_tensor(update_state_op.outputs[0]._sym_type) @staticmethod def test_update_tensor_state_builder_invalid(): # Update state with value of different shape with pytest.raises( ValueError, match="State wrapped type tensor\[2,3,fp32\] not matched with the value's sym_type tensor\[3,2,fp32\]", ): @mb.program( input_specs=[mb.StateTensorSpec((2, 3)), mb.TensorSpec((3, 2))], opset_version=_IOS18_TARGET, ) def prog(x, value): return mb.coreml_update_state(state=x, value=value) # Update state with value of different dtype with pytest.raises( ValueError, match="State wrapped type tensor\[2,3,fp32\] not matched with the value's sym_type tensor\[2,3,fp16\]", ): @mb.program( input_specs=[mb.StateTensorSpec((2, 3)), mb.TensorSpec((2, 3), dtype=types.fp16)], opset_version=_IOS18_TARGET, ) def prog(x, value): return mb.coreml_update_state(state=x, value=value) @staticmethod def test_simple_stateful_model_builder(): @mb.program( input_specs=[mb.StateTensorSpec((2, 3)), mb.TensorSpec((2, 3))], opset_version=_IOS18_TARGET, ) def prog(x, value): read_val = mb.read_state(input=x) add = mb.add(x=read_val, y=value) return mb.coreml_update_state(state=x, value=add) assert get_op_types_in_program(prog) == ["read_state", "add", "coreml_update_state"] ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.237547 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/0000755000000000000000000000000014672075535023377 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/__init__.py0000644000000000000000000000065614672066616025517 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import coremltools as ct from coremltools.converters.mil.testing_reqs import backends_internal, clean_up_backends backends = clean_up_backends(backends_internal, ct.target.iOS14, force_include_iOS15_test=True) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_activation.py0000644000000000000000000011155614672066616027162 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import scipy import coremltools as ct from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import ( mark_api_breaking, run_compare_builder, ) from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import ssa_fn class TestClampedReLU: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=t.shape), } input_values = {"x": t} def build(x): return mb.clamped_relu(x=x, alpha=2.0, beta=1.0) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array([[-2, 1, -6], [1, -10, 1]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.clamped_relu(x=x_val, alpha=2.0, beta=1.0) x = np.minimum(np.maximum(x_val, 0), 1.0) y = np.minimum(np.minimum(x_val, 0) * 2.0, 1.0) np.testing.assert_allclose(x + y, v.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend, dim, alpha, beta", itertools.product(compute_units, backends, [2, 4, 8], [2.0, 3.0], [4.0, 5.0]), ) def test_builder_to_backend_stress(self, compute_unit, backend, dim, alpha, beta): shape_x = np.array([dim, dim]) x_val = np.random.rand(*shape_x) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [mb.clamped_relu(x=x, alpha=alpha, beta=beta)] x = np.minimum(np.maximum(x_val, 0), 1.0) y = np.minimum(np.minimum(x_val, 0) * 2.0, 1.0) expected_outputs = [x + y] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestELU: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=t.shape), } input_values = {"x": t} def build(x): return mb.elu(x=x, alpha=2.0) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array( [[-1.2642411, 2.0, -1.9004259], [4.0, -1.9865241, 6.0]], dtype=np.float32 ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.elu(x=x_val, alpha=2.0) b = np.copy(x_val) b[b < 0] = 2.0 * (np.exp(b[b < 0]) - 1) np.testing.assert_allclose(b, v.val, atol=1e-04, rtol=1e-05) class TestGeLU: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=t.shape), } input_values = {"x": t} def build(x): return mb.gelu(x=x) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array( [ [-1.58691406e-01, 1.95410156e00, -4.04968858e-03], [3.99987316e00, -1.49011612e-06, 6.00000000e00], ], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, atol=1e-3, rtol=1e-3, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) mode = "TANH_APPROXIMATION" v = mb.gelu(x=x_val, mode=mode) a = np.sqrt(2 / np.pi) * (x_val + 0.044715 * np.power(x_val, 3)) out = 0.5 * x_val * (1 + np.tanh(a)) np.testing.assert_allclose(out, v.val, atol=1e-04, rtol=1e-05) mode = "SIGMOID_APPROXIMATION" v = mb.gelu(x=x_val, mode=mode) out = x_val * (1 / (1 + np.exp(-(1.702 * x_val)))) np.testing.assert_allclose(out, v.val, atol=1e-04, rtol=1e-05) v = mb.gelu(x=x_val) out = 0.5 * x_val * (1 + scipy.special.erf(x_val / np.sqrt(2))) np.testing.assert_allclose(out, v.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend, dim, mode", itertools.product( compute_units, backends, [2, 6], ["EXACT", "TANH_APPROXIMATION", "SIGMOID_APPROXIMATION"], ), ) def test_builder_to_backend_stress(self, compute_unit, backend, dim, mode): shape = np.array([dim, dim]) x_val = np.random.rand(*shape) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [mb.gelu(x=x, mode=mode)] if mode == "TANH_APPROXIMATION": a = np.sqrt(2 / np.pi) * (x_val + 0.044715 * np.power(x_val, 3)) out = 0.5 * x_val * (1 + np.tanh(a)) elif mode == "SIGMOID_APPROXIMATION": out = x_val * (1 / (1 + np.exp(-(1.702 * x_val)))) else: out = 0.5 * x_val * (1 + scipy.special.erf(x_val / np.sqrt(2))) expected_outputs = [out] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, atol=1e-3, rtol=1e-3, ) class TestLeakyReLU: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=t.shape), } input_values = {"x": t} def build(x): return mb.leaky_relu(x=x, alpha=2.0) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array([[-2, 2, -6], [4, -10, 6]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.leaky_relu(x=x_val, alpha=2.0) b = np.copy(x_val) b[b < 0] *= 2.0 np.testing.assert_allclose(b, v.val, atol=1e-04, rtol=1e-05) class TestLinearActivation: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.linear_activation(x=x, alpha=2.0, beta=3.0) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array([[1, 7, -3], [11, -7, 15]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.linear_activation(x=x_val, alpha=2.0, beta=3.0) np.testing.assert_allclose(x_val * 2.0 + 3.0, v.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend, dim", itertools.product(compute_units, backends, [2, 4, 8]), ) def test_builder_to_backend_stress(self, compute_unit, backend, dim): shape = np.array([dim, dim]) x_val = np.random.rand(*shape) alpha = np.random.uniform() beta = np.random.uniform() input_placeholders = { "x": mb.placeholder(shape=x_val.shape), } input_values = {"x": x_val} def build(x): return [mb.linear_activation(x=x, alpha=alpha, beta=beta)] expected_outputs = [x_val * alpha + beta] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestPReLU: @pytest.mark.parametrize( "compute_unit, backend, rank, alpha_values", itertools.product( compute_units, backends, [3, 4, 5], [[1.0, 2.0, 3.0], [4.0, 4.0, 4.0]], ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, rank, alpha_values): if backend.backend == "mlprogram" and backend.precision == "fp16": pytest.xfail( "rdar://92175249 ([MIL] TestActivation::test_prelu[backend=(mlprogram, fp16)] CI failure)" ) alpha = np.array(alpha_values, dtype=np.float32) if rank == 3 or rank == 5: are_alpha_values_same = np.where(np.abs(alpha - alpha[0]) > 1e-5)[0].size == 0 if not are_alpha_values_same: pytest.xfail("rdar://91442339") t = np.array([[[[-1, 3]], [[-1, 2]], [[4, -5]]]], dtype=np.float32) expected_outputs = np.array( [[[[-1 * alpha[0], 3]], [[-1 * alpha[1], 2]], [[4, -5 * alpha[2]]]]], dtype=np.float32 ) shape = None if rank == 3: shape = (1, 3, 2) elif rank == 4: shape = (1, 3, 1, 2) elif rank == 5: shape = (1, 3, 1, 1, 2) else: raise ValueError("rank not supported") t = np.reshape(t, shape) expected_outputs = np.reshape(expected_outputs, shape) expected_output_types = tuple([s for s in shape]) + (types.fp32,) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.prelu(x=x, alpha=alpha) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[[[-1, 3, 6]], [[-1, 2, -3]], [[4, -5, 6]]]], dtype=np.float32) alpha = np.array([1, 2, 3], dtype=np.float32) v = mb.prelu(x=x_val, alpha=alpha) alpha_br = alpha for i in range(len(x_val.shape)): if i != 1: alpha_br = np.expand_dims(alpha_br, i) expected_res = np.maximum(x_val, 0) + np.minimum(x_val, 0) * alpha_br np.testing.assert_allclose(expected_res, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_eval1(self): x_val = np.array([[[-1, 3, 6]], [[-1, 2, -3]], [[4, -5, 6]]], dtype=np.float32) with pytest.raises(ValueError, match=r".* dimension 1 .*"): mb.prelu(x=x_val, alpha=np.array([1, 2], dtype=np.float32)) @ssa_fn def test_builder_eval2(self): x_val = np.array([[[-1, 3, 6]], [[-1, 2, -3]], [[4, -5, 6]]], dtype=np.float32) with pytest.raises(ValueError, match=r"alpha .* rank 1"): mb.prelu(x=x_val, alpha=np.array([[1, 2, 3]], dtype=np.float32)) @ssa_fn def test_builder_eval3(self): with pytest.raises(ValueError, match=r"x .* rank 3"): mb.prelu( x=np.array([1], dtype=np.float32), alpha=np.array([[1, 2, 3]], dtype=np.float32), ) @pytest.mark.parametrize( "compute_unit, backend, dim, chan", itertools.product(compute_units, backends, [1, 2, 4, 8], [2, 3, 4]), ) def test_builder_to_backend_stress(self, compute_unit, backend, dim, chan): shape = np.array([1, chan, dim, dim]) x_val = np.random.rand(*shape) alpha_val = np.random.rand(chan).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [mb.prelu(x=x, alpha=alpha_val)] alpha_br = np.copy(alpha_val) for i in range(1, len(x_val.shape) - 1): alpha_br = np.expand_dims(alpha_br, i) x_pos = np.maximum(x_val, 0) b = np.minimum(x_val, 0) expected_outputs = [x_pos + b * alpha_br] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestReLU: @pytest.mark.parametrize( "compute_unit, backend, data_type", itertools.product(compute_units, backends, [np.float32, np.float16]), ) def test_builder_to_backend_smoke(self, compute_unit, backend, data_type): t = np.array([[-1, 2, -3], [4, -5, 6]], dtype=data_type) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.relu(x=x) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array([[0, 2, 0], [4, 0, 6]], dtype=data_type) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.relu(x=x_val) np.testing.assert_allclose(np.maximum(x_val, 0), v.val, atol=1e-04, rtol=1e-05) class TestReLU6: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[-1, 7, -3], [4, -5, 8]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.relu6(x=x) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array([[0, 6, 0], [4, 0, 6]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[-1, 7, -3], [4, -5, 8]], dtype=np.float32) v = mb.relu6(x=x_val) np.testing.assert_allclose( np.minimum(np.maximum(x_val, 0), 6), v.val, atol=1e-04, rtol=1e-05 ) class TestScaledTanh: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.scaled_tanh(x=x, alpha=2.0, beta=1.0) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array( [[-1.5231884, 1.9280552, -1.9901096], [1.9986587, -1.9998184, 1.9999754]], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.scaled_tanh(x=x_val, alpha=2.0, beta=1.0) np.testing.assert_allclose(2.0 * np.tanh(x_val * 1.0), v.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend, dim, alpha, beta", itertools.product(compute_units, backends, [2, 4, 8], [2.0, 3.0], [4.0, 5.0]), ) def test_builder_to_backend_stress(self, compute_unit, backend, dim, alpha, beta): shape_x = np.array([dim, dim]) x_val = np.random.rand(*shape_x) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [mb.scaled_tanh(x=x, alpha=alpha, beta=beta)] expected_outputs = [alpha * np.tanh(x_val * beta)] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestSigmoid: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.sigmoid(x=x) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array( [ [0.2689414213699951, 0.8807970779778823, 0.04742587], [0.98201376, 0.00669285, 0.9975274], ], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.sigmoid(x=x_val) np.testing.assert_allclose(1 / (1 + np.exp(-x_val)), v.val, atol=1e-04, rtol=1e-05) class TestSigmoidHard: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.sigmoid_hard(x=x, alpha=1.0, beta=2.0) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array([[1.0, 1.0, 0.0], [1.0, 0.0, 1.0]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) alpha = 1.0 beta = 2.0 v = mb.sigmoid_hard(x=x_val, alpha=alpha, beta=beta) np.testing.assert_allclose( np.minimum(np.maximum((alpha * x_val) + beta, 0), 1), v.val, atol=1e-04, rtol=1e-05, ) @pytest.mark.parametrize( "compute_unit, backend, dim, alpha, beta", itertools.product(compute_units, backends, [2, 4, 8], [2.0, 3.0], [4.0, 5.0]), ) def test_builder_to_backend_stress(self, compute_unit, backend, dim, alpha, beta): shape_x = np.array([dim, dim]) x_val = np.random.rand(*shape_x) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [mb.sigmoid_hard(x=x, alpha=alpha, beta=beta)] expected_outputs = [np.minimum(np.maximum((alpha * x_val) + beta, 0), 1)] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestSiLU: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array([-1.1, 2.2, -3.3, 4.4], dtype=np.float32).reshape((1, 2, 1, 2)) input_placeholder_dict = { "x": mb.placeholder(shape=x_val.shape), } input_value_dict = {"x": x_val} expected_output_type = x_val.shape + (types.fp32,) def build(x): return mb.silu(x=x) expected_output = np.array([-0.2747, 1.9805, -0.1174, 4.3466], dtype=np.float32).reshape( expected_output_type[:-1] ) run_compare_builder( build, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) class TestSoftplus: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.softplus(x=x) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array( [[0.31326166, 2.126928, 0.04858733], [4.01815, 0.00671535, 6.0024757]], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.softplus(x=x_val) np.testing.assert_allclose( np.log(1 + np.exp(-np.abs(x_val))) + np.maximum(x_val, 0), v.val, atol=1e-04, rtol=1e-05 ) # No torch test because there is no direct torch translation to this layer class TestSoftplusParametric: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[[[-1, 3, 6]], [[-1, 2, -3]], [[4, -5, 6]]]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.softplus_parametric( x=x, alpha=np.array([1, 2, 3], dtype=np.float32), beta=np.array([4, 5, 6], dtype=np.float32), ) expected_output_types = (1, 3, 1, 3, types.fp32) expected_outputs = np.array( [ [ [[1.8142700e-02, 1.2000000e01, 2.4000000e01]], [[1.3427734e-02, 2.0000000e01, 7.1525574e-07]], [[7.2000000e01, 0.0000000e00, 1.0800000e02]], ] ], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[[[-1, 3, 6]], [[-1, 2, -3]], [[4, -5, 6]]]], dtype=np.float32) v = mb.softplus_parametric( x=x_val, alpha=np.array([1, 2, 3], dtype=np.float32), beta=np.array([4, 5, 6], dtype=np.float32), ) alpha_br = np.array([1, 2, 3], dtype=np.float32) beta_br = np.array([4, 5, 6], dtype=np.float32) for i in range(len(x_val.shape)): if i != 1: alpha_br = np.expand_dims(alpha_br, i) beta_br = np.expand_dims(beta_br, i) expected_res = alpha_br * np.log(np.exp(x_val * beta_br) + 1) np.testing.assert_allclose(expected_res, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_eval2(self): x_val = np.array([[[-1, 3, 6]], [[-1, 2, -3]], [[4, -5, 6]]], dtype=np.float32) with pytest.raises(ValueError, match=r".* dimension 1 .*"): mb.softplus_parametric( x=x_val, alpha=np.array([1, 2], dtype=np.float32), beta=np.array([4, 5, 6], dtype=np.float32), ) @ssa_fn def test_builder_eval3(self): x_val = np.array([[[-1, 3, 6]], [[-1, 2, -3]], [[4, -5, 6]]], dtype=np.float32) with pytest.raises(ValueError, match=r"alpha .* rank 1"): mb.softplus_parametric( x=x_val, alpha=np.array([[1, 2, 3]], dtype=np.float32), beta=np.array([4, 5, 6], dtype=np.float32), ) @ssa_fn def test_builder_eval4(self): with pytest.raises(ValueError, match=r"x .* rank 3"): mb.softplus_parametric( x=np.array([1], dtype=np.float32), alpha=np.array([[1, 2, 3]], dtype=np.float32), beta=np.array([4, 5, 6], dtype=np.float32), ) @ssa_fn def test_builder_eval5(self): x_val = np.array([[[-1, 3, 6]], [[-1, 2, -3]], [[4, -5, 6]]], dtype=np.float32) with pytest.raises(ValueError, match=r".* dimension 1 .*"): mb.softplus_parametric( x=x_val, alpha=np.array([1, 2, 3], dtype=np.float32), beta=np.array([5, 6], dtype=np.float32), ) @ssa_fn def test_builder_eval6(self): x_val = np.array([[[[-1, 3, 6]], [[-1, 2, -3]], [[4, -5, 6]]]], dtype=np.float32) with pytest.raises(ValueError, match=r"beta .* rank 1"): mb.softplus_parametric( x=x_val, alpha=np.array([1, 2, 3], dtype=np.float32), beta=np.array([[4, 5, 6]], dtype=np.float32), ) @pytest.mark.parametrize( "compute_unit, backend, dim, chan", itertools.product(compute_units, backends, [1, 2, 4, 8], [1, 2, 3]), ) def test_builder_to_backend_stress(self, compute_unit, backend, dim, chan): shape = np.array([1, chan, dim, dim]) x_val = np.random.rand(*shape) alpha_val = np.random.rand(chan).astype(np.float32) beta_val = np.random.rand(chan).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [mb.softplus_parametric(x=x, alpha=alpha_val, beta=beta_val)] alpha_br = np.copy(alpha_val) beta_br = np.copy(beta_val) for i in range(1, len(x_val.shape) - 1): alpha_br = np.expand_dims(alpha_br, i) beta_br = np.expand_dims(beta_br, i) expected_outputs = [alpha_br * np.log(np.exp(x_val * beta_br) + 1)] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestSoftmax: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_buidler_to_backend_smoke(self, compute_unit, backend): t = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.softmax(x=x, axis=0) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array( [ [6.69285092e-03, 9.99088949e-01, 1.23394576e-04], [9.93307149e-01, 9.11051194e-04, 9.99876605e-01], ], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.softmax(x=x_val, axis=0) np.testing.assert_allclose( scipy.special.softmax(x_val, axis=0), v.val, atol=1e-04, rtol=1e-05 ) @pytest.mark.parametrize("input_size", [(1), (2), (1, 2), (2, 2), (2, 3, 4), (2, 3, 4, 10)]) def test_value_inference(self, input_size): rs = np.random.RandomState(1234) x = rs.random(input_size) for axis in range(-x.ndim, x.ndim - 1): @mb.program(input_specs=[]) def prog(): return mb.softmax(x=x, axis=axis) ops = list(prog.functions.values())[0].operations op = list(ops)[2] assert op.op_type == "softmax" np.testing.assert_allclose( op.value_inference(), scipy.special.softmax(x, axis=axis), atol=1e-04, rtol=1e-05, ) class TestSoftsign: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.softsign(x=x) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array( [[-0.5, 0.66666667, -0.75], [0.8, -0.83333333, 0.85714286]], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.softsign(x=x_val) np.testing.assert_allclose(x_val / (1 + np.abs(x_val)), v.val, atol=1e-04, rtol=1e-05) class TestThresholdedReLU: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.thresholded_relu(x=x, alpha=2.0) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array([[0, 2, 0], [4, 0, 6]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[0, 2, 0], [4, 0, 6]], dtype=np.float32) v = mb.thresholded_relu(x=x_val, alpha=2.0) y = x_val y[y < 2.0] = 0 np.testing.assert_allclose(y, v.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend, dim, alpha", itertools.product(compute_units, backends, [2, 4, 8], [2.0, 3.0]), ) def test_builder_to_backend_stress(self, compute_unit, backend, dim, alpha): shape_x = np.array([dim, dim]) x_val = np.random.rand(*shape_x) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [mb.thresholded_relu(x=x, alpha=alpha)] y = x_val y[y < alpha] = 0 expected_outputs = [y] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestInputWeightDifferentDtypesErrorOut: """ Starting from IOS17 the alpha/beta can have different dtypes from the input/output, so this test class is mainly to verify the behaviour before iOS17, that the type inference should early error out. """ @mark_api_breaking(breaking_opset_version=ct.target.iOS17) @pytest.mark.parametrize( "backend, different_dtype, op_name", itertools.product( backends, [True, False], ["elu", "leaky_relu", "prelu", "thresholded_relu"], ), ) def test_builder_eval_alpha(self, backend, different_dtype, op_name): x = np.array([[[-1, 2, -3], [4, -5, 6]]], dtype=np.float32) alpha = np.float16(2.0) if different_dtype else np.float32(2.0) if op_name == "prelu": alpha = np.array([2.0, 2.0], dtype=alpha.dtype) # prelu requires alpha to be rank 1. def prog(): return getattr(mb, op_name)(x=x, alpha=alpha) if different_dtype: # Before iOS17 it should raise error when alpha has different dtype than input/output. with pytest.raises(ValueError, match="must have the same data type"): mb.program(input_specs=[], opset_version=backend.opset_version)(prog) else: mb.program(input_specs=[], opset_version=backend.opset_version)(prog) @mark_api_breaking(breaking_opset_version=ct.target.iOS17) @pytest.mark.parametrize( "backend, different_dtype, op_name", itertools.product( backends, [True, False], [ "clamped_relu", "linear_activation", "scaled_tanh", "sigmoid_hard", "softplus_parametric", ], ), ) def test_builder_eval_alpha_beta(self, backend, different_dtype, op_name): x = np.array([[[-1, 2, -3], [4, -5, 6]]], dtype=np.float32) alpha = np.float16(2.0) if different_dtype else np.float32(2.0) beta = np.float16(1.0) if different_dtype else np.float32(1.0) if op_name == "softplus_parametric": alpha = np.array([2.0, 2.0], dtype=alpha.dtype) beta = np.array([1.0, 1.0], dtype=beta.dtype) def prog(): return getattr(mb, op_name)(x=x, alpha=alpha, beta=beta) if different_dtype: with pytest.raises(ValueError, match="must have the same data type"): mb.program(input_specs=[], opset_version=backend.opset_version)(prog) else: mb.program(input_specs=[], opset_version=backend.opset_version)(prog) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_control_flow.py0000644000000000000000000004660214672066616027527 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import ( UNK_SYM, construct_inputs_from_placeholders, run_compare_builder, ) from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import random_gen, ssa_fn class TestSelect: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): cond_val = np.array([[3, 0, 0], [0, 4, 0], [5, 6, 0]], dtype=np.float32) a_val = np.array([[3, 1, 1], [1, 4, 1], [5, 6, 1]], dtype=np.float32) b_val = np.array([[3, 2, 2], [2, 4, 2], [5, 6, 2]], dtype=np.float32) input_placeholders = { "cond": mb.placeholder(shape=cond_val.shape), "a": mb.placeholder(shape=a_val.shape), "b": mb.placeholder(shape=b_val.shape), } input_values = {"cond": cond_val, "a": a_val, "b": b_val} def build(cond, a, b): if not types.is_bool(cond.dtype): cond = mb.cast(x=cond, dtype="bool") return [mb.select(cond=cond, a=a, b=b)] expected_output_types = [(3, 3, types.fp32)] expected_outputs = [ np.array([[3.0, 2.0, 2.0], [2.0, 4.0, 2.0], [5.0, 6.0, 2.0]], dtype=np.float32) ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_builder_to_backend_smoke_broadcast(self, compute_unit, backend): cond_val = np.array([[1], [0], [2]], dtype=np.float32) a_val = np.array([1, 7, 8], dtype=np.float32) b_val = np.array([[3, 2, 2], [2, 4, 2], [5, 6, 2]], dtype=np.float32) input_placeholders = { "cond": mb.placeholder(shape=cond_val.shape), "a": mb.placeholder(shape=a_val.shape), "b": mb.placeholder(shape=b_val.shape), } input_values = {"cond": cond_val, "a": a_val, "b": b_val} def build(cond, a, b): if not types.is_bool(cond.dtype): cond = mb.cast(x=cond, dtype="bool") return [mb.select(cond=cond, a=a, b=b)] expected_output_types = [(3, 3, types.fp32)] expected_outputs = [ np.array([[1.0, 7.0, 8.0], [2.0, 4.0, 2.0], [1.0, 7.0, 8.0]], dtype=np.float32) ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_builder_to_backend_smoke_scalar_and_tensor(self, compute_unit, backend): cond_val = np.array([[1], [0], [2]], dtype=np.float32) a_val = np.float32(1.0) b_val = np.array([[3, 2, 2], [2, 4, 2], [5, 6, 2]], dtype=np.float32) input_placeholders = { "cond": mb.placeholder(shape=cond_val.shape), "b": mb.placeholder(shape=b_val.shape), } input_values = {"cond": cond_val, "b": b_val} def build(cond, b): if not types.is_bool(cond.dtype): cond = mb.cast(x=cond, dtype="bool") return [mb.select(cond=cond, a=a_val, b=b)] expected_output_types = [(3, 3, types.fp32)] expected_outputs = [ np.array([[1.0, 1.0, 1.0], [2.0, 4.0, 2.0], [1.0, 1.0, 1.0]], dtype=np.float32) ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_builder_to_backend_smoke_symbolic(self, compute_unit, backend): SYMBOLIC_SHAPE = tuple([get_new_symbol() for _ in range(5)]) VALUE = 100.0 input_placeholders = {"a": mb.placeholder(shape=SYMBOLIC_SHAPE)} def build(a): return [mb.select(cond=False, a=a, b=np.float32(VALUE))] shape = tuple(np.random.randint(1, 5, size=len(SYMBOLIC_SHAPE))) a = np.random.rand(*shape) input_values = {"a": a} expected_outputs = [ VALUE * np.ones(shape), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types=[SYMBOLIC_SHAPE + (types.fp32,)], expected_outputs=expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, upper_bound=10), compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): cond = np.random.randint(low=0, high=2, size=(6, 1, 7)).astype(bool) a = random_gen(shape=(6, 1, 7), rand_min=-1962.0, rand_max=0.0) b = random_gen(shape=(6, 1, 7), rand_min=0.0, rand_max=1964.0) res = mb.select(cond=cond, a=a, b=b) np.testing.assert_allclose(np.where(cond, a, b), res.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_eval_broadcast(self): cond = np.array([[True], [False], [True]]) a = np.array([[1, 2], [3, 4], [5, 6]], dtype=np.float32) b = np.array([7, 8], dtype=np.float32) res = mb.select(cond=cond, a=a, b=b) np.testing.assert_allclose( np.array([[1, 2], [7, 8], [5, 6]], dtype=np.float32), res.val, atol=1e-04, rtol=1e-05 ) @ssa_fn def test_builder_eval_scalar(self): res = mb.select(cond=True, a=np.float32(1), b=np.float32(2)) assert isinstance(res.val, np.float32) np.testing.assert_allclose(np.float32(1), res.val) class TestCond: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): input_placeholders = { "a": mb.placeholder(shape=(1,), dtype=types.bool), "b": mb.placeholder(shape=(1,)), } def build(a, b): def true_fn(): return mb.add(x=b, y=1.0), mb.mul(x=b, y=2.0) def false_fn(): return mb.add(x=b, y=-1.0), mb.mul(x=b, y=-2.0) pred = mb.squeeze(x=a) return mb.cond(pred=pred, _true_fn=true_fn, _false_fn=false_fn) input_values = { "a": np.array([0], dtype=np.float32), "b": np.array([2], dtype=np.float32), } expected_output_types = [ (1, types.fp32), (1, types.fp32), ] expected_outputs = [ np.array([1], dtype=np.float32), np.array([-4], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestWhileLoop: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): def body(a, b): return mb.add(x=a, y=np.float32(1)), b def cond(a, b): return mb.less(x=a, y=b) input_placeholders = { "a": mb.placeholder(shape=(1,)), "b": mb.placeholder(shape=(1,)), } def build(a, b): return mb.while_loop(_cond=cond, _body=body, loop_vars=(a, b)) input_values = { "a": np.array([1], dtype=np.float32), "b": np.array([2], dtype=np.float32), } expected_output_types = [ (1, types.fp32), (1, types.fp32), ] expected_outputs = [ np.array([2], dtype=np.float32), np.array([2], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_power(self, compute_unit, backend): input_placeholders = { "a": mb.placeholder(shape=(1,)), "b": mb.placeholder(shape=(1,)), } def build(a, b): # Compute a^b def body(res, bx): return mb.mul(x=res, y=a), mb.add(x=bx, y=np.float32(1)) def cond(res, bx): return mb.less(x=bx, y=b) res, ignored = mb.while_loop(_cond=cond, _body=body, loop_vars=([1.0], [0.0])) return res input_values = { "a": np.array([2], dtype=np.float32), "b": np.array([4], dtype=np.float32), } expected_output_types = [ (1, types.fp32), ] expected_outputs = [ np.array([16], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_nested(self, compute_unit, backend): if backend.backend == "neuralnetwork": pytest.xfail( "rdar://96862073 (test_control_flow::TestWhileLoop::test_builder_to_backend_nested failing on nnv1)" ) input_placeholders = { "x": mb.placeholder(shape=(1,)), "y": mb.placeholder(shape=(1,)), } def build(x, y): # i, j = x, y # while i < j: # while 2*i < i+2: # i += 1 # i += 2 # return i, j # Create const outside of while loop for testing purpose two = mb.const(val=[2.0], name="const_two") one = mb.const(val=[1.0], name="const_one") def cond2(i): return mb.less(x=mb.mul(x=two, y=i), y=mb.add(x=i, y=two)) def body2(i): return mb.add(x=i, y=one) def cond1(i, j): return mb.less(x=i, y=j) def body1(i, j): new_i = mb.while_loop(_cond=cond2, _body=body2, loop_vars=(i,)) return mb.add(x=new_i, y=two), j return mb.while_loop(_cond=cond1, _body=body1, loop_vars=(x, y)) input_values = { "x": np.array([0], dtype=np.float32), "y": np.array([10], dtype=np.float32), } expected_output_types = [ (1, types.fp32), (1, types.fp32), ] expected_outputs = [ np.array([10], dtype=np.float32), np.array([10], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestList: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): elem_shape = (2,) input_placeholders = { "a": mb.placeholder(shape=elem_shape), "b": mb.placeholder(shape=elem_shape), } def build(a, b): ls = mb.make_list(init_length=2, elem_shape=elem_shape) # list is initially all 0 init_t = mb.list_read(ls=ls, index=0) ls = mb.list_write(ls=ls, index=0, value=a) # this write is out of bound ls = mb.list_write(ls=ls, index=4, value=b) ls = mb.list_scatter( ls=ls, indices=[2, 1], value=np.array([[-1, -2], [-4, -5]], dtype=np.float32), ) return ( init_t, mb.list_read(ls=ls, index=0), mb.list_gather(ls=ls, indices=[4, 2, 3]), ) input_values = { "a": np.array([1, 3], dtype=np.float32), "b": np.array([2, 4], dtype=np.float32), } expected_output_types = [ (2, types.fp32), (2, types.fp32), (3, 2, types.fp32), ] expected_outputs = [ np.array([0, 0], dtype=np.float32), np.array([1, 3], dtype=np.float32), np.array([[2, 4], [-1, -2], [0, 0]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_while(self, compute_unit, backend): # The while_loop appends [1, 2]*i to `ls` for each iteration # i = 0, ... num_iters-1. def body(i, num_iters, ls, update): y = mb.cast(x=i, dtype="fp32") new_elem = mb.mul(x=update, y=y) return ( mb.add(x=i, y=1), num_iters, mb.list_write(ls=ls, index=i, value=new_elem), update, ) def cond(i, num_iters, ls, update): i = mb.cast(x=i, dtype="fp32") return mb.less(x=i, y=num_iters) elem_shape = (2,) input_placeholders = { "num_iters": mb.placeholder(shape=(1,)), "update": mb.placeholder(shape=elem_shape), } def build(num_iters, update): i = 0 ls = mb.make_list(init_length=1, elem_shape=elem_shape) _, _, final_tensor_list, _ = mb.while_loop( _cond=cond, _body=body, loop_vars=(i, num_iters, ls, update) ) list_len = mb.list_length(ls=final_tensor_list) indices = mb.range_1d(start=0, end=list_len, step=1) return mb.list_gather(ls=final_tensor_list, indices=indices) input_values = { "num_iters": np.array([3], dtype=np.float32), "update": np.array([1, 2], dtype=np.float32), } expected_output_types = [ # Type inference does not unroll loop (UNK_SYM, 2, types.fp32), ] expected_outputs = [ np.array([[0, 0], [1, 2], [2, 4]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestConst: @pytest.mark.parametrize( "compute_unit, backend, dtype", itertools.product( compute_units, backends, [ np.int32, np.int64, np.float16, np.float32, np.float64, ], ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, dtype): t = np.random.randint(0, 5, (4, 2)).astype(np.float32) constant = np.random.randint(0, 5, (4, 2)).astype(dtype) input_placeholders = { "x": mb.placeholder(shape=t.shape), } input_values = {"x": t} def build(x): y = mb.const(val=constant) y = mb.cast(x=y, dtype="fp32") return mb.add(x=x, y=y) expected_output_types = (4, 2, types.fp32) expected_outputs = t + constant.astype(np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, dtype", itertools.product( compute_units, backends, ( np.int8, np.uint8, np.int16, np.uint16, np.int32, np.int64, np.float16, np.float32, np.float64, ), ), ) def test_const_type(self, compute_unit, backend, dtype): """Makes sure the ndarray in const has the correct type.""" @mb.program(input_specs=[], opset_version=backend.opset_version) def prog(): return mb.const(val=np.random.randint(0, 5, (4, 2)).astype(dtype)) const_op = prog.functions["main"].find_ops(op_type="const")[0] if dtype == np.int64: target_dtype = np.int32 elif dtype == np.float64: target_dtype = np.float32 else: target_dtype = dtype assert const_op.outputs[0].dtype == types.numpy_type_to_builtin_type(target_dtype) @pytest.mark.parametrize( "compute_unit, backend, dtype_str", itertools.product( compute_units, backends, ("int4", "uint1", "uint2", "uint3", "uint4", "uint6") ), ) def test_const_sub_byte_dtype(self, compute_unit, backend, dtype_str): builtin_dtype = types.string_to_builtin(dtype_str) upper_bound = types.type_mapping.builtin_to_range(builtin_dtype).high original_data = np.random.randint(0, upper_bound + 1, (2, 3)) np_dtype = types.nptype_from_builtin(builtin_dtype) @mb.program(input_specs=[], opset_version=backend.opset_version) def prog(): return mb.const(val=original_data.astype(np_dtype)) const_op = prog.functions["main"].find_ops(op_type="const")[0] assert types.builtin_to_string(const_op.outputs[0].dtype) == dtype_str expected_underlying_dtype = np.int8 if dtype_str.startswith("i") else np.uint8 assert const_op.outputs[0].val.dtype == expected_underlying_dtype assert const_op.outputs[0].val.dtype.metadata["true_dtype"] == types.string_to_builtin( dtype_str ) np.testing.assert_equal(const_op.outputs[0].val, original_data) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_conv.py0000644000000000000000000010215514672066616025761 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools._deps import _HAS_TORCH, MSG_TORCH_NOT_FOUND from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import random_gen if _HAS_TORCH: import torch import torch.nn as nn class TestConvTranspose: @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "conv_dim", "config", "x_weight_dtype", ] ), itertools.product( compute_units, backends, ["conv1d", "conv2d", "conv3d"], [ { "padding": (1, 2, 3), "DHWKdKhKw": (10, 12, 14, 3, 2, 4), "stride": (2, 1, 1), "dilation": (1, 1, 1), "has_bias": False, "groups": 1, "test_symbolic": False, "test_output_shape": True, }, { "padding": (2, 2, 2), "DHWKdKhKw": (10, 12, 14, 3, 2, 4), "stride": (2, 2, 2), "dilation": (2, 1, 1), "has_bias": False, "groups": 2, "test_symbolic": True, "test_output_shape": False, }, { "padding": (1, 2, 3), "DHWKdKhKw": (7, 7, 7, 2, 2, 2), "stride": (2, 2, 2), "dilation": (2, 1, 1), "has_bias": True, "groups": 1, "test_symbolic": True, "test_output_shape": False, }, { "padding": (2, 2, 2), "DHWKdKhKw": (7, 7, 7, 2, 2, 2), "stride": (2, 1, 1), "dilation": (1, 1, 1), "has_bias": True, "groups": 2, "test_symbolic": False, "test_output_shape": False, }, ], [(np.float32, np.float32), (np.float16, np.float16)], ), ) def test_builder_to_backend_stress( self, compute_unit, backend, conv_dim, config, x_weight_dtype, ): padding = config["padding"] DHWKdKhKw = config["DHWKdKhKw"] stride = config["stride"] dilation = config["dilation"] has_bias = config["has_bias"] groups = config["groups"] test_symbolic = config["test_symbolic"] test_output_shape = config["test_output_shape"] D, H, W, Kd, Kh, Kw = DHWKdKhKw N, C_in, C_out = 1, 1 * groups, 2 * groups x_dtype, weight_bias_dtype = x_weight_dtype x_builtin_dtype = types.numpy_type_to_builtin_type(x_dtype) isDeconv1d = conv_dim == "conv1d" isDeconv2d = conv_dim == "conv2d" if isDeconv1d: strides = [stride[0]] dilations = [dilation[0]] kernels = [Kh] m = nn.ConvTranspose1d( C_in, C_out, kernels, stride=strides, dilation=dilations, bias=has_bias, groups=groups, padding=padding[0], ) input_shape = [N, C_in, H] paddings = [padding[0], padding[0]] elif isDeconv2d: strides = [stride[0], stride[1]] dilations = [dilation[0], dilation[1]] kernels = [Kh, Kw] m = nn.ConvTranspose2d( C_in, C_out, kernels, stride=strides, dilation=dilations, bias=has_bias, groups=groups, padding=(padding[0], padding[1]), ) input_shape = [N, C_in, H, W] paddings = [padding[0], padding[0], padding[1], padding[1]] else: strides = [stride[0], stride[1], stride[2]] dilations = [dilation[0], dilation[1], dilation[2]] kernels = [Kd, Kh, Kw] m = nn.ConvTranspose3d( C_in, C_out, kernels, stride=strides, dilation=dilations, bias=has_bias, groups=groups, padding=padding, ) input_shape = [N, C_in, D, H, W] paddings = [ padding[0], padding[0], padding[1], padding[1], padding[2], padding[2], ] wts = m.state_dict() weight = wts["weight"].detach().numpy().astype(weight_bias_dtype) bias = wts["bias"].detach().numpy().astype(weight_bias_dtype) if has_bias else None input = torch.randn(*input_shape) output = m(input) output = output.detach().numpy().astype(x_dtype) input = input.detach().numpy().astype(x_dtype) output_shape = list(output.shape) if test_symbolic: # For symbolic input test # Make Batch Size and input channel as symbolic symbolic_batch_size = get_new_symbol() input_shape[0] = symbolic_batch_size output_shape[0] = symbolic_batch_size expected_output_types = tuple(output_shape[:]) + (types.fp32,) expected_outputs = [output] input_placeholders = {"x": mb.placeholder(shape=input_shape, dtype=x_builtin_dtype)} input_values = {"x": input} def build(x): arguments = { "x": x, "weight": weight, "pad": paddings, "pad_type": "custom", "strides": strides, "dilations": dilations, "groups": groups, } if has_bias: arguments["bias"] = bias if test_output_shape: arguments["output_shape"] = output.shape return mb.conv_transpose(**arguments) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, atol=1e-3 if x_dtype == np.float16 and backend.backend == "neuralnetwork" else 1e-4, ) class TestConv: @pytest.mark.parametrize( "backend, pad_type", itertools.product( backends, ["valid", "same", "same_lower", "custom"], ), ) def test_type_inference_cache_no_pad(self, backend, pad_type): # Test the type inference has the caching mechanism to ensure # same symbolic input shapes results in the same output shape if pad_type == "same_lower" and backend.opset_version == ct.target.iOS15: return @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 3, get_new_symbol(), get_new_symbol()), dtype=types.fp32) ], opset_version=backend.opset_version, ) def prog(x): weight = np.random.rand(2, 3, 2, 2) # Basic conv conv_1 = mb.conv(x=x, weight=weight) conv_2 = mb.conv(x=x, weight=weight) assert conv_1.shape == conv_2.shape # With strides / dialations conv_1 = mb.conv(x=x, weight=weight, strides=[1, 2], dilations=[3, 4]) conv_2 = mb.conv(x=x, weight=weight, strides=[1, 2], dilations=[3, 4]) assert conv_1.shape == conv_2.shape # With padding conv_1 = mb.conv(x=x, weight=weight, pad_type=pad_type, pad=[2, 3, 4, 5]) conv_2 = mb.conv(x=x, weight=weight, pad_type=pad_type, pad=[2, 3, 4, 5]) assert conv_1.shape == conv_2.shape return conv_1 @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( "compute_unit, backend, padding_mode, conv_dim", itertools.product( compute_units, backends, ["same_lower", "same", "valid"], ["conv1d", "conv2d", "conv3d"], ), ) def test_padding_mode_stress(self, compute_unit, backend, padding_mode, conv_dim): def rotation_tensor(tensor): assert tensor.shape[0] == tensor.shape[1] == 1 tensor = tensor[0][0] rank = len(tensor.shape) new_tensor = np.copy(np.flip(tensor, axis=tuple(range(rank)))) return np.expand_dims(new_tensor, axis=(0, 1)) if conv_dim == "conv3d" and padding_mode == "same_lower": if backend.backend == "neuralnetwork": pytest.skip("same_lower mode not supported for conv3d in neuralnetwork backend") if padding_mode == "same_lower" and backend.opset_version == ct.target.iOS15: pytest.skip("same_lower pad_type not supported iOS15 opset") batch, in_channels, out_channels = 1, 1, 1 input_shape = (batch, in_channels, 4, 5, 6) # batch, channel, height, width kernel_size = (2, 4, 3) torch_padding_mode = padding_mode if padding_mode != "same_lower" else "same" # Get the right shape for each conv_dim if conv_dim == "conv1d": input_shape = input_shape[:3] kernel_size = kernel_size[:1] elif conv_dim == "conv2d": input_shape = input_shape[:4] kernel_size = kernel_size[:2] # Get the ground truth answer from torch if conv_dim == "conv1d": m = torch.nn.Conv1d( in_channels, out_channels, kernel_size, stride=1, padding=torch_padding_mode, bias=False, ) elif conv_dim == "conv2d": m = torch.nn.Conv2d( in_channels, out_channels, kernel_size, stride=1, padding=torch_padding_mode, bias=False, ) elif conv_dim == "conv3d": m = torch.nn.Conv3d( in_channels, out_channels, kernel_size, stride=1, padding=torch_padding_mode, bias=False, ) # Original weight / inputs for the torch model weight = torch.clone(m.state_dict()["weight"]) input = torch.randn(*input_shape, dtype=torch.float32) # Coreml weights / inputs values coreml_weight = weight.detach().numpy() coreml_input = input.detach().numpy() if padding_mode == "same_lower": # For the same_lower padding mode, we get the ground truth output by doing the following steps # (1) Rotate the input value # (2) Rotate the kernel value # (3) Rotate the torch out rotated_input = torch.tensor( rotation_tensor(input.detach().numpy()), dtype=torch.float32 ) rotated_weight = torch.tensor( rotation_tensor(weight.detach().numpy()), dtype=torch.float32 ) m.load_state_dict({"weight": rotated_weight}, strict=False) output = m(rotated_input).detach().numpy() output = rotation_tensor(output) else: output = m(input).detach().numpy() output_shape = list(output.shape) expected_output_types = tuple(output_shape[:]) + (types.fp32,) expected_outputs = [output] input_placeholders = {"x": mb.placeholder(shape=input_shape)} input_values = {"x": coreml_input} def build(x): arguments = { "x": x, "weight": coreml_weight, "pad_type": padding_mode, } return mb.conv(**arguments) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "conv_dim", "config", "x_weight_dtype", ] ), itertools.product( compute_units, backends, ["conv1d", "conv2d", "conv3d"], [ { "padding": (1, 1, 1), "DHWKdKhKw": (10, 12, 14, 3, 2, 4), "stride": (2, 1, 1), "dilation": (1, 1, 1), "has_bias": False, "groups": 1, "symbolic": False, }, { "padding": (2, 2, 2), "DHWKdKhKw": (10, 12, 14, 3, 2, 4), "stride": (2, 2, 2), "dilation": (2, 1, 1), "has_bias": False, "groups": 2, "symbolic": True, }, { "padding": (1, 1, 1), "DHWKdKhKw": (5, 5, 5, 2, 2, 2), "stride": (2, 2, 2), "dilation": (2, 1, 1), "has_bias": True, "groups": 1, "symbolic": True, }, { "padding": (2, 2, 2), "DHWKdKhKw": (5, 5, 5, 2, 2, 2), "stride": (2, 1, 1), "dilation": (1, 1, 1), "has_bias": True, "groups": 2, "symbolic": False, }, ], [(np.float32, np.float32), (np.float16, np.float16)], ), ) def test_builder_to_backend_stress( self, compute_unit, backend, conv_dim, config, x_weight_dtype, ): if ( backend.backend == 'neuralnetwork' and conv_dim == "conv2d" and config == { "padding": (1, 1, 1), "DHWKdKhKw": (5, 5, 5, 2, 2, 2), "stride": (2, 2, 2), "dilation": (2, 1, 1), "has_bias": True, "groups": 1, "symbolic": True, } ): pytest.xfail( "rdar://129121584: NN Conv Fail when Run Multiple Faulty Models at Same Time" ) padding = config["padding"] DHWKdKhKw = config["DHWKdKhKw"] stride = config["stride"] dilation = config["dilation"] has_bias = config["has_bias"] groups = config["groups"] symbolic = config["symbolic"] D, H, W, Kd, Kh, Kw = DHWKdKhKw N, C_in, C_out = 1, 1 * groups, 2 * groups x_dtype, weight_bias_dtype = x_weight_dtype x_builtin_dtype = types.numpy_type_to_builtin_type(x_dtype) isConv1d = conv_dim == "conv1d" isConv2d = conv_dim == "conv2d" if isConv1d: strides = [stride[0]] dilations = [dilation[0]] kernels = [Kh] m = nn.Conv1d( C_in, C_out, kernels, stride=strides, dilation=dilations, bias=has_bias, groups=groups, padding=padding[0], ) input_shape = [N, C_in, H] paddings = [padding[0], padding[0]] elif isConv2d: strides = [stride[0], stride[1]] dilations = [dilation[0], dilation[1]] kernels = [Kh, Kw] m = nn.Conv2d( C_in, C_out, kernels, stride=strides, dilation=dilations, bias=has_bias, groups=groups, padding=(padding[0], padding[1]), ) input_shape = [N, C_in, H, W] paddings = [padding[0], padding[0], padding[1], padding[1]] else: strides = [stride[0], stride[1], stride[2]] dilations = [dilation[0], dilation[1], dilation[2]] kernels = [Kd, Kh, Kw] m = nn.Conv3d( C_in, C_out, kernels, stride=strides, dilation=dilations, bias=has_bias, groups=groups, padding=padding, ) input_shape = [N, C_in, D, H, W] paddings = [ padding[0], padding[0], padding[1], padding[1], padding[2], padding[2], ] wts = m.state_dict() weight = wts["weight"].detach().numpy().astype(weight_bias_dtype) bias = wts["bias"].detach().numpy().astype(weight_bias_dtype) if has_bias else None # PyTorch and CoreML weight format is same # PyTorch weight format: C_out, C_in, H, W # MIL weight format: C_out, C_in, H, W input = random_gen(input_shape) input = torch.Tensor(input) output = m(input) output = output.detach().numpy().astype(x_dtype) input = input.detach().numpy().astype(x_dtype) output_shape = list(output.shape) if symbolic: # For symbolic input test # Make Batch Size and input channel as symbolic symbolic_batch_size = get_new_symbol() input_shape[0] = symbolic_batch_size output_shape[0] = symbolic_batch_size expected_output_types = tuple(output_shape[:]) + (x_builtin_dtype,) expected_outputs = [output] input_placeholders = {"x": mb.placeholder(shape=input_shape, dtype=x_builtin_dtype)} input_values = {"x": input} def build(x): arguments = { "x": x, "weight": weight, "pad": paddings, "pad_type": "custom", "strides": strides, "dilations": dilations, "groups": groups, } if has_bias: arguments["bias"] = bias return mb.conv(**arguments) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, atol=1e-3 if x_dtype == np.float16 and backend.backend == "neuralnetwork" else 1e-4, ) @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "conv_dim", "config", ] ), itertools.product( compute_units, backends, ["conv1d", "conv2d"], [ { "padding": (1, 1, 1), "DHWKdKhKw": (10, 12, 14, 3, 2, 4), "stride": (2, 1, 1), "dilation": (1, 1, 1), "has_bias": False, "groups": 1, "symbolic": False, }, { "padding": (2, 2, 2), "DHWKdKhKw": (10, 12, 14, 3, 2, 4), "stride": (2, 2, 2), "dilation": (2, 1, 1), "has_bias": False, "groups": 2, "symbolic": True, }, { "padding": (1, 1, 1), "DHWKdKhKw": (5, 5, 5, 2, 2, 2), "stride": (2, 2, 2), "dilation": (2, 1, 1), "has_bias": True, "groups": 1, "symbolic": True, }, { "padding": (2, 2, 2), "DHWKdKhKw": (5, 5, 5, 2, 2, 2), "stride": (2, 1, 1), "dilation": (1, 1, 1), "has_bias": True, "groups": 2, "symbolic": False, }, ], ), ) def test_builder_to_backend_stress_weights_input( self, compute_unit, backend, conv_dim, config, ): padding = config["padding"] DHWKdKhKw = config["DHWKdKhKw"] stride = config["stride"] has_bias = config["has_bias"] groups = config["groups"] symbolic = config["symbolic"] if backend.backend == "neuralnetwork" and groups > 1: pytest.skip( "dynamic conv with groups > 1 is not supported on the neuralnetwork backend" ) if backend.backend == "mlprogram" and compute_unit != ct.ComputeUnit.CPU_ONLY: pytest.xfail( "rdar://97398343 (test_builder_to_backend_stress_weights_input is failing on mlprogram + GPU)" ) D, H, W, Kd, Kh, Kw = DHWKdKhKw N, C_in, C_out = 1, 1 * groups, 2 * groups isConv1d = conv_dim == "conv1d" isConv2d = conv_dim == "conv2d" if isConv1d: strides = [stride[0]] kernels = [Kh] m = nn.Conv1d( C_in, C_out, kernels, stride=strides, bias=has_bias, groups=groups, padding=padding[0], ) input_shape = [N, C_in, H] paddings = [padding[0], padding[0]] elif isConv2d: strides = [stride[0], stride[1]] kernels = [Kh, Kw] m = nn.Conv2d( C_in, C_out, kernels, stride=strides, groups=groups, padding=(padding[0], padding[1]), bias=has_bias, ) input_shape = [N, C_in, H, W] paddings = [padding[0], padding[0], padding[1], padding[1]] wts = m.state_dict() weight = wts["weight"].detach().numpy() bias = wts["bias"].detach().numpy() if has_bias else None # PyTorch and CoreML weight format is same # PyTorch weight format: C_out, C_in, H, W # MIL weight format: C_out, C_in, H, W input = torch.randn(*input_shape) output = m(input) output = output.detach().numpy() input = input.detach().numpy() output_shape = list(output.shape) if symbolic: # For symbolic input test # Make Batch Size and input channel as symbolic symbolic_batch_size = get_new_symbol() input_shape[0] = symbolic_batch_size output_shape[0] = symbolic_batch_size expected_output_types = tuple(output_shape[:]) + (types.fp32,) expected_outputs = [output] input_placeholders = { "x": mb.placeholder(shape=input_shape), "input_weight": mb.placeholder(shape=weight.shape), } input_values = {"x": input, "input_weight": weight} def build(x, input_weight): arguments = { "x": x, "weight": input_weight, "pad": paddings, "pad_type": "custom", "strides": strides, "groups": groups, } if has_bias: arguments["bias"] = bias return mb.conv(**arguments) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_conv_bias_fusion(self, compute_unit, backend): """ Test conv bias fusion when const input. Input graph: Const | V input -----> convolution -----> add/sub ---> out Output graph: input -----> convolution -----> out """ weight = np.array([2.5], dtype=np.float32).reshape([1, 1, 1, 1]) def build(x): x = mb.conv(x=x, weight=weight) bias = mb.const(val=[10.0]) return mb.add(x=x, y=bias) input = np.array([1, 2, 3, 4], dtype=np.float32).reshape((1, 1, 2, 2)) output = np.array([12.5, 15.0, 17.5, 20.0], dtype=np.float32).reshape((1, 1, 2, 2)) expected_output_types = output.shape + (types.fp32,) expected_outputs = [output] input_placeholders = {"x": mb.placeholder(shape=input.shape)} input_values = {"x": input} run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestInvalidConvConfig: @pytest.mark.parametrize( "compute_unit, backend, conv_dim", itertools.product( compute_units, backends, (1, 2, 3), ), ) def test_invalid_weight(self, compute_unit, backend, conv_dim): N, C_in, C_out = tuple(np.random.randint(low=16, high=32, size=3)) D = tuple(np.random.randint(low=8, high=16, size=conv_dim)) K = tuple(np.random.randint(low=1, high=4, size=conv_dim)) input_shape = (N, C_in) + D x = np.random.rand(*input_shape) groups = np.random.randint(low=1, high=C_in + 1) while C_in % groups != 0: groups = np.random.randint(low=1, high=C_in + 1) weight = ( np.random.rand(C_out, C_in // groups + +np.random.randint(low=1, high=8), *K) * 2.0 - 1.0 ) def build(x): return mb.conv(x=x, weight=weight, groups=groups) with pytest.raises( ValueError, match=r"C_in / groups = [0-9]+/[0-9]+ != weight\[1\] \([0-9]+\)" ): run_compare_builder( build, {"x": mb.placeholder(shape=input_shape)}, {"x": x}, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, conv_dim", itertools.product( compute_units, backends, (1, 2, 3), ), ) def test_invalid_bias(self, compute_unit, backend, conv_dim): N, C_in, C_out = tuple(np.random.randint(low=1, high=10, size=3)) D = tuple(np.random.randint(low=8, high=16, size=conv_dim)) K = tuple(np.random.randint(low=1, high=4, size=conv_dim)) input_shape = (N, C_in) + D x = np.random.rand(*input_shape) weight = np.random.rand(C_out, C_in, *K) * 2.0 - 1.0 wrong_bias_size = C_out + np.random.randint(low=1, high=8) bias = np.random.rand(wrong_bias_size) * 2.0 - 1.0 def build(x): return mb.conv(x=x, weight=weight, bias=bias) with pytest.raises( ValueError, match=r"# of bias values [0-9]+ not equal to # output channels [0-9]+" ): run_compare_builder( build, {"x": mb.placeholder(shape=input_shape)}, {"x": x}, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, conv_dim", itertools.product( compute_units, backends, (1, 2, 3), ), ) def test_invalid_kernel(self, compute_unit, backend, conv_dim): N, C_in, C_out = tuple(np.random.randint(low=1, high=10, size=3)) D = tuple(np.random.randint(low=8, high=16, size=conv_dim)) K = tuple(np.random.randint(low=16, high=32, size=conv_dim)) input_shape = (N, C_in) + D x = np.random.rand(*input_shape) weight = np.random.rand(C_out, C_in, *K) * 2.0 - 1.0 def build(x): return mb.conv(x=x, weight=weight) with pytest.raises( ValueError, match=r"spatial dimension [0-9]+ has invalid output size -?[0-9]+" ): run_compare_builder( build, {"x": mb.placeholder(shape=input_shape)}, {"x": x}, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, conv_dim", itertools.product( compute_units, backends, (1, 2, 3), ), ) def test_invalid_dilation(self, compute_unit, backend, conv_dim): N, C_in, C_out = tuple(np.random.randint(low=1, high=10, size=3)) D = tuple(np.random.randint(low=8, high=16, size=conv_dim)) K = tuple(np.random.randint(low=2, high=4, size=conv_dim)) dilations = tuple(np.random.randint(low=16, high=32, size=conv_dim)) input_shape = (N, C_in) + D x = np.random.rand(*input_shape) weight = np.random.rand(C_out, C_in, *K) * 2.0 - 1.0 def build(x): return mb.conv(x=x, weight=weight, dilations=dilations) with pytest.raises( ValueError, match=r"spatial dimension [0-9]+ has invalid output size -?[0-9]+" ): run_compare_builder( build, {"x": mb.placeholder(shape=input_shape)}, {"x": x}, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, conv_dim", itertools.product( compute_units, backends, (1, 2, 3), ), ) def test_invalid_groups(self, compute_unit, backend, conv_dim): N, C_in, C_out = tuple(np.random.randint(low=16, high=32, size=3)) D = tuple(np.random.randint(low=8, high=16, size=conv_dim)) K = tuple(np.random.randint(low=1, high=4, size=conv_dim)) input_shape = (N, C_in) + D x = np.random.rand(*input_shape) groups = np.random.randint(low=1, high=C_in) while C_in % groups == 0: groups = np.random.randint(low=1, high=C_in) weight = np.random.rand(C_out, C_in // groups, *K) * 2.0 - 1.0 def build(x): return mb.conv(x=x, weight=weight, groups=groups) with pytest.raises( ValueError, match=r"# of input channels [0-9]+ not divisible by groups [0-9]+" ): run_compare_builder( build, {"x": mb.placeholder(shape=input_shape)}, {"x": x}, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, conv_dim", itertools.product( compute_units, backends, (1, 2, 3), ), ) def test_invalid_rank(self, compute_unit, backend, conv_dim): N, C_in, C_out = tuple(np.random.randint(low=16, high=32, size=3)) D = tuple(np.random.randint(low=8, high=16, size=conv_dim)) input_shape = (N, C_in) + D x = np.random.rand(*input_shape) wrong_K = tuple(np.random.randint(low=1, high=4, size=conv_dim - 1)) weight = np.random.rand(C_out, C_in, *wrong_K) * 2.0 - 1.0 strides = tuple(np.random.randint(low=1, high=4, size=conv_dim + 1)) dilations = tuple(np.random.randint(low=1, high=4, size=conv_dim + 2)) pad = tuple(np.random.randint(low=1, high=4, size=2 * conv_dim + 3)) def build(x): return mb.conv( x=x, weight=weight, strides=strides, dilations=dilations, pad_type="custom", pad=pad ) with pytest.raises( ValueError, match=r"input_shape \(length [0-9]+\), " r"kernel_shape \(length [0-9]+\), " r"strides \(length [0-9]+\), " r"dilations \(length [0-9]+\), " r"and custom_pad \(length [0-9]+\) divided by two " r"must all be the same length", ): run_compare_builder( build, {"x": mb.placeholder(shape=input_shape)}, {"x": x}, compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_elementwise_binary.py0000644000000000000000000005123614672066616030704 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import ssa_fn class TestElementwiseBinary: # All in this test share the same backends @pytest.mark.parametrize( "compute_unit, backend, mode", itertools.product( compute_units, backends, [ "add", "floor_div", "maximum", "minimum", "mod", "mul", "pow", "real_div", "sub", ], ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, mode): if mode == "add": x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[0, 4, 0], [8, 0, 12]], dtype=np.float32) build = lambda x, y: mb.add(x=x, y=y) elif mode == "floor_div": x = np.array([[10, 20, 30], [40, 50, 60]], dtype=np.float32) y = np.array([[11, 12, 13], [14, 15, 16]], dtype=np.float32) expected_outputs = np.array([[0, 1, 2], [2, 3, 3]], dtype=np.float32) build = lambda x, y: mb.floor_div(x=x, y=y) elif mode == "maximum": x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) build = lambda x, y: mb.maximum(x=x, y=y) elif mode == "minimum": x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) build = lambda x, y: mb.minimum(x=x, y=y) elif mode == "mod": x = np.array([[10, 20, 30], [40, 50, 60]], dtype=np.float32) y = np.array([[11, 12, 13], [14, 15, 16]], dtype=np.float32) expected_outputs = np.array([[10, 8, 4], [12, 5, 12]], dtype=np.float32) build = lambda x, y: mb.mod(x=x, y=y) elif mode == "mul": x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[-1, 4, -9], [16, -25, 36]], dtype=np.float32) build = lambda x, y: mb.mul(x=x, y=y) elif mode == "pow": x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[1, 4, 0.037], [256, 0.00032, 46656]], dtype=np.float32) build = lambda x, y: mb.pow(x=x, y=y) elif mode == "real_div": x = np.array([[10, 20, 30], [40, 50, 60]], dtype=np.float32) y = np.array([[11, 12, 13], [14, 15, 16]], dtype=np.float32) expected_outputs = np.array( [[0.90909091, 1.66666667, 2.30769231], [2.85714286, 3.33333333, 3.75]], dtype=np.float32, ) build = lambda x, y: mb.real_div(x=x, y=y) elif mode == "sub": x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[2, 0, 6], [0, 10, 0]], dtype=np.float32) build = lambda x, y: mb.sub(x=x, y=y) expected_output_types = (2, 3, types.fp32) input_placeholders = { "x": mb.placeholder(shape=x.shape), "y": mb.placeholder(shape=y.shape), } input_values = {"x": x, "y": y} run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) def test_output_dim_for_same_symbolic_dim_inputs(self): symbolic_input_shape = (get_new_symbol(), 4, 5) @mb.program( input_specs=[ mb.TensorSpec(shape=symbolic_input_shape), mb.TensorSpec(shape=symbolic_input_shape), ] ) def prog(x, y): return mb.add(x=x, y=y) add_op = prog.find_ops(op_type="add")[0] output_shape = add_op.outputs[0].shape if output_shape != symbolic_input_shape: raise AssertionError( "Invalid Output shape {}. Should instead be {}".format( output_shape, symbolic_input_shape ) ) @ssa_fn def test_builder_add(self): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[0, 4, 0], [8, 0, 12]], dtype=np.float32) v = mb.add(x=x, y=y) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_floor_div(self): x = np.array([[10, 20, 30], [40, 50, 60]], dtype=np.float32) y = np.array([[11, 12, 13], [14, 15, 16]], dtype=np.float32) expected_outputs = np.array([[0, 1, 2], [2, 3, 3]], dtype=np.float32) v = mb.floor_div(x=x, y=y) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_maximum(self): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) v = mb.maximum(x=x, y=y) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_minimum(self): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.minimum(x=x, y=y) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_mod(self): x = np.array([[10, 20, 30], [40, 50, 60]], dtype=np.float32) y = np.array([[11, 12, 13], [14, 15, 16]], dtype=np.float32) expected_outputs = np.array([[10, 8, 4], [12, 5, 12]], dtype=np.float32) v = mb.mod(x=x, y=y) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_mul(self): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[-1, 4, -9], [16, -25, 36]], dtype=np.float32) v = mb.mul(x=x, y=y) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_pow(self): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[1, 4, 0.037], [256, 0.00032, 46656]], dtype=np.float32) v = mb.pow(x=x, y=y) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_real_div(self): x = np.array([[10, 20, 30], [40, 50, 60]], dtype=np.float32) y = np.array([[11, 12, 13], [14, 15, 16]], dtype=np.float32) expected_outputs = np.array( [[0.90909091, 1.66666667, 2.30769231], [2.85714286, 3.33333333, 3.75]], dtype=np.float32, ) v = mb.real_div(x=x, y=y) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_real_div_both_ints(self): x = np.array([5], dtype=np.int32) y = np.array([2], dtype=np.int32) expected_outputs = np.array([2], dtype=np.int32) v = mb.real_div(x=x, y=y) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) assert isinstance(v.val[0], (float, np.int32)) # make sure the dtype is float assert types.is_int(v.dtype) # make sure the symbolic type matches the value type assert v._sym_type.get_primitive() == v._sym_val.get_primitive() @ssa_fn def test_builder_sub(self): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[2, 0, 6], [0, 10, 0]], dtype=np.float32) v = mb.sub(x=x, y=y) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_real_div_int_builder_to_backend(self, compute_unit, backend): """ For the neuralnetwork backend, the real_div is producing float output even for int inputs, while the mlprogram backend produces int type output. """ x = np.array([[10, 20, 30], [40, 50, 60]], dtype=np.float32) y = np.array([[11, 12, 13], [14, 15, 16]], dtype=np.float32) if backend.backend == "neuralnetwork": dtype = np.float32 else: dtype = np.int32 expected_outputs = np.array(x / y, dtype=dtype) build = lambda x, y: mb.real_div(x=x, y=y) expected_output_types = (2, 3, types.int32) input_placeholders = { "x": mb.placeholder(shape=x.shape, dtype=types.int32), "y": mb.placeholder(shape=y.shape, dtype=types.int32), } input_values = {"x": x, "y": y} run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestEqual: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=x.shape), "y": mb.placeholder(shape=y.shape), } input_values = {"x": x, "y": y} def build(x, y): return mb.equal(x=x, y=y), mb.equal(x=-3.0, y=y) expected_output_types = [ (2, 3, types.bool), (2, 3, types.bool), ] expected_outputs = [ np.array([[0, 1, 0], [1, 0, 1]], dtype=bool), np.array([[0, 0, 1], [0, 0, 0]], dtype=bool), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[0, 1, 0], [1, 0, 1]], dtype=bool) v = mb.equal(x=x_val, y=y_val) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) class TestGreater: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=x.shape), "y": mb.placeholder(shape=y.shape), } input_values = {"x": x, "y": y} def build(x, y): return mb.greater(x=x, y=y), mb.greater(x=x, y=3.5) expected_output_types = [ (2, 3, types.bool), (2, 3, types.bool), ] expected_outputs = [ np.array([[1, 0, 1], [0, 1, 0]], dtype=bool), np.array([[0, 0, 0], [1, 1, 1]], dtype=bool), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[1, 0, 1], [0, 1, 0]], dtype=bool) v = mb.greater(x=x_val, y=y_val) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) class TestGreaterEqual: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=x.shape), "y": mb.placeholder(shape=y.shape), } input_values = {"x": x, "y": y} def build(x, y): return mb.greater_equal(x=x, y=y), mb.greater_equal(x=x, y=3.5) expected_output_types = [ (2, 3, types.bool), (2, 3, types.bool), ] expected_outputs = [ np.array([[1, 1, 1], [1, 1, 1]], dtype=bool), np.array([[0, 0, 0], [1, 1, 1]], dtype=bool), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[1, 1, 1], [1, 1, 1]], dtype=bool) v = mb.greater_equal(x=x_val, y=y_val) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) class TestLess: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=x.shape), "y": mb.placeholder(shape=y.shape), } input_values = {"x": x, "y": y} def build(x, y): return mb.less(x=x, y=y) expected_output_types = (2, 3, types.bool) expected_outputs = np.array([[0, 0, 0], [0, 0, 0]], dtype=bool) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke2(self, compute_unit, backend): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x.shape)} input_values = {"x": x} def build(x): # y is const y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) return mb.less(x=x, y=y) expected_output_types = (2, 3, types.bool) expected_outputs = np.array([[0, 0, 0], [0, 0, 0]], dtype=bool) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_broadcast(self, compute_unit, backend): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x.shape)} input_values = {"x": x} def build(x): # y is const return mb.less(x=x, y=3.5) expected_output_types = (2, 3, types.bool) expected_outputs = np.array([[1, 1, 1], [0, 0, 0]], dtype=bool) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[0, 0, 0], [0, 0, 0]], dtype=bool) v = mb.less(x=x_val, y=y_val) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) class TestLessEqual: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=x.shape), "y": mb.placeholder(shape=y.shape), } input_values = {"x": x, "y": y} def build(x, y): return mb.less_equal(x=x, y=y) expected_output_types = (2, 3, types.bool) expected_outputs = np.array([[0, 1, 0], [1, 0, 1]], dtype=bool) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[0, 1, 0], [1, 0, 1]], dtype=bool) v = mb.less_equal(x=x_val, y=y_val) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) class TestNotEqual: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=x.shape), "y": mb.placeholder(shape=y.shape), } input_values = {"x": x, "y": y} def build(x, y): return mb.not_equal(x=x, y=y) expected_output_types = (2, 3, types.bool) expected_outputs = np.array([[1, 0, 1], [0, 1, 0]], dtype=bool) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) y_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[1, 0, 1], [0, 1, 0]], dtype=bool) v = mb.not_equal(x=x_val, y=y_val) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_elementwise_unary.py0000644000000000000000000005637214672066616030564 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import scipy from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import ssa_fn class TestElementwiseUnary: # All ops in this test share the same backends @pytest.mark.parametrize( "compute_unit, backend, mode", itertools.product( compute_units, backends, [ "abs", "acos", "asin", "atan", "atanh", "cast", "clip", "cos", "cosh", "erf", "exp", "exp2", "floor", "inverse", "log", "round", "rsqrt", "sign", "sin", "sinh", "sqrt", "square", "tan", "tanh", "threshold", ], ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, mode): if mode == "abs": val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) build = lambda x: mb.abs(x=x) elif mode == "acos": val = np.array([[-1, -0.5, 0], [0.4, 0.5, 0.8]], dtype=np.float32) expected_outputs = np.array( [ [3.14159265, 2.0943951, 1.57079633], [1.15927948, 1.04719755, 0.64350111], ], dtype=np.float32, ) build = lambda x: mb.acos(x=x) elif mode == "asin": val = np.array([[-1, -0.5, 0], [0.4, 0.5, 0.8]], dtype=np.float32) expected_outputs = np.array( [[-1.57079633, -0.52359878, 0.0], [0.41151685, 0.52359878, 0.92729522]], dtype=np.float32, ) build = lambda x: mb.asin(x=x) elif mode == "atan": val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array( [ [-0.78539816, 1.10714872, -1.24904577], [1.32581766, -1.37340077, 1.40564765], ], dtype=np.float32, ) build = lambda x: mb.atan(x=x) elif mode == "atanh": val = np.array([[-0.8, -0.5, 0], [0.4, 0.5, 0.8]], dtype=np.float32) expected_outputs = np.array( [[-1.09861229, -0.54930614, 0.0], [0.42364893, 0.54930614, 1.09861229]], dtype=np.float32, ) build = lambda x: mb.atanh(x=x) elif mode == "cast": val = np.array([[-1.2, 2, -3.6], [4.5, -5, 6.7]], dtype=np.float32) expected_outputs = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.int32) build = lambda x: mb.cast(x=x, dtype="int32") elif mode == "ceil": val = np.array([[-1.2, 2, -3.4], [4.5, -5, 6.7]], dtype=np.float32) expected_outputs = np.array([[-1, 2, -3], [5, -5, 7]], dtype=np.float32) build = lambda x: mb.ceil(x=x) elif mode == "clip": val = np.array([[-1.2, 2, -3.4], [4.5, -5, 6.7]], dtype=np.float32) expected_outputs = np.array([[0, 2, 0], [4.5, 0, 5]], dtype=np.float32) build = lambda x: mb.clip(x=x, alpha=0.0, beta=5.0) elif mode == "cos": val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array( [ [0.54030231, -0.41614684, -0.9899925], [-0.65364362, 0.28366219, 0.96017029], ], dtype=np.float32, ) build = lambda x: mb.cos(x=x) elif mode == "cosh": val = np.array([[-1, -2, -3], [1, 2, 3]], dtype=np.float32) expected_outputs = np.array( [ [1.54308063, 3.76219569, 10.067662], [1.54308063, 3.76219569, 10.067662], ], dtype=np.float32, ) build = lambda x: mb.cosh(x=x) elif mode == "erf": val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array( [ [-0.8427007929497148, 0.9953222650189527, -0.9999779095030014], [0.9999999845827421, -0.9999999999984626, 1.0], ], dtype=np.float32, ) build = lambda x: mb.erf(x=x) elif mode == "exp": val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array( [ [0.36787944, 7.3890561, 0.04978707], [54.5981500, 0.0067379, 403.428793], ], dtype=np.float32, ) build = lambda x: mb.exp(x=x) elif mode == "exp2": val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array([[0.5, 4.0, 0.125], [16, 0.03125, 64]], dtype=np.float32) build = lambda x: mb.exp2(x=x) elif mode == "floor": val = np.array([[-1.2, 2, -3.4], [4.5, -5, 6.7]], dtype=np.float32) expected_outputs = np.array([[-2, 2, -4], [4, -5, 6]], dtype=np.float32) build = lambda x: mb.floor(x=x) elif mode == "inverse": val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array( [[-1.0, 0.5, -0.33333334], [0.25, -0.2, 0.16666667]], dtype=np.float32 ) build = lambda x: mb.inverse(x=x) elif mode == "log": val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) expected_outputs = np.array( [[0.0, 0.69314718, 1.09861229], [1.38629436, 1.60943791, 1.79175947]], dtype=np.float32, ) build = lambda x: mb.log(x=x) elif mode == "round": val = np.array([[-1.2, 2, -3.4], [4.6, -5, 6.7]], dtype=np.float32) expected_outputs = np.array([[-1, 2, -3], [5, -5, 7]], dtype=np.float32) build = lambda x: mb.round(x=x) elif mode == "rsqrt": val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) expected_outputs = np.array( [[1.0, 0.70710678, 0.57735027], [0.5, 0.4472136, 0.40824829]], dtype=np.float32, ) build = lambda x: mb.rsqrt(x=x) elif mode == "sign": val = np.array([[-1, 2, 0], [0, -5, 6]], dtype=np.float32) expected_outputs = np.array([[-1, 1, 0], [0, -1, 1]], dtype=np.float32) build = lambda x: mb.sign(x=x) elif mode == "sin": val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array( [ [-0.84147098, 0.90929743, -0.14112001], [-0.7568025, 0.95892427, -0.2794155], ], dtype=np.float32, ) build = lambda x: mb.sin(x=x) elif mode == "sinh": val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array( [[-1.1752, 3.62686, -10.017874], [27.289917, -74.20321, 201.71315]], dtype=np.float32, ) build = lambda x: mb.sinh(x=x) elif mode == "sqrt": val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) expected_outputs = np.array( [[1.0, 1.41421356, 1.73205081], [2.0, 2.23606798, 2.44948974]], dtype=np.float32, ) build = lambda x: mb.sqrt(x=x) elif mode == "square": val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) expected_outputs = np.array( [[1.0, 4.0, 9.0], [16.0, 25.0, 36.0]], dtype=np.float32, ) build = lambda x: mb.square(x=x) elif mode == "tan": val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array( [[-1.5574, -2.185, 0.1425], [1.15782, 3.3805, -0.291]], dtype=np.float32 ) build = lambda x: mb.tan(x=x) elif mode == "tanh": val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) expected_outputs = np.array( [ [-0.7615942, 0.9640276, -0.9950548], [0.9993293, -0.9999092, 0.9999877], ], dtype=np.float32, ) build = lambda x: mb.tanh(x=x) elif mode == "threshold": val = np.array([[-1.2, 2, -3.4], [4.5, -5, 6.7]], dtype=np.float32) expected_outputs = np.array([[1.0, 2, 1.0], [4.5, 1.0, 6.7]], dtype=np.float32) build = lambda x: mb.threshold(x=x, alpha=1.0) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} expected_output_types = (2, 3, types.int32) if mode == "cast" else (2, 3, types.fp32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_abs_eval(self): val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.abs(x=val) expected_outputs = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_acos_eval(self): val = np.array([[-1, -0.5, 0], [0.4, 0.5, 0.8]], dtype=np.float32) v = mb.acos(x=val) expected_outputs = np.array( [[3.14159265, 2.0943951, 1.57079633], [1.15927948, 1.04719755, 0.64350111]], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_asin_eval(self): val = np.array([[-1, -0.5, 0], [0.4, 0.5, 0.8]], dtype=np.float32) v = mb.asin(x=val) expected_outputs = np.array( [[-1.57079633, -0.52359878, 0.0], [0.41151685, 0.52359878, 0.92729522]], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_atan_eval(self): val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.atan(x=val) expected_outputs = np.array( [ [-0.78539816, 1.10714872, -1.24904577], [1.32581766, -1.37340077, 1.40564765], ], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_atanh_eval(self): val = np.array([[-0.8, -0.5, 0], [0.4, 0.5, 0.8]], dtype=np.float32) v = mb.atanh(x=val) expected_outputs = np.array( [[-1.09861229, -0.54930614, 0.0], [0.42364893, 0.54930614, 1.09861229]], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_cast_eval(self): val = np.array([[-1.2, 2, -3.4], [4.5, -5, 6.7]], dtype=np.float32) expected_outputs = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.int32) v = mb.cast(x=val, dtype="int32") np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_ceil_eval(self): val = np.array([[-1.2, 2, -3.4], [4.5, -5, 6.7]], dtype=np.float32) v = mb.ceil(x=val) expected_outputs = np.array([[-1, 2, -3], [5, -5, 7]], dtype=np.float32) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_clip_eval(self): val = np.array([[-1.2, 2, -3.4], [4.5, -5, 6.7]], dtype=np.float32) v = mb.clip(x=val, alpha=0.0, beta=5.0) expected_outputs = np.array([[0, 2, 0], [4.5, 0, 5]], dtype=np.float32) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_cos_eval(self): val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.cos(x=val) expected_outputs = np.array( [ [0.54030231, -0.41614684, -0.9899925], [-0.65364362, 0.28366219, 0.96017029], ], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_cosh_eval(self): val = np.array([[-1, -2, -3], [1, 2, 3]], dtype=np.float32) v = mb.cosh(x=val) expected_outputs = np.array( [[1.54308063, 3.76219569, 10.067662], [1.54308063, 3.76219569, 10.067662]], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_erf_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.erf(x=x_val) np.testing.assert_allclose(scipy.special.erf(x_val), v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_exp_eval(self): val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.exp(x=val) expected_outputs = np.array( [[0.36787944, 7.3890561, 0.04978707], [54.5981500, 0.0067379, 403.428793]], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_exp2_eval(self): val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.exp2(x=val) expected_outputs = np.array([[0.5, 4.0, 0.125], [16, 0.03125, 64]], dtype=np.float32) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_floor_eval(self): val = np.array([[-1.2, 2, -3.4], [4.5, -5, 6.7]], dtype=np.float32) v = mb.floor(x=val) expected_outputs = np.array([[-2, 2, -4], [4, -5, 6]], dtype=np.float32) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_inverse_eval(self): val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.inverse(x=val) expected_outputs = np.array( [[-1.0, 0.5, -0.33333334], [0.25, -0.2, 0.16666667]], dtype=np.float32 ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_log_eval(self): val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) v = mb.log(x=val) expected_outputs = np.array( [[0.0, 0.69314718, 1.09861229], [1.38629436, 1.60943791, 1.79175947]], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_round_eval(self): val = np.array([[-1.2, 2, -3.4], [4.6, -5, 6.7]], dtype=np.float32) v = mb.round(x=val) expected_outputs = np.array([[-1, 2, -3], [5, -5, 7]], dtype=np.float32) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_rsqrt_eval(self): val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) v = mb.rsqrt(x=val) expected_outputs = np.array( [[1.0, 0.70710678, 0.57735027], [0.5, 0.4472136, 0.40824829]], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_sign_eval(self): val = np.array([[-1, 2, 0], [0, -5, 6]], dtype=np.float32) v = mb.sign(x=val) expected_outputs = np.array([[-1, 1, 0], [0, -1, 1]], dtype=np.float32) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_sin_eval(self): val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.sin(x=val) expected_outputs = np.array( [ [-0.84147098, 0.90929743, -0.14112001], [-0.7568025, 0.95892427, -0.2794155], ], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_sinh_eval(self): val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.sinh(x=val) expected_outputs = np.array( [[-1.1752, 3.62686, -10.017874], [27.289917, -74.20321, 201.71315]], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_sqrt_eval(self): val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) v = mb.sqrt(x=val) expected_outputs = np.array( [[1.0, 1.41421356, 1.73205081], [2.0, 2.23606798, 2.44948974]], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_tan_eval(self): val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.tan(x=val) expected_outputs = np.array( [[-1.5574, -2.185, 0.1425], [1.15782, 3.3805, -0.291]], dtype=np.float32 ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_tanh_eval(self): x_val = np.array([[-1, 2, -3], [4, -5, 6]], dtype=np.float32) v = mb.tanh(x=x_val) np.testing.assert_allclose(np.tanh(x_val), v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_threshold_eval(self): val = np.array([[-1.2, 2, -3.4], [4.5, -5, 6.7]], dtype=np.float32) v = mb.threshold(x=val, alpha=1.0) expected_outputs = np.array([[1.0, 2, 1.0], [4.5, 1.0, 6.7]], dtype=np.float32) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "backend, dtype", itertools.product( backends, ["bool", "int32", "fp16", "fp32"], ), ) def test_cast_with_symbolic_value(self, backend, dtype): s1 = get_new_symbol() @mb.program( input_specs=[mb.TensorSpec(shape=(s1, 1))], opset_version=backend.opset_version, ) def prog(x): shape = mb.shape(x=x) out = mb.cast(x=shape, dtype=dtype) assert out.val is None sym_val = out.sym_val if dtype == "bool": assert sym_val.tolist() == [s1, True] elif dtype == "int32": assert sym_val.tolist() == [s1, 1] elif dtype == "fp16": assert sym_val.tolist() == [s1, np.float16(1.0)] else: assert dtype == "fp32" assert sym_val.tolist() == [s1, np.float32(1.0)] return out @staticmethod def _test_builder_to_backend_stress_with_epsilon( compute_unit, backend, op_name, epsilon_val, x_eps_dtype, ): x_dtype, epsilon_dtype = x_eps_dtype x = np.array([[1, 2, 3], [4, 5, 6]], dtype=x_dtype) epsilon = epsilon_dtype(epsilon_val) def _calculate_by_np(): if op_name == "inverse": return 1 / (x + epsilon) elif op_name == "log": return np.log(x + epsilon) elif op_name == "rsqrt": return 1.0 / np.sqrt(x + epsilon) else: raise ValueError(f"Invalid op {op_name}") def build(x): return getattr(mb, op_name)(x=x, epsilon=epsilon) x_mb_dtype = types.numpy_type_to_builtin_type(x_dtype) run_compare_builder( build, input_placeholders={"x": mb.placeholder(shape=x.shape, dtype=x_mb_dtype)}, input_values={"x": x}, expected_output_types=x.shape + (x_mb_dtype,), expected_outputs=_calculate_by_np(), compute_unit=compute_unit, backend=backend, atol=1e-2 if x_dtype == np.float16 else 1e-4, rtol=1e-3 if x_dtype == np.float16 else 1e-5, ) @pytest.mark.parametrize( "compute_unit, backend, op_name, epsilon_val, x_eps_dtype", itertools.product( compute_units, backends, ["inverse", "log", "rsqrt"], [1e-3, 1e-1, 1.0], [(np.float32, np.float32), (np.float16, np.float16)], ), ) def test_builder_to_backend_stress_with_epsilon( self, compute_unit, backend, op_name, epsilon_val, x_eps_dtype, ): self._test_builder_to_backend_stress_with_epsilon( compute_unit, backend, op_name, epsilon_val, x_eps_dtype ) @pytest.mark.parametrize( "compute_unit, backend, src_dst", itertools.product( compute_units, backends, [("fp16", "fp32"), ("fp32", "fp16")], ), ) def test_builder_to_backend_stress_cast(self, compute_unit, backend, src_dst): src_dtype, dst_dtype = src_dst x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) numpy_pred = x.astype(dtype=np.float16) input_placeholder_dict = {"x": mb.placeholder(shape=x.shape)} input_value_dict = {"x": x} def build(x): x = mb.cast(x=x, dtype=src_dtype) x = mb.square(x=x) x = mb.cast(x=x, dtype=dst_dtype) x = mb.sqrt(x=x) x = mb.cast(x=x, dtype="fp32") return x expected_output_type = x.shape + (types.fp32,) run_compare_builder( build, input_placeholder_dict, input_value_dict, expected_output_type, numpy_pred, compute_unit=compute_unit, backend=backend, ) def test_erf_value_inference(self): INPUT_SIZE = (2, 3, 4) rs = np.random.RandomState(1234) x = rs.random(INPUT_SIZE) @mb.program(input_specs=[]) def prog(): return mb.erf(x=x) ops = list(prog.functions.values())[0].operations ops = list(ops) assert len(ops) == 2 assert ops[0].op_type == "const" erf_op = ops[1] assert erf_op.op_type == "erf" np.testing.assert_allclose( erf_op.value_inference(), scipy.special.erf(x), atol=1e-04, rtol=1e-05 ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_image_resizing.py0000644000000000000000000004410214672066616030005 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import functools import itertools import numpy as np import pytest import coremltools as ct from coremltools._deps import _HAS_TORCH, MSG_TORCH_NOT_FOUND from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import ( mark_api_breaking, run_compare_builder, ) from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import random_gen if _HAS_TORCH: import torch class TestResizeNearestNeighbor: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array([0.37, 6.17], dtype=np.float32).reshape([1, 1, 2, 1]) input_placeholder_dict = {"x": mb.placeholder(shape=x_val.shape)} input_value_dict = {"x": x_val} def build_model(x): return [ mb.resize_nearest_neighbor( x=x, target_size_height=2, target_size_width=1, ), mb.resize_nearest_neighbor( x=x, target_size_height=2, target_size_width=3, ), ] expected_output_types = [ (1, 1, 2, 1, types.fp32), (1, 1, 2, 3, types.fp32), ] expected_outputs = [ x_val, np.array([0.37, 0.37, 0.37, 6.17, 6.17, 6.17], dtype=np.float32).reshape([1, 1, 2, 3]), ] run_compare_builder( build_model, input_placeholder_dict, input_value_dict, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestResizeBilinear: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): if backend.backend == "mlprogram": pytest.xfail( "Seg fault: rdar://78343191 ((MIL GPU) Core ML Tools Unit Test failures [failure to load or Seg fault])" ) if backend.backend == "neuralnetwork" and compute_unit == ct.ComputeUnit.CPU_ONLY: pytest.xfail( "rdar://85318710 (Coremltools Smoke test on ResizeBilinear failing on NNv1 backend.)" ) x = np.array([0, 1], dtype=np.float32).reshape(1, 1, 2) input_placeholder_dict = {"x": mb.placeholder(shape=x.shape)} input_value_dict = {"x": x} def build_mode_0(x): return mb.resize_bilinear( x=x, target_size_height=1, target_size_width=5, sampling_mode="STRICT_ALIGN_CORNERS", ) expected_output_type = (1, 1, 5, types.fp32) expected_output = np.array([0, 0.25, 0.5, 0.75, 1], dtype=np.float32).reshape(1, 1, 5) run_compare_builder( build_mode_0, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) def build_mode_2(x): return mb.resize_bilinear( x=x, target_size_height=1, target_size_width=5, sampling_mode="DEFAULT" ) expected_output = np.array([0, 0.4, 0.8, 1, 1], dtype=np.float32).reshape(1, 1, 5) run_compare_builder( build_mode_2, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) def build_mode_3(x): return mb.resize_bilinear( x=x, target_size_height=1, target_size_width=5, sampling_mode="OFFSET_CORNERS", ) expected_output = np.array([0.1, 0.3, 0.5, 0.7, 0.9], dtype=np.float32).reshape(1, 1, 5) run_compare_builder( build_mode_3, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) class TestUpsampleBilinear: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([0, 1], dtype=np.float32).reshape(1, 1, 2) input_placeholder_dict = {"x": mb.placeholder(shape=x.shape)} input_value_dict = {"x": x} def build_upsample_integer(x): return mb.upsample_bilinear( x=x, scale_factor_height=1, scale_factor_width=3, align_corners=True ) expected_output_type = (1, 1, 6, types.fp32) expected_output = np.array([0, 0.2, 0.4, 0.6, 0.8, 1], dtype=np.float32).reshape(1, 1, 6) run_compare_builder( build_upsample_integer, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) def build_upsample_fractional(x): return mb.upsample_bilinear( x=x, scale_factor_height=1.0, scale_factor_width=2.6, align_corners=False ) expected_output_type = (1, 1, 5, types.fp32) expected_output = np.array([0, 0.1, 0.5, 0.9, 1], dtype=np.float32).reshape(1, 1, 5) run_compare_builder( build_upsample_fractional, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( "compute_unit, backend, input_shape, scale_factor, align_corners, recompute_scale_factor", itertools.product( compute_units, backends, [(2, 5, 10, 22)], [(3, 4), (2.5, 2.0), (0.5, 0.75)], [True, False], [True, False], ), ) def test_builder_to_backend_stress( self, compute_unit, backend, input_shape, scale_factor, align_corners, recompute_scale_factor, ): scale_factor_height, scale_factor_width = scale_factor _, _, height, width = input_shape height = height * scale_factor_height width = width * scale_factor_width is_h_float = height - np.floor(height) > 0.001 is_w_float = width - np.floor(width) > 0.001 # Currently, MIL is not supporting recompute_scale_factor=False + align_corners=False # with fractional output size if not recompute_scale_factor and not align_corners and (is_h_float or is_w_float): pytest.xfail("rdar://81124053 (Support recompute_scale_factor)") def _get_torch_upsample_prediction( x, scale_factor=(2, 2), align_corners=False, recompute_scale_factor=True ): x = torch.from_numpy(x) out = torch.nn.functional.interpolate( x, scale_factor=scale_factor, mode="bilinear", align_corners=align_corners, recompute_scale_factor=recompute_scale_factor, ) return out.numpy() x = random_gen(input_shape, rand_min=-100, rand_max=100) torch_pred = _get_torch_upsample_prediction( x, scale_factor=scale_factor, align_corners=align_corners, recompute_scale_factor=recompute_scale_factor, ) input_placeholder_dict = {"x": mb.placeholder(shape=x.shape)} input_value_dict = {"x": x} def build_upsample(x): return mb.upsample_bilinear( x=x, scale_factor_height=scale_factor[0], scale_factor_width=scale_factor[1], align_corners=align_corners, ) expected_output_type = torch_pred.shape + (types.fp32,) run_compare_builder( build_upsample, input_placeholder_dict, input_value_dict, expected_output_type, torch_pred, compute_unit=compute_unit, backend=backend, rtol=0.5, ) class TestUpsampleNearestNeighbor: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([1.5, 2.5, 3.5], dtype=np.float32).reshape([1, 1, 1, 3]) input_placeholder_dict = {"x": mb.placeholder(shape=x.shape)} input_value_dict = {"x": x} def build(x): return mb.upsample_nearest_neighbor(x=x, scale_factor_height=1, scale_factor_width=2) expected_output_type = (1, 1, 1, 6, types.fp32) expected_output = np.array([1.5, 1.5, 2.5, 2.5, 3.5, 3.5], dtype=np.float32).reshape( [1, 1, 1, 6] ) run_compare_builder( build, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) class TestCrop: @pytest.mark.parametrize( "compute_unit, backend, is_symbolic", itertools.product(compute_units, backends, compute_units), ) def test_builder_to_backend_smoke(self, compute_unit, backend, is_symbolic): x = np.array( [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]], dtype=np.float32, ).reshape(1, 1, 4, 4) input_shape = list(x.shape) placeholder_input_shape = input_shape if is_symbolic: # set batch and channel dimension symbolic placeholder_input_shape[0] = get_new_symbol() placeholder_input_shape[1] = get_new_symbol() input_placeholder_dict = {"x": mb.placeholder(shape=placeholder_input_shape)} input_value_dict = {"x": x} def build(x): return mb.crop(x=x, crop_height=[0, 1], crop_width=[1, 1]) expected_output_type = ( placeholder_input_shape[0], placeholder_input_shape[1], 3, 2, types.fp32, ) expected_output = np.array([2, 3, 6, 7, 10, 11], dtype=np.float32).reshape(1, 1, 3, 2) run_compare_builder( build, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, C, H, W", itertools.product( compute_units, backends, [x for x in range(2, 4)], [x for x in range(5, 8)], [x for x in range(8, 10)], ), ) def test_builder_to_backend_stress(self, compute_unit, backend, C, H, W): input_shape = (1, C, H, W) x = np.random.random(input_shape) crop_h = [np.random.randint(H)] crop_h.append(np.random.randint(H - crop_h[0])) crop_w = [np.random.randint(W)] crop_w.append(np.random.randint(W - crop_w[0])) input_placeholder_dict = {"x": mb.placeholder(shape=input_shape)} input_value_dict = {"x": x} def build(x): return mb.crop(x=x, crop_height=crop_h, crop_width=crop_w) expected_output_type = ( 1, C, H - crop_h[0] - crop_h[1], W - crop_w[0] - crop_w[1], types.fp32, ) expected_output = x[:, :, crop_h[0] : H - crop_h[1], crop_w[0] : W - crop_w[1]] run_compare_builder( build, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) class TestCropResize: @mark_api_breaking(breaking_opset_version=ct.target.iOS17) @pytest.mark.parametrize( "compute_unit, backend, is_symbolic, mode", itertools.product(compute_units, backends, [True, False], list(range(5))), ) def test_builder_to_backend_smoke(self, compute_unit, backend, is_symbolic, mode): if backend.backend == "mlprogram" and compute_unit != ct.ComputeUnit.CPU_ONLY: pytest.xfail("rdar://97398582 (TestCropResize failing on mlprogram + GPU)") if backend.backend == "mlprogram" and is_symbolic: pytest.xfail( "rdar://128585772 Crop_Resize Symbolic Shape Propagation Error if not classic CPU" ) x = np.array( [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]], dtype=np.float32, ).reshape(1, 1, 4, 4) input_shape = list(x.shape) placeholder_input_shape = input_shape if is_symbolic: # set batch and channel dimension symbolic placeholder_input_shape[0] = get_new_symbol() placeholder_input_shape[1] = get_new_symbol() input_placeholder_dict = {"x": mb.placeholder(shape=placeholder_input_shape)} input_value_dict = {"x": x} N = 1 roi = np.array([[1, 1, 2, 2]], dtype=np.float32).reshape(1, 1, 4, 1, 1) roi_normalized = np.array([[0, 0.0, 0.0, 1.0 / 3, 1.0 / 3]], dtype=np.float32).reshape( 1, 1, 5, 1, 1 ) roi_invert = np.array([[2, 2, 1, 1]], dtype=np.float32).reshape(1, 1, 4, 1, 1) def build(x, mode=0): if mode == 0: return mb.crop_resize( x=x, roi=roi, target_width=2, target_height=2, normalized_coordinates=False, box_coordinate_mode="CORNERS_HEIGHT_FIRST", sampling_mode="ALIGN_CORNERS", ) elif mode == 1: return mb.crop_resize( x=x, roi=roi, target_width=4, target_height=4, normalized_coordinates=False, box_coordinate_mode="CORNERS_HEIGHT_FIRST", sampling_mode="ALIGN_CORNERS", ) elif mode == 2: return mb.crop_resize( x=x, roi=roi, target_width=1, target_height=1, normalized_coordinates=False, box_coordinate_mode="CORNERS_HEIGHT_FIRST", sampling_mode="ALIGN_CORNERS", ) elif mode == 3: return mb.crop_resize( x=x, roi=roi_normalized, target_width=2, target_height=2, normalized_coordinates=True, box_coordinate_mode="CORNERS_HEIGHT_FIRST", sampling_mode="ALIGN_CORNERS", ) elif mode == 4: return mb.crop_resize( x=x, roi=roi_invert, target_width=2, target_height=2, normalized_coordinates=False, box_coordinate_mode="CORNERS_HEIGHT_FIRST", sampling_mode="ALIGN_CORNERS", ) expected_output_type = [ ( N, placeholder_input_shape[0], placeholder_input_shape[1], 2, 2, types.fp32, ), ( N, placeholder_input_shape[0], placeholder_input_shape[1], 4, 4, types.fp32, ), ( N, placeholder_input_shape[0], placeholder_input_shape[1], 1, 1, types.fp32, ), ( N, placeholder_input_shape[0], placeholder_input_shape[1], 2, 2, types.fp32, ), ( N, placeholder_input_shape[0], placeholder_input_shape[1], 2, 2, types.fp32, ), ] expected_output = [ np.array([6, 7, 10, 11], dtype=np.float32).reshape(1, 1, 1, 2, 2), np.array( [ [6, 6.333333, 6.66666, 7], [7.333333, 7.666666, 8, 8.333333], [8.666666, 9, 9.3333333, 9.666666], [10, 10.333333, 10.666666, 11], ], dtype=np.float32, ).reshape(1, 1, 1, 4, 4), np.array([8.5], dtype=np.float32).reshape(1, 1, 1, 1, 1), np.array([1, 2, 5, 6], dtype=np.float32).reshape(1, 1, 1, 2, 2), np.array([11, 10, 7, 6], dtype=np.float32).reshape(1, 1, 1, 2, 2), ] run_compare_builder( functools.partial(build, mode=mode), input_placeholder_dict, input_value_dict, expected_output_type[mode], expected_output[mode], compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_linear.py0000644000000000000000000003600714672066616026270 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import platform import numpy as np import pytest import coremltools as ct from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.mil.types import builtin_to_string, nptype_from_builtin from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import random_gen, ssa_fn class TestLinear: @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array([[-4.7182, 11.94], [-3.3939, 9.2166]], dtype=np.float32) weight_val = np.array([[1.2313, -0.095], [-1.4075, -0.8816]], dtype=np.float32) bias_val = np.array([1.0, 2.0], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [mb.linear(x=x, weight=weight_val, bias=bias_val)] expected_output_types = [(2, 2, types.fp32)] expected_outputs = [ np.array([[-5.9438195, -1.8854373], [-4.054486, -1.3484411]], dtype=np.float32) ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = random_gen(shape=(2, 2), rand_min=-37, rand_max=64) weight_val = random_gen(shape=(2, 2), rand_min=-91, rand_max=84) bias_val = random_gen(shape=(2,), rand_min=0.0, rand_max=9.0) v = mb.linear(x=x_val, weight=weight_val, bias=bias_val) np.testing.assert_allclose( np.matmul(x_val, weight_val.T) + bias_val, v.val, atol=1e-04, rtol=1e-05 ) @pytest.mark.parametrize( "compute_unit, backend, rank", itertools.product(compute_units, backends, [2, 3, 5]), ) def test_builder_to_backend_stress(self, compute_unit, backend, rank): if backend.backend == "mlprogram" and compute_unit != ct.ComputeUnit.CPU_ONLY: pytest.xfail("rdar://97398733 (TestLinear failing on mlprogram + GPU)") if ( backend.backend == "neuralnetwork" and compute_unit != ct.ComputeUnit.CPU_ONLY and platform.machine() == "arm64" and rank == 5 ): pytest.xfail( "rdar://98015195 ([M1 native tests] Some MIL unittests are failing on M1 native)" ) x_shape = np.random.randint(low=1, high=3, size=(rank,)) x_val = np.random.rand(*x_shape) out_channels = 3 w_shape = np.array([out_channels, x_shape[-1]]) weight_val = np.random.rand(*w_shape).astype(np.float32) bias_val = np.random.rand(out_channels).astype(np.float32) input_placeholders = { "x": mb.placeholder(shape=x_val.shape), } input_values = {"x": x_val} def build(x): return [mb.linear(x=x, weight=weight_val, bias=bias_val)] expected_outputs = [np.matmul(x_val, np.transpose(weight_val)) + bias_val] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, input_type", itertools.product(compute_units, backends, [types.int32, types.fp16, types.fp32]), ) def test_default_bias_type(self, compute_unit, backend, input_type): # Test the default bias matches the dtype of x and weight. @mb.program( input_specs=[mb.TensorSpec(shape=(1, 2), dtype=types.fp32)], opset_version=backend.opset_version, ) def prog(x): x = mb.cast(x=x, dtype=builtin_to_string(input_type)) weight = np.random.rand(3, 2).astype(nptype_from_builtin(input_type)) res = mb.linear(x=x, weight=weight) assert res.op.bias.val.dtype == nptype_from_builtin(input_type) return res class TestMatMul: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array([[-4.0, 13.0], [-3.0, 9.0]], dtype=np.float32) y_val = np.array([[1.0, -7.0], [-1.0, -8.0]], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=x_val.shape), "y": mb.placeholder(shape=y_val.shape), } input_values = {"x": x_val, "y": y_val} def build(x, y): return [ mb.matmul(x=x_val, y=y), mb.matmul(x=x, y=y_val), mb.matmul(x=x, y=y), mb.matmul(x=x, y=y, transpose_x=True, transpose_y=True), mb.matmul(x=x_val, y=y, transpose_x=True, transpose_y=True), mb.matmul(x=x, y=y_val, transpose_x=True, transpose_y=True), mb.matmul(x=x, y=y_val, transpose_x=True, transpose_y=False), mb.matmul(x=x, y=y_val, transpose_x=False, transpose_y=True), ] expected_output_types = [ (2, 2, types.fp32), (2, 2, types.fp32), (2, 2, types.fp32), (2, 2, types.fp32), (2, 2, types.fp32), (2, 2, types.fp32), (2, 2, types.fp32), (2, 2, types.fp32), ] expected_outputs = [ np.array([[-17.0, -76.0], [-12.0, -51.0]], dtype=np.float32), np.array([[-17.0, -76.0], [-12.0, -51.0]], dtype=np.float32), np.array([[-17.0, -76.0], [-12.0, -51.0]], dtype=np.float32), np.array([[17.0, 28.0], [-50.0, -85.0]], dtype=np.float32), np.array([[17.0, 28.0], [-50.0, -85.0]], dtype=np.float32), np.array([[17.0, 28.0], [-50.0, -85.0]], dtype=np.float32), np.array([[-1.0, 52.0], [4.0, -163.0]], dtype=np.float32), np.array([[-95.0, -100.0], [-66.0, -69.0]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = random_gen(shape=(2, 2, 4), rand_min=-37, rand_max=64) y_val = random_gen(shape=(2, 4, 2), rand_min=-91, rand_max=84) v = mb.matmul(x=x_val, y=y_val) np.testing.assert_allclose(np.matmul(x_val, y_val), v.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend, shapes", itertools.product( compute_units, backends, [ ((3, 2, 3, 4), (3, 2, 4, 5)), ((1, 1, 1, 3, 4), (1, 3, 2, 4, 5)), ((1, 3, 1, 2, 3), (1, 4, 3, 2)), ((1, 3, 4), (3, 2, 4, 6)), ((7, 4), (3, 9, 5, 4, 3)), ], ), ) def test_builder_to_backend_stress(self, compute_unit, backend, shapes): shape_x, shape_y = shapes x_val = np.random.rand(*shape_x) y_val = np.random.rand(*shape_y) input_placeholders = { "x": mb.placeholder(shape=x_val.shape), "y": mb.placeholder(shape=y_val.shape), } input_values = {"x": x_val, "y": y_val} def build(x, y): return [mb.matmul(x=x, y=y, transpose_x=False, transpose_y=False)] expected_outputs = [np.matmul(x_val, y_val)] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, shape_x", itertools.product( compute_units, backends, [ (5,), (2, 5), (2, 2, 5), (4, 3, 2, 5), (5, 4, 2, 3, 5), ], ), ) def test_builder_y_rank_2_const(self, compute_unit, backend, shape_x): x_val = np.random.rand(*shape_x) y_val = np.random.rand(5, 10) input_placeholders = { "x": mb.placeholder(shape=x_val.shape), } input_values = {"x": x_val} def build(x): return [mb.matmul(x=x, y=y_val, transpose_x=False, transpose_y=False)] expected_outputs = [np.matmul(x_val, y_val)] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_builder_transpose_y(self, compute_unit, backend): x_val = np.random.rand(3, 2, 7, 16) y_val = np.random.rand(3, 2, 5, 16) def build(x): return mb.matmul(x=x, y=y_val, transpose_x=False, transpose_y=True) expected_output = np.matmul(x_val, np.transpose(y_val, (0, 1, 3, 2))) run_compare_builder( build, input_placeholders={"x": mb.placeholder(shape=x_val.shape)}, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) class TestEinsum: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): equation = "abcd,adce->abce" x_val = np.arange(12).astype(np.float32).reshape((2, 1, 3, 2)) y_val = np.arange(48).astype(np.float32).reshape((2, 2, 3, 4)) input_placeholder_dict = { "x": mb.placeholder(shape=x_val.shape), "y": mb.placeholder(shape=y_val.shape), } input_value_dict = {"x": x_val, "y": y_val} out_shape = list(x_val.shape) out_shape[-1] = y_val.shape[-1] expected_output_type = tuple(out_shape) + (types.fp32,) def build(x, y): return mb.einsum(values=(x, y), equation=equation) expected_output = np.einsum(equation, x_val, y_val) run_compare_builder( build, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, rank, broadcast, backend", itertools.product( compute_units, [3, 4], [True, False], backends, ), ) def test_builder_to_backend_stress(self, compute_unit, rank, broadcast, backend): equation = "abcd,adce->abce" if rank == 4 else "vnm,mno->vno" shape_x = np.random.randint(low=2, high=16, size=rank).astype(np.int32) shape_y = np.random.randint(low=2, high=12, size=rank).astype(np.int32) shape_y[-3] = shape_x[-1] shape_y[-2] = 1 if broadcast else shape_x[-2] if rank == 4: shape_x[-4] = 1 if broadcast else shape_y[-4] x_val = np.random.rand(*shape_x) y_val = np.random.rand(*shape_y) input_placeholder_dict = { "x": mb.placeholder(shape=x_val.shape), "y": mb.placeholder(shape=y_val.shape), } input_value_dict = {"x": x_val, "y": y_val} out_shape = ( [shape_y[-4], shape_x[-3], shape_x[-2], shape_y[-1]] if rank == 4 else [shape_x[-3], shape_x[-2], shape_y[-1]] ) expected_output_type = tuple(out_shape) + (types.fp32,) def build(x, y): return mb.einsum(values=(x, y), equation=equation) if rank == 3: expected_output = np.einsum( equation, np.broadcast_to(x_val, [shape_x[-3], shape_x[-2], shape_x[-1]]), np.broadcast_to(y_val, [shape_y[-3], shape_x[-2], shape_y[-1]]), ) else: expected_output = np.einsum( equation, np.broadcast_to(x_val, [shape_y[-4], shape_x[-3], shape_x[-2], shape_x[-1]]), np.broadcast_to(y_val, [shape_y[-4], shape_y[-3], shape_x[-2], shape_y[-1]]), ) run_compare_builder( build, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.arange(6).astype(np.float32).reshape((1, 3, 2)) y_val = np.arange(24).astype(np.float32).reshape((2, 3, 4)) equation = "bcd,dce->bce" v = mb.einsum(values=(x_val, y_val), equation=equation) np.testing.assert_allclose(np.einsum(equation, x_val, y_val), v.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "backend", backends, ) def test_symbolic_input_conv_and_einsum(self, backend): """ Test a pattern of: %1 = conv_1(%x) %2 = conv_2(%x) %3 = transpose(%2, [0, 3, 2, 1]) %4 = einsum(%1, %3) If ``%x`` has symbolic shape and ``conv_1, conv_2`` have the same configuration, the above program should pass the type inference. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 3, get_new_symbol(), get_new_symbol()), dtype=types.fp32) ], opset_version=backend.opset_version, ) def prog(x): weight = np.random.rand(2, 3, 2, 2) conv_1 = mb.conv(x=x, weight=weight) conv_2 = mb.conv(x=x, weight=weight) conv_2_transpose = mb.transpose(x=conv_2, perm=[0, 3, 2, 1]) return mb.einsum(values=(conv_1, conv_2_transpose), equation="abcd,adce->abce") assert prog is not None ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_normalization.py0000644000000000000000000007220014672066616027677 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import platform import numpy as np import pytest import coremltools as ct from coremltools._deps import _HAS_TF_2, _HAS_TORCH, MSG_TF2_NOT_FOUND, MSG_TORCH_NOT_FOUND from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import ( UNK_SYM, construct_inputs_from_placeholders, run_compare_builder, ) from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import random_gen if _HAS_TORCH: import torch if _HAS_TF_2: import tensorflow as tf class TestNormalizationBatchNorm: @pytest.mark.parametrize( "compute_unit, backend, x_param_dtype", itertools.product( compute_units, backends, [(np.float16, np.float16), (np.float32, np.float32)], ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, x_param_dtype): x_dtype, param_dtype = x_param_dtype x_builtin_dtype = types.numpy_type_to_builtin_type(x_dtype) if x_dtype == np.float16 and backend.backend == "neuralnetwork": pytest.skip("No need to test fp16 for neuralnetwork backend.") x_val = np.array( [ [ [[-16.0, 13.0], [11.0, -16.0]], [[13.0, -15.0], [13.0, 9.0]], [[-9.0, -4.0], [-6.0, 3.0]], ] ], dtype=x_dtype, ) mean_val = np.array([9.0, 6.0, 3.0], dtype=param_dtype) variance_val = np.array([6.0, 1.0, 7.0], dtype=param_dtype) gamma_val = np.array([1.0, 1.0, 1.0], dtype=param_dtype) beta_val = np.array([1.0, 3.0, 0.0], dtype=param_dtype) input_placeholders = {"x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype)} input_values = {"x": x_val} def build(x): return [ mb.batch_norm(x=x, mean=mean_val, variance=variance_val), mb.batch_norm( x=x, mean=mean_val, variance=variance_val, gamma=gamma_val, beta=beta_val, epsilon=param_dtype(1e-4), ), ] expected_output_types = [ (1, 3, 2, 2, x_builtin_dtype), (1, 3, 2, 2, x_builtin_dtype), ] expected_outputs = [ np.array( [ [ [[-10.206199, 1.6329918], [0.8164959, -10.206199]], [[6.999965, -20.999895], [6.999965, 2.9999852]], [[-4.53557, -2.6457493], [-3.4016776, 0.0]], ] ], dtype=x_dtype, ), np.array( [ [ [[-9.206122, 2.6329796], [1.8164899, -9.206122]], [[9.99965, -17.998951], [9.99965, 5.9998503]], [[-4.535541, -2.6457324], [-3.4016557, 0.0]], ] ], dtype=x_dtype, ), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestNormalizationInstanceNorm: @pytest.mark.parametrize( "compute_unit, backend, x_param_dtype", itertools.product( compute_units, backends, [(np.float16, np.float16), (np.float32, np.float32)], ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, x_param_dtype): x_dtype, param_dtype = x_param_dtype x_builtin_dtype = types.numpy_type_to_builtin_type(x_dtype) if x_dtype == np.float16 and backend.backend == "neuralnetwork": pytest.skip("No need to test fp16 for neuralnetwork backend.") x_val = np.array( [ [ [[-16.0, 13.0], [11.0, 16.0]], [[13.0, 15.0], [13.0, 9.0]], [[-9.0, 4.0], [-6.0, 3.0]], ], [ [[-5.0, 1.0], [12.0, 3.0]], [[0.0, 9.0], [2.0, -8.0]], [[2.0, 5.0], [10.0, 0.0]], ], ], dtype=x_dtype, ) input_placeholders = {"x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype)} input_values = {"x": x_val} def build(x): return mb.instance_norm(x=x, epsilon=param_dtype(1e-2)) expected_output_types = [(2, 3, 2, 2, x_builtin_dtype)] expected_outputs = [ np.array( [ [ [[-1.71524656, 0.54576027], [0.38982874, 0.77965748]], [[0.22917463, 1.14587319], [0.22917463, -1.60422242]], [[-1.2470212, 1.06887531], [-0.71258354, 0.89072943]], ], [ [[-1.27070526, -0.28693344], [1.51664821, 0.04099049]], [[-0.12380638, 1.36187018], [0.20634397, -1.44440776]], [[-0.59714057, 0.19904686], [1.5260259, -1.12793219]], ], ], dtype=np.float32, ) ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, x_param_dtype", itertools.product( compute_units, backends, [(np.float16, np.float16), (np.float32, np.float32)], ), ) def test_builder_to_backend_smoke_with_gamma_and_beta( self, compute_unit, backend, x_param_dtype ): x_dtype, param_dtype = x_param_dtype x_builtin_dtype = types.numpy_type_to_builtin_type(x_dtype) if x_dtype == np.float16 and backend.backend == "neuralnetwork": pytest.skip("No need to test fp16 for neuralnetwork backend.") x_val = np.array( [ [ [[-16.0, 13.0], [11.0, 16.0]], [[13.0, 15.0], [13.0, 9.0]], [[-9.0, 4.0], [-6.0, 3.0]], ], [ [[-5.0, 1.0], [12.0, 3.0]], [[0.0, 9.0], [2.0, -8.0]], [[2.0, 5.0], [10.0, 0.0]], ], ], dtype=x_dtype, ) gamma_val = np.array([-9.0, 3.2, 1.3], dtype=param_dtype) beta_val = np.array([-0.8, 3.4, 1.2], dtype=param_dtype) input_placeholders = {"x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype)} input_values = {"x": x_val} def build(x): return mb.instance_norm(x=x, gamma=gamma_val, beta=beta_val, epsilon=param_dtype(1e-2)) expected_output_types = [(2, 3, 2, 2, x_builtin_dtype)] expected_outputs = [ np.array( [ [ [[14.63721807, -5.71184211], [-4.30845865, -7.8169173]], [[4.1333588, 7.06679399], [4.1333588, -1.73351158]], [[-0.42112757, 2.58953791], [0.27364139, 2.35794826]], ], [ [[10.6363473, 1.782401], [-14.44983388, -1.16891443]], [[3.00381959, 7.75798456], [4.06030069, -1.22210484]], [[0.42371726, 1.45876091], [3.18383368, -0.26631185]], ], ], dtype=np.float32, ) ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( "rank, compute_unit, backend, epsilon, x_param_dtype", itertools.product( [3, 4], compute_units, backends, [1e-3, 1e-5, 1e-10], [(np.float16, np.float16), (np.float32, np.float32)], ), ) def test_builder_to_backend_stress(self, rank, compute_unit, backend, epsilon, x_param_dtype): x_dtype, param_dtype = x_param_dtype x_builtin_dtype = types.numpy_type_to_builtin_type(x_dtype) if x_dtype == np.float16 and backend.backend == "neuralnetwork": pytest.skip("No need to test fp16 for neuralnetwork backend.") shape = np.random.randint(low=2, high=6, size=rank) x_val = random_gen(shape=shape, rand_min=-100.0, rand_max=100.0).astype(x_dtype) input_placeholders = {"x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype)} input_values = {"x": x_val} def build(x): return mb.instance_norm(x=x, epsilon=param_dtype(epsilon)) layer = torch.nn.InstanceNorm2d if rank == 4 else torch.nn.InstanceNorm1d torch_op = layer(num_features=shape[1], eps=epsilon) # PyTorch's batch_norm op is not implemented for fp16, so need to cast to fp32 first. expected_outputs = [torch_op(torch.as_tensor(x_val.astype(np.float32))).numpy()] expected_output_types = [o.shape[:] + (x_builtin_dtype,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, atol=1e-3, rtol=1e-4, also_compare_shapes=True, ) class TestNormalizationL2Norm: @staticmethod def _compute_l2_norm(val, eps): shape = val.shape rank = len(shape) batch_dims = rank - 3 if batch_dims == 0: square_sum = np.sum(val**2) output = val / np.power(square_sum + eps, 0.5) else: batch_dim_prod = np.prod(shape[:batch_dims]) reshape_val = np.reshape(val, (batch_dim_prod, -1)) square_sum = np.sum(reshape_val * reshape_val, axis=1, keepdims=True) + eps output = reshape_val / np.power(square_sum, 0.5) output = np.reshape(output, shape) return output @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array([[[1.0, -7.0], [5.0, -6.0], [-3.0, -5.0]]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [mb.l2_norm(x=x, epsilon=1e-10)] expected_output_types = [(1, 3, 2, types.fp32)] expected_outputs = [ np.array( [ [ [0.08304548, -0.58131838], [0.41522741, -0.4982729], [-0.24913645, -0.41522741], ] ], dtype=np.float32, ) ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank, epsilon, x_param_dtype", itertools.product( compute_units, backends, [3, 4, 5], [1e-4, 5.7], [(np.float16, np.float16), (np.float32, np.float32)], ), ) def test_builder_to_backend_stress(self, compute_unit, backend, rank, epsilon, x_param_dtype): x_dtype, param_dtype = x_param_dtype x_builtin_dtype = types.numpy_type_to_builtin_type(x_dtype) if x_dtype == np.float16 and backend.backend == "neuralnetwork": pytest.skip("No need to test fp16 for neuralnetwork backend.") shape = np.random.randint(low=2, high=6, size=rank) x_val = random_gen(shape=shape, rand_min=-1.0, rand_max=1.0).astype(x_dtype) input_placeholders = {"x": mb.placeholder(shape=shape, dtype=x_builtin_dtype)} input_values = {"x": x_val} def build(x): return [mb.l2_norm(x=x, epsilon=param_dtype(epsilon))] output = TestNormalizationL2Norm._compute_l2_norm(x_val, epsilon) expected_output_types = [list(output.shape) + [x_builtin_dtype]] expected_outputs = [output] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "rank, epsilon", itertools.product( [3, 4, 5], [1e-4, 11.2], ), ) def test_builder_eval_stress(self, rank, epsilon): shape = np.random.randint(low=2, high=6, size=rank) x_val = random_gen(shape=shape, rand_min=-1, rand_max=1) with Function({}): res = mb.l2_norm(x=x_val, epsilon=epsilon) ref = TestNormalizationL2Norm._compute_l2_norm(x_val, epsilon) np.testing.assert_allclose(ref, res.val, atol=1e-6, rtol=1e-5) class TestNormalizationLayerNorm: @staticmethod def _keras_layer_norm(x, axes, epsilon): layer = tf.keras.layers.LayerNormalization(axis=axes, epsilon=epsilon) data = tf.constant(x, dtype=tf.float32) output = layer(data) return output.numpy() @staticmethod def _np_layer_norm(x, axes, gamma=None, beta=None, epsilon=1e-5): rank = len(x.shape) axes = [axis + rank if axis < 0 else axis for axis in axes] normalized_shape = [x.shape[i] if i in axes else 1 for i in range(rank)] gamma = ( np.ones(shape=normalized_shape) if gamma is None else np.reshape(gamma, normalized_shape) ) beta = ( np.zeros(shape=normalized_shape) if beta is None else np.reshape(beta, normalized_shape) ) num = x - np.mean(x, axis=tuple(axes), keepdims=True) dem = np.sqrt( np.sum(np.square(num), axis=tuple(axes), keepdims=True) / np.prod(normalized_shape) + epsilon ) return num / dem * gamma + beta @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array([[[1.0, -7.0], [5.0, -6.0], [-3.0, -5.0]]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} gamma_val = np.array([1.0, 1.0], dtype=np.float32) beta_val = np.array([1.0, 0.0], dtype=np.float32) def build(x): return [ # V2->V1 lowering (op_mappings.py): if branch mb.layer_norm(x=x, axes=[2], epsilon=1e-4), # V2->V1 lowering (op_mappings.py): else branch mb.layer_norm(x=x, axes=[-2, -1], epsilon=1e-4), # V2->V1 lowering (op_mappings.py): if branch with scale mb.layer_norm(x=x, axes=[2], epsilon=1e-4, gamma=gamma_val, beta=beta_val), ] expected_output_types = [ (1, 3, 2, types.fp32), (1, 3, 2, types.fp32), (1, 3, 2, types.fp32), ] expected_outputs = [ np.array( [ [ [0.9999969, -0.9999969], [0.99999833, -0.99999833], [0.99995005, -0.99995005], ] ], dtype=np.float32, ), np.array( [ [ [0.82687193, -1.06312108], [1.77186835, -0.82687193], [-0.11812456, -0.59062278], ] ], dtype=np.float32, ), np.array( [ [ [1.9999969, -0.9999969], [1.99999833, -0.99999833], [1.99995005, -0.99995005], ] ], dtype=np.float32, ), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke_rank_2(self, compute_unit, backend): x_val = np.array([[1.0, -7.0], [5.0, -6.0], [-3.0, -5.0]], dtype=np.float32) gamma_val = np.array([1.0, 1.0], dtype=np.float32) beta_val = np.array([1.0, 0.0], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [ # V2->V1 lowering (op_mappings.py): if branch mb.layer_norm(x=x, axes=[1], epsilon=1e-4), mb.layer_norm(x=x, axes=[1], epsilon=1e-4, gamma=gamma_val, beta=beta_val), ] expected_output_types = [(3, 2, types.fp32), (3, 2, types.fp32)] expected_outputs = [ np.array( [ [0.9999969, -0.9999969], [0.99999833, -0.99999833], [0.99995005, -0.99995005], ], dtype=np.float32, ), np.array( [ [1.9999969, -0.9999969], [1.99999833, -0.99999833], [1.99995005, -0.99995005], ], dtype=np.float32, ), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke_with_dynamic_shape(self, compute_unit, backend): x_val = np.array([[[1.0, -7.0], [5.0, -6.0], [-3.0, -5.0]]], dtype=np.float32) shape = (get_new_symbol(), get_new_symbol(), 2) input_placeholders = {"x": mb.placeholder(shape=shape)} input_values = {"x": x_val} def build(x): return [ mb.layer_norm(x=x, axes=[2], epsilon=1e-4), ] expected_output_types = [(UNK_SYM, UNK_SYM, 2, types.fp32)] expected_outputs = [ np.array( [ [ [0.9999969, -0.9999969], [0.99999833, -0.99999833], [0.99995005, -0.99995005], ] ], dtype=np.float32, ), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, 10) if backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank_and_axes, epsilon, provides_gamma_beta", itertools.product( compute_units, backends, [[3, [0, 2]], [3, [-2]], [4, [0, 1, 3]], [5, [0, 4]], [5, [-5, -4, -3, -2, -1]]], [0.0001, 0.01], [True, False], ), ) def test_builder_to_backend_stress_numpy( self, compute_unit, backend, rank_and_axes, epsilon, provides_gamma_beta ): if ( backend.backend == "mlprogram" and backend.precision == "fp16" and compute_unit != ct.ComputeUnit.CPU_ONLY ): pytest.xfail( "rdar://80662357 ([GPU failures] LayerNorm FP16 tests failing on GPU with numerical errors)" ) if ( backend.backend == "neuralnetwork" and compute_unit != ct.ComputeUnit.CPU_ONLY and platform.machine() == "arm64" ): pytest.xfail( "rdar://98015195 ([M1 native tests] Some MIL unittests are failing on M1 native)" ) rank, axes = rank_and_axes shape = np.random.randint(low=2, high=6, size=rank) x_val = random_gen(shape=shape, rand_min=-100.0, rand_max=100.0) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} gamma, beta = None, None if provides_gamma_beta: positive_axes = [axis + rank if axis < 0 else axis for axis in axes] normalized_shape = [shape[i] for i in range(rank) if i in positive_axes] gamma = random_gen(shape=normalized_shape, rand_min=-100, rand_max=100) beta = random_gen(shape=normalized_shape, rand_min=-100, rand_max=100) def build(x): return [mb.layer_norm(x=x, axes=axes, epsilon=epsilon, gamma=gamma, beta=beta)] output = TestNormalizationLayerNorm._np_layer_norm( x=x_val, axes=axes, epsilon=epsilon, gamma=gamma, beta=beta ) expected_output_types = [tuple(output.shape) + (types.fp32,)] expected_outputs = [output] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, atol=1e-3, rtol=1e-4, ) @pytest.mark.skipif(not _HAS_TF_2, reason=MSG_TF2_NOT_FOUND) @pytest.mark.parametrize( "compute_unit, backend, rank_and_axes, epsilon, x_param_dtype", itertools.product( compute_units, backends, [[3, [0, 2]], [3, [-2]], [4, [0, 1, 3]], [5, [0, 4]], [5, [-5, -4, -3, -2, -1]]], [0.0001, 0.01], [(np.float16, np.float16), (np.float32, np.float32)], ), ) def test_builder_to_backend_stress_keras( self, compute_unit, backend, rank_and_axes, epsilon, x_param_dtype ): x_dtype, param_dtype = x_param_dtype x_builtin_dtype = types.numpy_type_to_builtin_type(x_dtype) if x_dtype == np.float16 and backend.backend == "neuralnetwork": pytest.skip("No need to test fp16 for neuralnetwork backend.") rank, axes = rank_and_axes shape = np.random.randint(low=2, high=6, size=rank) x_val = random_gen(shape=shape, rand_min=-100.0, rand_max=100.0).astype(x_dtype) input_placeholders = {"x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype)} input_values = {"x": x_val} def build(x): return [mb.layer_norm(x=x, axes=axes, epsilon=param_dtype(epsilon))] output = TestNormalizationLayerNorm._keras_layer_norm(x=x_val, axes=axes, epsilon=epsilon) expected_output_types = [tuple(output.shape) + (x_builtin_dtype,)] expected_outputs = [output] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "rank_and_axes, epsilon", itertools.product( [ [3, [0, 2]], [3, [-2, -1]], [4, [0, 1, 2, 3]], [5, [0, 2, -1]], [5, [-5, -4, -3, -2, -1]], ], [0.0001, 0.01], ), ) def test_builder_eval_stress(self, rank_and_axes, epsilon): rank, axes = rank_and_axes shape = np.random.randint(low=2, high=6, size=rank) x_val = random_gen(shape=shape, rand_min=-100.0, rand_max=100.0) positive_axes = [axis + rank if axis < 0 else axis for axis in axes] normalized_shape = [shape[i] for i in range(rank) if i in positive_axes] gamma_val = random_gen(shape=normalized_shape, rand_min=-100, rand_max=100) beta_val = random_gen(shape=normalized_shape, rand_min=-100, rand_max=100) with Function({}): res = mb.layer_norm(x=x_val, axes=axes, epsilon=epsilon, gamma=gamma_val, beta=beta_val) ref = TestNormalizationLayerNorm._np_layer_norm( x=x_val, axes=axes, epsilon=epsilon, gamma=gamma_val, beta=beta_val ) np.testing.assert_allclose(ref, res.val, atol=1e-04, rtol=1e-05) class TestNormalizationLocalResponseNorm: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array([[[1.0, -7.0], [5.0, -6.0], [-3.0, -5.0]]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [ mb.local_response_norm(x=x, size=2), mb.local_response_norm(x=x, size=3, alpha=0.0001, beta=0.75, k=1.0), ] expected_output_types = [(1, 3, 2, types.fp32), (1, 3, 2, types.fp32)] expected_outputs = [ np.array( [ [ [0.99996257, -6.98716545], [4.99531746, -5.99191284], [-2.99898791, -4.99531746], ] ], dtype=np.float32, ), np.array( [ [ [0.99997497, -6.99143696], [4.99687672, -5.99460602], [-2.99932504, -4.99687672], ] ], dtype=np.float32, ), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( "compute_unit, backend, rank, size, alpha, beta, k, x_param_dtype", itertools.product( compute_units, backends, [rank for rank in range(3, 6)], [2, 3, 5], [0.0001, 0.01], [0.75, 1.0], [1.0, 2.0], [(np.float16, np.float16), (np.float32, np.float32)], ), ) def test_builder_to_backend_stress( self, compute_unit, backend, rank, size, alpha, beta, k, x_param_dtype ): x_dtype, param_dtype = x_param_dtype x_builtin_dtype = types.numpy_type_to_builtin_type(x_dtype) if x_dtype == np.float16 and backend.backend == "neuralnetwork": pytest.skip("No need to test fp16 for neuralnetwork backend.") shape = np.random.randint(low=2, high=5, size=rank) x_val = random_gen(shape=shape).astype(x_dtype) input_placeholders = {"x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype)} input_values = {"x": x_val} def build(x): return mb.local_response_norm( x=x, size=size, alpha=param_dtype(alpha), beta=param_dtype(beta), k=param_dtype(k) ) torch_lrn = torch.nn.LocalResponseNorm(size=size, alpha=alpha, beta=beta, k=k) # PyTorch doesn't support LocalResponseNorm with fp16, so need to cast to float32 first. expected_outputs = [torch_lrn(torch.as_tensor(x_val.astype(np.float32))).numpy()] expected_output_types = [o.shape[:] + (x_builtin_dtype,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, atol=1e-2, rtol=1e-3, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_pool.py0000644000000000000000000004352114672066616025766 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units class TestAvgPool: @pytest.mark.parametrize( "backend, pad_type", itertools.product( backends, ["valid", "same", "same_lower", "custom"], ), ) def test_type_inference_cache(self, backend, pad_type): # Test the type inference has the caching mechanism to ensure # same symbolic input shapes results in the same output shape if pad_type == "same_lower" and backend.opset_version == ct.target.iOS15: return @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 3, get_new_symbol(), get_new_symbol()), dtype=types.fp32) ], opset_version=backend.opset_version, ) def prog(x): # Basic pool pool_1 = mb.avg_pool(x=x, kernel_sizes=[1, 2], pad_type=pad_type) pool_2 = mb.avg_pool(x=x, kernel_sizes=[1, 2], pad_type=pad_type) assert pool_1.shape == pool_1.shape # With strides pool_1 = mb.avg_pool(x=x, kernel_sizes=[1, 2], strides=[1, 2], pad_type=pad_type) pool_2 = mb.avg_pool(x=x, kernel_sizes=[1, 2], strides=[1, 2], pad_type=pad_type) assert pool_1.shape == pool_1.shape # With padding pool_1 = mb.avg_pool(x=x, kernel_sizes=[1, 2], pad_type=pad_type, pad=[2, 3, 4, 5]) pool_2 = mb.avg_pool(x=x, kernel_sizes=[1, 2], pad_type=pad_type, pad=[2, 3, 4, 5]) assert pool_1.shape == pool_2.shape return pool_1 @pytest.mark.parametrize( "compute_unit, backend, inputshape_kernelshape", itertools.product( compute_units, backends, [ [(1, 1, 2), (2,)], [(1, 1, 2, 2), (2, 2)], [(1, 1, 2, 2, 2), (2, 2, 2)], ], ), ) def test_avgpool_builder_to_backend_smoke_samelower_padtype( self, compute_unit, backend, inputshape_kernelshape ): input_shape, kernel_shape = inputshape_kernelshape rank = len(input_shape) - 2 if backend.backend == "neuralnetwork" and rank == 3: pytest.skip( "pad_type `same_lower` not supported for 3d pooling in neuralnetwork backend" ) if backend.backend == "mlprogram" and rank == 1: pytest.xfail( "rdar://98852008 (MIL backend producing wrong result for 1d pooling with pad_type " "same_lower)" ) if backend.opset_version == ct.target.iOS15: pytest.skip("same_lower pad_type not supported in iOS15 opset.") x_val = np.arange(1, np.prod(input_shape) + 1).reshape(*input_shape).astype(np.float32) if rank == 1: expected_output_val = [0.5, 1.5] elif rank == 2: expected_output_val = [0.25, 0.75, 1, 2.5] else: expected_output_val = [0.125, 0.375, 0.5, 1.25, 0.75, 1.75, 2, 4.5] expected_output_types = [input_shape + (types.fp32,)] expected_outputs = [np.array(expected_output_val).reshape(*input_shape).astype(np.float32)] input_values = {"x": x_val} input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} def build(x): return mb.avg_pool( x=x, kernel_sizes=kernel_shape, pad_type="same_lower", ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, num_dims", itertools.product(compute_units, backends, [1, 2, 3]), ) def test_builder_to_backend_smoke(self, compute_unit, backend, num_dims): kernel_sizes = [1, 2, 3] strides = [2, 1, 3] if num_dims == 1: x_val = np.array([[[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]]], dtype=np.float32) expected_output_types = [(1, 1, 4, types.fp32), (1, 1, 3, types.fp32)] expected_outputs = [ np.array([[[1.0, 3.0, 5.0, 7.0]]], dtype=np.float32), np.array([[[1.5, 4.0, 6.5]]], dtype=np.float32), ] elif num_dims == 2: x_val = np.array( [ [ [[-10.80291205, -6.42076184], [-7.07910997, 9.1913279]], [[-3.18181497, 0.9132147], [11.9785544, 7.92449539]], ] ], dtype=np.float32, ) expected_output_types = [(1, 2, 1, 1, types.fp32), (1, 2, 2, 1, types.fp32)] expected_outputs = [ np.array([[[[-8.611837]], [[-1.1343001]]]], dtype=np.float32), np.array( [[[[-3.7778642], [1.056109]], [[4.4086123], [9.951525]]]], dtype=np.float32, ), ] else: # num_dims == 3 x_val = np.array( [ [ [ [[-1, -5, -1], [-3, -3, 8], [2, 6, 2]], [[-4, 7, -4], [4, 6, 7], [4, 4, 8]], [[5, -3, 5], [0, -5, 8], [1, 7, 2]], ] ], [ [ [[7, -3, -5], [5, 4, 7], [-2, -4, -3]], [[-4, 3, -1], [6, -4, 4], [3, 6, 2]], [[-1, 4, -4], [-2, -1, -2], [3, 2, 8]], ] ], ], dtype=np.float32, ) expected_output_types = [ (2, 1, 2, 2, 1, types.fp32), (2, 1, 2, 3, 1, types.fp32), ] expected_outputs = [ np.array( [ [[[[-0.8333334], [2.0]], [[1.6666667], [2.1666667]]]], [[[[2.5], [1.1666667]], [[-1.0], [1.3333334]]]], ], dtype=np.float32, ), np.array( [ [ [ [[-0.8333334], [2.0], [3.3333335]], [[1.6666667], [2.1666667], [3.3333335]], ] ], [ [ [[2.5], [1.1666667], [-3.0]], [[-1.0], [1.3333334], [4.3333335]], ] ], ], dtype=np.float32, ), ] input_values = {"x": x_val} input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} def build(x): return [ mb.avg_pool( x=x, kernel_sizes=kernel_sizes[:num_dims], strides=strides[:num_dims], pad_type="valid", ), mb.avg_pool( x=x, kernel_sizes=kernel_sizes[-num_dims:], strides=strides[-num_dims:], pad_type="same", exclude_padding_from_average=True, ), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestMaxPool: @pytest.mark.parametrize( "compute_unit, backend, inputshape_kernelshape", itertools.product( compute_units, backends, [ [(1, 1, 2), (2,)], [(1, 1, 2, 2), (2, 2)], [(1, 1, 2, 2, 2), (2, 2, 2)], ], ), ) def test_maxpool_builder_to_backend_smoke_samelower_padtype( self, compute_unit, backend, inputshape_kernelshape ): input_shape, kernel_shape = inputshape_kernelshape rank = len(input_shape) - 2 if backend.backend == "neuralnetwork" and rank == 3: pytest.skip( "pad_type `same_lower` not supported for 3d pooling in neuralnetwork backend" ) if backend.backend == "mlprogram" and rank == 1: pytest.xfail( "rdar://98852008 (MIL backend producing wrong result for 1d pooling with pad_type " "same_lower)" ) if backend.opset_version == ct.target.iOS15: pytest.skip("same_lower pad_type not supported in iOS15 opset.") x_val = np.arange(1, np.prod(input_shape) + 1).reshape(*input_shape).astype(np.float32) if rank == 1: expected_output_val = [1, 2] elif rank == 2: expected_output_val = [1, 2, 3, 4] else: expected_output_val = [1, 2, 3, 4, 5, 6, 7, 8] expected_output_types = [input_shape + (types.fp32,)] expected_outputs = [np.array(expected_output_val).reshape(*input_shape).astype(np.float32)] input_values = {"x": x_val} input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} def build(x): return mb.max_pool( x=x, kernel_sizes=kernel_shape, pad_type="same_lower", ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, num_dims", itertools.product(compute_units, backends, [1, 2, 3]), ) def test_builder_to_backend_smoke(self, compute_unit, backend, num_dims): kernel_sizes = [1, 2, 3] strides = [2, 1, 3] if num_dims == 1: x_val = np.array([[[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]]], dtype=np.float32) expected_output_types = [(1, 1, 4, types.fp32), (1, 1, 3, types.fp32)] expected_outputs = [ np.array([[[1.0, 3.0, 5.0, 7.0]]], dtype=np.float32), np.array([[[2.0, 5.0, 7.0]]], dtype=np.float32), ] elif num_dims == 2: x_val = np.array( [ [ [[-10.80291205, -6.42076184], [-7.07910997, 9.1913279]], [[-3.18181497, 0.9132147], [11.9785544, 7.92449539]], ] ], dtype=np.float32, ) expected_output_types = [(1, 2, 1, 1, types.fp32), (1, 2, 2, 1, types.fp32)] expected_outputs = [ np.array([[[[-6.42076184]], [[0.9132147]]]], dtype=np.float32), np.array( [[[[9.191328], [9.191328]], [[11.978555], [11.978555]]]], dtype=np.float32, ), ] else: # num_dims == 3 x_val = np.array( [ [ [ [[-1, -5, -1], [-3, -3, 8], [2, 6, 2]], [[-4, 7, -4], [4, 6, 7], [4, 4, 8]], [[5, -3, 5], [0, -5, 8], [1, 7, 2]], ] ], [ [ [[7, -3, -5], [5, 4, 7], [-2, -4, -3]], [[-4, 3, -1], [6, -4, 4], [3, 6, 2]], [[-1, 4, -4], [-2, -1, -2], [3, 2, 8]], ] ], ], dtype=np.float32, ) expected_output_types = [ (2, 1, 2, 2, 1, types.fp32), (2, 1, 2, 3, 1, types.fp32), ] expected_outputs = [ np.array( [ [[[[8.0], [8.0]], [[8.0], [8.0]]]], [[[[7.0], [7.0]], [[4.0], [8.0]]]], ], dtype=np.float32, ), np.array( [ [[[[8.0], [8.0], [6.0]], [[8.0], [8.0], [7.0]]]], [[[[7.0], [7.0], [-2.0]], [[4.0], [8.0], [8.0]]]], ], dtype=np.float32, ), ] input_values = {"x": x_val} input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} def build(x): return [ mb.max_pool( x=x, kernel_sizes=kernel_sizes[:num_dims], strides=strides[:num_dims], pad_type="valid", ), mb.max_pool( x=x, kernel_sizes=kernel_sizes[-num_dims:], strides=strides[-num_dims:], pad_type="same", ), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestL2Pool: @pytest.mark.parametrize( "compute_unit, backend, inputshape_kernelshape", itertools.product( compute_units, backends, [ [(1, 1, 2), (2,)], [(1, 1, 2, 2), (2, 2)], ], ), ) def test_l2pool_builder_to_backend_smoke_samelower_padtype( self, compute_unit, backend, inputshape_kernelshape ): input_shape, kernel_shape = inputshape_kernelshape rank = len(input_shape) - 2 if backend.backend == "mlprogram" and rank == 1: pytest.xfail( "rdar://98852008 (MIL backend producing wrong result for 1d pooling with pad_type " "same_lower)" ) if backend.opset_version == ct.target.iOS15: pytest.skip("same_lower pad_type not supported in iOS15 opset.") x_val = np.arange(1, np.prod(input_shape) + 1).reshape(*input_shape).astype(np.float32) if rank == 1: expected_output_val = [1, 2.236068] else: expected_output_val = [1, 2.236068, 3.162278, 5.477226] expected_output_types = [input_shape + (types.fp32,)] expected_outputs = [np.array(expected_output_val).reshape(*input_shape).astype(np.float32)] input_values = {"x": x_val} input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} def build(x): return mb.l2_pool( x=x, kernel_sizes=kernel_shape, pad_type="same_lower", ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, num_dims", itertools.product(compute_units, backends, [1, 2]), ) def test_builder_to_backend_smoke(self, compute_unit, backend, num_dims): kernel_sizes = [1, 2, 3] strides = [2, 1, 3] if num_dims == 1: x_val = np.array([[[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]]], dtype=np.float32) expected_output_types = [(1, 1, 4, types.fp32), (1, 1, 3, types.fp32)] expected_outputs = [ np.array([[[1.0, 3.0, 5.0, 7.0]]], dtype=np.float32), np.array([[[2.236068, 7.071068, 9.219544]]], dtype=np.float32), ] elif num_dims == 2: x_val = np.array( [[[[-10.0, -6.0], [-7.0, 9.0]], [[-3.0, 0.0], [11.0, 7.0]]]], dtype=np.float32, ) expected_output_types = [(1, 2, 1, 1, types.fp32), (1, 2, 2, 1, types.fp32)] expected_outputs = [ np.array([[[[11.66190338]], [[3.0]]]], dtype=np.float32), np.array( [[[[16.309507], [11.401754]], [[13.379088], [13.038404]]]], dtype=np.float32, ), ] else: # num_dims == 3 pass # Enum PoolingType3D has no value defined for name L2 input_values = {"x": x_val} input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} def build(x): return [ mb.l2_pool( x=x, kernel_sizes=kernel_sizes[:num_dims], strides=strides[:num_dims], pad_type="valid", ), mb.l2_pool( x=x, kernel_sizes=kernel_sizes[-num_dims:], strides=strides[-num_dims:], pad_type="same", ), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_random.py0000644000000000000000000003451214672066616026275 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import UNK_SYM, run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import get_core_ml_prediction from coremltools.models.utils import _macos_version class TestRandomBernoulli: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array([0.0], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [ mb.add(x=x, y=x), mb.random_bernoulli(shape=np.array([2, 1, 3], np.int32), prob=1.0), mb.random_bernoulli(shape=np.array([3, 1, 2], np.int32), prob=0.0), ] expected_outputs = [ np.array(np.zeros(shape=(1,)), np.float32), np.array(np.ones(shape=(2, 1, 3)), np.float32), np.array(np.zeros(shape=(3, 1, 2)), np.float32), ] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank, prob, dynamic", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [1.0, 0.0], [True, False], ), ) def test_builder_to_backend_stress(self, compute_unit, backend, rank, prob, dynamic): shape = np.random.randint(low=1, high=4, size=rank).astype(np.int32) x_val = np.array([0.0], dtype=np.float32) if dynamic: input_placeholders = { "x": mb.placeholder(shape=x_val.shape), "dyn_shape": mb.placeholder(shape=shape.shape, dtype=types.int32), } input_values = {"x": x_val, "dyn_shape": shape} else: input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [mb.add(x=x, y=x), mb.random_bernoulli(shape=shape, prob=prob)] def build_dyn(x, dyn_shape): return [mb.add(x=x, y=x), mb.random_bernoulli(shape=dyn_shape, prob=prob)] expected_outputs = [ np.array(np.zeros(shape=(1,)), np.float32), np.random.binomial(1, prob, shape), ] if dynamic: expected_output_types = [ tuple([UNK_SYM for _ in o.shape]) + (types.fp32,) for o in expected_outputs ] else: expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] builder = build_dyn if dynamic else build run_compare_builder( builder, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestRandomCategorical: def softmax(self, data): e_data = np.exp(data - np.max(data)) return e_data / e_data.sum() @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array([1], dtype=np.int32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [ mb.random_categorical(x=x, seed=1), mb.random_categorical(x=x, seed=1, size=4), ] expected_outputs = [ np.array(np.zeros(shape=(1,)), dtype=np.float32), np.array(np.zeros(shape=(4,)), dtype=np.float32), ] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.skipif( _macos_version() < (12, 0), reason="Can only get predictions for ml program on macOS 12+" ) @pytest.mark.parametrize( "compute_unit, backend, n_sample, n_class", itertools.product(compute_units, backends, [50000], [2, 10, 20]), ) def test_builder_to_backend_stress(self, compute_unit, backend, n_sample, n_class): output_name = "random_categorical" logits = np.random.rand(2, n_class) probs = [self.softmax(logits[0]), self.softmax(logits[1])] # Test logits input input_placeholders = {"x": mb.placeholder(shape=(2, n_class))} input_values = {"x": logits} def build(x): return [mb.random_categorical(x=x, size=n_sample, mode="logits", name=output_name)] prediction = get_core_ml_prediction( build, input_placeholders, input_values, backend=backend, compute_unit=compute_unit, ) ref0 = np.random.multinomial(n_sample, probs[0]) ref1 = np.random.multinomial(n_sample, probs[1]) pred0 = prediction[output_name].reshape(2, n_sample)[0] pred1 = prediction[output_name].reshape(2, n_sample)[1] # convert to bincount and validate probabilities pred0 = np.bincount(np.array(pred0).astype(np.int32), minlength=n_class) pred1 = np.bincount(np.array(pred1).astype(np.int32), minlength=n_class) assert np.allclose(np.true_divide(pred0, n_sample), probs[0], atol=1e-2) assert np.allclose( np.true_divide(pred0, n_sample), np.true_divide(ref0, n_sample), atol=1e-2, ) assert np.allclose(np.true_divide(pred1, n_sample), probs[1], atol=1e-2) assert np.allclose( np.true_divide(pred1, n_sample), np.true_divide(ref1, n_sample), atol=1e-2, ) # Test probs input input_placeholders = {"x": mb.placeholder(shape=(2, n_class))} input_values = {"x": np.array(probs)} def build(x): return [mb.random_categorical(x=x, size=n_sample, mode="probs", name=output_name)] prediction = get_core_ml_prediction( build, input_placeholders, input_values, backend=backend, compute_unit=compute_unit ) pred0 = prediction[output_name].reshape(2, n_sample)[0] pred1 = prediction[output_name].reshape(2, n_sample)[1] # convert to bincount and validate probabilities pred0 = np.bincount(np.array(pred0).astype(np.int32), minlength=n_class) pred1 = np.bincount(np.array(pred1).astype(np.int32), minlength=n_class) assert np.allclose(np.true_divide(pred0, n_sample), probs[0], atol=1e-2) assert np.allclose( np.true_divide(pred0, n_sample), np.true_divide(ref0, n_sample), atol=1e-2, ) assert np.allclose(np.true_divide(pred1, n_sample), probs[1], atol=1e-2) assert np.allclose( np.true_divide(pred1, n_sample), np.true_divide(ref1, n_sample), atol=1e-2, ) class TestRandomNormal: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array([0.0], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [ mb.add(x=x, y=x), mb.random_normal(shape=np.array([2, 1, 3], np.int32), mean=1.0, stddev=0.0), mb.random_normal(shape=np.array([3, 1, 2], np.int32), mean=0.0, stddev=0.0), ] expected_outputs = [ np.array(np.zeros(shape=(1,)), np.float32), np.array(np.ones(shape=(2, 1, 3)), np.float32), np.array(np.zeros(shape=(3, 1, 2)), np.float32), ] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank, mean, dynamic", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [1.0, 0.0], [True, False], ), ) def test_builder_to_backend_stress(self, compute_unit, backend, rank, mean, dynamic): shape = np.random.randint(low=1, high=4, size=rank).astype(np.int32) x_val = np.array([0.0], dtype=np.float32) if dynamic: input_placeholders = { "x": mb.placeholder(shape=x_val.shape), "dyn_shape": mb.placeholder(shape=shape.shape, dtype=types.int32), } input_values = {"x": x_val, "dyn_shape": shape} else: input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [ mb.add(x=x, y=x), mb.random_normal(shape=shape, mean=mean, stddev=0.0), ] def build_dyn(x, dyn_shape): return [ mb.add(x=x, y=x), mb.random_normal(shape=dyn_shape, mean=mean, stddev=0.0), ] expected_outputs = [ np.array(np.zeros(shape=(1,)), np.float32), np.random.normal(loc=mean, scale=0.0, size=shape), ] if dynamic: expected_output_types = [ tuple([UNK_SYM for _ in o.shape]) + (types.fp32,) for o in expected_outputs ] else: expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] builder = build_dyn if dynamic else build run_compare_builder( builder, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestRandomUniform: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array([0.0], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [ mb.add(x=x, y=x), mb.random_uniform(shape=np.array([2, 1, 3], np.int32), low=0.0, high=0.0), mb.random_uniform(shape=np.array([3, 1, 2], np.int32), low=1.0, high=1.0), ] expected_outputs = [ np.array(np.zeros(shape=(1,)), np.float32), np.array(np.zeros(shape=(2, 1, 3)), np.float32), np.array(np.ones(shape=(3, 1, 2)), np.float32), ] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank, low, high, dynamic", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [0.0], [0.0], [True, False], ), ) def test_builder_to_backend_stress(self, compute_unit, backend, rank, low, high, dynamic): shape = np.random.randint(low=1, high=4, size=rank).astype(np.int32) x_val = np.array([0.0], dtype=np.float32) if dynamic: input_placeholders = { "x": mb.placeholder(shape=x_val.shape), "dyn_shape": mb.placeholder(shape=shape.shape, dtype=types.int32), } input_values = {"x": x_val, "dyn_shape": shape} else: input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [ mb.add(x=x, y=x), mb.random_uniform(shape=shape, low=low, high=high), ] def build_dyn(x, dyn_shape): return [ mb.add(x=x, y=x), mb.random_uniform(shape=dyn_shape, low=low, high=high), ] expected_outputs = [ np.array(np.zeros(shape=(1,)), np.float32), np.random.uniform(low=low, high=high, size=shape), ] if dynamic: expected_output_types = [ tuple([UNK_SYM for _ in o.shape]) + (types.fp32,) for o in expected_outputs ] else: expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] builder = build_dyn if dynamic else build run_compare_builder( builder, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_recurrent.py0000644000000000000000000007156414672066616027036 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools._deps import _HAS_TORCH, MSG_TORCH_NOT_FOUND from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import ( construct_inputs_from_placeholders, run_compare_builder, ) from coremltools.converters.mil.mil.types.type_mapping import numpy_type_to_builtin_type from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import ssa_fn if _HAS_TORCH: import torch new_backends = [] for v in backends: if v.opset_version <= ct.target.iOS15: new_backends.append(v) backends = new_backends class TestGRU: @pytest.mark.parametrize( argnames=[ "compute_unit", "backend", "seq_len", "batch_size", "input_size", "hidden_size", "has_bias", "output_sequence", "direction", "activation_functions", "symbolic", "dtype", ], argvalues=itertools.product( compute_units, backends, [1, 3], [1], # [MIL] GRU with batch size 1 produces incorrect # output(always 0) for second batch onwards [1, 2], [1, 2], [True, False], [True, False], ["forward", "reverse"], [ ["tanh", "sigmoid"], ["sigmoid", "tanh"], ], [True, False], [np.float32], ), ) def test_builder_to_backend_smoke( self, compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, direction, activation_functions, symbolic, dtype, ): torch.manual_seed(5) R_z = 2 * np.random.rand(hidden_size, hidden_size) - 1 R_r = 2 * np.random.rand(hidden_size, hidden_size) - 1 R_o = 2 * np.random.rand(hidden_size, hidden_size) - 1 W_z = 2 * np.random.rand(hidden_size, input_size) - 1 W_r = 2 * np.random.rand(hidden_size, input_size) - 1 W_o = 2 * np.random.rand(hidden_size, input_size) - 1 b_z = 2 * np.random.rand(hidden_size) - 1 if has_bias else np.zeros((hidden_size)) b_r = 2 * np.random.rand(hidden_size) - 1 if has_bias else np.zeros((hidden_size)) b_o = 2 * np.random.rand(hidden_size) - 1 if has_bias else np.zeros((hidden_size)) def apply_act(x, option): if option == "tanh": return np.tanh(x) elif option == "sigmoid": return 1.0 / (1 + np.exp(-x)) else: raise ValueError("activation invalid") def get_numpy_prediction_gru( X, H, return_seq, direction, inner_activation_str="sigmoid", activation_str="tanh", ): """ shape of X : (B, Seq, input_size) shape of H : (B, hidden_size) shape of return = (B, 1, hidden_size) if return_seq=False else (B, Seq, hidden_size) """ assert X.shape == (batch_size, seq_len, input_size) assert H.shape == (batch_size, hidden_size) out = [] for i in range(batch_size): numpy_input = X[i] hidden_state = H[i] out.append( get_numpy_prediction_gru_single_batch( numpy_input, hidden_state, return_seq, direction, inner_activation_str=inner_activation_str, activation_str=activation_str, ) ) output = np.stack(out, axis=0) output = np.transpose(output, (1, 0, 2)) return output, output[-1, :, :] def get_numpy_prediction_gru_single_batch( X, h, return_seq, direction, inner_activation_str="sigmoid", activation_str="tanh" ): np_out = np.zeros((seq_len, hidden_size)) batch_x = X if direction == "forward" else X[::-1, :] for k in range(seq_len): x = batch_x[k, :] z = apply_act(np.dot(W_z, x) + np.dot(R_z, h) + b_z, inner_activation_str) r = apply_act(np.dot(W_r, x) + np.dot(R_r, h) + b_r, inner_activation_str) c = h * r o = apply_act(np.dot(W_o, x) + np.dot(R_o, c) + b_o, activation_str) h = (1 - z) * o + z * h np_out[k, :] = h if return_seq: np_out_final = np_out else: np_out_final = np_out[-1:, :] return np_out_final x = np.random.rand(batch_size, seq_len, input_size).astype(dtype) h = np.random.rand(batch_size, hidden_size).astype(dtype) activation, inner_activation = activation_functions output, state = get_numpy_prediction_gru( x, h, output_sequence, direction, inner_activation, activation ) expected_outputs = [output, state] if symbolic: batch_size = get_new_symbol() seq_len = get_new_symbol() hh_wt = np.concatenate([R_r, R_o, R_z], axis=0).astype(dtype) ih_wt = np.concatenate([W_r, W_o, W_z], axis=0).astype(dtype) b = np.concatenate([b_r, b_o, b_z], axis=0).astype(dtype) input_shape = [seq_len, batch_size, input_size] h_shape = [batch_size, hidden_size] builtin_dtype = numpy_type_to_builtin_type(dtype) input_placeholders = { "x": mb.placeholder(shape=input_shape, dtype=builtin_dtype), "initial_h": mb.placeholder(shape=h_shape, dtype=builtin_dtype), } coreml_x = np.transpose(x, (1, 0, 2)) input_values = {"x": coreml_x, "initial_h": h} expected_output_types = [ (seq_len if output_sequence else 1, batch_size, hidden_size, builtin_dtype), (batch_size, hidden_size, builtin_dtype), ] def build(x, initial_h): arguments = { "x": x, "initial_h": initial_h, "weight_ih": ih_wt, "weight_hh": hh_wt, "direction": direction, "output_sequence": output_sequence, "activation": activation, "recurrent_activation": inner_activation, } # If bias is provided, add in arguments if has_bias: arguments["bias"] = b return mb.gru(**arguments) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, upper_bound=10) if symbolic and backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) class TestLSTM: @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "input_dims", "output_dim", "activation", "inner_activation", "outer_activation", "return_seq", "has_bias", "forget_bias", "has_peephole", "coupled_input_forget", "clip", "dtype", ] ), itertools.product( compute_units, backends, [[8, 32, 32]], [1, 4], ["sigmoid"], ["tanh"], ["relu", "scaled_tanh", "hard_sigmoid", "linear"], [False, True], [False, True], [False, True], [True, False], [False], # We have not exposed this option yet! [50.0, 0.2, 0.01], [np.float32], # Only support fp32 before iOS17. ), ) def test_numpy_numerical( self, compute_unit, backend, input_dims, output_dim, activation, inner_activation, outer_activation, return_seq, has_bias, forget_bias, has_peephole, coupled_input_forget, clip, dtype, ): def _apply_act(x, option): # All activation functions use their standard default values. # This makes `tanh` equivalent to `scaled_tanh`, and makes `linear` a pass through. if option == "tanh" or option == "scaled_tanh": return np.tanh(x) elif option == "relu": return np.maximum(0, x) elif option == "sigmoid": return 1.0 / (1 + np.exp(-x)) elif option == "hard_sigmoid": return np.minimum(np.maximum(0.2 * x + 0.5, 0), 1) elif option == "linear": return x else: raise ValueError("activation invalid") def _clip(x, threshold=500.0): return np.maximum(np.minimum(x, threshold), -threshold) def _get_numpy_prediction_lstm(Weights, X): # X : (batch, seq_len, channel) batch, _, _ = X.shape out = [] for i in range(batch): out.append( _get_numpy_prediction_lstm_single_batch( Weights, np.expand_dims(X[i, :, :], axis=0) ) ) return np.stack(out, axis=0) def _get_numpy_prediction_lstm_single_batch(Weights, X): batch_size, seq_len, input_size = X.shape X = X[0, :, :] hidden_size = output_dim b = Weights["b"] Wx_i, Wx_f, Wx_o, Wx_g = np.split(Weights["W_x"], 4) Wh_i, Wh_f, Wh_o, Wh_g = np.split(Weights["W_h"], 4) b_i, b_f, b_o, b_g = np.split(b, 4) p_i, p_f, p_o = np.split(Weights["p"], 3) act1 = activation act2 = inner_activation act3 = outer_activation h = np.zeros((hidden_size)) c = np.zeros((hidden_size)) np_out = np.zeros((seq_len, hidden_size)) for k in range(seq_len): x = X[k, :] i = _apply_act(np.dot(Wx_i, x) + np.dot(Wh_i, h) + b_i + c * p_i, act1) f = _apply_act(np.dot(Wx_f, x) + np.dot(Wh_f, h) + b_f + c * p_f, act1) g = _apply_act(np.dot(Wx_g, x) + np.dot(Wh_g, h) + b_g, act2) if coupled_input_forget: c = c * (1 - i) + i * g else: c = c * f + i * g c = _clip(c, clip) o = _apply_act(np.dot(Wx_o, x) + np.dot(Wh_o, h) + b_o + c * p_o, act1) h = o * _apply_act(c, act3) np_out[k, :] = h if return_seq: np_out_final = np_out else: np_out_final = np_out[-1:, :] return np_out_final batch, seq_len, input_size = input_dims hidden_size = output_dim # define random weights W_x = np.random.rand(4 * hidden_size, input_size).astype(dtype) W_h = np.random.rand(4 * hidden_size, hidden_size).astype(dtype) if has_bias: b = np.random.rand(4 * hidden_size) - 0.5 if forget_bias: b = b + 1 else: b = np.zeros((4 * hidden_size)) b = b.astype(dtype) if has_peephole: p = np.random.rand(3 * hidden_size) - 0.5 else: p = np.zeros((3 * hidden_size)) p = p.astype(dtype) weights = {"W_x": W_x, "W_h": W_h, "b": b, "p": p} input_data = np.random.rand(batch, seq_len, input_size).astype(dtype) numpy_preds = _get_numpy_prediction_lstm(weights, input_data) numpy_preds = np.transpose(numpy_preds, [1, 0, 2]) coreml_input_data = np.transpose(input_data, [1, 0, 2]) builtin_dtype = numpy_type_to_builtin_type(dtype) input_placeholders = { "x": mb.placeholder(shape=coreml_input_data.shape, dtype=builtin_dtype) } input_values = {"x": coreml_input_data} def build(x): h_all, ht, ct = mb.lstm( x=x, initial_h=np.zeros((batch, hidden_size)).astype(dtype), initial_c=np.zeros((batch, hidden_size)).astype(dtype), weight_ih=W_x, weight_hh=W_h, peephole=p, direction="forward", bias=b, output_sequence=return_seq, recurrent_activation=activation, cell_activation=inner_activation, activation=outer_activation, clip=dtype(clip), ) return h_all expected_output_types = ( seq_len if return_seq else 1, batch, hidden_size, builtin_dtype, ) expected_outputs = numpy_preds run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, atol=1e-3, rtol=1e-3, ) @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( argnames=[ "compute_unit", "backend", "seq_len", "batch_size", "input_size", "hidden_size", "has_bias", "output_sequence", "direction", "symbolic", "dtype", ], argvalues=itertools.product( compute_units, backends, [1, 8], [1, 32], [1, 64], [1, 16], [True, False], [True, False], ["forward", "reverse"], [True, False], [np.float32], # Only support fp32 before iOS17. ), ) def test_builder_to_backend_smoke_unilstm( self, compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, direction, symbolic, dtype, ): torch.manual_seed(50) rnn = torch.nn.LSTM(input_size, hidden_size, 1, bias=has_bias) state_dict = rnn.state_dict() ih_wt = state_dict["weight_ih_l0"].detach().numpy() hh_wt = state_dict["weight_hh_l0"].detach().numpy() # Make weight compatible to CoreML format def ifzo_to_ifoz(x): i, f, z, o = np.split(x, 4) return np.concatenate([i, f, o, z], axis=0).astype(dtype) w_x = ifzo_to_ifoz(ih_wt) w_h = ifzo_to_ifoz(hh_wt) b = None if has_bias: ih_b = state_dict["bias_ih_l0"].detach().numpy() hh_b = state_dict["bias_hh_l0"].detach().numpy() ih_b = ifzo_to_ifoz(ih_b) hh_b = ifzo_to_ifoz(hh_b) b = ih_b + hh_b t = torch.randn(seq_len, batch_size, input_size) h0 = torch.randn(1, batch_size, hidden_size) c0 = torch.randn(1, batch_size, hidden_size) n_t = t if direction == "reverse": n_t = torch.flip(n_t, [0]) output, (hn, cn) = rnn(n_t, (h0, c0)) if not output_sequence: output = output[-1].unsqueeze(0) output = output.detach().numpy() hn = hn.detach().numpy().squeeze(0) cn = cn.detach().numpy().squeeze(0) t = np.reshape(t.detach().numpy(), [seq_len, batch_size, input_size]) h = np.reshape(h0.detach().numpy().squeeze(0), [batch_size, hidden_size]) c = np.reshape(c0.detach().numpy().squeeze(0), [batch_size, hidden_size]) if symbolic: batch_size = get_new_symbol() seq_len = get_new_symbol() input_shape = [seq_len, batch_size, input_size] h_shape = [batch_size, hidden_size] c_shape = [batch_size, hidden_size] builtin_dtype = numpy_type_to_builtin_type(dtype) expected_output_types = [ (seq_len if output_sequence else 1, batch_size, hidden_size, builtin_dtype), (batch_size, hidden_size, builtin_dtype), (batch_size, hidden_size, builtin_dtype), ] expected_outputs = [output, hn, cn] input_placeholders = { "x": mb.placeholder(shape=input_shape, dtype=builtin_dtype), "initial_h": mb.placeholder(shape=h_shape, dtype=builtin_dtype), "initial_c": mb.placeholder(shape=c_shape, dtype=builtin_dtype), } input_values = { "x": t.astype(dtype), "initial_h": h.astype(dtype), "initial_c": c.astype(dtype), } def build(x, initial_h, initial_c): arguments = { "x": x, "initial_h": initial_h, "initial_c": initial_c, "weight_ih": w_x, "weight_hh": w_h, "direction": direction, "output_sequence": output_sequence, } # If bias is provided, add in arguments if b is not None: arguments["bias"] = b return mb.lstm(**arguments) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, upper_bound=64) if symbolic and backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( argnames=[ "compute_unit", "backend", "seq_len", "batch_size", "input_size", "hidden_size", "has_bias", "output_sequence", "symbolic", "dtype", ], argvalues=itertools.product( compute_units, backends, [1, 8], [1, 32], [1, 64], [2, 16], [True, False], [True, False], [True, False], [np.float32], ), ) def test_builder_to_backend_smoke_bidirlstm( self, compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, symbolic, dtype, ): def _pytorch_hidden_to_coreml(x): x = x.detach().numpy() # Split of Direction axis f, b = np.split(x, 2, axis=0) # Concat on Hidden Size axis x = np.concatenate([f, b], axis=2) x = np.squeeze(x, axis=0) return x direction = "bidirectional" torch.manual_seed(20) rnn = torch.nn.LSTM(input_size, hidden_size, 1, bidirectional=True, bias=has_bias) state_dict = rnn.state_dict() ih_wt = state_dict["weight_ih_l0"].detach().numpy() hh_wt = state_dict["weight_hh_l0"].detach().numpy() ih_wt_r = state_dict["weight_ih_l0_reverse"].detach().numpy() hh_wt_r = state_dict["weight_hh_l0_reverse"].detach().numpy() def ifzo_to_ifoz(x): i, f, z, o = np.split(x, 4) return np.concatenate([i, f, o, z], axis=0).astype(dtype) wx = ifzo_to_ifoz(ih_wt) wh = ifzo_to_ifoz(hh_wt) r_wx = ifzo_to_ifoz(ih_wt_r) r_wh = ifzo_to_ifoz(hh_wt_r) b, r_b = None, None if has_bias: ih_b = state_dict["bias_ih_l0"].detach().numpy() hh_b = state_dict["bias_hh_l0"].detach().numpy() r_ih_b = state_dict["bias_ih_l0_reverse"].detach().numpy() r_hh_b = state_dict["bias_hh_l0_reverse"].detach().numpy() # Convert forward bias into [4*H] b = ih_b + hh_b b = ifzo_to_ifoz(b) # Convert reverse bias into [*H] r_b = r_ih_b + r_hh_b r_b = ifzo_to_ifoz(r_b) t = torch.randn(seq_len, batch_size, input_size) h0 = torch.randn(2, batch_size, hidden_size) c0 = torch.randn(2, batch_size, hidden_size) output, (hn, cn) = rnn(t, (h0, c0)) if not output_sequence: output_f = output[-1].unsqueeze(0)[:, :, :hidden_size] output_r = output[0].unsqueeze(0)[:, :, hidden_size:] output = torch.cat([output_f, output_r], dim=2) output = output.detach().numpy().astype(dtype) hn = _pytorch_hidden_to_coreml(hn).astype(dtype) cn = _pytorch_hidden_to_coreml(cn).astype(dtype) if symbolic: batch_size = get_new_symbol() seq_len = get_new_symbol() input_shape = [seq_len, batch_size, input_size] h_shape = [batch_size, 2 * hidden_size] c_shape = [batch_size, 2 * hidden_size] builtin_dtype = numpy_type_to_builtin_type(dtype) expected_output_types = [ ( seq_len if output_sequence else 1, batch_size, 2 * hidden_size, builtin_dtype, ), (batch_size, 2 * hidden_size, builtin_dtype), (batch_size, 2 * hidden_size, builtin_dtype), ] expected_outputs = [output, hn, cn] t = t.detach().numpy() h = _pytorch_hidden_to_coreml(h0) c = _pytorch_hidden_to_coreml(c0) input_placeholders = { "x": mb.placeholder(shape=input_shape, dtype=builtin_dtype), "initial_h": mb.placeholder(shape=h_shape, dtype=builtin_dtype), "initial_c": mb.placeholder(shape=c_shape, dtype=builtin_dtype), } input_values = { "x": t.astype(dtype), "initial_h": h.astype(dtype), "initial_c": c.astype(dtype), } def build(x, initial_h, initial_c): arguments = { "x": x, "initial_h": initial_h, "initial_c": initial_c, "weight_ih": wx, "weight_hh": wh, "weight_ih_back": r_wx, "weight_hh_back": r_wh, "direction": direction, "output_sequence": output_sequence, } # If bias is provided, add in arguments if b is not None: arguments["bias"] = b arguments["bias_back"] = r_b return mb.lstm(**arguments) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, upper_bound=64) if symbolic and backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_invalid_bidirectional_lstm(self): with pytest.raises( ValueError, match="For bidirectional LSTM, the `weight_ih_back` and " "`weight_hh_back` must be provided.", ): seq_len = 3 batch = 2 input_size = 4 hidden_size = 5 mb.lstm( x=np.random.rand(seq_len, batch, input_size), initial_h=np.zeros((batch, hidden_size)).astype(np.float32), initial_c=np.zeros((batch, hidden_size)).astype(np.float32), weight_ih=np.random.rand(4 * hidden_size, input_size), weight_hh=np.random.rand(4 * hidden_size, hidden_size), direction="bidirectional", ) @ssa_fn def test_invalid_activation_lstm(self): seq_len = 3 batch = 2 input_size = 4 hidden_size = 5 arguments = { "x": np.random.rand(seq_len, batch, input_size), "initial_h": np.zeros((batch, hidden_size)).astype(np.float32), "initial_c": np.zeros((batch, hidden_size)).astype(np.float32), "weight_ih": np.random.rand(4 * hidden_size, input_size), "weight_hh": np.random.rand(4 * hidden_size, hidden_size), "direction": "forward", } with pytest.raises(ValueError, match="Activation `dummy` not supported."): mb.lstm(recurrent_activation="dummy", **arguments) with pytest.raises(ValueError, match="Activation `dummy` not supported."): mb.lstm(cell_activation="dummy", **arguments) with pytest.raises(ValueError, match="Activation `dummy` not supported."): mb.lstm(activation="dummy", **arguments) class TestRNN: @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( argnames=[ "compute_unit", "backend", "seq_len", "batch_size", "input_size", "hidden_size", "has_bias", "output_sequence", "direction", "symbolic", "dtype", ], argvalues=itertools.product( compute_units, backends, [2, 8], [1, 32], [1, 64], [1, 16], [True, False], [True, False], ["forward", "reverse"], [True, False], [np.float32], ), ) def test_builder_to_backend_smoke( self, compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, direction, symbolic, dtype, ): torch.manual_seed(50) rnn = torch.nn.RNN(input_size, hidden_size, 1, bias=has_bias) state_dict = rnn.state_dict() ih_wt = state_dict["weight_ih_l0"].detach().numpy().astype(dtype) hh_wt = state_dict["weight_hh_l0"].detach().numpy().astype(dtype) b = None if has_bias: ih_b = state_dict["bias_ih_l0"].detach().numpy().astype(dtype) hh_b = state_dict["bias_hh_l0"].detach().numpy().astype(dtype) b = ih_b + hh_b t = torch.randn(seq_len, batch_size, input_size) h0 = torch.randn(1, batch_size, hidden_size) n_t = t if direction == "reverse": n_t = torch.flip(n_t, [0]) output, hn = rnn(n_t, h0) if not output_sequence: output = output[-1].unsqueeze(0) output = output.detach().numpy() hn = hn.detach().numpy().squeeze(0) t = np.reshape(t.detach().numpy(), [seq_len, batch_size, input_size]) h = np.reshape(h0.detach().numpy().squeeze(0), [batch_size, hidden_size]) if symbolic: batch_size = get_new_symbol() seq_len = get_new_symbol() input_shape = [seq_len, batch_size, input_size] h_shape = [batch_size, hidden_size] builtin_dtype = numpy_type_to_builtin_type(dtype) expected_output_types = [ (seq_len if output_sequence else 1, batch_size, hidden_size, builtin_dtype), (batch_size, hidden_size, builtin_dtype), ] expected_outputs = [output, hn] input_placeholders = { "x": mb.placeholder(shape=input_shape, dtype=builtin_dtype), "initial_h": mb.placeholder(shape=h_shape, dtype=builtin_dtype), } input_values = {"x": t.astype(dtype), "initial_h": h.astype(dtype)} def build(x, initial_h): arguments = { "x": x, "initial_h": initial_h, "weight_ih": ih_wt, "weight_hh": hh_wt, "direction": direction, "output_sequence": output_sequence, } # If bias is provided, add in arguments if b is not None: arguments["bias"] = b return mb.rnn(**arguments) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, upper_bound=64) if symbolic and backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_reduction.py0000644000000000000000000003272314672066616027013 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import scipy from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import random_gen, ssa_fn class TestReduction: # All ops in this test share the same backends @pytest.mark.parametrize( "compute_unit, backend, mode", itertools.product( compute_units, backends, [ "argmax", "argmin", "l1_norm", "l2_norm", "log_sum", "log_sum_exp", "max", "mean", "min", "prod", "sum", "sum_square", ], ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, mode): val = np.array([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} if mode in {"argmax", "argmin"}: expected_output_types = (2, types.int32) else: expected_output_types = (2, types.fp32) if mode == "argmax": build = lambda x: mb.reduce_argmax(x=x, axis=1, keep_dims=False) expected_outputs = np.array([2, 2], dtype=np.int32) elif mode == "argmin": build = lambda x: mb.reduce_argmin(x=x, axis=1, keep_dims=False) expected_outputs = np.array([0, 0], dtype=np.int32) elif mode == "l1_norm": build = lambda x: mb.reduce_l1_norm(x=x, axes=[1], keep_dims=False) expected_outputs = np.array([6.0, 15.0], dtype=np.float32) elif mode == "l2_norm": build = lambda x: mb.reduce_l2_norm(x=x, axes=[1], keep_dims=False) expected_outputs = np.array([3.74165738, 8.77496438], dtype=np.float32) elif mode == "log_sum": build = lambda x: mb.reduce_log_sum(x=x, axes=[1], keep_dims=False) expected_outputs = np.array([1.7917595, 2.70805025], dtype=np.float32) elif mode == "log_sum_exp": build = lambda x: mb.reduce_log_sum_exp(x=x, axes=[1], keep_dims=False) expected_outputs = np.array([3.40760589, 6.40760612], dtype=np.float32) elif mode == "max": build = lambda x: mb.reduce_max(x=x, axes=[1], keep_dims=False) expected_outputs = np.array([3.0, 6.0], dtype=np.float32) elif mode == "mean": build = lambda x: mb.reduce_mean(x=x, axes=[1], keep_dims=False) expected_outputs = np.array([2.0, 5.0], dtype=np.float32) elif mode == "min": build = lambda x: mb.reduce_min(x=x, axes=[1], keep_dims=False) expected_outputs = np.array([1.0, 4.0], dtype=np.float32) elif mode == "prod": build = lambda x: mb.reduce_prod(x=x, axes=[1], keep_dims=False) expected_outputs = np.array([6.0, 120.0], dtype=np.float32) elif mode == "sum": build = lambda x: mb.reduce_sum(x=x, axes=[1], keep_dims=False) expected_outputs = np.array([6.0, 15.0], dtype=np.float32) elif mode == "sum_square": build = lambda x: mb.reduce_sum_square(x=x, axes=[1], keep_dims=False) expected_outputs = np.array([14.0, 77.0], dtype=np.float32) else: raise NotImplementedError() run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, mode", itertools.product(compute_units, backends, ["max", "mean"]), ) def test_builder_to_backend_global_pool_2d(self, compute_unit, backend, mode): # test lowering to spatial reduction to global_pool path val = np.array([[[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} expected_output_types = (1, 1, 1, 1, types.fp32) if mode == "max": build = lambda x: mb.reduce_max(x=x, axes=[2, -1], keep_dims=True) expected_outputs = np.array([[[[6.0]]]], dtype=np.float32) elif mode == "mean": build = lambda x: mb.reduce_mean(x=x, axes=[3, -2], keep_dims=True) expected_outputs = np.array([[[[3.5]]]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, mode", itertools.product(compute_units, backends, ["max", "mean"]), ) def test_builder_to_backend_global_pool_none(self, compute_unit, backend, mode): # test lowering to spatial reduction to global_pool path for axis = None val = np.array( [[[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]], [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]]], dtype=np.float32, ) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} expected_output_types = (1, 1, 1, 1, types.fp32) if mode == "max": build = lambda x: mb.reduce_max(x=x, axes=None, keep_dims=True) expected_outputs = np.array([[[[6.0]]]], dtype=np.float32) elif mode == "mean": build = lambda x: mb.reduce_mean(x=x, axes=None, keep_dims=True) expected_outputs = np.array([[[[3.5]]]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, mode", itertools.product(compute_units, backends, ["max", "mean"]), ) def test_builder_to_backend_global_pool_3d(self, compute_unit, backend, mode): # test lowering to spatial reduction to global_pool path val = np.array([[[[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]]]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} expected_output_types = (1, 1, 1, 1, 1, types.fp32) if mode == "max": build = lambda x: mb.reduce_max(x=x, axes=[2, -1, 3], keep_dims=True) expected_outputs = np.array([[[[[6.0]]]]], dtype=np.float32) elif mode == "mean": build = lambda x: mb.reduce_mean(x=x, axes=[-3, 3, 4], keep_dims=True) expected_outputs = np.array([[[[[3.5]]]]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize(["axis", "keep_dims"], itertools.product([1, -3], [True, False])) def test_builder_eval(self, axis, keep_dims): x_val = random_gen(shape=(1, 3, 4, 4), rand_min=-100.0, rand_max=100.0) @ssa_fn def test_reduce_argmax(): res = mb.reduce_argmax(x=x_val, axis=axis, keep_dims=keep_dims).val ref = np.argmax(x_val, axis=axis) if keep_dims: ref = np.expand_dims(ref, axis=axis) np.testing.assert_allclose(ref, res, atol=1e-04, rtol=1e-05) @ssa_fn def test_reduce_argmin(): res = mb.reduce_argmin(x=x_val, axis=axis, keep_dims=keep_dims).val ref = np.argmin(x_val, axis=axis) if keep_dims: ref = np.expand_dims(ref, axis=axis) np.testing.assert_allclose(ref, res, atol=1e-04, rtol=1e-05) @ssa_fn def test_reduce_l1_norm(): res = mb.reduce_l1_norm(x=x_val, axes=[axis], keep_dims=keep_dims).val ref = np.sum(np.abs(x_val), axis=axis, keepdims=keep_dims) np.testing.assert_allclose(ref, res, atol=1e-04, rtol=1e-05) @ssa_fn def test_reduce_l2_norm(): res = mb.reduce_l2_norm(x=x_val, axes=[axis], keep_dims=keep_dims).val ref = np.sqrt(np.sum(np.square(x_val), axis=axis, keepdims=keep_dims)) np.testing.assert_allclose(ref, res, atol=1e-04, rtol=1e-05) @ssa_fn def test_reduce_log_sum(): x_val = random_gen(shape=(1, 3, 4, 4), rand_min=0.0, rand_max=100.0) res = mb.reduce_log_sum(x=x_val, axes=[axis], keep_dims=keep_dims).val ref = np.log(np.sum(x_val, axis=axis, keepdims=keep_dims)) np.testing.assert_allclose(ref, res, atol=1e-04, rtol=1e-05) @ssa_fn def test_reduce_log_sum_exp(): res = mb.reduce_log_sum_exp(x=x_val, axes=[axis], keep_dims=keep_dims).val ref = scipy.special.logsumexp(x_val, axis=axis, keepdims=keep_dims) np.testing.assert_allclose(ref, res, atol=1e-04, rtol=1e-05) @ssa_fn def test_reduce_max(): res = mb.reduce_max(x=x_val, axes=[axis], keep_dims=keep_dims).val ref = np.max(x_val, axis=axis, keepdims=keep_dims) np.testing.assert_allclose(ref, res, atol=1e-04, rtol=1e-05) @ssa_fn def test_reduce_mean(): res = mb.reduce_mean(x=x_val, axes=[axis], keep_dims=keep_dims).val ref = np.mean(x_val, axis=axis, keepdims=keep_dims) np.testing.assert_allclose(ref, res, atol=1e-04, rtol=1e-05) @ssa_fn def test_reduce_min(): res = mb.reduce_min(x=x_val, axes=[axis], keep_dims=keep_dims).val ref = np.min(x_val, axis=axis, keepdims=keep_dims) np.testing.assert_allclose(ref, res, atol=1e-04, rtol=1e-05) @ssa_fn def test_reduce_prod(): # test value res = mb.reduce_prod(x=x_val, axes=[axis], keep_dims=keep_dims).val ref = np.prod(x_val, axis=axis, keepdims=keep_dims) np.testing.assert_allclose(ref, res, atol=1e-04, rtol=1e-05) # test dtype for int input res = mb.reduce_prod(x=x_val.astype(np.int32), axes=[axis], keep_dims=keep_dims).val assert res.dtype == np.int32 @ssa_fn def test_reduce_sum(): res = mb.reduce_sum(x=x_val, axes=[axis], keep_dims=keep_dims).val ref = np.sum(x_val, axis=axis, keepdims=keep_dims) np.testing.assert_allclose(ref, res, atol=1e-04, rtol=1e-05) @ssa_fn def test_reduce_sum_square(): res = mb.reduce_sum_square(x=x_val, axes=[axis], keep_dims=keep_dims).val ref = np.sum(np.square(x_val), axis=axis, keepdims=keep_dims) np.testing.assert_allclose(ref, res, atol=1e-04, rtol=1e-05) test_reduce_argmax() test_reduce_argmin() test_reduce_l1_norm() test_reduce_l2_norm() test_reduce_log_sum() test_reduce_log_sum_exp() test_reduce_max() test_reduce_mean() test_reduce_min() test_reduce_prod() test_reduce_sum() test_reduce_sum_square() @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_symbolic(self, compute_unit, backend): s0 = get_new_symbol() val = np.array([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=(s0, 3))} input_values = {"x": val} def build(x): return [ mb.reduce_argmax(x=x, axis=1, keep_dims=True), mb.reduce_argmin(x=x, axis=0, keep_dims=True), ] expected_output_types = [(s0, 1, types.int32), (1, 3, types.int32)] expected_outputs = [ np.array([[2], [2]], dtype=np.int32), np.array([[0, 0, 0]], dtype=np.int32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize("input_size", [(1), (2), (1, 2), (2, 2), (2, 3, 4), (2, 3, 4, 10)]) def test_reduce_log_sum_exp_value_inference(self, input_size): rs = np.random.RandomState(1234) x = rs.random(input_size) for axis in range(-x.ndim, x.ndim - 1): @mb.program(input_specs=[]) def prog(): return mb.reduce_log_sum_exp(x=x, axes=(axis,)) ops = list(prog.functions.values())[0].operations op = list(ops)[3] assert op.op_type == "reduce_log_sum_exp" np.testing.assert_allclose( op.value_inference(), scipy.special.logsumexp(x, axis=axis), atol=1e-04, rtol=1e-05 ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_scatter_gather.py0000644000000000000000000006265114672066616030021 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools._deps import _HAS_TF_2, MSG_TF2_NOT_FOUND from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import ( mark_api_breaking, run_compare_builder, ) from coremltools.converters.mil.testing_reqs import compute_units if _HAS_TF_2: import tensorflow as tf class TestScatter: @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_builder_to_backend_smoke(self, compute_unit, backend): data = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array([1, 0], dtype=np.int32) updates = np.array([[5, 6, 7], [8, 9, 10]], dtype=np.float32) input_placeholders = { "data": mb.placeholder(shape=data.shape), "indices": mb.placeholder(shape=indices.shape, dtype=types.int32), "updates": mb.placeholder(shape=updates.shape), } input_values = {"data": data, "indices": indices, "updates": updates} def build(data, indices, updates): return (mb.scatter(data=data, indices=indices, updates=updates),) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array([[9, 11, 13], [9, 11, 13]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.skipif(not _HAS_TF_2, reason=MSG_TF2_NOT_FOUND) @pytest.mark.parametrize( "compute_unit, backend, rankData_rankIndices, accumulate_mode", itertools.product( compute_units, backends, [(1, 2), (2, 1), (3, 2), (2, 3), (1, 1), (3, 3), (1, 3)], ["update", "add", "sub", "mul", "div", "max", "min"], ), ) def test_builder_to_backend_programmatic( self, compute_unit, backend, rankData_rankIndices, accumulate_mode, ): data_rank, indices_rank = rankData_rankIndices data_shape = np.random.randint(low=2, high=5, size=data_rank) indices_shape = np.random.randint(low=2, high=5, size=indices_rank) updates_shape = list(indices_shape) + list(data_shape[1:]) data = np.random.rand(*data_shape).astype(np.float32) updates = np.random.rand(*updates_shape).astype(np.float32) indices = np.random.randint(0, data_shape[0], size=indices_shape).astype(np.int32) def build(data, indices, updates): return mb.scatter(data=data, indices=indices, updates=updates, mode=accumulate_mode) tf_output = tf.Variable(data) if accumulate_mode == "update": tf.compat.v1.scatter_update(tf_output, indices, updates) if accumulate_mode == "add": tf.compat.v1.scatter_add(tf_output, indices, updates) if accumulate_mode == "sub": tf.compat.v1.scatter_sub(tf_output, indices, updates) if accumulate_mode == "mul": tf.compat.v1.scatter_mul(tf_output, indices, updates) if accumulate_mode == "div": tf.compat.v1.scatter_div(tf_output, indices, updates) if accumulate_mode == "max": tf.compat.v1.scatter_max(tf_output, indices, updates) if accumulate_mode == "min": tf.compat.v1.scatter_min(tf_output, indices, updates) expected_output = tf_output.numpy() input_placeholders = { "data": mb.placeholder(shape=data.shape), "indices": mb.placeholder(shape=indices.shape, dtype=types.int32), "updates": mb.placeholder(shape=updates.shape), } input_values = {"data": data, "indices": indices, "updates": updates} expected_output_types = tuple(data_shape[:]) + (types.fp32,) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_output, compute_unit=compute_unit, backend=backend, ) class TestScatterAlongAxis: @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_builder_to_backend_smoke(self, compute_unit, backend): data = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array([[1, 0, 1], [1, 1, 0]], dtype=np.int32) updates = np.array([[5, 6, 7], [8, 9, 10]], dtype=np.float32) input_placeholders = { "data": mb.placeholder(shape=data.shape), "indices": mb.placeholder(shape=indices.shape, dtype=types.int32), "updates": mb.placeholder(shape=updates.shape), } input_values = {"data": data, "indices": indices, "updates": updates} def build(data, indices, updates): return mb.scatter_along_axis( data=data, indices=indices, updates=updates, axis=0, mode="update" ) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array([[1, 6, 10], [8, 9, 7]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "backend", backends, ) def test_builder_eval(self, backend): @mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, ) def prog(x): params = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array([[1, 0, 1], [1, 1, 0]], dtype=np.int32) updates = np.array([[5, 6, 7], [8, 9, 10]], dtype=np.float32) res = mb.scatter_along_axis( data=params, indices=indices, updates=updates, axis=0, mode="update" ) return res main_func = prog.functions["main"] gather_ops = main_func.find_ops(op_type="scatter_along_axis")[0] np.testing.assert_allclose( np.array([[1, 6, 10], [8, 9, 7]], dtype=np.float32), gather_ops.outputs[0].val, atol=1e-04, rtol=1e-05, ) @staticmethod def _test_builder_to_backend_programmatic( compute_unit, backend, rank_axis, force_non_negative_indices ): rank, axis = rank_axis data_shape = np.random.randint(low=2, high=8, size=rank) indices_shape = np.copy(data_shape) indices_shape[axis] = np.random.randint(low=1, high=8) updates_shape = indices_shape data = np.random.rand(*data_shape).astype(np.float32) updates = np.random.rand(*updates_shape).astype(np.float32) if force_non_negative_indices: # IOS17 scatter_along_axis requires indices to be non-negative. indices = np.random.randint(0, data_shape[axis], size=indices_shape).astype(np.int32) else: indices = np.random.randint( -data_shape[axis], data_shape[axis], size=indices_shape ).astype(np.int32) def build(data, indices, updates): return mb.scatter_along_axis( data=data, indices=indices, updates=updates, axis=axis, mode="update" ) input_placeholders = { "data": mb.placeholder(shape=data.shape), "indices": mb.placeholder(shape=indices.shape, dtype=types.int32), "updates": mb.placeholder(shape=updates.shape), } input_values = {"data": data, "indices": indices, "updates": updates} expected_output_types = tuple(data_shape[:]) + (types.fp32,) np_output = np.copy(data) np.put_along_axis(np_output, indices, updates, axis=axis) run_compare_builder( build, input_placeholders, input_values, expected_output_types, np_output, compute_unit=compute_unit, backend=backend, ) @mark_api_breaking(breaking_opset_version=ct.target.iOS17) @pytest.mark.parametrize( "compute_unit, backend, rank_axis", itertools.product( compute_units, backends, [(rank, axis) for rank in range(1, 5) for axis in range(-rank, rank)], ), ) def test_builder_to_backend_programmatic( self, compute_unit, backend, rank_axis, ): self._test_builder_to_backend_programmatic( compute_unit, backend, rank_axis, force_non_negative_indices=False ) class TestScatterNd: @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_builder_to_backend_smoke(self, compute_unit, backend): data = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array([[1, 0], [0, 2]], dtype=np.int32) updates = np.array([5, 10], dtype=np.float32) input_placeholders = { "data": mb.placeholder(shape=data.shape), "indices": mb.placeholder(shape=indices.shape, dtype=types.int32), "updates": mb.placeholder(shape=updates.shape), } input_values = {"data": data, "indices": indices, "updates": updates} def build(data, indices, updates): return (mb.scatter_nd(data=data, indices=indices, updates=updates),) run_compare_builder( build, input_placeholders, input_values, expected_output_types=(2, 3, types.fp32), expected_outputs=np.array([[1, 2, 13], [9, 5, 6]], dtype=np.float32), compute_unit=compute_unit, backend=backend, ) @pytest.mark.skipif(not _HAS_TF_2, reason=MSG_TF2_NOT_FOUND) @pytest.mark.parametrize( "compute_unit, backend, rankData_rankIndices, accumulate_mode", itertools.product( compute_units, backends, [(2, 2), (1, 4), (5, 2), (4, 3), (3, 4), (1, 5)], ["update", "add", "sub"], ), ) def test_builder_to_backend_programmatic( self, compute_unit, backend, rankData_rankIndices, accumulate_mode, ): data_rank, indices_rank = rankData_rankIndices data_shape = np.random.randint(low=2, high=5, size=data_rank) indices_shape = np.random.randint(low=2, high=5, size=indices_rank) indices_shape[-1] = np.random.randint(low=1, high=data_rank + 1) updates_shape = list(indices_shape[:-1]) + list(data_shape[indices_shape[-1] :]) data = np.random.rand(*data_shape).astype(np.float32) updates = np.random.rand(*updates_shape).astype(np.float32) indices_list = [] for i in range(indices_shape[-1]): indices_list.append(np.random.randint(0, data_shape[i], size=indices_shape[:-1])) indices = np.stack(indices_list, axis=-1).astype(np.int32) def build(data, indices, updates): return mb.scatter_nd(data=data, indices=indices, updates=updates, mode=accumulate_mode) tf_output = tf.Variable(data) if accumulate_mode == "update": tf.compat.v1.scatter_nd_update(tf_output, indices, updates) if accumulate_mode == "add": tf.compat.v1.scatter_nd_add(tf_output, indices, updates) if accumulate_mode == "sub": tf.compat.v1.scatter_nd_sub(tf_output, indices, updates) expected_output = tf_output.numpy() input_placeholders = { "data": mb.placeholder(shape=data.shape), "indices": mb.placeholder(shape=indices.shape, dtype=types.int32), "updates": mb.placeholder(shape=updates.shape), } input_values = {"data": data, "indices": indices, "updates": updates} expected_output_types = tuple(data_shape[:]) + (types.fp32,) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_output, compute_unit=compute_unit, backend=backend, ) class TestGather: @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array([1, 0], dtype=np.int32) input_placeholders = { "x": mb.placeholder(shape=x.shape), "indices": mb.placeholder(shape=indices.shape, dtype=types.int32), } input_values = {"x": x, "indices": indices} def build(x, indices): return [ mb.gather(x=x, indices=indices, axis=0), mb.gather(x=x, indices=indices, axis=1), mb.gather(x=x, indices=indices, axis=-2), mb.gather(x=x, indices=indices, axis=-1), mb.gather(x=x, indices=indices), # mb.gather(x=x, indices=1), #shape of scalar indices is incorrect. # mb.gather(x=x, indices=1, axis=1), #Scalar index passes on axis=0 but fails on axis=1, # Need to handle rank 0 correctly, rdar://73160449 ] expected_output_types = [ (2, 3, types.fp32), (2, 2, types.fp32), (2, 3, types.fp32), (2, 2, types.fp32), (2, 3, types.fp32), # (3, types.fp32), ] expected_outputs = [ np.array([[4, 5, 6], [1, 2, 3]], dtype=np.float32), np.array([[2, 1], [5, 4]], dtype=np.float32), np.array([[4, 5, 6], [1, 2, 3]], dtype=np.float32), np.array([[2, 1], [5, 4]], dtype=np.float32), np.array([[4, 5, 6], [1, 2, 3]], dtype=np.float32), # np.array([4, 5, 6], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_embedding_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array([1, 0], dtype=np.int32) input_placeholders = { "indices": mb.placeholder(shape=indices.shape, dtype=types.int32), } input_values = {"indices": indices} def build(indices): return [ mb.gather(x=x, indices=indices, axis=0), mb.gather(x=x, indices=indices, axis=-2), ] expected_output_types = [ (2, 3, types.fp32), (2, 3, types.fp32), ] expected_outputs = [ np.array([[4, 5, 6], [1, 2, 3]], dtype=np.float32), np.array([[4, 5, 6], [1, 2, 3]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "backend", backends, ) def test_builder_eval(self, backend): @mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, ) def prog(x): params = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array([1, 0], dtype=np.int32) res = mb.gather(x=params, indices=indices, axis=-1) return res main_func = prog.functions["main"] gather_ops = main_func.find_ops(op_type="gather")[0] np.testing.assert_allclose( np.array([[2, 1], [5, 4]], dtype=np.float32), gather_ops.outputs[0].val, atol=1e-04, rtol=1e-05, ) @pytest.mark.parametrize( "backend, indices_val, validate_indices", itertools.product(backends, [[-1, 0], [0, 3]], [True, False]), ) def test_builder_invalid_indices(self, backend, indices_val, validate_indices): def prog(x): params = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array(indices_val, dtype=np.int32) res = mb.gather(x=params, indices=indices, axis=-1) return res if any([idx > 2 for idx in indices_val]): with pytest.raises(IndexError, match="index 3 is out of bounds for axis 1 with size 3"): mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, )(prog) else: mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, )(prog) @staticmethod def test_gather_value_inference_on_symbolic_input(): s1, s2 = get_new_symbol(), get_new_symbol() @mb.program( input_specs=[mb.TensorSpec(shape=(2, 3, s1, s2, 5))], ) def prog(x): shape = mb.shape(x=x) gather_1 = mb.gather(x=shape, indices=0, axis=0) gather_2 = mb.gather(x=shape, indices=[0, 1], axis=0) gather_3 = mb.gather(x=shape, indices=[1, 2, 3], axis=0) # Test value inference assert gather_1.val == 2 assert gather_1.sym_val == 2 assert gather_2.val.tolist() == [2, 3] assert gather_2.sym_val.tolist() == [2, 3] assert gather_3.val is None assert gather_3.sym_val.tolist() == [3, s1, s2] return x class TestGatherAlongAxis: @pytest.mark.parametrize( "compute_unit, backend, x_dtype, indices_dtype", itertools.product(compute_units, backends, [np.float32, np.float16, np.int32], [np.int32]), ) def test_builder_to_backend_smoke(self, compute_unit, backend, x_dtype, indices_dtype): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=x_dtype) indices = np.array([[1, 0, 1], [1, 1, 0]], dtype=indices_dtype) builtin_x_dtype = types.numpy_type_to_builtin_type(x_dtype) input_placeholders = { "x": mb.placeholder(shape=x.shape, dtype=builtin_x_dtype), "indices": mb.placeholder( shape=indices.shape, dtype=types.numpy_type_to_builtin_type(indices_dtype) ), } input_values = {"x": x, "indices": indices} def build(x, indices): return [ mb.gather_along_axis(x=x, indices=indices, axis=0), mb.gather_along_axis(x=x, indices=indices, axis=1), mb.gather_along_axis(x=x, indices=indices, axis=-2), mb.gather_along_axis(x=x, indices=indices, axis=-1), mb.gather_along_axis(x=x, indices=indices), ] expected_output_types = [ (2, 3, builtin_x_dtype), (2, 3, builtin_x_dtype), (2, 3, builtin_x_dtype), (2, 3, builtin_x_dtype), (2, 3, builtin_x_dtype), ] expected_outputs = [ np.array([[4, 2, 6], [4, 5, 3]], dtype=x_dtype), np.array([[2, 1, 2], [5, 5, 4]], dtype=x_dtype), np.array([[4, 2, 6], [4, 5, 3]], dtype=x_dtype), np.array([[2, 1, 2], [5, 5, 4]], dtype=x_dtype), np.array([[4, 2, 6], [4, 5, 3]], dtype=x_dtype), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "backend", backends, ) def test_builder_eval(self, backend): @mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, ) def prog(x): params = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array([[1, 0, 1], [0, 0, 1]], dtype=np.int32) res = mb.gather_along_axis(x=params, indices=indices, axis=0) return res main_func = prog.functions["main"] gather_ops = main_func.find_ops(op_type="gather_along_axis")[0] np.testing.assert_allclose( np.array([[4, 2, 6], [1, 2, 6]], dtype=np.float32), gather_ops.outputs[0].val, atol=1e-04, rtol=1e-05, ) @staticmethod def _test_builder_to_backend_programmatic( compute_unit, backend, rank_axis, x_dtype, indices_dtype, force_non_negative_indices ): rank, axis = rank_axis x_shape = np.random.randint(low=2, high=8, size=rank) indices_shape = np.copy(x_shape) indices_shape[axis] = np.random.randint(low=1, high=8) x = np.random.rand(*x_shape).astype(x_dtype) lower_bound = -x_shape[axis] if force_non_negative_indices or np.issubdtype(indices_dtype, np.unsignedinteger): lower_bound = 0 indices = np.random.randint(lower_bound, x_shape[axis], size=indices_shape).astype( indices_dtype ) def build(x, indices): return mb.gather_along_axis(x=x, indices=indices, axis=axis) builtin_x_dtype = types.numpy_type_to_builtin_type(x_dtype) input_placeholders = { "x": mb.placeholder(shape=x.shape, dtype=builtin_x_dtype), "indices": mb.placeholder( shape=indices.shape, dtype=types.numpy_type_to_builtin_type(indices_dtype) ), } input_values = {"x": x, "indices": indices} expected_output_types = tuple(indices_shape[:]) + (builtin_x_dtype,) expected_output = np.take_along_axis(x, indices, axis=axis) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_output, compute_unit=compute_unit, backend=backend, ) @mark_api_breaking(breaking_opset_version=ct.target.iOS17) @pytest.mark.parametrize( "compute_unit, backend, rank_axis, x_dtype, indices_dtype", itertools.product( compute_units, backends, [(rank, axis) for rank in range(1, 5) for axis in range(-rank, rank)], [np.float32, np.float16, np.int32], [np.int32], ), ) def test_builder_to_backend_programmatic( self, compute_unit, backend, rank_axis, x_dtype, indices_dtype ): self._test_builder_to_backend_programmatic( compute_unit, backend, rank_axis, x_dtype, indices_dtype, False ) @pytest.mark.parametrize( "backend, indices_val, validate_indices", itertools.product( backends, [[[1, 0, -1], [0, 0, 1]], [[1, 0, 1], [0, 0, 2]]], [True, False], ), ) def test_builder_invalid_indices(self, backend, indices_val, validate_indices): def prog(x): params = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array(indices_val, dtype=np.int32) res = mb.gather_along_axis(x=params, indices=indices, axis=0) return res if any([idx > 1 for sub_indices in indices_val for idx in sub_indices]): with pytest.raises(IndexError, match="index 2 is out of bounds for axis 0 with size 2"): mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, )(prog) else: mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, )(prog) class TestGatherNd: @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array([[1, 0], [0, 2]], dtype=np.int32) input_placeholders = { "x": mb.placeholder(shape=x.shape), "indices": mb.placeholder(shape=indices.shape, dtype=types.int32), } input_values = {"x": x, "indices": indices} def build(x, indices): return (mb.gather_nd(x=x, indices=indices),) expected_output_types = (2, types.fp32) expected_outputs = np.array([4, 3], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, frontend_only=False, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_tensor_operation.py0000644000000000000000000016016514672066616030413 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import platform import numpy as np import pytest import coremltools as ct from coremltools._deps import _HAS_TF_2, MSG_TF2_NOT_FOUND from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import ( UNK_SYM, UNK_VARIADIC, construct_inputs_from_placeholders, mark_api_breaking, run_compare_builder, ) from coremltools.converters.mil.mil.types.symbolic import is_symbolic from coremltools.converters.mil.mil.types.type_mapping import nptype_from_builtin from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import get_op_types_in_program, random_gen, ssa_fn if _HAS_TF_2: import tensorflow as tf class TestBandPart: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array( [ [3.0, 3.0, 5.0, 1.0], [5.0, 6.0, 3.0, 8.0], [7.0, 2.0, 7.0, 2.0], [6.0, 7.0, 7.0, 1.0], ], dtype=np.float32, ) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [ mb.band_part(x=x), mb.band_part(x=x, lower=0, upper=-1), mb.band_part(x=x, lower=-1, upper=0), mb.band_part(x=x, lower=0, upper=0), ] expected_output_types = [ (4, 4, types.fp32), (4, 4, types.fp32), (4, 4, types.fp32), (4, 4, types.fp32), ] expected_outputs = [ np.array( [ [3.0, 3.0, 5.0, 1.0], [5.0, 6.0, 3.0, 8.0], [7.0, 2.0, 7.0, 2.0], [6.0, 7.0, 7.0, 1.0], ], dtype=np.float32, ), np.array( [ [3.0, 3.0, 5.0, 1.0], [0.0, 6.0, 3.0, 8.0], [0.0, 0.0, 7.0, 2.0], [0.0, 0.0, 0.0, 1.0], ], dtype=np.float32, ), np.array( [ [3.0, 0.0, 0.0, 0.0], [5.0, 6.0, 0.0, 0.0], [7.0, 2.0, 7.0, 0.0], [6.0, 7.0, 7.0, 1.0], ], dtype=np.float32, ), np.array( [ [3.0, 0.0, 0.0, 0.0], [0.0, 6.0, 0.0, 0.0], [0.0, 0.0, 7.0, 0.0], [0.0, 0.0, 0.0, 1.0], ], dtype=np.float32, ), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) def get_output_from_mlmodel( self, x_val: np.ndarray, num_lower: int, num_upper: int, dtype: type, ) -> np.ndarray: @mb.program(input_specs=[mb.TensorSpec(shape=(3, 4), dtype=dtype)]) def prog(x): return mb.band_part(x=x, lower=num_lower, upper=num_upper, name="out") mlmodel = ct.convert( prog, convert_to="mlprogram", compute_precision=ct.precision.FLOAT32, compute_units=ct.ComputeUnit.CPU_ONLY, ) out = mlmodel.predict({"x": x_val})["out"] return out def get_value_inference_output( self, x_val: np.ndarray, num_lower: int, num_upper: int, dtype: type, ) -> np.ndarray: func_inputs = {"x": mb.placeholder(shape=[3, 4], dtype=dtype)} with Function(func_inputs) as ssa_fun: x = ssa_fun.inputs["x"] v = mb.band_part(x=x_val, lower=num_lower, upper=num_upper) return v.val @pytest.mark.skipif( ct.utils._macos_version() < (10, 15), reason="needs mlprogram, skip on macos < 10.15" ) @pytest.mark.parametrize( "lower_upper, dtype", itertools.product( [(0, -1), (-1, 0), (0, 0), (1, 1), (1, 2), (2, 1)], [types.int32, types.fp32], ), ) def test_value_inference(self, lower_upper, dtype): num_lower, num_upper = lower_upper np_type = nptype_from_builtin(dtype) test_input = np.random.rand(3, 4).astype(np_type) out_value_inference = self.get_value_inference_output( test_input, num_lower, num_upper, dtype ) out_from_model_prediction = self.get_output_from_mlmodel( test_input, num_lower, num_upper, dtype ) assert out_value_inference.dtype == test_input.dtype np.testing.assert_allclose( out_value_inference, out_from_model_prediction, atol=1e-3, rtol=1e-3 ) class TestCumSum: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.cumsum(x=x, axis=0, reverse=True, exclusive=False) expected_output_types = (2, 3, types.fp32) expected_outputs = np.array([[5, 7, 9], [4, 5, 6]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = random_gen(shape=(1, 2, 3, 4, 5), rand_min=-100, rand_max=100) v = mb.cumsum(x=x_val) np.testing.assert_allclose(np.cumsum(x_val, axis=0), v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_invalid_arg(self): x_val = random_gen(shape=(1, 2, 3, 4, 5), rand_min=-100, rand_max=100) with pytest.raises(ValueError): mb.cumsum(x=x_val, axis=0, invalid_arg=3) @ssa_fn def test_invalid_axis1(self): x_val = random_gen(shape=(1, 2, 3, 4, 5), rand_min=-100, rand_max=100) with pytest.raises(ValueError): mb.cumsum(x=x_val, axis=-2) @ssa_fn def test_invalid_axis2(self): x_val = random_gen(shape=(1, 2, 3, 4, 5), rand_min=-100, rand_max=100) with pytest.raises(ValueError): mb.cumsum(x=x_val, axis=len(x_val.shape)) @ssa_fn def test_invalid_axis3(self): x_val = random_gen(shape=(1, 2, 3, 4, 5), rand_min=-100, rand_max=100) with pytest.raises(ValueError): mb.cumsum(x=x_val, axis="") @ssa_fn def test_invalid_reverse1(self): x_val = random_gen(shape=(1, 2, 3, 4, 5), rand_min=-100, rand_max=100) with pytest.raises(ValueError): mb.cumsum(x=x_val, reverse="") @ssa_fn def test_invalid_reverse2(self): x_val = random_gen(shape=(1, 2, 3, 4, 5), rand_min=-100, rand_max=100) with pytest.raises(ValueError): mb.cumsum(x=x_val, reverse=0) @ssa_fn def test_invalid_reverse3(self): x_val = random_gen(shape=(1, 2, 3, 4, 5), rand_min=-100, rand_max=100) with pytest.raises(ValueError): mb.cumsum(x=x_val, reverse=1) @ssa_fn def test_invalid_exclusive1(self): x_val = random_gen(shape=(1, 2, 3, 4, 5), rand_min=-100, rand_max=100) with pytest.raises(ValueError): mb.cumsum(x=x_val, exclusive="") @ssa_fn def test_invalid_exclusive2(self): x_val = random_gen(shape=(1, 2, 3, 4, 5), rand_min=-100, rand_max=100) with pytest.raises(ValueError): mb.cumsum(x=x_val, exclusive=0) @ssa_fn def test_invalid_exclusive3(self): x_val = random_gen(shape=(1, 2, 3, 4, 5), rand_min=-100, rand_max=100) with pytest.raises(ValueError): mb.cumsum(x=x_val, exclusive=1) @ssa_fn def test_invalid_input1(self): x_val = 1 with pytest.raises(ValueError): mb.cumsum(x=x_val) @ssa_fn def test_invalid_input2(self): x_val = ["1"] with pytest.raises(ValueError): mb.cumsum(x=x_val) class TestFill: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): shape = (2, 1, 3) x_val = np.zeros(shape=shape, dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return mb.add(x=x, y=mb.fill(shape=shape, value=1.0)) expected_output_types = [(2, 1, 3, types.fp32)] expected_outputs = [np.full(shape=shape, fill_value=1.0)] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): shape = np.random.randint(low=1, high=3, size=5).astype(np.int32) res = mb.fill(shape=shape, value=1991.0).val np.testing.assert_allclose(np.full(shape, fill_value=1991.0), res, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend, rank, value", itertools.product( compute_units, backends, [rank for rank in range(1, 6)], [-1917.0, 0.0, 2048.0], ), ) def test_builder_to_backend_stress(self, compute_unit, backend, rank, value): shape = np.random.randint(low=1, high=4, size=rank).astype(np.int32) x_val = np.zeros(shape=shape, dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return mb.add(x=x, y=mb.fill(shape=shape, value=value)) expected_outputs = [np.full(shape=shape, fill_value=value)] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_symbolic(self, compute_unit, backend): s_len = get_new_symbol() input_placeholders = { "shape": mb.placeholder(shape=(s_len,), dtype=types.int32), } def build(shape): return [mb.fill(shape=shape)] expected_output_types = [(UNK_VARIADIC, types.fp32)] expected_outputs = [np.zeros(shape=(2, 1, 3), dtype=np.float32)] input_values = {"shape": np.array([2, 1, 3], dtype=np.float32)} run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, 3) if backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) @pytest.mark.skipif(not _HAS_TF_2, reason=MSG_TF2_NOT_FOUND) class TestNonMaximumSuppression: @mark_api_breaking(breaking_opset_version=ct.target.iOS17) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): boxes_val = np.array( [ [ [0.0, 0.0, 0.0, 0.0], [1.0, 1.0, 1.0, 1.0], [2.0, 2.0, 2.0, 2.0], [3.0, 3.0, 3.0, 3.0], ] ], dtype=np.float32, ) scores_val = np.array([[[-3.5], [9.4], [2.3], [0.7]]], dtype=np.float32) input_placeholders = { "boxes": mb.placeholder(shape=(1, 4, 4)), "scores": mb.placeholder(shape=(1, 4, 1)), } input_values = {"boxes": boxes_val, "scores": scores_val} expected_output_types = [ (1, 2, 4, types.fp32), (1, 2, 1, types.fp32), (1, 2, types.int32), (1, types.int32), ] expected_outputs = [ np.array([[[1.0, 1.0, 1.0, 1.0], [2.0, 2.0, 2.0, 2.0]]], dtype=np.float32), np.array([[[9.4], [2.3]]], dtype=np.float32), np.array([[1, 2]], dtype=np.int32), np.array([2], dtype=np.int32), ] def build(boxes, scores): return mb.non_maximum_suppression( boxes=boxes, scores=scores, iou_threshold=0.2, score_threshold=0.4, max_boxes=2, per_class_suppression=True, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @staticmethod def _compute_iou_matrix(boxes): # input is (N, 4), in order [center_w, center_h, width, height] boxes = boxes.astype(np.float32) center_w, center_h, width, height = np.split(boxes, 4, axis=1) top = center_h + 0.5 * height bottom = center_h - 0.5 * height left = center_w - 0.5 * width right = center_w + 0.5 * width area = width * height h_b = np.minimum(top, np.transpose(top)) w_b = np.minimum(right, np.transpose(right)) h_a = np.maximum(bottom, np.transpose(bottom)) w_a = np.maximum(left, np.transpose(left)) intersection_area = np.maximum(0, h_b - h_a) * np.maximum(0, w_b - w_a) union_area = area + np.transpose(area) - intersection_area return intersection_area / union_area @staticmethod def _ref_non_maximum_suppression( boxes, scores, iou_threshold, score_threshold, max_boxes, per_class_suppression ): """ Reference implementation of Core ML's NMS op using TensorFlow. boxes of shape (n_batch, n_box, 4), [center_w, center_h, width, height] scores of shape (n_batch, n_box, n_score) output shapes [ (n_batch, max_boxes, 4), (n_batch, max_boxes, n_score), (n_batch, max_boxes), (n_batch,) ] """ n_batch, n_box, n_score = scores.shape iou_threshold = iou_threshold.astype(np.float32) score_threshold = score_threshold.astype(np.float32) # convert box ids to TF style center_w, center_h, width, height = np.split(boxes, 4, axis=-1) # (n_batch,n_box,1) y1 = center_h - 0.5 * height y2 = center_h + 0.5 * height x1 = center_w - 0.5 * width x2 = center_w + 0.5 * width boxes_tf = np.concatenate((y1, x1, y2, x2), axis=-1) # (n_batch,n_box,4) out1 = np.zeros((n_batch, max_boxes, 4)) out2 = np.zeros((n_batch, max_boxes, n_score)) out3 = -1 * np.ones((n_batch, max_boxes)) out4 = np.zeros((n_batch,)) for b in range(n_batch): box_coord_matrix = boxes_tf[b, :, :] # (n_box,4) score_vector = np.max(scores[b, :, :], axis=-1) # (n_box,) if not per_class_suppression: # this is the simple case as TF directly supports it ids_g = tf.image.non_max_suppression( box_coord_matrix, score_vector, max_output_size=max_boxes, iou_threshold=iou_threshold, score_threshold=score_threshold, ) ids = ids_g.numpy() else: # this is slightly complicated as TF does not directly support it class_ids = np.argmax(scores[b, :, :], axis=-1) # (n_box,) sorted_score_ids = np.argsort(-score_vector) box_coord_matrix2 = np.take(box_coord_matrix, sorted_score_ids, axis=0) score_vector2 = np.take(score_vector, sorted_score_ids) class_ids = np.take(class_ids, sorted_score_ids) classes_seen = dict() ids_intermediate = np.array([], dtype=np.int32) for n in range(n_box): if class_ids[n] in classes_seen: continue c = class_ids[n] classes_seen[c] = True current_class_ids = np.where(class_ids == c)[0] if len(current_class_ids) > 0: feed_in1 = np.take(box_coord_matrix2, current_class_ids, axis=0) feed_in2 = np.take(score_vector2, current_class_ids) cur_ids_g = tf.image.non_max_suppression( feed_in1, feed_in2, max_output_size=max_boxes, iou_threshold=iou_threshold, score_threshold=score_threshold, ) cur_ids = cur_ids_g.numpy() from_sort_ids = np.take(current_class_ids, cur_ids) ids_intermediate = np.append(ids_intermediate, from_sort_ids) ids_intermediate.sort() ids = np.take(sorted_score_ids, ids_intermediate) xx = len(ids) if xx == 0: ids = np.array([np.argmax(score_vector)]) xx = 1 if xx > max_boxes: ids = ids[:max_boxes] xx = len(ids) out1[b, :xx, :] = np.take(boxes[b, :, :], ids, axis=0) out2[b, :xx, :] = np.take(scores[b, :, :], ids, axis=0) out3[b, :xx] = ids out4[b] = xx return out1, out2, out3, out4 @mark_api_breaking(breaking_opset_version=ct.target.iOS17) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "iou_threshold_percentile", "score_threshold_percentile", "n_boxes", "n_batch", "n_score", "per_class_suppression", ] ), itertools.product( compute_units, backends, [0, 30, 80, 100], [0, 40, 100], [(10, 7), (30, 37), (100, 64)], [1], [1, 4, 7], [True, False], ), ) def test_builder_to_backend_stress( self, compute_unit, backend, iou_threshold_percentile, score_threshold_percentile, n_boxes, n_batch, n_score, per_class_suppression, ): if backend.backend == "mlprogram" and iou_threshold_percentile == 0: pytest.xfail("rdar://78080118") if ( backend.backend == "neuralnetwork" and n_boxes == (10, 7) and platform.machine() == "x86_64" ): pytest.xfail("rdar://78080118 (Investigate failing tests for NMS in coremltools)") if backend.backend == "mlprogram" and backend.precision == "fp16": pytest.xfail("CPU: rdar://80662705 and GPU: rdar://80661262") n_boxes_in, n_boxes_out = n_boxes boxes_val = random_gen((n_batch, n_boxes_in, 4), 0, 100) scores_val = random_gen((n_batch, n_boxes_in, n_score), -100, 100, allow_duplicate=False) iou_matrix = self._compute_iou_matrix(boxes_val[0, :, :]) iou_matrix = iou_matrix[~np.eye(iou_matrix.shape[0], dtype=bool)].reshape( iou_matrix.shape[0], -1 ) if score_threshold_percentile == 0: score_threshold = np.min(scores_val) - 1 elif score_threshold_percentile == 100: score_threshold = np.max(scores_val) + 1 else: score_threshold = np.percentile(scores_val, score_threshold_percentile) + 0.01 if iou_threshold_percentile == 0: iou_threshold = np.maximum(np.min(iou_matrix) - 0.01, 0.0) else: iou_threshold = np.percentile(iou_matrix, iou_threshold_percentile) + 0.01 iou_threshold = np.maximum(iou_threshold, 1e-8) (tf_boxes, tf_scores, tf_indices, tf_num_boxes,) = self._ref_non_maximum_suppression( boxes_val, scores_val, iou_threshold, score_threshold, n_boxes_out, per_class_suppression, ) expected_outputs = [tf_boxes, tf_scores, tf_indices, tf_num_boxes] expected_output_types = [ tf_boxes.shape[:] + (types.fp32,), tf_scores.shape[:] + (types.fp32,), tf_indices.shape[:] + (types.int32,), tf_num_boxes.shape[:] + (types.int32,), ] input_placeholders = { "boxes": mb.placeholder(shape=(n_batch, n_boxes_in, 4)), "scores": mb.placeholder(shape=(n_batch, n_boxes_in, n_score)), } input_values = {"boxes": boxes_val, "scores": scores_val} def build(boxes, scores): return mb.non_maximum_suppression( boxes=boxes, scores=scores, iou_threshold=iou_threshold, score_threshold=score_threshold, max_boxes=n_boxes_out, per_class_suppression=per_class_suppression, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestNonZero: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array([[3, 0, 0], [0, 4, 0], [5, 6, 0]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [mb.non_zero(x=x)] expected_output_types = [(UNK_SYM, 2, types.int32)] expected_outputs = [np.array(np.transpose(np.nonzero(x_val)))] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.random.randint(low=-1, high=2, size=(6, 1, 7)) res = mb.non_zero(x=x_val) np.testing.assert_allclose(np.transpose(np.nonzero(x_val)), res.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_shape_inference_for_deterministic_input(self): # If the input is compile time known, the builder should be able to infer the shape from value x_val = np.array([[0, 2], [1, 1]]) res = mb.non_zero(x=x_val) assert res.shape == (3, 2) class TestOneHot: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([1, 0], dtype=np.int32) depth = 4 input_placeholders = { "x": mb.placeholder(shape=x.shape, dtype=types.int32), "y": mb.placeholder(shape=(1,), dtype=types.int32), } input_values = {"x": x, "y": depth} def build(x, y): return [ mb.one_hot(indices=x, one_hot_vector_size=4), mb.one_hot(indices=x, one_hot_vector_size=4, axis=0), mb.one_hot(indices=x, one_hot_vector_size=4, on_value=1.0, off_value=0.1), mb.one_hot(indices=x, one_hot_vector_size=mb.squeeze(x=y), on_value=1, off_value=9), ] expected_output_types = [ (2, 4, types.int32), (4, 2, types.int32), (2, 4, types.fp32), (2, UNK_SYM, types.int32), ] expected_outputs = [ np.array([[0, 1, 0, 0], [1, 0, 0, 0]], dtype=np.float32), np.array([[0, 1], [1, 0], [0, 0], [0, 0]], dtype=np.float32), np.array([[0.1, 1, 0.1, 0.1], [1, 0.1, 0.1, 0.1]], dtype=np.float32), np.array([[9, 1, 9, 9], [1, 9, 9, 9]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestPad: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): def test_constant_mode(): t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) pad = np.array([1, 1, 2, 2], dtype=np.int32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.pad(x=x, pad=pad, mode="constant", constant_val=0.0) expected_output_types = (4, 7, types.fp32) expected_outputs = np.array( [ [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 2.0, 3.0, 0.0, 0.0], [0.0, 0.0, 4.0, 5.0, 6.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], ], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) def test_constant_mode_constant_val(): t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) pad = np.array([1, 1, 2, 2], dtype=np.int32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.pad(x=x, pad=pad, mode="constant", constant_val=0.5) expected_output_types = (4, 7, types.fp32) expected_outputs = np.array( [ [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5], [0.5, 0.5, 1.0, 2.0, 3.0, 0.5, 0.5], [0.5, 0.5, 4.0, 5.0, 6.0, 0.5, 0.5], [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5], ], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) def test_reflect_mode(): t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) pad = np.array([1, 1, 2, 2], dtype=np.int32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.pad(x=x, pad=pad, mode="reflect") expected_output_types = (4, 7, types.fp32) expected_outputs = np.array( [ [6.0, 5.0, 4.0, 5.0, 6.0, 5.0, 4.0], [3.0, 2.0, 1.0, 2.0, 3.0, 2.0, 1.0], [6.0, 5.0, 4.0, 5.0, 6.0, 5.0, 4.0], [3.0, 2.0, 1.0, 2.0, 3.0, 2.0, 1.0], ], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) def test_replicate_mode(): t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) pad = np.array([1, 1, 2, 2], dtype=np.int32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.pad(x=x, pad=pad, mode="replicate") expected_output_types = (4, 7, types.fp32) expected_outputs = np.array( [ [1.0, 1.0, 1.0, 2.0, 3.0, 3.0, 3.0], [1.0, 1.0, 1.0, 2.0, 3.0, 3.0, 3.0], [4.0, 4.0, 4.0, 5.0, 6.0, 6.0, 6.0], [4.0, 4.0, 4.0, 5.0, 6.0, 6.0, 6.0], ], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) def test_constant_general(): t = np.arange(12, dtype=np.float32).reshape([2, 2, 3]) pad = np.array([[1, 1], [2, 2], [1, 1]], dtype=np.int32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return mb.pad(x=x, pad=pad.reshape(-1), mode="constant", constant_val=0.0) expected_output_types = (4, 6, 5, types.fp32) expected_outputs = np.pad(t, pad, mode="constant") run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) # Test different modes test_constant_mode() test_constant_mode_constant_val() test_reflect_mode() test_replicate_mode() test_constant_general() @ssa_fn def test_builder_eval(self): def test_constant_mode(): x_val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) v = mb.pad( x=x_val, pad=np.array([1, 1, 2, 2], dtype=np.int32), mode="constant", constant_val=0.0, ) expected_outputs = np.array( [ [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 2.0, 3.0, 0.0, 0.0], [0.0, 0.0, 4.0, 5.0, 6.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], ], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) def test_reflect_mode(): x_val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) v = mb.pad(x=x_val, pad=np.array([1, 1, 2, 2], dtype=np.int32), mode="reflect") expected_outputs = np.array( [ [6.0, 5.0, 4.0, 5.0, 6.0, 5.0, 4.0], [3.0, 2.0, 1.0, 2.0, 3.0, 2.0, 1.0], [6.0, 5.0, 4.0, 5.0, 6.0, 5.0, 4.0], [3.0, 2.0, 1.0, 2.0, 3.0, 2.0, 1.0], ], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) def test_replicate_mode(): x_val = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) v = mb.pad(x=x_val, pad=np.array([1, 1, 2, 2], dtype=np.int32), mode="replicate") expected_outputs = np.array( [ [1.0, 1.0, 1.0, 2.0, 3.0, 3.0, 3.0], [1.0, 1.0, 1.0, 2.0, 3.0, 3.0, 3.0], [4.0, 4.0, 4.0, 5.0, 6.0, 6.0, 6.0], [4.0, 4.0, 4.0, 5.0, 6.0, 6.0, 6.0], ], dtype=np.float32, ) np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) def test_constant_general(): x_val = np.arange(12, dtype=np.float32).reshape([2, 2, 3]) pad = np.array([[1, 1], [2, 2], [1, 1]], dtype=np.int32) v = mb.pad(x=x_val, pad=pad.reshape(-1), mode="constant", constant_val=0.0) expected_outputs = np.pad(x_val, pad, mode="constant") np.testing.assert_allclose(expected_outputs, v.val, atol=1e-04, rtol=1e-05) # Test different modes test_constant_mode() test_reflect_mode() test_replicate_mode() test_constant_general() @staticmethod def test_value_inference_with_symbolic_padding(): @mb.program( input_specs=[ mb.TensorSpec(shape=(get_new_symbol(), get_new_symbol(), 1, 3), dtype=types.fp32) ] ) def prog(x): paddings = mb.shape(x=x) res = mb.pad(x=np.random.rand(1, 1), pad=paddings) shape = res.shape assert is_symbolic(shape[0]) assert shape[1] == 5 return res @staticmethod def test_error_out_with_dynamic_paddings_with_invaid_shape(): with pytest.raises( ValueError, match="Non-constant 'pad' must have shape \(8,\). Got \(4,\)" ): @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 1, 3, 4)), mb.TensorSpec(shape=(2, 2), dtype=types.int32), ] ) def prog(x, y): pad = mb.reshape(x=y, shape=[-1]) res = mb.pad(x=x, pad=pad) @staticmethod def test_error_out_with_invalid_padding_value(): with pytest.raises( ValueError, match=r"pad must be non-negative integer, got -1022 at index 6", ): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 48, 1, 1024))]) def prog(x): y = mb.pad(x=x, pad=[0, 0, 0, 0, 0, 0, -1022, 0], mode="constant") return y class TestRange1d: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = 15.0 y = 5.0 z = 2.0 # Model inputs must have rank at least 1 input_placeholders = { "x": mb.placeholder(shape=(1,)), "y": mb.placeholder(shape=(1,)), "z": mb.placeholder(shape=(1,)), } input_values = {"x": x, "y": y, "z": z} def build(x, y, z): return [ mb.range_1d(start=mb.squeeze(x=y), end=15.0, step=2.0), mb.range_1d(start=mb.squeeze(x=y), end=15.0, step=mb.squeeze(x=z)), mb.range_1d(start=mb.squeeze(x=y), end=mb.squeeze(x=x), step=2.0), mb.range_1d(start=mb.squeeze(x=y), end=mb.squeeze(x=x), step=mb.squeeze(x=z)), mb.range_1d(start=5.0, end=15.0, step=mb.squeeze(x=z)), mb.range_1d(start=5.0, end=mb.squeeze(x=x), step=2.0), mb.range_1d(start=5.0, end=mb.squeeze(x=x), step=mb.squeeze(x=z)), ] expected_output_types = [ (UNK_SYM, types.fp32), (UNK_SYM, types.fp32), (UNK_SYM, types.fp32), (UNK_SYM, types.fp32), (UNK_SYM, types.fp32), (UNK_SYM, types.fp32), (UNK_SYM, types.fp32), ] expected_outputs = [ np.array([5, 7, 9, 11, 13], dtype=np.float32), np.array([5, 7, 9, 11, 13], dtype=np.float32), np.array([5, 7, 9, 11, 13], dtype=np.float32), np.array([5, 7, 9, 11, 13], dtype=np.float32), np.array([5, 7, 9, 11, 13], dtype=np.float32), np.array([5, 7, 9, 11, 13], dtype=np.float32), np.array([5, 7, 9, 11, 13], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_large_array(self, compute_unit, backend): input_placeholders = { "x": mb.placeholder(shape=(1,)), # dummpy input } input_values = {"x": 0.5} def build(x): return [mb.range_1d(start=0.0, end=2000000.0, step=1.0)] expected_output_types = [(2000000, types.fp32)] expected_outputs = [ np.arange(0.0, 2000000.0, 1.0), ] mlmodel = run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) # verify that the range_1d op is not const folded prog = mlmodel._mil_program ops = get_op_types_in_program(prog) assert ops == ["range_1d", "identity"] @ssa_fn def test_builder_eval(self): v = mb.range_1d(start=5, end=15, step=2) np.testing.assert_allclose(np.arange(5, 15, 2), v.val, atol=1e-04, rtol=1e-05) class TestTile: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x.shape)} input_values = {"x": x} def build(x): return [ mb.tile(x=x, reps=(1, 1)), mb.tile(x=x, reps=(2, 1)), ] expected_output_types = [ (2, 3, types.fp32), (4, 3, types.fp32), ] expected_outputs = [ x, np.array([[1, 2, 3], [4, 5, 6], [1, 2, 3], [4, 5, 6]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) v = mb.tile(x=x, reps=(1, 2)) np.testing.assert_allclose(np.tile(x, reps=(1, 2)), v.val, atol=1e-04, rtol=1e-05) class TestDynamicTile: @staticmethod def test_dynamic_shape_tile_type_inference(): reps = [1, 2] input_shape = [get_new_symbol(), get_new_symbol()] @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): x = mb.tile(x=x, reps=[1, 2]) assert x.shape[0] == input_shape[0] assert is_symbolic(x.shape[1]) assert x.shape[1] != input_shape[1] return x @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) rep1 = np.array([1, 1]).astype(np.int32) rep2 = np.array([2, 1]).astype(np.int32) rep3 = np.array([2, 3]).astype(np.int32) input_placeholders = { "x": mb.placeholder(shape=x.shape), "reps1": mb.placeholder(shape=rep1.shape, dtype=types.int32), "reps2": mb.placeholder(shape=rep2.shape, dtype=types.int32), "reps3": mb.placeholder(shape=rep3.shape, dtype=types.int32), } input_values = {"x": x, "reps1": rep1, "reps2": rep2, "reps3": rep3} def build(x, reps1, reps2, reps3): return [ mb.tile(x=x, reps=reps1), mb.tile(x=x, reps=reps2), mb.tile(x=x, reps=reps3), ] expected_output_types = [ (UNK_SYM, UNK_SYM, types.fp32), (UNK_SYM, UNK_SYM, types.fp32), (UNK_SYM, UNK_SYM, types.fp32), ] expected_outputs = [ x, np.array([[1, 2, 3], [4, 5, 6], [1, 2, 3], [4, 5, 6]], dtype=np.float32), np.array( [ [1, 2, 3, 1, 2, 3, 1, 2, 3], [4, 5, 6, 4, 5, 6, 4, 5, 6], [1, 2, 3, 1, 2, 3, 1, 2, 3], [4, 5, 6, 4, 5, 6, 4, 5, 6], ], dtype=np.float32, ), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestTopK: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): val = np.array([[-1.0, 2.0, -3.0], [4.0, -5.0, 6.0]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return mb.topk(x=x, k=2, axis=1) expected_output_types = [ (2, 2, types.fp32), (2, 2, types.int32), ] expected_outputs = [ np.array([[2.0, -1.0], [6.0, 4.0]], dtype=np.float32), np.array([[1, 0], [2, 0]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): def np_topk(x, k, axis, ascending=False): indices = np.argsort(x, axis=axis) if not ascending: indices = np.argsort(-x, axis=axis) slc = [slice(None)] * len(x.shape) slc[axis] = slice(0, k) indices = indices[tuple(slc)] values = np.take_along_axis(x, indices, axis=axis) return values, indices val = np.array([[-1.0, 7.0, -3.0], [4.0, -5.0, 8.0]], dtype=np.float32) res_values, res_indices = mb.topk(x=val, k=1, axis=0) ref_values, ref_indices = np_topk(x=val, k=1, axis=0) np.testing.assert_allclose(ref_values, res_values.val, atol=1e-04, rtol=1e-05) np.testing.assert_allclose(ref_indices, res_indices.val, atol=1e-04, rtol=1e-05) res_values, res_indices = mb.topk(x=val, k=2, axis=-1, ascending=True) ref_values, ref_indices = np_topk(x=val, k=2, axis=-1, ascending=True) np.testing.assert_allclose(ref_values, res_values.val, atol=1e-04, rtol=1e-05) np.testing.assert_allclose(ref_indices, res_indices.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_symbolic(self, compute_unit, backend): s0 = get_new_symbol() val = np.array([[1.0, 2.0, -3.0], [4.0, -5.0, 6.0]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=(s0, 3))} input_values = {"x": val} def build(x): return mb.topk(x=x, k=2, axis=-1, ascending=True) expected_output_types = [ (s0, 2, types.fp32), (s0, 2, types.int32), ] expected_outputs = [ np.array([[-3.0, 1.0], [-5.0, 4.0]], dtype=np.float32), np.array([[2, 0], [1, 0]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestFlatten2d: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[[1, 2, 3], [4, 5, 6]], [[-1, -2, -3], [-4, -5, -6]]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return [mb.flatten2d(x=x)] expected_output_types = [ (2, 6, types.fp32), ] expected_outputs = [ np.array([[1, 2, 3, 4, 5, 6], [-1, -2, -3, -4, -5, -6]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, rank, axis, backend", itertools.product( compute_units, range(1, 6), range(-5, 6), backends, ), ) def test_builder_to_backend_stress(self, compute_unit, rank, axis, backend): if axis < -rank or axis >= rank + 1: return shape = np.random.randint(low=2, high=6, size=rank) t = np.random.random(shape) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return [mb.flatten2d(x=x, axis=axis)] np_axis = axis + rank if axis < 0 else axis pl, pr = 1, 1 for i in range(0, np_axis): pl *= shape[i] for i in range(np_axis, len(shape)): pr *= shape[i] new_shape = [pl, pr] ref = t.reshape(new_shape) expected_outputs = [ref] expected_output_types = [ tuple(list(ref.shape) + [types.fp32]), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): t = np.array([[[1, 2, 3], [4, 5, 6]]], dtype=np.float32) f = mb.flatten2d(x=t) expected_f = np.array([[1, 2, 3, 4, 5, 6]], dtype=np.float32) np.testing.assert_allclose(expected_f, f.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_symbolic(self, compute_unit, backend): s0 = get_new_symbol() input_placeholders = { "x": mb.placeholder(shape=(s0, 4, 5, 6)), } def build(x): return [mb.flatten2d(x=x)] input = np.random.rand(10, 4, 5, 6) output = input.reshape(10, -1) expected_output_types = (s0, 120, types.fp32) expected_outputs = [output] input_values = {"x": input} run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, 10) if backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) class TestShape: @pytest.mark.parametrize( "compute_unit, backend, input_type", itertools.product(compute_units, backends, ["int32", "float32"]), ) def test_builder_to_backend_smoke(self, compute_unit, backend, input_type): np_type = np.int32 if input_type == "int32" else np.float32 mb_type = types.int32 if input_type == "int32" else types.fp32 t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np_type) input_placeholders = {"x": mb.placeholder(shape=t.shape, dtype=mb_type)} input_values = {"x": t} def build(x): return mb.shape(x=x) expected_output_types = (2, types.int32) expected_outputs = [ np.array([2, 3], dtype=np.int32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): t = np.array([[[1, 2, 3], [4, 5, 6]]], dtype=np.float32) f = mb.shape(x=t) expected_f = np.array([1, 2, 3], dtype=np.float32) np.testing.assert_allclose(expected_f, f.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend, input_type", itertools.product(compute_units, backends, ["int32", "float32"]), ) def test_builder_to_backend_symbolic(self, compute_unit, backend, input_type): np_type = np.int32 if input_type == "int32" else np.float32 mb_type = types.int32 if input_type == "int32" else types.fp32 s0 = get_new_symbol() input_placeholders = { "x": mb.placeholder(shape=(s0, 4, 5, 6), dtype=mb_type), } def build(x): return [mb.shape(x=x)] input = np.random.rand(10, 4, 5, 6) input = input.astype(np_type) output = np.array([10, 4, 5, 6], dtype=np.int32) expected_output_types = (4, types.int32) expected_outputs = [output] input_values = {"x": input} run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, 10) if backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) class TestIdentity: @pytest.mark.parametrize( "compute_unit, backend, input_type", itertools.product(compute_units, backends, ["int32", "float32"]), ) def test_builder_to_backend_smoke(self, compute_unit, backend, input_type): np_type = np.int32 if input_type == "int32" else np.float32 mb_type = types.int32 if input_type == "int32" else types.fp32 t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np_type) input_placeholders = {"x": mb.placeholder(shape=t.shape, dtype=mb_type)} input_values = {"x": t} def build(x): return mb.identity(x=x) expected_output_types = [(2, 3, mb_type)] expected_outputs = [ np.array([[1, 2, 3], [4, 5, 6]], dtype=np_type), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): t = np.array([[[1, 2, 3], [4, 5, 6]]], dtype=np.float32) f = mb.identity(x=t) expected_f = np.array([[[1, 2, 3], [4, 5, 6]]], dtype=np.float32) np.testing.assert_allclose(expected_f, f.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_symbolic(self, compute_unit, backend): input_placeholders = { "x": mb.placeholder(shape=(10, 4, 5, 6)), } def build(x): return [mb.identity(x=x)] input = np.random.rand(10, 4, 5, 6) output = input expected_output_types = [(10, 4, 5, 6, types.fp32)] expected_outputs = [output] input_values = {"x": input} run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @staticmethod def test_identity_type_inference_for_const_input(): @mb.program(input_specs=[mb.TensorSpec(shape=(10,))]) def prog(x): x = mb.identity(x=np.float32(1.0)) assert x.dtype == types.fp32 return x class TestArgSort: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): val = np.array([[-1.0, 2.0, -3.0], [4.0, -5.0, 6.0]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return [mb.argsort(x=x), mb.argsort(x=x, axis=0, ascending=True)] expected_output_types = [ (2, 3, types.int32), (2, 3, types.int32), ] expected_outputs = [ np.array([[1, 0, 2], [2, 0, 1]], dtype=np.int32), np.array([[0, 1, 0], [1, 0, 1]], dtype=np.int32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = random_gen(shape=(1, 3, 2, 2), rand_min=-100, rand_max=100) res = mb.argsort(x=x_val, axis=-3) # The default np argsort mode is ascending, which is opposite to MIL's argsort op. np.testing.assert_allclose(np.argsort(-x_val, axis=-3), res.val, atol=1e-04, rtol=1e-05) class TestConcat: @pytest.mark.parametrize( "compute_unit, backend, axis", itertools.product( compute_units, backends, [0, 1], ), ) def test_builder_to_backend_numerical(self, compute_unit, backend, axis): def build(x1, x2): return mb.concat(values=[x1, x2], axis=axis) val1 = np.array([[-1.0, 2.0, -3.0], [4.0, -5.0, 6.0]], dtype=np.float32) val2 = -val1 input_placeholders = { "x1": mb.placeholder(shape=val1.shape), "x2": mb.placeholder(shape=val2.shape), } input_values = {"x1": val1, "x2": val2} expected_res = np.concatenate([val1, val2], axis=axis) run_compare_builder( build, input_placeholders, input_values, expected_output_types=[expected_res.shape + (types.fp32,)], expected_outputs=expected_res, compute_unit=compute_unit, backend=backend, ) def test_builder_eval_different_dtypes_error_out(self): """If the input to the concat op has different dtypes, it will error out.""" with pytest.raises( ValueError, match="Tensors in 'values' of the concat op \(concat_0\) should share the same data type", ): @mb.program( input_specs=[ mb.TensorSpec(shape=(2, 3), dtype=types.fp32), mb.TensorSpec(shape=(2, 3), dtype=types.int32), ] ) def prog(x1, x2): return mb.concat(values=[x1, x2], axis=0) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS14/test_tensor_transformation.py0000644000000000000000000017005414672066616031457 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools._deps import _HAS_TORCH, MSG_TORCH_NOT_FOUND from coremltools.converters.mil import testing_reqs from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS14 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import ( UNK_SYM, UNK_VARIADIC, construct_inputs_from_placeholders, run_compare_builder, ) from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import ssa_fn if _HAS_TORCH: import torch class TestDepthToSpace: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): # original input type is (1, 4, 1, 1, fp32) val = np.array([[[[9.0]], [[5.0]], [[1.0]], [[3.0]]]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return [mb.depth_to_space(x=x, block_size=2)] expected_output_types = (1, 1, 2, 2, types.fp32) expected_outputs = np.array([[[[9.0, 5.0], [1.0, 3.0]]]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestSpaceToBatch: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): # original input type is (2, 1, 2, 4, fp32) val = np.array( [[[[1, 2, 3, 4], [5, 6, 7, 8]]], [[[9, 10, 11, 12], [13, 14, 15, 16]]]], dtype=np.float32, ) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return [mb.space_to_batch(x=x, block_shape=[2, 2], paddings=[[0, 0], [2, 0]])] expected_output_types = (8, 1, 1, 3, types.fp32) expected_outputs = np.array( [ [[[0, 1, 3]]], [[[0, 9, 11]]], [[[0, 2, 4]]], [[[0, 10, 12]]], [[[0, 5, 7]]], [[[0, 13, 15]]], [[[0, 6, 8]]], [[[0, 14, 16]]], ], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestBatchToSpace: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): # original input type is (8, 1, 1, 3, fp32) val = np.array( [ [[[0, 1, 3]]], [[[0, 9, 11]]], [[[0, 2, 4]]], [[[0, 10, 12]]], [[[0, 5, 7]]], [[[0, 13, 15]]], [[[0, 6, 8]]], [[[0, 14, 16]]], ], dtype=np.float32, ) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return [mb.batch_to_space(x=x, block_shape=[2, 2], crops=[[0, 0], [2, 0]])] expected_output_types = (2, 1, 2, 4, types.fp32) expected_outputs = np.array( [[[[1, 2, 3, 4], [5, 6, 7, 8]]], [[[9, 10, 11, 12], [13, 14, 15, 16]]]], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestExpandDims: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return [ mb.expand_dims(x=x, axes=[0]), mb.expand_dims(x=x, axes=[1]), mb.expand_dims(x=x, axes=[2]), mb.expand_dims(x=x, axes=[-1]), mb.expand_dims(x=x, axes=[0, 1]), mb.expand_dims(x=x, axes=[-2, -1]), ] expected_output_types = [ (1, 2, 3, types.fp32), (2, 1, 3, types.fp32), (2, 3, 1, types.fp32), (2, 3, 1, types.fp32), (1, 1, 2, 3, types.fp32), (2, 3, 1, 1, types.fp32), ] expected_outputs = [ np.array([[[1, 2, 3], [4, 5, 6]]], dtype=np.float32), np.array([[[1, 2, 3]], [[4, 5, 6]]], dtype=np.float32), np.array([[[1], [2], [3]], [[4], [5], [6]]], dtype=np.float32), np.array([[[1], [2], [3]], [[4], [5], [6]]], dtype=np.float32), np.array([[[[1, 2, 3], [4, 5, 6]]]], dtype=np.float32), np.array([[[[1]], [[2]], [[3]]], [[[4]], [[5]], [[6]]]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_symbolic(self, compute_unit, backend): s0 = get_new_symbol() input_placeholders = { "x": mb.placeholder(shape=(2, s0)), } def build(x): return [ mb.expand_dims(x=x, axes=[-1]), mb.expand_dims(x=x, axes=[1]), ] expected_output_types = [ (2, s0, 1, types.fp32), (2, 1, s0, types.fp32), ] expected_outputs = [ np.array([[[1], [2], [3]], [[4], [5], [6]]], dtype=np.float32), np.array([[[1, 2, 3]], [[4, 5, 6]]], dtype=np.float32), ] input_values = { "x": np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32), } run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, 10) if backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x_val = np.random.rand(1, 6) v1 = mb.expand_dims(x=x_val, axes=[2]) np.testing.assert_allclose(np.expand_dims(x_val, 2), v1.val, atol=1e-04, rtol=1e-05) v2 = mb.expand_dims(x=x_val, axes=[-1]) np.testing.assert_allclose(np.expand_dims(x_val, -1), v2.val, atol=1e-04, rtol=1e-05) v3 = mb.expand_dims(x=x_val, axes=[-1, -2]) ref = np.expand_dims(np.expand_dims(x_val, -1), -1) np.testing.assert_allclose(ref, v3.val, atol=1e-04, rtol=1e-05) v4 = mb.expand_dims(x=x_val, axes=[0, -1, -2]) np.testing.assert_allclose( np.reshape(x_val, (1, 1, 6, 1, 1)), v4.val, atol=1e-04, rtol=1e-05 ) @pytest.mark.parametrize( "compute_unit, backend, rank_and_axis", itertools.product( compute_units, backends, [(rank, axis) for rank in range(1, 5) for axis in range(-rank - 1, rank + 1)], ), ) def test_builder_to_backend_programmatic_one_axis(self, compute_unit, backend, rank_and_axis): rank, axis = rank_and_axis x_shape = np.random.randint(low=2, high=6, size=rank) input_placeholders = {"x": mb.placeholder(shape=x_shape)} input_values = {"x": np.random.sample(x_shape).astype(np.float32)} def build(x): return mb.expand_dims(x=x, axes=[axis]) adjusted_axis = axis if axis >= 0 else rank + axis + 1 x_shape = list(x_shape) out_shape = x_shape[:adjusted_axis] + [1] + x_shape[adjusted_axis:] expected_output_types = tuple(out_shape[:]) + (types.fp32,) run_compare_builder( build, input_placeholders, input_values, expected_output_types, np.expand_dims(input_values["x"], axis), compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank_and_axes", itertools.product( compute_units, backends, [ (3, [0, 1]), (3, [1, 0]), (3, [-2, -1]), (3, [-1, -2]), (2, [-3, -1]), (2, [-3, 1, -1]), (2, [-2, 0]), (1, [-1, -2, -3, -4]), (1, [0, -1]), (1, [0, 1, -2, -1]), ], ), ) def test_builder_to_backend_programmatic_multiple_axes( self, compute_unit, backend, rank_and_axes ): rank, axes = rank_and_axes x_shape = np.random.randint(low=1, high=6, size=rank) input_placeholders = {"x": mb.placeholder(shape=x_shape)} input_values = {"x": np.random.sample(x_shape).astype(np.float32)} def build(x): return mb.expand_dims(x=x, axes=axes) out_shape = list(x_shape) out_rank = rank + len(axes) pos_axes = sorted([out_rank + axis if axis < 0 else axis for axis in axes]) for axis in pos_axes: out_shape.insert(axis, 1) expected_outputs = np.reshape(input_values["x"], out_shape) expected_output_types = tuple(out_shape) + (types.fp32,) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @staticmethod def test_expand_dims_value_inference_is_inplace(): @mb.program() def prog(): const = mb.const(val=[[2, 3], [4, 5]]) x = mb.expand_dims(x=const, axes=(1, 2)) x.val[0, 0, 0, 0] = 112 assert const.val[0, 0] == 112 return x class TestReshape: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return [ mb.reshape(x=x, shape=[3, 2]), mb.reshape(x=x, shape=[2, -1]), mb.reshape(x=x, shape=[2, 1, 1, 3]), ] expected_output_types = [ (3, 2, types.fp32), (2, 3, types.fp32), (2, 1, 1, 3, types.fp32), ] expected_outputs = [ np.array([[1, 2], [3, 4], [5, 6]], dtype=np.float32), np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32), np.array([[[[1.0, 2.0, 3.0]]], [[[4.0, 5.0, 6.0]]]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) r = mb.reshape(x=t, shape=[3, 2]) expected_r = np.array([[1, 2], [3, 4], [5, 6]], dtype=np.float32) np.testing.assert_allclose(expected_r, r.val, atol=1e-04, rtol=1e-05) r2 = mb.reshape(x=t, shape=[2, -1]) expected_r2 = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) np.testing.assert_allclose(expected_r2, r2.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_symbolic(self, compute_unit, backend): if backend.backend == "mlprogram": pytest.xfail( "rdar://131637870 Why It Randomly Segfaults on CI but Cannot Reproduce Locally " ) s0 = get_new_symbol() s_len = get_new_symbol() input_placeholders = { "x": mb.placeholder(shape=(2, s0)), "shape": mb.placeholder(shape=(3,), dtype=types.int32), "shape2": mb.placeholder(shape=(s_len,), dtype=types.int32), } def build(x, shape, shape2): return [ mb.reshape(x=x, shape=[2, -1]), mb.reshape(x=x, shape=[1, -1]), mb.reshape(x=x, shape=[2, 1, 1, -1]), mb.reshape(x=x, shape=shape), mb.reshape(x=x, shape=shape2), ] expected_output_types = [ (2, s0, types.fp32), (1, 2 * s0, types.fp32), (2, 1, 1, s0, types.fp32), (UNK_SYM, UNK_SYM, UNK_SYM, types.fp32), (UNK_VARIADIC, types.fp32), ] expected_outputs = [ np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32), np.array([[1, 2, 3, 4, 5, 6]], dtype=np.float32), np.array([[[[1.0, 2.0, 3.0]]], [[[4.0, 5.0, 6.0]]]], dtype=np.float32), np.array([[[1, 2, 3]], [[4, 5, 6]]], dtype=np.float32), np.array([[[1, 2, 3]], [[4, 5, 6]]], dtype=np.float32), ] input_values = { "x": np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32), "shape": np.array([2, 1, 3], dtype=np.float32), "shape2": np.array([2, 1, 3], dtype=np.float32), } run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, 10) if backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_too_many_neg_ones(self): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) with pytest.raises(ValueError, match="Reshape op supports only one dimension to be -1"): mb.reshape(x=x, shape=[-1, -1]) @ssa_fn def test_invalid_target_shape(self): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) with pytest.raises(ValueError, match="Invalid target shape in `reshape` op"): mb.reshape(x=x, shape=[4, -1]) @ssa_fn def test_invalid_target_shape_with_zero(self): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) with pytest.raises(ValueError, match="Invalid target shape in `reshape` op"): mb.reshape(x=x, shape=[0, 7]) @staticmethod def test_value_inference_with_symbolic_values(): @mb.program( input_specs=[ mb.TensorSpec(shape=(get_new_symbol(), get_new_symbol()), dtype=types.fp32) ] ) def prog(x): shape = mb.shape(x=x) res = mb.reshape(x=shape, shape=(1, 2)) res_sym_val = res.sym_val assert res_sym_val is not None assert res_sym_val.shape == (1, 2) assert res_sym_val[0][0] == shape.sym_val[0] assert res_sym_val[0][1] == shape.sym_val[1] return res @staticmethod def test_reshape_value_inference_is_inplace(): @mb.program() def prog(): const = mb.const(val=[[2, 3], [4, 5]]) x = mb.reshape(x=const, shape=(4, 1)) x.val[0, 0] = 112 assert const.val[0, 0] == 112 return x class TestReverse: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): val = np.array([[-1.0, 2.0, -3.0], [4.0, -5.0, 6.0]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return [mb.reverse(x=x), mb.reverse(x=x, axes=[0])] expected_output_types = [(2, 3, types.fp32), (2, 3, types.fp32)] expected_outputs = [ np.array([[6.0, -5.0, 4.0], [-3.0, 2.0, -1.0]], dtype=np.float32), np.array([[4.0, -5.0, 6.0], [-1.0, 2.0, -3.0]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): val = np.array([[-1.0, 7.0, -3.0], [4.0, -5.0, 8.0]], dtype=np.float32) res = mb.reverse(x=val, axes=[0]) np.testing.assert_allclose(np.flip(val, axis=0), res.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_symbolic(self, compute_unit, backend): s0 = get_new_symbol() val = np.array([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=(s0, 3))} input_values = {"x": val} def build(x): return [ mb.reverse(x=x, axes=[1]), mb.reverse(x=x, axes=[0]), ] expected_output_types = [ (s0, 3, types.fp32), (s0, 3, types.fp32), ] expected_outputs = [ np.array([[3.0, 2.0, 1.0], [6.0, 5.0, 4.0]], dtype=np.float32), np.array([[4.0, 5.0, 6.0], [1.0, 2.0, 3.0]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestReverseSequence: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array( [ [1, 2, 3, 4, 5, 0, 0, 0], [1, 2, 0, 0, 0, 0, 0, 0], [1, 2, 3, 4, 0, 0, 0, 0], [1, 2, 3, 4, 5, 6, 7, 8], ], dtype=np.float32, ) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} input_values = {"x": x_val} def build(x): return [ mb.reverse_sequence(x=x, lengths=[7, 2, 3, 5], seq_axis=1, batch_axis=0), ] expected_output_types = [ (4, 8, types.fp32), ] expected_outputs = [ np.array( [ [0, 0, 5, 4, 3, 2, 1, 0], [2, 1, 0, 0, 0, 0, 0, 0], [3, 2, 1, 4, 0, 0, 0, 0], [5, 4, 3, 2, 1, 6, 7, 8], ], dtype=np.float32, ) ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_symbolic(self, compute_unit, backend): s0 = get_new_symbol() x_val = np.array( [ [1, 2, 3, 4, 5, 0, 0, 0], [1, 2, 0, 0, 0, 0, 0, 0], [1, 2, 3, 4, 0, 0, 0, 0], [1, 2, 3, 4, 5, 6, 7, 8], ], dtype=np.float32, ) input_placeholders = {"x": mb.placeholder(shape=(4, s0))} input_values = {"x": x_val} def build(x): return [ mb.reverse_sequence(x=x, lengths=[7, 2, 3, 5], seq_axis=1, batch_axis=0), ] expected_output_types = [ (4, s0, types.fp32), ] expected_outputs = [ np.array( [ [0, 0, 5, 4, 3, 2, 1, 0], [2, 1, 0, 0, 0, 0, 0, 0], [3, 2, 1, 4, 0, 0, 0, 0], [5, 4, 3, 2, 1, 6, 7, 8], ], dtype=np.float32, ) ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, 10) if backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) class TestSliceByIndex: @pytest.mark.parametrize( "compute_unit, backend, x_dtype, idx_dtype", itertools.product( compute_units, backends, (np.float16, np.float32, np.int32), (np.int32,), ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, x_dtype, idx_dtype): x_builtin_dtype = types.numpy_type_to_builtin_type(x_dtype) idx_builtin_dtype = types.numpy_type_to_builtin_type(idx_dtype) x_val = np.array(list(range(24))).reshape((2, 3, 4)).astype(x_dtype) begin_val = np.array([1, 1, 1], dtype=idx_dtype) end_val = np.array([2, 3, 3], dtype=idx_dtype) input_placeholders = { "x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype), "begin": mb.placeholder(shape=begin_val.shape, dtype=idx_builtin_dtype), "end": mb.placeholder(shape=end_val.shape, dtype=idx_builtin_dtype), } input_values = {"x": x_val, "begin": begin_val, "end": end_val} def build(x, begin, end): begin_c = mb.const(val=begin_val) end_c = mb.const(val=end_val) return [ mb.slice_by_index(x=x, begin=begin, end=end), mb.slice_by_index(x=x, begin=begin_c, end=end_c), ] expected_output_types = [(UNK_SYM, UNK_SYM, UNK_SYM, x_builtin_dtype)] * 2 expected_outputs = [np.array([[[17, 18], [21, 22]]], dtype=x_dtype)] * 2 run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) def test_type_inference(self): s0 = get_new_symbol() s1 = get_new_symbol() s2 = get_new_symbol() input_placeholders = { "x": mb.placeholder(shape=(10, s0, s1, s2)), } def build(x): return [ mb.slice_by_index( x=x, begin=[2, 5, 6, 12], end=[6, 9, 20, -9], stride=[2, 1, 2, 1] ), mb.slice_by_index( x=x, begin=[-2, -5, -3, 9], end=[-6, -9, -6, -7], stride=[-2, -1, -2, 1], ), mb.slice_by_index( x=x, begin=[0, 0, 0, 0], end=[-6, -9, 3, -2], stride=[-2, -3, 1, 2], begin_mask=[True, True, True, True], end_mask=[False, False, False, False], ), mb.slice_by_index( x=x, begin=[-2, 5, -1, -7], end=[0, 0, 0, 0], stride=[-2, -3, 1, -2], begin_mask=[False, False, False, False], end_mask=[True, True, True, True], ), mb.slice_by_index( x=x, begin=[4, -1, 0, -5], end=[4, -1, 0, -5], stride=[1, -1, 2, -2] ), mb.slice_by_index( x=x, begin=[0, -1, 0, 2], end=[2, 0, 0, 2], begin_mask=[False, False, False, False], end_mask=[False, True, True, False], stride=[1, 2, -2, 1], ), mb.slice_by_index( x=x, begin=[0, 2, -3, 0], end=[1, 3, -4, 4], begin_mask=[False, False, False, False], end_mask=[False, False, False, False], stride=[1, 1, -1, 1], ), ] expected_output_types = [ (2, UNK_SYM, UNK_SYM, UNK_SYM, types.fp32), (2, UNK_SYM, UNK_SYM, UNK_SYM, types.fp32), (3, UNK_SYM, UNK_SYM, UNK_SYM, types.fp32), (5, UNK_SYM, 1, UNK_SYM, types.fp32), (0, 0, 0, 0, types.fp32), (2, 1, 1, 0, types.fp32), (1, 1, 1, UNK_SYM, types.fp32), ] run_compare_builder( build, input_placeholders, expected_output_types=expected_output_types, frontend_only=True, ) @pytest.mark.xfail(reason="rdar://99664032") @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_single_element_edge_case(self, compute_unit, backend): x_val = np.array(list(range(6))).reshape((1, 3, 2)).astype(np.float32) input_placeholders = { "x": mb.placeholder(shape=x_val.shape), } input_values = {"x": x_val} def build(x): return mb.slice_by_index( x=x, begin=[-1, 0, 0], end=[-2, 0, 0], stride=[-1, 1, 1], begin_mask=[False, True, True], end_mask=[False, True, True], ) expected_output_types = [(1, 3, 2, types.fp32)] expected_outputs = [np.array([[[0, 1], [2, 3], [4, 5]]], dtype=np.float32)] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval_scalar_output_corner_cases(self): x1 = np.array([2.0]) x2 = np.array([[[[1.0], [3.0]]]]) v = [ mb.slice_by_index( x=x1, begin=[ 0, ], end=[0], squeeze_mask=[True], ), mb.slice_by_index( x=x2, begin=[0, 0, 0, 0], end=[0, 0, 0, 0], squeeze_mask=[True, True, True, True], ), ] assert v[0].val.shape == () assert v[0].val == 2 assert v[1].val.shape == () assert v[1].val == 1 @ssa_fn def test_builder_eval(self): x_val = np.array(list(range(24))).reshape((2, 3, 4)) v = [ mb.slice_by_index(x=x_val, begin=[1, 1, 1], end=[2, 2, 2]), # x_val[1:2, 1:2, 1:2] mb.slice_by_index( x=x_val, begin=[1, 1, 1], end=[2, 3, 4], stride=[1, 1, 2] ), # x_val[1:2, 1:3, 1:4:2] mb.slice_by_index( x=x_val, begin=[-3, -3, -3], end=[-1, -1, -1] ), # x_val[-3:-1, -3:-1, -3:-1] mb.slice_by_index( x=x_val, begin=[0, 0, -3], end=[-1, -2, -2] ), # x_val[0:-1, 0:-2, -3:-2] mb.slice_by_index( x=x_val, begin=[-1, -1, -1], end=[0, 1, -3], stride=[-2, -1, -3] ), # x_val[-1:0:-2, -1:1:-1, -1:-3:-3] mb.slice_by_index( x=x_val, begin=[1, 1, 1], end=[2, 3, 4], stride=[1, 1, 2], begin_mask=[True, False, True], ), # x_val[:2, 1:3, :4:2] mb.slice_by_index( x=x_val, begin=[1, 1, 1], end=[2, 3, 4], stride=[1, 1, 2], begin_mask=[True, False, True], end_mask=[True, True, False], ), # x_val[:, 1:, :4:2] mb.slice_by_index( x=x_val, begin=[1, 1, 1], end=[2, 3, 3], stride=[1, 1, 2], begin_mask=[False, False, True], end_mask=[True, False, False], squeeze_mask=[False, True, False], ), # x_val[1::1, 1, :3:2] mb.slice_by_index( x=x_val, begin=[0, 0, 0], end=[0, 0, 0], stride=[1, 1, 1], begin_mask=[True, True, True], end_mask=[True, True, True], ), # x_val[:, :, :] mb.slice_by_index( x=x_val, begin=[1, 1, 1], end=[2, 2, 0], stride=[1, 1, 1], squeeze_mask=[False, False, True], ), # x_val[1:2, 1:2, 1] mb.slice_by_index( x=x_val, begin=[1, 0, 0], end=[2, 0, 0], stride=[1, 1, 1], begin_mask=[False, True, True], end_mask=[False, True, True], ), # x_val[1:2, ...] mb.slice_by_index( x=x_val, begin=[0, 0, 0], end=[0, 0, 0], stride=[1, 1, 1], begin_mask=[True, True, True], end_mask=[True, True, True], ), # x_val[...] mb.slice_by_index( x=x_val, begin=[1, 0, 1], end=[2, 0, 2], stride=[1, 1, 1], begin_mask=[False, True, False], end_mask=[False, True, False], ), # x_val[1:2, ..., 1:2] mb.slice_by_index( x=x_val, begin=[0, 0, 1], end=[0, 0, 0], stride=[1, 1, 1], begin_mask=[True, True, False], end_mask=[True, True, False], squeeze_mask=[False, False, True], ), # x_val[..., 1] mb.slice_by_index( x=x_val, begin=[0, 0, 0], end=[0, 0, 0], stride=[1, 1, 1], begin_mask=[False, False, True], end_mask=[False, False, True], squeeze_mask=[True, True, False], ), # x_val[0, 0, :] mb.slice_by_index( x=x_val, begin=[1, 0, 0], end=[2, 0, 0], stride=[1, 1, 1], begin_mask=[False, True, True], end_mask=[False, True, True], ), # x_val[1:2] mb.slice_by_index( x=x_val, begin=[1, 1, 0], end=[2, 2, 0], stride=[1, 1, 1], begin_mask=[False, False, True], end_mask=[False, False, True], ), # x_val[1:2, 1:2] mb.slice_by_index( x=x_val, begin=[1, 0, 0], end=[0, 0, 0], stride=[1, 1, 1], begin_mask=[False, True, True], end_mask=[False, True, True], squeeze_mask=[True, False, False], ), # x_val[1] mb.slice_by_index( x=x_val, begin=[0, 0, 0], end=[0, 0, 0], begin_mask=[True, True, True], end_mask=[True, True, True], ), # x_val[:] mb.slice_by_index( x=x_val, begin=[0, 0, 0], end=[0, 0, 0], stride=[1, 1, -1], begin_mask=[True, True, True], end_mask=[True, True, True], ), # x_val[..., ::-1] ] ans = [ x_val[1:2, 1:2, 1:2], x_val[1:2, 1:3, 1:4:2], x_val[-3:-1, -3:-1, -3:-1], x_val[0:-1, 0:-2, -3:-2], x_val[-1:0:-2, -1:1:-1, -1:-3:-3], x_val[:2, 1:3, :4:2], x_val[:, 1:, :4:2], x_val[1::1, 1, :3:2], x_val[:, :, :], x_val[1:2, 1:2, 1], x_val[1:2, ...], x_val[...], x_val[1:2, ..., 1:2], x_val[..., 1], x_val[0, 0, :], x_val[1:2], x_val[1:2, 1:2], x_val[1], x_val[:], x_val[..., ::-1], ] for idx in range(len(v)): assert ans[idx].shape == v[idx].shape np.testing.assert_allclose(ans[idx], v[idx].val, atol=1e-04, rtol=1e-05) @staticmethod @pytest.mark.skipif(ct.utils._macos_version() < (14, 0), reason="Bug fixed in macOS 14") def test_slice_by_index(): INPUT_SHAPE = (1, 2, 8, 16) @mb.program(input_specs=[mb.TensorSpec(shape=INPUT_SHAPE)]) def prog(x): x = mb.slice_by_index( x=x, begin=[0, 0, 0, 0], end=[1, 2, 8, 12], stride=[1, 1, 2, 2], begin_mask=None, end_mask=None, squeeze_mask=None, ) return x x = np.float16(np.random.rand(*INPUT_SHAPE)) # slice by index is x[begin[0]: end[0]: stride[0], begin[1]: end[1]: stride[1], ...] y_numpy = x[0:1:1, 0:2:1, 0:8:2, 0:12:2] model = ct.convert(prog, source="milinternal", convert_to="neuralnetwork") y_neuralnetwork = list(model.predict({"x": x}).values())[0] np.testing.assert_allclose(y_numpy, y_neuralnetwork) model = ct.convert( prog, source="milinternal", convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY, ) y_mlprogram = list(model.predict({"x": x}).values())[0] assert y_numpy.shape == y_mlprogram.shape np.testing.assert_allclose(y_numpy, y_mlprogram) @staticmethod @pytest.mark.skipif(ct.utils._macos_version() < (14, 0), reason="Bug fixed in macOS 14") def test_slice_by_index_slice_squeeze_separate(): INPUT_SHAPE = (1, 2, 8, 16) @mb.program(input_specs=[mb.TensorSpec(shape=INPUT_SHAPE)]) def prog(x): x = mb.slice_by_index( x=x, begin=[0, 0, 0, 0], end=[1, 2, 8, 12], stride=[1, 1, 1, 2], begin_mask=None, end_mask=None, squeeze_mask=[True, False, False, False], ) return x x = np.random.rand(*INPUT_SHAPE) # slice by index is x[begin[0]: end[0]: stride[0], begin[1]: end[1]: stride[1], ...] # and squeeze dim 0 y_numpy = x[0:1:1, 0:2:1, 0:8:1, 0:12:2] y_numpy = np.squeeze(y_numpy, axis=0) model = ct.convert(prog, source="milinternal", convert_to="neuralnetwork") y_neuralnetwork = list(model.predict({"x": x}).values())[0] assert y_numpy.shape == y_neuralnetwork.shape np.testing.assert_allclose(y_numpy, y_neuralnetwork) model = ct.convert(prog, source="milinternal", convert_to="mlprogram") y_mlprogram = list(model.predict({"x": x}).values())[0] # TODO: rdar://103365766 MLProgram does not apply squeeze_mask. # np.testing.assert_allclose(y_numpy, y_mlprogram) class TestSliceBySize: @pytest.mark.parametrize( "compute_unit, backend, size_val, x_dtype, idx_dtype", itertools.product( compute_units, backends, ([1, 2, 3], [-1, 2, -1]), (np.float16, np.float32, np.int32), (np.int32,), ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, size_val, x_dtype, idx_dtype): def build(x, begin): return mb.slice_by_size(x=x, begin=begin, size=np.array(size_val, dtype=idx_dtype)) x_builtin_dtype = types.numpy_type_to_builtin_type(x_dtype) idx_builtin_dtype = types.numpy_type_to_builtin_type(idx_dtype) x_val = np.array(list(range(24))).reshape((2, 3, 4)).astype(x_dtype) begin_val = np.array([1, 1, 1], dtype=idx_dtype) input_placeholders = { "x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype), "begin": mb.placeholder(shape=begin_val.shape, dtype=idx_builtin_dtype), } input_values = {"x": x_val, "begin": begin_val} expected_outputs = np.array([[[17, 18, 19], [21, 22, 23]]], dtype=x_dtype) expected_output_types = tuple([dim if dim != -1 else UNK_SYM for dim in size_val]) + ( x_builtin_dtype, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x = np.array(list(range(24))).reshape(2, 3, 4) v_1 = mb.slice_by_size(x=x, begin=(0, 1, 0), size=(-1, -1, -1)) v_2 = mb.slice_by_size(x=x, begin=(0, 1, 0), size=(-1, -1, 3)) v_3 = mb.slice_by_size(x=x, begin=(0, -2, 0), size=(-1, -1, 3)) np.testing.assert_allclose(x[:, 1:, :], v_1.val, atol=1e-04, rtol=1e-05) np.testing.assert_allclose(x[:, 1:, :3], v_2.val, atol=1e-04, rtol=1e-05) np.testing.assert_allclose(x[:, -2:, :3], v_3.val, atol=1e-04, rtol=1e-05) class TestSpaceToDepth: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): # original input type is (1, 1, 2, 2, fp32) val = np.array([[[[7.0, 9.0], [4.0, 6.0]]]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return [mb.space_to_depth(x=x, block_size=2)] expected_output_types = (1, 4, 1, 1, types.fp32) expected_outputs = np.array([[[[7.0]], [[9.0]], [[4.0]], [[6.0]]]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestSqueeze: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([[[[1], [2], [3]]]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x.shape)} input_values = {"x": x} def build(x): return [ mb.squeeze(x=x, axes=(-1,)), mb.squeeze(x=x, axes=(-3, 0)), mb.squeeze(x=x, axes=(0, 1, 3)), mb.squeeze(x=x), ] expected_output_types = [ (1, 1, 3, types.fp32), (3, 1, types.fp32), (3, types.fp32), (3, types.fp32), ] expected_outputs = [ np.array([[[1, 2, 3]]], dtype=np.float32), np.array([[1], [2], [3]], dtype=np.float32), np.array([1, 2, 3], dtype=np.float32), np.array([1, 2, 3], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x = np.array([[[[1], [2], [3]], [[4], [5], [6]]]], dtype=np.float32) v = mb.squeeze(x=x, axes=(-4, 3)) np.testing.assert_allclose(np.squeeze(x, axis=(-4, 3)), v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_eval_rank_0(self): x = np.array([1], dtype=np.float32) v = mb.squeeze(x=x) assert v.shape == () assert type(v.val) == np.float32 assert np.isclose(np.squeeze(x), v.val) @staticmethod def test_squeeze_value_inference_is_inplace(): @mb.program() def prog(): const = mb.const(val=[[[2, 3], [4, 5]]]) x = mb.squeeze(x=const, axes=(0,)) x.val[0, 0] = 112 assert const.val[0, 0, 0] == 112 return x @staticmethod def test_squeeze_invalid_axis(): with pytest.raises( ValueError, match="Invalid axis 3 in squeeze. The axis should be smaller than 3" ): @mb.program() def prog(): const = mb.const(val=[[[2, 3], [4, 5]]]) x = mb.squeeze(x=const, axes=(3,)) return x @pytest.mark.parametrize( "compute_unit, backend, is_symbolic", itertools.product( compute_units, backends, (True, False), ), ) def test_non_single_element_dim(self, compute_unit, backend, is_symbolic): if backend.backend == "neuralnetwork": pytest.skip("neuralnetwork backend doesn't support squeeze a not-1 dimension") if compute_unit == ct.ComputeUnit.CPU_ONLY: pytest.xfail("CPU failed non-single-dim squeeze (rdar://124555262)") x = np.arange(2 * 3 * 4, dtype=np.int32).reshape(2, 3, 4) input_shape = ( [get_new_symbol(), get_new_symbol(), get_new_symbol()] if is_symbolic else x.shape ) input_placeholders = {"x": mb.placeholder(shape=input_shape)} input_values = {"x": x} def build(x): return [ mb.squeeze(x=x, axes=(-1,)), mb.squeeze(x=x, axes=(-2, 0)), mb.squeeze(x=x, axes=(0, 1, 2)), mb.squeeze(x=x), ] # The symbolic dim won't be squeezed, so it doesn't affect the output. expected_output_types = [tuple(input_shape) + (types.int32,)] * 4 expected_outputs = [x] * 4 run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, 10) if backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) class TestTranspose: @pytest.mark.parametrize( "compute_unit, backend, is_symbolic", itertools.product( compute_units, backends, [True, False], ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, is_symbolic): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) input_shape = x.shape if is_symbolic: input_shape = [get_new_symbol(), get_new_symbol()] input_placeholders = {"x": mb.placeholder(shape=input_shape)} input_values = {"x": x} def build(x): return [ mb.transpose(x=x, perm=(0, 1)), mb.transpose(x=x, perm=(1, 0)), mb.transpose(x=x, perm=(-1, 0)), mb.transpose(x=x, perm=(-2, -1)), ] d0 = input_shape[0] d1 = input_shape[1] expected_output_types = [ (d0, d1, types.fp32), (d1, d0, types.fp32), (d1, d0, types.fp32), (d0, d1, types.fp32), ] expected_outputs = [x, x.T, x.T, x] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, 10) if backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) v = mb.transpose(x=x, perm=(1, 0)) np.testing.assert_allclose(x.T, v.val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_symbolic(self, compute_unit, backend): s0 = get_new_symbol() input_placeholders = { "x": mb.placeholder(shape=(2, s0)), } def build(x): return [ mb.transpose(x=x, perm=[1, 0]), ] expected_output_types = [ (s0, 2, types.fp32), ] expected_outputs = [ np.array([[1, 4], [2, 5], [3, 6]], dtype=np.float32), ] input_values = { "x": np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32), } run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, inputs=construct_inputs_from_placeholders(input_placeholders, 10) if backend.backend == "mlprogram" else None, compute_unit=compute_unit, backend=backend, ) @staticmethod def test_transpose_value_inference_is_inplace(): @mb.program() def prog(): const = mb.const(val=[[2, 3], [4, 5]]) x = mb.transpose(x=const, perm=(0, 1)) x.val[0, 0] = 112 assert const.val[0, 0] == 112 return x class TestPixelShuffle: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): # original input type is (1, 4, 1, 1, fp32) val = np.array([[[[9.0]], [[5.0]], [[1.0]], [[3.0]]]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return [mb.pixel_shuffle(x=x, upscale_factor=2)] expected_output_types = (1, 1, 2, 2, types.fp32) expected_outputs = np.array([[[[9.0, 5.0], [1.0, 3.0]]]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.skipif(not testing_reqs._HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( "compute_unit, backend, shape, upscale_factor", itertools.product( compute_units, backends, [(1, 16, 1, 1), (2, 16, 3, 3), (1, 32, 1, 1)], [2, 4], ), ) def test_builder_to_backend_stress(self, compute_unit, backend, shape, upscale_factor): val = np.random.rand(*shape) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return [mb.pixel_shuffle(x=x, upscale_factor=upscale_factor)] torch_pixel_shuffle = torch.nn.PixelShuffle(upscale_factor) expected_outputs = [torch_pixel_shuffle(torch.Tensor(val)).numpy()] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestSlidingWindows: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): # original input type is (1, 4, 1, 1, fp32) val = np.array([[[[9.0]], [[5.0]], [[1.0]], [[3.0]]]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return [mb.sliding_windows(x=x, axis=1, size=2)] expected_output_types = (1, 3, 2, 1, 1, types.fp32) expected_outputs = np.array( [[[[[9.0]], [[5.0]]], [[[5.0]], [[1.0]]], [[[1.0]], [[3.0]]]]], dtype=np.float32, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank_and_axis, size, stride", itertools.product( compute_units, backends, [(rank, axis) for rank in range(1, 5) for axis in range(-rank, rank)], [1, 2], [1, 2], ), ) def test_builder_to_backend_stress(self, compute_unit, backend, rank_and_axis, size, stride): def np_sliding_windows(a, np_axis, np_size, np_stride): n = (a.shape[np_axis] - np_size) // np_stride + 1 x_shape = list(a.shape) x_shape[np_axis] = n if np_axis < 0: np_axis += len(x_shape) x_shape.insert(np_axis + 1, np_size) strides = list(a.strides) eff_stride = strides[np_axis] * np_stride strides.insert(np_axis, eff_stride) return np.lib.stride_tricks.as_strided(a, x_shape, strides) rank, axis = rank_and_axis shape = np.random.randint(low=2, high=5, size=rank) val = np.random.rand(*shape) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return [mb.sliding_windows(x=x, axis=axis, size=size, stride=stride)] expected_outputs = [np_sliding_windows(val, np_axis=axis, np_size=size, np_stride=stride)] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestConcat: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): t1 = np.array([[1, 2], [4, 5]], dtype=np.float32) t2 = np.array([[7, 8]], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=t1.shape), "y": mb.placeholder(shape=t2.shape), } input_values = {"x": t1, "y": t2} def build(x, y): return (mb.concat(values=(x, y), axis=0),) expected_output_types = [ (3, 2, types.fp32), ] expected_outputs = [ np.array([[1, 2], [4, 5], [7, 8]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, rank, n_inputs, negative_index", itertools.product( compute_units, backends, [1, 2, 3, 4, 5], [2, 3], [False, True], ), ) def test_builder_to_backend_stress_interleave( self, compute_unit, backend, rank, n_inputs, negative_index ): def np_concat_interleave(arrays, axis): step = len(arrays) in_shape = arrays[0].shape out_shape = list(in_shape) if axis < 0: axis += len(in_shape) out_shape[axis] = step * in_shape[axis] concat_tensor = np.empty(tuple(out_shape), dtype=np.float32) for i in range(step): if rank == 5: if axis == 4: concat_tensor[:, :, :, :, i::step] = arrays[i] if axis == 3: concat_tensor[:, :, :, i::step, :] = arrays[i] if axis == 2: concat_tensor[:, :, i::step, :, :] = arrays[i] if axis == 1: concat_tensor[:, i::step, :, :, :] = arrays[i] if axis == 0: concat_tensor[i::step, :, :, :, :] = arrays[i] if rank == 4: if axis == 3: concat_tensor[:, :, :, i::step] = arrays[i] if axis == 2: concat_tensor[:, :, i::step, :] = arrays[i] if axis == 1: concat_tensor[:, i::step, :, :] = arrays[i] if axis == 0: concat_tensor[i::step, :, :, :] = arrays[i] if rank == 3: if axis == 2: concat_tensor[:, :, i::step] = arrays[i] if axis == 1: concat_tensor[:, i::step, :] = arrays[i] if axis == 0: concat_tensor[i::step, :, :] = arrays[i] if rank == 2: if axis == 1: concat_tensor[:, i::step] = arrays[i] if axis == 0: concat_tensor[i::step, :] = arrays[i] if rank == 1: concat_tensor[i::step] = arrays[i] return concat_tensor input_shape = [4, 2, 3, 6, 5] for axis in range(rank): if negative_index: axis = axis - rank shape = tuple(input_shape[:rank]) t1 = np.random.normal(size=shape).astype(np.float32) t2 = np.random.normal(size=shape).astype(np.float32) all_input_arrs = [t1, t2] input_placeholders = { "x": mb.placeholder(shape=t1.shape), "y": mb.placeholder(shape=t2.shape), } input_values = {"x": t1, "y": t2} if n_inputs == 3: t3 = np.random.normal(size=shape).astype(np.float32) input_placeholders["z"] = mb.placeholder(shape=t3.shape) input_values["z"] = t3 all_input_arrs.append(t3) def build_2_inputs(x, y): return (mb.concat(values=(x, y), axis=axis, interleave=True),) def build_3_inputs(x, y, z): return (mb.concat(values=(x, y, z), axis=axis, interleave=True),) np_out = np_concat_interleave(all_input_arrs, axis) expected_output_types = [np_out.shape + (types.fp32,)] expected_outputs = [np_out] run_compare_builder( build_3_inputs if n_inputs == 3 else build_2_inputs, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): values = [ np.random.rand(1, 1, 6, 2), np.random.rand(1, 1, 3, 2), ] v = mb.concat(values=values, axis=2) np.testing.assert_allclose(np.concatenate(values, 2), v.val, atol=1e-04, rtol=1e-05) @ssa_fn def test_builder_eval_failure(self): values = [ np.random.rand(1, 1, 6, 2), np.random.rand(1, 1, 3, 1), ] with pytest.raises(ValueError): mb.concat(values=values, axis=2) class TestSplit: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array([[1, 2], [3, 4], [5, 6]], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=t.shape), } input_values = {"x": t} def build(x): return mb.split(x=x, num_splits=2, axis=1) + mb.split(x=x, split_sizes=[1, 2], axis=0) expected_output_types = [ (3, 1, types.fp32), (3, 1, types.fp32), (1, 2, types.fp32), (2, 2, types.fp32), ] expected_outputs = [ np.array([[1], [3], [5]], dtype=np.float32), np.array([[2], [4], [6]], dtype=np.float32), np.array([[1, 2]], dtype=np.float32), np.array([[3, 4], [5, 6]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): t = np.array([[1, 2], [3, 4], [5, 6]], dtype=np.float32) vs = mb.split(x=t, num_splits=3, axis=0) es = np.split(t, [1, 2, 3], axis=0) for v, e in zip(vs, es): np.testing.assert_allclose(e, v.val, atol=1e-04, rtol=1e-05) class TestStack: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): t1 = np.array([1, 2, 3], dtype=np.float32) t2 = np.array([7, 8, 9], dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=t1.shape), "y": mb.placeholder(shape=t2.shape), } input_values = {"x": t1, "y": t2} def build(x, y): return [ mb.stack(values=(x, y), axis=0), mb.stack(values=(x, y), axis=1), mb.stack(values=(x, y), axis=-1), ] expected_output_types = [ (2, 3, types.fp32), (3, 2, types.fp32), (3, 2, types.fp32), ] expected_outputs = [ np.array([[1, 2, 3], [7, 8, 9]], dtype=np.float32), np.array([[1, 7], [2, 8], [3, 9]], dtype=np.float32), np.array([[1, 7], [2, 8], [3, 9]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @ssa_fn def test_builder_eval(self): values = [ np.random.rand(1, 1, 3, 2).astype(np.float32), np.random.rand(1, 1, 3, 2).astype(np.float32), ] v = mb.stack(values=values, axis=2) np.testing.assert_allclose(np.stack(values, 2), v.val, atol=1e-04, rtol=1e-05) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2415469 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS15/0000755000000000000000000000000014672075535023400 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS15/__init__.py0000644000000000000000000000061714672066616025515 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import coremltools as ct from coremltools.converters.mil.testing_reqs import backends_internal, clean_up_backends backends = clean_up_backends(backends_internal, ct.target.iOS15) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS15/test_elementwise_unary.py0000644000000000000000000000224014672066616030546 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.mil.ops.defs.iOS15 import elementwise_unary # Mock class to simulate the input_var behavior class MockInputVar: def __init__(self, val, sym_type): self.val = val self.sym_type = sym_type class TestCast: NUMPY_DTYPE_TO_STRING = { np.int32: "int32", np.float16: "fp16", np.float32: "fp32", np.bool_: "bool", } @pytest.mark.parametrize( "value, dtype", itertools.product( [2.0, (0.0, 1.0)], [np.int32, np.float16, np.float32, np.bool_], ), ) def test_cast(self, value, dtype): input_var = MockInputVar(val=value, sym_type=None) output = elementwise_unary.cast.get_cast_value( input_var, self.NUMPY_DTYPE_TO_STRING[dtype] ) expected_output = dtype(value) np.testing.assert_array_equal(output, expected_output) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS15/test_image_resizing.py0000644000000000000000000003275114672066616030015 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS15 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import ( mark_api_breaking, run_compare_builder, ) from coremltools.converters.mil.testing_reqs import compute_units class TestAffine: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.array([11.0, 22.0, 33.0, 44.0], dtype=np.float32).reshape([1, 1, 2, 2]) transform_matrix_val = np.array( [-1.0, -2.0, -3.7, -1.0, 3.5, 1.2], dtype=np.float32 ).reshape([1, 6]) input_placeholder_dict = { "x": mb.placeholder(shape=x_val.shape), "transform_matrix": mb.placeholder(shape=transform_matrix_val.shape), } input_value_dict = {"x": x_val, "transform_matrix": transform_matrix_val} def build(x, transform_matrix): return [ mb.affine( x=x, transform_matrix=transform_matrix, output_height=3, output_width=3, sampling_mode="bilinear", padding_mode="constant", padding_value=0.0, coordinates_mode="normalized_minus_one_to_one", align_corners=True, ), mb.affine( x=x, transform_matrix=transform_matrix, output_height=2, output_width=5, sampling_mode="bilinear", padding_mode="constant", padding_value=0.0, coordinates_mode="normalized_minus_one_to_one", align_corners=True, ), ] expected_output_types = [ (1, 1, 3, 3, types.fp32), (1, 1, 2, 5, types.fp32), ] expected_outputs = [ np.array( [10.752501, 2.5025, 0.0, 1.9799997, 0.0, 0.0, 0.0, 0.0, 0.0], dtype=np.float32, ).reshape([1, 1, 3, 3]), np.array( [10.752501, 5.94, 2.5025, 0.44000006, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], dtype=np.float32, ).reshape([1, 1, 2, 5]), ] run_compare_builder( build, input_placeholder_dict, input_value_dict, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestUpsampleNearestNeighborFractionalScales: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): if compute_unit != ct.ComputeUnit.CPU_ONLY: pytest.xfail( "rdar://97398448 (TestUpsampleNearestNeighborFractionalScales failing on GPU)" ) x_val = np.array([1.5, -2.5, 3.5], dtype=np.float32).reshape([1, 1, 1, 3]) input_placeholder_dict = {"x": mb.placeholder(shape=x_val.shape)} input_value_dict = {"x": x_val} def build(x): return [ mb.upsample_nearest_neighbor( x=x, scale_factor_height=1.0, scale_factor_width=1.0, ), mb.upsample_nearest_neighbor( x=x, scale_factor_height=3.17, scale_factor_width=0.67 ), mb.upsample_nearest_neighbor( x=x, scale_factor_height=2.0, scale_factor_width=1.12, ), ] expected_output_types = [ (1, 1, 1, 3, types.fp32), (1, 1, 3, 2, types.fp32), (1, 1, 2, 3, types.fp32), ] expected_outputs = [ x_val, np.array([1.5, -2.5, 1.5, -2.5, 1.5, -2.5], dtype=np.float32).reshape([1, 1, 3, 2]), np.array([1.5, -2.5, 3.5, 1.5, -2.5, 3.5], dtype=np.float32).reshape([1, 1, 2, 3]), ] run_compare_builder( build, input_placeholder_dict, input_value_dict, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestResample: @staticmethod def _test_builder_to_backend_smoke(compute_unit, backend, coordinates_dtype, expected_cast_ops): x_ = np.array([11.0, 22.0, 33.0, 44.0], dtype=np.float32).reshape([1, 1, 2, 2]) coordinates_ = ( np.array([-1.0, -2.0, -3.7, -1.0, 0.0, 0.0, 3.5, 1.2], dtype=np.float32) .reshape([1, 2, 2, 2]) .astype(coordinates_dtype) ) if np.issubdtype(coordinates_dtype, np.integer): coordinates_ = ( np.array([0, 0, 1, 1, 0, 0, 1, 1]).reshape([1, 2, 2, 2]).astype(coordinates_dtype) ) expected_output_type = (1, 1, 2, 2, types.fp32) def build_0(x, coordinates): return mb.resample( x=x, coordinates=coordinates, sampling_mode="bilinear", padding_mode="constant", padding_value=6.17, coordinates_mode="normalized_minus_one_to_one", align_corners=True, ) expected_output_0 = np.array([8.585, 6.17, 27.5, 6.17], dtype=np.float32) if np.issubdtype(coordinates_dtype, np.integer): expected_output_0 = np.array([27.5, 44.0, 27.5, 44.0], dtype=np.float32) expected_output_0 = expected_output_0.reshape(expected_output_type[:-1]) def build_1(x, coordinates): return mb.resample( x=x, coordinates=coordinates, sampling_mode="nearest", padding_mode="border", padding_value=-1.0, coordinates_mode="unnormalized", align_corners=False, ) expected_output_1 = np.array([11.0, 11.0, 11.0, 44.0], dtype=np.float32) if np.issubdtype(coordinates_dtype, np.integer): expected_output_1 = np.array([11.0, 44.0, 11.0, 44.0], dtype=np.float32) expected_output_1 = expected_output_1.reshape(expected_output_type[:-1]) def build_2(x, coordinates): return mb.resample( x=x, coordinates=coordinates, sampling_mode="bilinear", padding_mode="reflection", padding_value=-1.0, coordinates_mode="normalized_zero_to_one", align_corners=True, ) expected_output_2 = np.array([22.0, 36.3, 11.0, 34.1], dtype=np.float32) if np.issubdtype(coordinates_dtype, np.integer): expected_output_2 = np.array([11.0, 44.0, 11.0, 44.0], dtype=np.float32) expected_output_2 = expected_output_2.reshape(expected_output_type[:-1]) def build_3(x, coordinates): return mb.resample( x=x, coordinates=coordinates, sampling_mode="nearest", padding_mode="symmetric", padding_value=-1.0, coordinates_mode="normalized_zero_to_one", align_corners=False, ) expected_output_3 = np.array([22.0, 33.0, 11.0, 33.0], dtype=np.float32) if np.issubdtype(coordinates_dtype, np.integer): expected_output_3 = np.array([11.0, 44.0, 11.0, 44.0], dtype=np.float32) expected_output_3 = expected_output_3.reshape(expected_output_type[:-1]) for build, expected_output in zip( [build_0, build_1, build_2, build_3], [ expected_output_0, expected_output_1, expected_output_2, expected_output_3, ], ): # Need to create placeholders inside for loop to avoid interfere with each other. input_placeholder_dict = { "x": mb.placeholder(shape=x_.shape), "coordinates": mb.placeholder( shape=coordinates_.shape, dtype=types.numpy_type_to_builtin_type(coordinates_dtype), ), } input_value_dict = {"x": x_, "coordinates": coordinates_} mlmodel = run_compare_builder( build, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) prog = mlmodel._mil_program number_of_cast = len(prog["main"].find_ops(op_type="cast")) assert number_of_cast == expected_cast_ops @mark_api_breaking(breaking_opset_version=ct.target.iOS16) @pytest.mark.parametrize( "compute_unit, backend, coordinates_dtype", itertools.product( compute_units, backends, (np.int32, np.float32), ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, coordinates_dtype): self._test_builder_to_backend_smoke(compute_unit, backend, coordinates_dtype, 2) class TestResizeBilinear: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): x = np.array([0, 1], dtype=np.float32).reshape(1, 1, 2) input_placeholder_dict = {"x": mb.placeholder(shape=x.shape)} input_value_dict = {"x": x} def build_mode_4(x): return mb.resize_bilinear( x=x, target_size_height=1, target_size_width=5, sampling_mode="UNALIGN_CORNERS", ) expected_output_type = expected_output_type = (1, 1, 5, types.fp32) expected_output = np.array([0.0, 0.1, 0.5, 0.9, 1.0], dtype=np.float32).reshape(1, 1, 5) run_compare_builder( build_mode_4, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) class TestCropResize: @mark_api_breaking(breaking_opset_version=ct.target.iOS17) @pytest.mark.parametrize( "compute_unit, backend, is_symbolic", itertools.product(compute_units, backends, [True, False]), ) def test_builder_to_backend_smoke(self, compute_unit, backend, is_symbolic): if compute_unit != ct.ComputeUnit.CPU_ONLY: pytest.xfail("rdar://97398582 (TestCropResize failing on mlprogram + GPU)") x = np.array( [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]], dtype=np.float32, ).reshape(1, 1, 4, 4) input_shape = list(x.shape) placeholder_input_shape = input_shape if is_symbolic: # set batch and channel dimension symbolic placeholder_input_shape[0] = get_new_symbol() placeholder_input_shape[1] = get_new_symbol() input_placeholder_dict = {"x": mb.placeholder(shape=placeholder_input_shape)} input_value_dict = {"x": x} N = 1 roi = np.array([[1, 1, 2, 2]], dtype=np.float32).reshape(1, 1, 4, 1, 1) roi_normalized = np.array([[0, 0.0, 0.0, 1.0 / 3, 1.0 / 3]], dtype=np.float32).reshape( 1, 1, 5, 1, 1 ) roi_invert = np.array([[2, 2, 1, 1]], dtype=np.float32).reshape(1, 1, 4, 1, 1) def build(x): return mb.crop_resize( x=x, roi=roi_invert, target_width=2, target_height=2, normalized_coordinates=True, box_coordinate_mode="CORNERS_HEIGHT_FIRST", sampling_mode="UNALIGN_CORNERS", ) expected_output_type = ( N, placeholder_input_shape[0], placeholder_input_shape[1], 2, 2, types.fp32, ) expected_output = np.array([3.5, 5.5, 11.5, 13.5], dtype=np.float32).reshape(1, 1, 1, 2, 2) run_compare_builder( build, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) @mark_api_breaking(breaking_opset_version=ct.target.iOS17) @pytest.mark.parametrize( "backend", backends, ) def test_default_value(self, backend): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 3, 640, 640))], opset_version=backend.opset_version) def prog(a): res = mb.crop_resize(x=a, roi=np.array([[[[[0]], [[0]], [[320]], [[320]]]]], dtype=np.float32), target_height=160, target_width=160) assert res.op.box_coordinate_mode.val == "CORNERS_HEIGHT_FIRST" assert res.op.sampling_mode.val == "DEFAULT" return res ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS15/test_tensor_transformation.py0000644000000000000000000000603214672066616031452 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS15 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import ( mark_api_breaking, run_compare_builder, ) from coremltools.converters.mil.testing_reqs import compute_units class TestReshape: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_reshape_with_zero(self, compute_unit, backend): t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return [ mb.reshape(x=x, shape=[0, -1]), mb.reshape(x=x, shape=[0, 3]), mb.reshape(x=x, shape=[-1, 0]), ] expected_output_types = [ (2, 3, types.fp32), (2, 3, types.fp32), (2, 3, types.fp32), ] expected_outputs = [ np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32), np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32), np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @mark_api_breaking(breaking_opset_version=ct.target.iOS17) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_reshape_with_zero_different_len(self, compute_unit, backend): t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return [ mb.reshape(x=x, shape=[1, 0, -1, 0]), ] expected_output_types = [ (1, 1, 2, 3, types.fp32), ] expected_outputs = [ np.array([[[[1, 2, 3], [4, 5, 6]]]], dtype=np.float32), ] with pytest.raises( ValueError, match="When there is 0 in shape, the rank of x .* must " "equal to the target shape len", ): run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2415469 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS16/0000755000000000000000000000000014672075535023401 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS16/__init__.py0000644000000000000000000000061714672066616025516 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import coremltools as ct from coremltools.converters.mil.testing_reqs import backends_internal, clean_up_backends backends = clean_up_backends(backends_internal, ct.target.iOS16) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS16/test_constexpr_ops.py0000644000000000000000000006227414672066616027733 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools.converters.mil import testing_reqs from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.defs.iOS16 import constexpr_ops from coremltools.converters.mil.mil.ops.tests.iOS16 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import ( mark_api_breaking, run_compare_builder, ) from coremltools.converters.mil.testing_utils import get_op_types_in_program, ssa_fn compute_units = testing_reqs.compute_units class TestConstexprAffineDequantize: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array(range(4)).reshape(1, 1, 2, 2).astype(np.float32) decompressed_constant = np.array([1, 2, 3, 4]).reshape(1, 1, 2, 2).astype(np.float32) input_placeholders = { "x": mb.placeholder(shape=t.shape), } input_values = {"x": t} def build(x): quantized_data = np.array([3, 5, 5, 6]).reshape(1, 1, 2, 2).astype(np.uint8) scale = np.array([1, 2]).astype(np.float32) zero_point = np.array([2, 4]).astype(np.uint8) axis = 3 y = mb.constexpr_affine_dequantize( quantized_data=quantized_data, zero_point=zero_point, scale=scale, axis=axis, ) return mb.add(x=x, y=y) expected_output_types = (1, 1, 2, 2, types.fp32) expected_outputs = t + decompressed_constant.astype(np.float32) mlmodel = run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) # validate that the constexpr op is not removed by any graph pass prog = mlmodel._mil_program assert "constexpr_affine_dequantize" in get_op_types_in_program(prog) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_linear(self, compute_unit, backend): input_data = np.ones((4, 64), dtype=np.float32) input_placeholders = { "x": mb.placeholder(shape=input_data.shape), } input_values = {"x": input_data} def build(x): weight = mb.constexpr_affine_dequantize( quantized_data=np.ones((32, 64), dtype=np.uint8), zero_point=np.uint8(0), scale=np.float32(2.0), axis=0, ) return mb.linear(x=x, weight=weight, bias=np.zeros((32,), dtype=np.float32)) expected_output_types = (4, 32, types.fp32) expected_outputs = np.ones((4, 32), dtype=np.float32) * 128 mlmodel = run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) assert "constexpr_affine_dequantize" in get_op_types_in_program(mlmodel._mil_program) def test_is_all_zeros(self): @mb.program(opset_version=ct.target.iOS16) def prog_0_scalar(): return mb.constexpr_affine_dequantize( quantized_data=np.array([[0, 0, 0], [0, 0, 0]]).astype(np.int8), zero_point=np.int8(0), scale=np.float32(1.2), axis=0, ) assert prog_0_scalar.find_ops(op_type="constexpr_affine_dequantize")[0].is_all_zeros() @mb.program(opset_version=ct.target.iOS16) def prog_0_vector(): return mb.constexpr_affine_dequantize( quantized_data=np.array([[1, 2, 3], [1, 2, 3]]).astype(np.uint8), zero_point=np.uint8([1, 2, 3]), scale=np.float32(2), axis=1, ) assert prog_0_vector.find_ops(op_type="constexpr_affine_dequantize")[0].is_all_zeros() @mb.program(opset_version=ct.target.iOS16) def prog_none0(): return mb.constexpr_affine_dequantize( quantized_data=np.array([[1, 2, 3], [1, 2, 3]]).astype(np.uint8), zero_point=np.uint8([1, 2]), scale=np.float32(2), axis=0, ) assert not prog_none0.find_ops(op_type="constexpr_affine_dequantize")[0].is_all_zeros() @ssa_fn def test_builder_eval(self): # scalar zero-point & scalar scale v = mb.constexpr_affine_dequantize( quantized_data=np.array([[1, 2, 3], [1, 2, 3]]).astype(np.uint8), zero_point=np.uint8(1), scale=np.float32(2), axis=0, ) assert v.val is None np.testing.assert_allclose( np.float32([[0, 2, 4], [0, 2, 4]]), v.op.materialized_val_inference() ) # vector zero-point & scalar scale v = mb.constexpr_affine_dequantize( quantized_data=np.array([[1, 2, 3], [1, 2, 3]]).astype(np.int8), zero_point=np.array([1, 2]).astype(np.int8), scale=np.float32(2), axis=0, ) np.testing.assert_allclose( np.float32([[0, 2, 4], [-2, 0, 2]]), v.op.materialized_val_inference() ) # scalar zero-point & vector scale v = mb.constexpr_affine_dequantize( quantized_data=np.array([[1, 2, 3], [1, 2, 3]]).astype(np.uint8), zero_point=np.uint8(1), scale=np.array([2, 4]).astype(np.float32), axis=0, ) np.testing.assert_allclose( np.float32([[0, 2, 4], [0, 4, 8]]), v.op.materialized_val_inference() ) # vector zero-point & vector scale v = mb.constexpr_affine_dequantize( quantized_data=np.array([[1, 2, 3], [1, 2, 3]]).astype(np.int8), zero_point=np.array([1, 2]).astype(np.int8), scale=np.array([2, 4]).astype(np.float32), axis=0, ) np.testing.assert_allclose( np.float32([[0, 2, 4], [-4, 0, 4]]), v.op.materialized_val_inference() ) @staticmethod def affine_dequant_config_generator(): np.random.seed(1984) for quant_dtype in [np.int8, np.uint8]: low = 0 if quant_dtype == np.uint8 else -128 high = 255 if quant_dtype == np.uint8 else 127 for zp_dtype in [np.int8, np.uint8, np.float32]: for rank in range(1, 6): shape = np.random.randint(low=2, high=5, size=rank) quantized_data = np.random.randint( low=low, high=high, size=shape, dtype=quant_dtype ) axis = np.random.choice(range(-rank, rank)) scalar_zp = np.random.choice([True, False]) scalar_sc = np.random.choice([True, False]) zero_point = ( np.random.randint( low=low, high=high, size=quantized_data.shape[axis], dtype=quant_dtype, ).astype(zp_dtype) if not scalar_zp else np.random.choice(range(low, high)).astype(zp_dtype) ) scale = ( np.random.rand(quantized_data.shape[axis]).astype(np.float32) if not scalar_sc else np.float32(np.random.rand()) ) # fp16 is already covered under backends parameterization params = { "quantized_data": quantized_data, "zp": zero_point, "sc": scale, "axis": axis, } yield params @pytest.mark.parametrize( "compute_unit, backend, config", itertools.product(compute_units, backends, affine_dequant_config_generator.__func__()), ) def test_builder_stress(self, compute_unit, backend, config): quantized_data, zero_point, scale, axis = ( config["quantized_data"], config["zp"], config["sc"], config["axis"], ) def build(x): y = mb.constexpr_affine_dequantize( quantized_data=quantized_data, zero_point=zero_point, scale=scale, axis=axis, ) return mb.add(x=x, y=y) expected_output_types = ( *quantized_data.shape, types.numpy_type_to_builtin_type(scale.dtype), ) t = np.random.rand(*quantized_data.shape).astype(scale.dtype) decompressed_constant = constexpr_ops.constexpr_affine_dequantize.decompress( quantized_data, zero_point, scale, axis ) expected_outputs = t + decompressed_constant input_placeholders = { "x": mb.placeholder(shape=quantized_data.shape), } input_values = {"x": t} mlmodel = run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) # validate that the constexpr op is not removed by any graph pass prog = mlmodel._mil_program if "constexpr_affine_dequantize" not in get_op_types_in_program(prog): raise AssertionError("Invalidated: Test Failed") class TestConstexprCast: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array(range(4)).reshape(4, 1).astype(np.float32) decompressed_constant = np.array([1, 2, 3, 4]).reshape(4, 1).astype(np.float32) input_placeholders = { "x": mb.placeholder(shape=t.shape), } input_values = {"x": t} def build(x): source_val = np.array([1, 2, 3, 4]).reshape(4, 1).astype(np.float16) y = mb.constexpr_cast(source_val=source_val, output_dtype="fp32") return mb.add(x=x, y=y) expected_output_types = (4, 1, types.fp32) expected_outputs = t + decompressed_constant.astype(np.float32) mlmodel = run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) # validate that the constexpr op is not removed by any graph pass prog = mlmodel._mil_program if "constexpr_cast" not in get_op_types_in_program(prog): raise AssertionError("Invalidated: Test Failed") @ssa_fn def test_builder_eval(self): v = mb.constexpr_cast(source_val=np.float16([1, 2]), output_dtype="fp32") assert v.val is None np.testing.assert_allclose(np.float32([1, 2]), v.op.materialized_val_inference()) @staticmethod def cast_config_generator(): np.random.seed(1984) for rank in range(1, 6): shape = np.random.randint(low=2, high=5, size=rank) source_val = np.random.rand(*shape).astype(np.float16) params = { "source_val": source_val, "output_dtype": "fp32", } yield params @pytest.mark.parametrize( "compute_unit, backend, config", itertools.product(compute_units, backends, cast_config_generator.__func__()), ) def test_builder_stress(self, compute_unit, backend, config): source_val, output_dtype = ( config["source_val"], config["output_dtype"], ) def build(x): y = mb.constexpr_cast( source_val=source_val, output_dtype=output_dtype, ) return mb.add(x=x, y=y) expected_output_types = ( *source_val.shape, types.string_to_builtin(output_dtype), ) output_np_type = types.nptype_from_builtin(types.string_to_builtin(output_dtype)) t = np.random.rand(*source_val.shape).astype(output_np_type) decompressed_constant = source_val.astype(output_np_type) expected_outputs = t + decompressed_constant input_placeholders = { "x": mb.placeholder(shape=source_val.shape), } input_values = {"x": t} mlmodel = run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) # validate that the constexpr op is not removed by any graph pass prog = mlmodel._mil_program assert "constexpr_cast" in get_op_types_in_program(prog) class TestConstexprLutToDense: @mark_api_breaking(breaking_opset_version=ct.target.iOS18) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array(range(4)).reshape(4, 1).astype(np.float32) decompressed_constant = np.array([1, 2, 3, 4]).reshape(4, 1).astype(np.float32) input_placeholders = { "x": mb.placeholder(shape=t.shape), } input_values = {"x": t} def build(x): lut_data = np.array( [ -19.0, 4.0, 0.0, -1.0, 1.0, 3.0, 5.0, -8.0, 19, 13, 42, 4.5, 5.4, 2.0, -6, -7, ] ).astype(np.float32) indices = np.array([212, 21]).astype(np.uint8) shape = np.array([4, 1]).astype(np.uint32) y = mb.constexpr_lut_to_dense(lut=lut_data, indices=indices, shape=shape) return mb.add(x=x, y=y) expected_output_types = (4, 1, types.fp32) expected_outputs = t + decompressed_constant.astype(np.float32) mlmodel = run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) # validate that the constexpr op is not removed by any graph pass prog = mlmodel._mil_program assert "constexpr_lut_to_dense" in get_op_types_in_program(prog) @mark_api_breaking(breaking_opset_version=ct.target.iOS18) @pytest.mark.parametrize("backend", backends) def test_shape_of_constexpr_is_replaceable(self, backend): @mb.program(input_specs=[], opset_version=backend.opset_version) def prog(): lut_data = np.array( [ -19.0, 4.0, 0.0, -1.0, 1.0, 3.0, 5.0, -8.0, 19, 13, 42, 4.5, 5.4, 2.0, -6, -7, ] ).astype(np.float32) indices = np.array([212, 21]).astype(np.uint8) shape = np.array([4, 1]).astype(np.uint32) y = mb.constexpr_lut_to_dense(lut=lut_data, indices=indices, shape=shape) shape = mb.shape(x=y) assert len(shape.nonreplaceable_vars_upstream) == 0 gather = mb.gather( x=shape, indices=[ 0, ], axis=0, ) assert len(gather.nonreplaceable_vars_upstream) == 0 return gather @ssa_fn def test_builder_eval(self): v = mb.constexpr_lut_to_dense( lut=np.array([1.0, 2.0, 3.0, 4.0]), indices=np.array([10, 4]).astype(np.uint8), shape=np.array( [ 5, ] ).astype(np.uint32), ) assert v.val is None np.testing.assert_allclose( np.float32([3, 3, 1, 1, 1]).astype(np.float32), v.op.materialized_val_inference() ) @staticmethod def lut_config_generator(): np.random.seed(1999) for lut_dtype in [np.float32]: # [np.uint8, np.int8]: # float16 already covered under backends parameterization # Not possible to write 8-bit tests since no other op consumes uint8/int8 tensors for nbits in [1, 2, 4, 6, 8]: lut_size = 2**nbits if lut_dtype == np.uint8: lut = np.random.randint(low=255, size=lut_size, dtype=np.uint8) elif lut_dtype == np.int8: lut = np.random.randint(low=-128, high=127, size=lut_size, dtype=np.int8) else: lut = np.random.rand(lut_size).astype(lut_dtype) for output_rank in range(1, 6): output_shape = np.random.randint(low=2, high=5, size=output_rank) indices = np.random.randint( low=0, high=2**nbits, size=output_shape, dtype=np.uint8 ) indices_bitarray = np.unpackbits(indices, bitorder="little").reshape(-1, 8) packed_indices = np.packbits(indices_bitarray[:, :nbits], bitorder="little") assert packed_indices.size == np.ceil(nbits * np.prod(output_shape) / 8).astype( np.int32 ) params = { "indices": packed_indices, "shape": output_shape, "lut": lut, } yield params @mark_api_breaking(breaking_opset_version=ct.target.iOS18) @pytest.mark.parametrize( "compute_unit, backend, config", itertools.product(compute_units, backends, lut_config_generator.__func__()), ) def test_builder_stress(self, compute_unit, backend, config): indices, lut, shape = ( config["indices"], config["lut"], config["shape"], ) def build(x): y = mb.constexpr_lut_to_dense( indices=indices, lut=lut, shape=shape.astype(np.uint32), ) return mb.add(x=x, y=y) expected_output_types = ( *shape, types.numpy_type_to_builtin_type(lut.dtype), ) t = np.random.rand(*shape).astype(lut.dtype) decompressed_constant = constexpr_ops.constexpr_lut_to_dense.decompress(lut, indices, shape) expected_outputs = t + decompressed_constant input_placeholders = { "x": mb.placeholder(shape=shape), } input_values = {"x": t} mlmodel = run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) # validate that the constexpr op is not removed by any graph pass prog = mlmodel._mil_program if "constexpr_lut_to_dense" not in get_op_types_in_program(prog): raise AssertionError("Invalidated: Test Failed") class TestConstexprSparseToDense: @mark_api_breaking(breaking_opset_version=ct.target.iOS18) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): t = np.array(range(4)).reshape(4, 1).astype(np.float32) decompressed_constant = np.array([1, 2, 0, 4]).reshape(4, 1).astype(np.float32) input_placeholders = { "x": mb.placeholder(shape=t.shape), } input_values = {"x": t} def build(x): nonzero_data = np.array([1, 2, 4]).astype(np.float32) mask = np.array([11]).astype(np.uint8) shape = np.array([4, 1]).astype(np.uint32) y = mb.constexpr_sparse_to_dense(nonzero_data=nonzero_data, mask=mask, shape=shape) return mb.add(x=x, y=y) expected_output_types = (4, 1, types.fp32) expected_outputs = t + decompressed_constant.astype(np.float32) mlmodel = run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) # validate that the constexpr op is not removed by any graph pass prog = mlmodel._mil_program assert "constexpr_sparse_to_dense" in get_op_types_in_program(prog) @ssa_fn def test_builder_eval(self): v = mb.constexpr_sparse_to_dense( nonzero_data=np.array([1.0, 2.0, 4.0]), mask=np.array([11]).astype(np.uint8), shape=np.array( [ 4, ] ).astype(np.uint32), ) assert v.val is None np.testing.assert_allclose( np.float32([1.0, 2.0, 0.0, 4.0]), v.op.materialized_val_inference() ) @staticmethod def sparse_config_generator(): np.random.seed(1999) for nonzero_data_dtype in [np.float32]: # [np.uint8, np.int8]: # float16 already covered under backends parameterization # Not possible to write 8-bit tests since no other op consumes uint8/int8 tensors for output_rank in range(1, 6): output_shape = np.random.randint(low=2, high=5, size=output_rank) output_size = np.prod(output_shape) nBytes = np.ceil(output_size / 8).astype(np.int32) mask = np.random.randint(low=255, size=nBytes, dtype=np.uint8) bitarray = np.unpackbits(mask, bitorder="little") while any(bitarray[i] != 0 for i in range(output_size, len(bitarray))): mask = np.random.randint(low=255, size=nBytes, dtype=np.uint8) bitarray = np.unpackbits(mask, bitorder="little") nonzero_size = np.sum(np.where(np.unpackbits(mask, bitorder="little") != 0, 1, 0)) if nonzero_data_dtype == np.uint8: nonzero_data = np.random.randint(low=255, size=nonzero_size, dtype=np.uint8) elif nonzero_data_dtype == np.int8: nonzero_data = np.random.randint( low=-128, high=127, size=nonzero_size, dtype=np.int8 ) else: nonzero_data = np.random.rand(nonzero_size).astype(nonzero_data_dtype) params = { "nonzero_data": nonzero_data, "shape": output_shape, "mask": mask, } yield params @mark_api_breaking(breaking_opset_version=ct.target.iOS18) @pytest.mark.parametrize( "compute_unit, backend, config", itertools.product(compute_units, backends, sparse_config_generator.__func__()), ) def test_builder_stress(self, compute_unit, backend, config): nonzero_data, mask, shape = ( config["nonzero_data"], config["mask"], config["shape"], ) def build(x): y = mb.constexpr_sparse_to_dense( nonzero_data=nonzero_data, mask=mask, shape=shape.astype(np.uint32), ) return mb.add(x=x, y=y) expected_output_types = ( *shape, types.numpy_type_to_builtin_type(nonzero_data.dtype), ) t = np.random.rand(*shape).astype(nonzero_data.dtype) decompressed_constant = constexpr_ops.constexpr_sparse_to_dense.decompress( nonzero_data, mask, shape ) expected_outputs = t + decompressed_constant input_placeholders = { "x": mb.placeholder(shape=shape), } input_values = {"x": t} mlmodel = run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) # validate that the constexpr op is not removed by any graph pass prog = mlmodel._mil_program if "constexpr_sparse_to_dense" not in get_op_types_in_program(prog): raise AssertionError("Invalidated: Test Failed") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS16/test_conv.py0000644000000000000000000000544614672066616025770 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS16 import backends from coremltools.converters.mil.testing_utils import get_op_types_in_program class TestConvolution: @pytest.mark.parametrize("backend", backends) def test_type_inference_with_constexpr_ops(self, backend): # Test the type inference of the conv op doesn't error out for constexpr bias @mb.program( input_specs=[mb.TensorSpec(shape=(1, 3, 4, 4), dtype=types.fp32)], opset_version=backend.opset_version, ) def prog(x): weight = np.random.rand(2, 3, 2, 2) bias = mb.constexpr_affine_dequantize( quantized_data=np.array([1, 2]).astype(np.uint8), zero_point=np.uint8(1), scale=np.float32(2), axis=0, ) return mb.conv(x=x, weight=weight, bias=bias) assert get_op_types_in_program(prog) == ["constexpr_affine_dequantize", "conv"] # Test conv op can have dilations with constexpr weight @mb.program( input_specs=[mb.TensorSpec(shape=(1, 3, 4, 4), dtype=types.fp32)], opset_version=backend.opset_version, ) def prog(x): weight = mb.constexpr_affine_dequantize( quantized_data=np.array(range(24)).astype(np.uint8).reshape(2, 3, 2, 2), zero_point=np.uint8(1), scale=np.float32(2), axis=0, ) return mb.conv(x=x, weight=weight, dilations=[2, 2]) assert get_op_types_in_program(prog) == ["constexpr_affine_dequantize", "conv"] # Test conv op can have dilations with constexpr weight with casts @mb.program( input_specs=[mb.TensorSpec(shape=(1, 3, 4, 4), dtype=types.fp16)], opset_version=backend.opset_version, ) def prog(x): weight = mb.constexpr_affine_dequantize( quantized_data=np.array(range(24)).astype(np.uint8).reshape(2, 3, 2, 2), zero_point=np.uint8(1), scale=np.float16(2), axis=0, ) cast_weight = mb.cast(x=weight, dtype="fp32") cast_weight = mb.cast(x=weight, dtype="fp16") return mb.conv(x=x, weight=cast_weight, dilations=[2, 2]) assert get_op_types_in_program(prog) == [ "constexpr_affine_dequantize", "cast", "cast", "conv", ] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS16/test_image_resizing.py0000644000000000000000000001404014672066616030005 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS15.test_image_resizing import ( TestResample as _TestResample_iOS15, ) from coremltools.converters.mil.mil.ops.tests.iOS16 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import ( mark_api_breaking, run_compare_builder, ) from coremltools.converters.mil.testing_reqs import compute_units class TestUpsampleBilinear: @pytest.mark.parametrize( "compute_unit, backend, align_corners, half_pixel_centers", itertools.product( compute_units, backends, [True, False], [True, False, None], ), ) def test_builder_to_backend_smoke_iOS16( self, compute_unit, backend, align_corners, half_pixel_centers ): if align_corners and half_pixel_centers: pytest.skip("Invalid configuration of align_corners and half_pixel_centers") x = np.array([1, 2], dtype=np.float32).reshape(1, 1, 1, 2) input_placeholder_dict = {"x": mb.placeholder(shape=x.shape)} input_value_dict = {"x": x} def build_upsample_bilinear(x): return mb.upsample_bilinear( x=x, scale_factor_height=2, scale_factor_width=3, align_corners=align_corners, half_pixel_centers=half_pixel_centers, ) expected_output_type = (1, 1, 2, 6, types.fp32) if half_pixel_centers is None: half_pixel_centers = not align_corners if align_corners and not half_pixel_centers: expected_output = [1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0] elif not align_corners and half_pixel_centers: expected_output = [ 1.0, 1.0, 1.33334, 1.66667, 2.0, 2.0, 1.0, 1.0, 1.33334, 1.66667, 2.0, 2.0, ] elif not align_corners and not half_pixel_centers: expected_output = [ 1.0, 1.33334, 1.66667, 2.0, 2.0, 2.0, 1.0, 1.33334, 1.66667, 2.0, 2.0, 2.0, ] else: raise ValueError("align_corners and half_pixel_centers cannot be both True") expected_output = [np.array(expected_output, dtype=np.float32).reshape(1, 1, 2, 6)] run_compare_builder( build_upsample_bilinear, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) class TestCropResize: @mark_api_breaking(breaking_opset_version=ct.target.iOS17) @pytest.mark.parametrize( "compute_unit, backend, pad_value", itertools.product(compute_units, backends, [0.0, 1.0, 10.0]), ) def test_builder_to_backend_ios16(self, compute_unit, backend, pad_value): """For iOS16+ the crop_resize op supports pad_value.""" x = np.array( [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]], dtype=np.float32, ).reshape(1, 1, 4, 4) roi = np.array( [ [0, 0.1, 0.3, 1.3, 1], [0, 0.5, 1.8, 1.0, 0.3], [0, 0.0, 0.4, 0.6, 0.7], ], dtype=np.float32, ).reshape(3, 1, 5, 1, 1) def build(x): return mb.crop_resize( x=x, roi=roi, target_width=2, target_height=2, normalized_coordinates=True, box_coordinate_mode="CORNERS_HEIGHT_FIRST", sampling_mode="ALIGN_CORNERS", pad_value=pad_value, ) expected_output_type = [ (3, 1, 1, 2, 2, types.fp32), ] expected_output = [ np.array( [ 3.1, 5.2, pad_value, pad_value, pad_value, 7.899, pad_value, 13.9, 2.2, 3.1, 9.4, 10.3, ], dtype=np.float32, ).reshape(3, 1, 1, 2, 2), ] input_placeholder_dict = {"x": mb.placeholder(shape=(1, 1, 4, 4))} input_value_dict = {"x": x} run_compare_builder( build, input_placeholder_dict, input_value_dict, expected_output_type, expected_output, compute_unit=compute_unit, backend=backend, ) class TestResample: @pytest.mark.parametrize( "compute_unit, backend, coordinates_dtype", itertools.product( compute_units, backends, (np.int32, np.float16, np.float32), ), ) def test_builder_to_backend_smoke_iOS16(self, compute_unit, backend, coordinates_dtype): # The fp16 precision will have two casts inserted for input/output expected_cast_ops = 2 if backend.precision == "fp16" else 0 if backend.precision == "fp16" and coordinates_dtype == np.float32: # The coordinates also cast to fp16. expected_cast_ops += 1 _TestResample_iOS15._test_builder_to_backend_smoke( compute_unit, backend, coordinates_dtype, expected_cast_ops ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS16/test_scatter_gather.py0000644000000000000000000002156314672066616030020 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS14.test_scatter_gather import ( TestGatherAlongAxis as _TestGatherAlongAxis_iOS14, ) from coremltools.converters.mil.mil.ops.tests.iOS16 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import ( mark_api_breaking, run_compare_builder, ) from coremltools.converters.mil.testing_reqs import compute_units class TestGather: @pytest.mark.parametrize( "compute_unit, backend, x_dtype, indices_dtype, indices_dynamic", itertools.product( compute_units, backends, [np.float32, np.float16, np.int32], [np.int32, np.int16, np.uint16], [True, False], ), ) def test_builder_to_backend_smoke( self, compute_unit, backend, x_dtype, indices_dtype, indices_dynamic ): x = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]], dtype=x_dtype) indices = np.array([[[1, 0], [0, 1]], [[1, 0], [0, 0]]], dtype=indices_dtype) builtin_x_dtype = types.numpy_type_to_builtin_type(x_dtype) input_placeholders = {"x": mb.placeholder(shape=x.shape, dtype=builtin_x_dtype)} input_values = {"x": x} if indices_dynamic: input_placeholders["indices"] = mb.placeholder( shape=indices.shape, dtype=types.numpy_type_to_builtin_type(indices_dtype) ) input_values["indices"] = indices def build_dynamic(x, indices): return [ mb.gather(x=x, indices=indices, axis=1, batch_dims=0), mb.gather(x=x, indices=indices, axis=1, batch_dims=1), mb.gather(x=x, indices=indices, axis=2, batch_dims=0), mb.gather(x=x, indices=indices, axis=2, batch_dims=1), mb.gather(x=x, indices=indices, axis=2, batch_dims=2), ] def build_static(x): return [ mb.gather(x=x, indices=indices, axis=1, batch_dims=0), mb.gather(x=x, indices=indices, axis=1, batch_dims=1), mb.gather(x=x, indices=indices, axis=2, batch_dims=0), mb.gather(x=x, indices=indices, axis=2, batch_dims=1), mb.gather(x=x, indices=indices, axis=2, batch_dims=2), ] build = build_dynamic if indices_dynamic else build_static expected_output_types = [ (2, 2, 2, 2, 3, builtin_x_dtype), (2, 2, 2, 3, builtin_x_dtype), (2, 2, 2, 2, 2, builtin_x_dtype), (2, 2, 2, 2, builtin_x_dtype), (2, 2, 2, builtin_x_dtype), ] expected_outputs = [ np.array( [ [ [[[4, 5, 6], [1, 2, 3]], [[1, 2, 3], [4, 5, 6]]], [[[4, 5, 6], [1, 2, 3]], [[1, 2, 3], [1, 2, 3]]], ], [ [[[10, 11, 12], [7, 8, 9]], [[7, 8, 9], [10, 11, 12]]], [[[10, 11, 12], [7, 8, 9]], [[7, 8, 9], [7, 8, 9]]], ], ], dtype=x_dtype, ), np.array( [ [[[4, 5, 6], [1, 2, 3]], [[1, 2, 3], [4, 5, 6]]], [[[10, 11, 12], [7, 8, 9]], [[7, 8, 9], [7, 8, 9]]], ], dtype=x_dtype, ), np.array( [ [[[[2, 1], [1, 2]], [[2, 1], [1, 1]]], [[[5, 4], [4, 5]], [[5, 4], [4, 4]]]], [ [[[8, 7], [7, 8]], [[8, 7], [7, 7]]], [[[11, 10], [10, 11]], [[11, 10], [10, 10]]], ], ], dtype=x_dtype, ), np.array( [[[[2, 1], [1, 2]], [[5, 4], [4, 5]]], [[[8, 7], [7, 7]], [[11, 10], [10, 10]]]], dtype=x_dtype, ), np.array([[[2, 1], [4, 5]], [[8, 7], [10, 10]]], dtype=x_dtype), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "backend", backends, ) def test_builder_eval_batch_dims(self, backend): @mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, ) def prog(x): params = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]], dtype=np.float32) indices = np.array([[[1, 0], [0, 1]], [[1, 0], [0, 0]]], dtype=np.int32) res = mb.gather(x=params, indices=indices, axis=2, batch_dims=2) return res main_func = prog.functions["main"] gather_ops = main_func.find_ops(op_type="gather")[0] np.testing.assert_allclose( np.array([[[2, 1], [4, 5]], [[8, 7], [10, 10]]], dtype=np.float32), gather_ops.outputs[0].val, atol=1e-04, rtol=1e-05, ) class TestGatherAlongAxis(_TestGatherAlongAxis_iOS14): @pytest.mark.parametrize( "compute_unit, backend, x_dtype, indices_dtype", itertools.product( compute_units, backends, [np.float32, np.float16, np.int32], [np.int32, np.int16, np.uint16], ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, x_dtype, indices_dtype): super().test_builder_to_backend_smoke(compute_unit, backend, x_dtype, indices_dtype) @pytest.mark.parametrize( "compute_unit, backend, rank_axis, x_dtype, indices_dtype", itertools.product( compute_units, backends, [(rank, axis) for rank in range(1, 5) for axis in range(-rank, rank)], [np.float32, np.float16, np.int32], [np.int32, np.int16, np.uint16], ), ) def test_builder_to_backend_programmatic( self, compute_unit, backend, rank_axis, x_dtype, indices_dtype ): super()._test_builder_to_backend_programmatic( compute_unit, backend, rank_axis, x_dtype, indices_dtype, True ) class TestGatherNd: @pytest.mark.parametrize( "compute_unit, backend, x_dtype, indices_dtype", itertools.product( compute_units, backends, [np.float32, np.float16, np.int32], [np.int32, np.int16, np.uint16], ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, x_dtype, indices_dtype): x = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]], dtype=np.float32) indices = np.array([[[1, 0], [0, 1]], [[1, 0], [0, 0]]], dtype=np.int32) builtin_x_dtype = types.numpy_type_to_builtin_type(x_dtype) input_placeholders = { "x": mb.placeholder(shape=x.shape, dtype=builtin_x_dtype), "indices": mb.placeholder( shape=indices.shape, dtype=types.numpy_type_to_builtin_type(indices_dtype) ), } input_values = {"x": x, "indices": indices} def build(x, indices): return [ mb.gather_nd(x=x, indices=indices, batch_dims=0), mb.gather_nd(x=x, indices=indices, batch_dims=1), ] expected_output_types = [(2, 2, 3, builtin_x_dtype), (2, 2, builtin_x_dtype)] expected_outputs = [ np.array([[[7, 8, 9], [4, 5, 6]], [[7, 8, 9], [1, 2, 3]]], dtype=x_dtype), np.array([[4, 2], [10, 7]], dtype=x_dtype), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @mark_api_breaking(breaking_opset_version=ct.target.iOS17) @pytest.mark.parametrize( "backend, indices_val", itertools.product(backends, [[[-1], [2]], [[1], [3]]]), ) def test_builder_invalid_indices(self, backend, indices_val): def prog(x): params = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array(indices_val, dtype=np.int32) res = mb.gather_nd(x=params, indices=indices, batch_dims=1) return res mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, )(prog) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS16/test_tensor_operation.py0000644000000000000000000000535314672066616030412 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil import testing_reqs from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS16 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder compute_units = testing_reqs.compute_units class TestFillLike: @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): shape = (2, 1, 3) x_val = np.zeros(shape=shape, dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape, dtype=types.int32)} input_values = {"x": x_val} def build(x): return mb.fill_like(ref_tensor=x, value=1.0) expected_output_types = [(2, 1, 3, types.fp32)] expected_outputs = [np.full(shape=shape, fill_value=1.0)] mlmodel = run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestTopK: @pytest.mark.parametrize( "compute_unit, backend, return_indices, sort", itertools.product( compute_units, backends, [True, False], [True, False], ), ) def test_builder_to_backend_smoke_iOS16(self, compute_unit, backend, return_indices, sort): val = np.array([[-1.0, 2.0, -3.0], [4.0, -5.0, 6.0]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return mb.topk(x=x, k=2, axis=1, return_indices=return_indices, sort=sort) expected_output_types = [ (2, 2, types.fp32), (2, 2, types.int32), ] expected_outputs = [ np.array([[2.0, -1.0], [6.0, 4.0]], dtype=np.float32), np.array([[1, 0], [2, 0]], dtype=np.float32), ] if not return_indices: expected_output_types = expected_output_types[:1] expected_outputs = expected_outputs[:1] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS16/test_tensor_transformation.py0000644000000000000000000001313114672066616031451 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools._deps import _HAS_TORCH, MSG_TORCH_NOT_FOUND from coremltools.converters.mil import testing_reqs from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS16 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.mil.types.type_mapping import numpy_type_to_builtin_type compute_units = testing_reqs.compute_units if _HAS_TORCH: import torch class TestPixelUnshuffle: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_builder_to_backend_smoke(self, compute_unit, backend): val = np.array([[[[9.0, 5.0], [1.0, 3.0]]]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return [mb.pixel_unshuffle(x=x, downscale_factor=np.uint32(2))] expected_output_types = (1, 4, 1, 1, types.fp32) expected_outputs = np.array([[[[9.0]], [[5.0]], [[1.0]], [[3.0]]]], dtype=np.float32) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.skipif(not testing_reqs._HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( "compute_unit, backend, shape, downscale_factor", itertools.product( compute_units, backends, [(1, 2, 4, 4), (2, 1, 8, 4)], [2, 4], ), ) def test_builder_to_backend_stress( self, compute_unit, backend, shape, downscale_factor, ): val = np.random.rand(*shape) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} def build(x): return [mb.pixel_unshuffle(x=x, downscale_factor=np.uint32(downscale_factor))] torch_pixel_unshuffle = torch.nn.PixelUnshuffle(downscale_factor) expected_outputs = [torch_pixel_unshuffle(torch.Tensor(val)).numpy()] expected_output_types = [o.shape[:] + (types.fp32,) for o in expected_outputs] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestReshapeLike: @pytest.mark.parametrize( "compute_unit, backend, InputShape_RefShapes_Begins_Ends_EndMasks, x_dtype, ref_dtype", itertools.product( compute_units, backends, [ [(4, 3), ((2, 2, 3), (1, 3)), (0, 1), (2, 2), (False, False)], [(32,), ((1, 2, 2, 2), (3, 2, 2)), (1, 1), (0, 0), (True, True)], [(72, 1), ((1, 2, 3, 4, 1), (3,)), (1, 0), (0, 1), (True, False)], ], [np.float16, np.float32, np.int32, bool], [np.float16, np.float32, np.int32, bool], ), ) def test_builder_to_backend_smoke( self, compute_unit, backend, InputShape_RefShapes_Begins_Ends_EndMasks, x_dtype, ref_dtype, ): input_shape, ref_shapes, begins, ends, end_masks = InputShape_RefShapes_Begins_Ends_EndMasks ref_shape_1, ref_shape_2 = ref_shapes x_builtin_dtype = numpy_type_to_builtin_type(x_dtype) ref_builtin_dtype = numpy_type_to_builtin_type(ref_dtype) x_val = np.random.randint(low=0, high=6, size=input_shape).astype(x_dtype) ref_tensor_1 = np.random.randint(low=0, high=6, size=ref_shape_1).astype(ref_dtype) ref_tensor_2 = np.random.randint(low=0, high=6, size=ref_shape_2).astype(ref_dtype) input_placeholders = { "x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype), "ref_tensor_1": mb.placeholder(shape=ref_shape_1, dtype=ref_builtin_dtype), "ref_tensor_2": mb.placeholder(shape=ref_shape_2, dtype=ref_builtin_dtype), } input_values = { "x": x_val, "ref_tensor_1": ref_tensor_1, "ref_tensor_2": ref_tensor_2, } def build(x, ref_tensor_1, ref_tensor_2): return mb.reshape_like( x=x, ref_tensors=(ref_tensor_1, ref_tensor_2), begins=begins, ends=ends, end_masks=end_masks, ) output_shape = () for ref_shape, begin, end, end_mask in zip( (ref_shape_1, ref_shape_2), begins, ends, end_masks ): if end_mask: output_shape += tuple(ref_shape[begin:]) else: output_shape += tuple(ref_shape[begin:end]) expected_output_types = [output_shape + (x_builtin_dtype,)] expected_outputs = [np.reshape(x_val, output_shape).astype(x_dtype)] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2415469 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/0000755000000000000000000000000014672075535023402 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/__init__.py0000644000000000000000000000061714672066616025517 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import coremltools as ct from coremltools.converters.mil.testing_reqs import backends_internal, clean_up_backends backends = clean_up_backends(backends_internal, ct.target.iOS17) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/test_activation.py0000644000000000000000000001524314672066616027161 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS17 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units class TestInputWeightDifferentDtypes: """ Starting from IOS17 the alpha/beta can have different dtypes from the input/output, so this test class is mainly to verify the behaviour of those alpha/beta related activations. """ @pytest.mark.parametrize( "backend, different_dtype, op_name", itertools.product( backends, [True, False], ["elu", "leaky_relu", "prelu", "thresholded_relu"], ), ) def test_builder_eval_alpha(self, backend, different_dtype, op_name): x = np.array([[[-1, 2, -3], [4, -5, 6]]], dtype=np.float32) alpha = np.float16(2.0) if different_dtype else np.float32(2.0) if op_name == "prelu": alpha = np.array([2.0, 2.0], dtype=alpha.dtype) # prelu requires alpha to be rank 1. def prog(): return getattr(mb, op_name)(x=x, alpha=alpha) mb.program(input_specs=[], opset_version=backend.opset_version)(prog) @pytest.mark.parametrize( "backend, different_dtype, op_name", itertools.product( backends, [True, False], [ "clamped_relu", "linear_activation", "scaled_tanh", "sigmoid_hard", "softplus_parametric", ], ), ) def test_builder_eval_alpha_beta(self, backend, different_dtype, op_name): x = np.array([[[-1, 2, -3], [4, -5, 6]]], dtype=np.float32) alpha = np.float16(2.0) if different_dtype else np.float32(2.0) beta = np.float16(1.0) if different_dtype else np.float32(1.0) if op_name == "softplus_parametric": alpha = np.array([2.0, 2.0], dtype=alpha.dtype) beta = np.array([1.0, 1.0], dtype=beta.dtype) def prog(): return getattr(mb, op_name)(x=x, alpha=alpha, beta=beta) mb.program(input_specs=[], opset_version=backend.opset_version)(prog) @pytest.mark.parametrize( "compute_unit, backend, different_dtype, op_name", itertools.product( compute_units, backends, [True, False], ["elu", "leaky_relu", "prelu", "thresholded_relu"], ), ) def test_builder_to_backend_numerical_alpha( self, compute_unit, backend, different_dtype, op_name ): x = np.array([[[-1, 2, -3], [4, -5, 6]]], dtype=np.float32) alpha = np.float16(2.0) if different_dtype else np.float32(2.0) if op_name == "prelu": alpha = np.array([2.0, 2.0], dtype=alpha.dtype) def calculate_by_np(): if op_name == "elu": res = np.copy(x) res[res < 0] = alpha * (np.exp(res[res < 0]) - 1) return res elif op_name == "leaky_relu": res = np.copy(x) res[res < 0] *= 2.0 return res elif op_name == "prelu": alpha_br = np.copy(alpha) for i in range(len(x.shape)): if i != 1: alpha_br = np.expand_dims(alpha_br, i) res = np.maximum(x, 0) + np.minimum(x, 0) * alpha_br return res elif op_name == "thresholded_relu": res = np.copy(x) res[res < alpha] = 0.0 return res else: raise ValueError(f"Invalid op_name: {op_name}") def build(x): return getattr(mb, op_name)(x=x, alpha=alpha) run_compare_builder( build, input_placeholders={"x": mb.placeholder(shape=x.shape)}, input_values={"x": x}, expected_output_types=x.shape + (types.fp32,), expected_outputs=calculate_by_np(), compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, different_dtype, op_name", itertools.product( compute_units, backends, [True, False], [ "clamped_relu", "linear_activation", "scaled_tanh", "sigmoid_hard", "softplus_parametric", ], ), ) def test_builder_to_backend_numerical_alpha_beta( self, compute_unit, backend, different_dtype, op_name ): x = np.array([[[-1, 2, -3], [4, -5, 6]]], dtype=np.float32) alpha = np.float16(2.0) if different_dtype else np.float32(2.0) beta = np.float16(1.0) if different_dtype else np.float32(1.0) if op_name == "softplus_parametric": alpha = np.array([2.0, 2.0], dtype=alpha.dtype) beta = np.array([1.0, 1.0], dtype=beta.dtype) def calculate_by_np(): if op_name == "clamped_relu": return np.minimum(np.maximum(x, 0), beta) + np.minimum( np.minimum(x, 0) * alpha, beta ) elif op_name == "linear_activation": return x * alpha + beta elif op_name == "scaled_tanh": return alpha * np.tanh(x * beta) elif op_name == "sigmoid_hard": return np.minimum(np.maximum((alpha * x) + beta, 0), 1) elif op_name == "softplus_parametric": alpha_br = alpha beta_br = beta for i in range(len(x.shape)): if i != 1: alpha_br = np.expand_dims(alpha_br, i) beta_br = np.expand_dims(beta_br, i) res = alpha_br * np.log(np.exp(x * beta_br) + 1) return res else: raise ValueError(f"Invalid op_name: {op_name}") def build(x): return getattr(mb, op_name)(x=x, alpha=alpha, beta=beta) run_compare_builder( build, input_placeholders={"x": mb.placeholder(shape=x.shape)}, input_values={"x": x}, expected_output_types=x.shape + (types.fp32,), expected_outputs=calculate_by_np(), compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/test_conv.py0000644000000000000000000001407014672066616025762 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools._deps import _HAS_TORCH, MSG_TORCH_NOT_FOUND from coremltools.converters.mil.mil.ops.tests.iOS14.test_conv import TestConv as _TestConvIos14 from coremltools.converters.mil.mil.ops.tests.iOS14.test_conv import ( TestConvTranspose as _TestTestConvTransposeIos14, ) from coremltools.converters.mil.mil.ops.tests.iOS17 import backends from coremltools.converters.mil.testing_reqs import compute_units class TestConv(_TestConvIos14): @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "conv_dim", "config", "x_weight_dtype", ] ), itertools.product( compute_units, backends, ["conv1d", "conv2d", "conv3d"], [ { "padding": (1, 1, 1), "DHWKdKhKw": (10, 12, 14, 3, 2, 4), "stride": (2, 1, 1), "dilation": (1, 1, 1), "has_bias": False, "groups": 1, "symbolic": False, }, { "padding": (2, 2, 2), "DHWKdKhKw": (10, 12, 14, 3, 2, 4), "stride": (2, 2, 2), "dilation": (2, 1, 1), "has_bias": False, "groups": 2, "symbolic": True, }, { "padding": (1, 1, 1), "DHWKdKhKw": (5, 5, 5, 2, 2, 2), "stride": (2, 2, 2), "dilation": (2, 1, 1), "has_bias": True, "groups": 1, "symbolic": True, }, { "padding": (2, 2, 2), "DHWKdKhKw": (5, 5, 5, 2, 2, 2), "stride": (2, 1, 1), "dilation": (1, 1, 1), "has_bias": True, "groups": 2, "symbolic": False, }, ], [ (np.float32, np.float32), (np.float16, np.float16), (np.float16, np.float32), (np.float32, np.float16), ], ), ) def test_builder_to_backend_stress( self, compute_unit, backend, conv_dim, config, x_weight_dtype, ): if ( backend.backend == "mlprogram" and backend.precision == "fp16" and backend.opset_version == ct.target.iOS17 and conv_dim == "conv2d" and config == { "padding": (1, 1, 1), "DHWKdKhKw": (5, 5, 5, 2, 2, 2), "stride": (2, 2, 2), "dilation": (2, 1, 1), "has_bias": True, "groups": 1, "symbolic": True, } and x_weight_dtype == (np.float32, np.float16) ): pytest.xfail("rdar://124260627 ([CI] Two tests are random failing on CI)") super().test_builder_to_backend_stress( compute_unit, backend, conv_dim, config, x_weight_dtype ) class TestConvTranspose(_TestTestConvTransposeIos14): @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "conv_dim", "config", "x_weight_dtype", ] ), itertools.product( compute_units, backends, ["conv1d", "conv2d", "conv3d"], [ { "padding": (1, 2, 3), "DHWKdKhKw": (10, 12, 14, 3, 2, 4), "stride": (2, 1, 1), "dilation": (1, 1, 1), "has_bias": False, "groups": 1, "test_symbolic": False, "test_output_shape": True, }, { "padding": (2, 2, 2), "DHWKdKhKw": (10, 12, 14, 3, 2, 4), "stride": (2, 2, 2), "dilation": (2, 1, 1), "has_bias": False, "groups": 2, "test_symbolic": True, "test_output_shape": False, }, { "padding": (1, 2, 3), "DHWKdKhKw": (7, 7, 7, 2, 2, 2), "stride": (2, 2, 2), "dilation": (2, 1, 1), "has_bias": True, "groups": 1, "test_symbolic": True, "test_output_shape": False, }, { "padding": (2, 2, 2), "DHWKdKhKw": (7, 7, 7, 2, 2, 2), "stride": (2, 1, 1), "dilation": (1, 1, 1), "has_bias": True, "groups": 2, "test_symbolic": False, "test_output_shape": False, }, ], [ (np.float32, np.float32), (np.float16, np.float16), (np.float16, np.float32), (np.float32, np.float16), ], ), ) def test_builder_to_backend_stress( self, compute_unit, backend, conv_dim, config, x_weight_dtype, ): super().test_builder_to_backend_stress( compute_unit, backend, conv_dim, config, x_weight_dtype ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/test_elementwise_unary.py0000644000000000000000000001662214672066616030561 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools from unittest.mock import patch import numpy as np import pytest import coremltools as ct from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS14.test_elementwise_unary import ( TestElementwiseUnary as _TestElementwiseUnary_iOS14, ) from coremltools.converters.mil.mil.ops.tests.iOS17 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.mil.passes.pass_pipeline import PassPipeline from coremltools.converters.mil.mil.types.type_mapping import numpy_type_to_builtin_type from coremltools.converters.mil.mil.var import Var from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import get_op_types_in_program class TestElementwiseUnary: @pytest.mark.parametrize( "compute_unit, backend, src_dtype, dst_dtype", itertools.product( compute_units, backends, [ np.float16, np.float32, np.int32, np.int16, np.uint16, np.int8, np.uint8, ], [ np.float16, np.float32, np.int32, np.int16, np.uint16, np.int8, np.uint8, ], ), ) def test_builder_eval_cast_ios17(self, compute_unit, backend, src_dtype, dst_dtype): x = np.array([[1, 2, 3], [4, 5, 6]], dtype=src_dtype) dst_dtype_str = types.builtin_to_string(numpy_type_to_builtin_type(dst_dtype)) expected_res = x.astype(dtype=np.float16) @mb.program(input_specs=[], opset_version=backend.opset_version) def prog(): return mb.cast(x=x, dtype=dst_dtype_str) main_func = prog.functions["main"] cast_op = main_func.find_ops(op_type="cast")[0] np.testing.assert_allclose(expected_res, cast_op.outputs[0].val, atol=1e-04, rtol=1e-05) @pytest.mark.parametrize( "backend, dtype", itertools.product( backends, ["int8", "uint8", "int16", "uint16"], ), ) def test_cast_with_symbolic_value_iOS17(self, backend, dtype): s1 = get_new_symbol() @mb.program( input_specs=[mb.TensorSpec(shape=(s1, 1))], opset_version=backend.opset_version, ) def prog(x): shape = mb.shape(x=x) out = mb.cast(x=shape, dtype=dtype) assert out.val is None sym_val = out.sym_val assert sym_val.tolist() == [s1, 1] return out @pytest.mark.parametrize( "compute_unit, backend, src_dtype, dst_dtype", itertools.product( compute_units, backends, [np.float16, np.float32, np.int16, np.int32, np.uint16, np.int8, np.uint8], [np.float16, np.float32, np.int16, np.int32, np.uint16, np.int8, np.uint8], ), ) def test_builder_to_backend_cast_ios17(self, compute_unit, backend, src_dtype, dst_dtype): _SUPPORTED_IO_DTYPES = {types.fp16, types.fp32, types.int32} x = np.array([[1, 2, 3], [4, 5, 6]], dtype=src_dtype) src_builtin_dtype = numpy_type_to_builtin_type(src_dtype) dst_builtin_dtype = numpy_type_to_builtin_type(dst_dtype) expected_res = x.astype(dtype=np.float16) expected_cast_num = 1 if src_builtin_dtype not in _SUPPORTED_IO_DTYPES: # A cast will be inserted for unsupported dtypes inputs. expected_cast_num += 1 # As CoreML IO only allows fp16/32 and int32, the output will be further cast. expected_res_builtin_dtype = dst_builtin_dtype if dst_builtin_dtype not in _SUPPORTED_IO_DTYPES: expected_res_builtin_dtype = ( types.int32 if types.is_int(dst_builtin_dtype) else types.fp32 ) expected_cast_num += 1 def build(x): return mb.cast(x=x, dtype=types.builtin_to_string(dst_builtin_dtype)) with patch.object(Var, "_is_nonreplaceable_var") as mocked_is_nonreplaceable_var: # Mock that the cast is non-replaceable, to make sure it's kept in the graph. mocked_is_nonreplaceable_var.side_effect = ( lambda var: var.op and var.op.op_type == "cast" ) # Remove the cast optimization pass to make sure all cast are kept in the graph. pass_pipeline: PassPipeline = PassPipeline.DEFAULT pass_pipeline.remove_passes( ["common::cast_optimization", "common::topological_reorder"] ) mlmodel = run_compare_builder( build, {"x": mb.placeholder(shape=x.shape, dtype=src_builtin_dtype)}, input_values={"x": x}, expected_output_types=x.shape + (expected_res_builtin_dtype,), expected_outputs=expected_res, compute_unit=compute_unit, backend=backend, pass_pipeline=pass_pipeline, ) prog = mlmodel._mil_program cast_ops = prog["main"].find_ops(op_type="cast") assert len(cast_ops) == expected_cast_num @pytest.mark.parametrize( "compute_unit, backend, op_name, epsilon_val, x_eps_dtype", itertools.product( compute_units, backends, ["inverse", "log", "rsqrt"], [1e-3, 1e-1, 1.0], [(np.float32, np.float16), (np.float16, np.float32)], ), ) def test_builder_to_backend_stress_with_epsilon( self, compute_unit, backend, op_name, epsilon_val, x_eps_dtype, ): # From iOS17, epsilon and have different dtype than x _TestElementwiseUnary_iOS14._test_builder_to_backend_stress_with_epsilon( compute_unit, backend, op_name, epsilon_val, x_eps_dtype ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( [ct.ComputeUnit.CPU_ONLY, ct.ComputeUnit.CPU_AND_GPU, ct.ComputeUnit.ALL], backends, ), ) def test_cast_fp16_output_bug_smoke(self, compute_unit, backend): """ Since a fp16 output bug in Core ML can only be reproduced by non-CPU backends, for this test, we hardcode the compute_unit. """ def build(x): return mb.cast(x=x, dtype="fp16") x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.int32) expected_res = x.astype(dtype=np.float16) mlmodel = run_compare_builder( build, {"x": mb.placeholder(shape=x.shape, dtype=types.int32)}, input_values={"x": x}, expected_output_types=x.shape + (types.fp16,), expected_outputs=expected_res, compute_unit=compute_unit, backend=backend, ) prog = mlmodel._mil_program assert get_op_types_in_program(prog) == ["cast"] cast_op = prog.find_ops(op_type="cast", exactly_one=True)[0] assert cast_op.dtype.val == "fp16" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/test_image_resizing.py0000644000000000000000000003677614672066616030032 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS15.test_image_resizing import ( TestResample as _TestResampleIos15, ) from coremltools.converters.mil.mil.ops.tests.iOS17 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import UNK_SYM, run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units class TestCropResize: @pytest.mark.parametrize( "compute_unit, backend, N", itertools.product(compute_units, backends, [1, 3]), ) def test_builder_to_backend_ios17(self, compute_unit, backend, N): """For iOS17+ the `roi` input is replaced by `boxes` and `box_indices`.""" x = np.arange(1, 17, dtype=np.float32).reshape(1, 1, 4, 4) boxes = np.array([1, 1, 2, 2], dtype=np.float32).reshape(1, 4) box_indices = None normalized_coordinates = False if N == 3: boxes = np.array( [ [0.1, 0.3, 1.3, 1.0], [0.5, 1.8, 1.0, 0.3], [0.0, 0.4, 0.6, 0.7], ], dtype=np.float32, ) box_indices = np.array([0] * 3, dtype=np.int32) normalized_coordinates = True def build(x): return mb.crop_resize( x=x, boxes=boxes, box_indices=box_indices, target_width=2, target_height=2, normalized_coordinates=normalized_coordinates, box_coordinate_mode="CORNERS_HEIGHT_FIRST", sampling_mode="ALIGN_CORNERS", pad_value=10.0, ) expected_outputs = [np.array([6, 7, 10, 11], dtype=np.float32).reshape(1, 1, 2, 2)] if N == 3: expected_outputs = [ np.array( [3.1, 5.2, 10.0, 10.0, 10.0, 7.899, 10.0, 13.9, 2.2, 3.1, 9.4, 10.3], dtype=np.float32, ).reshape(3, 1, 2, 2) ] run_compare_builder( build, input_placeholders={"x": mb.placeholder(shape=(1, 1, 4, 4))}, input_values={"x": x}, expected_output_types=[(N, 1, 2, 2, types.fp32)], expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "backend", backends, ) def test_default_value(self, backend): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 4, 4))], opset_version=backend.opset_version) def prog(a): res = mb.crop_resize(x=a, boxes=np.array([1, 1, 2, 2], dtype=np.float32).reshape(1, 4), target_height=2, target_width=2) assert res.op.box_coordinate_mode.val == "CORNERS_HEIGHT_FIRST" assert res.op.sampling_mode.val == "DEFAULT" return res @pytest.mark.parametrize( "backend", backends, ) def test_builder_eval_ios17_invalid(self, backend): x = np.arange(1, 17, dtype=np.float32).reshape(1, 1, 4, 4) three_boxes = np.array( [ [0.1, 0.3, 1.3, 1.0], [0.5, 1.8, 1.0, 0.3], [0.0, 0.4, 0.6, 0.7], ], dtype=np.float32, ) with pytest.raises( ValueError, match='N dimension of "boxes" \(3\) should not be greater ' 'than the B dimension of "x" \(1\)', ): @mb.program(input_specs=[], opset_version=backend.opset_version) def prog(): return mb.crop_resize(x=x, boxes=three_boxes) one_box = np.array([1, 1, 2, 2], dtype=np.float32).reshape(1, 4) indices_out_of_bound = np.array([10], dtype=np.int32) with pytest.raises( ValueError, match='input "box_indices" should not have values >= B ' "dimension of x \(1\), but got \[10\]", ): @mb.program(input_specs=[], opset_version=backend.opset_version) def prog(): return mb.crop_resize(x=x, boxes=one_box, box_indices=indices_out_of_bound) indices_two_dim = np.array([[0]], dtype=np.int32) with pytest.raises( ValueError, match='input "box_indices" must has shape \[1\], but got \(1, 1\)' ): @mb.program(input_specs=[], opset_version=backend.opset_version) def prog(): return mb.crop_resize(x=x, boxes=one_box, box_indices=indices_two_dim) x_rank5 = np.arange(1, 17, dtype=np.float32).reshape(1, 1, 4, 4, 1) with pytest.raises( ValueError, match='input to the "crop_resize" op must be of rank 4, but got 5' ): @mb.program(input_specs=[], opset_version=backend.opset_version) def prog(): return mb.crop_resize(x=x_rank5, boxes=one_box) class TestResample(_TestResampleIos15): @pytest.mark.parametrize( "compute_unit, backend, coordinates_dtype", itertools.product( compute_units, backends, (np.int8, np.uint8, np.int16, np.uint16, np.int32, np.float16, np.float32), ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, coordinates_dtype): # The fp16 precision will have two casts inserted for input/output expected_cast_ops = 2 if backend.precision == "fp16" else 0 if backend.precision == "fp16" and coordinates_dtype == np.float32: # The coordinates also cast to fp16. expected_cast_ops += 1 if coordinates_dtype not in (np.int32, np.float16, np.float32): # For dtype not supported in CoreML I/O, a cast will be inserted. expected_cast_ops += 1 self._test_builder_to_backend_smoke( compute_unit, backend, coordinates_dtype, expected_cast_ops ) class TestResize: @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends) ) def test_resize_nearest_neighbor(self, compute_unit, backend): def build_model(x): return mb.resize( x=x, shape=[1, 1, 3, 2], resized_dims=np.uint32(2), interpolation_mode="NEAREST_NEIGHBOR", sampling_mode="DEFAULT", ) x_val = np.array([-6.174, 9.371], dtype=np.float32).reshape([1, 1, 1, 2, 1]) input_placeholder_dict = {"x": mb.placeholder(shape=x_val.shape)} input_value_dict = {"x": x_val} expected_output_types = [(1, 1, 1, 3, 2, types.fp32)] expected_outputs = [ np.array([[-6.174, -6.174, 9.371, 9.371, 9.371, 9.371]], dtype=np.float32).reshape( [1, 1, 1, 3, 2] ) ] run_compare_builder( build_model, input_placeholder_dict, input_value_dict, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_resize_nearest_neighbor_dynamic_shape(self, compute_unit, backend): def build_model(x, shape): return mb.resize( x=x, shape=shape, resized_dims=np.uint32(2), interpolation_mode="NEAREST_NEIGHBOR", sampling_mode="DEFAULT", ) x_val = np.array([-6.174, 9.371], dtype=np.float32).reshape([1, 1, 2, 1]) shape_val = np.array([1, 1, 3, 2], dtype=np.int32) input_placeholder_dict = { "x": mb.placeholder(shape=x_val.shape, dtype=types.fp32), "shape": mb.placeholder(shape=shape_val.shape, dtype=types.int32), } input_value_dict = {"x": x_val, "shape": shape_val} expected_output_types = [(1, 1, UNK_SYM, UNK_SYM, types.fp32)] expected_outputs = [ np.array([[-6.174, -6.174, 9.371, 9.371, 9.371, 9.371]], dtype=np.float32).reshape( [1, 1, 3, 2] ) ] run_compare_builder( build_model, input_placeholder_dict, input_value_dict, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_resize_linear(self, compute_unit, backend): def build_model(x): return mb.resize( x=x, shape=[1, 1, 5], resized_dims=np.uint32(2), interpolation_mode="LINEAR", sampling_mode="DEFAULT", ) x_val = np.array([0, 1], dtype=np.float32).reshape([1, 1, 2]) input_placeholder_dict = {"x": mb.placeholder(shape=x_val.shape)} input_value_dict = {"x": x_val} expected_output_types = [(1, 1, 5, types.fp32)] expected_outputs = [np.array([[0, 0.4, 0.8, 1, 1]], dtype=np.float32).reshape([1, 1, 5])] run_compare_builder( build_model, input_placeholder_dict, input_value_dict, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_resize_linear_dynamic_shape(self, compute_unit, backend): def build_model(x, shape): return mb.resize( x=x, shape=shape, resized_dims=np.uint32(2), interpolation_mode="LINEAR", sampling_mode="DEFAULT", ) x_val = np.array([0, 1], dtype=np.float32).reshape([1, 1, 1, 2]) shape_val = np.array([3, 1, 5], dtype=np.int32) input_placeholder_dict = { "x": mb.placeholder(shape=x_val.shape, dtype=types.fp32), "shape": mb.placeholder(shape=shape_val.shape, dtype=types.int32), } input_value_dict = {"x": x_val, "shape": shape_val} expected_output_types = [(1, 1, UNK_SYM, UNK_SYM, types.fp32)] expected_outputs = [np.array([[0, 0.4, 0.8, 1, 1]], dtype=np.float32).reshape([1, 1, 1, 5])] run_compare_builder( build_model, input_placeholder_dict, input_value_dict, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_resize_invalid_parameter(self, compute_unit, backend): def build_invalid_interpolation_mode(x): return mb.resize( x=x, shape=[1, 1, 5], resized_dims=np.uint32(2), interpolation_mode="DUMMY", sampling_mode="DEFAULT", ) def build_invalid_sampling_mode(x): return mb.resize( x=x, shape=[1, 1, 5], resized_dims=np.uint32(2), interpolation_mode="LINEAR", sampling_mode="DUMMY", ) def build_invalid_target_shape(x): return mb.resize( x=x, shape=[1, 1, 1, 5], resized_dims=np.uint32(2), interpolation_mode="LINEAR", sampling_mode="DEFAULT", ) x_val = np.array([0, 1], dtype=np.float32).reshape([1, 1, 2]) input_placeholder_dict = {"x": mb.placeholder(shape=x_val.shape)} input_value_dict = {"x": x_val} with pytest.raises(ValueError, match="Invalid interpolation_mode"): run_compare_builder( build_invalid_interpolation_mode, input_placeholder_dict, input_value_dict, compute_unit=compute_unit, backend=backend, ) with pytest.raises(ValueError, match="Invalid sampling_mode"): run_compare_builder( build_invalid_sampling_mode, input_placeholder_dict, input_value_dict, compute_unit=compute_unit, backend=backend, ) with pytest.raises(ValueError, match="The shape's size \(4\) must <= x's rank \(3\)"): run_compare_builder( build_invalid_target_shape, input_placeholder_dict, input_value_dict, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, interpolation_mode", itertools.product(compute_units, backends, ("LINEAR",)), ) def test_resize_inherit_shape(self, compute_unit, backend, interpolation_mode): def build_model(x): return mb.resize( x=x, shape=[1, 0, 0, 0], resized_dims=np.uint32(3), interpolation_mode=interpolation_mode, sampling_mode="DEFAULT", ) pytest.xfail("rdar://112418424 Backend failed when input shape has 0.") x_val = np.array([-6.174, 9.371], dtype=np.float32).reshape([1, 1, 1, 2, 1]) input_placeholder_dict = {"x": mb.placeholder(shape=x_val.shape)} input_value_dict = {"x": x_val} expected_output_types = [(1, 1, 1, 2, 1, types.fp32)] expected_outputs = [np.array([-6.174, 9.371], dtype=np.float32).reshape([1, 1, 1, 2, 1])] run_compare_builder( build_model, input_placeholder_dict, input_value_dict, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, interpolation_mode", itertools.product(compute_units, backends, ("LINEAR", "NEAREST_NEIGHBOR")), ) def test_resize_inherit_shape_dynamic(self, compute_unit, backend, interpolation_mode): def build_model(x, shape): return mb.resize( x=x, shape=shape, resized_dims=np.uint32(2), interpolation_mode=interpolation_mode, sampling_mode="DEFAULT", ) pytest.xfail("rdar://112418424 Backend failed when input shape has 0.") x_val = np.array([0, 1], dtype=np.float32).reshape([1, 1, 1, 2]) shape_val = np.array([1, 0, 0], dtype=np.int32) input_placeholder_dict = { "x": mb.placeholder(shape=x_val.shape, dtype=types.fp32), "shape": mb.placeholder(shape=shape_val.shape, dtype=types.int32), } input_value_dict = {"x": x_val, "shape": shape_val} expected_output_types = [(1, 1, UNK_SYM, UNK_SYM, types.fp32)] expected_outputs = [np.array([[0, 1]], dtype=np.float32).reshape([1, 1, 1, 2])] run_compare_builder( build_model, input_placeholder_dict, input_value_dict, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/test_linear.py0000644000000000000000000001641114672066616026270 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS17 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.mil.types import ( builtin_to_string, nptype_from_builtin, numpy_type_to_builtin_type, ) from coremltools.converters.mil.testing_reqs import compute_units class TestLinear: @pytest.mark.parametrize( "compute_unit, backend, x_dtype, weight_bias_dtype", itertools.product( compute_units, backends, [np.float16, np.float32, np.int32], [np.float16, np.float32, np.int32], ), ) def test_linear_ios17_mixed_precision(self, compute_unit, backend, x_dtype, weight_bias_dtype): if x_dtype == np.int32: pytest.xfail("Linear op doesn't work with int32 input (rdar://111421695)") out_channels = 3 x_shape = np.random.randint(low=1, high=3, size=(3,)) w_shape = np.array([out_channels, x_shape[-1]]) b_shape = np.array([out_channels]) x_val = np.random.randint(low=0, high=10, size=x_shape).astype(x_dtype) weight_val = np.random.randint(low=0, high=10, size=w_shape).astype(weight_bias_dtype) bias_val = np.random.randint(low=0, high=10, size=b_shape).astype(weight_bias_dtype) x_builtin_dtype = numpy_type_to_builtin_type(x_dtype) input_placeholders = { "x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype), } def build(x): return mb.linear(x=x, weight=weight_val, bias=bias_val) expected_outputs = np.matmul(x_val, np.transpose(weight_val)) + bias_val run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_outputs.shape + (x_builtin_dtype,), expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, x_input_type, weight_input_type", itertools.product( compute_units, backends, [types.int32, types.fp16, types.fp32], [types.int32, types.fp16, types.fp32], ), ) def test_default_bias_type_ios17(self, compute_unit, backend, x_input_type, weight_input_type): # Start from iOS17, x and weight can have different dtype. # Test the default bias matches the dtype of weight. @mb.program( input_specs=[mb.TensorSpec(shape=(1, 2), dtype=types.fp32)], opset_version=backend.opset_version, ) def prog(x): x = mb.cast(x=x, dtype=builtin_to_string(x_input_type)) weight = np.random.rand(3, 2).astype(nptype_from_builtin(weight_input_type)) res = mb.linear(x=x, weight=weight) assert res.op.bias.val.dtype == nptype_from_builtin(weight_input_type) return res class TestMatMul: @pytest.mark.parametrize( "compute_unit, backend, x_dtype, y_dtype", itertools.product( compute_units, backends, [np.float32, np.float16, np.int32], [np.float32, np.float16, np.int32], ), ) def test_ios17_mixed_precision(self, compute_unit, backend, x_dtype, y_dtype): x_val = np.random.randint(low=0, high=10, size=(2, 5)).astype(x_dtype) y_val = np.random.randint(low=0, high=10, size=(5, 10)).astype(y_dtype) x_mb_dtype = numpy_type_to_builtin_type(x_dtype) y_mb_dtype = numpy_type_to_builtin_type(y_dtype) expected_outputs = np.matmul(x_val, y_val) def build_x_const(y): return mb.matmul(x=x_val, y=y, transpose_x=False, transpose_y=False) def build_y_const(x): return mb.matmul(x=x, y=y_val, transpose_x=False, transpose_y=False) mlmodel = run_compare_builder( build_y_const, input_placeholders={"x": mb.placeholder(shape=x_val.shape, dtype=x_mb_dtype)}, input_values={"x": x_val}, expected_output_types=expected_outputs.shape + (x_mb_dtype,), expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, pass_pipeline=ct.PassPipeline.EMPTY, ) prog = mlmodel._mil_program matmul_op = prog["main"].find_ops(op_type="matmul")[0] # When x is non-const and y is const, the output should have the same dtype as x. assert types.builtin_to_string(matmul_op.outputs[0].dtype) == types.builtin_to_string( x_mb_dtype ) mlmodel = run_compare_builder( build_x_const, input_placeholders={"y": mb.placeholder(shape=y_val.shape, dtype=y_mb_dtype)}, input_values={"y": y_val}, expected_output_types=expected_outputs.shape + (y_mb_dtype,), expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, pass_pipeline=ct.PassPipeline.EMPTY, ) prog = mlmodel._mil_program matmul_op = prog["main"].find_ops(op_type="matmul")[0] # When x is const and y is non-const, the output should have the same dtype as y. assert types.builtin_to_string(matmul_op.outputs[0].dtype) == types.builtin_to_string( y_mb_dtype ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_ios17_invalid_mixed_precision(self, compute_unit, backend): """When x and y are both const or both non-const, mixed precision is not allowed.""" x_val = np.random.rand(2, 5).astype(np.float32) y_val = np.random.randint(low=0, high=10, size=(5, 10)).astype(np.int32) def build_both_const(): return mb.matmul(x=x_val, y=y_val, transpose_x=False, transpose_y=False) def build_both_not_const(x, y): return mb.matmul(x=x, y=y, transpose_x=False, transpose_y=False) with pytest.raises( ValueError, match="when x and y are both const, their dtype need to match" ): run_compare_builder( build_both_const, input_placeholders={}, input_values={}, compute_unit=compute_unit, backend=backend, ) with pytest.raises( ValueError, match="when x and y are both non-const, their dtype need to match" ): run_compare_builder( build_both_not_const, input_placeholders={ "x": mb.placeholder(shape=x_val.shape, dtype=types.fp32), "y": mb.placeholder(shape=y_val.shape, dtype=types.int32), }, input_values={"x": x_val, "y": y_val}, compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/test_normalization.py0000644000000000000000000001432614672066616027707 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools._deps import _HAS_TF_2, _HAS_TORCH, MSG_TF2_NOT_FOUND, MSG_TORCH_NOT_FOUND from coremltools.converters.mil.mil.ops.tests.iOS14.test_normalization import ( TestNormalizationBatchNorm as _TestNormalizationBatchNormIos14, ) from coremltools.converters.mil.mil.ops.tests.iOS14.test_normalization import ( TestNormalizationInstanceNorm as _TestNormalizationInstanceNormIos14, ) from coremltools.converters.mil.mil.ops.tests.iOS14.test_normalization import ( TestNormalizationL2Norm as _TestNormalizationL2NormIos14, ) from coremltools.converters.mil.mil.ops.tests.iOS14.test_normalization import ( TestNormalizationLayerNorm as _TestNormalizationLayerNormIos14, ) from coremltools.converters.mil.mil.ops.tests.iOS14.test_normalization import ( TestNormalizationLocalResponseNorm as _TestNormalizationLocalResponseNormIos14, ) from coremltools.converters.mil.mil.ops.tests.iOS17 import backends from coremltools.converters.mil.testing_reqs import compute_units class TestNormalizationBatchNorm(_TestNormalizationBatchNormIos14): @pytest.mark.parametrize( "compute_unit, backend, x_param_dtype", itertools.product( compute_units, backends, [ (np.float16, np.float16), (np.float32, np.float32), (np.float16, np.float32), (np.float32, np.float16), ], ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, x_param_dtype): super().test_builder_to_backend_smoke(compute_unit, backend, x_param_dtype) class TestNormalizationInstanceNorm(_TestNormalizationInstanceNormIos14): @pytest.mark.parametrize( "compute_unit, backend, x_param_dtype", itertools.product( compute_units, backends, [ (np.float16, np.float16), (np.float32, np.float32), (np.float16, np.float32), (np.float32, np.float16), ], ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, x_param_dtype): super().test_builder_to_backend_smoke(compute_unit, backend, x_param_dtype) @pytest.mark.parametrize( "compute_unit, backend, x_param_dtype", itertools.product( compute_units, backends, [ (np.float16, np.float16), (np.float32, np.float32), (np.float16, np.float32), (np.float32, np.float16), ], ), ) def test_builder_to_backend_smoke_with_gamma_and_beta( self, compute_unit, backend, x_param_dtype ): super().test_builder_to_backend_smoke_with_gamma_and_beta( compute_unit, backend, x_param_dtype ) @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( "rank, compute_unit, backend, epsilon, x_param_dtype", itertools.product( [3, 4], compute_units, backends, [1e-3, 1e-5, 1e-10], [ (np.float16, np.float16), (np.float32, np.float32), (np.float16, np.float32), (np.float32, np.float16), ], ), ) def test_builder_to_backend_stress(self, rank, compute_unit, backend, epsilon, x_param_dtype): super().test_builder_to_backend_stress(rank, compute_unit, backend, epsilon, x_param_dtype) class TestNormalizationL2Norm(_TestNormalizationL2NormIos14): @pytest.mark.parametrize( "compute_unit, backend, rank, epsilon, x_param_dtype", itertools.product( compute_units, backends, [3, 4, 5], [1e-4, 5.7], [ (np.float16, np.float16), (np.float32, np.float32), (np.float16, np.float32), (np.float32, np.float16), ], ), ) def test_builder_to_backend_stress(self, compute_unit, backend, rank, epsilon, x_param_dtype): super().test_builder_to_backend_stress(compute_unit, backend, rank, epsilon, x_param_dtype) class TestNormalizationLayerNorm(_TestNormalizationLayerNormIos14): @pytest.mark.skipif(not _HAS_TF_2, reason=MSG_TF2_NOT_FOUND) @pytest.mark.parametrize( "compute_unit, backend, rank_and_axes, epsilon, x_param_dtype", itertools.product( compute_units, backends, [[3, [0, 2]], [3, [-2]], [4, [0, 1, 3]], [5, [0, 4]], [5, [-5, -4, -3, -2, -1]]], [0.0001, 0.01], [ (np.float16, np.float16), (np.float32, np.float32), (np.float16, np.float32), (np.float32, np.float16), ], ), ) def test_builder_to_backend_stress_keras( self, compute_unit, backend, rank_and_axes, epsilon, x_param_dtype ): super().test_builder_to_backend_stress_keras( compute_unit, backend, rank_and_axes, epsilon, x_param_dtype ) class TestNormalizationLocalResponseNorm(_TestNormalizationLocalResponseNormIos14): @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( "compute_unit, backend, rank, size, alpha, beta, k, x_param_dtype", itertools.product( compute_units, backends, [rank for rank in range(3, 6)], [2, 3, 5], [0.0001, 0.01], [0.75, 1.0], [1.0, 2.0], [ (np.float16, np.float16), (np.float32, np.float32), (np.float16, np.float32), (np.float32, np.float16), ], ), ) def test_builder_to_backend_stress( self, compute_unit, backend, rank, size, alpha, beta, k, x_param_dtype ): super().test_builder_to_backend_stress( compute_unit, backend, rank, size, alpha, beta, k, x_param_dtype ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/test_quantization.py0000644000000000000000000004302314672066616027543 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools from typing import Tuple import numpy as np import pytest from coremltools._deps import _HAS_TORCH, MSG_TORCH_NOT_FOUND from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.ops.tests.iOS17 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.mil.types import builtin_to_string, numpy_type_to_builtin_type from coremltools.converters.mil.testing_reqs import BackendConfig, compute_units from coremltools.converters.mil.testing_utils import ssa_fn if _HAS_TORCH: import torch torch.manual_seed(1042) np.random.seed(1042) class TestQuantizationBase: @staticmethod def get_random_quantization_params( float_dtype: np.dtype, quant_dtype: np.dtype, input_rank: int, is_zp_present: bool = True, axis: int = None, ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: """ return floating-point input, floating-point scale, integer zero point """ x_shape = np.random.randint(low=1, high=5, size=(input_rank,)) low, high = (-128, 128) if quant_dtype == np.int8 else (0, 256) # create quantized x x_q = np.random.randint(low=low, high=high, size=x_shape) # create scale and zero point, the dequantize x x_fp = None scale = None zp = None # quantize per tensor if axis is None: scale = np.array(np.random.rand()) if is_zp_present: zp = np.array(np.random.randint(low=low, high=high)) x_fp = (x_q - zp) * scale else: x_fp = x_q * scale # quantize per channel else: # prepare broadcast shape for latter dequantize broadcastable_shape = np.ones(input_rank, dtype=np.int32) broadcastable_shape[axis] = x_shape[axis] scale = np.random.rand(x_shape[axis]) broadcasted_scale = np.reshape(scale, broadcastable_shape) if is_zp_present: zp = np.random.randint(low=low, high=high, size=x_shape[axis]) broadcasted_zp = np.reshape(zp, broadcastable_shape) x_fp = (x_q - broadcasted_zp) * broadcasted_scale else: x_fp = x_q * broadcasted_scale x_fp = x_fp.astype(float_dtype) scale = scale.astype(float_dtype) zero_point = zp.astype(quant_dtype) if is_zp_present else None return x_fp, scale, zero_point @staticmethod def torch_quantize( x: np.ndarray, scale: np.ndarray, zero_point: np.ndarray, axis: int = None, quant_dtype: np.dtype = None, ) -> torch.Tensor: """ return quantized x by pytorch """ # quantization data type is either inferred from `zero_point`, # or explicitly provided if zero_point is not None: quant_dtype = zero_point.dtype assert quant_dtype is not None # if scale is scalar, then axis must be None # if scale is not scalar, then axis must have a value assert (len(scale.shape) == 0) != (axis is not None) x_torch = torch.from_numpy(x).to(torch.float32) s_torch = torch.from_numpy(scale).to(torch.float32) zp_torch = ( torch.zeros(scale.shape, dtype=torch.int) if zero_point is None else torch.from_numpy(zero_point) ) dtype_torch = torch.quint8 if quant_dtype == np.uint8 else torch.qint8 output: np.ndarray if axis is None: output = torch.quantize_per_tensor(x_torch, s_torch, zp_torch, dtype_torch) else: if axis < 0: axis += len(x.shape) output = torch.quantize_per_channel(x_torch, s_torch, zp_torch, axis, dtype_torch) return output class TestQuantize(TestQuantizationBase): @ssa_fn def test_builder_eval_scalar_params(self): v = mb.quantize( input=np.float32([[0, 2, 4], [0, 2, 4]]), zero_point=np.uint8(1), scale=np.float32(2), output_dtype="uint8", ) np.testing.assert_allclose(np.array([[1, 2, 3], [1, 2, 3]]).astype(np.uint8), v.val) @ssa_fn def test_builder_eval_vector_params(self): v = mb.quantize( input=np.array([1, 2, 3, 4]).reshape(1, 1, 2, 2).astype(np.float32), zero_point=np.array([2, 4]).astype(np.int8), scale=np.array([1, 2]).astype(np.float32), axis=3, output_dtype="int8", ) np.testing.assert_allclose( np.array([3, 5, 5, 6]).reshape(1, 1, 2, 2).astype(np.int8), v.val ) @ssa_fn def test_builder_eval_vector_params_neg_axis(self): v = mb.quantize( input=np.array([1, 2, 3, 4]).reshape(1, 1, 2, 2).astype(np.float32), zero_point=np.array([2, 4]).astype(np.int8), scale=np.array([1, 2]).astype(np.float32), axis=-1, output_dtype="int8", ) np.testing.assert_allclose( np.array([3, 5, 5, 6]).reshape(1, 1, 2, 2).astype(np.int8), v.val ) @ssa_fn def test_builder_eval_no_zero_point(self): v = mb.quantize( input=np.float32([[0, 2, 4], [0, 2, 4]]), scale=np.float32(2), output_dtype="int8", ) np.testing.assert_allclose(np.array([[0, 1, 2], [0, 1, 2]]).astype(np.int8), v.val) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_smoke_builder_to_backend_quantize_per_tensor(self, compute_unit, backend): def build(x): x = mb.cast(x=x, dtype="fp16") quantized = mb.quantize( input=x, zero_point=np.int8(10), scale=np.float16(0.1), output_dtype="int8", ) # TODO(rdar://107430678): Replace scale=1 zero_point=0 quantize/dequantize with cast dequantized = mb.dequantize( input=quantized, scale=np.float16(1), ) return dequantized x = np.array([-1.0, 0.0, 1.0, 2.0], dtype=np.float16) expected_output = np.array([0, 10, 20, 30], dtype=np.float16) expected_output_type = expected_output.shape + ( numpy_type_to_builtin_type(expected_output.dtype), ) run_compare_builder( build, input_placeholders={"x": mb.placeholder(shape=x.shape)}, input_values={"x": x}, expected_output_types=[expected_output_type], expected_outputs=[expected_output], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_smoke_builder_to_backend_quantize_per_channel(self, compute_unit, backend): def build(x): x = mb.cast(x=x, dtype="fp16") quantized = mb.quantize( input=x, zero_point=np.uint8([10, 0]), scale=np.float16([0.1, 0.01]), axis=0, output_dtype="uint8", ) # TODO(rdar://107430678): Replace scale=1 zero_point=0 quantize/dequantize with cast dequantized = mb.dequantize( input=quantized, scale=np.float16(1), ) return dequantized x = np.array([[-1.0, 0.0], [1.0, 2.0]], dtype=np.float16) expected_output = np.array([[0, 10], [100, 200]], dtype=np.float16) expected_output_type = expected_output.shape + ( numpy_type_to_builtin_type(expected_output.dtype), ) run_compare_builder( build, input_placeholders={"x": mb.placeholder(shape=x.shape)}, input_values={"x": x}, expected_output_types=[expected_output_type], expected_outputs=[expected_output], compute_unit=compute_unit, backend=backend, ) @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( "compute_unit, backend, float_dtype, quant_dtype, input_rank, axis, is_zp_present", itertools.product( compute_units, backends, (np.float32, np.float16), (np.int8, np.uint8), list(range(1, 6)), [None] + list(range(-5, 5)), (True, False), ), ) def test_stress_builder_to_backend_quantize_all_possibilities( self, compute_unit, backend, float_dtype, quant_dtype, input_rank, axis, is_zp_present, ): if axis is not None and (axis < -input_rank or axis >= input_rank): pytest.skip("axis should either be None or in [-input_rank, input_rank)") def build(x): x = mb.cast(x=x, dtype=builtin_to_string(numpy_type_to_builtin_type(float_dtype))) quantized = mb.quantize( input=x, zero_point=zero_point, scale=scale, axis=axis, output_dtype=builtin_to_string(numpy_type_to_builtin_type(quant_dtype)), ) # TODO(rdar://107430678): Replace scale=1 zero_point=0 quantize/dequantize with cast dequantized = mb.dequantize( input=quantized, scale=float_dtype(1), ) return dequantized x_fp, scale, zero_point = self.get_random_quantization_params( float_dtype, quant_dtype, input_rank, is_zp_present, axis ) input_placeholders = { "x": mb.placeholder( shape=x_fp.shape, dtype=numpy_type_to_builtin_type(float_dtype), ), } input_values = {"x": x_fp} output_torch = self.torch_quantize(x_fp, scale, zero_point, axis, quant_dtype) output_torch_val = output_torch.int_repr().numpy() output_type = output_torch_val.shape + (numpy_type_to_builtin_type(np.float32),) expected_outputs = [output_torch_val] expected_output_types = [output_type] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestDequantize(TestQuantizationBase): @ssa_fn def test_builder_eval_scalar_params(self): v = mb.dequantize( input=np.array([[1, 2, 3], [1, 2, 3]]).astype(np.uint8), zero_point=np.uint8(1), scale=np.float32(2), ) assert v.val is None np.testing.assert_allclose( np.float32([[0, 2, 4], [0, 2, 4]]), v.op.materialized_val_inference(), ) @ssa_fn def test_builder_eval_vector_params(self): v = mb.dequantize( input=np.array([3, 5, 5, 6]).reshape(1, 1, 2, 2).astype(np.uint8), zero_point=np.array([2, 4]).astype(np.uint8), scale=np.array([1, 2]).astype(np.float32), axis=3, ) assert v.val is None np.testing.assert_allclose( np.array([1, 2, 3, 4]).reshape(1, 1, 2, 2).astype(np.float32), v.op.materialized_val_inference(), ) @ssa_fn def test_builder_eval_no_zero_point(self): v = mb.dequantize( input=np.array([[0, 1, 2], [0, 1, 2]]).astype(np.int8), scale=np.float32(2), ) assert v.val is None np.testing.assert_allclose( np.float32([[0, 2, 4], [0, 2, 4]]), v.op.materialized_val_inference(), ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_smoke_builder_to_backend_dequantize_per_tensor(self, compute_unit, backend): def build(x): x = mb.cast(x=x, dtype="fp32") # TODO(rdar://107430678): Replace scale=1 zero_point=0 quantize/dequantize with cast quantized = mb.quantize( input=x, scale=np.float32(1), output_dtype="uint8", ) dequantized = mb.dequantize( input=quantized, zero_point=np.uint8(5), scale=np.float32(0.2), ) return dequantized x = np.array([5, 10, 15, 20], dtype=np.float32) expected_output = np.array([0, 1, 2, 3], dtype=np.float32) expected_output_type = expected_output.shape + ( numpy_type_to_builtin_type(expected_output.dtype), ) run_compare_builder( build, input_placeholders={"x": mb.placeholder(shape=x.shape)}, input_values={"x": x}, expected_output_types=[expected_output_type], expected_outputs=[expected_output], compute_unit=compute_unit, # Other test cases are mostly testing fp16 precision, # so this one we explicitly test fp32 precision backend=BackendConfig( backend=backend.backend, precision="fp32", opset_version=backend.opset_version, ), atol=1e-3, rtol=1e-3, ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_smoke_builder_to_backend_dequantize_per_channel(self, compute_unit, backend): def build(x): x = mb.cast(x=x, dtype="fp32") # TODO(rdar://107430678): Replace scale=1 zero_point=0 quantize/dequantize with cast quantized = mb.quantize( input=x, scale=np.float32(1), output_dtype="int8", ) dequantized = mb.dequantize( input=quantized, zero_point=np.int8([-5, 5]), scale=np.float32([0.2, 0.3]), axis=1, ) return dequantized x = np.array([[-10, -5], [0, 5]], dtype=np.float32) expected_output = np.array([[-1, -3], [1, 0]], dtype=np.float32) expected_output_type = expected_output.shape + ( numpy_type_to_builtin_type(expected_output.dtype), ) run_compare_builder( build, input_placeholders={"x": mb.placeholder(shape=x.shape)}, input_values={"x": x}, expected_output_types=[expected_output_type], expected_outputs=[expected_output], compute_unit=compute_unit, # Other test cases are mostly testing fp16 precision, # so this one we explicitly test fp32 precision backend=BackendConfig( backend=backend.backend, precision="fp32", opset_version=backend.opset_version, ), atol=1e-3, rtol=1e-3, ) @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( "compute_unit, backend, float_dtype, quant_dtype, input_rank, axis, is_zp_present", itertools.product( compute_units, backends, (np.float32, np.float16), (np.int8, np.uint8), list(range(1, 6)), [None] + list(range(-5, 5)), (True, False), ), ) def test_stress_builder_to_backend_dequantize_all_possibilities( self, compute_unit, backend, float_dtype, quant_dtype, input_rank, axis, is_zp_present, ): if axis is not None and (axis < -input_rank or axis >= input_rank): pytest.skip("axis should either be None or in [-input_rank, input_rank)") def build(x): x = mb.cast(x=x, dtype=builtin_to_string(numpy_type_to_builtin_type(float_dtype))) # TODO(rdar://107430678): Replace scale=1 zero_point=0 quantize/dequantize with cast quantized = mb.quantize( input=x, scale=float_dtype(1), output_dtype=builtin_to_string(numpy_type_to_builtin_type(quant_dtype)), ) dequantized = mb.dequantize( input=quantized, zero_point=zero_point, scale=scale, axis=axis, ) return dequantized x_fp, scale, zero_point = self.get_random_quantization_params( float_dtype, quant_dtype, input_rank, is_zp_present, axis ) x_q = self.torch_quantize(x_fp, scale, zero_point, axis, quant_dtype) output_torch_val = torch.dequantize(x_q).numpy() output_type = output_torch_val.shape + (numpy_type_to_builtin_type(np.float32),) input_placeholders = { "x": mb.placeholder( shape=x_fp.shape, dtype=numpy_type_to_builtin_type(float_dtype), ), } input_values = {"x": x_q.int_repr().numpy()} expected_outputs = [output_torch_val] expected_output_types = [output_type] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs=expected_outputs, compute_unit=compute_unit, backend=backend, rtol=1e-3, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/test_recurrent.py0000644000000000000000000001757114672066616027037 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools._deps import _HAS_TORCH, MSG_TORCH_NOT_FOUND from coremltools.converters.mil.mil.ops.tests.iOS14.test_recurrent import TestGRU as _TestGRU_iOS14 from coremltools.converters.mil.mil.ops.tests.iOS14.test_recurrent import ( TestLSTM as _TestLSTM_iOS14, ) from coremltools.converters.mil.mil.ops.tests.iOS14.test_recurrent import TestRNN as _TestRNN_iOS14 from coremltools.converters.mil.mil.ops.tests.iOS17 import backends from coremltools.converters.mil.testing_reqs import compute_units class TestGRU(_TestGRU_iOS14): @pytest.mark.parametrize( argnames=[ "compute_unit", "backend", "seq_len", "batch_size", "input_size", "hidden_size", "has_bias", "output_sequence", "direction", "activation_functions", "symbolic", "dtype", ], argvalues=itertools.product( compute_units, backends, [1, 3], [1], # [MIL] GRU with batch size 1 produces incorrect # output(always 0) for second batch onwards [1, 2], [1, 2], [True, False], [True, False], ["forward", "reverse"], [ ["tanh", "sigmoid"], ["sigmoid", "tanh"], ], [True, False], [np.float16, np.float32], ), ) def test_builder_to_backend_smoke( self, compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, direction, activation_functions, symbolic, dtype, ): super().test_builder_to_backend_smoke( compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, direction, activation_functions, symbolic, dtype, ) class TestLSTM(_TestLSTM_iOS14): @pytest.mark.parametrize( ",".join( [ "compute_unit", "backend", "input_dims", "output_dim", "activation", "inner_activation", "outer_activation", "return_seq", "has_bias", "forget_bias", "has_peephole", "coupled_input_forget", "clip", "dtype", ] ), itertools.product( compute_units, backends, [[8, 32, 32]], [4], ["sigmoid"], ["tanh"], ["relu"], [False, True], [False, True], [False, True], [True, False], [False], [50.0, 0.01], [np.float16, np.float32], ), ) def test_numpy_numerical( self, compute_unit, backend, input_dims, output_dim, activation, inner_activation, outer_activation, return_seq, has_bias, forget_bias, has_peephole, coupled_input_forget, clip, dtype, ): super().test_numpy_numerical( compute_unit, backend, input_dims, output_dim, activation, inner_activation, outer_activation, return_seq, has_bias, forget_bias, has_peephole, coupled_input_forget, clip, dtype, ) @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( argnames=[ "compute_unit", "backend", "seq_len", "batch_size", "input_size", "hidden_size", "has_bias", "output_sequence", "direction", "symbolic", "dtype", ], argvalues=itertools.product( compute_units, backends, [1, 8], [1, 32], [1, 64], [1, 16], [True, False], [True, False], ["forward", "reverse"], [True, False], [np.float16, np.float32], ), ) def test_builder_to_backend_smoke_unilstm( self, compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, direction, symbolic, dtype, ): super().test_builder_to_backend_smoke_unilstm( compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, direction, symbolic, dtype, ) @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( argnames=[ "compute_unit", "backend", "seq_len", "batch_size", "input_size", "hidden_size", "has_bias", "output_sequence", "symbolic", "dtype", ], argvalues=itertools.product( compute_units, backends, [1, 8], [1, 32], [1, 64], [2, 16], [True, False], [True, False], [True, False], [np.float16, np.float32], ), ) def test_builder_to_backend_smoke_bidirlstm( self, compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, symbolic, dtype, ): super().test_builder_to_backend_smoke_bidirlstm( compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, symbolic, dtype, ) class TestRNN(_TestRNN_iOS14): @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( argnames=[ "compute_unit", "backend", "seq_len", "batch_size", "input_size", "hidden_size", "has_bias", "output_sequence", "direction", "symbolic", "dtype", ], argvalues=itertools.product( compute_units, backends, [2, 8], [1, 32], [1, 64], [1, 16], [True, False], [True, False], ["forward", "reverse"], [True, False], [np.float16, np.float32], ), ) def test_builder_to_backend_smoke( self, compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, direction, symbolic, dtype, ): super().test_builder_to_backend_smoke( compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, direction, symbolic, dtype, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/test_reduction.py0000644000000000000000000000457514672066616027022 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS17 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units class TestReduction: @pytest.mark.parametrize( "compute_unit, backend, op_name, output_dtype", itertools.product( compute_units, backends, ["reduce_argmax", "reduce_argmin"], ["int32", "uint16", None] ), ) def test_reduce_arg_ios17_output_dtype(self, compute_unit, backend, op_name, output_dtype): def build(x): return getattr(mb, op_name)(x=x, axis=1, keep_dims=False, output_dtype=output_dtype) val = np.array([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=val.shape)} input_values = {"x": val} output_np_type = np.uint16 if output_dtype == "uint16" else np.int32 output_type = types.uint16 if output_dtype == "uint16" else types.int32 expected_output_types = (2, output_type) expected_outputs = np.array( [2, 2] if op_name == "reduce_argmax" else [0, 0], dtype=output_np_type ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "backend, op_name", itertools.product( backends, ["reduce_argmax", "reduce_argmin"], ), ) def test_reduce_arg_ios17_output_dtype_invalid(self, backend, op_name): x = np.array([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]], dtype=np.float32) def prog(): return getattr(mb, op_name)(x=x, axis=1, keep_dims=False, output_dtype="dummy") with pytest.raises(ValueError, match='Invalid "output_dtype" dummy'): mb.program(input_specs=[], opset_version=backend.opset_version)(prog) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/test_scatter_gather.py0000644000000000000000000004466014672066616030024 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import coremltools as ct from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS14.test_scatter_gather import ( TestGatherAlongAxis as _TestGatherAlongAxisIOS14, ) from coremltools.converters.mil.mil.ops.tests.iOS14.test_scatter_gather import ( TestScatterAlongAxis as _TestScatterAlongAxisIOS14, ) from coremltools.converters.mil.mil.ops.tests.iOS16.test_scatter_gather import ( TestGather as _TestGatherIOS16, ) from coremltools.converters.mil.mil.ops.tests.iOS16.test_scatter_gather import ( TestGatherNd as _TestGatherNdIOS16, ) from coremltools.converters.mil.mil.ops.tests.iOS17 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units class TestScatter: @pytest.mark.parametrize( "compute_unit, backend, indices_val, validate_indices, dynamic", itertools.product( compute_units, backends, [[-1, 0], [10, 0]], # One negative indices, another out-of-range indices. [True, False], [True, False], ), ) def test_ios17_invalid_indices( self, compute_unit, backend, indices_val, validate_indices, dynamic ): if ( indices_val == [10, 0] and backend.opset_version == ct.target.iOS18 and not validate_indices ): pytest.xfail( "rdar://128089254 ([Bug][Regression] iOS18 scatter ops has unexpected behavior than iOS17)" ) if ( indices_val == [-1, 0] and backend.opset_version == ct.target.iOS18 and validate_indices and dynamic ): pytest.xfail( "rdar://128089254 ([Bug][Regression] iOS18 scatter ops has unexpected behavior than iOS17)" ) if ( indices_val == [-1, 0] and backend.opset_version == ct.target.iOS18 and not validate_indices ): pytest.xfail( "rdar://128089254 ([Bug][Regression] iOS18 scatter ops has unexpected behavior than iOS17)" ) def build_static(data, updates): return ( mb.scatter( data=data, indices=np.array(indices_val, dtype=np.int32), updates=updates, validate_indices=validate_indices, ), ) def build_dynamic(data, indices, updates): return ( mb.scatter( data=data, indices=indices, updates=updates, validate_indices=validate_indices ), ) data = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) updates = np.array([[5, 6, 7], [8, 9, 10]], dtype=np.float32) input_placeholders = { "data": mb.placeholder(shape=data.shape), "updates": mb.placeholder(shape=updates.shape), } input_values = {"data": data, "updates": updates} if dynamic: indices = np.array(indices_val, dtype=np.int32) input_placeholders["indices"] = mb.placeholder(shape=indices.shape, dtype=types.int32) input_values["indices"] = indices if not validate_indices: # When not validate indices, negative or out-of-bound indices behavior is undefined. expected_error = AssertionError expected_error_msg = "Not equal" elif dynamic: # In PyMIL's validation, the `validate_indices` will only validate indices whose values are # known during op insertion, so it will not error out at PyMIL layer, but instead, rely on # the backend to do the validation after compilation. expected_error = RuntimeError expected_error_msg = ( "Error computing NN outputs", "Unable to compute the prediction using a neural network model", "Unable to compute the prediction using ML Program", ) else: # The negative or out-of-bound indices will error out when validate_indices is set. expected_error = IndexError expected_error_msg = "Indices is out of bounds" with pytest.raises(expected_error) as excinfo: run_compare_builder( build_dynamic if dynamic else build_static, input_placeholders, input_values, expected_output_types=(2, 3, types.fp32), expected_outputs=np.array([[9, 11, 13], [9, 11, 13]], dtype=np.float32), compute_unit=compute_unit, backend=backend, ) if not isinstance(expected_error_msg, tuple): expected_error_msg = expected_error_msg assert any([err in str(excinfo.value) for err in expected_error_msg]) class TestScatterAlongAxis: @pytest.mark.parametrize( "compute_unit, backend, rank_axis", itertools.product( compute_units, backends, [(rank, axis) for rank in range(1, 5) for axis in range(-rank, rank)], ), ) def test_builder_to_backend_programmatic(self, compute_unit, backend, rank_axis): _TestScatterAlongAxisIOS14._test_builder_to_backend_programmatic( compute_unit, backend, rank_axis, force_non_negative_indices=True ) @pytest.mark.parametrize( "compute_unit, backend, indices_val, dynamic", itertools.product( compute_units, backends, [[[-1, 0, 1], [1, 1, 0]], [[1, 10, 1], [1, 1, 0]]], [True, False], ), ) def test_ios17_invalid_indices(self, compute_unit, backend, indices_val, dynamic): if ( indices_val == [[-1, 0, 1], [1, 1, 0]] and dynamic and backend.opset_version == ct.target.iOS18 ): pytest.xfail( "rdar://128089254 ([Bug][Regression] iOS18 scatter ops has unexpected behavior than iOS17)" ) def build_static(data, updates): return ( mb.scatter_along_axis( data=data, indices=np.array(indices_val, dtype=np.int32), updates=updates, validate_indices=True, ), ) def build_dynamic(data, indices, updates): return mb.scatter_along_axis( data=data, indices=indices, updates=updates, axis=0, mode="update", validate_indices=True, ) data = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) updates = np.array([[5, 6, 7], [8, 9, 10]], dtype=np.float32) input_placeholders = { "data": mb.placeholder(shape=data.shape), "updates": mb.placeholder(shape=updates.shape), } input_values = {"data": data, "updates": updates} if dynamic: indices = np.array(indices_val, dtype=np.int32) input_placeholders["indices"] = mb.placeholder(shape=indices.shape, dtype=types.int32) input_values["indices"] = indices if dynamic: expected_error = RuntimeError expected_error_msg = ( "Error computing NN outputs", "Unable to compute the prediction using a neural network model", "Unable to compute the prediction using ML Program", ) else: # The negative or out-of-bound indices will error out when validate_indices is set. expected_error = IndexError expected_error_msg = "Indices is out of bounds" # The negative or out-of-bound indices will error out when validate_indices is set. with pytest.raises(expected_error) as excinfo: run_compare_builder( build_dynamic if dynamic else build_static, input_placeholders, input_values, expected_output_types=(2, 3, types.fp32), expected_outputs=np.array([[1, 6, 10], [8, 9, 7]], dtype=np.float32), compute_unit=compute_unit, backend=backend, ) if not isinstance(expected_error_msg, tuple): expected_error_msg = expected_error_msg assert any([err in str(excinfo.value) for err in expected_error_msg]) class TestScatterNd: @pytest.mark.parametrize( "compute_unit, backend, indices_val, dynamic", itertools.product( compute_units, backends, [[[1, 0], [0, -1]], [[1, 0], [0, 3]]], [True, False] ), ) def test_ios17_invalid_indices(self, compute_unit, backend, indices_val, dynamic): if ( indices_val == [[1, 0], [0, -1]] and dynamic and backend.opset_version == ct.target.iOS18 ): pytest.xfail( "rdar://128089254 ([Bug][Regression] iOS18 scatter ops has unexpected behavior than iOS17)" ) def build_static(data, updates): return ( mb.scatter_nd( data=data, indices=np.array(indices_val, dtype=np.int32), updates=updates, validate_indices=True, ), ) def build_dynamic(data, indices, updates): return ( mb.scatter_nd(data=data, indices=indices, updates=updates, validate_indices=True), ) data = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) updates = np.array([5, 10], dtype=np.float32) input_placeholders = { "data": mb.placeholder(shape=data.shape), "updates": mb.placeholder(shape=updates.shape), } input_values = {"data": data, "updates": updates} if dynamic: indices = np.array(indices_val, dtype=np.int32) input_placeholders["indices"] = mb.placeholder(shape=indices.shape, dtype=types.int32) input_values["indices"] = indices if dynamic: expected_error = RuntimeError expected_error_msg = ( "Error computing NN outputs", "Unable to compute the prediction using a neural network model", "Unable to compute the prediction using ML Program", ) else: # The negative or out-of-bound indices will error out when validate_indices is set. expected_error = IndexError expected_error_msg = "Indices is out of bounds" with pytest.raises(expected_error) as excinfo: run_compare_builder( build_dynamic if dynamic else build_static, input_placeholders, input_values, expected_output_types=(2, 3, types.fp32), expected_outputs=np.array([[1, 2, 13], [9, 5, 6]], dtype=np.float32), compute_unit=compute_unit, backend=backend, ) if not isinstance(expected_error_msg, tuple): expected_error_msg = expected_error_msg assert any([err in str(excinfo.value) for err in expected_error_msg]) class TestGather(_TestGatherIOS16): @pytest.mark.parametrize( "compute_unit, backend, x_dtype, indices_dtype, indices_dynamic", itertools.product( compute_units, backends, [np.float32, np.float16, np.int32, np.int16, np.uint16, np.int8, np.uint8], [np.int32, np.int16, np.uint16, np.int8, np.uint8], [True, False], ), ) def test_builder_to_backend_smoke( self, compute_unit, backend, x_dtype, indices_dtype, indices_dynamic ): super().test_builder_to_backend_smoke( compute_unit, backend, x_dtype, indices_dtype, indices_dynamic ) @pytest.mark.parametrize( "backend, indices_val, validate_indices", itertools.product(backends, [[-1, 0], [0, 3]], [True, False]), ) def test_builder_invalid_indices_iOS17(self, backend, indices_val, validate_indices): def prog(x): params = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array(indices_val, dtype=np.int32) res = mb.gather(x=params, indices=indices, axis=-1, validate_indices=validate_indices) return res if validate_indices: with pytest.raises(IndexError, match="Indices is out of bounds for `gather` node"): mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, )(prog) elif any([idx > 2 for idx in indices_val]): # If the indices are not validated during type inference for IOS17, the `gather` op's # value inference will raise error for out-of-bound index. with pytest.raises(IndexError, match="index 3 is out of bounds for axis 1 with size 3"): mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, )(prog) else: mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, )(prog) @pytest.mark.parametrize( "backend, indices_val", itertools.product(backends, [0, 1]), ) def test_builder_scalar_indices(self, backend, indices_val): @mb.program(input_specs=[], opset_version=backend.opset_version) def prog(): params = np.array([1, 2, 3, 4], dtype=np.int32) indices = np.array(indices_val, dtype=np.int32) res = mb.gather( x=params, indices=indices_val, axis=0, batch_dims=0, validate_indices=False ) return res main_func = prog.functions["main"] gather_op = main_func.find_ops(op_type="gather")[0] assert gather_op.outputs[0].val == 1 if indices_val == 0 else 2 assert gather_op.outputs[0].dtype == types.int32 class TestGatherAlongAxis: @pytest.mark.parametrize( "compute_unit, backend, rank_axis, x_dtype, indices_dtype", itertools.product( compute_units, backends, [(rank, axis) for rank in (3,) for axis in (-rank, 0, rank - 1)], [np.float32, np.float16, np.int32, np.int16, np.uint16, np.int8, np.uint8], [np.int32, np.int16, np.uint16, np.int8, np.uint8], ), ) def test_builder_to_backend_programmatic( self, compute_unit, backend, rank_axis, x_dtype, indices_dtype ): _TestGatherAlongAxisIOS14._test_builder_to_backend_programmatic( compute_unit, backend, rank_axis, x_dtype, indices_dtype, True ) @pytest.mark.parametrize( "backend, indices_val, validate_indices", itertools.product( backends, [[[1, 0, -1], [0, 0, 1]], [[1, 0, 1], [0, 0, 2]]], [True, False], ), ) def test_builder_invalid_indices(self, backend, indices_val, validate_indices): def prog(x): params = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array(indices_val, dtype=np.int32) res = mb.gather_along_axis( x=params, indices=indices, axis=0, validate_indices=validate_indices ) return res if validate_indices: with pytest.raises( IndexError, match="Indices is out of bounds for `gather_along_axis` node" ): mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, )(prog) elif any([idx > 1 for sub_indices in indices_val for idx in sub_indices]): # If the indices are not validated during type inference for IOS17, the `gather` op's # value inference will raise error for out-of-bound index. with pytest.raises(IndexError, match="index 2 is out of bounds for axis 0 with size 2"): mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, )(prog) else: mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, )(prog) class TestGatherNd(_TestGatherNdIOS16): @pytest.mark.parametrize( "compute_unit, backend, x_dtype, indices_dtype", itertools.product( compute_units, backends, [np.float32, np.float16, np.int32, np.int16, np.uint16, np.int8, np.uint8], [np.int32, np.int16, np.uint16, np.int8, np.uint8], ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, x_dtype, indices_dtype): super().test_builder_to_backend_smoke(compute_unit, backend, x_dtype, indices_dtype) @pytest.mark.parametrize( "backend, indices_val, validate_indices", itertools.product( backends, [[[-1], [2]], [[1], [3]]], [True, False], ), ) def test_builder_invalid_indices(self, backend, indices_val, validate_indices): def prog(x): params = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) indices = np.array(indices_val, dtype=np.int32) res = mb.gather_nd( x=params, indices=indices, batch_dims=1, validate_indices=validate_indices ) return res if validate_indices: with pytest.raises(IndexError, match="Indices is out of bounds for `gather_nd` node"): mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, )(prog) else: mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)], opset_version=backend.opset_version, )(prog) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/test_tensor_operation.py0000644000000000000000000001240014672066616030402 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS17 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units class TestTopK: @pytest.mark.parametrize( "compute_unit, backend, x_dtype, k_dtype", itertools.product( compute_units, backends, [np.float16, np.float32, np.int8, np.int16, np.int32, np.uint8, np.uint16], [np.int8, np.int16, np.int32], ), ) def test_ios17_different_dtypes(self, compute_unit, backend, x_dtype, k_dtype): def build(x): return mb.topk(x=x, k=k_dtype(2), axis=1) val = np.array([[2, 3, 1], [5, 4, 6]], dtype=x_dtype) x_mb_dtype = types.numpy_type_to_builtin_type(x_dtype) input_placeholders = {"x": mb.placeholder(shape=val.shape, dtype=x_mb_dtype)} input_values = {"x": val} # As int16 is not in CoreML I/O supported dtypes, it will be cast to int32. expected_output_types = [(2, 2, x_mb_dtype), (2, 2, types.int32)] expected_outputs = [ np.array([[3, 2], [6, 5]], dtype=x_dtype), np.array([[1, 0], [2, 0]], dtype=np.int32), ] mlmodel = run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) prog = mlmodel._mil_program topk_op = prog["main"].find_ops(op_type="topk")[0] expected_x_dtype = x_mb_dtype if backend.precision == "fp16" and types.is_float(x_mb_dtype): expected_x_dtype = types.fp16 assert types.builtin_to_string(topk_op.x.dtype) == types.builtin_to_string(expected_x_dtype) @pytest.mark.parametrize( "compute_unit, backend, output_indices_dtype", itertools.product( compute_units, backends, ["int32", "uint16", None], ), ) def test_ios17_output_indices_dtype(self, compute_unit, backend, output_indices_dtype): def build(x): return mb.topk(x=x, k=2, axis=1, output_indices_dtype=output_indices_dtype) val = np.array([[2, 3, 1], [5, 4, 6]], dtype=np.int32) input_placeholders = {"x": mb.placeholder(shape=val.shape, dtype=types.int32)} input_values = {"x": val} expected_output_types = [(2, 2, types.int32), (2, 2, types.int32)] expected_outputs = [ np.array([[3, 2], [6, 5]], dtype=np.int32), np.array([[1, 0], [2, 0]], dtype=np.int32), ] mlmodel = run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) prog = mlmodel._mil_program topk_op = prog["main"].find_ops(op_type="topk")[0] # If output_indices_dtype is not set, the output should be in type int32 expected_output_indices_dtype = "int32" if output_indices_dtype is not None: expected_output_indices_dtype = output_indices_dtype assert types.builtin_to_string(topk_op.outputs[1].dtype) == expected_output_indices_dtype @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_ios17_invalid_output_indices_dtype(self, compute_unit, backend): def build(x): return mb.topk(x=x, k=2, axis=1, output_indices_dtype="dummy") val = np.array([[2, 3, 1], [5, 4, 6]], dtype=np.int32) with pytest.raises(ValueError, match="invalid output_indices_dtype"): run_compare_builder( build, input_placeholders={"x": mb.placeholder(shape=val.shape, dtype=types.int32)}, input_values={"x": val}, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_ios17_redundant_output_indices_dtype_early_error_out(self, compute_unit, backend): def build(x): return mb.topk(x=x, k=2, axis=1, return_indices=False, output_indices_dtype="int32") val = np.array([[2, 3, 1], [5, 4, 6]], dtype=np.int32) with pytest.raises( ValueError, match='"output_indices_dtype" can only be set when "return_indices=True"' ): run_compare_builder( build, input_placeholders={"x": mb.placeholder(shape=val.shape, dtype=types.int32)}, input_values={"x": val}, compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS17/test_tensor_transformation.py0000644000000000000000000003245514672066616031464 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS14.test_tensor_transformation import ( TestSliceByIndex as _TestSliceByIndexIos14, ) from coremltools.converters.mil.mil.ops.tests.iOS14.test_tensor_transformation import ( TestSliceBySize as _TestSliceBySizeIos14, ) from coremltools.converters.mil.mil.ops.tests.iOS16.test_tensor_transformation import ( TestReshapeLike as _TestReshapeLike_iOS16, ) from coremltools.converters.mil.mil.ops.tests.iOS17 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.mil.types.type_mapping import numpy_type_to_builtin_type from coremltools.converters.mil.testing_reqs import compute_units class TestReshape: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_reshape_with_zero_different_len_iOS17(self, compute_unit, backend): t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return [mb.reshape(x=x, shape=[1, 0, -1, 0])] # In IOS17 it accepts different length. expected_output_types = [(1, 1, 2, 3, types.fp32)] expected_outputs = [np.array([[[[1, 2, 3], [4, 5, 6]]]], dtype=np.float32)] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_reshape_invalid_with_zero(self, compute_unit, backend): t = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32) input_placeholders = {"x": mb.placeholder(shape=t.shape)} input_values = {"x": t} def build(x): return [mb.reshape(x=x, shape=[4, 0, -1, 0])] with pytest.raises(ValueError, match="Invalid target shape in `reshape` op"): run_compare_builder( build, input_placeholders, input_values, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, x_dtype, shape_dtype", itertools.product( compute_units, backends, [np.int8, np.uint8, np.int16, np.uint16, np.int32, np.float16, np.float32], [np.int8, np.int16, np.int32], ), ) def test_reshape_ios17_different_data_types(self, compute_unit, backend, x_dtype, shape_dtype): x_val = np.array([[1, 2, 3], [4, 5, 6]], dtype=x_dtype) target_shape = np.array([1, 6], dtype=shape_dtype) x_builtin_dtype = numpy_type_to_builtin_type(x_dtype) input_placeholders = {"x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype)} input_values = {"x": x_val} def build(x): return mb.reshape(x=x, shape=target_shape) expected_output_types = (1, 6, x_builtin_dtype) expected_outputs = np.array([[1, 2, 3, 4, 5, 6]], dtype=x_dtype) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestReshapeLike(_TestReshapeLike_iOS16): @pytest.mark.parametrize( "compute_unit, backend, InputShape_RefShapes_Begins_Ends_EndMasks, x_dtype, ref_dtype", itertools.product( compute_units, backends, [ [(4, 3), ((2, 2, 3), (1, 3)), (0, 1), (2, 2), (False, False)], [(32,), ((1, 2, 2, 2), (3, 2, 2)), (1, 1), (0, 0), (True, True)], [(72, 1), ((1, 2, 3, 4, 1), (3,)), (1, 0), (0, 1), (True, False)], ], [np.int8, np.uint8, np.int16, np.uint16, np.int32, np.float16, np.float32, bool], [np.int8, np.uint8, np.int16, np.uint16, np.int32, np.float16, np.float32, bool], ), ) def test_builder_to_backend_smoke( self, compute_unit, backend, InputShape_RefShapes_Begins_Ends_EndMasks, x_dtype, ref_dtype, ): super().test_builder_to_backend_smoke( compute_unit, backend, InputShape_RefShapes_Begins_Ends_EndMasks, x_dtype, ref_dtype ) class TestExpandDims: @pytest.mark.parametrize( "compute_unit, backend, x_dtype", itertools.product( compute_units, backends, [np.int8, np.uint8, np.int16, np.uint16, np.int32, np.float16, np.float32], ), ) def test_expand_dims_different_data_types(self, compute_unit, backend, x_dtype): axis = 1 x_val = np.random.randint(low=2, high=6, size=(2, 3, 4)).astype(x_dtype) x_builtin_dtype = numpy_type_to_builtin_type(x_dtype) input_placeholders = {"x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype)} input_values = {"x": x_val} def build(x): return mb.expand_dims(x=x, axes=[axis]) x_shape = list(x_val.shape) out_shape = x_shape[:axis] + [1] + x_shape[axis:] expected_output_types = tuple(out_shape[:]) + (x_builtin_dtype,) expected_outputs = np.expand_dims(input_values["x"], axis) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestReverse: @pytest.mark.parametrize( "compute_unit, backend, x_dtype", itertools.product( compute_units, backends, [np.int8, np.uint8, np.int16, np.uint16, np.int32, np.float16, np.float32], ), ) def test_reverse_different_data_types(self, compute_unit, backend, x_dtype): def build(x): return [mb.reverse(x=x), mb.reverse(x=x, axes=[0])] x_val = np.array([[1, 2, 3], [4, 5, 6]], dtype=x_dtype) x_builtin_dtype = numpy_type_to_builtin_type(x_dtype) input_placeholders = {"x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype)} input_values = {"x": x_val} expected_output_types = [(2, 3, x_builtin_dtype), (2, 3, x_builtin_dtype)] expected_outputs = [ np.array([[6, 5, 4], [3, 2, 1]], dtype=x_dtype), np.array([[4, 5, 6], [1, 2, 3]], dtype=x_dtype), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestReverseSequence: @pytest.mark.parametrize( "compute_unit, backend, x_dtype, length_dtype", itertools.product( compute_units, backends, [np.int8, np.uint8, np.int16, np.uint16, np.int32, np.float16, np.float32], [np.int8, np.int16, np.int32], ), ) def test_reverse_sequence_different_data_types( self, compute_unit, backend, x_dtype, length_dtype ): def build(x, length): return mb.reverse_sequence(x=x, lengths=length, seq_axis=1, batch_axis=0) x_val = np.array( [ [1, 2, 3, 4, 5, 0, 0, 0], [1, 2, 0, 0, 0, 0, 0, 0], [1, 2, 3, 4, 0, 0, 0, 0], [1, 2, 3, 4, 5, 6, 7, 8], ], dtype=x_dtype, ) length_val = np.array([7, 2, 3, 5], dtype=length_dtype) x_builtin_dtype = numpy_type_to_builtin_type(x_dtype) length_builtin_dtype = numpy_type_to_builtin_type(length_dtype) input_placeholders = { "x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype), "length": mb.placeholder(shape=length_val.shape, dtype=length_builtin_dtype), } input_values = {"x": x_val, "length": length_val} expected_output_types = (4, 8, x_builtin_dtype) expected_outputs = np.array( [ [0, 0, 5, 4, 3, 2, 1, 0], [2, 1, 0, 0, 0, 0, 0, 0], [3, 2, 1, 4, 0, 0, 0, 0], [5, 4, 3, 2, 1, 6, 7, 8], ], dtype=x_dtype, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestSqueeze: @pytest.mark.parametrize( "compute_unit, backend, x_dtype", itertools.product( compute_units, backends, [np.int8, np.uint8, np.int16, np.uint16, np.int32, np.float16, np.float32], ), ) def test_squeeze_different_data_types(self, compute_unit, backend, x_dtype): def build(x): return mb.squeeze(x=x, axes=(-1,)) x_val = np.array([[[[1], [2], [3]]]], dtype=x_dtype) x_builtin_dtype = numpy_type_to_builtin_type(x_dtype) input_placeholders = {"x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype)} input_values = {"x": x_val} expected_outputs = np.squeeze(x_val, -1) expected_output_types = tuple(expected_outputs.shape) + (x_builtin_dtype,) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestTranspose: @pytest.mark.parametrize( "compute_unit, backend, x_dtype", itertools.product( compute_units, backends, [np.int8, np.uint8, np.int16, np.uint16, np.int32, np.float16, np.float32], ), ) def test_transpose_different_data_types(self, compute_unit, backend, x_dtype): def build(x): return mb.transpose(x=x, perm=(-1, 0)) x_val = np.array([[1, 2, 3], [4, 5, 6]], dtype=x_dtype) x_builtin_dtype = numpy_type_to_builtin_type(x_dtype) run_compare_builder( build, input_placeholders={"x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype)}, input_values={"x": x_val}, expected_output_types=(3, 2, types.fp32), expected_outputs=x_val.T, compute_unit=compute_unit, backend=backend, ) class TestSlidingWindows: @pytest.mark.parametrize( "compute_unit, backend, x_dtype", itertools.product( compute_units, backends, [np.int8, np.uint8, np.int16, np.uint16, np.int32, np.float16, np.float32], ), ) def test_ios17_different_data_types(self, compute_unit, backend, x_dtype): def build(x): return mb.sliding_windows(x=x, axis=1, size=2) x_val = np.array([[[[9.0]], [[5.0]], [[1.0]], [[3.0]]]], dtype=x_dtype) x_builtin_dtype = numpy_type_to_builtin_type(x_dtype) input_placeholders = {"x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype)} input_values = {"x": x_val} expected_output_types = (1, 3, 2, 1, 1, x_builtin_dtype) expected_outputs = np.array( [[[[[9.0]], [[5.0]]], [[[5.0]], [[1.0]]], [[[1.0]], [[3.0]]]]], dtype=x_dtype, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestSliceByIndex(_TestSliceByIndexIos14): @pytest.mark.parametrize( "compute_unit, backend, x_dtype, idx_dtype", itertools.product( compute_units, backends, (np.float16, np.float32, np.int8, np.int16, np.int32, np.uint8, np.uint16), (np.int8, np.int16, np.int32), ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, x_dtype, idx_dtype): super().test_builder_to_backend_smoke(compute_unit, backend, x_dtype, idx_dtype) class TestSliceBySize(_TestSliceBySizeIos14): @pytest.mark.parametrize( "compute_unit, backend, size_val, x_dtype, idx_dtype", itertools.product( compute_units, backends, ([1, 2, 3], [-1, 2, -1]), (np.float16, np.float32, np.int8, np.int16, np.int32, np.uint8, np.uint16), (np.int8, np.int16, np.int32), ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, size_val, x_dtype, idx_dtype): super().test_builder_to_backend_smoke(compute_unit, backend, size_val, x_dtype, idx_dtype) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2415469 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS18/0000755000000000000000000000000014672075535023403 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS18/__init__.py0000644000000000000000000000061714672066616025520 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import coremltools as ct from coremltools.converters.mil.testing_reqs import backends_internal, clean_up_backends backends = clean_up_backends(backends_internal, ct.target.iOS18) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS18/test_compression.py0000644000000000000000000022747714672066616027400 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import math import re from typing import List, Tuple import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.defs._utils import promote_input_dtypes from coremltools.converters.mil.mil.ops.defs.iOS18 import ( _IOS18_TARGET, constexpr_blockwise_shift_scale, constexpr_lut_to_dense, constexpr_lut_to_sparse, constexpr_sparse_blockwise_shift_scale, constexpr_sparse_to_dense, ) from coremltools.converters.mil.mil.ops.tests.iOS18 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units def _convert_to_sub_byte_dtype(data: np.ndarray, sub_byte_dtype: type) -> np.ndarray: """Convert data to a specific sub-byte dtype, including shift between signed and unsigned range.""" if not np.issubdtype(data.dtype, np.integer): raise ValueError("Input data must be integer.") if not types.is_sub_byte(sub_byte_dtype): raise ValueError("Target dtype must be a sub-byte dtype.") original_signed = np.issubdtype(data.dtype, np.signedinteger) target_signed = not sub_byte_dtype.is_unsigned() if original_signed != target_signed: shift = 2 ** (sub_byte_dtype.get_bitwidth() - 1) if original_signed: data += shift else: data -= shift dtype_range = types.type_mapping.builtin_to_range(sub_byte_dtype) if np.max(data) > dtype_range.high: raise ValueError( f"Data has element {np.max(data)}, which is larger than the lower-bound {dtype_range.high}" ) if np.min(data) < dtype_range.low: raise ValueError( f"Data has element {np.min(data)}, which is smaller than the lower-bound {dtype_range.low}" ) return data.astype(types.nptype_from_builtin(sub_byte_dtype)) def _infer_lut_shape( indices_shape: Tuple[int, ...], block_sizes: Tuple[int, ...], nbits: int, vector_size: int ): """Infer the shape of look-up-table (LUT).""" lut_shape = [] for axis, dim_size in enumerate(indices_shape): lut_dim_size = 1 if block_sizes[axis] == 0 else dim_size // block_sizes[axis] lut_shape.append(lut_dim_size) lut_shape.extend([2**nbits, vector_size]) return lut_shape class TestConstexprBlockwiseDequantize: def test_builder_eval_basic_8bit(self): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_blockwise_shift_scale( data=np.array([4, 8, 10, 13, 24, 5, 6, 9]).reshape((1, 2, 4)).astype(np.int8), scale=np.array([4, 8]).reshape((1, 1, 2)).astype(np.float16), offset=np.array([4, 0]).reshape((1, 1, 2)).astype(np.int8), ) main_func = prog.functions["main"] constexpr_blockwise_shift_scale_op = main_func.find_ops( op_type="constexpr_blockwise_shift_scale" )[0] decompressed_res = ( np.array([0, 16, 80, 104, 80, 4, 48, 72]).reshape((1, 2, 4)).astype(np.float16) ) np.testing.assert_allclose( decompressed_res, constexpr_blockwise_shift_scale_op.outputs[0].op.materialized_val_inference(), ) @pytest.mark.parametrize( "scale_shape_output, quantized_dtype", itertools.product( [ ((1, 1, 2), [0, -16, -64, 0, -40, -16, -24, 0]), ((1, 2, 1), [0, -16, -48, -16, -48, 0, -24, 0]), ], ["int4", "uint4"], ), ) def test_builder_eval_basic_4bit( self, scale_shape_output: Tuple[Tuple[int], List[int]], quantized_dtype: str ): quantized_dtype = types.string_to_builtin(quantized_dtype) scale_shape, expected_output = scale_shape_output @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): quantized_data = _convert_to_sub_byte_dtype( np.array([4, 0, -8, 0, -6, 0, -3, 0]).reshape((1, 2, 4)), quantized_dtype ) offset = _convert_to_sub_byte_dtype( np.array([4, 0]).reshape(scale_shape), quantized_dtype ) quantized_data = mb.const(val=quantized_data, name="quantized_data") offset = mb.const(val=offset, name="offset") return mb.constexpr_blockwise_shift_scale( data=quantized_data, scale=np.array([4, 8]).reshape(scale_shape).astype(np.float32), offset=offset, ) constexpr_blockwise_shift_scale_op = prog.functions["main"].find_ops( op_type="constexpr_blockwise_shift_scale" )[0] np.testing.assert_allclose( np.array(expected_output).reshape((1, 2, 4)).astype(np.float32), constexpr_blockwise_shift_scale_op.outputs[0].op.materialized_val_inference(), ) def test_builder_eval_basic_no_offset(self): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): quantized_data = mb.const( val=np.array([4, 0, -8, 0, -6, 0, -3, 0]) .reshape((1, 2, 4)) .astype(types.np_int4_dtype), name="quantized_data", ) return mb.constexpr_blockwise_shift_scale( data=quantized_data, scale=np.array([4, 8]).reshape((1, 1, 2)).astype(np.float32), ) constexpr_blockwise_shift_scale_op = prog.functions["main"].find_ops( op_type="constexpr_blockwise_shift_scale" )[0] np.testing.assert_allclose( np.array([16, 0, -64, 0, -24, 0, -24, 0]).reshape((1, 2, 4)).astype(np.float32), constexpr_blockwise_shift_scale_op.outputs[0].op.materialized_val_inference(), ) @pytest.mark.parametrize( "nbits, block_size, mode", itertools.product( (4, 8), (1, 2, 4), ("linear_symmetric", "linear"), ), ) def test_builder_eval_numerical_stress(self, nbits, block_size, mode): nbits_range_max = 2 ** (nbits - 1) - 1 nbits_range_min = -nbits_range_max if mode == "linear": nbits_range_min -= 1 nbits_range = nbits_range_max - nbits_range_min # As small-bit quantization has a lot of information loss, we use int input to make the # information loss less critical when comparing the dequantized data with original data. original_data = ( np.random.randn(2, 3, 8) if block_size == 1 else np.random.randint(nbits_range_min, nbits_range_max, (2, 3, 8)) ) scaled_data = original_data.flatten() scales = [] zero_points = [] for i in range(0, scaled_data.size, block_size): block_data = scaled_data[i : i + block_size] offset = 0 if mode == "linear_symmetric": block_range = np.max(np.abs(block_data)) * 2 else: assert mode == "linear" # For the linear mode, we need to make sure the data range contains `0`. block_max = np.maximum(0.0, np.max(block_data)) block_min = np.minimum(0.0, np.min(block_data)) block_range = block_max - block_min offset = ( (nbits_range_min * block_max - nbits_range_max * block_min) / block_range if block_range != 0.0 else 0.0 ) zero_points.append(offset) block_scale = block_range / nbits_range scales.append(block_scale) scaled_data[i : i + block_size] = np.round(block_data / block_scale + offset) scaled_data = np.minimum(scaled_data, nbits_range_max) scaled_data = np.maximum(scaled_data, nbits_range_min) scaled_data = scaled_data.reshape(original_data.shape).astype(np.int8) scales_shape = original_data.shape[:-1] + (original_data.shape[-1] // block_size,) @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): quantized_data = scaled_data if nbits == 4: quantized_data = mb.const(val=quantized_data.astype(types.np_int4_dtype)) return mb.constexpr_blockwise_shift_scale( data=quantized_data, scale=np.array(scales).reshape(scales_shape).astype(np.float32), offset=None if mode == "linear_symmetric" else np.array(zero_points).reshape(scales_shape).astype(np.float32), ) constexpr_blockwise_shift_scale_op = prog.functions["main"].find_ops( op_type="constexpr_blockwise_shift_scale" )[0] if block_size == 1: # With block_size==1, the quantization will not have information loss. atol, rtol = 1e-06, 1e-06 elif nbits > 4 and block_size < 3: # When block size is small and nbits is large, the information loss is limited. atol, rtol = 1e-04, 1e-04 else: atol, rtol = 1e-02, 1e-02 dequantized_data = constexpr_blockwise_shift_scale_op.outputs[ 0 ].op.materialized_val_inference() if np.issubdtype(original_data.dtype, np.integer): dequantized_data = np.round(dequantized_data) np.testing.assert_allclose( original_data, dequantized_data, atol=atol, rtol=rtol, ) def test_builder_eval_invalid_parameter(self): with pytest.raises( ValueError, match=r"Parameter 'data' needs to have at least rank 1, but got scalar." ): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_blockwise_shift_scale( data=np.int8(10), scale=np.float32(2.0), ) with pytest.raises( ValueError, match=r"Parameter 'data' and 'scale' need to have the same rank" ): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_blockwise_shift_scale( data=np.int8(10), scale=np.array([1, 2]).astype(np.float32), ) with pytest.raises( ValueError, match=r"Number of scales along each dimension should be a " r"factor of corresponding dimension size of 'data'.", ): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_blockwise_shift_scale( data=np.array([1, 2]).reshape((1, 2)).astype(np.int8), scale=np.array([1, 2]).reshape((2, 1)).astype(np.float16), ) with pytest.raises( ValueError, match=r"Invalid parameter 'offset'; the shape of 'offset' " r"should match the shape of 'scale'", ): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_blockwise_shift_scale( data=np.array([1, 2]).astype(np.int8), scale=np.array([1, 2]).astype(np.float16), offset=np.array([1, 2]).reshape((1, 2)).astype(np.int8), ) with pytest.raises( ValueError, match=r"Invalid parameter 'offset'; the dtype of 'offset' " r"should match the dtype of 'data'", ): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_blockwise_shift_scale( data=np.array([1, 2]).astype(types.nptype_from_builtin(types.int4)), scale=np.array([1, 2]).astype(np.float16), offset=np.array([1, 2]).astype(np.int8), ) # When the offset is float, it doesn't need to have the same dtype as data. @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_blockwise_shift_scale( data=np.array([1, 2]).astype(types.nptype_from_builtin(types.int4)), scale=np.array([1, 2]).astype(np.float16), offset=np.array([1, 2]).astype(np.float32), ) @pytest.mark.parametrize( "compute_unit, backend, nbits, has_offset", itertools.product(compute_units, backends, [4, 8], [True, False]), ) def test_builder_to_backend_smoke(self, compute_unit, backend, nbits, has_offset): x_val = np.ones(1).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} np_dtype = types.nptype_from_builtin(types.string_to_builtin(f"int{nbits}")) if nbits == 8: data_val = [4, 8, 10, 13, 24, 5, 6, 9] elif nbits == 4: data_val = [2, 3, 5, 7, 6, 5, 3, 1] data = np.array(data_val).reshape((1, 2, 4)).astype(np_dtype) if has_offset is True: if nbits == 8: offset_val = [4, 0] elif nbits == 4: offset_val = [1, 0] else: offset_val = [0, 0] offset = np.array(offset_val).reshape((1, 1, 2)).astype(np_dtype) scale = np.array([1, 2]).reshape((1, 1, 2)).astype(np.float32) # Calculate expected output based on op definition. expected_output = np.zeros(data.shape) for n in range(0, 1): for i in range(0, data.shape[0]): for j in range(0, data.shape[1]): for k in range(0, data.shape[2]): i0 = math.floor(i / (data.shape[0] / scale.shape[0])) j0 = math.floor(j / (data.shape[1] / scale.shape[1])) k0 = math.floor(k / (data.shape[2] / scale.shape[2])) expected_output[i][j][k] = ( scale[i0][j0][k0] * (data[i][j][k] - offset[i0][j0][k0]) + 1 ) def build(x): output = mb.constexpr_blockwise_shift_scale( data=data, scale=scale, offset=offset, ) return mb.add(x=x, y=output) run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, dtype, block_sizes, has_offset", itertools.product( compute_units, backends, ["int4", "uint4", "int8", "uint8", "fp16"], [(0, 1, 1, 1), (0, 0, 0, 2), (0, 0, 0, 0), (1, 1, 1, 1), (0, 4, 2, 0), (4, 8, 16, 8)], [True, False], ), ) def test_builder_to_backend_stress(self, compute_unit, backend, dtype, block_sizes, has_offset): """ Use constexpr_blockwise_shift_scale op's value inference to check backends outputs. Following combinations will fail if enable BNNS (rdar://125854036). - dtype = 'uint4'/'int4', block_sizes = (1, 1, 1, 1) - dtype = 'uint4'/'int4', block_sizes = (0, 1, 1, 1) """ quantized_data_shape = (4, 8, 16, 8) builtin_dtype = types.string_to_builtin(dtype) np_dtype = types.nptype_from_builtin(builtin_dtype) if types.is_int(builtin_dtype): data_range = types.type_mapping.builtin_to_range(builtin_dtype) quantized_data = np.random.randint( low=data_range.low, high=data_range.high + 1, size=quantized_data_shape ).astype(np_dtype) else: quantized_data = np.random.rand(*quantized_data_shape).astype(np_dtype) scale_shape = [ 1 if block_sizes[axis] == 0 else dim_size // block_sizes[axis] for axis, dim_size in enumerate(quantized_data.shape) ] scale = np.random.rand(*scale_shape) offset = None if has_offset: if types.is_int(builtin_dtype): offset = np.random.randint( low=data_range.low, high=data_range.high + 1, size=scale.shape ).astype(np_dtype) else: offset = np.random.rand(*scale.shape).astype(np_dtype) def build(x): output = mb.constexpr_blockwise_shift_scale( data=quantized_data, scale=scale, offset=offset, ) return mb.add(x=x, y=output) x_val = np.ones_like(quantized_data).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} expected_output = ( constexpr_blockwise_shift_scale.decompress(quantized_data, scale, offset) + 1 ) run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) class TestConstexprLut: @staticmethod def _pad_lut_for_nbits_requirements(lut: np.ndarray, nbits: int): """ Make the number of palettes in lut size (second last dim) meet the 2^nbits requirement. This util function is needed before we add all uint sub-byte dtypes. """ pad_shape = lut.shape[:-2] + (2**nbits - lut.shape[-2], lut.shape[-1]) return np.concatenate((lut, np.zeros(pad_shape)), axis=-2) @staticmethod def _generate_lut(shape: Tuple[int, ...]): """It follows the MIL test cases.""" total_num = np.prod(shape) lut = np.arange(min(total_num, 128)) if total_num > lut.size: lut = np.concatenate((lut, np.ones(total_num - lut.size) * 127)) return lut.reshape(shape) @pytest.mark.parametrize("nbits", [1, 2, 3, 4, 6, 8]) def test_builder_eval_channelwise_lut(self, nbits): """ Test channel-wise lut with first axis as channel axis (the first dim of lut has size > 1). indices = tensor>([2, 3, 3, 0, 1, 0, 3, 0, 2, 1, 0, 3]) lut = tensor([1, 5, 9, 13, 2, 10, 18, 26]) It is effectively a 2-group 2-bit scalar palettization. The output shape would be [6, 2], which is the same as the indices shape. The output tensor values are: [[lut0[2]->9, lut0[3]->13], [lut0[3]->13, lut0[0]->1], [lut0[1]->5, lut0[0]->1], [lut1[3]->26, lut1[0]->2], [lut1[2]->18, lut1[1]->10], [lut1[0]->2, lut1[3]->26]] """ @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): if nbits == 1: indices = np.array([0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1]).reshape((6, 2)) lut = np.array([1, 5, 9, 13]).reshape((2, 1, 2, 1)).astype(np.int8) else: indices = np.array([2, 3, 3, 0, 1, 0, 3, 0, 2, 1, 0, 3]).reshape((6, 2)) lut = self._pad_lut_for_nbits_requirements( np.array([1, 5, 9, 13, 2, 10, 18, 26]).reshape((2, 1, 4, 1)).astype(np.int8), nbits=nbits, ) indices_np_dtype = types.nptype_from_builtin(types.string_to_builtin(f"uint{nbits}")) indices = indices.astype(indices_np_dtype) return mb.constexpr_lut_to_dense(indices=indices, lut=lut) constexpr_lut_to_dense_op = prog.functions["main"].find_ops( op_type="constexpr_lut_to_dense" )[0] if nbits == 1: decompressed_res = np.array([1, 5, 5, 1, 5, 1, 13, 9, 13, 13, 9, 13]) else: decompressed_res = np.array([9, 13, 13, 1, 5, 1, 26, 2, 18, 10, 2, 26]) decompressed_res = decompressed_res.reshape((6, 2)).astype(np.int8) np.testing.assert_allclose( decompressed_res, constexpr_lut_to_dense_op.outputs[0].op.materialized_val_inference() ) @pytest.mark.parametrize("vector_axis", (0, 1, 2, -1)) def test_builder_eval_vector_lut(self, vector_axis): """ Test vector lut on different axis. indices = [ [ [4, 8], -> group 0 [10, 13], -> group 0 [24, 5], -> group 1 [6, 9] -> group 1 ], [ [13, 31], -> group 0 [17, 7], -> group 0 [2, 8], -> group 1 [3, 1] -> group 1 ] ] """ @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_lut_to_dense( indices=np.array([4, 8, 10, 13, 24, 5, 6, 9, 13, 31, 17, 7, 2, 8, 3, 1]) .reshape((2, 4, 2)) .astype(np.uint8), lut=self._generate_lut(shape=(1, 2, 1, 256, 3)), vector_axis=vector_axis, ) constexpr_lut_to_dense_op = prog.functions["main"].find_ops( op_type="constexpr_lut_to_dense" )[0] if vector_axis == 0: decompressed_res = ( np.array( [ 12, 24, 30, 39, 127, 127, 127, 127, 13, 25, 31, 40, 127, 127, 127, 127, 14, 26, 32, 41, 127, 127, 127, 127, 39, 93, 51, 21, 127, 127, 127, 127, 40, 94, 52, 22, 127, 127, 127, 127, 41, 95, 53, 23, 127, 127, 127, 127, ] ) .reshape((2 * 3, 4, 2)) .astype(np.int8) ) elif vector_axis == 1: decompressed_res = ( np.array( [ 12, 24, 13, 25, 14, 26, 30, 39, 31, 40, 32, 41, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, 39, 93, 40, 94, 41, 95, 51, 21, 52, 22, 53, 23, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, ] ) .reshape((2, 4 * 3, 2)) .astype(np.int8) ) else: decompressed_res = ( np.array( [ 12, 13, 14, 24, 25, 26, 30, 31, 32, 39, 40, 41, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, 39, 40, 41, 93, 94, 95, 51, 52, 53, 21, 22, 23, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, 127, ] ) .reshape((2, 4, 2 * 3)) .astype(np.int8) ) np.testing.assert_allclose( decompressed_res, constexpr_lut_to_dense_op.outputs[0].op.materialized_val_inference() ) @pytest.mark.parametrize( "compute_unit, backend, nbits", itertools.product(compute_units, backends, [2, 3, 4, 6, 8]) ) def test_builder_to_backend_smoke(self, compute_unit, backend, nbits): x_val = np.ones(12).astype(np.float32).reshape(6, 2) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} def build(x): indices = np.array([2, 3, 3, 0, 1, 0, 3, 0, 2, 1, 0, 3]).reshape((6, 2)) lut = self._pad_lut_for_nbits_requirements( np.array([1, 5, 9, 13, 2, 10, 18, 26]).reshape((2, 1, 4, 1)).astype(np.int8), nbits=nbits, ) indices_np_dtype = types.nptype_from_builtin(types.string_to_builtin(f"uint{nbits}")) indices = indices.astype(indices_np_dtype) output = mb.constexpr_lut_to_dense( indices=indices, lut=lut, ) return mb.add(x=x, y=output) expected_output = np.array([9, 13, 13, 1, 5, 1, 26, 2, 18, 10, 2, 26]).reshape(6, 2) + 1 run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) @pytest.mark.xfail(reason="rdar://131511244 Investigate Why Palettization is Failing on BNNS") @pytest.mark.parametrize( "compute_unit, backend, nbits, block_sizes, vector_size, vector_axis, lut_dtype", itertools.product( compute_units, backends, [2, 3, 4, 6, 8], [(0, 2, 0, 0), (2, 0, 0, 0), (0, 0, 0, 0), (1, 1, 1, 1), (4, 2, 0, 0), (4, 8, 16, 8)], [1, 4], [0, 1, -1], ["fp16", "fp32"], # TODO (rdar://125859751): Add "int8" and "uint8". ), ) def test_builder_to_backend_stress( self, compute_unit, backend, nbits, block_sizes, vector_size, vector_axis, lut_dtype ): """Use constexpr_lut_to_dense op's value inference to check backends outputs.""" indices_shape = (4, 8, 16, 8) builtin_dtype = types.string_to_builtin(f"uint{nbits}") np_dtype = types.nptype_from_builtin(builtin_dtype) indices = np.random.randint(low=0, high=2**nbits, size=indices_shape).astype(np_dtype) lut_np_dtype = types.nptype_from_builtin(types.string_to_builtin(lut_dtype)) lut_shape = _infer_lut_shape(indices_shape, block_sizes, nbits, vector_size) lut = np.random.rand(*lut_shape).astype(lut_np_dtype) def build(x): output = mb.constexpr_lut_to_dense( indices=indices, lut=lut, vector_axis=vector_axis, ) x, output = promote_input_dtypes([x, output]) return mb.add(x=x, y=output) output_shape = list(indices.shape) if vector_size > 1: output_shape[vector_axis] *= vector_size x_val = np.ones(output_shape).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} expected_output = ( constexpr_lut_to_dense.decompress(indices, lut, vector_axis=vector_axis) + 1 ) run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) class TestConstexprSparseToDense: def test_builder_eval_basic(self): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_sparse_to_dense( nonzero_data=np.array([3.0, 5.0, 4.0]), mask=np.array([1, 0, 1, 0, 1, 0]).reshape((2, 3)).astype(types.np_uint1_dtype), ) constexpr_sparse_to_dense_op = prog.functions["main"].find_ops( op_type="constexpr_sparse_to_dense" )[0] decompressed_res = np.array([[3.0, 0.0, 5.0], [0.0, 4.0, 0.0]]) np.testing.assert_allclose( decompressed_res, constexpr_sparse_to_dense_op.outputs[0].op.materialized_val_inference(), ) @pytest.mark.parametrize( "shape, data_dtype", itertools.product( ((2, 3, 4), (3, 8), (24,)), (types.int4, types.uint4, types.int8, types.uint8, types.fp16, types.fp32), ), ) def test_builder_eval_numerical_stress(self, shape, data_dtype): np_dtype = types.nptype_from_builtin(data_dtype) @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_sparse_to_dense( nonzero_data=np.array([3.0, 5.0, 4.0]).astype(np_dtype), mask=np.array([1, 0, 1, 0, 1, 0] + [0] * 18) .reshape(shape) .astype(types.np_uint1_dtype), ) constexpr_sparse_to_dense_op = prog.functions["main"].find_ops( op_type="constexpr_sparse_to_dense" )[0] decompressed_res = np.array([3, 0, 5, 0, 4, 0] + [0] * 18).reshape(shape).astype(np_dtype) np.testing.assert_allclose( decompressed_res, constexpr_sparse_to_dense_op.outputs[0].op.materialized_val_inference(), ) def test_builder_eval_invalid_parameter(self): with pytest.raises( ValueError, match="Parameter nonzero_data needs to have rank 1, but got 2" ): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_sparse_to_dense( nonzero_data=np.array([1.0, 5.0, 4.0]).reshape((3, 1)), mask=np.array([1, 1, 1, 0, 0, 0]).reshape((2, 3)).astype(types.np_uint1_dtype), ) with pytest.raises( AssertionError, match="Number of 1s in mask not match number of elements in parameter nonzero_data", ): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_sparse_to_dense( nonzero_data=np.array([1.0, 5.0, 4.0]), mask=np.array([1, 1, 1, 0, 1, 0]).reshape((2, 3)).astype(types.np_uint1_dtype), ) @pytest.mark.parametrize( "compute_unit, backend, data_dtype", itertools.product( compute_units, backends, ("fp16", "fp32"), # TODO (rdar://125859751): Add "int8" and "uint8". ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, data_dtype): builtin_dtype = types.string_to_builtin(data_dtype) np_dtype = types.nptype_from_builtin(builtin_dtype) x_val = np.array([1, 1, 1, 1, 1, 1], dtype=np_dtype).reshape((2, 3)) input_placeholders = {"x": mb.placeholder(shape=x_val.shape, dtype=builtin_dtype)} def build(x): nonzero_data = np.array([3.0, 5.0, 4.0]).astype(np_dtype) mask = np.array([1, 0, 1, 0, 1, 0]).reshape((2, 3)).astype(types.np_uint1_dtype) output = mb.constexpr_sparse_to_dense( nonzero_data=nonzero_data, mask=mask, ) return mb.add(x=x, y=output) expected_output = np.array([[3.0, 0.0, 5.0], [0.0, 4.0, 0.0]]).astype(np_dtype) + 1 run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (builtin_dtype,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, sparse_ratio, data_dtype", itertools.product( compute_units, backends, [0.01, 0.5, 0.99], ["fp16", "fp32"], # TODO (rdar://125859751): Add "int8" and "uint8". ), ) def test_builder_to_backend_stress(self, compute_unit, backend, sparse_ratio, data_dtype): """Use constexpr_sparse_to_dense op's value inference to check backends outputs.""" dense_data_shape = (4, 8, 16, 8) mask = np.random.choice( [0, 1], size=dense_data_shape, p=[sparse_ratio, 1.0 - sparse_ratio] ).astype(types.np_uint1_dtype) non_zero_element_num = np.sum(mask) data_np_dtype = types.nptype_from_builtin(types.string_to_builtin(data_dtype)) nonzero_data = np.random.rand(non_zero_element_num).astype(data_np_dtype) def build(x): output = mb.constexpr_sparse_to_dense( nonzero_data=nonzero_data, mask=mask, ) x, output = promote_input_dtypes([x, output]) return mb.add(x=x, y=output) x_val = np.ones_like(mask).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} expected_output = constexpr_sparse_to_dense.decompress(nonzero_data, mask) + 1 run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) class TestConstexprLutToSparse: def test_builder_eval_scalar_lut(self): """ indices_mask = [[1, 1, 0, 0, 0, 0], [1, 1, 0, 0, 0, 1], [0, 1, 1, 0, 1, 0], [0, 0, 0, 1, 0, 0]] indices_nonzero_data = [0, 1, 1, 0, 1, 1, 0, 0, 1] lut = [2.0, 3.0] The output mask is the same as input indices_mask. The output sparse tensor in the dense layout is: 2.0 3.0 3.0 2.0 3.0 3.0 2.0 2.0 3.0 So the output nonzero_data is [2.0, 3.0, 3.0, 2.0, 3.0, 3.0, 2.0, 2.0, 3.0]. """ @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_lut_to_sparse( indices_mask=np.array( [[1, 1, 0, 0, 0, 0], [1, 1, 0, 0, 0, 1], [0, 1, 1, 0, 1, 0], [0, 0, 0, 1, 0, 0]] ).astype(types.np_uint1_dtype), indices_nonzero_data=np.array([0, 1, 1, 0, 1, 1, 0, 0, 1]).astype( types.np_uint1_dtype ), lut=np.array([2.0, 3.0]).reshape((1, 1, 2, 1)), ) constexpr_lut_to_sparse_op = prog.functions["main"].find_ops( op_type="constexpr_lut_to_sparse" )[0] expected_output_mask = np.array( [[1, 1, 0, 0, 0, 0], [1, 1, 0, 0, 0, 1], [0, 1, 1, 0, 1, 0], [0, 0, 0, 1, 0, 0]] ) expected_output_nonzero_data = np.array([2.0, 3.0, 3.0, 2.0, 3.0, 3.0, 2.0, 2.0, 3.0]) output_mask, output_nonzero_data = constexpr_lut_to_sparse_op.outputs[ 0 ].op.materialized_val_inference() np.testing.assert_allclose(output_mask, expected_output_mask) np.testing.assert_allclose(output_nonzero_data, expected_output_nonzero_data) def test_builder_eval_vector_lut(self): """ indices_mask = [[1, 1, 0, 0, 0, 0], [1, 1, 0, 0, 0, 1], [0, 1, 1, 0, 1, 0], [0, 0, 0, 1, 0, 0]] indices_nonzero_data = [0, 1, 1, 0, 1, 1, 0, 0, 1] lut = [ [1, 1, 0, 0, 0, 0], [1, 1, 0, 0, 0, 0], [1, 1, 0, 0, 0, 1], [1, 1, 0, 0, 0, 1], [0, 1, 1, 0, 1, 0], [0, 1, 1, 0, 1, 0], [0, 0, 0, 1, 0, 0], [0, 0, 0, 1, 0, 0], ] The second output in the dense layout would be: 2.0 3.0 2.0 3.0 3.0 2.0 3.0 3.0 2.0 3.0 3.0 2.0 2.0 3.0 2.0 2.0 3.0 3.0 It is created by fetching the vector entry from the lut for every bit 1 in the data_mask, and filling the vector over axis=0. """ @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_lut_to_sparse( indices_mask=np.array( [[1, 1, 0, 0, 0, 0], [1, 1, 0, 0, 0, 1], [0, 1, 1, 0, 1, 0], [0, 0, 0, 1, 0, 0]] ).astype(types.np_uint1_dtype), indices_nonzero_data=np.array([0, 1, 1, 0, 1, 1, 0, 0, 1]).astype( types.np_uint1_dtype ), lut=np.array([[2.0, 2.0], [3.0, 3.0]]).reshape((1, 1, 2, 2)), vector_axis=0, ) constexpr_lut_to_sparse_op = prog.functions["main"].find_ops( op_type="constexpr_lut_to_sparse" )[0] expected_output_mask = np.array( [ [1, 1, 0, 0, 0, 0], [1, 1, 0, 0, 0, 0], [1, 1, 0, 0, 0, 1], [1, 1, 0, 0, 0, 1], [0, 1, 1, 0, 1, 0], [0, 1, 1, 0, 1, 0], [0, 0, 0, 1, 0, 0], [0, 0, 0, 1, 0, 0], ] ) expected_output_nonzero_data = np.array( [ 2.0, 3.0, 2.0, 3.0, 3.0, 2.0, 3.0, 3.0, 2.0, 3.0, 3.0, 2.0, 2.0, 3.0, 2.0, 2.0, 3.0, 3.0, ] ) output_mask, output_nonzero_data = constexpr_lut_to_sparse_op.outputs[ 0 ].op.materialized_val_inference() np.testing.assert_allclose(output_mask, expected_output_mask) np.testing.assert_allclose(output_nonzero_data, expected_output_nonzero_data) def test_builder_eval_invalid_parameter(self): with pytest.raises( AssertionError, match="Number of 1s in mask not match number of elements in parameter nonzero_data", ): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_lut_to_sparse( indices_mask=np.array([1, 1, 1, 0, 1, 0]) .reshape((2, 3)) .astype(types.np_uint1_dtype), indices_nonzero_data=np.array([0, 1, 0]).astype(types.np_uint1_dtype), lut=np.array([2.0, 3.0]).reshape((1, 1, 2, 1)), ) with pytest.raises( ValueError, match=re.escape( "When lut's last dim (VECTOR_SIZE) > 1, the parameter " "'vector_axis' need to be provided." ), ): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_lut_to_sparse( indices_mask=np.array([1, 1, 1, 0, 1, 0]) .reshape((2, 3)) .astype(types.np_uint1_dtype), indices_nonzero_data=np.array([0, 1, 0, 1]).astype(types.np_uint1_dtype), lut=np.array([2.0, 3.0, 2.0, 3.0]).reshape((1, 1, 2, 2)), ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_smoke(self, compute_unit, backend): x_val = np.ones(18).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} def build(x): indices_mask = np.array( [[1, 1, 0, 0, 0, 0], [1, 1, 0, 0, 0, 1], [0, 1, 1, 0, 1, 0], [0, 0, 0, 1, 0, 0]] ).astype(types.np_uint1_dtype) indices_nonzero_data = np.array([0, 1, 1, 0, 1, 1, 0, 0, 1]).astype( types.np_uint1_dtype ) lut = np.array([[2.0, 2.0], [3.0, 3.0]]).reshape((1, 1, 2, 2)) vector_axis = 0 output_mask, output_nonzero_data = mb.constexpr_lut_to_sparse( indices_mask=indices_mask, indices_nonzero_data=indices_nonzero_data, lut=lut, vector_axis=vector_axis, ) return mb.add(x=x, y=output_nonzero_data) expected_output = 1 + np.array( [ 2.0, 3.0, 2.0, 3.0, 3.0, 2.0, 3.0, 3.0, 2.0, 3.0, 3.0, 2.0, 2.0, 3.0, 2.0, 2.0, 3.0, 3.0, ] ) run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, nbits, block_sizes, vector_size, sparse_ratio, lut_dtype", itertools.product( compute_units, backends, [2, 3, 4, 6, 8], [(0, 1, 1, 1), (0, 0, 0, 2), (0, 0, 0, 0), (1, 1, 1, 1), (0, 4, 2, 0), (4, 8, 16, 8)], [1, 4], [0.01, 0.5, 0.99], ["fp16", "fp32"], # TODO (rdar://125859751): Add "int8" and "uint8". ), ) def test_builder_to_backend_stress( self, compute_unit, backend, nbits, block_sizes, vector_size, sparse_ratio, lut_dtype ): """Use constexpr_lut_to_sparse op's value inference to check backends outputs.""" indices_shape = (4, 8, 16, 8) indices_mask = np.random.choice( [0, 1], size=indices_shape, p=[sparse_ratio, 1.0 - sparse_ratio] ).astype(types.np_uint1_dtype) indices_nonzero_element_num = np.sum(indices_mask) indices_np_dtype = types.nptype_from_builtin(types.string_to_builtin(f"uint{nbits}")) indices_nonzero_data = np.random.randint( low=0, high=2**nbits, size=indices_nonzero_element_num ).astype(indices_np_dtype) lut_np_dtype = types.nptype_from_builtin(types.string_to_builtin(lut_dtype)) lut_shape = _infer_lut_shape(indices_shape, block_sizes, nbits, vector_size) lut = np.random.rand(*lut_shape).astype(lut_np_dtype) vector_axis = 0 if vector_size > 1 else None def build(x): output_mask, output_nonzero_data = mb.constexpr_lut_to_sparse( indices_mask=indices_mask, indices_nonzero_data=indices_nonzero_data, lut=lut, vector_axis=vector_axis, ) x, output_nonzero_data = promote_input_dtypes([x, output_nonzero_data]) return mb.add(x=x, y=output_nonzero_data) output_shape = int(indices_nonzero_element_num * vector_size) x_val = np.ones(output_shape).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} expected_output = ( constexpr_lut_to_sparse.decompress( indices_mask, indices_nonzero_data, lut, vector_axis )[1] + 1 ) run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) class TestConstexprSparseBlockwiseShiftScale: def test_builder_eval_sparse_per_channel(self): """ Test per-channel de-quantization on sparse tensor. data_mask = [[1, 1, 0, 0], [1, 1, 1, 0], [0, 0, 1, 1], [1, 1, 0, 0]] nonzero_data = [10, 11, 3, 4, 5, 6, 7, 8, 9] scale = [[0.1, 0.2, 0.3, 0.4]] offset = [[1, 2, 3, 4]] The sparse tensor in the dense layout would look like: 10 11 3 4 5 6 7 8 9 The input `nonzero_data` would be dequantized per-column as in the dense layout, and the output sparse tensor in the dense layout would be: (10-1)*0.1 (11-2)*0.2 (3-1)*0.1 (4-2)*0.2 (5-3)*0.3 (6-3)*0.3 (7-4)*0.4 (8-1)*0.1 (9-2)*0.2 The first output would be the same as the `data_mask`, The second output would be [0.9, 1.8, 0.2, 0.4, 0.6, 0.9, 1.2, 0.7, 1.4] """ @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_sparse_blockwise_shift_scale( data_mask=np.array([[1, 1, 0, 0], [1, 1, 1, 0], [0, 0, 1, 1], [1, 1, 0, 0]]).astype( types.np_uint1_dtype ), nonzero_data=np.array([10, 11, 3, 4, 5, 6, 7, 8, 9]).astype(np.int8), scale=np.array([[0.1, 0.2, 0.3, 0.4]]), offset=np.array([[1, 2, 3, 4]]).astype(np.int8), ) constexpr_sparse_blockwise_shift_scale_op = prog.functions["main"].find_ops( op_type="constexpr_sparse_blockwise_shift_scale" )[0] expected_output_mask = np.array([[1, 1, 0, 0], [1, 1, 1, 0], [0, 0, 1, 1], [1, 1, 0, 0]]) expected_output_nonzero_data = np.array([0.9, 1.8, 0.2, 0.4, 0.6, 0.9, 1.2, 0.7, 1.4]) output_mask, output_nonzero_data = constexpr_sparse_blockwise_shift_scale_op.outputs[ 0 ].op.materialized_val_inference() np.testing.assert_allclose(output_mask, expected_output_mask) np.testing.assert_allclose(output_nonzero_data, expected_output_nonzero_data) def test_builder_eval_sparse_per_block(self): """ Test per-block de-quantization on sparse tensor with block size 2. data_mask = [[1, 1, 0, 0], [1, 1, 1, 0], [0, 0, 1, 1], [1, 1, 0, 1]] # shape [4, 4] nonzero_data = [10, 11, 3, 4, 5, 6, 7, 8, 9, 2] scale = [[0.1, 0.2], [0.3, 0.4], [0.5, 0.6], [0.7, 0.8]] # shape [4, 2] because block size is [1, 2] offset = [[1, 2], [3, 4], [5, 6], [7, 8]] The sparse tensor in the dense layout would look like: 10 11 3 4 5 6 7 8 9 2 The input `nonzero_data` would be dequantized per-column as in the dense layout, and the output sparse tensor in the dense layout would be: (10-1)*0.1 (11-1)*0.1 (3-3)*0.3 (4-3)*0.3 (5-4)*0.4 (6-6)*0.6 (7-6)*0.6 (8-7)*0.7 (9-7)*0.7 (2-8)*0.8 The first output would be the same as the `data_mask`, The second output would be [0.9, 1.0, 0.0, 0.3, 0.4, 0.0, 0.6, 0.7, 1.4, -4.8] """ @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_sparse_blockwise_shift_scale( data_mask=np.array([[1, 1, 0, 0], [1, 1, 1, 0], [0, 0, 1, 1], [1, 1, 0, 1]]).astype( types.np_uint1_dtype ), nonzero_data=np.array([10, 11, 3, 4, 5, 6, 7, 8, 9, 2]).astype(np.int8), scale=np.array([[0.1, 0.2], [0.3, 0.4], [0.5, 0.6], [0.7, 0.8]]), offset=np.array([[1, 2], [3, 4], [5, 6], [7, 8]]).astype(np.int8), ) constexpr_sparse_blockwise_shift_scale_op = prog.functions["main"].find_ops( op_type="constexpr_sparse_blockwise_shift_scale" )[0] expected_output_mask = np.array([[1, 1, 0, 0], [1, 1, 1, 0], [0, 0, 1, 1], [1, 1, 0, 1]]) expected_output_nonzero_data = np.array([0.9, 1.0, 0.0, 0.3, 0.4, 0.0, 0.6, 0.7, 1.4, -4.8]) output_mask, output_nonzero_data = constexpr_sparse_blockwise_shift_scale_op.outputs[ 0 ].op.materialized_val_inference() np.testing.assert_allclose(output_mask, expected_output_mask) np.testing.assert_allclose(output_nonzero_data, expected_output_nonzero_data) def test_builder_eval_invalid_parameter(self): with pytest.raises( AssertionError, match="Number of 1s in mask not match number of elements in parameter nonzero_data", ): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_sparse_blockwise_shift_scale( data_mask=np.array([1, 1, 1, 0, 1, 0]) .reshape((2, 3)) .astype(types.np_uint1_dtype), nonzero_data=np.array([0, 1, 0]).astype(np.int8), scale=np.array([[0.1, 0.2, 0.3]]), ) with pytest.raises( ValueError, match=re.escape("the shape of 'offset' should match the shape of 'scale'"), ): @mb.program(input_specs=[], opset_version=_IOS18_TARGET) def prog(): return mb.constexpr_sparse_blockwise_shift_scale( data_mask=np.array([1, 1, 1, 0, 1, 0]) .reshape((2, 3)) .astype(types.np_uint1_dtype), nonzero_data=np.array([0, 1, 0, 1]).astype(np.int8), scale=np.array([[0.1, 0.2, 0.3]]), offset=np.array([[1, 2, 3, 4]]).astype(np.int8), ) @pytest.mark.parametrize( "compute_unit, backend, per_block, data_dtype", itertools.product( compute_units, backends, (True, False), (types.uint4, types.int8, types.uint8, types.fp32), ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, per_block, data_dtype): x_val = np.ones(10).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} np_dtype = types.nptype_from_builtin(data_dtype) def build(x): data_mask_val = np.array( [[1, 1, 0, 0], [1, 1, 1, 0], [0, 0, 1, 1], [1, 1, 0, 1]] ).astype(types.np_uint1_dtype) nonzero_data_val = np.array([10, 11, 3, 4, 5, 6, 7, 8, 9, 2]).astype(np_dtype) if per_block: scale_val = np.array([[0.1, 0.2], [0.3, 0.4], [0.5, 0.6], [0.7, 0.8]]) else: scale_val = np.array([[0.1, 0.2, 0.3, 0.4]]) if per_block: offset_val = np.array([[1, 2], [3, 4], [5, 6], [7, 8]]).astype(np_dtype) else: offset_val = np.array([[1, 2, 3, 4]]).astype(np_dtype) output_mask, output_nonzero_data = mb.constexpr_sparse_blockwise_shift_scale( data_mask=data_mask_val, nonzero_data=nonzero_data_val, scale=scale_val, offset=offset_val, ) return mb.add(x=x, y=output_nonzero_data) if per_block: expected_output = np.array([0.9, 1.0, 0.0, 0.3, 0.4, 0.0, 0.6, 0.7, 1.4, -4.8]) + 1 else: expected_output = np.array([0.9, 1.8, 0.2, 0.4, 0.6, 0.9, 1.2, 0.7, 1.4, -0.8]) + 1 run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, atol=1e-3, rtol=1e-3, ) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_corner_case(self, compute_unit, backend): """ This test case uses the real data from a conv model. It's for testing the scale/offset is correctly repeated and the joint ops materialized_val_inference work as expected. """ def build_weight(): data_mask = np.array( [ [[[0, 1], [1, 1]], [[1, 1], [1, 1]], [[1, 1], [1, 1]], [[1, 0], [1, 1]]], [[[1, 1], [1, 1]], [[1, 1], [0, 0]], [[1, 1], [1, 1]], [[1, 0], [0, 0]]], ] ).astype(types.np_uint1_dtype) data_mask, nonzero_data = mb.constexpr_sparse_blockwise_shift_scale( data_mask=data_mask, nonzero_data=np.array( [ -8, -2, 7, -4, -7, -6, -5, 2, -6, 7, -5, 2, -8, -6, -7, -8, -5, -8, 6, 7, 6, -7, 7, 2, -8, ] ).astype(np.int8), scale=np.array([[[[0.01955]], [[0.02809]]], [[[0.02898]], [[0.02487]]]]), offset=np.array([[[[3]], [[-1]]], [[[-2]], [[-3]]]]).astype(np.int8), ) return mb.constexpr_sparse_to_dense(nonzero_data=nonzero_data, mask=data_mask) def build(x): return mb.add(x=x, y=build_weight()) # Get the const expected weight by decompressing val inference from the joint constexpr ops. weight_prog = mb.program(input_specs=[], opset_version=_IOS18_TARGET)(build_weight) result_op = weight_prog.functions["main"].find_ops(op_type="constexpr_sparse_to_dense")[0] expected_weight = result_op.outputs[0].op.materialized_val_inference() x_val = np.ones(2 * 4 * 2 * 2).reshape((2, 4, 2, 2)).astype(np.float32) expected_output = expected_weight + 1 input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} # With joint quant + sparse ops, the backend prediction should match the expected_weight. run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) # Test conv using joint constexpr ops weight matches using the decompressed const weight. def build_conv_with_joint_constexpr_weight(x): return mb.conv(x=x, weight=build_weight()) def build_conv_with_const_weight(x): return mb.conv(x=x, weight=expected_weight) x_val = np.random.rand(1, 4, 10, 10).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} mlmodel_conv_with_joint_constexpr_weight = run_compare_builder( build_conv_with_joint_constexpr_weight, input_placeholders, input_values={"x": x_val}, expected_output_types=(1, 2, 9, 9) + (types.fp32,), frontend_only=True, compute_unit=compute_unit, backend=backend, ) mlmodel_conv_with_const_weight = run_compare_builder( build_conv_with_const_weight, input_placeholders, input_values={"x": x_val}, expected_output_types=(1, 2, 9, 9) + (types.fp32,), frontend_only=True, compute_unit=compute_unit, backend=backend, ) result_1 = mlmodel_conv_with_joint_constexpr_weight.predict({"x": x_val}) result_2 = mlmodel_conv_with_const_weight.predict({"x": x_val}) np.testing.assert_allclose(result_1["conv_0"], result_2["conv_0"], rtol=3e-3, atol=3e-4) @pytest.mark.parametrize("compute_unit, backend", itertools.product(compute_units, backends)) def test_builder_to_backend_no_offset(self, compute_unit, backend): """ Test per-channel de-quantization on sparse tensor without offset. data_mask = [[1, 1, 0, 0], [1, 1, 1, 0], [0, 0, 1, 1], [1, 1, 0, 0]] nonzero_data = [10, 11, 3, 4, 5, 6, 7, 8, 9] scale = [[0.1, 0.2, 0.3, 0.4]] The sparse tensor in the dense layout would look like: 10 11 3 4 5 6 7 8 9 The input `nonzero_data` would be dequantized per-column as in the dense layout, and the output sparse tensor in the dense layout would be: (10)*0.1 (11)*0.2 (3)*0.1 (4)*0.2 (5)*0.3 (6)*0.3 (7)*0.4 (8)*0.1 (9)*0.2 The first output would be the same as the `data_mask`, The second output would be [1.0, 1.1, 0.3, 0.8, 1.5, 1.8, 2.8, 0.8, 1.8] """ data_dtype = types.int8 x_val = np.ones(9).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} np_dtype = types.nptype_from_builtin(data_dtype) def build(x): data_mask_val = np.array( [[1, 1, 0, 0], [1, 1, 1, 0], [0, 0, 1, 1], [1, 1, 0, 0]] ).astype(types.np_uint1_dtype) nonzero_data_val = np.array([10, 11, 3, 4, 5, 6, 7, 8, 9]).astype(np_dtype) scale_val = np.array([[0.1, 0.2, 0.3, 0.4]]) output_mask, output_nonzero_data = mb.constexpr_sparse_blockwise_shift_scale( data_mask=data_mask_val, nonzero_data=nonzero_data_val, scale=scale_val, ) return mb.add(x=x, y=output_nonzero_data) expected_output = np.array([1.0, 2.2, 0.3, 0.8, 1.5, 1.8, 2.8, 0.8, 1.8]) + 1 run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, dtype, block_sizes, has_offset, sparse_ratio", itertools.product( compute_units, backends, ["int4", "uint4", "int8", "uint8", "fp16"], [(0, 1, 1, 1), (0, 0, 0, 2), (0, 0, 0, 0), (1, 1, 1, 1), (0, 4, 2, 0), (4, 8, 16, 8)], [True, False], [0.01, 0.5, 0.99], ), ) def test_builder_to_backend_stress( self, compute_unit, backend, dtype, block_sizes, has_offset, sparse_ratio ): """ Use constexpr_sparse_blockwise_shift_scale op's value inference to check backends outputs. """ quantized_data_shape = (4, 8, 16, 8) builtin_dtype = types.string_to_builtin(dtype) np_dtype = types.nptype_from_builtin(builtin_dtype) data_mask = np.random.choice( [0, 1], size=quantized_data_shape, p=[sparse_ratio, 1.0 - sparse_ratio] ).astype(types.np_uint1_dtype) data_nonzero_element_num = int(np.sum(data_mask)) if types.is_int(builtin_dtype): data_range = types.type_mapping.builtin_to_range(builtin_dtype) quantized_data = np.random.randint( low=data_range.low, high=data_range.high + 1, size=data_nonzero_element_num ).astype(np_dtype) else: quantized_data = np.random.rand(data_nonzero_element_num).astype(np_dtype) scale_shape = [ 1 if block_sizes[axis] == 0 else dim_size // block_sizes[axis] for axis, dim_size in enumerate(quantized_data_shape) ] scale = np.random.rand(*scale_shape) offset = None if has_offset: if types.is_int(builtin_dtype): offset = np.random.randint( low=data_range.low, high=data_range.high + 1, size=scale.shape ).astype(np_dtype) else: offset = np.random.rand(*scale.shape).astype(np_dtype) def build(x): output_mask, output_nonzero_data = mb.constexpr_sparse_blockwise_shift_scale( data_mask=data_mask, nonzero_data=quantized_data, scale=scale, offset=offset, ) return mb.add(x=x, y=output_nonzero_data) x_val = np.ones_like(quantized_data).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} expected_output = ( constexpr_sparse_blockwise_shift_scale.decompress( data_mask, quantized_data, scale, offset )[1] + 1 ) run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) class TestJointCompressionOps: @pytest.mark.xfail(reason="rdar://131511244 Investigate Why Palettization is Failing on BNNS") @pytest.mark.parametrize( "compute_unit, backend, nbits, block_sizes, vector_size, lut_dtype, quant_dtype", itertools.product( compute_units, backends, [2, 3, 4, 8], [(0, 2, 0, 0), (2, 0, 0, 0), (4, 2, 0, 0)], [1, 4], ["fp16", "fp32"], ["int4", "uint4", "int8", "uint8"], ), ) def test_quant_lut( self, compute_unit, backend, nbits, block_sizes, vector_size, lut_dtype, quant_dtype ): """ Test lut with quantized (int8) entries, which is represented as lut(int8) -> constexpr_blockwise_shift_scale -> lut(fp) \ constexpr_lut_to_dense -> dense(fp) indices / """ indices_shape = (4, 8, 16, 8) builtin_dtype = types.string_to_builtin(f"uint{nbits}") np_dtype = types.nptype_from_builtin(builtin_dtype) indices = np.random.randint(low=0, high=2**nbits, size=indices_shape).astype(np_dtype) lut_np_dtype = types.nptype_from_builtin(types.string_to_builtin(lut_dtype)) lut_shape = _infer_lut_shape(indices_shape, block_sizes, nbits, vector_size) vector_axis = 0 if vector_size > 1 else None quant_builtin_dtype = types.string_to_builtin(quant_dtype) quant_np_dtype = types.nptype_from_builtin(quant_builtin_dtype) quant_data_range = types.type_mapping.builtin_to_range(quant_builtin_dtype) quantized_data = np.random.randint( low=quant_data_range.low, high=quant_data_range.high + 1, size=lut_shape ).astype(quant_np_dtype) scale_shape = tuple([1] * len(lut_shape)) scale = np.array([2.0]).reshape(scale_shape).astype(lut_np_dtype) offset = np.array([3]).reshape(scale_shape).astype(quant_np_dtype) def build(x): lut = mb.constexpr_blockwise_shift_scale( data=quantized_data, scale=scale, offset=offset, ) output = mb.constexpr_lut_to_dense( indices=indices, lut=lut, vector_axis=vector_axis, ) x, output = promote_input_dtypes([x, output]) return mb.add(x=x, y=output) output_shape = list(indices.shape) if vector_size > 1: output_shape[vector_axis] *= vector_size x_val = np.ones(output_shape).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} lut = constexpr_blockwise_shift_scale.decompress(quantized_data, scale, offset) expected_output = ( constexpr_lut_to_dense.decompress(indices, lut, vector_axis=vector_axis) + 1 ) run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, nbits, block_sizes, vector_size, sparse_ratio, lut_dtype", itertools.product( compute_units, backends, [2, 3, 4, 8], [(0, 2, 0, 0), (2, 0, 0, 0), (1, 1, 1, 1), (4, 2, 0, 0)], [1, 4], [0.01, 0.5, 0.99], ["fp16", "fp32"], # TODO (rdar://125859751): Add "int8" and "uint8". ), ) def test_sparse_lut( self, compute_unit, backend, nbits, block_sizes, vector_size, sparse_ratio, lut_dtype ): """Joint constexpr_lut_to_sparse + constexpr_sparse_to_dense.""" indices_shape = (4, 8, 16, 8) indices_mask = np.random.choice( [0, 1], size=indices_shape, p=[sparse_ratio, 1.0 - sparse_ratio] ).astype(types.np_uint1_dtype) indices_nonzero_element_num = np.sum(indices_mask) indices_np_dtype = types.nptype_from_builtin(types.string_to_builtin(f"uint{nbits}")) indices_nonzero_data = np.random.randint( low=0, high=2**nbits, size=indices_nonzero_element_num ).astype(indices_np_dtype) lut_np_dtype = types.nptype_from_builtin(types.string_to_builtin(lut_dtype)) lut_shape = _infer_lut_shape(indices_shape, block_sizes, nbits, vector_size) lut = np.random.rand(*lut_shape).astype(lut_np_dtype) vector_axis = 0 if vector_size > 1 else None def build(x): output_mask, output_nonzero_data = mb.constexpr_lut_to_sparse( indices_mask=indices_mask, indices_nonzero_data=indices_nonzero_data, lut=lut, vector_axis=vector_axis, ) output = mb.constexpr_sparse_to_dense( nonzero_data=output_nonzero_data, mask=output_mask, ) x, output = promote_input_dtypes([x, output]) return mb.add(x=x, y=output) output_mask, output_nonzero_data = constexpr_lut_to_sparse.decompress( indices_mask, indices_nonzero_data, lut, vector_axis ) expected_output = constexpr_sparse_to_dense.decompress(output_nonzero_data, output_mask) + 1 x_val = np.ones(expected_output.shape).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, dtype, block_sizes, has_offset, sparse_ratio", itertools.product( compute_units, backends, ["int4", "uint4", "int8", "uint8", "fp16"], [(0, 2, 0, 0), (2, 0, 0, 0), (1, 1, 1, 1), (4, 2, 0, 0)], [True, False], [0.01, 0.5, 0.99], ), ) def test_sparse_quant( self, compute_unit, backend, dtype, block_sizes, has_offset, sparse_ratio ): """Joint constexpr_sparse_blockwise_shift_scale + constexpr_sparse_to_dense.""" quantized_data_shape = (4, 8, 16, 8) builtin_dtype = types.string_to_builtin(dtype) np_dtype = types.nptype_from_builtin(builtin_dtype) data_mask = np.random.choice( [0, 1], size=quantized_data_shape, p=[sparse_ratio, 1.0 - sparse_ratio] ).astype(types.np_uint1_dtype) data_nonzero_element_num = int(np.sum(data_mask)) if types.is_int(builtin_dtype): data_range = types.type_mapping.builtin_to_range(builtin_dtype) quantized_data = np.random.randint( low=data_range.low, high=data_range.high + 1, size=data_nonzero_element_num ).astype(np_dtype) else: quantized_data = np.random.rand(data_nonzero_element_num).astype(np_dtype) scale_shape = [ 1 if block_sizes[axis] == 0 else dim_size // block_sizes[axis] for axis, dim_size in enumerate(quantized_data_shape) ] scale = np.random.rand(*scale_shape) offset = None if has_offset: if types.is_int(builtin_dtype): offset = np.random.randint( low=data_range.low, high=data_range.high + 1, size=scale.shape ).astype(np_dtype) else: offset = np.random.rand(*scale.shape).astype(np_dtype) def build(x): output_mask, output_nonzero_data = mb.constexpr_sparse_blockwise_shift_scale( data_mask=data_mask, nonzero_data=quantized_data, scale=scale, offset=offset, ) output = mb.constexpr_sparse_to_dense( nonzero_data=output_nonzero_data, mask=output_mask, ) return mb.add(x=x, y=output) output_mask, output_nonzero_data = constexpr_sparse_blockwise_shift_scale.decompress( data_mask, quantized_data, scale, offset ) expected_output = constexpr_sparse_to_dense.decompress(output_nonzero_data, output_mask) + 1 x_val = np.ones(expected_output.shape).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, nbits, block_sizes, vector_size, sparse_ratio, lut_dtype, quant_dtype", itertools.product( compute_units, backends, [2, 3, 4, 8], [(0, 2, 0, 0), (2, 0, 0, 0), (4, 2, 0, 0)], [1, 4], [0.01, 0.5, 0.99], ["fp16", "fp32"], ["int4", "uint4", "int8", "uint8"], ), ) def test_quant_sparse_lut( self, compute_unit, backend, nbits, block_sizes, vector_size, sparse_ratio, lut_dtype, quant_dtype, ): """ Test sparse lut with quantized (int8) entries, which is represented as constexpr_blockwise_shift_scale + constexpr_lut_to_sparse + constexpr_sparse_to_dense """ indices_shape = (4, 8, 16, 8) indices_mask = np.random.choice( [0, 1], size=indices_shape, p=[sparse_ratio, 1.0 - sparse_ratio] ).astype(types.np_uint1_dtype) indices_nonzero_element_num = np.sum(indices_mask) indices_np_dtype = types.nptype_from_builtin(types.string_to_builtin(f"uint{nbits}")) indices_nonzero_data = np.random.randint( low=0, high=2**nbits, size=indices_nonzero_element_num ).astype(indices_np_dtype) lut_np_dtype = types.nptype_from_builtin(types.string_to_builtin(lut_dtype)) lut_shape = _infer_lut_shape(indices_shape, block_sizes, nbits, vector_size) vector_axis = 0 if vector_size > 1 else None quant_builtin_dtype = types.string_to_builtin(quant_dtype) quant_np_dtype = types.nptype_from_builtin(quant_builtin_dtype) quant_data_range = types.type_mapping.builtin_to_range(quant_builtin_dtype) quantized_data = np.random.randint( low=quant_data_range.low, high=quant_data_range.high + 1, size=lut_shape ).astype(quant_np_dtype) scale_shape = tuple([1] * len(lut_shape)) scale = np.array([2.0]).reshape(scale_shape).astype(lut_np_dtype) offset = np.array([3]).reshape(scale_shape).astype(quant_np_dtype) def build(x): lut = mb.constexpr_blockwise_shift_scale( data=quantized_data, scale=scale, offset=offset, ) output_mask, output_nonzero_data = mb.constexpr_lut_to_sparse( indices_mask=indices_mask, indices_nonzero_data=indices_nonzero_data, lut=lut, vector_axis=vector_axis, ) output = mb.constexpr_sparse_to_dense( nonzero_data=output_nonzero_data, mask=output_mask, ) x, output = promote_input_dtypes([x, output]) return mb.add(x=x, y=output) lut = constexpr_blockwise_shift_scale.decompress(quantized_data, scale, offset) output_mask, output_nonzero_data = constexpr_lut_to_sparse.decompress( indices_mask, indices_nonzero_data, lut, vector_axis ) expected_output = constexpr_sparse_to_dense.decompress(output_nonzero_data, output_mask) + 1 x_val = np.ones(expected_output.shape).astype(np.float32) input_placeholders = {"x": mb.placeholder(shape=x_val.shape)} run_compare_builder( build, input_placeholders, input_values={"x": x_val}, expected_output_types=expected_output.shape + (types.fp32,), expected_outputs=expected_output, compute_unit=compute_unit, backend=backend, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS18/test_recurrent.py0000644000000000000000000001320514672066616027026 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import torch import coremltools as ct from coremltools.converters.mil.mil import Builder as mb, types from coremltools.converters.mil.mil.ops.tests.iOS17.test_recurrent import TestGRU as _TestGRU_iOS17 from coremltools.converters.mil.mil.ops.tests.iOS18 import backends from coremltools.converters.mil.testing_reqs import compute_units class TestGRU(_TestGRU_iOS17): # Test functionality from previous opset version @pytest.mark.parametrize( argnames=[ "compute_unit", "backend", "seq_len", "batch_size", "input_size", "hidden_size", "has_bias", "output_sequence", "direction", "activation_functions", "symbolic", "dtype", ], argvalues=itertools.product( compute_units, backends, [1, 3], [1], [1, 2], [1, 2], [True, False], [True, False], ["forward", "reverse"], [ ["tanh", "sigmoid"], ["sigmoid", "tanh"], ], [True, False], [np.float16, np.float32], ), ) def test_builder_to_backend_smoke( self, compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, direction, activation_functions, symbolic, dtype, ): super().test_builder_to_backend_smoke( compute_unit, backend, seq_len, batch_size, input_size, hidden_size, has_bias, output_sequence, direction, activation_functions, symbolic, dtype, ) @pytest.mark.xfail(reason="rdar://128479517") @pytest.mark.parametrize( argnames=[ "compute_units", "backend", "sequence_length", "num_features", # also called "input_size" "hidden_size", "batch_size", ], argvalues=itertools.product( compute_units, backends, [1, 3], [1, 2], [1], [1, 2], ), ) def test_pytorch_parity(self, backend, compute_units, sequence_length, num_features, hidden_size, batch_size): def get_weight_i_tensor(): return np.random.rand(hidden_size, num_features).astype('float32') def get_weight_h_tensor(): return np.random.rand(hidden_size, hidden_size).astype('float32') def get_bias_tensor(): return np.random.rand(hidden_size).astype('float32') W_ir, W_iz, W_in = get_weight_i_tensor(), get_weight_i_tensor(), get_weight_i_tensor() W_hr, W_hz, W_hn = get_weight_h_tensor(), get_weight_h_tensor(), get_weight_h_tensor() b_ir, b_iz, b_in = get_bias_tensor(), get_bias_tensor(), get_bias_tensor() b_hr, b_hz, b_hn = get_bias_tensor(), get_bias_tensor(), get_bias_tensor() # MIL op only supports single direction and single layer x = np.random.rand(sequence_length, batch_size, num_features).astype('float16') initial_h = np.random.rand(1, batch_size, hidden_size).astype('float16') # Set up PyTorch model m_t = torch.nn.GRU(num_features, hidden_size) t_state = m_t.state_dict() t_state['weight_ih_l0'] = torch.Tensor(np.concatenate((W_ir, W_iz, W_in))) t_state['weight_hh_l0'] = torch.Tensor(np.concatenate((W_hr, W_hz, W_hn))) t_state['bias_ih_l0'] = torch.Tensor(np.concatenate((b_ir, b_iz, b_in))) t_state['bias_hh_l0'] = torch.Tensor(np.concatenate((b_hr, b_hz, b_hn))) m_t.load_state_dict(t_state) # Get PyTorch results (out_t, h_t) = m_t(torch.Tensor(x), torch.Tensor(initial_h)) out_t = out_t.detach().numpy() h_t = h_t.detach().numpy() # MIL op only support num_layers=1 and D=1, so hidden state only has rank 2 initial_h = initial_h.squeeze(0) # MIL program @mb.program( [ mb.TensorSpec(shape=x.shape, dtype=types.fp32), mb.TensorSpec(shape=initial_h.shape, dtype=types.fp32) ], opset_version=backend.opset_version ) def prog(x, initial_h): return mb.gru( x=x, initial_h=initial_h, weight_ih=np.concatenate((W_ir, W_in, W_iz)), weight_hh=np.concatenate((W_hr, W_hn, W_hz)), input_bias=np.concatenate((b_ir, b_in, b_iz)), bias=np.concatenate((b_hr, b_hn, b_hz)), reset_after=True, output_sequence=True, ) mlmodel = ct.convert( prog, source="milinternal", convert_to=backend.backend, minimum_deployment_target=backend.opset_version, compute_units=compute_units, pass_pipeline=ct.PassPipeline.EMPTY, ) # Core ML ouput y_cm = mlmodel.predict({'x': x, 'initial_h': initial_h}) out_cm, h_cm = y_cm['gru_0_0'], y_cm['gru_0_1'] # Check outputs np.testing.assert_allclose(out_cm, out_t, atol=0.01, rtol=0.1) np.testing.assert_allclose([h_cm], h_t, atol=0.01, rtol=0.1) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS18/test_states.py0000644000000000000000000002314314672066616026322 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.builder import Builder as mb from coremltools.converters.mil.mil.ops.defs.iOS18 import _IOS18_TARGET from coremltools.converters.mil.mil.ops.tests.iOS18 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import random_gen class TestCoreMLUpdateState: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_coreml_update_state_smoke(self, compute_unit, backend): def build(state, value): return mb.coreml_update_state( state=state, value=value, ) input_placeholders = { "state": mb.state_tensor_placeholder( shape=(2,), dtype=types.fp16, ), "value": mb.placeholder(shape=(2,), dtype=types.fp16), } value = random_gen((2,)) input_values = {"value": value} run_compare_builder( build, input_placeholders, input_values, expected_output_types=[(2, types.fp16)], expected_outputs=[value], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product( compute_units, backends, [(1,), (2, 3), (4, 5, 6)], ), ) def test_coreml_update_stress(self, compute_unit, backend, shape): def build(x_in, y_in, z_in): def increase_val_by_one(state, input): v = mb.add(x=input, y=np.float16(1)) return mb.coreml_update_state(state=state, value=v) x = mb.read_state(input=x_in) y = mb.read_state(input=y_in) z = mb.read_state(input=z_in) for i in range(10): x = increase_val_by_one(x_in, x) y = increase_val_by_one(y_in, y) z = increase_val_by_one(z_in, z) return mb.read_state(input=x_in), mb.read_state(input=y_in), mb.read_state(input=z_in) input_placeholders = { "x_in": mb.state_tensor_placeholder( shape=shape, dtype=types.fp16, ), "y_in": mb.state_tensor_placeholder( shape=shape, dtype=types.fp16, ), "z_in": mb.state_tensor_placeholder( shape=shape, dtype=types.fp16, ), } input_values = {} run_compare_builder( build, input_placeholders, input_values, expected_output_types=[ ( *shape, types.fp16, ) ] * 3, expected_outputs=[ [ 10 * np.ones( shape, ) ] * 3, [ 20 * np.ones( shape, ) ] * 3, ], compute_unit=compute_unit, backend=backend, pred_iters=2, ) class TestReadState: @staticmethod def test_read_tensor_state_builder(): @mb.program(input_specs=[mb.StateTensorSpec((2, 3))], opset_version=_IOS18_TARGET) def prog(x): return mb.read_state(input=x) read_state_op = prog.find_ops("read_state")[0] assert types.is_state(read_state_op.input._sym_type) assert types.is_tensor(read_state_op.outputs[0]._sym_type) @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_read_state_smoke(self, compute_unit, backend): def build(state): return mb.read_state( input=state, ) input_placeholders = { "state": mb.state_tensor_placeholder( shape=(2,), dtype=types.fp16, ), } input_values = {} run_compare_builder( build, input_placeholders, input_values, expected_output_types=[(2, types.fp16)], expected_outputs=[ np.zeros( 2, ) ], compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, shape", itertools.product(compute_units, backends, [(1,), (2, 3), (4, 5, 6)]), ) def test_read_state_stress(self, compute_unit, backend, shape): def build(x, y, z): return ( mb.read_state( input=x, ), mb.read_state( input=y, ), mb.read_state( input=z, ), ) input_placeholders = { "x": mb.state_tensor_placeholder( shape=shape, dtype=types.fp16, ), "y": mb.state_tensor_placeholder( shape=shape, dtype=types.fp16, ), "z": mb.state_tensor_placeholder( shape=shape, dtype=types.fp16, ), } input_values = {} run_compare_builder( build, input_placeholders, input_values, expected_output_types=[ ( *shape, types.fp16, ) ] * 3, expected_outputs=[ np.zeros( shape, ) ] * 3, compute_unit=compute_unit, backend=backend, ) class TestStatefulModel: @pytest.mark.parametrize( "compute_unit, backend", itertools.product( compute_units, backends, ), ) def test_state_model_with_slice_update(self, compute_unit, backend): def build(x_in, y_in, z_in, update_1, update_2): def single_slice_update(state, input): v = mb.slice_update( x=input, update=update_1, begin=[0, 0], end=[1, 2], ) return mb.coreml_update_state(state=state, value=v) def double_slice_update(state, input): v = mb.slice_update( x=input, update=update_1, begin=[0, 0], end=[1, 2], ) v = mb.slice_update( x=input, update=update_2, begin=[1, 1], end=[3, 3], ) return mb.coreml_update_state(state=state, value=v) x = mb.read_state(input=x_in) y = mb.read_state(input=y_in) z = mb.read_state(input=z_in) for i in range(10): # single slice update x = single_slice_update(x_in, x) y = single_slice_update(y_in, y) z = single_slice_update(z_in, z) # double slice update x = double_slice_update(x_in, x) y = double_slice_update(y_in, y) z = double_slice_update(z_in, z) return mb.read_state(input=x_in), mb.read_state(input=y_in), mb.read_state(input=z_in) shape = (8, 9) input_placeholders = { "x_in": mb.state_tensor_placeholder( shape=shape, dtype=types.fp16, ), "y_in": mb.state_tensor_placeholder( shape=shape, dtype=types.fp16, ), "z_in": mb.state_tensor_placeholder( shape=shape, dtype=types.fp16, ), "update_1": mb.placeholder( shape=(1, 2), dtype=types.fp16, ), "update_2": mb.placeholder( shape=(2, 2), dtype=types.fp16, ), } update_1_val = np.array([[1, 2]], dtype=np.float16) update_2_val = np.array([[1, 2], [3, 4]], dtype=np.float16) input_values = { "update_1": update_1_val, "update_2": update_2_val, } output = np.zeros(shape, dtype=np.float16) output[:1, :2] = update_1_val output[1:3, 1:3] = update_2_val run_compare_builder( build, input_placeholders, input_values, expected_output_types=[ ( *shape, types.fp16, ) ] * 3, expected_outputs=[ [output] * 3, [output] * 3, ], compute_unit=compute_unit, backend=backend, pred_iters=2, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS18/test_tensor_transformation.py0000644000000000000000000004113714672066616031462 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil._deployment_compatibility import AvailableTarget as target from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS18 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units def _test_eval( x, update, begin, end, stride=None, begin_mask=None, end_mask=None, squeeze_mask=None, ans=None, compute_unit=None, backend=None, x_builtin_dtype=None, run_conversion_test=True, ): # Test the value inference in pymil @mb.program(input_specs=[], opset_version=target.iOS18) def prog(): res = mb.slice_update( x=x, update=update, begin=begin, end=end, stride=stride, begin_mask=begin_mask, end_mask=end_mask, squeeze_mask=squeeze_mask, ) assert res.shape == ans.shape np.testing.assert_allclose(ans, res.val, atol=1e-04, rtol=1e-05) return res if not run_conversion_test: return # pymil to backend test x_val = np.array(x, dtype=np.float32) update_val = np.array(update, dtype=np.float32) begin_val = np.array(begin, dtype=np.int32) end_val = np.array(end, dtype=np.int32) input_placeholders = { "x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype), "update": mb.placeholder(shape=update_val.shape, dtype=x_builtin_dtype), "begin": mb.placeholder(shape=begin_val.shape, dtype=types.int32), "end": mb.placeholder(shape=end_val.shape, dtype=types.int32), } input_values = {"x": x_val, "update": update_val, "begin": begin_val, "end": end_val} expected_output_shape = list(ans.shape) expected_output_types = [expected_output_shape + [types.fp32]] expected_outputs = [ans] def build(x, update, begin, end): return mb.slice_update( x=x, update=update, begin=begin, end=end, begin_mask=begin_mask, end_mask=end_mask, squeeze_mask=squeeze_mask, stride=stride, ) run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) class TestSliceUpdate: @pytest.mark.parametrize( "compute_unit, backend, x_dtype, idx_dtype", itertools.product( compute_units, backends, (np.float16, np.float32, np.int32), (np.int16, np.int32, np.int8), ), ) def test_builder_to_backend_smoke(self, compute_unit, backend, x_dtype, idx_dtype): x_builtin_dtype = types.numpy_type_to_builtin_type(x_dtype) idx_builtin_dtype = types.numpy_type_to_builtin_type(idx_dtype) x_val = np.array(list(range(24))).reshape((2, 3, 4)).astype(x_dtype) update_val = np.array([[[-1, -2], [-3, -4]]]).astype(x_dtype) begin_val = np.array([1, 1, 1], dtype=idx_dtype) end_val = np.array([2, 3, 3], dtype=idx_dtype) input_placeholders = { "x": mb.placeholder(shape=x_val.shape, dtype=x_builtin_dtype), "update": mb.placeholder(shape=update_val.shape, dtype=x_builtin_dtype), "begin": mb.placeholder(shape=begin_val.shape, dtype=idx_builtin_dtype), "end": mb.placeholder(shape=end_val.shape, dtype=idx_builtin_dtype), } input_values = {"x": x_val, "update": update_val, "begin": begin_val, "end": end_val} expected_output_types = [(2, 3, 4, x_builtin_dtype)] * 2 copy_x_val = np.array(x_val, dtype=x_dtype) copy_x_val[1:2, 1:3, 1:3] = update_val expected_outputs = [copy_x_val, copy_x_val] def build(x, update, begin, end): begin_c = mb.const(val=begin_val) end_c = mb.const(val=end_val) update_c = mb.const(val=update_val) return [ mb.slice_update(x=x, update=update, begin=begin, end=end), mb.slice_update(x=x, update=update_c, begin=begin_c, end=end_c), ] run_compare_builder( build, input_placeholders, input_values, expected_output_types, expected_outputs, compute_unit=compute_unit, backend=backend, ) @pytest.mark.parametrize( "compute_unit, backend, x_dtype, idx_dtype", itertools.product( compute_units, backends, (np.float16, np.float32, np.int32), (np.int16, np.int32, np.int8), ), ) def test_stress(self, compute_unit, backend, x_dtype, idx_dtype): x_val = np.array(list(range(24))).reshape((2, 3, 4)).astype(x_dtype) x_builtin_dtype = types.numpy_type_to_builtin_type(x_dtype) update = np.random.rand(1, 1, 1).astype(x_dtype) ans = np.copy(x_val) ans[1:2, 1:2, 1:2] = update _test_eval( x=x_val, begin=[1, 1, 1], end=[2, 2, 2], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(1, 2, 2).astype(x_dtype) ans = np.copy(x_val) ans[1:2, 1:3, 1:4:2] = update _test_eval( x=x_val, begin=[1, 1, 1], end=[2, 3, 4], stride=[1, 1, 2], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(1, 2, 2).astype(x_dtype) ans = np.copy(x_val) ans[-3:-1, -3:-1, -3:-1] = update _test_eval( x=x_val, begin=[-3, -3, -3], end=[-1, -1, -1], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, # rdar://128037672 ([Bug][iOS18][Classic CPU] slice_update fails on classic CPU on an unittest) run_conversion_test=False, ) update = np.random.rand(1, 1, 1).astype(x_dtype) ans = np.copy(x_val) ans[0:-1, 0:-2, -3:-2] = update _test_eval( x=x_val, begin=[0, 0, -3], end=[-1, -2, -2], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(1, 1, 1).astype(x_dtype) ans = np.copy(x_val) ans[-1:0:-2, -1:1:-1, -1:-3:-3] = update _test_eval( x=x_val, begin=[-1, -1, -1], end=[0, 1, -3], stride=[-2, -1, -3], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(2, 2, 2).astype(x_dtype) ans = np.copy(x_val) ans[:2, 1:3, :4:2] = update _test_eval( x=x_val, begin=[1, 1, 1], end=[2, 3, 4], stride=[1, 1, 2], begin_mask=[True, False, True], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(2, 2, 2).astype(x_dtype) ans = np.copy(x_val) ans[:, 1:, :4:2] = update _test_eval( x=x_val, begin=[1, 1, 1], end=[2, 3, 4], stride=[1, 1, 2], begin_mask=[True, False, True], end_mask=[True, True, False], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(1, 2).astype(x_dtype) ans = np.copy(x_val) ans[1::1, 1, :3:2] = update _test_eval( x=x_val, begin=[1, 1, 1], end=[2, 3, 3], stride=[1, 1, 2], begin_mask=[False, False, True], end_mask=[True, False, False], squeeze_mask=[False, True, False], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(2, 3, 4).astype(x_dtype) ans = np.copy(x_val) ans[:, :, :] = update _test_eval( x=x_val, begin=[0, 0, 0], end=[0, 0, 0], stride=[1, 1, 1], begin_mask=[True, True, True], end_mask=[True, True, True], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(1, 1).astype(x_dtype) ans = np.copy(x_val) ans[1:2, 1:2, 1] = update _test_eval( x=x_val, begin=[1, 1, 1], end=[2, 2, 0], stride=[1, 1, 1], squeeze_mask=[False, False, True], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(1, 3, 4).astype(x_dtype) ans = np.copy(x_val) ans[1:2, ...] = update _test_eval( x=x_val, begin=[1, 0, 0], end=[2, 0, 0], stride=[1, 1, 1], begin_mask=[False, True, True], end_mask=[False, True, True], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(2, 3, 4).astype(x_dtype) ans = np.copy(x_val) ans[...] = update _test_eval( x=x_val, begin=[0, 0, 0], end=[0, 0, 0], stride=[1, 1, 1], begin_mask=[True, True, True], end_mask=[True, True, True], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(1, 3, 1).astype(x_dtype) ans = np.copy(x_val) ans[1:2, ..., 1:2] = update _test_eval( x=x_val, begin=[1, 0, 1], end=[2, 0, 2], stride=[1, 1, 1], begin_mask=[False, True, False], end_mask=[False, True, False], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(2, 3).astype(x_dtype) ans = np.copy(x_val) ans[..., 1] = update _test_eval( x=x_val, begin=[0, 0, 1], end=[0, 0, 0], stride=[1, 1, 1], begin_mask=[True, True, False], end_mask=[True, True, False], squeeze_mask=[False, False, True], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand( 4, ).astype(x_dtype) ans = np.copy(x_val) ans[0, 0, :] = update _test_eval( x=x_val, begin=[0, 0, 0], end=[0, 0, 0], stride=[1, 1, 1], begin_mask=[False, False, True], end_mask=[False, False, True], squeeze_mask=[True, True, False], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(1, 3, 4).astype(x_dtype) ans = np.copy(x_val) ans[1:2] = update _test_eval( x=x_val, begin=[1, 0, 0], end=[2, 0, 0], stride=[1, 1, 1], begin_mask=[False, True, True], end_mask=[False, True, True], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(1, 1, 4).astype(x_dtype) ans = np.copy(x_val) ans[1:2, 1:2] = update _test_eval( x=x_val, begin=[1, 1, 0], end=[2, 2, 0], stride=[1, 1, 1], begin_mask=[False, False, True], end_mask=[False, False, True], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(3, 4).astype(x_dtype) ans = np.copy(x_val) ans[1] = update _test_eval( x=x_val, begin=[1, 0, 0], end=[0, 0, 0], stride=[1, 1, 1], begin_mask=[False, True, True], end_mask=[False, True, True], squeeze_mask=[True, False, False], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(2, 3, 4).astype(x_dtype) ans = np.copy(x_val) ans[:] = update _test_eval( x=x_val, begin=[0, 0, 0], end=[0, 0, 0], begin_mask=[True, True, True], end_mask=[True, True, True], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) update = np.random.rand(2, 3, 4).astype(x_dtype) ans = np.copy(x_val) ans[..., ::-1] = update _test_eval( x=x_val, begin=[0, 0, 0], end=[0, 0, 0], stride=[1, 1, -1], begin_mask=[True, True, True], end_mask=[True, True, True], update=update, ans=ans, compute_unit=compute_unit, backend=backend, x_builtin_dtype=x_builtin_dtype, ) def test_builder_eval_scalar_corner_cases(self): pytest.xfail( "rdar://128221986 ([Feature][Slice_update] The backend is not supporting scalar update for the slice_update op)" ) # two corner cases x_val = np.array([2.0]) update = np.float32(3.14) ans = np.copy(x_val) ans[0] = update _test_eval( x=x_val, begin=[0], end=[0], squeeze_mask=[True], update=update, ans=ans, run_conversion_test=False, # rank 0 input is not supported ) x_val = np.array([[[[1.0], [3.0]]]]) update = np.float32(7.78) ans = np.copy(x_val) ans[0, 0, 0, 0] = update _test_eval( x=x_val, begin=[0, 0, 0, 0], end=[0, 0, 0, 0], squeeze_mask=[True, True, True, True], update=update, ans=ans, run_conversion_test=False, # rank 0 input is not supported ) @staticmethod def test_rank_0_update_early_error_out(): """ Backend does not support rank-0 update for the slice_update op. coremltools should early error out until this radar is fixed: rdar://128221986 ([Feature][Slice_update] The backends is not supporting scalar update for the slice_update op) """ with pytest.raises( ValueError, match="rank-0 'update' is not supported in 'slice_update' op" ): @mb.program(input_specs=[], opset_version=target.iOS18) def prog(): return mb.slice_update( x=[0.0, 0.0], update=0.0, begin=[0], end=[1], squeeze_mask=[True], ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/iOS18/test_transformers.py0000644000000000000000000003403114672066616027542 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest import torch import coremltools as ct from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol, types from coremltools.converters.mil.mil.ops.tests.iOS18 import backends from coremltools.converters.mil.mil.ops.tests.testing_utils import run_compare_builder from coremltools.converters.mil.testing_reqs import compute_units class TestScaledDotProductAttention: @staticmethod def _mb_eval_scaled_dot_product_attention( query: np.ndarray, key: np.ndarray, value: np.ndarray, mask: np.ndarray = None ) -> np.ndarray: @mb.program(opset_version=ct.target.iOS18) def prog(): return mb.scaled_dot_product_attention( query=query, key=key, value=value, attn_mask=mask, ) return ( prog.functions["main"] .find_ops(op_type="scaled_dot_product_attention")[0] .outputs[0] .val ) @staticmethod def _torch_scaled_dot_product_attention( query: np.ndarray, key: np.ndarray, value: np.ndarray, mask: np.ndarray = None ) -> np.ndarray: """ Two things: 1. torch cannot consume np.ndarray, so need to convert to torch.Tensor 2. torch cpu kernel has no half-precision support, so need to cast to float """ query_torch = torch.tensor(query).to(torch.float32) key_torch = torch.tensor(key).to(torch.float32) value_torch = torch.tensor(value).to(torch.float32) mask_torch = None if mask is not None: mask_torch = torch.tensor(mask) if mask.dtype != bool: mask_torch = mask_torch.to(torch.float32) return ( torch.nn.functional.scaled_dot_product_attention( query_torch, key_torch, value_torch, mask_torch ) .numpy() .astype(query.dtype) ) @pytest.mark.parametrize( "batches, float_dtype, mask_dtype", itertools.product( ([3], [3, 2], [3, 2, 4]), (np.float16, np.float32), (None, bool, np.float16, np.float32), ), ) def test_builder_eval_stress(self, batches, float_dtype, mask_dtype): S = 5 L = 7 E = 16 EV = 32 query_shape = batches + [L, E] key_shape = batches + [S, E] value_shape = batches + [S, EV] query = np.random.rand(*query_shape).astype(float_dtype) key = np.random.rand(*key_shape).astype(float_dtype) value = np.random.rand(*value_shape).astype(float_dtype) mask = None if mask_dtype is not None: mask = np.zeros((1, 1, S), dtype=mask_dtype) mask[:, :, S // 2 :] = False if mask_dtype is bool else -np.inf attention_coreml = self._mb_eval_scaled_dot_product_attention(query, key, value, mask) attention_torch = self._torch_scaled_dot_product_attention(query, key, value, mask) np.testing.assert_allclose( attention_coreml, attention_torch, atol=1e-6 if float_dtype == np.float32 else 1e-3, rtol=1e-6 if float_dtype == np.float32 else 1e-3, ) @pytest.mark.parametrize( "compute_unit, backend, batches, float_dtype, mask_dtype", itertools.product( compute_units, backends, ([3], [3, 2], [3, 2, 4]), (np.float16, np.float32), (None, bool, np.float16, np.float32), ), ) def test_builder_to_backend_stress( self, compute_unit, backend, batches, float_dtype, mask_dtype ): def build(query, key, value): return mb.scaled_dot_product_attention( query=query, key=key, value=value, ) def build_with_mask(query, key, value, mask): return mb.scaled_dot_product_attention( query=query, key=key, value=value, attn_mask=mask, ) S = 5 L = 7 E = 16 EV = 32 query_shape = batches + [L, E] key_shape = batches + [S, E] value_shape = batches + [S, EV] query = np.random.rand(*query_shape).astype(float_dtype) key = np.random.rand(*key_shape).astype(float_dtype) value = np.random.rand(*value_shape).astype(float_dtype) input_placeholders = { "query": mb.placeholder( shape=query.shape, dtype=types.numpy_type_to_builtin_type(float_dtype) ), "key": mb.placeholder( shape=key.shape, dtype=types.numpy_type_to_builtin_type(float_dtype) ), "value": mb.placeholder( shape=value.shape, dtype=types.numpy_type_to_builtin_type(float_dtype) ), } input_values = { "query": query, "key": key, "value": value, } mask = None if mask_dtype is not None: mask = np.zeros((1, 1, S), dtype=mask_dtype) mask[:, :, S - 1 :] = False if mask_dtype is bool else -np.inf input_placeholders["mask"] = mb.placeholder( shape=mask.shape, dtype=types.numpy_type_to_builtin_type(mask_dtype) ) input_values["mask"] = mask attention_torch = self._torch_scaled_dot_product_attention(query, key, value, mask) run_compare_builder( build if mask_dtype is None else build_with_mask, input_placeholders, input_values, expected_output_types=[attention_torch.shape + (types.fp32,)], expected_outputs=[attention_torch], compute_unit=compute_unit, backend=backend, atol=1e-6 if backend.precision == "fp32" and float_dtype == np.float32 else 1e-3, rtol=1e-6 if backend.precision == "fp32" and float_dtype == np.float32 else 1e-3, ) @pytest.mark.parametrize( "compute_unit, backend, batches, float_dtype, mask_dtype", itertools.product( compute_units, backends, ([2], [2, 3], [2, 3, 4]), (np.float16, np.float32), (None, bool, np.float16, np.float32), ), ) def test_builder_to_backend_dynamic_stress( self, compute_unit, backend, batches, float_dtype, mask_dtype ): def build(query, key, value): return mb.scaled_dot_product_attention( query=query, key=key, value=value, ) def build_with_mask(query, key, value, mask): return mb.scaled_dot_product_attention( query=query, key=key, value=value, attn_mask=mask, ) S = 2 L = 2 E = 4 EV = 32 query_shape = batches + [L, E] key_shape = batches + [S, E] value_shape = batches + [S, EV] query = np.random.rand(*query_shape).astype(float_dtype) key = np.random.rand(*key_shape).astype(float_dtype) value = np.random.rand(*value_shape).astype(float_dtype) dynamic_query_shape = query_shape dynamic_query_shape[0] = get_new_symbol() dynamic_query_shape[-2] = get_new_symbol() dynamic_key_shape = key_shape dynamic_key_shape[-2] = get_new_symbol() dynamic_value_shape = value_shape dynamic_value_shape[-2] = get_new_symbol() input_placeholders = { "query": mb.placeholder( shape=tuple(dynamic_query_shape), dtype=types.numpy_type_to_builtin_type(float_dtype), ), "key": mb.placeholder( shape=tuple(dynamic_key_shape), dtype=types.numpy_type_to_builtin_type(float_dtype) ), "value": mb.placeholder( shape=tuple(dynamic_value_shape), dtype=types.numpy_type_to_builtin_type(float_dtype), ), } input_values = { "query": query, "key": key, "value": value, } mask = None if mask_dtype is not None: mask = np.zeros((1, S), dtype=mask_dtype) mask[:, S - 1 :] = False if mask_dtype is bool else -np.inf dynamic_mask_shape = [] for i in range(len(mask.shape)): dynamic_mask_shape.append(get_new_symbol()) input_placeholders["mask"] = mb.placeholder( shape=tuple(dynamic_mask_shape), dtype=types.numpy_type_to_builtin_type(mask_dtype) ) input_values["mask"] = mask attention_torch = self._torch_scaled_dot_product_attention(query, key, value, mask) output_shape = list(attention_torch.shape) output_shape[0] = query_shape[0] output_shape[-2] = query_shape[-2] run_compare_builder( build if mask_dtype is None else build_with_mask, input_placeholders, input_values, expected_output_types=[tuple(output_shape) + (types.fp32,)], expected_outputs=[attention_torch], compute_unit=compute_unit, backend=backend, atol=1e-6 if backend.precision == "fp32" and float_dtype == np.float32 else 1e-3, rtol=1e-6 if backend.precision == "fp32" and float_dtype == np.float32 else 1e-3, ) def test_builder_invalid_shape(self): B = 3 S = 5 L = 7 E = 16 EV = 32 with pytest.raises( ValueError, match=( r"query, key, value must have a same rank, got\n" r"\* query rank = [0-9]+\n" r"\* key rank = [0-9]+\n" r"\* value rank = [0-9]+" ), ): query_shape = [B, L, E] key_shape = [S, E] value_shape = [S, EV] query = np.random.rand(*query_shape) key = np.random.rand(*key_shape) value = np.random.rand(*value_shape) self._mb_eval_scaled_dot_product_attention(query, key, value) with pytest.raises( ValueError, match=( r"query, key, value must have at lease rank 3 " r"for batch, sequence length, embedding, got rank [0-9]+" ), ): query_shape = [L, E] key_shape = [S, E] value_shape = [S, EV] query = np.random.rand(*query_shape) key = np.random.rand(*key_shape) value = np.random.rand(*value_shape) self._mb_eval_scaled_dot_product_attention(query, key, value) with pytest.raises( ValueError, match=( r"query, key, value must have a same batch dimension, got\n" r"\* query batch = \((?:\s*\d+\s*,)+\s*\d*\)\n" r"\* key batch = \((?:\s*\d+\s*,)+\s*\d*\)\n" r"\* value batch = \((?:\s*\d+\s*,)+\s*\d*\)" ), ): query_shape = [B + 1, L, E] key_shape = [B, S, E] value_shape = [B, S, EV] query = np.random.rand(*query_shape) key = np.random.rand(*key_shape) value = np.random.rand(*value_shape) self._mb_eval_scaled_dot_product_attention(query, key, value) with pytest.raises( ValueError, match=( r"query and key must have a same embedding dimension, got\n" r"\* query embedding = [0-9]+\n" r"\* key embedding = [0-9]+" ), ): query_shape = [B, L, E + 1] key_shape = [B, S, E] value_shape = [B, S, EV] query = np.random.rand(*query_shape) key = np.random.rand(*key_shape) value = np.random.rand(*value_shape) self._mb_eval_scaled_dot_product_attention(query, key, value) with pytest.raises( ValueError, match=( r"key and value must have a same sequence length, got\n" r"\* key sequence = [0-9]+\n" r"\* value sequence = [0-9]+" ), ): query_shape = [B, L, E] key_shape = [B, S + 1, E] value_shape = [B, S, EV] query = np.random.rand(*query_shape) key = np.random.rand(*key_shape) value = np.random.rand(*value_shape) self._mb_eval_scaled_dot_product_attention(query, key, value) with pytest.raises( ValueError, match=( r"key and mask must have a same sequence length, got\n" r"\* key sequence = [0-9]+\n" r"\* mask sequence = [0-9]+" ), ): query_shape = [B, L, E] key_shape = [B, S, E] value_shape = [B, S, EV] query = np.random.rand(*query_shape) key = np.random.rand(*key_shape) value = np.random.rand(*value_shape) mask = np.zeros(S + 1, dtype=bool) mask[-1] = True self._mb_eval_scaled_dot_product_attention(query, key, value, mask) with pytest.raises( ValueError, match=( r"Incompatible dim [0-9]+ in shapes " r"\((?:\s*\d+\s*,)+\s*\d*\) vs\. \((?:\s*\d+\s*,)+\s*\d*\)" ), ): query_shape = [B, L, E] key_shape = [B, S, E] value_shape = [B, S, EV] query = np.random.rand(*query_shape) key = np.random.rand(*key_shape) value = np.random.rand(*value_shape) mask = np.zeros((B + 1, L - 1, S), dtype=bool) mask[:, :, -1] = True self._mb_eval_scaled_dot_product_attention(query, key, value, mask) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/test_utils.py0000644000000000000000000001760214672066616025317 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil.ops.defs._utils import ( aggregated_pad, effective_kernel, spatial_dimensions_out_shape, ) class TestDilation: def test_kernel_and_dilations_not_same_size(self): np.testing.assert_raises_regex( ValueError, "kernel_shape.*dilations.*length", effective_kernel, kernel_shape=(1, 2, 3), dilations=(1, 2), ) def test_effective_kernel_dilation_1(self): actual = effective_kernel(kernel_shape=(1, 2, 3), dilations=(1, 1, 1)) expected = [1, 2, 3] np.testing.assert_equal(actual, expected) def test_effective_kernel_dilation_2(self): actual = effective_kernel(kernel_shape=(1, 2, 3), dilations=(2, 2, 2)) expected = [1, 3, 5] np.testing.assert_equal(actual, expected) def test_effective_kernel_dilation_3(self): actual = effective_kernel(kernel_shape=(1, 2, 3), dilations=(3, 3, 3)) expected = [1, 4, 7] np.testing.assert_equal(actual, expected) class TestAggregatePadding: def test_invalid_pad_type(self): np.testing.assert_raises_regex( ValueError, "Invalid padding pad_type", aggregated_pad, pad_type="bananas", kernel_shape=(1, 2, 3), ) def test_dilations_rank_different_from_input_rank(self): np.testing.assert_raises_regex( ValueError, "dilations must have same length as kernel_shape", aggregated_pad, pad_type="valid", # doesn't matter kernel_shape=(1, 2, 3), dilations=(4, 5), ) def test_custom_pad(self): actual = aggregated_pad( pad_type="custom", kernel_shape=(1, 2, 3), custom_pad=(7, 8, 9, 10, 11, 12) ) expected = [7 + 8, 9 + 10, 11 + 12] np.testing.assert_equal(actual, expected) def test_custom_pad_none(self): np.testing.assert_raises_regex( ValueError, "Invalid custom_pad", aggregated_pad, pad_type="custom", kernel_shape=(1, 2, 3), # doesn't matter custom_pad=None, ) def test_custom_pad_invalid(self): np.testing.assert_raises_regex( ValueError, "Invalid custom_pad", aggregated_pad, pad_type="custom", kernel_shape=(1, 2, 3), # doesn't matter custom_pad=(7, 8, 9, 10), # too few elements ) def test_valid_pad(self): actual = aggregated_pad(pad_type="valid", kernel_shape=(1, 2, 3),) expected = [0, 0, 0] np.testing.assert_equal(actual, expected) def test_valid_pad_4d(self): actual = aggregated_pad(pad_type="valid", kernel_shape=(1, 2, 3, 4),) expected = [0, 0, 0, 0] np.testing.assert_equal(actual, expected) def test_valid_pad_2d(self): actual = aggregated_pad(pad_type="valid", kernel_shape=(1, 2),) expected = [0, 0] np.testing.assert_equal(actual, expected) def test_valid_pad_1d(self): actual = aggregated_pad(pad_type="valid", kernel_shape=[4]) expected = [0] np.testing.assert_equal(actual, expected) def test_same_padding_no_dilation(self): actual = aggregated_pad( pad_type="same", input_shape=(5, 6, 7), kernel_shape=(2, 2, 2), strides=(1, 2, 2), ) expected = [1, 0, 1] np.testing.assert_equal(actual, expected) def test_same_padding_dilation_with_dilation(self): actual = aggregated_pad( pad_type="same", input_shape=(19, 20, 21), kernel_shape=(2, 2, 2), strides=(1, 2, 2), dilations=(5, 6, 7), ) expected = [5, 5, 7] np.testing.assert_equal(actual, expected) def test_same_padding_stride_same_as_input(self): actual = aggregated_pad( pad_type="same", input_shape=(5, 5), kernel_shape=(3, 3), strides=(5, 5), ) expected = [0, 0] np.testing.assert_equal(actual, expected) def test_same_padding_stride_larger_than_kernel_but_less_than_input(self): actual = aggregated_pad( pad_type="same", input_shape=(5, 5), kernel_shape=(3, 3), strides=(4, 4), ) expected = [2, 2] np.testing.assert_equal(actual, expected) def test_same_padding_none_input_shape(self): np.testing.assert_raises_regex( ValueError, "input_shape.*None", aggregated_pad, pad_type="same", kernel_shape=(1, 2, 3), strides=(1, 2, 3), ) def test_same_padding_input_shape_wrong_size(self): np.testing.assert_raises_regex( ValueError, "input_shape.*same length", aggregated_pad, pad_type="same", kernel_shape=(1, 2, 3), input_shape=(1, 2), strides=(1, 2, 3), ) def test_same_padding_none_strides(self): np.testing.assert_raises_regex( ValueError, "strides.*None", aggregated_pad, pad_type="same", kernel_shape=(1, 2, 3), input_shape=(1, 2, 3), ) def test_same_padding_strides_wrong_size(self): np.testing.assert_raises_regex( ValueError, "strides.*same length", aggregated_pad, pad_type="same", kernel_shape=(1, 2, 3), input_shape=(1, 2, 3), strides=(1, 2), ) class TestOutputShape: def test_custom_padding_shape(self): actual = spatial_dimensions_out_shape( pad_type="custom", input_shape=(3, 3, 3), kernel_shape=(2, 2, 2), strides=(2, 2, 2), custom_pad=(2, 0, 1, 2, 2, 3), ) expected = [2, 3, 4] np.testing.assert_equal(actual, expected) def test_valid_padding_shape(self): actual = spatial_dimensions_out_shape( pad_type="valid", input_shape=(7, 7), kernel_shape=(3, 3), strides=(1, 1) ) expected = [5, 5] np.testing.assert_equal(actual, expected) def test_valid_padding_shape_dilation_2(self): actual = spatial_dimensions_out_shape( pad_type="valid", input_shape=(7, 7), kernel_shape=(3, 3), strides=(1, 1), dilations=(2, 2), ) expected = [3, 3] np.testing.assert_equal(actual, expected) def test_valid_padding_shape_with_stride_2(self): actual = spatial_dimensions_out_shape( pad_type="valid", input_shape=(7, 7), kernel_shape=(3, 3), strides=(2, 2) ) expected = [3, 3] np.testing.assert_equal(actual, expected) def test_same_padding_shape(self): actual = spatial_dimensions_out_shape( pad_type="same", input_shape=(6, 6), kernel_shape=(2, 2), strides=(2, 2) ) expected = [3, 3] np.testing.assert_equal(actual, expected) def test_same_padding_shape_stride_2_input_not_multiple_of_kernel(self): actual = spatial_dimensions_out_shape( pad_type="same", input_shape=(5, 5), kernel_shape=(2, 2), strides=(2, 2) ) expected = [3, 3] np.testing.assert_equal(actual, expected) def test_same_padding_shape_dilation_2(self): actual = spatial_dimensions_out_shape( pad_type="same", input_shape=(5, 5), kernel_shape=(2, 2), strides=(1, 1), dilations=(2, 2), ) expected = [5, 5] np.testing.assert_equal(actual, expected) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/ops/tests/testing_utils.py0000644000000000000000000002075614672066616026021 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import functools from typing import Dict, List, Optional import pytest import coremltools as ct from coremltools import _logger as logger from coremltools.converters.mil import mil from coremltools.converters.mil.input_types import TensorType from coremltools.converters.mil.mil import Function, Placeholder from coremltools.converters.mil.mil.passes.pass_pipeline import PassPipeline from coremltools.converters.mil.mil.types.symbolic import is_symbolic from coremltools.converters.mil.testing_reqs import BackendConfig from coremltools.converters.mil.testing_utils import ( compare_backend, ct_convert, validate_minimum_deployment_target, ) UNK_VARIADIC = "*s_unk" UNK_SYM = "s_unk" def mark_api_breaking(breaking_opset_version: ct.target): """ The function is used to mark api breaking for MIL op unittests. For instance, if `test_op_1` is supposed to pass from iOS14 -> iOS16 and breaks starting from iOS17, we can use the following syntax: @makr_api_breaking(breaking_opsey_version=ct.target.iOS17) def test_op_1(self, backend, ...): pass Note that the test function must take `backend` with type of `BackendConfig` as an input. """ def decorator(func): @functools.wraps(func) def wrapper(*args, **kwargs): backend = kwargs.get("backend", None) if backend is None: raise ValueError( f'Function {func} decorated with mark_api_breaking must takes "backend" as an input.' ) if backend.opset_version >= breaking_opset_version: pytest.skip(f"The test is breaking at opset version {breaking_opset_version}.") return func(*args, **kwargs) return wrapper return decorator def run_compare_builder( build, input_placeholders, input_values=None, expected_output_types=None, expected_outputs=None, compute_unit=ct.ComputeUnit.CPU_ONLY, frontend_only=False, backend: Optional[BackendConfig] = None, atol=1e-04, rtol=1e-05, inputs=None, also_compare_shapes=True, converter=ct.convert, pass_pipeline: Optional[PassPipeline] = None, pred_iters: Optional[int] = None, ): """ Inputs: - build: python function taking input of Vars and returning Var or list[Var]. Each input argument in build must match a key in input_values / input_placeholders. - input_placeholders: str -> placeholder. It may not be an empty dict as MLModel doesn't support function with no input. - input_values: str -> np.array or PIL.Image. Keys must match those in input_placeholders. - expected_output_types: list[(shape, builtin_type)] or (shape, builtin_type). None skips type inference validation. - compute_unit: Enum[ct.ComputeUnit]. Compute unit for the coreml model - expected_outputs: list[np.array] or np.array. Required iff frontend_only == False - frontend_only: True to test up to proto generation. - inputs: type of inputs (either None (defaults to tensor) or [ct.ImageType]) - converter: function Reference to convert function to be used. Default: ct.convert - backend: A BackendConfig that specifies the compute backend, precision and minimum_deployment_target - pred_iters: Number of prediction to run the mlmodel. For a stateful model, each prediction can have different numerical results. Can only be provided when mlmodel is stateful. Returns: The converted mlmodel (MLModel), or Tuple[MLModel, MLState]. """ if backend is None: backend = BackendConfig( backend="neuralnetwork", precision="fp32", opset_version=ct.target.iOS14, ) minimum_deployment_target = backend.opset_version backend = (backend.backend, backend.precision) validate_minimum_deployment_target(minimum_deployment_target, backend) if not isinstance(expected_output_types, list): expected_output_types = [expected_output_types] if expected_outputs is not None and not isinstance(expected_outputs, list): expected_outputs = [expected_outputs] prog = mil.Program() with Function(input_placeholders, opset_version=minimum_deployment_target) as ssa_func: output_vars = build(**ssa_func.inputs) if isinstance(output_vars, tuple): output_vars = list(output_vars) elif not isinstance(output_vars, list): output_vars = [output_vars] ssa_func.set_outputs(output_vars) prog.add_function("main", ssa_func) # get output names for output_vars output_names = [x.name for x in output_vars] # Validate type inference msg = ( "Provided expected outputs types {} should match number of output" + " variables {}" ) assert_msg = msg.format(len(expected_output_types), len(output_vars)) assert len(output_vars) == len(expected_output_types), assert_msg for out_var, s in zip(output_vars, expected_output_types): # The output type will be casted by the `adjust_io_to_supported_types` pass, so we don't # check the output var dtype matching here. if UNK_VARIADIC in s[:-1]: msg = "Skip type checking for UNK_VARIADIC. Output shape: {} vs expected shape: {}" logger.debug(msg.format(out_var.shape, s[:-1])) continue expected_shape = s[:-1] msg = "Output {} shape: expect {}, got {}. Program:\n{}".format( out_var.name, expected_shape, out_var.shape, prog ) # No more variadic here. if len(out_var.shape) != len(expected_shape): raise ValueError(msg) # replace UNK_SYM in out_var.shape. output_shape = [ 0 if es == UNK_SYM else os for os, es in zip(out_var.shape, expected_shape) ] expected_shape = [0 if es == UNK_SYM else es for es in expected_shape] # convert float etc to int. output_shape = [i if is_symbolic(i) else int(i) for i in output_shape] expected_shape = [i if is_symbolic(i) else int(i) for i in expected_shape] if output_shape != expected_shape: raise ValueError(msg) mlmodel = ct_convert( prog, converter=converter, source="milinternal", convert_to=backend, inputs=inputs, compute_units=compute_unit, minimum_deployment_target=minimum_deployment_target, pass_pipeline=pass_pipeline, ) if frontend_only: return mlmodel state = mlmodel.make_state() if mlmodel._is_stateful() else None if pred_iters is not None: assert state is not None, "pred_iters can only be provided with stateful model." else: pred_iters = 1 for i in range(pred_iters): # get the expected outputs from each prediction iteration outputs = None if expected_outputs is not None: outputs = expected_outputs if pred_iters == 1 else expected_outputs[i] assert len(output_vars) == len(outputs), ( f"Provided expected_outputs {len(outputs)}" " should match number of output" f" variables {len(output_vars)}" ) outputs = {name: val for name, val in zip(output_names, outputs)} # run the mlmodel and compare the output numerical compare_backend( mlmodel=mlmodel, input_key_values=input_values, expected_outputs=outputs, atol=atol, rtol=rtol, also_compare_shapes=also_compare_shapes, dtype=backend[1], state=state, ) return mlmodel def construct_inputs_from_placeholders( input_placeholders: Dict[str, Placeholder], upper_bound: int ) -> [List[TensorType]]: """Construct the `inputs` param from placeholders with upper_bound.""" inputs: [List[TensorType]] = [] for input_name, placeholder in input_placeholders.items(): input_shape = [ ct.RangeDim(upper_bound=upper_bound) if is_symbolic(shape) else shape for shape in placeholder.sym_shape ] input_tensor_type = TensorType(name=input_name, shape=input_shape) inputs.append(input_tensor_type) return inputs ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2455468 coremltools-8.0/coremltools/converters/mil/mil/passes/0000755000000000000000000000000014672075535022073 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/__init__.py0000644000000000000000000000300614672066616024203 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause # Import all frontend/backend passes to make sure they got registered. from coremltools.converters.mil.backend.mil.passes import ( adjust_io_to_supported_types, fuse_activation_silu, fuse_pow2_sqrt, insert_image_preprocessing_op, sanitize_name_strings, ) from coremltools.converters.mil.backend.nn.passes import ( alert_return_type_cast, commingle_loop_vars, conv1d_decomposition, handle_return_inputs_as_outputs, handle_return_unused_inputs, handle_unused_inputs, mlmodel_passes, ) from coremltools.converters.mil.frontend.tensorflow2.ssa_passes import remove_vacuous_cond from coremltools.converters.mil.frontend.tensorflow.ssa_passes import ( backfill_make_list_elem_type, expand_tf_lstm, tf_lstm_to_core_lstm, ) from coremltools.converters.mil.frontend.torch.ssa_passes import ( torch_tensor_assign_to_core, torch_upsample_to_core_upsample, ) from coremltools.converters.mil.mil.passes.defs import ( cleanup, lower_complex_dialect_ops, optimize_activation, optimize_activation_quantization, optimize_conv, optimize_elementwise_binary, optimize_linear, optimize_normalization, optimize_quantization, optimize_repeat_ops, optimize_state, optimize_tensor_operation, preprocess, symbol_transform, ) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2455468 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/0000755000000000000000000000000014672075535023014 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/__init__.py0000644000000000000000000000033214672066616025123 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.249547 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/cleanup/0000755000000000000000000000000014672075535024443 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/cleanup/__init__.py0000644000000000000000000000147014672066616026556 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .const_deduplication import const_deduplication from .const_elimination import const_elimination from .dead_code_elimination import dead_code_elimination from .dedup_op_and_var_names import dedup_op_and_var_names from .expand_dynamic_linear import expand_dynamic_linear from .fuse_reduce_mean import fuse_reduce_mean from .loop_invariant_elimination import loop_invariant_elimination from .noop_elimination import noop_elimination from .remove_redundant_ops import remove_redundant_ops from .remove_symbolic_reshape import remove_symbolic_reshape from .topological_reorder import topological_reorder ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/cleanup/const_deduplication.py0000644000000000000000000002335614672066616031060 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import hashlib from typing import Dict, List, Tuple, Union import numpy as np from coremltools.converters.mil.mil import Block, ListVar, Program, Var, types from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class const_deduplication(AbstractGraphPass): """ Remove duplicated large constants (tensor with 100+ elements) For example .. code-block:: Input graph (where weight and bias are large constants): weight_q = const(weight) weight_k = const(weight) bias_q = const(bias) bias_k = const(bias) q_embedding = linear(x=q, weight=weight_q, bias=bias_q) k_embedding = linear(x=k, weight=weight_k, bias=bias_k) Output graph: weight_q = const(weight) bias_q = const(bias) q_embedding = linear(x=q, weight=weight_q, bias=bias_q) k_embedding = linear(x=k, weight=weight_q, bias=bias_q) Concretely, this graph pass consists of two stages: (1) Deduplication of ``const`` op: We consider a ``const`` as duplicated if there exists such a previous ``const`` that has same dtype and value (2) Deduplication of ``constexpr_*`` op: We consider a ``constexpr_*`` as duplicated if there exists such a previous ``constexpr_*`` that has the same ``op_type`` and input attributes. Support options: - ``const_threshold``: Skip deduplicating ``const`` ops that have smaller number of elements than a threshold. Defaults to ``100``. i.e. the constants with ``size < 100`` will not be deduplicated. """ # const with size < _const_threshold will not be deduplicated _const_threshold = 100 # length of the number value hashkey LENGTH_OF_HASHKEY = 100 DTYPE2ATOL = { types.fp16: 6e-8, types.fp32: 1e-12, } @property def const_threshold(self) -> int: return const_deduplication._const_threshold @const_threshold.setter def const_threshold(self, val: int) -> None: if not isinstance(val, int): raise ValueError(f"Expect option 'const_threshold' to be type of int. Got {type(val)}.") const_deduplication._const_threshold = val def apply(self, prog) -> None: for f in prog.functions.values(): self._constant_deduplication_block(f) def _deduplicate_const_across_functions(self, prog: Program) -> None: """ When there are duplicated consts across functions, we cannot create a common const op to be shared. Instead, we set the weight_id to the consts, to allow them share the same blob file value when lowering into milproto. """ # We first make sure that consts are deduplicated within each function, # to make sure we can maximize the weight sharing. self.apply(prog) # check no weight_id is set yet in the program for block in prog.functions.values(): for op in block.operations: if op.op_type != "const": continue if op.weight_id is not None: raise ValueError(f"const op {op.name} already has weight_id {op.weight_id}") # deduplication across functions blocks = list(prog.functions.values()) unique2duplicates_const = self.find_constants(blocks) for i, (k, v) in enumerate(unique2duplicates_const.items()): if len(v) == 0: continue # There could be cases where two functions are pointing to the same block all_vars = [k] + list(v) all_vars = list(set(all_vars)) for duplicate in all_vars: duplicate.op.weight_id = i def remove_duplicate_ops( self, block: Block, unique2duplicates: Dict[Var, List[Var]], force_replace: bool ) -> None: for unique in unique2duplicates: for duplicate in unique2duplicates[unique]: if duplicate in block.outputs: continue block.replace_uses_of_var_after_op( anchor_op=duplicate.op, old_var=duplicate, new_var=unique, force_replace=force_replace, ) block.remove_ops([duplicate.op]) @block_context_manager def _constant_deduplication_block(self, block: Block) -> None: for op in list(block.operations): for b in op.blocks: self._constant_deduplication_block(b) # Deduplication of ``const`` op unique2duplicates_const = self.find_constants([block]) self.remove_duplicate_ops(block, unique2duplicates_const, force_replace=False) # Deduplication of ``constexpr_*`` op # Note that, the ``find_constexpr`` must go after ``find_constants`` + ``remove_duplicate_ops`` for ``const`` ops. # Since after the above two functions, ``const`` ops with identical values are # deduplicated into a single ``Var`` object, which allows ``find_constexpr`` to # directly compare the ``const`` input attr pointers instead of the actual values. unique2duplicates_constexpr = self.find_constexprs([block]) self.remove_duplicate_ops(block, unique2duplicates_constexpr, force_replace=True) @staticmethod def find_constexprs(blocks: List[Block]) -> Dict[Var, List[Var]]: """ Given a list of blocks, return all constexpr in the blocks in such a format: {unique_var_0: [duplicated_var_0_0, duplicated_var_0_1, ...], unique_var_1: [duplicated_var_1_0, duplicated_var_1_1, ...], ... } """ hashkey_2_duplicates: Dict[Tuple, List[Var]] = {} for block in blocks: for op in list(block.operations): if "constexpr" not in op.op_type: continue if hasattr(op, "weight_key"): hash_key = [op.op_type, op.weight_key] else: hash_key = [op.op_type] for v in op.inputs.values(): hash_key.append(v.dtype) if v.val is None or const_deduplication.should_be_deduplicated(v.val): hash_key.append(v) else: hash_key.append(str(v.val)) hash_key = tuple(hash_key) if hash_key not in hashkey_2_duplicates: hashkey_2_duplicates[hash_key] = [op.outputs[0]] else: hashkey_2_duplicates[hash_key].append(op.outputs[0]) return {v[0]: v[1:] for v in hashkey_2_duplicates.values()} @staticmethod def should_be_deduplicated(val: Union[str, bool, np.ndarray]) -> bool: assert val is not None, "val should only be type of (str, bool, np.ndarray)" if isinstance(val, (str, bool)): return False if np.prod(val.shape) < const_deduplication._const_threshold: return False return True @staticmethod def find_constants(blocks: List[Block]) -> Dict[Var, List[Var]]: """ Given a list of blocks, return all constants in the blocks in such a format: {unique_var_0: [duplicated_var_0_0, duplicated_var_0_1, ...], unique_var_1: [duplicated_var_1_0, duplicated_var_1_1, ...], ... } """ unique2duplicates: Dict[Var, List[Var]] = {} # instead of brute-force C_N^2 comparison, use a hash map to be O(N) constant_dict: Dict[Tuple[str, types.type, Tuple[int], str], List[Var]] = {} for block in blocks: for op in list(block.operations): if op.op_type != "const": continue constant_var = op.outputs[0] if isinstance(constant_var, ListVar): continue if not const_deduplication.should_be_deduplicated(constant_var.val): continue shape = constant_var.shape dtype = constant_var.dtype value = constant_var.val hash = hashlib.sha1( np.ascontiguousarray(value.reshape(-1)[: const_deduplication.LENGTH_OF_HASHKEY]) ).hexdigest() if hasattr(op, "weight_key"): key = (op.weight_key, dtype, shape, hash) else: key = (dtype, shape, hash) if key not in constant_dict: constant_dict[key] = [constant_var] unique2duplicates[constant_var] = [] else: hash_collisions = constant_dict[key] existing_constant_var = None for var in hash_collisions: if np.allclose( value, var.val, rtol=0.0, atol=const_deduplication.DTYPE2ATOL.get(dtype, 1e-12), ): existing_constant_var = var break if existing_constant_var is None: hash_collisions.append(constant_var) unique2duplicates[constant_var] = [] else: unique2duplicates[existing_constant_var].append(constant_var) return unique2duplicates ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/cleanup/const_elimination.py0000644000000000000000000001127514672066616030541 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools import _logger as logger from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Program from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class const_elimination(AbstractGraphPass): """ Replace non-``const`` ops that have ``const`` Var. Outputs are replaced with the ``const`` op. Example: .. code-block:: Given: %2, %3 = non_const_op(...) # %2 is const, %3 isn't const %4 = other_op(%2, %3) Result: _, %3 = non_const_op(...) # _ is the ignored output %2_const = const() # %2_const name is for illustration only %4 = other_op(%2_const, %3) Support options: - ``skip_const_by_size``: Skip folding ``const`` ops that have larger number of elements than a threshold. """ _skip_const_by_size = None @property def skip_const_by_size(self): return self._skip_const_by_size @skip_const_by_size.setter def skip_const_by_size(self, threshold: str): try: # Convert to float instead of int to support more flexible input such as `1e6`. threshold = float(threshold) except Exception as e: raise ValueError( f"Expected to get float threshold, but got `{threshold}` which cannot " f"be converted to float. {e}" ) self._skip_const_by_size = float(threshold) def apply(self, prog: Program): for f in prog.functions.values(): self._const_elimination_block(f) @block_context_manager def _const_elimination_block(self, block): # shallow copy hides changes on f.operations during the loop for op in list(block.operations): if op.op_type == "const": continue for b in op.blocks: self._const_elimination_block(b) all_outputs_are_replaced = True for output in op.outputs: if output.can_be_folded_to_const(): if ( self._skip_const_by_size is not None and len(output.shape) > 0 and output.val.size > self._skip_const_by_size ): logger.warning( f"The output ({output}) of op {op} is skipped in const elimination pass " f"because its val size ({output.val.size}) is larger than threshold " f"({self._skip_const_by_size})." ) all_outputs_are_replaced = False break res = mb.const( val=output.val, before_op=op, # same var name, but different python # instance does not violate SSA property. name=output.name, ) if op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=op, old_var=output, new_var=res, ): # rename the const output output.set_name(output.name + "_ignored") else: all_outputs_are_replaced = False # force const folding of the shape op elif output.val is not None and op.op_type == "shape": res = mb.const( val=output.val, before_op=op, # same var name, but different python # instance does not violate SSA property. name=output.name, ) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=output, new_var=res, force_replace=True, ) # rename the const output output.set_name(output.name + "_ignored") else: all_outputs_are_replaced = False if all_outputs_are_replaced: op.remove_from_block() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/cleanup/dead_code_elimination.py0000644000000000000000000000617614672066616031306 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools import _logger as logger from coremltools.converters.mil.mil import Program from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class dead_code_elimination(AbstractGraphPass): """ Eliminate unused ops in program. Ops whose outputs do not contribute to final outputs will be deleted. .. code-block:: # Before dead_code_elimination pass. main(%x: (2, 4, fp32)) { block0() { %const_2: (4, 2, fp32)* = const(val=[...]) %const_3: (4, fp32)* = const(val=[...]) %tx_0: (bool)* = const(val=False) %ty_0: (bool)* = const(val=False) %matmul_0: (2, 2, fp32) = matmul(x=%x, y=%const_2, transpose_x=%tx_0, transpose_y=%ty_0) %linear_0: (2, 4, fp32) = linear(x=%x, weight=%const_2, bias=%const_3) } -> (%linear_0) } # After dead_code_elimination pass. main(%x: (2, 4, fp32)) { block0() { %const_2: (4, 2, fp32)* = const(val=[...]) %const_3: (4, fp32)* = const(val=[...]) %linear_0: (2, 4, fp32) = linear(x=%x, weight=%const_2, bias=%const_3) } -> (%linear_0) } In the example above, ``%matmul_0`` is an op that is not used in the computation. This op and its input ops (``%tx_0`` and ``%ty_0``) are eliminated in this pass. """ def apply(self, prog: Program): for f in prog.functions.values(): self._dead_code_elimination_block(f) @staticmethod @block_context_manager def _dead_code_elimination_block(block): used_vars = set() ops_to_remove = list() # mark block's outputs to used used_vars.update(block.outputs) # mark outputs from coreml_update_state to used for op in block.operations: if op.op_type == "coreml_update_state": used_vars.update(op.outputs) for op in reversed(block.operations): # if none of op's output is used, delete op if not set(op.outputs).intersection(used_vars): ops_to_remove.append(op) continue # mark all op's inputs to used for _, input_var in op.inputs.items(): if isinstance(input_var, (tuple, list)): used_vars.update(list(input_var)) else: used_vars.update([input_var]) for b in op.blocks: used_in_block = dead_code_elimination._dead_code_elimination_block(b) used_vars.update(used_in_block) for op in ops_to_remove: logger.info('Removing op "{}" (type: {})'.format(op.name, op.op_type)) op.remove_from_block() return used_vars ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/cleanup/dedup_op_and_var_names.py0000644000000000000000000000737114672066616031501 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import collections import itertools from coremltools.converters.mil.mil import Function from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class dedup_op_and_var_names(AbstractGraphPass): """ For each function, this pass renames ops and variables with the same name as any preceding ops/variables across all scopes in the given function, where the precedence is implementation-specific. Note that an op name and variable names are tracked separately, so an op may have the same name as a variable. The pass preserves input and output name. Raises ValueError if we cannot dedup without changing the input/output var names. .. code-block:: def prog(x): x = mb.cast(x=x, dtype="fp16", name="castop") x = mb.cast(x=x, dtype="fp32", name="castop") x = mb.square(x=x, name="square_last") return x # Before dedup pass, the op names are ["castop", "castop", "square_last"]. # After dedup pass, the op names are ["castop", "castop_1", "square_last"]. """ def apply(self, prog): for func in prog.functions.values(): # Handle function input/outputs as they cannot be changed (to maintain user interface) inputs = list(func.inputs.values()) io_vars = set(inputs + func.outputs) self._ensure_unique_var_names(io_vars) seen_var_names = set([v.name for v in io_vars]) seen_op_names = set() self._deduplicate_block(func, set(func.outputs), seen_var_names, seen_op_names) @staticmethod def _gen_new_name(seen_names, curr_name): if curr_name not in seen_names: return curr_name # make sure the name is unique for i in itertools.count(start=1): # loop from 1 to infinity # rename duplicated name start from 1: 'xxx_1' new_name = curr_name + "_" + str(i) if new_name not in seen_names: return new_name def _deduplicate_block(self, block, func_outputs, seen_var_names, seen_op_names): """ seen_var_names: set[str] seen_op_names: set[str] """ # Add block input (function input is handled separately) if not isinstance(block, Function): for v in block.inputs: v.name = self._gen_new_name(seen_var_names, v.name) seen_var_names.add(v.name) for op in list(block.operations): for b in op.blocks: self._deduplicate_block(b, func_outputs, seen_var_names, seen_op_names) if op.name is not None: op.name = self._gen_new_name(seen_op_names, op.name) seen_op_names.add(op.name) for v in op.outputs: if v in func_outputs: # func output is never renamed continue v.name = self._gen_new_name(seen_var_names, v.name) seen_var_names.add(v.name) @staticmethod def _ensure_unique_var_names(v_set): """ v_set: set[Variable] All variables in v_set should have different names. Raise ValueError otherwise """ names = [v.name for v in v_set] dup_names = [name for name, count in collections.Counter(names).items() if count > 1] if len(dup_names) > 0: raise ValueError(f"Var names {dup_names} is used both as function's input and output") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/cleanup/expand_dynamic_linear.py0000644000000000000000000001203314672066616031331 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, Program, Var from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class expand_dynamic_linear(AbstractGraphPass): """ Translate to ``linear`` when the operand is a descendant of const, since such an operand may be folded into const or fused into constexpr later by graph passes. In op translation, we prefer ``linear`` whenever possible because it requires const or constexpr ``weight`` and ``bias``. If such const folding or constexpr fusion did not happen, this pass would clean up the too-ambitious ``linear`` ops by replacing them with ``matmul`` ops. """ def apply(self, prog: Program) -> None: for f in prog.functions.values(): self._expand_dynamic_linear_block(f) @block_context_manager def _expand_dynamic_linear_block(self, block: Block) -> None: # use shallow copy to hide changes on block.operations during the loop, # since we do not need to deal with the newly expanded matmul + add ops for op in list(block.operations): for b in op.blocks: self._expand_dynamic_linear_block(b) if op.op_type == "linear": self._try_expand_dynamic_linear(op, block) @staticmethod def _is_operand_static(var: Var) -> bool: if var is None: return True op = var.op if op is None: return False op_type = op.op_type return op_type == "const" or op_type.startswith("constexpr_") def _try_expand_dynamic_linear(self, op: Operation, block: Block) -> None: assert op.op_type == "linear", "Should only apply to linear op" is_weight_static = self._is_operand_static(op.weight) is_bias_static = self._is_operand_static(op.bias) if is_weight_static: if is_bias_static: # static weight and bias, linear is good return else: # static weight with dynamic bias, so linear for weight matmul + add for bias add matmul = mb.linear(x=op.x, weight=op.weight, before_op=op) add = mb.add(x=matmul, y=op.bias, before_op=op, name=op.name) block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=add, ) op.remove_from_block() else: # dynamic weight, have to expand to at least matmul result = mb.matmul(x=op.x, y=op.weight, transpose_y=True, before_op=op) # static bias, try skipping add if all zero if is_bias_static: force_replace = False # if no bias provided, default to 0, can skip # if bias provided, need to inspect its value if op.bias is not None: bias_op = op.bias.op bias_op_type = bias_op.op_type if bias_op_type == "const": is_nonzero_bias = np.any(op.bias.val != 0) else: if bias_op_type == "constexpr_affine_dequantize": is_nonzero_bias = not bias_op.is_all_zeros() # cowardly treat other types of compressed bias as if nonzero else: is_nonzero_bias = True # For such a compressed all-zero bias, if we skip add, then # the result (matmul output) would only descend from weight but not bias, # i.e. need to force replacing descendant of bias if not is_nonzero_bias: force_replace = True if is_nonzero_bias: result = mb.add(x=result, y=op.bias, before_op=op, name=op.name) block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=result, force_replace=force_replace, ) op.remove_from_block() # dynamic bias, have to further expand to matmul + add else: result = mb.add(x=result, y=op.bias, before_op=op, name=op.name) block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=result, ) op.remove_from_block() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/cleanup/fuse_reduce_mean.py0000644000000000000000000001045214672066616030310 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import ( _check_child_op_type, _check_var_scalar_value, block_context_manager, ) from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.types.symbolic import is_symbolic @register_pass(namespace="common") class fuse_reduce_mean(AbstractGraphPass): """ Detect the ``reduce_sum`` ---> ``mul/real_div`` pattern than can be mapped to ``reduce_mean``. That is, the operation ``reduce_sum/count == reduce_mean``. `Input graph` .. code-block:: const (scalar) | input ----> reduce_sum ----> mul/real_div -----------> output `Output graph` .. code-block:: input --------> reduce_mean ---------> output """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._fuse_reduce_mean_block(f) @staticmethod def _try_to_transform(reduce_sum_op, block): ops_to_remove = [] # check that the dimensions in the shape of the input to the reduce_sum op, # over which the reduction operation is being performed, are known input_shape = reduce_sum_op.x.shape if input_shape is None: return False axes = None if reduce_sum_op.axes is not None: axes = reduce_sum_op.axes.val if axes is None: return False count = 1 for dim in axes: if is_symbolic(input_shape[dim]): return False count *= input_shape[dim] # check that output of reduce_sum is not a block output if reduce_sum_op.outputs[0] in block.outputs: return False ops_to_remove.append(reduce_sum_op) # check that reduce_sum op is followed by either: # - mul op with scalar value 1/count # or # - real_div op with scalar value count if _check_child_op_type(reduce_sum_op, "mul"): child_op = list(reduce_sum_op.outputs[0].child_ops)[0] other_input = child_op.x if child_op.y == reduce_sum_op.outputs[0] else child_op.y if not _check_var_scalar_value(other_input, 1.0 / count, 1e-6): return False elif _check_child_op_type(reduce_sum_op, "real_div"): child_op = list(reduce_sum_op.outputs[0].child_ops)[0] if child_op.x != reduce_sum_op.outputs[0]: return False other_input = child_op.y if not _check_var_scalar_value(other_input, count, 1e-2): return False else: return False ops_to_remove.append(child_op) # remove all the ops, and replace with a reduce_mean op out_name = child_op.outputs[0].name x = mb.reduce_mean( x=reduce_sum_op.x, axes=reduce_sum_op.axes.val, keep_dims=reduce_sum_op.keep_dims.val, name=out_name, before_op=child_op, ) child_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=child_op, old_var=child_op.outputs[0], new_var=x ) block.remove_ops(ops_to_remove) return True @block_context_manager def _fuse_reduce_mean_block(self, block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._fuse_reduce_mean_block(b) if len(op.blocks) > 0: continue # start pattern match if mul op is encountered if op.op_type == "reduce_sum": if self._try_to_transform(op, block): fusion_occurred = True return fusion_occurred ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/cleanup/loop_invariant_elimination.py0000644000000000000000000001521214672066616032432 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class loop_invariant_elimination(AbstractGraphPass): """ When a block does not modify a block input var, eliminate that block input var and use the corresponding var in the outer scope. Example: .. code-block:: # Before loop_invariant_elimination pass. # Notice that ``%b.x`` is constant through while loop iterates. main(%a: (1, 2, fp32), %b: (1, 2, fp32)) { block0() { %loop:0: (1, 2, fp32), %loop:1: (1, 2, fp32) = \ while_loop(loop_vars=(%a, %b)) loop_cond(%a.x, %b.x) { %cond_var: (bool) = some_op(x=%a.x, y=%b.x) } -> (%cond_var) loop_body(%a.x, %b.x) { %add_0: (1, 2, fp32) = add(x=%a.x, y=%b.x) } -> (%add_0, %b.x) } -> (%loop:0, %loop:1) } # After loop_invariant_elimination pass. main(%a: (1, 2, fp32), %b: (1, 2, fp32)) { block0() { %loop:1: (1, 2, fp32) = identity(x=%b) %loop:0: (1, 2, fp32) = \ while_loop(loop_vars=(%a)) loop_cond(%a.x) { %cond_var: (bool) = some_op(x=%a.x, y=%b) } -> (%cond_var) loop_body(%a.x) { %add_0: (1, 2, fp32) = add(x=%a.x, y=%b) } -> (%add_0) } -> (%loop:0, %loop:1) } where we eliminate loop invariant ``%b.x`` from ``while_loop``, which returns 1 instead of 2 outputs. We also preserve the return var names with identity. """ def apply(self, prog): for f in prog.functions.values(): self._loop_invariant_elimination_block(f) @staticmethod def _detect_loop_invariants(while_op): block = while_op.blocks[1] # body block loop_invariant_ids = [] # list of index in op.loop_vars, block.inputs for i, vx_in in enumerate(block.inputs): vx_out = block.outputs[i] # first output is cond var. return_input_as_output = vx_in == vx_out # this block output is a var from outside of the block enclosing_block = while_op.enclosing_block output_from_outside_of_block = enclosing_block.is_var_visible_in_block( vx_out, upto_op=while_op ) if return_input_as_output or output_from_outside_of_block: loop_invariant_ids.append(i) # TODO: All outputs that depend on only invariants are invariant. We # need to move computation out of while loop. return loop_invariant_ids @block_context_manager def _loop_invariant_elimination_block(self, block): # Phase 1: Find vars needed to be renamed. # # while_loop outputs need to be renamed if the output will be eliminated # (due to loop invariant) and is returned as block output (which would # change the return var name and the program interface). # # list[(v_src, v_tgt, before_op)]: will rename v_src to v_tgt before # before_op (a while_loop) output_rename = [] for op in list(block.operations): for b in op.blocks: self._loop_invariant_elimination_block(b) if op.op_type != "while_loop": continue loop_invariant_ids = self._detect_loop_invariants(op) for i in loop_invariant_ids: output_rename.append((op.loop_vars[i], op.outputs[i], op)) if len(loop_invariant_ids) > 0: # Avoid the following case: # %a, %b = while_loop(..., name="b") # becomes # %b = identity(..., name="b") # %a = while_loop(..., name="b") # (two ops with the same name -> name collision) op.name = op.name + "_renamed" # Phase 2: insert rename ops. This changes block.operations for v_src, v_tgt, op in output_rename: if v_tgt in block.outputs: # rename the loop output to existing block output names res = mb.identity(x=v_src, before_op=op, name=v_tgt.name) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=v_tgt, new_var=res ) # Phase 3: Perform loop invariant elimination without fear! for op in list(block.operations): if op.op_type != "while_loop": continue loop_invariant_ids = self._detect_loop_invariants(op) # replace uses of loop_invariants with its source from outside of the # while_loop op. for i in loop_invariant_ids: for block in op.blocks: block.replace_uses_of_var_after_op( anchor_op=None, old_var=block.inputs[i], new_var=op.loop_vars[i] ) # replace block inputs for block in op.blocks: block.remove_inputs([block.inputs[i] for i in loop_invariant_ids]) # remove invariants from while_loop loop_vars for i in loop_invariant_ids: # replace usage of while_loop outputs that we'll eliminate. op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[i], new_var=op.loop_vars[i] ) # Remove after replacing to ensure program is valid for i in loop_invariant_ids: op.loop_vars[i].remove_child_op(op) op.loop_vars = tuple( v for i, v in enumerate(op.loop_vars) if i not in loop_invariant_ids ) op._input_vars["loop_vars"] = op.loop_vars # remove invariants from while_loop body_block outputs body_block = op.blocks[1] body_block.set_outputs( [v for i, v in enumerate(body_block.outputs) if i not in loop_invariant_ids] ) # op._output_vars doesn't include cond var op._output_vars = [ v for i, v in enumerate(op._output_vars) if i not in loop_invariant_ids ] # check healthy state op.enclosing_block.validate() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/cleanup/noop_elimination.py0000644000000000000000000001726714672066616030375 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class noop_elimination(AbstractGraphPass): """ Remove ops that have no effect. .. code-block:: Given: %1 (1, 96, 128, 64, fp32) = ... %2 (1, 96, 128, 64, fp32) = reshape(%1) ... %3 (1, 96, 128, 64, fp32) = add(%2, constant) ... Result: %1 (1, 96, 128, 64, fp32) = ... %3 (1, 96, 128, 64, fp32) = add(%1, constant) ... """ _SUPPORTED_OPS = { "identity", "add", "mul", "floor_div", "pow", "real_div", "sub", "reshape", "split", "slice_by_index", "slice_by_size", "pad", "tile", "transpose", "upsample_nearest_neighbor", "upsample_bilinear", "resize_bilinear", "crop", "linear_activation", } def apply(self, prog): for f in prog.functions.values(): self._noop_elimination_block_wrapper(f) @staticmethod def _match_pattern(op): def remove_identity(op): if op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=op.x, ): op.enclosing_block.remove_ops([op]) return True return False def _remove_elementwise_binary(op, x, y): # We remove the ops that has op.x == x or op.y == y def has_all_elements_equal_to(var, value): if value is None: return False if var.val is not None: return np.all(var.val == value) elif var.op is not None and var.op.op_type == "fill": fill_value = var.op.value.val return fill_value is not None and (fill_value == value) else: return False if has_all_elements_equal_to(op.x, x): input_var = op.y elif has_all_elements_equal_to(op.y, y): input_var = op.x else: return False input_shape = input_var.sym_type output_shape = op.outputs[0].sym_type # We might be using elementwise as broadcasting if input_shape != output_shape: return False if op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=input_var, ): op.enclosing_block.remove_ops([op]) return True return False def remove_elementwise(op): if op.op_type in {"add"}: return _remove_elementwise_binary(op, 0, 0) elif op.op_type in {"mul"}: return _remove_elementwise_binary(op, 1, 1) elif op.op_type in {"floor_div", "pow", "real_div"}: return _remove_elementwise_binary(op, None, 1) elif op.op_type in {"sub"}: return _remove_elementwise_binary(op, None, 0) else: return False def remove_slice_by_index(op): input_shape = op.x.sym_type output_shape = op.outputs[0].sym_type if input_shape != output_shape: return False if op.stride is not None and op.stride.val is not None: stride = op.stride.val.flatten().tolist() if any([x < 0 for x in stride]): return False if op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=op.x, ): op.enclosing_block.remove_ops([op]) return True return False def remove_same_shape(op): input_shape = op.x.sym_type output_shape = op.outputs[0].sym_type if input_shape != output_shape: return False if op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=op.x, ): op.enclosing_block.remove_ops([op]) return True return False def remove_linear(op): if op.alpha.val != 1 or op.beta.val != 0: return False if op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=op.x, ): op.enclosing_block.remove_ops([op]) return True return False def remove_transpose(op): perm = np.array([p if p >= 0 else p + len(op.perm.val) for p in op.perm.val]) sorted_perm = np.sort(perm) if (perm != sorted_perm).any(): return False if op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=op.x, ): op.enclosing_block.remove_ops([op]) return True return False op_to_removal_fn = { "identity": remove_identity, "add": remove_elementwise, "mul": remove_elementwise, "floor_div": remove_elementwise, "pow": remove_elementwise, "real_div": remove_elementwise, "sub": remove_elementwise, "reshape": remove_same_shape, "split": remove_same_shape, "slice_by_index": remove_slice_by_index, "slice_by_size": remove_same_shape, "pad": remove_same_shape, "tile": remove_same_shape, "transpose": remove_transpose, "upsample_nearest_neighbor": remove_same_shape, "upsample_bilinear": remove_same_shape, "resize_bilinear": remove_same_shape, "crop": remove_same_shape, "linear_activation": remove_linear, } # abort if op output is a block output if op.outputs[0] in op.enclosing_block.outputs: return None if op.op_type in noop_elimination._SUPPORTED_OPS: if len(op.outputs) != 1: return None return op_to_removal_fn[op.op_type] return None @block_context_manager def _noop_elimination_block_wrapper(self, block): def _noop_elimination_block(block): status = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = _noop_elimination_block(b) if len(op.blocks) > 0: continue remove_fn = noop_elimination._match_pattern(op) if remove_fn is not None and remove_fn(op): status = True return status block_changed = True while block_changed: block_changed = _noop_elimination_block(block) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/cleanup/remove_redundant_ops.py0000644000000000000000000003046114672066616031243 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import collections from typing import Dict, List import numpy as np from coremltools.converters.mil.mil import Block, Operation, Var from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class remove_redundant_ops(AbstractGraphPass): """ If there are multiple ops with "identical" inputs, then they are redundant and all but one of them can be removed. This pass checks and removes such ops. Since all inputs to ops in MIL are named, two ops with same ``op_types`` can be compared by comparing their correspondingly named inputs. Inputs are treated as identical if one of the following is true: - The input is a constant var, in which case its value should have the same dtype and numerical value. - The input is a non constant var, in which case it should be the same var object. This pass iterates over the ops, takes its first output var, and then builds a candidate op list from the child ops of this var. This candidate ops list contains ops of the same ``op_type``, arranged in topological order. From each of these candidate ops in the list, the second, third, and subsequent ops are pairwise compared with the first op, and if identical to it, they are removed. For example: .. code-block:: Input: %0 = op0(...) %1 = op1(...) %2 = const(val=4.5) %3 = const(val=4.5) %4 = op2(%1, %0, %2) %5 = op3(%1, %0, %3) Output: %0 = op0(...) %1 = op1(...) %2 = const(val=4.5) %3 = const(val=4.5) # this will get removed later by dead code elimination pass %4 = op2(%1, %0, %2) In the example above, ``op3`` is removed and all uses of ``%5`` is replaced by ``%4``. For more examples, see "TestRemoveRedundantOpsPass". """ _NON_REDUNDANT_OPS = tuple() def __init__(self): self._num_of_visited_ops: int = ( 0 # Testing purpose, making sure the algorithm performs in O(N) ) self._ops_order: Dict[Block, Dict[Operation, int]] = {} def apply(self, prog): self._num_of_visited_ops = 0 for f in prog.functions.values(): self._remove_redundant_ops_in_block_wrapper(f) @staticmethod def _is_op_eligible_to_be_removed(op): if ( len(op.blocks) != 0 or op.op_type.startswith("random") or op.op_type in remove_redundant_ops._NON_REDUNDANT_OPS ): return False else: return True def _get_candidate_ops_list(self, prospective_ops_list: List[Operation]) -> List[Operation]: od = collections.OrderedDict() enclosing_blocks = [op.enclosing_block for op in prospective_ops_list] if len(set(enclosing_blocks)) > 1: # all candidate ops must belong to the same block return [] for op in prospective_ops_list: if remove_redundant_ops._is_op_eligible_to_be_removed(op): od[op] = self._ops_order[enclosing_blocks[0]][op] # Sort the ops according to their index of appearing in block.operations, which is # topologically sorted return [x[0] for x in sorted(od.items(), key=lambda t: t[1])] def _get_candidate_ops_lists_from_var(self, var: Var) -> List[List[Operation]]: """ Return a list of lists. Each element is a list of a subset of the child ops of var, which satisfies the following conditions: - they are of the same op_type - ops are not repeated in it. The .child_ops property of a var may sometimes contain an op repeated more than once - the ops are ordered based on the order in which they appear in the block.operations list (which is topologically sorted), with ops appearing earlier in that list appearing first here. """ candidate_ops_lists = [] op_types_to_ops = collections.OrderedDict() for op in var.child_ops: if op.op_type in op_types_to_ops: op_types_to_ops[op.op_type].append(op) else: op_types_to_ops[op.op_type] = [op] for v in op_types_to_ops.values(): if len(v) > 1: candidate_ops_list = self._get_candidate_ops_list(v) if len(candidate_ops_list) > 1: candidate_ops_lists.append(candidate_ops_list) return candidate_ops_lists @staticmethod def _are_ops_identical(op1, op2): """ Return True, if all inputs of op1 and op2 are identical. non-constant inputs must refer to the same object. For constant inputs, we only compare arrays with small size. Large size const ops are already deduplicated in the const_deduplication pass so we can compare the pointers. """ def _are_values_identical(val1, val2): if not isinstance(val1, np.ndarray) or not isinstance(val2, np.ndarray): return np.array_equal(np.array(val1), np.array(val2)) if val1.size != val2.size: return False if val1.size < 100: return np.array_equal(val1, val2) return False def _are_vars_identical(var1, var2): if var1 is var2: return True if var1.val is None and var2.val is None: if var1 != var2: return False elif var1.val is not None and var2.val is not None: if var1.dtype != var2.dtype: return False if not _are_values_identical(var1.val, var2.val): return False else: return False return True if op1 == op2: return True if op1.op_type != op2.op_type: return False if len(op1.inputs) != len(op2.inputs): return False for key, value1 in op1.inputs.items(): if key not in op2.inputs: return False value2 = op2.inputs[key] if isinstance(value1, Var) and isinstance(value2, Var): if not _are_vars_identical(value1, value2): return False elif isinstance(value1, (list, tuple)) and isinstance(value2, (list, tuple)): if len(value1) != len(value2): return False else: for i, v in enumerate(value1): if not _are_vars_identical(v, value2[i]): return False else: return False assert len(op1.blocks) == 0, "this method does not handle ops that have blocks in it" assert len(op2.blocks) == 0, "this method does not handle ops that have blocks in it" return True @staticmethod def _try_to_remove_ops(candidate_ops_list): # candidate_ops_list contains ops in topological order. # All the ops in candidate_ops_list will be compared to the first op, and removed if identical to it. # Removing ops later in the topological order is much easier, as their output vars # can simply be replaced by the output var of the first_op, this doesn't require # changing any op order in the block. if len(candidate_ops_list) < 2: return False first_op = candidate_ops_list[0] block = first_op.enclosing_block if block is None: return False # currently, we only consider the cases when the op has 1 output. # The replace var logic below only handles the single output case. if len(first_op.outputs) > 1: return False ops_to_remove = [] for op in candidate_ops_list[1:]: if op.enclosing_block is None: continue if op.outputs[0] not in block.outputs: # to make sure we don't remove an output op if remove_redundant_ops._are_ops_identical(first_op, op): ops_to_remove.append(op) if len(ops_to_remove) == 0: return False # remove uses of output vars of the ops to be removed. # This can be safely done, since all the ops in ops_to_remove # appear after first_op, hence first_op.outputs[0] variable is in # scope before the op's output var ops_removed = [] for op in ops_to_remove: if op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=first_op.outputs[0]): ops_removed.append(op) if len(ops_removed) == 0: return False block.remove_ops(ops_removed) return True def _try_to_transform(self, parent_var: Var) -> bool: """ scan the children ops to parent_var, to find and remove identical ops, if any. Returns True, if successful in finding such redundant ops. """ candidate_ops_lists = self._get_candidate_ops_lists_from_var(parent_var) block_changed = False for ops_list in candidate_ops_lists: # Iterate through the child ops list, to make sure that we check all possible combinations. for idx in range(len(ops_list)): if remove_redundant_ops._try_to_remove_ops(ops_list[idx:]): # We shoud not break right alway, so that we can keep # the time complexity low. block_changed = True return block_changed @block_context_manager def _remove_redundant_ops_in_block_wrapper(self, block): def _cache_topological_order_of_ops_in_block(block: Block): if block in self._ops_order: return self._ops_order[block] = {} for i, op in enumerate(block.operations): for b in op.blocks: _cache_topological_order_of_ops_in_block(b) self._ops_order[block][op] = i def _remove_redundant_ops_in_block(block): # cache the topological order of the ops, # so that we would not to query the index every single time. # Note that, the transformation in this particular graph pass # is going to preserve the topological order. And that is the # reason why we can do the cache in the very beginning. _cache_topological_order_of_ops_in_block(block) # iterate over the block inputs if isinstance(block.inputs, dict): block_input_var_list = list(block.inputs.values()) elif isinstance(block.inputs, (list, tuple)): block_input_var_list = block.inputs else: raise ValueError("Unrecognized type of block.inputs, its neither a list nor dict.") for input_var in block_input_var_list: if len(input_var.child_ops) > 1: self._try_to_transform(input_var) # iterate over the ops in the block graph_updated = False for op in list(block.operations): if op.op_type == "const": continue self._num_of_visited_ops += 1 for b in op.blocks: block_changed = True while block_changed: block_changed = _remove_redundant_ops_in_block(b) if len(op.outputs) > 0 and len(op.outputs[0].child_ops) > 1: # currently, we only check the first output of the op # this can be extended, if required, to check for other outputs. if self._try_to_transform(op.outputs[0]): # we don't need to break right away, in order to # keep the time complexity fast. graph_updated = True return graph_updated block_changed = True while block_changed: self._ops_order = {} block_changed = _remove_redundant_ops_in_block(block) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/cleanup/remove_symbolic_reshape.py0000644000000000000000000000741014672066616031724 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools import _logger as logger from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Program from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.types.symbolic import any_variadic, is_symbolic, num_symbolic @register_pass(namespace="common") class remove_symbolic_reshape(AbstractGraphPass): """ Convert symbolic shape in ``reshape`` to integers. Note: This does not perform any optimization, but simply replaces symbols with positive integers if solved from volumetric constraint, or -1. Therefore, this pass fails if more than one symbol needs to be resolved to -1. .. code-block:: # Before remove_symbolic_reshape pass. main(%x: (s0, 4, fp32)) { block0() { %reshape_0_shape_0: (3,i32)^ = const(val=(s0, s1, 2)) %reshape_0: (s0, 2, 2, fp32) = reshape(x=%x, shape=%reshape_0_shape_0) } -> (%reshape_0) } # After remove_symbolic_reshape pass. main(%x: (s0, 4, fp32)) { block0() { %reshape_0_shape_0x: (3,i32)* = const(val=[-1, 2, 2]) %reshape_0: (-1, 2, 2, fp32) = reshape(x=%x, shape=%reshape_0_shape_0x) } -> (%reshape_0) } TODO (rdar://59165842): Use expand_dims, squeeze etc to use 0 instead of dynamic reshape with -1. """ def apply(self, prog: Program): for f in prog.functions.values(): num_changes = self._remove_symbolic_reshape_block(f) msg = "remove_symbolic_reshape: changed {} reshapes." logger.info(msg.format(num_changes)) @block_context_manager def _remove_symbolic_reshape_block(self, block): num_changes = 0 for op in list(block.operations): for b in op.blocks: num_changes += self._remove_symbolic_reshape_block(b) if op.op_type != "reshape": continue if op.shape.val is not None: # shape does not contain symbol. continue if op.shape.sym_val is None: # shape is runtime determined. continue if len(op.shape.child_ops) > 1: continue # Use output shape as `shape` shape = op.outputs[0].shape if any_variadic(shape): msg = ( "Cannot reshape to variadic from a compile time " + "shape argument. Variadic shape can only be achieved " + "via runtime shape argument. op: {}" ) raise ValueError(msg.format(op)) num_symbols = num_symbolic(shape) if num_symbols > 1: continue # Convert the one symbol to -1 integer_shape = [-1 if is_symbolic(i) else i for i in shape] shape_const = mb.const( val=integer_shape, name=op.shape.name + "x", before_op=op, ) reshaped = mb.reshape(x=op.x, shape=shape_const, name=op.name, before_op=op) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=reshaped ) # Remove all the ops at once block.remove_ops([op, op.shape.op]) num_changes += 1 return num_changes ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/cleanup/topological_reorder.py0000644000000000000000000001742614672066616031065 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.utils import CacheDoublyLinkedList @register_pass(namespace="common") class topological_reorder(AbstractGraphPass): """ Topologically re-orders the list of operations in a program by places each operation closer to its first use, or at the end if it's not consumed by any other operation. Currently, This pass re-orders only Transpose and Cast operations. .. code-block:: # Example: input program main(x: (2, 4, fp32)) { x = mb.cast(x=x, dtype="fp16") x1 = mb.square(x=x) x1_t = mb.transpose(x=x1, perm=[1, 0]) x2 = mb.cast(x=x1_t, dtype="fp32") x3 = mb.log(x=x) x3_t = mb.transpose(x=x3, perm=[1, 0]) x4 = mb.cast(x=x3_t, dtype="fp32") x5 = mb.relu(x=x) x6 = mb.cast(x=x5, dtype="fp32") x7 = mb.relu(x=x6) x8 = mb.relu(x=x) } -> x2, x4, x7, x8 # After moving `cast` ops becomes main(x: (2, 4, fp32)) { x = mb.cast(x=x, dtype="fp16") x1 = mb.square(x=x) x1_t = mb.transpose(x=x1, perm=[1, 0]) x3 = mb.log(x=x) x3_t = mb.transpose(x=x3, perm=[1, 0]) x5 = mb.relu(x=x) x6 = mb.cast(x=x5, dtype="fp32") x7 = mb.relu(x=x6) x8 = mb.relu(x=x) x4 = mb.cast(x=x3_t, dtype="fp32") x2 = mb.cast(x=x1_t, dtype="fp32") } -> x2, x4, x7, x8 # After moving `transpose` ops becomes main(x: (2, 4, fp32)) { x = mb.cast(x=x, dtype="fp16") x1 = mb.square(x=x) x3 = mb.log(x=x) x5 = mb.relu(x=x) x6 = mb.cast(x=x5, dtype="fp32") x7 = mb.relu(x=x6) x8 = mb.relu(x=x) x3_t = mb.transpose(x=x3, perm=[1, 0]) x4 = mb.cast(x=x3_t, dtype="fp32") x1_t = mb.transpose(x=x1, perm=[1, 0]) x2 = mb.cast(x=x1_t, dtype="fp32") } -> x2, x4, x7, x8 """ def apply(self, prog): for f in prog.functions.values(): self._move_operations_to_the_end_block(f, ["cast", "transpose"]) @staticmethod @block_context_manager def _move_operations_to_the_end_block(block, op_type_to_move): # Moves ops with `op_type_to_move` in `block.operations` (list) to the end of the program. # Note: ops with `op_type_to_move` and is dead code are moved toward end, which can be eliminated # later with dead-code-elimination pass. # # Inputs: # - block (mil.Block): block to be modified in-place # - op_type_to_move (List[str]) # Returns: # - set[Var]: Set of vars consumed in block (or returned as block output) # first_use maps var to (index, op) representing the first op in block.operation that consumes this var. block.operations = list(block.operations) first_use = {} # var -> op ops_to_remove = [] # list of ops to be deleted at the end of pass for op in reversed(block.operations): current_op = op if op.op_type in op_type_to_move: # Mark op for deletion ops_to_remove.append(op) # Create list of operations consuming each output of current operation first_consumers = [first_use[v] for v in op.outputs if v in first_use] before_op = None # None means adding at the end of block if len(first_consumers) > 0: # Current op should be moved right before this first consumer of one of it's output. # 1. Find indices for all the consumer ops of outputs # 2. Move current op right before first consumer i.e. smallest index in block.operations first_use_indices = [ block.operations.index(first_use_op) for first_use_op in first_consumers ] before_op = block.operations[min(first_use_indices)] # Create new copy of current operation new_var = getattr(mb, op.op_type)(**op.inputs, before_op=before_op) if not isinstance(new_var, (list, tuple)): new_var = [new_var] # the new var should have the same name as the old var for i, old_var in enumerate(op.outputs): new_var[i].name = old_var.name # Override current_op to be newly created op to ensure `first_use` # points to newly created op instead of old one. current_op = new_var[0].op for old_output_var, new_output_var in zip(op.outputs, new_var): block.replace_uses_of_var_after_op( anchor_op=op, old_var=old_output_var, new_var=new_output_var ) # Collect input vars from sub-block if present relevant_inputs = set() for b in current_op.blocks: relevant_inputs |= topological_reorder._move_operations_to_the_end_block( b, op_type_to_move ) # Collect vars from operation input for v in current_op.inputs.values(): if isinstance(v, (tuple, list)): relevant_inputs |= set(v) continue relevant_inputs.add(v) # Mark current op as first use for all the input vars # a) of it's sub-block # b) of current op for v in relevant_inputs: # input is seen for the first time or # current_op is first_use i.e. appears before earlier recorded first_use. # Note: since ops are moved to the end, it's possible that an op is moved right after # earlier recorded first_use and in such cases, first_use should not be modified. # # == Example == # main( %x: (10, 20, fp32)(Tensor)) { # block0() { # %cast_0: (10, 20, fp16)(Tensor) = cast(x= %x, dtype = "fp16", name = "cast_0") # %cast_1: (10, 20, fp32)(Tensor) = cast(x= %cast_0, dtype = "fp32", name = "cast_1") # %transpose_0: (20, 10, fp16)(Tensor) = transpose(x= %cast_0, perm = [1, 0], name = "transpose_0") # %transpose_1: (10, 20, fp16)(Tensor) = transpose(x= %transpose_0, perm = [1, 0], name = "transpose_1") # } -> (% cast_1, % transpose_1) # } # In above example, `%cast_1` will be moved to the end of the block and first_use info for `%cast_0` # should point to `%transpose_0` and not to `%cast_1` if v not in first_use or block.operations.index( first_use[v] ) > block.operations.index(current_op): first_use[v] = current_op # Remove ops that are reordered block.operations = CacheDoublyLinkedList(block.operations) block.remove_ops(ops_to_remove) # Returns set of vars consumed in current block vars_consumed_in_block = set([v for v in first_use]) vars_consumed_in_block.update(block.outputs) return vars_consumed_in_block ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/lower_complex_dialect_ops.py0000644000000000000000000006354614672066616030631 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ This file contains a pass for lowering complex dialect ops into core ops. Steps for adding a new complex dialect op: 1. Add a dialect op in complex_dialect_ops.py 2. Add a corresponding lowering function In Step 2, notice that when implementing lower functions, we need to specify before_op during lowering to core ops. It's for both correctness as well as SSA graph's readability, because the generated core ops should be placed before the ops which were placed after that dialect op. More specifically, here is the SSA graph before lowering: block0() { %1 = complex_dialect_op(data=%input) %2 = core_op1(x=%1) %3 = core_op2(x=%2) } -> (%3) During lowering `complex_dialect_op`, we want all newly generated core ops are placed before the `core_op1`. """ import functools from typing import Callable, Dict, Optional, Tuple import numpy as np from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.operation import Operation from coremltools.converters.mil.mil.ops.defs.complex_dialect_ops import ( fft_canonicalize_length_dim, fft_canonicalize_shapes_dims, ) from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.var import ComplexVar, Var class LowerComplex: # The map recording each complex dialect op's lowering function. _lower_map: Dict[str, Callable] = dict() @staticmethod def register_lower_func(op_type: str) -> Callable: """Register lowering function for complex dialect ops.""" def lower_func_wrapper(func): @functools.wraps(func) def wrapper_inner(*args, **kwargs): return func(*args, **kwargs) if op_type in LowerComplex._lower_map: raise ValueError(f"The op {op_type} already got lowering function registered.") LowerComplex._lower_map[op_type] = func return wrapper_inner return lower_func_wrapper @staticmethod def has_lower_func(op_type: str) -> bool: """Check if the complex dialect op has corresponding lowering function.""" return op_type in LowerComplex._lower_map @staticmethod def get_lower_func(op_type: str) -> Callable: """Get the complex dialect op's lowering function.""" if not LowerComplex.has_lower_func(op_type): raise ValueError(f"The op {op_type} doesn't have any lowering function registered.") return LowerComplex._lower_map[op_type] def _resize_data(input_data: Var, dims: Tuple[int], sizes: Tuple[int], before_op: Operation) -> Var: """ For each dim in `dims`, resize the input data size to corresponding size in `sizes`. If the `size` is smaller than the data's size at `dim`, trim the data to `size`. If the `size` is larger, pad zeros to make the data reaches `size`. """ for (dim, size) in zip(dims, sizes): if size < input_data.shape[dim]: indices = mb.range_1d(start=0, end=size, step=1, before_op=before_op) input_data = mb.gather(x=input_data, indices=indices, axis=dim, before_op=before_op) elif size > input_data.shape[dim]: zero_shape = list(input_data.shape) zero_shape[dim] = size - input_data.shape[dim] zero_data = mb.fill(shape=zero_shape, value=0.0, before_op=before_op) input_data = mb.concat(values=[input_data, zero_data], axis=dim, before_op=before_op) return input_data def _restore_conj( input_data: ComplexVar, n: Var, dim: Var, before_op: Operation ) -> Tuple[Var, Var]: """ The input is interpreted as a one-sided Hermitian signal in the Fourier domain, as produced by rfft(). So we need to restore it to the full matrix by following X[i] = conj(X[-i]). Real part's conj is itself, and imaginary part's conj is negative of the original value. For odd number n, the last element is also included in mirroring input. """ real_data: Var = input_data.real imag_data: Var = input_data.imag size = 2 * (input_data.real.shape[dim.val] - 1) if n is not None and n.val is not None: size = n.val real_data = _resize_data( real_data, dims=(dim.val,), sizes=(size // 2 + 1,), before_op=before_op ) imag_data = _resize_data( imag_data, dims=(dim.val,), sizes=(size // 2 + 1,), before_op=before_op ) range_end = real_data.shape[dim.val] - 2 if size % 2 == 0 else real_data.shape[dim.val] - 1 if range_end > 0: mirror_indices = mb.range_1d(start=range_end, end=0, step=-1, before_op=before_op) real_part_mirror_values = mb.gather( x=real_data, indices=mirror_indices, axis=dim.val, before_op=before_op ) imag_part_mirror_values = mb.gather( x=imag_data, indices=mirror_indices, axis=dim.val, before_op=before_op ) imag_part_mirror_values = mb.mul(x=imag_part_mirror_values, y=-1.0, before_op=before_op) real_data = mb.concat( values=[real_data, real_part_mirror_values], axis=dim.val, before_op=before_op, ) imag_data = mb.concat( values=[imag_data, imag_part_mirror_values], axis=dim.val, before_op=before_op, ) return real_data, imag_data def _calculate_dft_matrix( n_fft: Var, onesided: bool = False, before_op: Operation = None, ) -> Tuple[Var, Var]: """ The core issue is how to derive the DFT matrix. As the DFT matrix is consist of different powers of `w`, where w=e^(2pi/N i), we need to separate the real and imaginary part of w. To achieve that, we need to find a way to construct the following matrix (from the power of `w` in DFT): 0 0 0 ... 0 0 1 2 ... N-1 0 2 4 ... 2(N-1) ... .... ... 0 N-1 2(N-1) ... (N-1)(N-1) This matrix could be derived by outer product of two range tensors. After getting that base matrix, we can take sin and cos to get the corresponding `sin_base` and `cos_base` matrix. If the onesided flag is passed, we can take advantage of Hermitian symmetry and return a weight matrix consisting of only the first (n_fft // 2 + 1) values. """ n_fft = mb.cast(x=n_fft, dtype="fp32", before_op=before_op) if onesided: half = mb.floor_div(x=n_fft, y=2.0, before_op=before_op) half = mb.add(x=half, y=1.0, before_op=before_op) tmp_x = mb.range_1d(start=0.0, end=(half if onesided else n_fft), step=1.0, before_op=before_op) tmp_y = mb.range_1d(start=0.0, end=n_fft, step=1.0, before_op=before_op) # Use MIL ops to calculate base = torch.outer(tmp, tmp) * (2 * torch.pi / N). tmp_x = mb.reshape(x=tmp_x, shape=[-1, 1], before_op=before_op) tmp_y = mb.reshape(x=tmp_y, shape=[1, -1], before_op=before_op) base = mb.matmul(x=tmp_x, y=tmp_y, before_op=before_op) base = mb.mul(x=base, y=2 * np.pi, before_op=before_op) base = mb.real_div(x=base, y=n_fft, before_op=before_op) # Get real part and imaginary part separately. cos_base = mb.cos(x=base, before_op=before_op) sin_base = mb.sin(x=base, before_op=before_op) return cos_base, sin_base def _fft_1d( input_real: Var, input_imag: Var, n: Optional[Var], dim: Optional[Var], norm: Optional[Var], before_op: Operation, inverse: bool = False, # For inverse FFT. ) -> Tuple[Var, Var]: """ 1-D FFT by DFT Matrix Multiplication. Now based on some math formulas including: * The addition of complex numbers is: (a+bi)+(c+di)=(a+c)+(b+d)i. * The multiplication of complex numbers is: (a+bi)(c+di)=ac+adi+bci−bd=(ac−bd)+(ad+bc)i. * Euler’s formula: e^xi=cosx+isinx. * Cosine is an even function: cos(−x)=cosx. * Sine is an odd function: sin(−x)=−(sinx). We can get * The real part output is: cos_base * input_real + sin_base * input_imag * The imaginary part output is: - (sin_base * input_real - cos_base * input_imag) That's how we calculate the real and imaginary part separately for the FFT. """ n, dim = fft_canonicalize_length_dim(input_real, n, dim) # Swaps target dim axis to the first axis. axes = list(range(len(input_real.shape))) axes[0] = dim axes[dim] = 0 transposed_input_real = mb.transpose(x=input_real, perm=axes, before_op=before_op) transposed_input_imag = mb.transpose(x=input_imag, perm=axes, before_op=before_op) # Trim or pad input according to n. transposed_input_real = _resize_data( input_data=transposed_input_real, dims=(0,), sizes=(n,), before_op=before_op, ) transposed_input_imag = _resize_data( input_data=transposed_input_imag, dims=(0,), sizes=(n,), before_op=before_op, ) # Calculate DFT matrix. original_shape = transposed_input_real.shape N = transposed_input_real.shape[0] reshaped_input_real = mb.reshape(x=transposed_input_real, shape=[N, -1], before_op=before_op) reshaped_input_imag = mb.reshape(x=transposed_input_imag, shape=[N, -1], before_op=before_op) N = mb.cast(x=N, dtype="fp32", before_op=before_op) cos_base, sin_base = _calculate_dft_matrix(N, onesided=False, before_op=before_op) if not inverse: real_part = mb.add( x=mb.matmul(x=cos_base, y=reshaped_input_real, before_op=before_op), y=mb.matmul(x=sin_base, y=reshaped_input_imag, before_op=before_op), before_op=before_op, ) imag_part = mb.sub( x=mb.matmul(x=sin_base, y=reshaped_input_real, before_op=before_op), y=mb.matmul(x=cos_base, y=reshaped_input_imag, before_op=before_op), before_op=before_op, ) imag_part = mb.mul(x=imag_part, y=-1.0, before_op=before_op) else: real_part = mb.sub( x=mb.matmul(x=cos_base, y=reshaped_input_real, before_op=before_op), y=mb.matmul(x=sin_base, y=reshaped_input_imag, before_op=before_op), before_op=before_op, ) imag_part = mb.add( x=mb.matmul(x=sin_base, y=reshaped_input_real, before_op=before_op), y=mb.matmul(x=cos_base, y=reshaped_input_imag, before_op=before_op), before_op=before_op, ) real_part = mb.reshape(x=real_part, shape=original_shape, before_op=before_op) imag_part = mb.reshape(x=imag_part, shape=original_shape, before_op=before_op) # Swaps dim back. real_part = mb.transpose(x=real_part, perm=axes, before_op=before_op) imag_part = mb.transpose(x=imag_part, perm=axes, before_op=before_op) # Normalization if needed. apply_scale = False scale = 1 if norm.val is not None: # For FFT, "forward" means normalize 1/N, while in IFFT, "backward" means normalize 1/N. if (not inverse) and (norm.val in ["forward", "ortho"]): apply_scale = True scale = N if norm.val == "forward" else mb.sqrt(x=N, before_op=before_op) if inverse and (norm.val in ["backward", "ortho"]): apply_scale = True scale = N if norm.val == "backward" else mb.sqrt(x=N, before_op=before_op) if apply_scale: real_part = mb.real_div(x=real_part, y=scale, before_op=before_op) imag_part = mb.real_div(x=imag_part, y=scale, before_op=before_op) return real_part, imag_part def _rfft_1d( input_real: Var, n: Optional[Var], dim: Optional[Var], norm: Optional[Var], before_op: Operation, ) -> Tuple[Var, Var]: """ It's similar to fft, but as the input is real data, the redundant info (the conjugate part) is removed in the result. """ input_imag = mb.fill( shape=mb.shape(x=input_real, before_op=before_op), value=0.0, before_op=before_op, ) real_data, imag_data = _fft_1d(input_real, input_imag, n, dim, norm, before_op=before_op) remain_len = real_data.shape[dim.val] // 2 + 1 remain_indices = mb.range_1d(start=0, end=remain_len, step=1, before_op=before_op) real_data = mb.gather(x=real_data, indices=remain_indices, axis=dim.val, before_op=before_op) imag_data = mb.gather(x=imag_data, indices=remain_indices, axis=dim.val, before_op=before_op) return real_data, imag_data def _stft( input_real: Var, input_imaginary: Optional[Var], n_fft: Var, hop_length: Optional[Var], win_length: Optional[Var], window: Optional[Var], normalized: Optional[Var], onesided: Optional[Var], before_op: Operation, ) -> Tuple[Var, Var]: """ We can write STFT in terms of convolutions with a DFT kernel. At the end: * The real part output is: cos_base * input_real + sin_base * input_imag * The imaginary part output is: - (sin_base * input_real - cos_base * input_imag) Adapted from: https://github.com/adobe-research/convmelspec/blob/main/convmelspec/mil.py """ hop_length = hop_length or mb.floor_div(x=n_fft, y=4, before_op=before_op) # input should always be 2D should_increase_rank = input_real.rank == 1 if should_increase_rank: input_real = mb.expand_dims(x=input_real, axes=(0,), before_op=before_op) if input_imaginary: input_imaginary = mb.expand_dims(x=input_imaginary, axes=(0,), before_op=before_op) is_onesided = onesided and onesided.val cos_base, sin_base = _calculate_dft_matrix( n_fft, onesided=is_onesided, before_op=before_op) # create a window of centered 1s of the requested size if win_length: n_left = (n_fft.val - win_length.val) // 2 n_right = n_fft.val - win_length.val - n_left left = mb.fill(shape=(n_left,), value=0., before_op=before_op) if not window: window = mb.fill(shape=(win_length.val,), value=1., before_op=before_op) right = mb.fill(shape=(n_right,), value=0., before_op=before_op) # concatenate window = mb.concat(values=(left, window, right), axis=0, before_op=before_op) # apply time window if window: cos_base = mb.mul(x=window, y=cos_base, before_op=before_op) sin_base = mb.mul(x=window, y=sin_base, before_op=before_op) # conv with DFT kernel across the input signal sin_base = mb.sub(x=0., y=sin_base, before_op=before_op) cos_base = mb.expand_dims(x=cos_base, axes=(1,), before_op=before_op) sin_base = mb.expand_dims(x=sin_base, axes=(1,), before_op=before_op) hop_size = mb.expand_dims(x=hop_length, axes=(0,), before_op=before_op) signal_real = mb.expand_dims(x=input_real, axes=(1,), before_op=before_op) cos_windows_real = mb.conv(x=signal_real, weight=cos_base, strides=hop_size, pad_type='valid', before_op=before_op) sin_windows_real = mb.conv(x=signal_real, weight=sin_base, strides=hop_size, pad_type='valid', before_op=before_op) if input_imaginary: signal_imaginary = mb.expand_dims(x=input_imaginary, axes=(1,), before_op=before_op) cos_windows_imag = mb.conv(x=signal_imaginary, weight=cos_base, strides=hop_size, pad_type='valid', before_op=before_op) sin_windows_imag = mb.conv(x=signal_imaginary, weight=sin_base, strides=hop_size, pad_type='valid', before_op=before_op) # add everything together if input_imaginary: # sin base is already negative so subtract real_result = mb.sub(x=cos_windows_real, y=sin_windows_imag, before_op=before_op) imag_result = mb.add(x=sin_windows_real, y=cos_windows_imag, before_op=before_op) else: real_result = cos_windows_real imag_result = sin_windows_real # reduce the rank of the output if should_increase_rank: real_result = mb.squeeze(x=real_result, axes=(0,), before_op=before_op) imag_result = mb.squeeze(x=imag_result, axes=(0,), before_op=before_op) if normalized and normalized.val: divisor = mb.sqrt(x=mb.cast(x=n_fft, dtype="fp32", before_op=before_op), before_op=before_op) real_result = mb.real_div(x=real_result, y=divisor, before_op=before_op) imag_result = mb.real_div(x=imag_result, y=divisor, before_op=before_op) return real_result, imag_result def _wrap_complex_output(original_output: Var, real_data: Var, imag_data: Var) -> ComplexVar: return ComplexVar( name=original_output.name + "_lowered", sym_type=original_output.sym_type, real=real_data, imag=imag_data, ) @LowerComplex.register_lower_func(op_type="complex") def _lower_complex(op: Operation): return _wrap_complex_output(op.outputs[0], op.real_data, op.imag_data) @LowerComplex.register_lower_func(op_type="complex_real") def _lower_complex_real(op: Operation): complex_input: ComplexVar = op.data # Use an identity op to avoid the block's input name inconsistency issue. If we directly use # complex_input.real, the var's name could be inconsistent with the block's input name. result = mb.identity(x=complex_input.real, before_op=op) return result @LowerComplex.register_lower_func(op_type="complex_imag") def _lower_complex_imag(op: Operation): complex_input: ComplexVar = op.data # Use an identity op to avoid the block's input name inconsistency issue. If we directly use # complex_input.imag, the var's name could be inconsistent with the block's input name. result = mb.identity(x=complex_input.imag, before_op=op) return result @LowerComplex.register_lower_func(op_type="complex_fft") def _lower_complex_fft(op: Operation): if types.is_complex(op.data.dtype): real_data = op.data.real imag_data = op.data.imag else: real_data = op.data imag_data = mb.fill( shape=mb.shape(x=real_data, before_op=op), value=mb.cast( x=mb.const(val=0.0, before_op=op), dtype=real_data.dtype.__name__, before_op=op, ), before_op=op, ) real_data, imag_data = _fft_1d( real_data, imag_data, op.n, op.dim, op.norm, before_op=op, ) return _wrap_complex_output(op.outputs[0], real_data, imag_data) @LowerComplex.register_lower_func(op_type="complex_fftn") def _lower_complex_fftn(op: Operation): if types.is_complex(op.data.dtype): real_data = op.data.real imag_data = op.data.imag else: real_data = op.data imag_data = mb.fill( shape=mb.shape(x=real_data, before_op=op), value=mb.cast( x=mb.const(val=0.0, before_op=op), dtype=real_data.dtype.__name__, before_op=op, ), before_op=op, ) shapes, dims = fft_canonicalize_shapes_dims(real_data, op.shapes, op.dims) for shape, dim in zip(shapes, dims): real_data, imag_data = _fft_1d( real_data, imag_data, n=mb.const(val=shape, before_op=op), dim=mb.const(val=dim, before_op=op), norm=op.norm, before_op=op, ) return _wrap_complex_output(op.outputs[0], real_data, imag_data) @LowerComplex.register_lower_func(op_type="complex_rfft") def _lower_complex_rfft(op: Operation): real_data, imag_data = _rfft_1d(op.data, op.n, op.dim, op.norm, before_op=op) return _wrap_complex_output(op.outputs[0], real_data, imag_data) @LowerComplex.register_lower_func(op_type="complex_rfftn") def _lower_complex_rfftn(op: Operation): shapes, dims = fft_canonicalize_shapes_dims(op.data, op.shapes, op.dims) real_data, imag_data = _rfft_1d( op.data, mb.const(val=shapes[-1], before_op=op), mb.const(val=dims[-1], before_op=op), op.norm, before_op=op, ) for shape, dim in zip(shapes[:-1], dims[:-1]): real_data, imag_data = _fft_1d( real_data, imag_data, n=mb.const(val=shape, before_op=op), dim=mb.const(val=dim, before_op=op), norm=op.norm, before_op=op, ) return _wrap_complex_output(op.outputs[0], real_data, imag_data) @LowerComplex.register_lower_func(op_type="complex_ifft") def _lower_complex_ifft(op: Operation): real_data, imag_data = _fft_1d( op.data.real, op.data.imag, op.n, op.dim, op.norm, before_op=op, inverse=True ) return _wrap_complex_output(op.outputs[0], real_data, imag_data) @LowerComplex.register_lower_func(op_type="complex_ifftn") def _lower_complex_ifftn(op: Operation): real_data = op.data.real imag_data = op.data.imag shapes, dims = fft_canonicalize_shapes_dims(real_data, op.shapes, op.dims) for shape, dim in zip(shapes, dims): real_data, imag_data = _fft_1d( real_data, imag_data, n=mb.const(val=shape, before_op=op), dim=mb.const(val=dim, before_op=op), norm=op.norm, before_op=op, inverse=True, ) return _wrap_complex_output(op.outputs[0], real_data, imag_data) @LowerComplex.register_lower_func(op_type="complex_irfft") def _lower_complex_irfft(op: Operation): real_data, imag_data = _restore_conj(op.data, op.n, op.dim, before_op=op) n, dim = fft_canonicalize_length_dim(op.data, op.n, op.dim, c2r=True) real_data, imag_data = _fft_1d( real_data, imag_data, mb.const(val=n, before_op=op), mb.const(val=dim, before_op=op), op.norm, before_op=op, inverse=True, ) return real_data @LowerComplex.register_lower_func(op_type="complex_irfftn") def _lower_complex_irfftn(op: Operation): real_data = op.data.real imag_data = op.data.imag shapes, dims = fft_canonicalize_shapes_dims(real_data, op.shapes, op.dims, c2r=True) # For all but last dim/shape, do N-D IFFT. for shape, dim in zip(shapes[:-1], dims[:-1]): real_data, imag_data = _fft_1d( real_data, imag_data, n=mb.const(val=shape, before_op=op), dim=mb.const(val=dim, before_op=op), norm=op.norm, before_op=op, inverse=True, ) # For the last dim/shape, do 1-D IRFFT. n: Var = mb.const(val=shapes[-1], before_op=op) dim: Var = mb.const(val=dims[-1], before_op=op) real_data, imag_data = _restore_conj( input_data=_wrap_complex_output(op.outputs[0], real_data, imag_data), n=n, dim=dim, before_op=op, ) real_data, imag_data = _fft_1d( real_data, imag_data, n, dim, op.norm, before_op=op, inverse=True ) real_data = _resize_data(real_data, dims=(dim.val,), sizes=(n.val,), before_op=op) return real_data @LowerComplex.register_lower_func(op_type="complex_stft") def _lower_complex_stft(op: Operation): is_complex = types.is_complex(op.input.dtype) # check parameters for validity if op.win_length and op.win_length.val > op.n_fft.val: raise ValueError("Window length must be less than or equal to n_fft") if is_complex and op.onesided and op.onesided.val: raise ValueError("Onesided is only valid for real inputs") real, imag = _stft( op.input.real if is_complex else op.input, op.input.imag if is_complex else None, op.n_fft, op.hop_length, op.win_length, op.window, op.normalized, op.onesided, before_op=op) return _wrap_complex_output(op.outputs[0], real, imag) @LowerComplex.register_lower_func(op_type="complex_shape") def _lower_complex_shape(op: Operation): return mb.shape(x=op.data.real, before_op=op) @LowerComplex.register_lower_func(op_type="complex_abs") def _lower_complex_abs(op: Operation): mag_r, mag_i = (mb.square(x=x, before_op=op) for x in (op.x.real, op.x.imag)) mag = mb.add(x=mag_r, y=mag_i, before_op=op) return mb.sqrt(x=mag, before_op=op) def _match_and_replace_dialect_op(block, op): if not LowerComplex.has_lower_func(op.op_type): return False lower_res = LowerComplex.get_lower_func(op.op_type)(op) if not op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=lower_res, ): raise ValueError(f"Unable to lower complex dialect op {op}") block.remove_ops([op]) return True @block_context_manager def _lower_complex_dialect_ops_in_block(block): for op in list(block.operations): _match_and_replace_dialect_op(block, op) @register_pass(namespace="common") class lower_complex_dialect_ops(AbstractGraphPass): """ Identify complex data related ops and replace it by using real and imaginary parts separately. The goal of this pass it to lower complex dialect ops into core ops. This pass also checks if the output is complex. As Core ML doesn't support complex data yet, it errors out early when detecting complex output. Input graph (`complex` and `complex_real` are complex dialect ops): %complex_data = complex(real_data=%real_data, imag_data=%imag_data) %real_data = complex_real(data=%complex_data) return %real_data Output graph (only core ops, no complex dialect ops): %complex_data_real = identity(x=%real_data) %complex_data_imag = identity(x=%imag_data) %real_data = identity(data=%complex_data_real) return %real_data """ def apply(self, prog): for block in prog.functions.values(): # Early error out for complex data output. for out_var in block.outputs: if types.is_complex(out_var.dtype): raise ValueError( "MIL doesn't support complex data as model's output, please " "extract real and imaginary parts explicitly." ) _lower_complex_dialect_ops_in_block(block) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/optimize_activation.py0000644000000000000000000006303214672066616027453 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.experimental.passes.generic_pass_infrastructure import ( fuse_all_blocks, ) from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import ( _check_child_op_type, _check_var_scalar_value, _check_var_scalar_value_in_interval, block_context_manager, ) from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class fuse_gelu_exact(AbstractGraphPass): """ Identify the pattern that corresponds to the exact version of ``gelu``, and replace it with a single ``gelu`` layer with ``mode=EXACT``. The pattern is ``y = 0.5 * x * (1 + erf (x / srqt (2))``, which can be represented by one of the following: .. code-block:: (1) [...] ----> div (1.414) ---> erf ---> add (1) -----> mul (0.5) ---> mul ---> [...] | ^ | | |------------------------------------------------------------------- (2) [...] ----> div (1.414) ---> erf ---> add (1) -----> mul ---> mul (0.5) ---> [...] | ^ | | |---------------------------------------------------- (3) [...] ----> div (1.414) ---> erf ---> add (1) -----> mul ------> [...] | ^ | | |---------------> mul(0.5) -------------------------- All of them are converted to: [...] ----> gelu (mode=EXACT) ---> [...] """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._fuse_gelu_exact_block(f) @staticmethod def _try_to_transform(op, block): ops_to_remove = [] if op.x.val is None and op.y.val is None: return False # check either the op is mul(1/sqrt(2)) or real_div(sqrt(2)) root_var = op.x if op.y.val is not None else op.y if op.op_type == "real_div": if not _check_var_scalar_value(op.y, 2**0.5): return False elif op.op_type == "mul": if not ( _check_var_scalar_value(op.x, 2**-0.5) or _check_var_scalar_value(op.y, 2**-0.5) ): return False ops_to_remove.append(op) # check if the child op is erf if not _check_child_op_type(op, "erf"): return False erf_op = list(op.outputs[0].child_ops)[0] ops_to_remove.append(erf_op) # check if the child op is add if not _check_child_op_type(erf_op, "add"): return False add_op = list(erf_op.outputs[0].child_ops)[0] if not (_check_var_scalar_value(add_op.x, 1) or _check_var_scalar_value(add_op.y, 1)): return False ops_to_remove.append(add_op) # check if the child op is mul if not _check_child_op_type(add_op, "mul"): return False mul_op = list(add_op.outputs[0].child_ops)[0] # now we have two case: # (1) first mul by 0.5 and by the root var if _check_var_scalar_value(mul_op.x, 0.5) or _check_var_scalar_value(mul_op.y, 0.5): ops_to_remove.append(mul_op) if not _check_child_op_type(mul_op, "mul"): return False mul_op_2 = list(mul_op.outputs[0].child_ops)[0] if not (mul_op_2.x == root_var or mul_op_2.y == root_var): return False ops_to_remove.append(mul_op_2) # (2) first mul by the root var and then mul by 0.5 elif mul_op.x == root_var or mul_op.y == root_var: ops_to_remove.append(mul_op) if not _check_child_op_type(mul_op, "mul"): return False mul_op_2 = list(mul_op.outputs[0].child_ops)[0] if not ( _check_var_scalar_value(mul_op_2.x, 0.5) or _check_var_scalar_value(mul_op_2.y, 0.5) ): return False ops_to_remove.append(mul_op_2) else: other_parent_op = mul_op.x.op if mul_op.y == add_op.outputs[0] else mul_op.y.op if other_parent_op.op_type != "mul": return False if not ( _check_var_scalar_value(other_parent_op.x, 0.5) or _check_var_scalar_value(other_parent_op.y, 0.5) ): return False if not (other_parent_op.x == root_var or other_parent_op.y == root_var): return False ops_to_remove.append(other_parent_op) ops_to_remove.append(mul_op) mul_op_2 = mul_op # check that none of the op in this pattern is connected to the output # (except the last mul op) for op in ops_to_remove[:-1]: for out in op.outputs: if out in block.outputs: return False # remove all the ops, and replace with a gelu op out_name = mul_op_2.outputs[0].name x = mb.gelu(x=root_var, mode="EXACT", name=out_name, before_op=op) mul_op_2.enclosing_block.replace_uses_of_var_after_op( anchor_op=mul_op_2, old_var=mul_op_2.outputs[0], new_var=x ) # Remove all the ops at once block.remove_ops(ops_to_remove) return True @block_context_manager def _fuse_gelu_exact_block(self, block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._fuse_gelu_exact_block(b) if len(op.blocks) > 0: # This op can't be real_div or mul continue if op.op_type in ["mul", "real_div"]: if self._try_to_transform(op, block): fusion_occurred = True return fusion_occurred @register_pass(namespace="common") class fuse_gelu_tanh_approximation(AbstractGraphPass): """ Identify the pattern that corresponds to the ``tanh`` approximate version of ``gelu``, and replace it with a single ``gelu`` layer with ``mode=TANH_APPROXIMATION``. The implementation of this pass uses the generic graph pattern matching and transform algorithm implemented in ``coremltools.converters.mil.experimental.passes.generic_pass_infrastructure`` and documented in ``coremltools/converters/mil/experimental/passes/readme.md``. `Graph for` ``get_gelu_pattern1()`` ``y = x * (0.5 * (tanh(((.0447)x^3 + x ) * sqrt(2/pi)) + 1))`` .. code-block:: [...] -----> pow (3) ----> mul (.044715) ---> add -----> mul (sqrt(2/pi)) ---> tanh ----> add (1) ----> mul (0.5) -----> mul ---> [...] | ^ ^ | | | |------------------------------------------------------------------------------------------------------------------------ `Graph for` ``get_gelu_pattern2()`` ``y = (0.5 * x) * (tanh(((.0447)x^3 + x ) * sqrt(2/pi)) + 1)`` .. code-block:: -------------------------------------------------------------------------------------------------------- ^ | | V [...] -----> mul(0.5) pow (3) ----> mul (.044715) ---> add -----> mul (sqrt(2/pi)) ---> tanh ----> add (1) -----> mul ---> [...] | ^ ^ | | | |--------------------------------------------------------- """ def apply(self, prog): fuse_all_blocks( ops_arrangement=self.get_gelu_pattern1(), var_constraints=self.is_var_constraint_satisifed, transform_pattern=self.transform_pattern, prog=prog, ) fuse_all_blocks( ops_arrangement=self.get_gelu_pattern2(), var_constraints=self.is_var_constraint_satisifed, transform_pattern=self.transform_pattern, prog=prog, ) @staticmethod def is_var_constraint_satisifed(pattern): passed = _check_var_scalar_value(pattern.mul.y, 0.5) or _check_var_scalar_value( pattern.mul.x, 0.5 ) passed = passed and _check_var_scalar_value(pattern.pow.y, 3.0) passed = passed and ( _check_var_scalar_value(pattern.mul_1.y, 0.044715) or _check_var_scalar_value(pattern.mul_1.x, 0.044715) ) passed = passed and ( _check_var_scalar_value(pattern.mul_2.y, 0.79788) or _check_var_scalar_value(pattern.mul_2.x, 0.79788) ) passed = passed and ( _check_var_scalar_value(pattern.add_1.y, 1) or _check_var_scalar_value(pattern.add_1.x, 1) ) return passed @staticmethod def transform_pattern(pattern): # remove all the ops, and replace with a gelu op out_name = pattern.mul_3.outputs[0].name x = mb.gelu( x=pattern.root_var, mode="TANH_APPROXIMATION", name=out_name, before_op=pattern.mul ) pattern.mul_3.enclosing_block.replace_uses_of_var_after_op( anchor_op=pattern.mul_3, old_var=pattern.mul_3.outputs[0], new_var=x ) # Remove all the ops at once pattern.block.remove_ops(pattern.op_list()) @staticmethod def get_gelu_pattern1(): """ ``y = x * (0.5 * (tanh(((.0447)x^3 + x ) * sqrt(2/pi)) + 1))`` .. code-block:: [...] -----> pow (3) ----> mul (.044715) ---> add -----> mul (sqrt(2/pi)) ---> tanh ----> add (1) ----> mul (0.5) -----> mul ---> [...] | ^ ^ | | | |------------------------------------------------------------------------------------------------------------------------ """ @mb.program( input_specs=[ mb.TensorSpec(shape=([get_new_symbol(), get_new_symbol(), get_new_symbol()])), ] ) def gelu_to_detect_1(x): # MIL operation takes named inputs (instead of positional inputs). # Here `name` argument is MANDATORY. pow = mb.pow(x=x, y=3.0, name="pow") mul_1 = mb.mul(x=0.044714998453855515, y=pow, name="mul_1") add = mb.add(x=x, y=mul_1, name="add") mul_2 = mb.mul(x=0.7978845834732056, y=add, name="mul_2") tanh = mb.tanh(x=mul_2, name="tanh") add_1 = mb.add(x=1.0, y=tanh, name="add_1") mul = mb.mul(x=0.5, y=add_1, name="mul") mul_3 = mb.mul(x=mul, y=x, name="mul_3") return mul_3 return gelu_to_detect_1 @staticmethod def get_gelu_pattern2(): """ ``y = (0.5 * x) * (tanh(((.0447)x^3 + x ) * sqrt(2/pi)) + 1)`` .. code-block:: -------------------------------------------------------------------------------------------------------- ^ | | V [...] -----> mul(0.5) pow (3) ----> mul (.044715) ---> add -----> mul (sqrt(2/pi)) ---> tanh ----> add (1) -----> mul ---> [...] | ^ ^ | | | |--------------------------------------------------------- """ @mb.program( input_specs=[ mb.TensorSpec(shape=([get_new_symbol(), get_new_symbol(), get_new_symbol()])), ] ) def gelu_to_detect_2(x): pow = mb.pow(x=x, y=3.0, name="pow") mul_1 = mb.mul(x=0.044714998453855515, y=pow, name="mul_1") add = mb.add(x=x, y=mul_1, name="add") mul_2 = mb.mul(x=0.7978845834732056, y=add, name="mul_2") tanh = mb.tanh(x=mul_2, name="tanh") add_1 = mb.add(x=1.0, y=tanh, name="add_1") mul = mb.mul(x=0.5, y=x, name="mul") mul_3 = mb.mul(x=mul, y=add_1, name="mul_3") return mul_3 return gelu_to_detect_2 @register_pass(namespace="common") class fuse_leaky_relu(AbstractGraphPass): """ Detect the ``mul`` ---> ``max`` pattern than can be mapped to ``leaky_relu``. `In code form - Input` .. code-block:: %2 = const(value = alpha) # where 0 <= alpha <= 1 %3 = mul(%1, %2) # alpha * x %4 = max(%3, %1) # max(alpha * x, x) `In code form - Output` .. code-block:: %4 = leaky_relu(x=%1, alpha=%2) `In graphical form - Input graph` .. code-block:: const (val = alpha) | input ----> mul ---------------> maximum -----------> output | | |---------------------------------- `In graphical form - Output graph` .. code-block:: input --------> leaky_relu ---------> output """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._fuse_leaky_relu_block(f) @staticmethod def _try_to_transform(mul_op, block): ops_to_remove = [] # check that one of the inputs of the mul op is a constant that is between 0 and 1 if _check_var_scalar_value_in_interval(mul_op.x, 0, 1): alpha_input_var = mul_op.x parent_var = mul_op.y elif _check_var_scalar_value_in_interval(mul_op.y, 0, 1): alpha_input_var = mul_op.y parent_var = mul_op.x else: return False # check that output of mul is not a block output if mul_op.outputs[0] in block.outputs: return False ops_to_remove.append(mul_op) # check if the child op of the mul op is maximum if not _check_child_op_type(mul_op, "maximum"): return False # check that the other input of the max op is same as the parent of the mul op max_op = list(mul_op.outputs[0].child_ops)[0] if not ( (max_op.x == mul_op.outputs[0] and max_op.y == parent_var) or (max_op.y == mul_op.outputs[0] and max_op.x == parent_var) ): return False ops_to_remove.append(max_op) # remove all the ops, and replace with a leaky relu op out_name = max_op.outputs[0].name x = mb.leaky_relu(x=parent_var, alpha=alpha_input_var.val, name=out_name, before_op=max_op) max_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=max_op, old_var=max_op.outputs[0], new_var=x ) block.remove_ops(ops_to_remove) return True @block_context_manager def _fuse_leaky_relu_block(self, block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._fuse_leaky_relu_block(b) if len(op.blocks) > 0: continue # start pattern match if mul op is encountered if op.op_type == "mul": if self._try_to_transform(op, block): fusion_occurred = True return fusion_occurred class FusePreluPattern1: @staticmethod def is_var_constraint_satisifed(pattern): # input must be rank 4 if pattern.root_var.rank != 4: return False # output must be rank 4 if pattern.out_op.outputs[0].rank != 4: return False if not ( _check_var_scalar_value(pattern.neg.y, -1) or _check_var_scalar_value(pattern.neg.x, -1) ): return False if pattern.alpha_mul.x.val is not None: alpha = pattern.alpha_mul.x.val elif pattern.alpha_mul.y.val is not None: alpha = pattern.alpha_mul.y.val else: return False # alpha must be of shape (1, C, 1, 1) or (C, 1, 1) if len(alpha.shape) not in (3, 4): return False if alpha.size != alpha.shape[-3]: return False return True @staticmethod def transform_pattern(pattern): # remove all the ops, and replace with a prelu op out_var = pattern.out_op.outputs[0] if pattern.alpha_mul.x.val is not None: alpha = pattern.alpha_mul.x.val else: alpha = pattern.alpha_mul.y.val alpha_vector = -1 * alpha.flatten() x = mb.prelu( x=pattern.root_var, alpha=alpha_vector, name=out_var.name, before_op=pattern.out_op ) pattern.out_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=pattern.out_op, old_var=out_var, new_var=x ) # Remove all the ops at once pattern.block.remove_ops(pattern.op_list()) @staticmethod def get_prelu_pattern(): """ ``y = a * relu(-1 * x) + relu(x)`` When ``x`` is rank 4, and ``a`` is of shape ``(1, C, 1, 1)`` or ``(C, 1, 1)``, this is equivalent to ``prelu`` with ``alpha = -a.flatten()``. """ @mb.program( input_specs=[ mb.TensorSpec( shape=([get_new_symbol(), get_new_symbol(), get_new_symbol(), get_new_symbol()]) ), ] ) def prelu_pattern(x): return fuse_prelu._prelu_pattern(x) return prelu_pattern class FusePreluPattern2: @staticmethod def is_var_constraint_satisifed(pattern): perm = pattern.transpose.perm.val if not np.array_equal(perm, np.array([0, 2, 3, 1])): return False # output must be rank 4 if pattern.out_op.outputs[0].rank != 4: return False if not ( _check_var_scalar_value(pattern.neg.y, -1) or _check_var_scalar_value(pattern.neg.x, -1) ): return False if pattern.alpha_mul.x.val is not None: alpha = pattern.alpha_mul.x.val elif pattern.alpha_mul.y.val is not None: alpha = pattern.alpha_mul.y.val else: return False # alpha must be of shape (C,) or (1,C) or (1,1,C) or (1,1,1,C) if alpha.size != alpha.shape[-1]: return False return True @staticmethod def transform_pattern(pattern): # remove all the ops, and replace with a prelu op + transpose op perm = pattern.transpose.perm.val out_var = pattern.out_op.outputs[0] if pattern.alpha_mul.x.val is not None: alpha = pattern.alpha_mul.x.val else: alpha = pattern.alpha_mul.y.val alpha_vector = -1 * alpha.flatten() x = mb.prelu(x=pattern.root_var, alpha=alpha_vector, before_op=pattern.out_op) x = mb.transpose(x=x, perm=perm, name=out_var.name, before_op=pattern.out_op) pattern.out_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=pattern.out_op, old_var=out_var, new_var=x ) # Remove all the ops at once pattern.block.remove_ops(pattern.op_list()) @staticmethod def get_prelu_pattern(): """ ``x1 = transpose(perm=(0,2,3,1))(x)`` ``y = a * relu(-1 * x1) + relu(x1)`` When ``x`` is rank 4, and ``a`` is of shape (``C,)``, ``(1, C)``, ``(1,1,C)``, or ``(1,1,1,C)``, this is equivalent to ``prelu`` with ``alpha = -a.flatten()``, followed by a ``transpose`` with ``perm (0,2,3,1)``. """ @mb.program( input_specs=[ mb.TensorSpec( shape=([get_new_symbol(), get_new_symbol(), get_new_symbol(), get_new_symbol()]) ), ] ) def prelu_pattern(x): # perm value can be anything, it will be checked in "is_var_constraint_satisifed" method x = mb.transpose(x=x, perm=[0, 1, 2, 3], name="transpose") return fuse_prelu._prelu_pattern(x) return prelu_pattern @register_pass(namespace="common") class fuse_prelu(AbstractGraphPass): """ Detect the following patterns that can be mapped to a ``prelu`` op. Essentially, the ``prelu`` op can be broken down into the following ops: ``y = a * relu(-1 * x) + relu(x)`` `Pattern 1` .. code-block:: | ------------> relu --------------------| | V x (BCHW) ------| add -----> y (BCHW) | ^ --------> mul -------> relu -----> mul---| ^ ^ | | Const(val=-1) Const(name=a, shape=(C,1,1) or (1,C,1,1)) This will be mapped to: .. code-block:: x (BCHW) ------> prelu(alpha=a, shape=(C,)) ---------> y (BCHW) `Pattern 2` .. code-block:: | ------------> relu --------------------| | V x (BCHW) -->transpose(BHWC)---->| add -----> y (BHWC) | ^ --------> mul -------> relu -----> mul---| ^ ^ | | Const(val=-1) Const(shape=(C,) or (1,C) or (1,1,C) or (1,1,1,C)) This will be mapped to: .. code-block:: x (BCHW) ------> prelu ---------> transpose ------> y (BHWC) """ def apply(self, prog): for pattern in (FusePreluPattern1, FusePreluPattern2): fuse_all_blocks( ops_arrangement=pattern.get_prelu_pattern(), var_constraints=pattern.is_var_constraint_satisifed, transform_pattern=pattern.transform_pattern, prog=prog, ) @staticmethod def _prelu_pattern(x): # MIL operation takes named inputs (instead of positional inputs). # Here `name` argument is MANDATORY. neg = mb.mul(x=x, y=-1.0, name="neg") relu1 = mb.relu(x=neg, name="relu1") # Use any constant here to match, rank and shape will be verified in # `is_var_constraint_satisifed`. mul = mb.mul(x=relu1, y=np.random.rand(2, 2, 2, 2), name="alpha_mul") relu2 = mb.relu(x=x, name="relu2") out = mb.add(x=relu2, y=mul, name="out_op") return out @register_pass(namespace="common") class prelu_to_lrelu(AbstractGraphPass): """ If ``prelu`` has the same leakage factor across all channels, it will be converted to ``leaky_relu``. """ def apply(self, prog): for f in prog.functions.values(): self._prelu_to_lrelu_block(f) @block_context_manager def _prelu_to_lrelu_block(self, block): for op in list(block.operations): for b in op.blocks: self._prelu_to_lrelu_block(b) if len(op.blocks) > 0: # This op can't be prelu. continue if op.op_type == "prelu": alpha_val = op.alpha.val common_leakage_factor = True for c in range(1, op.alpha.val.shape[0]): if alpha_val[c] != alpha_val[0]: common_leakage_factor = False break if common_leakage_factor: lrelu_out = mb.leaky_relu( x=op.x, alpha=alpha_val[0], name=op.outputs[0].name, before_op=op ) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=lrelu_out ) block.remove_ops([op]) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/optimize_activation_quantization.py0000644000000000000000000003454414672066616032267 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, types from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import _check_child_op_type, block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="compression") class insert_suffix_quantize_dequantize_pair(AbstractGraphPass): """ Insert trailing quantize and dequantize operation pairs after valid patterns. .. code-block:: Pattern 1: dequantize -> conv Given: %2 = dequantize(%1) %3 = conv(%2) ... Result: %2 = dequantize(%1) %3 = conv(%2) %4 = quantize(%3) %5 = dequantize(%4) ... Pattern 2: dequantize ->| |-> add dequantize ->| Given: %2 = dequantize(%1) %4 = dequantize(%3) %5 = add(%2,%4) ... Result: %2 = dequantize(%1) %4 = dequantize(%3) %5 = add(%2,%4) %6 = quantize(%5) %7 = dequantize(%6) ... """ _allowed_activations = { "leaky_relu", "tanh", "scaled_tanh", "sigmoid", "hard_sigmoid", "relu", "relu6", } # Graph pass option for setting compression config. _config = None @property def config(self): return self._config @config.setter def config(self, value): self._config = value if value._op_selector is not None: self.op_selector = value._op_selector def apply(self, prog): visited_ops = set() for f in prog.functions.values(): self._insert_quantize_dequantize(f, self._config, visited_ops) @block_context_manager def _insert_quantize_dequantize(self, block: Block, config, visited_ops: set): def help_insert_quantize_dequantize(block: Block) -> bool: fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue if op in visited_ops: continue visited_ops.add(op) for b in op.blocks: self._insert_quantize_dequantize(b) # Must start with "dequantize" op. if op.op_type != "dequantize": continue # Try matching valid patterns. if self._try_match_and_transform_pattern(op, block, config, visited_ops): fusion_occurred = True return fusion_occurred block_changed = True while block_changed: block_changed = help_insert_quantize_dequantize(block) def _try_match_and_transform_pattern( self, dequantize_op: Operation, block: Block, config, visited_ops: set ) -> bool: """ This function performs the pattern match for all target patterns. It priorizes longer patterns to shorter ones for more fusions on hardware. Reject if the trailing `quantize` and `dequantize` pair already existed. A list of valid patterns. - conv - conv, activation - add - add, activation - pool (max_pool, avg_pool) E.g. Identify valid patterns: - (`quantize` ->) dequantize` -> `conv` - (`quantize` ->) dequantize` -> `conv` -> `relu` - (`quantize` ->) dequantize` -> `avg_pool` - (`quantize` ->) dequantize` -> `max_pool` E.g. Reject if trailing `quantize` -> `dequantize` exist: - (`quantize` ->) dequantize` -> `conv` -> `quantize` -> `dequantize` - (`quantize` ->) dequantize` -> `conv` -> `relu` -> `quantize` -> `dequantize` """ # Reject if 1st operation is not `conv`/`add`/`pool`. SUPPORTED_OP_TYPES = ["conv", "add", "avg_pool", "max_pool"] if any([_check_child_op_type(dequantize_op, val) for val in SUPPORTED_OP_TYPES]): pass else: return False core_op = dequantize_op.outputs[0].child_ops[0] last_op = core_op # For operations with two inputs, both need to be `dequantize`. if core_op.op_type == "add": # Check both inputs in_var_x = core_op.inputs["x"] in_var_y = core_op.inputs["y"] in_x_prev_op = in_var_x.op in_y_prev_op = in_var_y.op if not (in_x_prev_op.op_type == "dequantize" and in_y_prev_op.op_type == "dequantize"): return False # Checking op-level config. Skip if we disable compression on certain operations. op_config = config._get_op_config(core_op) if op_config is None: return False # Reject if trailing `quantize` -> `dequantize` pair exist. if _check_child_op_type(core_op, "quantize"): return False _child_op = None if len(core_op.outputs[0].child_ops) > 0: _child_op = core_op.outputs[0].child_ops[0] # Check if 2nd operation is part of a valid pattern. # E.g. `dequantize` -> `conv` -> activation -> `quantize`. if _child_op is not None: if _child_op.op_type in self._allowed_activations: if len(_child_op.outputs[0].child_ops) > 0: if _check_child_op_type(_child_op, "quantize"): return False _child_child_op = _child_op.outputs[0].child_ops[0] last_op = _child_op _child_op = _child_child_op return self._try_apply_transform(last_op, _child_op, block, visited_ops) @staticmethod def _try_apply_transform( last_op: Operation, _child_op: Operation, block: Block, visited_ops: set, ) -> bool: """ last_op: last op of a valid pattern. E.g. in `conv` -> `relu`, last_op is `relu`; in `conv`, last_op is `conv`. _child_op: the child op of the last_op. block: current block. visited_ops: a dict Pattern: Given: |-> child_op_1 last_op -> |-> child_op_2 |-> ... Result: |-> child_op_1 last_op -> quantize -> dequantize -> |-> child_op_2 |-> ... """ if _child_op is None: return False scale_dtype = np.float16 if last_op.outputs[0].dtype == types.fp16 else np.float32 new_last_op = getattr(mb, last_op.op_type) kargs = {} for k, v in last_op.inputs.items(): kargs[k] = v kargs["name"] = last_op.name kargs["before_op"] = last_op new_last_op = new_last_op(**kargs) new_quantize_op = mb.quantize( input=new_last_op, scale=np.array(1).astype(scale_dtype), zero_point=np.int8(0), output_dtype="int8", before_op=last_op, ) new_dequantize_op = mb.dequantize( input=new_quantize_op, scale=np.array(1).astype(scale_dtype), zero_point=np.int8(0), before_op=last_op, ) ops_to_remove = [last_op] last_op_var_name = last_op.outputs[0].name # Replace output var of last_op with output of new_dequantize_op. if last_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=last_op, end_op=last_op, old_var=last_op.outputs[0], new_var=new_dequantize_op, ): block.remove_ops(ops_to_remove) # The name of new quantize/dequantize may change. # Add the new ones to the visited list to avoid revisiting. visited_ops.add(new_dequantize_op.op) visited_ops.add(new_quantize_op.op) new_dequantize_var_name = new_dequantize_op.name new_dequantize_op.set_name(f"{new_dequantize_var_name}__post__dequant") new_last_op.set_name(f"{last_op_var_name}") return True return False @register_pass(namespace="compression") class update_quantize_dequantize(AbstractGraphPass): """ Update scale and zero point values in `quantize` and `dequantize` operations with calibration statistics. .. code-block:: Pattern: Given: %2 = quantize(%1) with random scale and zp %3 = dequantize(%2) with random scale and zp ... Result: %2 = quantize(%1) with calculated scale and zp %3 = dequantize(%2) with calculated scale and zp ... """ _activation_stats = None @property def activation_stats(self): return self._activation_stats @activation_stats.setter def activation_stats(self, value): self._activation_stats = value def apply(self, prog): visited_ops = set() for f in prog.functions.values(): self._update_quantize_dequantize(f, self._activation_stats, visited_ops) @block_context_manager def _update_quantize_dequantize(self, block: Block, activation_stats: dict, visited_ops: set): def help_update_quantize_dequantize(block: Block, activation_stats: dict) -> bool: fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue if op in visited_ops: continue visited_ops.add(op) for b in op.blocks: self._update_quantize_dequantize(b, activation_stats) # Must start with "quantize" op if op.op_type != "quantize": continue # Try pattern match: `quantize` -> `dequantize`. if self._try_match_and_transform_pattern(op, block, activation_stats, visited_ops): fusion_occurred = True return fusion_occurred block_changed = True while block_changed: block_changed = help_update_quantize_dequantize(block, activation_stats) def _try_match_and_transform_pattern( self, quantize_op: Operation, block: Block, activation_stats: dict, visited_ops: set ) -> bool: """ This function performs validation checks for the target pattern: `quantize` -> `dequantize` """ if not _check_child_op_type(quantize_op, "dequantize"): return False dequantize_op = quantize_op.outputs[0].child_ops[0] last_op = dequantize_op _child_op = None if len(dequantize_op.outputs[0].child_ops) > 0: _child_op = dequantize_op.outputs[0].child_ops[0] return self._try_apply_transform( quantize_op, last_op, _child_op, block, activation_stats, visited_ops ) @staticmethod def _try_apply_transform( quantize_op: Operation, last_op: Operation, _child_op: Operation, block: Block, activation_stats: dict, visited_ops: set, ) -> bool: """ last_op: last op of a valid pattern. it's 'dequantize' in this case. _child_op: the child op of the last_op. block: current block. """ ops_to_remove = [quantize_op, last_op] if _child_op is None: return False # Name of input var to `quantize`. in_var_name = quantize_op.inputs["input"].name val = np.array([0, 0], dtype=np.float16) # It's possible there are two ``quantize -> dequantize`` pair in a sequence. # Two pairs should share the same scale and zero_point values. # The name of input var to the 2nd `quantize` is newly created and does not exist in the original uncompressed model. # We make an adjustment by tracing the name of input var of 1st `quantize` to update the 2nd pair. if in_var_name not in activation_stats: # Make an adjustment by checking leading `quantize` `dequantize` pair. prev_dequantize = quantize_op.input.op prev_quantize = prev_dequantize.input.op if prev_quantize.inputs["input"].name in activation_stats: in_var_name = prev_quantize.inputs["input"].name val[0], val[1] = ( activation_stats[in_var_name]["rmin"], activation_stats[in_var_name]["rmax"], ) # Numerically the scale and zp won't change if the input array only have two elements: # the min and max of input array. Plus we don't care about quantized values. # That's the trick to re-use quantize_weight util. from coremltools.optimize.coreml._utils import quantize_weight _, _scale, _zero_point = quantize_weight( val, axes=0, nbits=8, signed=True, quantization_mode="LINEAR_SYMMETRIC", dtype=types.int8, ) # New ``quantize -> dequantize``. new_quantize_op = mb.quantize( input=quantize_op.input, scale=_scale, zero_point=_zero_point, output_dtype="int8", name=quantize_op.name, before_op=quantize_op, ) new_dequantize_op = mb.dequantize( input=new_quantize_op, scale=_scale, zero_point=_zero_point, name=last_op.name, before_op=quantize_op, ) # Replace old ``quantize -> dequantize`` with new ``quantize -> dequantize`` to update scale/zero_point. if last_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=last_op, end_op=last_op, old_var=last_op.outputs[0], new_var=new_dequantize_op, ): block.remove_ops(ops_to_remove) # Add the new ones to the visited list to avoid revisiting. visited_ops.add(new_quantize_op.op) visited_ops.add(new_dequantize_op.op) return False ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/optimize_conv.py0000644000000000000000000013604114672066616026260 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy from typing import List, Optional, Tuple import numpy as np from coremltools import _logger as logger from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, Program, types from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import ( _check_child_op_type, _check_no_output_connection, block_context_manager, ) from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.types.symbolic import any_symbolic @register_pass(namespace="common") class add_conv_transpose_output_shape(AbstractGraphPass): """ The ``conv_transpose`` input ``output_shape`` is an optional input. Since we can infer the output shape from ``type_inference``, we add ``output_shape`` input whenever it is known to be constant at compile time. For example: .. code-block:: Given: %1: (1, 5, 39, fp32) = conv_transpose(...) # no output_shape input. Result: %2: (3, i32) = const(val=[1,5,39]) %3: (1, 5, 39, fp32) = conv_transpose(..., output_shape=%2) """ def apply(self, prog): for f in prog.functions.values(): self._handle_block(f) @staticmethod def _match_pattern(op): return ( op.op_type == "conv_transpose" and op.output_shape is None and not any_symbolic(op.outputs[0].shape) ) @block_context_manager def _handle_block(self, block): for op in list(block.operations): for b in op.blocks: self._handle_block(b) if not self._match_pattern(op): continue # matched pattern x = mb.conv_transpose( **op.inputs, output_shape=op.outputs[0].shape, name=op.name + "_has_output_shape", before_op=op, ) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=x ) block.remove_ops([op]) @register_pass(namespace="common") class compose_conv1d(AbstractGraphPass): """ In `TensorFlow `_, ``tf.keras.layers.Conv1D`` is a composite op: .. code-block:: expand a dummy dim -> Conv2D -> squeeze the dummy dim In `PyTorch `_, this is also true for some backends (``mkldnn`` and ``xpu``). This decomposition wrecks the coremltools ``conv1d`` graph passes, so we should recompose the fragments back to MIL ``conv``, which natively supports ``conv1d``: .. code-block:: Pattern 1: Given: %2 = expand_dims(%1, axes=-2) or expand_dims(%1, axes=2), %1.rank = 3 %3 = conv(%2) %4 = squeeze(%3, axes=-2) or squeeze(%3, axes=2) ... Result: %4 = conv(%1) ... Pattern 2 (TensorFlow channel_last): Given: %2 = expand_dims(%1, axes=-3) or expand_dims(%1, axes=1), %1.rank = 3 %3 = transpose(%2, perm=(0, 3, 1, 2)) %4 = conv(%3) %5 = transpose(%4, perm=(0, 2, 3, 1)) %6 = squeeze(%5, axes=-3) or squeeze(%5, axes=1) ... Result: %3 = transpose(%1, perm=(0, 2, 1)) %4 = conv(%3) %6 = transpose(%4, perm=(0, 2, 1)) ... """ def apply(self, prog): for f in prog.functions.values(): self._compose_conv1d_block(f) @block_context_manager def _compose_conv1d_block(self, block: Block): def help_compose_conv1d_block(block: Block) -> bool: fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: self._compose_conv1d_block(b) # must start with expanding a 3-D tensor, # who has batch, channel, length dimensions if op.op_type != "expand_dims" or op.x.rank != 3: continue # try pattern `expand_dim` -> `conv2d` -> `squeeze` if self._try_match_and_transform_pattern(op, block): # has to break as the downstream iterator is affected return True # try pattern `expand_dim` -> `transpose` -> `conv2d` -> `transpose` -> `squeeze` if self._try_match_and_transform_pattern_channel_last(op, block): fusion_occurred = True return fusion_occurred block_changed = True while block_changed: block_changed = help_compose_conv1d_block(block) def _try_match_and_transform_pattern(self, expand_op: Operation, block: Block) -> bool: """ identify the pattern: `expand_dim` -> `conv2d` -> `squeeze` """ # abort composition if dummy dimension is not added as height if expand_op.axes.rank != 1 or expand_op.axes.val[0] not in (-2, 2): return False # `expand_dims` -> `conv` if not _check_child_op_type(expand_op, "conv"): return False conv_op = expand_op.outputs[0].child_ops[0] # `conv` -> `squeeze` if not _check_child_op_type(conv_op, "squeeze"): return False squeeze_op = conv_op.outputs[0].child_ops[0] # Abort composition if not squeezing the dummy height (the extended dim_size=1 dimension) if squeeze_op.axes.rank != 1 or squeeze_op.axes.val[0] not in (-2, 2): return False elif squeeze_op.x.shape[squeeze_op.axes.val[0]] != 1: return False # everything looks good return self._try_apply_transform(expand_op, conv_op, squeeze_op, block) def _try_match_and_transform_pattern_channel_last( self, expand_op: Operation, block: Block ) -> bool: """ identify the pattern: `expand_dim` -> `transpose` -> `conv2d` -> `transpose` -> `squeeze` """ # abort composition if dummy dimension is not added as height if expand_op.axes.rank != 1 or expand_op.axes.val[0] not in (-3, 1): return False # `expand_dims` -> `transpose` if not _check_child_op_type(expand_op, "transpose"): return False transpose1_op = expand_op.outputs[0].child_ops[0] # abort composition if permutation is not (0, 3, 1, 2) perm1 = transpose1_op.perm.val.copy() perm1[np.where(perm1 < 0)] += 4 if np.any(perm1 != (0, 3, 1, 2)): return False # `transpose` -> `conv` if not _check_child_op_type(transpose1_op, "conv"): return False conv_op = transpose1_op.outputs[0].child_ops[0] # `conv` -> `transpose` if not _check_child_op_type(conv_op, "transpose"): return False transpose2_op = conv_op.outputs[0].child_ops[0] # abort composition if permutation is not (0, 2, 3, 1) perm2 = transpose2_op.perm.val.copy() perm2[np.where(perm2 < 0)] += 4 if np.any(perm2 != (0, 2, 3, 1)): return False # `transpose` -> `squeeze` if not _check_child_op_type(transpose2_op, "squeeze"): return False squeeze_op = transpose2_op.outputs[0].child_ops[0] # abort composition if not squeezing the dummy height if squeeze_op.axes.rank != 1 or squeeze_op.axes.val[0] not in (-3, 1): return False # everything looks good return self._try_apply_transform_channel_last( expand_op, transpose1_op, conv_op, transpose2_op, squeeze_op, block ) @staticmethod def _try_apply_transform( expand_op: Operation, conv_op: Operation, squeeze_op: Operation, block: Block ) -> bool: ops_to_remove = [expand_op, conv_op, squeeze_op] if not _check_no_output_connection(block, ops_to_remove): return False # prepare `conv1d` conv_kwargs = {"name": squeeze_op.outputs[0].name, "before_op": conv_op} # inherit `x` from `expand_dim` conv_kwargs["x"] = expand_op.x # inherit `pad_type`, `groups`, `bias` from `conv2d` conv_kwargs["pad_type"] = conv_op.inputs["pad_type"].val conv_kwargs["groups"] = conv_op.inputs["groups"].val bias = conv_op.inputs.get("bias", None) if bias is not None: conv_kwargs["bias"] = bias # squeeze `weight`, `strides`, `pad`, `dilations` from `conv2d` conv_kwargs["weight"] = mb.squeeze( x=conv_op.inputs["weight"], axes=(-2,), before_op=conv_op ) conv_kwargs["strides"] = (conv_op.inputs["strides"].val[-1],) conv_kwargs["pad"] = (conv_op.inputs["pad"].val[-2], conv_op.inputs["pad"].val[-1]) conv_kwargs["dilations"] = (conv_op.inputs["dilations"].val[-1],) # compose `conv1d` out = mb.conv(**conv_kwargs) # try replacing `expand_dim` -> `conv2d` -> `squeeze` output # with the new `conv1d` output if squeeze_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=squeeze_op, old_var=squeeze_op.outputs[0], new_var=out ): # remove `expand_dim` -> `conv2d` -> `squeeze` block.remove_ops(ops_to_remove) return True return False @staticmethod def _try_apply_transform_channel_last( expand_op: Operation, transpose1_op: Operation, conv_op: Operation, transpose2_op: Operation, squeeze_op: Operation, block: Block, ) -> bool: ops_to_remove = [expand_op, transpose1_op, conv_op, transpose2_op, squeeze_op] if not _check_no_output_connection(block, ops_to_remove): return False # create `transpose1` transpose1_out = mb.transpose( x=expand_op.x, perm=(0, 2, 1), name=transpose1_op.outputs[0].name, before_op=expand_op ) # prepare `conv1d` conv_kwargs = {"name": conv_op.outputs[0].name, "x": transpose1_out, "before_op": conv_op} # inherit `pad_type`, `groups`, `bias` from `conv2d` conv_kwargs["pad_type"] = conv_op.inputs["pad_type"].val conv_kwargs["groups"] = conv_op.inputs["groups"].val bias = conv_op.inputs.get("bias", None) if bias is not None: conv_kwargs["bias"] = bias # squeeze `weight`, `strides`, `pad`, `dilations` from `conv2d` conv_kwargs["weight"] = mb.squeeze( x=conv_op.inputs["weight"], axes=(-2,), before_op=conv_op ) conv_kwargs["strides"] = (conv_op.inputs["strides"].val[-1],) conv_kwargs["pad"] = (conv_op.inputs["pad"].val[-2], conv_op.inputs["pad"].val[-1]) conv_kwargs["dilations"] = (conv_op.inputs["dilations"].val[-1],) # compose `conv1d` conv_out = mb.conv(**conv_kwargs) # create `transpose2` transpose2_out = mb.transpose( x=conv_out, perm=(0, 2, 1), name=squeeze_op.outputs[0].name, before_op=transpose2_op ) # try replacing `expand_dim` -> `transpose` -> `conv2d` -> `transpose` -> `squeeze` output # with the new `transpose` -> `conv1d` -> `transpose` output if squeeze_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=squeeze_op, old_var=squeeze_op.outputs[0], new_var=transpose2_out ): # remove `expand_dim` -> `transpose` -> `conv2d` -> `transpose` -> `squeeze` block.remove_ops(ops_to_remove) return True return False @register_pass(namespace="common") class fuse_conv_batchnorm(AbstractGraphPass): """ Fuse the following ``batch_norm`` layer into ``conv`` and ``conv_transpose``. That is, convert ``conv + batch_norm`` to ``conv``, by modifying the weight and bias in the ``conv`` layer. .. code-block:: Given: %2 = conv(%1) ... %3 = batch_norm(%2) ... Result: %3 = conv(%1) ... """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._fuse_conv_batchnorm_block(f) @staticmethod def _try_to_transform(conv_op, bn_op): # get parameters from batch_norm layer gamma = bn_op.gamma.val beta = bn_op.beta.val mean = bn_op.mean.val variance = bn_op.variance.val epsilon = bn_op.epsilon.val # get weight, bias and groups from conv layer if conv_op.weight.val is None: return False conv_weight = conv_op.weight.val conv_bias = conv_op.bias groups = conv_op.groups.val # get type of the conv layer is_deconv = conv_op.op_type == "conv_transpose" # The deconv weight transpose axes is determined by the dimension of convolution. # Conv1d should be [1, 0, 2], Conv2d should be [1, 0, 2, 3], Conv3d should be [1, 0, 2, 3, 4] if not 3 <= len(conv_weight.shape) <= 5: raise AssertionError( f"Only supports Conv1/2/3d, which means weight's dimension should" f"between 3 and 5, but got weight with {len(conv_weight.shape)} " f"dimensions. " ) deconv_weight_transpose_axes = [1, 0] + [axis for axis in range(2, len(conv_weight.shape))] # D_in denotes the spatial dimensions for conv kernel weight # for conv_transpose, conv_weight has shape [Cin, Cout / groups, *D_in] # for conv, conv_weight has shape [Cout, Cin / groups, *D_in] if is_deconv: Cout = conv_weight.shape[1] * groups Cin = conv_weight.shape[0] else: Cout = conv_weight.shape[0] Cin = conv_weight.shape[1] * groups # get the type of the conv weight conv_weight_type = conv_weight.dtype # create bias for conv if not exist if conv_bias is None: conv_bias = np.zeros(Cout) else: conv_bias = conv_bias.val if conv_bias is None: return False conv_bias = conv_bias.astype(conv_weight_type) # get the original shape of weight and bias origin_weight_shape = conv_weight.shape origin_bias_shape = conv_bias.shape # update the weight for conv layer new_conv_weight = [] new_conv_bias = [] if is_deconv: conv_weight = np.transpose(conv_weight, deconv_weight_transpose_axes) conv_weight = np.reshape( conv_weight, [Cout, Cin // groups] + list(conv_weight.shape[2:]) ) for i in range(Cout): # get batch norm parameters for each channel _gamma = gamma[i] _beta = beta[i] _mean = mean[i] _variance = variance[i] _scale = _gamma / np.sqrt(_variance + epsilon) # get conv weight and bias for each channel _conv_weight = conv_weight[i] _conv_bias = conv_bias[i] # update the conv weight and bias _conv_weight = _conv_weight * _scale _conv_bias = _scale * (_conv_bias - _mean) + _beta new_conv_weight.append(_conv_weight) new_conv_bias.append(_conv_bias) new_conv_weight = np.array(new_conv_weight).astype(conv_weight_type) new_conv_bias = np.array(new_conv_bias).astype(conv_weight_type) if is_deconv: new_conv_weight = np.reshape( new_conv_weight, [Cout // groups, Cin] + list(new_conv_weight.shape[2:]) ) new_conv_weight = np.transpose(new_conv_weight, deconv_weight_transpose_axes) # make sure the updated weight and bias have the same shape as the original ones if new_conv_weight.shape != origin_weight_shape: raise AssertionError( "conv weight should have the same shape before and after the fuse_" "conv_batchnorm pass. " ) if new_conv_bias.shape != origin_bias_shape: raise AssertionError( "conv bias should have the same shape before and after the fuse_" "conv_batchnorm pass. " ) # the new weight / bias should inherit the meta data from the old conv layer # TODO: this is currently a temporary solution, we should consider a more general approach. # the follow-up is tracked by: rdar://131637107 new_conv_weight = mb.const(val=new_conv_weight, before_op=conv_op) new_conv_bias = mb.const(val=new_conv_bias, before_op=conv_op) if conv_op.weight.op.op_type == "const": block = conv_op.enclosing_block block._copy_metadata(conv_op.weight, new_conv_weight) block._copy_metadata(conv_op.weight, new_conv_bias) # create a new conv op with the new bias value, copying rest of the attributes out_name = bn_op.outputs[0].name conv_kargs = { "weight": new_conv_weight, "bias": new_conv_bias, "name": out_name, "before_op": conv_op, } for k, v in conv_op.inputs.items(): if k in ["weight", "bias"]: continue conv_kargs[k] = v if is_deconv: x = mb.conv_transpose(**conv_kargs) else: x = mb.conv(**conv_kargs) if bn_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=bn_op, old_var=bn_op.outputs[0], new_var=x, ): bn_op.enclosing_block.remove_ops([conv_op, bn_op]) return True return False @block_context_manager def _fuse_conv_batchnorm_block(self, block): def _match_pattern(op): if op.op_type == "conv" or op.op_type == "conv_transpose": # abort fusion if op output is also a block output if op.outputs[0] in op.enclosing_block.outputs: return None # find batch_norm op child_ops = op.outputs[0].child_ops if len(child_ops) == 1: bn_op_candidate = list(child_ops)[0] if bn_op_candidate.op_type == "batch_norm": return bn_op_candidate return None fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._fuse_conv_batchnorm_block(b) if len(op.blocks) > 0: # This op can't be conv or conv_transpose continue bn_op = _match_pattern(op) if bn_op is not None: if self._try_to_transform(op, bn_op): fusion_occurred = True return fusion_occurred @register_pass(namespace="common") class fuse_conv_bias(AbstractGraphPass): """ Fold ``add``/``sub`` into ``bias`` of ``conv`` and ``conv_transpose``. That is, convert ``conv + add/sub`` to ``conv``, when ``add``/``sub`` is adding a constant. Two patterns are supported: .. code-block:: Pattern 1: Given: %2 = conv(%1) ... %3 = add(%2, constant) # where constant has shape (1,C,1)/(C,1) for 1d conv, (1,C,1,1)/(C,1,1) for 2d conv etc ... Result: %3 = conv(%1) ... Pattern 2: Given: %2 = conv(%1) %3 = transpose(%2) ... %4 = add(%3, constant) # where constant has a broacasable shape ... Result: %2 = conv(%1) %4 = transpose(%2) ... """ child_op_types = ["add", "sub"] def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._fuse_conv_bias_block(f) def _match_pattern(self, op): if op.op_type == "conv" or op.op_type == "conv_transpose": # abort fusion if op output is also a block output if op.outputs[0] in op.enclosing_block.outputs: return None # find add child_ops = op.outputs[0].child_ops if len(child_ops) == 1: add_op_candidate = list(child_ops)[0] if add_op_candidate.op_type in self.child_op_types: return add_op_candidate return None @staticmethod def _try_to_transform_transpose_pattern(conv_op, block): ops_to_remove = [] # conv layer if conv_op.op_type != "conv" and conv_op.op_type != "conv_transpose": return False is_deconv = conv_op.op_type == "conv_transpose" ops_to_remove.append(conv_op) # transpose layer if not _check_child_op_type(conv_op, "transpose"): return False transpose_op = list(conv_op.outputs[0].child_ops)[0] ops_to_remove.append(transpose_op) # add/sub layer if not _check_child_op_type(transpose_op, "add") and not _check_child_op_type( transpose_op, "sub" ): return False add_or_sub_op = list(transpose_op.outputs[0].child_ops)[0] ops_to_remove.append(add_or_sub_op) # get the bias if add_or_sub_op.x.val is None and add_or_sub_op.y.val is None: return False bias = add_or_sub_op.x.val if add_or_sub_op.x.val is not None else add_or_sub_op.y.val is_first_input = add_or_sub_op.y.val is not None is_sub = add_or_sub_op.op_type == "sub" # get the conv bias/weight conv_shape = conv_op.outputs[0].shape Cout = conv_shape[1] conv_weight = conv_op.weight.val conv_weight_type = conv_weight.dtype conv_bias = ( np.zeros(Cout).astype(conv_weight_type) if conv_op.bias is None else conv_op.bias.val ) # check if the bias is compatible for fusion is_bias_scalar = True if isinstance(bias, np.ndarray): if bias.shape == (): bias = bias.tolist() elif np.prod(bias.shape) == 1: bias = np.squeeze(bias).tolist() else: is_bias_scalar = False if not is_bias_scalar: if np.prod(bias.shape) != Cout: return False rank = transpose_op.outputs[0].rank cout_dim = transpose_op.perm.val.tolist().index(1) - rank if bias.shape[cout_dim] != Cout: return False bias = np.reshape(bias, (Cout)) # compute the new bias if is_sub: if is_first_input: bias = -bias else: conv_bias = -conv_bias new_bias = conv_bias + bias # compute the new weight if is_sub and not is_first_input: new_weight = -conv_weight else: new_weight = conv_weight if not _check_no_output_connection(block, ops_to_remove): return False # create a new conv op with the new weight, bias value, copying rest of the attributes conv_kargs = {"weight": new_weight, "bias": new_bias, "before_op": conv_op} for k, v in conv_op.inputs.items(): if k in ["weight", "bias"]: continue conv_kargs[k] = v if is_deconv: x = mb.conv_transpose(**conv_kargs) else: x = mb.conv(**conv_kargs) # create a new transpose op out_name = add_or_sub_op.outputs[0].name tranpose_kargs = {"x": x, "name": out_name, "before_op": transpose_op} for k, v in transpose_op.inputs.items(): if k == "x": continue tranpose_kargs[k] = v x = mb.transpose(**tranpose_kargs) if add_or_sub_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=add_or_sub_op, old_var=add_or_sub_op.outputs[0], new_var=x, ): add_or_sub_op.enclosing_block.remove_ops(ops_to_remove) return True return False @staticmethod def _try_to_transform(conv_op, add_op): if add_op.op_type == "sub": bias_var = add_op.y else: bias_var = add_op.x if add_op.x.val is not None else add_op.y bias_value = bias_var.val is_conv_op = conv_op.op_type == "conv" # check that the bias value is a constant array or a scalar constant if not isinstance(bias_value, (np.ndarray, np.generic)): return False is_bias_scalar = False if not isinstance(bias_value, np.ndarray): is_bias_scalar = True # find rank of the conv input rank = conv_op.x.rank if rank is None: return False if not (rank == 3 or rank == 4 or rank == 5): return False # check compatibility of bias value with the rank of the conv op # either bias value should be a scalar or: # rank=3 ==> (B,C,D), which means bias must be (1,C,1) or (C,1) # rank=4 ==> (B,C,D1,D2), which means bias must be (1,C,1,1) or (C,1,1) # rank=5 ==> (B,C,D1,D2,D3), which means bias must be (1,C,1,1,1) or (C,1,1,1) if is_bias_scalar: bias_value = np.array([bias_value]) else: # check that there is at most one dimension in the shape that is not 1 if len(np.squeeze(bias_value).shape) > 1: return False # check that addition is not happening on the batch dimension if len(bias_value.shape) == rank: if bias_value.shape[0] != 1: return False # check that last rank-2 entries in the shape vector are all 1s if np.prod(bias_value.shape[-(rank - 2) :]) != 1: return False bias_value = np.squeeze(bias_value) if add_op.op_type == "sub": bias_value *= -1 # everything looks good, now find the new updated bias old_bias = conv_op.inputs.get("bias", None) old_bias_value = None if old_bias is not None and old_bias.val is not None: old_bias_value = old_bias.val if old_bias is None: # need to create a fresh numpy array for bias if np.prod(bias_value.shape) == 1: # its a scalar bias # need to find the value of Cout to form a new bias if conv_op.weight.val is None: return False # conv_transpose has weight format [K, C_out, spatial dims] # conv has weight format [C_out, K, spatial dims] Cout = conv_op.weight.val.shape[0 if is_conv_op else 1] new_bias_value = np.broadcast_to(bias_value, (Cout,)) else: new_bias_value = bias_value else: # just need to update the existing bias array try: new_bias_value = old_bias_value + bias_value except: return False # create a new conv op with the new bias value, copying rest of the attributes out_name = add_op.outputs[0].name if new_bias_value.dtype != np.float32 and new_bias_value.dtype != np.float16: # cast the bias to match the weight type weight_np_type = types.nptype_from_builtin( conv_op.inputs["weight"].sym_type.get_primitive() ) logger.warning( "conv_bias_fusion pass: casting bias " "from {} to {} to match the dtype of the weight of the conv layer".format( new_bias_value.dtype, weight_np_type ) ) new_bias_value = new_bias_value.astype(weight_np_type) new_bias_var = mb.const(val=new_bias_value, before_op=conv_op) conv_kargs = {"bias": new_bias_var, "name": out_name, "before_op": conv_op} for k, v in conv_op.inputs.items(): if k == "bias": continue conv_kargs[k] = v if is_conv_op: x = mb.conv(**conv_kargs) else: x = mb.conv_transpose(**conv_kargs) if add_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=add_op, old_var=add_op.outputs[0], new_var=x, ): add_op.enclosing_block.remove_ops([conv_op, add_op]) return True return False @block_context_manager def _fuse_conv_bias_block(self, block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._fuse_conv_bias_block(b) if len(op.blocks) > 0: # This op can't be conv or conv_transpose continue # pattern 1 : conv + add/sub add_op = self._match_pattern(op) if add_op is not None: if self._try_to_transform(op, add_op): fusion_occurred = True # pattern 2 : conv + transpose + add/sub elif self._try_to_transform_transpose_pattern(op, block): fusion_occurred = True return fusion_occurred @register_pass(namespace="common") class fuse_conv_scale(AbstractGraphPass): """ Fold ``mul``/``div`` into ``conv``/``conv_transpose`` by updating the weight/bias of the convolution layers. The scale ``const`` can be a single number (scalar) or a vector with a broadcastable shape. For example, if the output of the ``conv``/``deconv`` layer is ``(B, Cout, H, W)``, ``const`` of shape ``(Cout, 1, 1)`` and ``(1, Cout, 1, 1)`` are allowed. .. code-block:: Given: %2 = conv(%1) ... %3 = mul(%2, constant) # where constant is the scale constant ... Result: %3 = conv(%1) ... """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._fuse_conv_scale_block(f) @staticmethod def _try_to_transform(conv_op, scale_op): # get the scale if scale_op.x.val is None and scale_op.y.val is None: return False scale_var = scale_op.x if scale_op.x.val is not None else scale_op.y scale = scale_var.val # for the scalar case, the scalar can be either # 1. a python int/float # 2. a 0d numpy array # 3. a 1d numpy array with shape (1,) is_scalar = True if isinstance(scale, np.ndarray): if scale.shape == (): scale = scale.tolist() elif scale.shape == (1) or scale.shape == (1,): scale = scale[0] else: is_scalar = False # get weight and bias and groups from conv layer if conv_op.weight.val is None: return False conv_weight = conv_op.weight.val conv_bias = conv_op.bias groups = conv_op.groups.val # get type of the conv layer is_deconv = conv_op.op_type == "conv_transpose" is_conv_1d = len(conv_weight.shape) == 3 # D_in denotes the spatial dimensions for conv kernel weight # for conv_transpose, conv_weight has shape [Cin, Cout / groups, *D_in] # for conv, conv_weight has shape [Cout, Cin / groups, *D_in] if is_deconv: Cout = conv_weight.shape[1] * groups Cin = conv_weight.shape[0] else: Cout = conv_weight.shape[0] Cin = conv_weight.shape[1] * groups # for the vector scale case, check if the shape is broacastable if not is_scalar: if not np.prod(scale.shape) == Cout: return False if len(scale.shape) == len(conv_weight.shape): if not scale.shape[1] == Cout: return False elif len(scale.shape) == len(conv_weight.shape) - 1: if not scale.shape[0] == Cout: return False else: return False # transform the scale to 1./scale for the real_div case if scale_op.op_type == "real_div": scale = 1.0 / scale # get the type of the conv weight conv_weight_type = conv_weight.dtype # create bias for conv if not exist if conv_bias is None: conv_bias = np.zeros(Cout) else: conv_bias = conv_bias.val conv_bias = conv_bias.astype(conv_weight_type) # get the original shape of weight and bias origin_weight_shape = conv_weight.shape origin_bias_shape = conv_bias.shape # update the weight/bias for conv layer if is_scalar: new_conv_bias = np.array(conv_bias * scale).astype(conv_weight_type) new_conv_weight = np.array(conv_weight * scale).astype(conv_weight_type) else: scale = np.reshape(scale, (Cout)) new_conv_bias = np.array(conv_bias * scale).astype(conv_weight_type) new_conv_weight = [] if is_deconv: conv_weight = np.transpose(conv_weight, [1, 0, 2] if is_conv_1d else [1, 0, 2, 3]) conv_weight = np.reshape( conv_weight, [Cout, Cin // groups] + list(conv_weight.shape[2:]) ) for i in range(Cout): _conv_weight = conv_weight[i] * scale[i] new_conv_weight.append(_conv_weight) new_conv_weight = np.array(new_conv_weight).astype(conv_weight_type) if is_deconv: new_conv_weight = np.reshape( new_conv_weight, [Cout // groups, Cin] + list(new_conv_weight.shape[2:]) ) new_conv_weight = np.transpose( new_conv_weight, [1, 0, 2] if is_conv_1d else [1, 0, 2, 3] ) # make sure the updated weight and bias have the same shape as the original ones assert ( new_conv_weight.shape == origin_weight_shape ), "conv weight should have the same shape before and after the fuse_conv_scale pass." assert ( new_conv_bias.shape == origin_bias_shape ), "conv bias should have the same shape before and after the fuse_conv_scale pass." # create a new conv op with the new weight, bias value, copying rest of the attributes out_name = scale_op.outputs[0].name conv_kargs = { "weight": new_conv_weight, "bias": new_conv_bias, "name": out_name, "before_op": conv_op, } for k, v in conv_op.inputs.items(): if k in ["weight", "bias"]: continue conv_kargs[k] = v if is_deconv: x = mb.conv_transpose(**conv_kargs) else: x = mb.conv(**conv_kargs) if scale_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=scale_op, old_var=scale_op.outputs[0], new_var=x, ): scale_op.enclosing_block.remove_ops([conv_op, scale_op]) return True return False @block_context_manager def _fuse_conv_scale_block(self, block): def _match_pattern(op): if op.op_type == "conv" or op.op_type == "conv_transpose": # abort fusion if op output is also a block output if op.outputs[0] in op.enclosing_block.outputs: return None # find batch_norm op child_ops = op.outputs[0].child_ops if len(child_ops) == 1: scale_op_candidate = list(child_ops)[0] if scale_op_candidate.op_type in ["mul", "real_div"]: return scale_op_candidate return None fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._fuse_conv_scale_block(b) if len(op.blocks) > 0: # This op can't be conv or conv_transpose continue scale_op = _match_pattern(op) if scale_op is not None: if self._try_to_transform(op, scale_op): fusion_occurred = True return fusion_occurred @register_pass(namespace="common") class fuse_pad_conv(AbstractGraphPass): """ When we observe ``pad -> transpose -> conv``, we move the ``pad`` to be next to ``conv``. This allows us to meld ``pad + conv`` if possible. .. code-block:: Given: %1 = pad(%0, ...) %2 = transpose(%1, ...) %3 = conv(%2, ...) ... Result: %1.a = transpose(%0, ...) $2.a = pad(%1.a, ...) %3 = conv(%2.a) ... """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._pad_conv_connect_block(f) @staticmethod def _match_pattern(op): ret = set([]) child_ops = op.outputs[0].child_ops for child_op in child_ops: if child_op.op_type != "transpose": continue skip_ops = child_op.outputs[0].child_ops for skip_op in skip_ops: if "conv" not in skip_op.op_type: continue ret.update([child_op]) return ret if len(ret) != 0 else None @staticmethod def _try_to_transform(pad_op, transpose_ops, block): def _compute_new_pad_values(transpose_op): if pad_op.inputs["pad"].val is None: return None pad_amounts = np.reshape(pad_op.inputs["pad"].val, [-1, 2]) transpose_axes = transpose_op.inputs["perm"].val rank_diff = len(transpose_axes) - pad_amounts.shape[0] pad_amounts_new = copy.deepcopy(pad_amounts) # append "rank_diff" rows of zeros to the top pad_amounts_new = np.concatenate( (np.zeros((2 * rank_diff)).reshape(-1, 2), pad_amounts_new) ) pad_amounts_new = pad_amounts_new.astype(pad_amounts.dtype) pad_amounts = np.concatenate((np.zeros((2 * rank_diff)).reshape(-1, 2), pad_amounts)) for i, axis in enumerate(transpose_axes): pad_amounts_new[i][0] = pad_amounts[axis][0] pad_amounts_new[i][1] = pad_amounts[axis][1] # get the top "rank_diff" rows top_rows = pad_amounts_new[:rank_diff, :] if not np.all(top_rows == 0): return False # cut "rank_diff" from the top pad_amounts_new = pad_amounts_new[rank_diff:, :] pad_amounts_new = pad_amounts_new.flatten() return pad_amounts_new if pad_op.outputs[0] in pad_op.enclosing_block.outputs: return False if len(set(pad_op.outputs[0].child_ops)) != len(transpose_ops): return False for transpose_op in transpose_ops: pad_amounts_new = _compute_new_pad_values(transpose_op) if pad_amounts_new is None: continue with pad_op.enclosing_block: new_transpose_var = mb.transpose( x=pad_op.inputs["x"], perm=transpose_op.inputs["perm"].val, before_op=transpose_op, ) new_pad_inputs = {"x": new_transpose_var, "pad": pad_amounts_new} for k, v in pad_op.inputs.items(): if k not in new_pad_inputs: new_pad_inputs[k] = v new_pad_var = mb.pad(before_op=transpose_op, **new_pad_inputs) pad_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=transpose_op, old_var=transpose_op.outputs[0], new_var=new_pad_var ) pad_op.enclosing_block.remove_ops(list(transpose_ops) + [pad_op]) return True @block_context_manager def _pad_conv_connect_block(self, block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._pad_conv_connect_block(b) if op.op_type != "pad": continue transpose_ops = self._match_pattern(op) if transpose_ops is not None: if self._try_to_transform(op, transpose_ops, block): fusion_occurred = True return fusion_occurred @register_pass(namespace="common") class fuse_dilated_conv(AbstractGraphPass): """ When we observe ``space_to_batch -> conv (2D) -> batch_to_space``, we attempt to fuse these three ops into a single ``conv`` with dilations. .. code-block:: Given: %1 = space_to_batch(%0, ...) %2 = conv(%1, ...) %3 = batch_to_space(%2, ...) ... Result: %3 = conv(%0, dilations=...) ... """ @staticmethod def _uses_same_padding( input_h: int, input_w: int, W_h: int, W_w: int, dilation_factor: List[int], padding: List[int], crop: List[int], ) -> bool: base_paddings = [0] * 4 dilated_W_h = dilation_factor[0] * (W_h - 1) + 1 dilated_W_w = dilation_factor[1] * (W_w - 1) + 1 base_paddings[0] = (dilated_W_h - 1) // 2 base_paddings[1] = dilated_W_h - 1 - (dilated_W_h - 1) // 2 base_paddings[2] = (dilated_W_w - 1) // 2 base_paddings[3] = dilated_W_w - 1 - (dilated_W_w - 1) // 2 pad_start_h = base_paddings[0] pad_start_w = base_paddings[2] orig_pad_end_h = base_paddings[1] orig_pad_end_w = base_paddings[3] full_input_h = input_h + pad_start_h + orig_pad_end_h full_input_w = input_w + pad_start_w + orig_pad_end_w pad_end_extra_h = ( dilation_factor[0] - full_input_h % dilation_factor[0] ) % dilation_factor[0] pad_end_extra_w = ( dilation_factor[1] - full_input_w % dilation_factor[1] ) % dilation_factor[1] pad_end_h = orig_pad_end_h + pad_end_extra_h pad_end_w = orig_pad_end_w + pad_end_extra_w return ( padding[0] == pad_start_h and padding[1] == pad_end_h and padding[2] == pad_start_w and padding[3] == pad_end_w and crop[0] == 0 and crop[1] == pad_end_extra_h and crop[2] == 0 and crop[3] == pad_end_extra_w ) def apply(self: AbstractGraphPass, prog: Program) -> None: for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._fuse_dilated_conv_block(f) @staticmethod def _match_pattern(op: Operation) -> Optional[List[Operation]]: if op.op_type != "space_to_batch": return None if not _check_child_op_type(op, 'conv'): return None conv_op = op.outputs[0].child_ops[0] if len(conv_op.inputs['x'].shape[2:]) != 2: # restricted to Conv2d for now because in _try_to_transform function, # the logic for calculating whether padding is same or not, works only for 2d conv config. return None if not _check_child_op_type(conv_op, 'batch_to_space'): return None batch_to_space_op = conv_op.outputs[0].child_ops[0] return (op, conv_op, batch_to_space_op) @staticmethod def _try_to_transform(matched_ops: Tuple[Operation], block: Block) -> bool: if not _check_no_output_connection(block, matched_ops): return False space_to_batch_op, conv_op, batch_to_space_op = matched_ops stb_dilation_factor = space_to_batch_op.inputs['block_shape'].val bts_dilation_factor = batch_to_space_op.inputs['block_shape'].val if stb_dilation_factor is None or bts_dilation_factor is None: return False if list(stb_dilation_factor) != list(bts_dilation_factor): # If block_shape for space_to_batch and batch_to_space doesn't match, # we do not fuse. return False padding_val = space_to_batch_op.inputs['paddings'].val if padding_val is None: return False padding_val = padding_val.flatten() crop_val = batch_to_space_op.inputs['crops'].val if crop_val is None: return False crop_val = crop_val.flatten() has_same_padding = False if np.any(padding_val != 0): input_shape = space_to_batch_op.inputs['x'].shape W_shape = conv_op.inputs['weight'].shape W_h, W_w = W_shape[2], W_shape[3] HW = input_shape[2:] has_same_padding = fuse_dilated_conv._uses_same_padding( HW[0], HW[1], W_h, W_w, stb_dilation_factor, padding_val, crop_val ) if not has_same_padding: return False conv_args = conv_op.inputs conv_args['x'] = space_to_batch_op.inputs['x'] conv_args['dilations'] = list(stb_dilation_factor) if has_same_padding: conv_args['pad_type'] = 'same' new_var = mb.conv(**conv_args, before_op=conv_op) if conv_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=conv_op, old_var=batch_to_space_op.outputs[0], new_var=new_var ): block.remove_ops(matched_ops) return True return False @block_context_manager def _fuse_dilated_conv_block(self: AbstractGraphPass, block: Block) -> bool: fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._fuse_dilated_conv_block(b) matched_ops = self._match_pattern(op) if matched_ops is not None: if self._try_to_transform(matched_ops, block): fusion_occurred = True return fusion_occurred ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/optimize_elementwise_binary.py0000644000000000000000000004177614672066616031212 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, Var from coremltools.converters.mil.mil import types as _types from coremltools.converters.mil.mil.ops.defs._utils import broadcast_shapes from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class divide_to_multiply(AbstractGraphPass): """ Convert divide into multiply if the divisor is ``const``. """ def apply(self, prog): for f in prog.functions.values(): self._divide_to_multiply_block(f) @block_context_manager def _divide_to_multiply_block(self, block): for op in list(block.operations): for b in op.blocks: self._divide_to_multiply_block(b) if len(op.blocks) > 0: # This op can't be divided. continue # If real_div has integer input, the result is an integer (following TensorFlow spec). # Hence, this pass needs disabled if the input is not float, since it translates y # to a floating point number. If x or y was originally an integer, and y becomes # a floating point number, then the original type # signature (with integer output) would not be preserved. if op.op_type == "real_div" and op.y.val is not None and _types.is_float(op.x.dtype): new_y_val = np.array(1.0, dtype=op.y.val.dtype) / op.y.val if not np.isfinite(new_y_val).all(): continue x = mb.mul(x=op.x, y=new_y_val, name="_inversed_" + op.name, before_op=op) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=x ) block.remove_ops([op]) @register_pass(namespace="common") class select_optimization(AbstractGraphPass): """ For ``select(cond, a, b)``, there are 2 cases where we can replace it with a single simpler op 1. If ``cond`` is a const scalar (or a const tensor but all elements are the same, which is equivalent to a scalar), then we replace ``select(cond, a, b)`` with simply ``a`` or ``b`` .. code-block:: Input graph: const(scalar cond) -| | a ------------------|-> select -> output | b ------------------| Output graph: if cond: a -> output else: b -> output 2. If ``cond`` is a more complicated const, and ``a`` is an inf const, then we replace ``a`` with ``select(cond, a, 0)``, then return ``a + b`` .. code-block:: Input graph: const(cond) -| | const(±inf) -|-> select -> output | b -----------| Output graph: select(cond, ±inf, 0) -| |-> add -> output b ---------------------| Note that ``select(cond, ±inf, 0))`` will further get eliminated by ``const_elimination``, so in the end the op in graph is simply ``add`` This replacement is based on floating-point arithmetic .. code-block:: inf + b = inf -inf + b = -inf 0 + b = b PS: if ``a`` is not inf const but ``b`` is, then we would swap ``a`` and ``b`` """ def apply(self, prog): @block_context_manager def apply_block(block: Block): for op in list(block.operations): for b in op.blocks: apply_block(b) if op.op_type == "select": self.try_to_transform_select(op) for f in prog.functions.values(): apply_block(f) def try_to_transform_select(self, select_op: Operation) -> bool: assert select_op.op_type == "select" cond_val = select_op.cond.val # this pass only handles const cond if cond_val is None: return False a_val = select_op.a.val b_val = select_op.b.val # if everything is const, then let const_elimination do its job if a_val is not None and b_val is not None: return False # try case 1: const scalar cond # (or const tensor cond but all elements are the same, which is equivalent to a scalar) result_candidate = self.try_to_transform_const_scalar_cond(select_op, cond_val) if result_candidate is not None and self.try_to_modify_block(select_op, result_candidate): return True # try case 2: complicated const cond + inf const a or b result_candidate = self.try_to_transform_inf_const_selection( select_op, cond_val, a_val, b_val ) if result_candidate is not None and self.try_to_modify_block(select_op, result_candidate): return True return False @staticmethod def try_to_transform_const_scalar_cond(select_op: Operation, cond_val: np.ndarray) -> Var: assert select_op.op_type == "select" assert cond_val is not None a = select_op.a b = select_op.b x: Var = None if np.all(cond_val): x = mb.identity(x=a, before_op=select_op) elif np.all(np.logical_not(cond_val)): x = mb.identity(x=b, before_op=select_op) else: return None result_shape = broadcast_shapes(a.shape, b.shape) # cannot simply replace with a or b if broadcasting if x.shape != result_shape: x.op.enclosing_block.remove_ops([x.op]) return None return x @staticmethod def try_to_transform_inf_const_selection( select_op: Operation, cond_val: np.ndarray, a_val: np.ndarray, b_val: np.ndarray ) -> Var: assert select_op.op_type == "select" assert cond_val is not None # check if a or b is all infinity constants # if a is not but b is, then swap a and b a: np.ndarray = None b: Var = None if a_val is not None and np.all(np.logical_not(np.isfinite(a_val))): a = a_val b = select_op.b elif b_val is not None and np.all(np.logical_not(np.isfinite(b_val))): a = b_val b = select_op.a cond_val = np.logical_not(cond_val) else: return None # build add cond_val, a = np.broadcast_arrays(cond_val, a) a = a.copy() a[np.where(np.logical_not(cond_val))] = 0.0 return mb.add(x=a, y=b, before_op=select_op, name=select_op.outputs[0].name) @staticmethod def try_to_modify_block(select_op: Operation, new_var: Var) -> bool: block: Block = select_op.enclosing_block if not block.try_replace_uses_of_var_after_op( anchor_op=select_op, old_var=select_op.outputs[0], new_var=new_var, ): return False block.remove_ops([select_op]) return True @register_pass(namespace="common") class fuse_elementwise_to_batchnorm(AbstractGraphPass): """ Fold ``mul`` + ``add`` into a ``batchnorm`` if the ``const`` feeding into the ``mul``/``add`` is of shape ``(1,C,1,1)`` or ``(C,1,1)`` and input to ``mul`` is of rank 4. .. code-block:: Given: [Const] [Const] | | V V [...] --> [Mul] --> [Add] --> [...] That is, %2 = op1(%1) %3 = mul(%2, constant) %4 = add(%3, constant) %5 = op2(%4) ... Result: [...] --> [BatchNorm] --> [...] That is, %2 = op1(%1) %4 = batchnorm(%2) %5 = op2(%4) ... """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._fuse_elementwise_to_batchnorm_block(f) @staticmethod def _match_pattern(op): if op.outputs[0] in op.enclosing_block.outputs: return None if op.op_type == "mul": # find add child_ops = op.outputs[0].child_ops if len(child_ops) == 1: add_op_candidate = list(child_ops)[0] if add_op_candidate.op_type == "add": return add_op_candidate return None @staticmethod def _try_to_transform(mul_op, add_op, block): def _find_const_input_val(op): if op.x.val is not None: return op.x.val if op.y.val is not None: return op.y.val return None def _check_shape(arr): """ return True if shape is of form (1,C,1,1) or (C,1,1) """ rank = len(arr.shape) if not (rank == 3 or rank == 4): return False C = arr.shape[-3] if not (arr.shape == (1, C, 1, 1) or arr.shape == (C, 1, 1)): return False return True non_const_input_mul = mul_op.x if mul_op.x.val is None else mul_op.y if non_const_input_mul.rank != 4: return False gamma = _find_const_input_val(mul_op) beta = _find_const_input_val(add_op) if gamma is None or beta is None: return False if not (isinstance(gamma, np.ndarray) and isinstance(beta, np.ndarray)): return False # check that gamma and beta have shape (1,C,1,1) or (C,1,1) # that is they are doing vector addition on the axis=-3, which is what the # batchnorm layer does (batchnorm layer only works on rank 4 input tensors) if not (_check_shape(gamma) and _check_shape(beta)): return False C = gamma.shape[-3] if C == 1: return False out_name = add_op.outputs[0].name x = mb.batch_norm( x=non_const_input_mul, mean=np.zeros((C,), np.float32), variance=np.ones((C,), np.float32), gamma=np.squeeze(gamma), beta=np.squeeze(beta), name=out_name, before_op=mul_op, ) add_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=add_op, old_var=add_op.outputs[0], new_var=x ) # Remove all the ops at once block.remove_ops([mul_op, add_op]) return True @block_context_manager def _fuse_elementwise_to_batchnorm_block(self, block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._fuse_elementwise_to_batchnorm_block(b) if len(op.blocks) > 0: # This op can't be mul continue add_op = self._match_pattern(op) if add_op is not None: if self._try_to_transform(op, add_op, block): fusion_occurred = True return fusion_occurred @register_pass(namespace="common") class rank0_expand_dims_swap(AbstractGraphPass): """ Identify the pattern of a ``rank-0`` binary elementwise operation followed by an ``expand_dims`` op. In the MIL backend, the output of the ``elementwise`` op becomes rank 1. Hence, an ``expand_dims`` op should be added after both of the ``rank-0`` tensors, and the final ``expand_dims`` should be removed. If the output var of the binary elementwise op is consumed by more than one op, a ``squeeze`` op is inserted. `Input` .. code-block:: [...](rank-0) --> sub --> expand_dims (axes=[0]) --> [...] ^ | | |--> op2 | | | |--> op3 | [scalar const] `Output` .. code-block:: [...](rank-0) --> expand_dims (axes=[0]) --> sub --> [...] ^ | | |--> squeeze ---> op2 | | | |--> op3 | expand_dims (axes=[0]) ^ | | [scalar const] """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._rank0_expand_dims_swap(f) @staticmethod def _try_to_transform(op, block): op_type = op.op_type ops_to_remove = [] if op.x.rank != 0 or op.y.rank != 0: return False # One and only one input is a scalar const if (op.x.val is None) == (op.y.val is None): return False var_1, var_2 = op.x, op.y ops_to_remove.append(op) # check if the output is consumed by exact one expand_dims op and other ops expand_dims_ops = [] other_ops = [] child_ops = list(op.outputs[0].child_ops) for child_op in child_ops: if child_op.op_type == "expand_dims": expand_dims_ops.append(child_op) else: other_ops.append(child_op) if len(expand_dims_ops) != 1: return False # check the expand_dim op has axes = [0] expand_dims_op = expand_dims_ops[0] expand_dims_op_axes_val = expand_dims_op.axes.val if isinstance(expand_dims_op_axes_val, np.ndarray): expand_dims_op_axes_val = expand_dims_op_axes_val.tolist() if expand_dims_op_axes_val != [0]: return False ops_to_remove.append(expand_dims_op) ops_to_remove += other_ops for out in op.outputs: if out in block.outputs: return False # add a expand_dims op after each rank-0 tensor var_1_expand = mb.expand_dims(x=var_1, axes=[0], before_op=op) var_2_expand = mb.expand_dims(x=var_2, axes=[0], before_op=op) # add a new elementwise binary op elem_op = getattr(mb, op_type) # replace var for the expand_dims op x = elem_op( x=var_1_expand, y=var_2_expand, name=expand_dims_op.outputs[0].name, before_op=op ) expand_dims_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=expand_dims_op, old_var=expand_dims_op.outputs[0], new_var=x ) # replace var for other ops if len(other_ops) >= 1: elem_op_output = op.outputs[0] squeeze = mb.squeeze(x=x, before_op=op) for other_op in other_ops: new_op = getattr(mb, other_op.op_type) kargs = {} for k, v in other_op.inputs.items(): if v == elem_op_output: kargs[k] = squeeze else: kargs[k] = v kargs["name"] = other_op.name kargs["before_op"] = other_op new_var = new_op(**kargs) other_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=other_op, old_var=other_op.outputs[0], new_var=new_var ) # Remove all the ops at once block.remove_ops(ops_to_remove) return True @block_context_manager def _rank0_expand_dims_swap(self, block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._rank0_expand_dims_swap(b) if len(op.blocks) > 0: # This op can't be elementwise binary ops continue if op.op_type in ["add", "sub", "mul", "real_div", "floor_div"]: if self._try_to_transform(op, block): fusion_occurred = True return fusion_occurred ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/optimize_linear.py0000644000000000000000000003576014672066616026573 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, Program, Var from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class fuse_linear_bias(AbstractGraphPass): """ Convert ``linear + add/sub`` to a single ``linear`` by updating the weight and bias of the ``linear`` layer. .. code-block:: Example 1: Original: %4 = linear(x=%1, weight=%2, bias=%3) # %2 is a rank-2 const tensor (weight) # %3 is a rank-1 const tensor (bias) ... %6 = add(x=%4, y=%5) # %5 is a const tensor with same shape as %3 Result: %8 = linear(x=%1, weight=%2, bias=%7) # where %7 is a new const tensor with value # %7 = %3 + %6 Example 2: Original: %4 = linear(x=%1, weight=%2, bias=%3) # %2 is a rank-2 const tensor (weight) # %3 is a rank-1 const tensor (bias) ... %6 = sub(x=%5, y=%4) # %5 is a const tensor with a broacasable shape with %3. i.e. if %3 has shape (Dout), %5 could be (1, Dout). Result: %9 = linear(x=%1, weight=%7, bias=%8) # where %7 is a new const tensor with value %7 = -%2 # %8 = %5 - %3 """ def apply(self, prog: Program): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._fuse_linear_bias_block(f) @staticmethod def _try_to_transform(linear_op, add_or_sub_op, block): if add_or_sub_op.x.val is None and add_or_sub_op.y.val is None: return False is_sub = add_or_sub_op.op_type == "sub" is_first_input = add_or_sub_op.x == linear_op.outputs[0] # Return if weight or bias are missing values if linear_op.weight.val is None or linear_op.bias.val is None: return False # compute the new bias linear_bias = linear_op.bias.val bias = add_or_sub_op.y.val if is_first_input else add_or_sub_op.x.val # check if the shape is broadcasable if np.prod(linear_bias.shape) != np.prod(bias.shape): return False Dout = linear_bias.shape[0] if bias.shape[-1] != Dout: return False bias = np.reshape(bias, (Dout,)) if is_sub: if is_first_input: bias = -bias else: linear_bias = -linear_bias new_bias = linear_bias + bias # compute the new weight if is_sub and not is_first_input: new_weight = -linear_op.weight.val else: new_weight = linear_op.weight.val # create a new linear op with the new weight, bias value, copying rest of the attributes out_name = add_or_sub_op.outputs[0].name linear_kargs = { "weight": new_weight, "bias": new_bias, "name": out_name, "before_op": linear_op, } for k, v in linear_op.inputs.items(): if k in ["weight", "bias"]: continue linear_kargs[k] = v x = mb.linear(**linear_kargs) if add_or_sub_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=add_or_sub_op, old_var=add_or_sub_op.outputs[0], new_var=x, ): add_or_sub_op.enclosing_block.remove_ops([linear_op, add_or_sub_op]) return True return False @block_context_manager def _fuse_linear_bias_block(self, block): def _find_candicate_op(op): if op.op_type != "linear": return None # abort fusion if op output is also a block output if op.outputs[0] in op.enclosing_block.outputs: return None # find add/sub op child_ops = op.outputs[0].child_ops if len(child_ops) == 1: op_candidate = list(child_ops)[0] if op_candidate.op_type in ["add", "sub"]: return op_candidate fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._fuse_linear_bias_block(b) if len(op.blocks) > 0: # This op can't be conv or conv_transpose continue add_or_sub_op = _find_candicate_op(op) if add_or_sub_op is not None: if self._try_to_transform(op, add_or_sub_op, block): fusion_occurred = True return fusion_occurred @register_pass(namespace="common") class fuse_matmul_weight_bias(AbstractGraphPass): """ Convert ``matmul + add/sub`` to ``linear`` whenever possible. .. code-block:: Given: %3 = matmul(x=%1, y=%2) # %1 or %2 is const and rank 2 (weight) ... %5 = add(x=%3, y=%4) # %4 is const. add(x=%4, y=%3) is equivalent # sub is similar. Result: # assuming %2 above is const and rank 2 %5 = linear(x=%1, weight=%2, bias=%4) """ def apply(self, prog: Program): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._fuse_matmul_weight_bias_block(f) @staticmethod def _find_candidate_op(op): _CHILD_OP_TYPES = ["add", "sub"] if op.op_type != "matmul": return None # find add child_ops = op.outputs[0].child_ops if len(child_ops) == 1: add_op_candidate = list(child_ops)[0] if add_op_candidate.op_type in _CHILD_OP_TYPES: return add_op_candidate @staticmethod def _transpose(v, before_op, name=None): """ Transpose the last 2 dims. - ``v``: (Var, must be a tensor). - ``before_op``: (Operation) The op right before the newly added ``transpose`` op. - ``name``: Name for the ``transpose`` op if provided. """ perm = list(range(v.rank)) perm[-2], perm[-1] = perm[-1], perm[-2] if name is None: return mb.transpose(x=v, perm=perm, before_op=before_op) else: return mb.transpose(x=v, perm=perm, before_op=before_op, name=name) def _try_to_transform(self, matmul_op, add_op, block): if matmul_op.x.val is None and matmul_op.y.val is None: # This is a dynamic matmul. return False if add_op.x.val is None and add_op.y.val is None: # This is a dynamic add. return False x_is_weight = matmul_op.x.val is not None if x_is_weight: weight, linear_x = matmul_op.x, matmul_op.y transpose_weight = matmul_op.transpose_x.val transpose_x = matmul_op.transpose_y.val else: weight, linear_x = matmul_op.y, matmul_op.x transpose_weight = matmul_op.transpose_y.val transpose_x = matmul_op.transpose_x.val # We potentially are going to transpose the weight, so if the weight itself is not removable, we skip this path if len(weight.nonreplaceable_vars_upstream) > 0: return False if linear_x.rank < 2 or weight.rank != 2: # We don't support these cases yet. return False # For those weights which are the input for more than one op, # we don't do the fusion. # The reason is that it might cause memory explosion by adding # those weight as a numpy array in the inner product or # the batch_mat_mul kernel. if len(weight.child_ops) > 1: return False d_out = weight.shape[1] if not transpose_weight else weight.shape[0] bias = add_op.x.val if add_op.x.val is not None else add_op.y.val if len(bias.shape) > 1: if any([d != 1 for d in bias.shape[:-1]]): return # cannot transform # squeeze leading dims of size 1 bias = np.squeeze(bias) if len(bias.shape) != 1 or bias.shape[0] != d_out: return # cannot transform if add_op.op_type == "sub": bias = -bias out_name = add_op.outputs[0].name if x_is_weight: # If transpose_x == transpose_weight == False: # w*x = (x^T w^T)^T = linear(x^T, w)^T x_transposed = ( self._transpose(linear_x, before_op=matmul_op) if not transpose_x else linear_x ) w_no_transpose = ( weight if not transpose_weight else self._transpose(weight, before_op=matmul_op) ) x = mb.linear(x=x_transposed, weight=w_no_transpose, bias=bias, before_op=matmul_op) x = self._transpose(x, before_op=matmul_op, name=out_name) else: # If transpose_x == transpose_weight == False # x*w = x*(w^T)^T = linear(x, w^T) x_no_transpose = ( self._transpose(linear_x, before_op=matmul_op) if transpose_x else linear_x ) w_transposed = ( weight if transpose_weight else self._transpose(weight, before_op=matmul_op) ) x = mb.linear( x=x_no_transpose, weight=w_transposed, bias=bias, before_op=matmul_op, name=out_name, ) if add_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=add_op, old_var=add_op.outputs[0], new_var=x, ): add_op.enclosing_block.remove_ops([matmul_op, add_op]) return True return False @block_context_manager def _fuse_matmul_weight_bias_block(self, block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._fuse_matmul_weight_bias_block(b) if len(op.blocks) > 0: # This op can't be matmul continue add_op = self._find_candidate_op(op) if add_op is not None: if self._try_to_transform(op, add_op, block): fusion_occurred = True return fusion_occurred @register_pass(namespace="common") class fuse_transpose_matmul(AbstractGraphPass): """ Fuse ``transpose + matmul`` to ``matmul`` if possible, since ``matmul`` has args ``transpose_x`` and ``transpose_y`` to transpose last 2 dims .. code-block:: Positive example: Input graph: transpose(x=x, perm=(1, 0)) -| |-> matmul(x=transposed_x, y=transposed_y) transpose(x=y, perm=(1, 0)) -| Output graph: matmul(x=x, y=y, transpose_x=True, transpose_y=True) Negative example: Input graph: transpose(x=x, perm=(1, 0, 2)) -| |-> matmul(x=transposed_x, y=transposed_y) transpose(x=y, perm=(1, 0, 2)) -| Output graph: Same to input graph, nothing changes """ def apply(self, prog: Program) -> None: for f in prog.functions.values(): self._fuse_transpose_matmul_block(f) @block_context_manager def _fuse_transpose_matmul_block(self, block: Block) -> None: # use shallow copy to hide changes on block.operations during the loop, # since we try fusion when loop to matmul, which will not affect downstream for op in list(block.operations): for b in op.blocks: self._fuse_transpose_matmul_block(b) if op.op_type == "matmul": self._try_fuse_transpose_matmul(op, block) @staticmethod def is_transposed_and_fusable_to_matmul(x: Var) -> bool: """ 1. check if x is transposed 2. check if x is transposed in the last 2 dimensions, since the transpose arg in matmul only transposes the last 2 dimensions """ # x is not transposed, False if x.op is None or x.op.op_type != "transpose": return False rank = x.rank # if transposing a rank < 2 tensor, it is a noop and will be elimianted by noop_elimination if rank < 2: return False # canonicalize the input permutation to compare with last-2-dim permutation below perm = x.op.perm.val perm[np.where(perm < 0)] += rank perm[-2:] -= rank # permuting only last 2 dims should look like (0, 1, ..., -1, -2) perm_only_last_2_dims = np.arange(rank) perm_only_last_2_dims[-2] = -1 perm_only_last_2_dims[-1] = -2 return np.all(perm == perm_only_last_2_dims) def _try_fuse_transpose_matmul(self, op: Operation, block: Block) -> None: assert op.op_type == "matmul" x = op.x y = op.y transpose_x = False if op.transpose_x is None else op.transpose_x.val transpose_y = False if op.transpose_y is None else op.transpose_y.val is_x_transposed_and_fusable_to_matmul = self.is_transposed_and_fusable_to_matmul(x) is_y_transposed_and_fusable_to_matmul = self.is_transposed_and_fusable_to_matmul(y) # if neither x nor y is transposed and fuseable with matmul, nothing we need to do if not is_x_transposed_and_fusable_to_matmul and not is_y_transposed_and_fusable_to_matmul: return if is_x_transposed_and_fusable_to_matmul: x = x.op.x transpose_x = not transpose_x if is_y_transposed_and_fusable_to_matmul: y = y.op.x transpose_y = not transpose_y fused_transpose_matmul = mb.matmul( x=x, y=y, transpose_x=transpose_x, transpose_y=transpose_y, before_op=op, name=op.name, ) block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=fused_transpose_matmul, ) op.remove_from_block() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/optimize_normalization.py0000644000000000000000000012527014672066616030203 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import List, Optional import numpy as np from coremltools import _logger as logger from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, Program, Var from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import ( _check_no_output_connection, block_context_manager, ) from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class fuse_layernorm_or_instancenorm(AbstractGraphPass): """ A graph optimization pass on PyMIL to detect and fuse several variants of ``layer_norm`` or ``instance_norm``. Pattern 1 corresponds to either ``layer_norm`` or ``instance_norm``. Patterns 2-4 are ``instance_norm``. Pattern 5 is ``layer_norm``. You can find these patterns in the methods for this class in the source code. To quickly view the source code, click the **[source]** button at the end of the class definition. """ _DEBUG = False # set to true to plot the block before and after the transformation def apply(self, prog: Program): for f in prog.functions.values(): block_changed = True while block_changed: if self._DEBUG: import graphviz graphviz.Source( f.get_dot_string( highlight_debug_op_types=["instance_norm"], ) ).view(filename="/tmp/block_before_fuse_layernorm_or_instancenorm") logger.debug("Block before fuse_layernorm_or_instancenorm transform:\n{}".format(f)) block_changed = self._fuse_layernorm_or_instancenorm_block(f) if self._DEBUG: graphviz.Source( f.get_dot_string( highlight_debug_op_types=["instance_norm"], ) ).view(filename="/tmp/block_after_fuse_layernorm_or_instancenorm") logger.debug("Block after fuse_layernorm_or_instancenorm transform:\n{}".format(f)) @staticmethod def _check_reduce_op(reduce_op: Operation, mode: str = "reduce_mean") -> bool: """ Check whether or not the ``reduction`` op satisfies following conditions: - Mode is expected. - Does not change rank (``keep_dims`` is ``True``). - The ``axes`` is known at compile time. Parameters ---------- param reduce_op : ``reduce_op`` to check on. param mode : ``reduce`` mode """ if reduce_op is None: return False if reduce_op.op_type != mode: return False if reduce_op.keep_dims is None or reduce_op.keep_dims.val is None: return False if reduce_op.keep_dims.val is False: return False if reduce_op.axes is None or reduce_op.axes.val is None: return False return True @staticmethod def _check_child_op_types( op: Operation, child_op_types: List[str], check_order: bool = True ) -> bool: """ Returns ``True`` for child op types matching ``child_op_types``, otherwise returns ``False``. Parameters ---------- param op : Current op. param child_op_type : Expected child op type. param check_order : Ensure child in given order, defaults to ``True``. """ if op is None or len(op.outputs) != 1: return False child_ops = list(op.outputs[0].child_ops) if len(child_ops) != len(child_op_types): return False ops_types = [c.op_type for c in child_ops] if check_order is False: ops_types = sorted(ops_types) child_op_types = sorted(child_op_types) return ops_types == child_op_types @staticmethod def _try_get_child_op_type( op: Operation, child_op_type: str, index: int = 0 ) -> Optional[Operation]: """ Returns child op if type matches, otherwise returns ``None``. Parameters ---------- param op : Current op. param child_op_type : Expected child op type. param index : Child op index. """ if op is None: return None if len(op.outputs) != 1: return None child_ops = list(op.outputs[0].child_ops) if index >= len(child_ops): return None if child_ops[index].op_type != child_op_type: return None return child_ops[index] @staticmethod def _try_apply_transform( reduce_op: Operation, block: Block, gamma_var: Var, beta_var: Var, epsilon_var: Var, end_op: Operation, ops_to_remove: List[Operation], ) -> bool: """ Insert instance_norm / layer_norm and delete all ops. :param reduce_op: Start operation of the pattern. :param block: Block :param gamma_var: Gamma variable. :param beta_var: Beta variable. :param epsilon_var: Epsilon variable. :param end_op: End operation of the pattern. :param ops_to_remove: Operations to remove. """ if not _check_no_output_connection(block, ops_to_remove): return False axes = reduce_op.axes.val rank = len(reduce_op.x.shape) # check whether the pattern is instance_norm or layer_norm is_layernorm = False is_instancenorm = False is_require_rank4_transpose = False negative_axes = [a - rank if a >= 0 else a for a in axes] negative_axes.sort() gamma_rank = gamma_var.rank if gamma_var is not None else -1 beta_rank = beta_var.rank if beta_var is not None else -1 if gamma_rank == len(axes) and beta_rank == len(axes): # axes for layer_norm must be [-1] or [-1, -2] or [-1, -2, -3] and so on if negative_axes == list(range(-len(negative_axes), 0)): is_layernorm = True if rank == 4 and negative_axes == [-3]: is_layernorm = (gamma_var is None and beta_var is None) or (gamma_rank == 1 and beta_rank == 1) if gamma_var: ops_to_remove.append(gamma_var.op) gamma_var = gamma_var.val else: gamma_var = None if beta_var: ops_to_remove.append(beta_var.op) beta_var = beta_var.val else: beta_var = None if rank == 4 and (negative_axes == [-2, -1] or negative_axes == [-3, -2]): if ( len(np.squeeze(gamma_var.val).shape) == 1 and len(np.squeeze(beta_var.val).shape) == 1 ): is_instancenorm = True if negative_axes == [-3, -2]: is_require_rank4_transpose = True if not (is_instancenorm or is_layernorm): return False # remove all the ops, and replace with a layer_norm or instance_norm op out_name = end_op.outputs[0].name if is_require_rank4_transpose: x = mb.transpose( x=reduce_op.x, perm=[0, 3, 1, 2], name=out_name + "_transpose_nhwc_nchw", before_op=end_op, ) if is_instancenorm: x = mb.instance_norm( x=x if is_require_rank4_transpose else reduce_op.x, gamma=np.squeeze(gamma_var.val), beta=np.squeeze(beta_var.val), epsilon=epsilon_var, name=out_name + "_instancenorm" if is_require_rank4_transpose else out_name, before_op=end_op, ) ops_to_remove.extend([gamma_var.op, beta_var.op]) else: # is_layernorm x = mb.layer_norm( x=x if is_require_rank4_transpose else reduce_op.x, axes=axes, gamma=gamma_var, beta=beta_var, epsilon=epsilon_var, name=out_name + "_layernorm" if is_require_rank4_transpose else out_name, before_op=end_op, ) if is_require_rank4_transpose: x = mb.transpose( x=x, perm=[0, 2, 3, 1], name=out_name + "_transpose_nchw_nhwc", before_op=end_op, ) end_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=end_op, old_var=end_op.outputs[0], new_var=x ) # Remove all the ops at once block.remove_ops(ops_to_remove) return True def _try_match_and_transform_pattern_1(self, reduce_op, block) -> bool: """ Identify the pattern: ``y = gamma * (x - mean) / sqrt(variance + epsilon) + beta`` ``y = x * [gamma * rsqrt(variance + eps)] + (beta - mean * [gamma * rsqrt(variance + eps)])`` .. code-block:: x --> reduce_mean --> sub --> square --> reduce_mean --> add(epsilon) --> rsqrt | | ^ | | | | V |----------------------- mul (gamma) | | | | | --------|--------- | | | | | | | V | |----------------------------------------------------------------> mul | | | | V | |--------------------------------------------------------------> mul | | V | sub (beta) --> add --> [...] | ^ |------------------------------- This pattern corresponds to either ``layer_norm`` or ``instance_norm``. It is ``instance_norm`` if all of the following are true: - ``input`` is rank 4. - ``axes`` of ``reduce_mean`` is ``[-2, -1]`` or ``[-3, -2]`` (when ``[-3, -2]``, a channel first to channel last transpose would be inserted). - ``gamma`` and ``beta`` are rank 1, after ``squeeze``. It is ``layer_norm`` if all of the following are true: - ``axes`` is either ``[-1]``, ``[-1, -2]``, or ``[-1, -2, -3]``, and so on. - ``rank`` of ``gamma`` and ``beta`` is equal to the length of the ``axes``. """ ops_to_remove = [] root_var = reduce_op.x if root_var.shape is None: return False # check that root_var feeds into exactly 3 ops if len(list(root_var.child_ops)) != 3: return False if root_var.op is not None and not self._check_child_op_types( root_var.op, child_op_types=["reduce_mean", "sub", "mul"] ): return False # check 1st reduce_mean op if not self._check_reduce_op(reduce_op): return False ops_to_remove.append(reduce_op) # check 1st sub op if not self._check_child_op_types(reduce_op, ["sub", "mul"], check_order=False): return False child_ops_reduce_mean = list(reduce_op.outputs[0].child_ops) op_a = child_ops_reduce_mean[0] op_b = child_ops_reduce_mean[1] sub_op1 = op_a if op_a.op_type == "sub" else op_b if not (sub_op1.x == root_var and sub_op1.y == reduce_op.outputs[0]): return False ops_to_remove.append(sub_op1) # check square op square_op = self._try_get_child_op_type(sub_op1, "square") if square_op is None: return False ops_to_remove.append(square_op) # check second reduce mean reduce_op2 = self._try_get_child_op_type(square_op, "reduce_mean") if not self._check_reduce_op(reduce_op2): return False ops_to_remove.append(reduce_op2) # check add op (with epsilon) add_op1 = self._try_get_child_op_type(reduce_op2, "add") if add_op1 is None: return False epsilon_var = add_op1.y if add_op1.x == reduce_op2.outputs[0] else add_op1.x if epsilon_var.val is None or len(epsilon_var.val.shape) != 0: return False # must be scalar ops_to_remove.append(add_op1) # check rsqrt rsqrt_op = self._try_get_child_op_type(add_op1, "rsqrt") if rsqrt_op is None: return False ops_to_remove.append(rsqrt_op) # check mul (gamma) mul_op1 = self._try_get_child_op_type(rsqrt_op, "mul") if mul_op1 is None: return False gamma_var = mul_op1.y if mul_op1.x == rsqrt_op.outputs[0] else mul_op1.x if gamma_var.val is None: return False ops_to_remove.append(mul_op1) # check 2 muls after the gamma mul if not self._check_child_op_types(mul_op1, ["mul", "mul"]): return False child_ops = list(mul_op1.outputs[0].child_ops) mul_op2 = child_ops[0] mul_op3 = child_ops[1] mul_op2_other_var = mul_op2.x if mul_op2.y == mul_op1.outputs[0] else mul_op2.y mul_op3_other_var = mul_op3.x if mul_op3.y == mul_op1.outputs[0] else mul_op3.y if not ( (mul_op2_other_var == root_var and mul_op3_other_var == reduce_op.outputs[0]) or (mul_op2_other_var == reduce_op.outputs[0] and mul_op3_other_var == root_var) ): return False if mul_op2_other_var == root_var: mul_root_op = mul_op2 mul_mean_op = mul_op3 else: mul_root_op = mul_op3 mul_mean_op = mul_op2 ops_to_remove.append(mul_mean_op) ops_to_remove.append(mul_root_op) # check sub with beta sub_op2 = self._try_get_child_op_type(mul_mean_op, "sub") if sub_op2 is None: return False if sub_op2.y != mul_mean_op.outputs[0]: return False beta_var = sub_op2.x if beta_var.val is None: return False ops_to_remove.append(sub_op2) # check last add op add_op2 = self._try_get_child_op_type(sub_op2, "add") if add_op2 is None: return False if not (add_op2.x == mul_root_op.outputs[0] or add_op2.y == mul_root_op.outputs[0]): return False ops_to_remove.append(add_op2) return self._try_apply_transform( reduce_op, block, gamma_var, beta_var, epsilon_var, add_op2, ops_to_remove ) def _try_match_and_transform_pattern_2(self, reduce_op, block) -> bool: """ Identify the pattern: ``y = (x - mean) / pow(variance + epsilon) * gamma + beta`` This pattern corresponds to, and should be fused as, ``instance_norm``. All of the following conditions must be satisfied: 1. ``input`` is rank 4 tensor. 2. ``reduce`` operates on spatial dimensions ``axes=[-2, -1]``, or ``axes=[-3, -2]`` (a channel first to channel last transpose would be inserted in such cases). 3. ``gamma`` and ``beta`` are both shape ``(C,)`` after ``squeeze``, where ``C`` is number of channels. .. code-block:: |----> sub -----| const (0.5) | ^ | | | | V V x ---> mean square --> mean1 --> add_eps ---> pow const_gamma const_beta | | | | | | V V V V |----> sub1 --------------------------------> real_div --> mul_gamma --> add_beta --> ... """ ops_to_remove = [] root_var = reduce_op.x if root_var.shape is None: return False # check that root_var feeds into exactly 3 ops if len(root_var.child_ops) != 3: return False if root_var.op is not None and not self._check_child_op_types( root_var.op, child_op_types=["reduce_mean", "sub", "sub"] ): return False # check 1st reduce_mean op if not self._check_reduce_op(reduce_op): return False ops_to_remove.append(reduce_op) # check 1st sub op if not self._check_child_op_types(reduce_op, ["sub", "sub"]): return False child_ops_reduce_mean = list(reduce_op.outputs[0].child_ops) reduce_mean_child_op_a = child_ops_reduce_mean[0] reduce_mean_child_op_b = child_ops_reduce_mean[1] # One of sub op directly goes square, the other one goes real_div if list(reduce_mean_child_op_a.outputs[0].child_ops)[0].op_type == "square": sub_op0 = reduce_mean_child_op_a sub_op1 = reduce_mean_child_op_b else: sub_op0 = reduce_mean_child_op_b sub_op1 = reduce_mean_child_op_a if not (sub_op0.x == root_var and sub_op0.y == reduce_op.outputs[0]): return False if not (sub_op1.x == root_var and sub_op1.y == reduce_op.outputs[0]): return False ops_to_remove.append(sub_op0) ops_to_remove.append(sub_op1) # check square op square_op = self._try_get_child_op_type(sub_op0, "square") if square_op is None: return False ops_to_remove.append(square_op) # check second reduce mean reduce_op2 = self._try_get_child_op_type(square_op, "reduce_mean") if not self._check_reduce_op(reduce_op2): return False ops_to_remove.append(reduce_op2) # check add op (with epsilon) add_eps_op = self._try_get_child_op_type(reduce_op2, "add") if add_eps_op is None: return False epsilon_var = add_eps_op.y if add_eps_op.x == reduce_op2.outputs[0] else add_eps_op.x if epsilon_var.val is None or len(epsilon_var.val.shape) != 0: return False # must be scalar ops_to_remove.append(add_eps_op) # check pow pow_op = self._try_get_child_op_type(add_eps_op, "pow") if pow_op is None: return False if pow_op.y.val is None or not np.isclose(pow_op.y.val, 0.5): return False ops_to_remove.append(pow_op) # check real_div real_div_op = self._try_get_child_op_type(pow_op, "real_div") if real_div_op is None: return False if not (real_div_op.x == sub_op1.outputs[0] and real_div_op.y == pow_op.outputs[0]): return False ops_to_remove.append(real_div_op) # check mul with gamma mul_gamma_op = self._try_get_child_op_type(real_div_op, "mul") if mul_gamma_op is None: return False gamma_var = mul_gamma_op.y if mul_gamma_op.x == real_div_op.outputs[0] else mul_gamma_op.x if gamma_var.val is None: return False ops_to_remove.append(mul_gamma_op) # check add with beta add_beta_op = self._try_get_child_op_type(mul_gamma_op, "add") if add_beta_op is None: return False beta_var = add_beta_op.y if add_beta_op.x == mul_gamma_op.outputs[0] else add_beta_op.x if beta_var.val is None: return False ops_to_remove.append(add_beta_op) return self._try_apply_transform( reduce_op, block, gamma_var, beta_var, epsilon_var, add_beta_op, ops_to_remove ) def _try_match_and_transform_pattern_3(self, reduce_op, block) -> bool: """ Detect ``InstanceNorm`` pattern in TensorFlow-Addons. This pattern corresponds to, and should be fused as, ``instance_norm``. All of the following conditions must be satisfied: 1. ``input`` is rank 4 tensor. 2. ``reduce`` operates on spatial dimensions ``axes=[-2, -1]``, or ``axes=[-3, -2]`` (a channel first to channel last transpose would be inserted in such cases). 3. ``gamma`` and ``beta`` are absent. Default values for ``gamma`` and ``beta`` would be used. .. code-block:: |-------------------------------------------------| | | | V x --> mean square --> mean1 --> add_eps --> rsqrt --> mul2 --> mul_sub | | ^ | | | V | | | | --> sub -----| | | | V V |--------------------------------------------> mul1 -------------> add --> ... """ ops_to_remove = [] root_var = reduce_op.x if root_var.shape is None: return False # check that root_var feeds into exactly 3 ops if len(root_var.child_ops) != 3: return False if root_var.op is not None and not self._check_child_op_types( root_var.op, ["sub", "mul", "reduce_mean"] ): return False # check 1st reduce_mean op if not self._check_reduce_op(reduce_op): return False ops_to_remove.append(reduce_op) # check 1st sub op if not self._check_child_op_types(reduce_op, ["sub", "mul"], check_order=False): return False child_ops_reduce_mean = list(reduce_op.outputs[0].child_ops) reduce_mean_child_op_a = child_ops_reduce_mean[0] reduce_mean_child_op_b = child_ops_reduce_mean[1] sub_op1 = ( reduce_mean_child_op_a if reduce_mean_child_op_a.op_type == "sub" else reduce_mean_child_op_b ) if not (sub_op1.x == root_var and sub_op1.y == reduce_op.outputs[0]): return False ops_to_remove.append(sub_op1) # check square op square_op = self._try_get_child_op_type(sub_op1, "square") if square_op is None: return False ops_to_remove.append(square_op) # check second reduce mean reduce_op2 = self._try_get_child_op_type(square_op, "reduce_mean") if reduce_op2 is None or not self._check_reduce_op(reduce_op2): return False ops_to_remove.append(reduce_op2) # check add op (with epsilon) add_eps_op = self._try_get_child_op_type(reduce_op2, "add") if add_eps_op is None: return False epsilon_var = add_eps_op.y if add_eps_op.x == reduce_op2.outputs[0] else add_eps_op.x if epsilon_var.val is None or len(epsilon_var.val.shape) != 0: return False # must be scalar ops_to_remove.append(add_eps_op) # check rsqrt rsqrt_op = self._try_get_child_op_type(add_eps_op, "rsqrt") if rsqrt_op is None: return False ops_to_remove.append(rsqrt_op) # check mul 1 mul_op1 = self._try_get_child_op_type(rsqrt_op, "mul") if mul_op1 is None: return False if not ( (mul_op1.x == root_var and mul_op1.y == rsqrt_op.outputs[0]) or (mul_op1.x == rsqrt_op.outputs[0] and mul_op1.y == root_var) ): return False ops_to_remove.append(mul_op1) # check mul 2 mul_op2 = self._try_get_child_op_type(rsqrt_op, "mul", index=1) if mul_op2 is None: return False if not ( (mul_op2.x == reduce_op.outputs[0] and mul_op2.y == rsqrt_op.outputs[0]) or (mul_op2.x == rsqrt_op.outputs[0] and mul_op2.y == reduce_op.outputs[0]) ): return False ops_to_remove.append(mul_op2) # check mul (sub) mul_sub_op = self._try_get_child_op_type(mul_op2, "mul") if mul_sub_op is None: return False if mul_sub_op.y.val is None or mul_sub_op.y.val != -1: return False ops_to_remove.append(mul_sub_op) # check last add op add_op = self._try_get_child_op_type(mul_sub_op, "add") if add_op is None: return False if not ( (add_op.x == mul_op1.outputs[0] and add_op.y == mul_sub_op.outputs[0]) or (add_op.x == mul_sub_op.outputs[0] and add_op.y == mul_op1.outputs[0]) ): return False ops_to_remove.append(add_op) gamma_var = mb.const( val=np.ones(shape=(1, root_var.shape[1], 1, 1)), name="_fuse_layernorm_or_instancenorm_gamma", ) beta_var = mb.const( val=np.zeros(shape=(1, root_var.shape[1], 1, 1)), name="_fuse_layernorm_or_instancenorm_beta", ) return self._try_apply_transform( reduce_op, block, gamma_var, beta_var, epsilon_var, add_op, ops_to_remove ) def _try_match_and_transform_pattern_4(self, reduce_op: Operation, block: Block) -> bool: """ Identify the pattern: ``y = x * [gamma * rsqrt(variance + eps)] + (beta - mean * [gamma * rsqrt(variance + eps)])`` This pattern corresponds to, and should be fused as, ``instance_norm``. All of the following conditions must be satisfied: 1. ``input`` is rank 4 tensor. 2. ``reduce`` operates on spatial dimensions ``axes=[-2, -1]`` or ``axes=[-3, -2]`` (a channel first to channel last transpose would be inserted in such cases). 3. ``gamma`` and ``beta`` are both shape ``(C,)`` after ``squeeze``, where ``C`` is number of channels. .. code-block:: |-----------| | V |------> mul_square1 -----> sum1 -----> mul_mean1 | | | V x --> sum --> mul_mean ==> mul_square --> sub_variance --> add_eps --> rsqrt | | | | | V | | mul_gamma | | | | | |----------------| | | | V | |--------------------------------------------+-------------> mul2 | V | |----------------------------------------------------------> mul1 | | V | sub_beta --> add --> [...] | ^ |---------------------------| """ ops_to_remove = [] root_var = reduce_op.x if root_var.shape is None: return False # check that root_var feeds into exactly 4 ops if len(root_var.child_ops) != 4: return False if ( root_var.op is not None and not self._check_child_op_types( root_var.op, child_op_types=["mul", "mul", "reduce_sum", "mul"] ) and not self._check_child_op_types( # The _check_child_op_types checks for the exact order of the child_ops. root_var.op, child_op_types=["mul", "mul", "mul", "reduce_sum"], ) ): return False # check 1st reduce_sum op if not self._check_reduce_op(reduce_op, mode="reduce_sum"): return False ops_to_remove.append(reduce_op) # check mul (mean) op mul_mean_op = self._try_get_child_op_type(reduce_op, "mul") if mul_mean_op is None: return False if mul_mean_op.y.shape != (): return False ops_to_remove.append(mul_mean_op) # check 1st mul (square) op if not self._check_child_op_types(mul_mean_op, child_op_types=["mul", "mul", "mul"]): return False # both 0 and 1 should be mul square op mul_square_op = self._try_get_child_op_type(mul_mean_op, "mul") if mul_square_op is None: return False if self._try_get_child_op_type(mul_mean_op, "mul", index=1) is None: return False ops_to_remove.append(mul_square_op) # Check another branch # check 2nd mul (square) op # both 0 and 1 should be mul square op 1 mul_square_op2 = list(root_var.child_ops)[0] ops_to_remove.append(mul_square_op2) # check 2nd reduce sum reduce_op2 = self._try_get_child_op_type(mul_square_op2, child_op_type="reduce_sum") if not self._check_reduce_op(reduce_op2, "reduce_sum"): return False ops_to_remove.append(reduce_op2) # check mul after 2nd reduce op mul_mean_op2 = self._try_get_child_op_type(reduce_op2, "mul") if mul_mean_op2 is None: return False if mul_mean_op2.y.shape != (): return False ops_to_remove.append(mul_mean_op2) # check sub (variance) sub_variance_op = self._try_get_child_op_type(mul_mean_op2, "sub") if sub_variance_op is None: return False if sub_variance_op.y != mul_square_op.outputs[0]: return False ops_to_remove.append(sub_variance_op) # check add op (epsilon) add_eps_op = self._try_get_child_op_type(sub_variance_op, "add") if add_eps_op is None: return False epsilon_var = add_eps_op.y if add_eps_op.x == sub_variance_op.outputs[0] else add_eps_op.x if epsilon_var.val is None or len(epsilon_var.val.shape) != 0: return False # must be scalar ops_to_remove.append(add_eps_op) # check rsqrt rsqrt_op = self._try_get_child_op_type(add_eps_op, "rsqrt") if rsqrt_op is None: return False ops_to_remove.append(rsqrt_op) # check mul (gamma) mul_gamma_op = self._try_get_child_op_type(rsqrt_op, "mul") if mul_gamma_op is None: return False gamma_var = mul_gamma_op.y if mul_gamma_op.x == rsqrt_op.outputs[0] else mul_gamma_op.x if gamma_var.val is None: return False ops_to_remove.append(mul_gamma_op) # check 2 muls after the gamma mul if not self._check_child_op_types(mul_gamma_op, ["mul", "mul"]): return False mul_gamma_child_ops = list(mul_gamma_op.outputs[0].child_ops) mul_op1 = mul_gamma_child_ops[0] mul_op2 = mul_gamma_child_ops[1] mul_op1_other_var = mul_op1.x if mul_op1.y == mul_gamma_op.outputs[0] else mul_op1.y mul_op2_other_var = mul_op2.x if mul_op2.y == mul_gamma_op.outputs[0] else mul_op2.y if not ( (mul_op1_other_var == root_var and mul_op2_other_var == mul_square_op.x) or (mul_op1_other_var == mul_square_op.x and mul_op2_other_var == root_var) ): return False if mul_op1_other_var == root_var: mul_op1, mul_op2 = mul_op1, mul_op2 else: mul_op2, mul_op1 = mul_op1, mul_op2 ops_to_remove.append(mul_op1) ops_to_remove.append(mul_op2) # check sub with beta sub_beta_op = self._try_get_child_op_type(mul_op2, "sub") if sub_beta_op is None: return False if sub_beta_op.y != mul_op2.outputs[0]: return False beta_var = sub_beta_op.x if beta_var.val is None: return False ops_to_remove.append(sub_beta_op) # check last add op add_op = self._try_get_child_op_type(sub_beta_op, "add") if add_op is None: return False if not ( (add_op.x == mul_op1.outputs[0] and add_op.y == sub_beta_op.outputs[0]) or (add_op.y == mul_op1.outputs[0] and add_op.x == sub_beta_op.outputs[0]) ): return False ops_to_remove.append(add_op) return self._try_apply_transform( reduce_op, block, gamma_var, beta_var, epsilon_var, add_op, ops_to_remove ) def _try_match_and_transform_pattern_5(self, reduce_op, block) -> bool: """ Detect BC1S ``LayerNorm`` pattern as in ml-ane-transformers. Identify two patterns, the first: ``y = (x - mean(x)) * rsqrt(variance(X) + eps)`` ``y = (x - mean(x)) * rsqrt(mean((x - mean(x))^2) + eps)`` .. code-block:: x --> reduce_mean --| | | |---| | V | V |----------------> sub --> mul --> reduce_mean --> add(epsilon) --> rsqrt | | | V |-------------------------------------------------> mul --> [...] If the optional elementwise weight and bias are set, the second pattern is: ``y = [(x - mean(x)) * rsqrt(mean((x - mean(x))^2) + eps) + beta] * gamma`` Note that this is different from the torch and MIL definitions of beta and gamma so beta is be scaled by gamma and applied before it. .. code-block:: x --> reduce_mean --| | | |---| | V | V |----------------> sub --> mul --> reduce_mean --> add(epsilon) --> rsqrt | | | V |-------------------------------------------------> mul | V add(beta) | V mul(gamma) --> [...] These pattern corresponds to a specific ``layer_norm``: - ``rank`` is 4. - ``axes`` is ``[1]`` - ``gamma`` and ``beta`` are applied as in ml-ane-transformers, in the opposite order of torch. """ ops_to_remove = [] root_var = reduce_op.x # check that root_var feeds into at least 2 ops if len(list(root_var.child_ops)) < 2: return False # Do not enforce that the only child ops are reduce_mean and sub as in other # patterns. There are models where the root op is used after the layer norm. # check 1st reduce_mean op if not self._check_reduce_op(reduce_op): return False if len(reduce_op.axes.val) != 1 or reduce_op.axes.val != [1] or not reduce_op.keep_dims.val: return False ops_to_remove.append(reduce_op) # check 1st sub op if not self._check_child_op_types(reduce_op, ["sub"], check_order=False): return False child_ops_reduce_mean = list(reduce_op.outputs[0].child_ops) sub_op1 = child_ops_reduce_mean[0] if sub_op1 is None or not self._check_child_op_types( sub_op1, child_op_types=["mul", "mul", "mul"] ): return False if not (sub_op1.x == root_var and sub_op1.y == reduce_op.outputs[0]): return False ops_to_remove.append(sub_op1) # check mul op (equivalent to a square op) square_op = self._try_get_child_op_type(sub_op1, "mul") if square_op is None or not self._check_child_op_types( square_op, child_op_types=["reduce_mean"] ): return False if square_op.x != square_op.y: return False ops_to_remove.append(square_op) # check second reduce mean reduce_op2 = self._try_get_child_op_type(square_op, "reduce_mean") if not self._check_reduce_op(reduce_op2) or not self._check_child_op_types( reduce_op2, child_op_types=["add"] ): return False if len(reduce_op2.axes.val) != 1 or reduce_op2.axes.val != [1] or not reduce_op2.keep_dims.val: return False ops_to_remove.append(reduce_op2) # check add op (with epsilon) add_op1 = self._try_get_child_op_type(reduce_op2, "add") if add_op1 is None or not self._check_child_op_types( add_op1, child_op_types=["rsqrt"] ): return False epsilon_var = add_op1.y if add_op1.x == reduce_op2.outputs[0] else add_op1.x if epsilon_var.val is None or len(epsilon_var.val.shape) != 0: return False # must be scalar ops_to_remove.append(add_op1) # check rsqrt rsqrt_op = self._try_get_child_op_type(add_op1, "rsqrt") if rsqrt_op is None or not self._check_child_op_types( rsqrt_op, child_op_types=["mul"] ): return False ops_to_remove.append(rsqrt_op) # Last op in pattern if there is no elementwise affine. mul_op = self._try_get_child_op_type(rsqrt_op, "mul") if mul_op is None: return False if mul_op.y != sub_op1.outputs[0] and mul_op.x != sub_op1.outputs[0]: return False ops_to_remove.append(mul_op) # Default values if no gamma or beta ops. end_op = mul_op gamma_var = None beta_var = None add_beta_op = self._try_get_child_op_type(mul_op, "add") mul_gamma_op = self._try_get_child_op_type(add_beta_op, "mul") has_beta_and_gamma = add_beta_op is not None and mul_gamma_op is not None # mul_op cannot be used except as an input to add_beta_op. if has_beta_and_gamma and not self._check_child_op_types( mul_op, child_op_types=["add"] ): # It would be possible to fuse this pattern as: # layer_norm(x, gamma=None, beta=None) -> add(beta) -> mul(gamma) -> ... # |-> other mul_op child ops # For simplicity don't handle this edge case. return False # add_beta_op cannot be used except as an input to mul_gamma_op. if has_beta_and_gamma and not self._check_child_op_types( add_beta_op, child_op_types=["mul"] ): # It would be possible to fuse this pattern as: # layer_norm(x, gamma=None, beta=None) -> add(beta) -> mul(gamma) -> ... # |-> other add_beta_op child ops # For simplicity don't handle this edge case. return False if add_beta_op is None and mul_gamma_op is None: # Gamma and beta are optional in layer_norm. pass elif add_beta_op is None or mul_gamma_op is None: # If only one of gamma or beta is present, they could # be folded into the layer_norm op. For simplicity # don't handle this edge case. return False if has_beta_and_gamma: beta_var = add_beta_op.y if add_beta_op.x == mul_op.outputs[0] else add_beta_op.x gamma_var = ( mul_gamma_op.y if mul_gamma_op.x == add_beta_op.outputs[0] else mul_gamma_op.x ) if beta_var.val is None or gamma_var.val is None: return False gamma_var = mb.const( val=np.squeeze(gamma_var.val), name="_fuse_layernorm_gamma", ) # Scale beta by gamma. Note: this un-scaling introduces a small amount # of precision loss. # https://github.com/apple/ml-ane-transformers/blob/da64000fa56cc85b0859bc17cb16a3d753b8304a/ane_transformers/huggingface/distilbert.py#L31 beta_var = mb.const( val=np.squeeze(beta_var.val) * gamma_var.val, name="_fuse_layernorm_beta" ) ops_to_remove.extend([add_beta_op, mul_gamma_op]) end_op = mul_gamma_op return self._try_apply_transform( reduce_op, block, gamma_var, beta_var, epsilon_var, end_op, ops_to_remove ) @block_context_manager def _fuse_layernorm_or_instancenorm_block(self, block: Block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._fuse_layernorm_or_instancenorm_block(b) if len(op.blocks) > 0: continue # start pattern match if reduce_mean op is encountered if op.op_type == "reduce_mean": if self._try_match_and_transform_pattern_1(op, block): fusion_occurred = True elif self._try_match_and_transform_pattern_2(op, block): fusion_occurred = True elif self._try_match_and_transform_pattern_3(op, block): fusion_occurred = True elif self._try_match_and_transform_pattern_5(op, block): fusion_occurred = True elif op.op_type == "reduce_sum": if self._try_match_and_transform_pattern_4(op, block): fusion_occurred = True return fusion_occurred ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/optimize_quantization.py0000644000000000000000000013444114672066616030043 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import List, Set, Tuple import numpy as np from coremltools.converters.mil._deployment_compatibility import AvailableTarget from coremltools.converters.mil.frontend import _utils from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, Var, types from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import ( _check_child_op_type, _check_no_output_connection, block_context_manager, ) from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class merge_affine_dequantize_with_consecutive_ops(AbstractGraphPass): """ This graph pass does const folding to a chain of supported ops starts with a ``constexpr_affine_dequantize`` op. More types of op are supported when quantization is tensor-wise, and only a subset is supported for channel-wise. For example .. code-block:: Input graph: data -> constexpr_affine_dequantize -> transpose -> expand_dims -> out Output graph: new_data -> constexpr_affine_dequantize -> out where ``new_data`` is computed by ``data -> transpose -> expand_dims``. Note that, the graph pass only supports const folding of a single linked list pattern. For example, the following pattern will not be changed .. code-block:: |-> constexpr_affine_dequantize -> transpose -> out data -| |-> constexpr_affine_dequantize -> reshape -> out_2 since the quantized data is used by multiple ``constexpr`` """ SUPPORTED_OP_TYPES_PER_TENSOR = { "transpose", "reshape", "expand_dims", "squeeze", } SUPPORTED_OP_TYPES_PER_CHANNEL = {"transpose"} assert SUPPORTED_OP_TYPES_PER_CHANNEL.issubset( SUPPORTED_OP_TYPES_PER_TENSOR ), "If an op can merge with channel-wise quantization, then it must also be able to merge with tensor-wise quantization" def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self.merge_affine_dequantize_with_consecutive_ops_block(f) @block_context_manager def merge_affine_dequantize_with_consecutive_ops_block(self, block: Block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self.merge_affine_dequantize_with_consecutive_ops_block(b) if op.op_type != "constexpr_affine_dequantize": continue if self._try_to_transform(op, block): fusion_occurred = True return fusion_occurred @staticmethod def _apply_equivalent_transform(val: np.ndarray, op: Operation) -> np.ndarray: if ( op.op_type not in merge_affine_dequantize_with_consecutive_ops.SUPPORTED_OP_TYPES_PER_TENSOR ): raise ValueError(f"unsupported op_type {op.op_type}") if op.op_type == "transpose": return np.transpose(val, axes=op.perm.val) if op.op_type == "reshape": return np.reshape(val, op.outputs[0].shape) if op.op_type == "expand_dims": return np.expand_dims(val, axis=op.axes.val.tolist()) if op.op_type == "squeeze": axes = op.axes if axes is None or axes.val is None: return np.squeeze(val) return np.squeeze(val, axis=tuple(op.axes.val.tolist())) @staticmethod def search_for_ops_to_fold( op: Operation, block: Block, supported_op_types: Set[str] ) -> List[Operation]: # traverse the graph to get a chain of applicable ops to fold ops_to_fold = [] cursor = op while True: prev_cursor = cursor if cursor.outputs[0] in block.outputs: break for supported_op_type in supported_op_types: if _check_child_op_type(cursor, supported_op_type): ops_to_fold.append(cursor.outputs[0].child_ops[0]) cursor = ops_to_fold[-1] break if prev_cursor == cursor: break return ops_to_fold @staticmethod def _try_to_transform_per_tensor(op: Operation, block: Block) -> bool: assert ( op.scale.rank == 0 and op.zero_point.rank == 0 ), "The _try_to_transform_per_tensor method should only be used for per-tensor dequantization case" ops_to_fold = merge_affine_dequantize_with_consecutive_ops.search_for_ops_to_fold( op, block, merge_affine_dequantize_with_consecutive_ops.SUPPORTED_OP_TYPES_PER_TENSOR ) if len(ops_to_fold) == 0: return False # do the same transformation on the source quantized data cursor = op.quantized_data.val for op_to_fold in ops_to_fold: cursor = merge_affine_dequantize_with_consecutive_ops._apply_equivalent_transform( cursor, op_to_fold ) # after transformation, we create a new constexpr_affine_dequantize op and do the replacement new_var = _utils._construct_constexpr_dequant_op( cursor, op.zero_point, op.scale, op.axis, name=ops_to_fold[-1].outputs[0].name, before_op=ops_to_fold[-1], ) block.replace_uses_of_var_after_op( anchor_op=ops_to_fold[-1], old_var=ops_to_fold[-1].outputs[0], new_var=new_var, force_replace=True, ) block.remove_ops([op] + ops_to_fold) return True @staticmethod def _try_to_transform_per_channel(op: Operation, block: Block) -> bool: scale = op.scale zero_point = op.zero_point # positively canonicalize axis for easier manipulation later on axis = op.axis.val if op.axis.val >= 0 else op.axis.val + op.quantized_data.rank ops_to_fold = merge_affine_dequantize_with_consecutive_ops.search_for_ops_to_fold( op, block, merge_affine_dequantize_with_consecutive_ops.SUPPORTED_OP_TYPES_PER_CHANNEL, ) if len(ops_to_fold) == 0: return False # do the same transformation on the source quantized data cursor = op.quantized_data.val for op_to_fold in ops_to_fold: cursor = merge_affine_dequantize_with_consecutive_ops._apply_equivalent_transform( cursor, op_to_fold ) if op_to_fold.op_type == "transpose": axis = np.where(op_to_fold.perm.val == axis)[0][0] # after transformation, we create a new constexpr_affine_dequantize op and do the replacement new_var = mb.constexpr_affine_dequantize( quantized_data=cursor, zero_point=zero_point, scale=scale, axis=axis, name=ops_to_fold[-1].outputs[0].name, before_op=ops_to_fold[-1], ) block.replace_uses_of_var_after_op( anchor_op=ops_to_fold[-1], old_var=ops_to_fold[-1].outputs[0], new_var=new_var, force_replace=True, ) block.remove_ops([op] + ops_to_fold) return True def _try_to_transform(self, op: Operation, block: Block) -> bool: # make sure quantized_data only feeds into a single op if len(op.quantized_data.child_ops) != 1: return False if op.scale.rank == 0 and op.zero_point.rank == 0: return self._try_to_transform_per_tensor(op, block) else: return self._try_to_transform_per_channel(op, block) @register_pass(namespace="common") class int_op_canonicalization(AbstractGraphPass): """ For general quantized operators, in Core ML, we represent them as ``dequantize -> the floating-point version of this operator -> quantize``, because mathematically it is the floating-point tensor rather than its quantized integer representation that gets operated upon. For some quantized operators that do not involve floating-point arithmetic, however, it is unnecessary to prepend ``dequantize`` and append ``quantize``. Examples are: * reshape """ INT_OP_TYPES_AND_OPSET_VERSIONS = {"reshape": {AvailableTarget.iOS17}} def apply(self, prog): for f in prog.functions.values(): self._canonicalize_int_ops_block(f) @block_context_manager def _canonicalize_int_ops_block(self, block: Block): def apply_block(block: Block) -> bool: for op in list(block.operations): for b in op.blocks: self._canonicalize_int_ops_block(b) matched_ops = self.match_pattern(op) if matched_ops is not None: dequantize, quantize = matched_ops # has to break as the downstream iterator is affected if self.try_to_transform(dequantize, op, quantize): return True return False need_transformation = True while need_transformation: need_transformation = apply_block(block) def match_pattern(self, op: Operation) -> Tuple[Operation, Operation]: if ( op.op_type not in self.INT_OP_TYPES_AND_OPSET_VERSIONS or op.opset_version not in self.INT_OP_TYPES_AND_OPSET_VERSIONS[op.op_type] ): return None # make sure the input is quantized dequantize = op.x.op if dequantize is None or dequantize.op_type != "dequantize": return None # make sure the output is quantized if not _check_child_op_type(op, "quantize"): return None quantize = op.outputs[0].child_ops[0] # we do not have to check block output, because: # * for dequantize, it is ok to connect to block output, since our # transformation method `try_to_transform` is able to deal with that # * for op, checking child op has made sure it has only 1 child # and connects to quantize, i.e. it cannot connect to block output return dequantize, quantize def try_to_transform(self, dequantize: Operation, op: Operation, quantize: Operation) -> bool: block: Block = op.enclosing_block if not block.try_replace_uses_of_var_after_op( anchor_op=quantize, old_var=quantize.outputs[0], new_var=self.build_int_op(dequantize, op, quantize), ): return False # remove op and quantize here, but not dequantize, since: # * all uses of op and quantize has been replaced with the canonicalized one # * dequantize may feed to multiple ops, which are not replaced # (if not, then pass dead_code_elimination will eliminate it) block.remove_ops([op, quantize]) return True @staticmethod def build_int_op(dequantize: Operation, op: Operation, quantize: Operation) -> Var: if op.op_type == "reshape": return mb.reshape( x=dequantize.input, shape=op.shape, name=quantize.outputs[0].name, before_op=op, ) raise NotImplementedError(f"no build method implemented for int op {op.op_type}") # TODO (rdar://107718371): remove this pass after implementing QuantizedVar @register_pass(namespace="common") class nullify_redundant_quantization_zero_point(AbstractGraphPass): """ In Core ML quantization, the performance is better when ``zero point = 0``, so we try to make ``zero point = 0`` if possible: * ``zero point = -128`` * this must be an int8 quantization * equivalent to uint8 quantization with 0 zero point * ``zero point = 128`` * this must be an uint8 quantization * equivalent to int8 quantization with 0 zero point Since ``zero point = 0`` is equivalent to ``zero point = None`` in Core ML semantics, we further canonicalize to ``zero point = None`` to: * make further graph passes easier * avoid serializing trivial 0 The ``zero point = 0`` case can be canonicalized trivially .. code-block:: Input op: quantize/dequantize(zero_point=0) Output op: quantize/dequantize(zero_point=None) To guarantee the conservation of output regardless the zero-point shift in ``zero point = ±128`` cases, we would only transform: * const dequantize, where we fuse the zero-point shift into the const .. code-block:: Input op: dequantize(input=const, zero_point=±128) Output op: dequantize(input=const∓128, zero_point=None) * ``quantize -> dequantize``, where we nullify both simultaneously .. code-block:: Input graph: input -> quantize(zero_point=±128) -> dequantize(zero_point=±128) -> output Output graph: input -> quantize(zero_point=None) -> dequantize(zero_point=None) -> output """ def apply(self, prog): for f in prog.functions.values(): self._nullify_redundant_quantization_zero_point_block(f) @block_context_manager def _nullify_redundant_quantization_zero_point_block(self, block: Block): def apply_block(block: Block) -> bool: fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: self._nullify_redundant_quantization_zero_point_block(b) # no need to break, since only the current op gets changed self.try_transform_zp0(op) self.try_transform_zp128_const_dequantize(op) # has to break as the downstream iterator is affected if self.try_transform_zp128_quantize_dequantize(op): fusion_occurred = True return fusion_occurred need_transformation = True while need_transformation: need_transformation = apply_block(block) @staticmethod def try_transform_zp0(op: Operation) -> bool: if op.op_type not in ("quantize", "dequantize"): return False zero_point = op.zero_point # if already no zero point, no need for further nullification if zero_point is None: return False zero_point = zero_point.val if not np.all(zero_point == 0): return False new_var: Var if op.op_type == "quantize": new_var = mb.quantize( input=op.input, scale=op.scale, axis=op.axis, output_dtype=op.output_dtype, before_op=op, ) else: new_var = mb.dequantize( input=op.input, scale=op.scale, axis=op.axis, before_op=op, ) block: Block = op.enclosing_block if not block.try_replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=new_var ): return False block.remove_ops([op]) return True @staticmethod def try_transform_zp128_const_dequantize(op: Operation) -> bool: if op.op_type != "dequantize": return False zero_point = op.zero_point # if already no zero point, no need for further nullification if zero_point is None: return False zero_point = zero_point.val is_negative_128 = np.all(zero_point == -128) is_positive_128 = np.all(zero_point == 128) if not (is_negative_128 or is_positive_128): return False input = op.input.val if input is None: return False if is_negative_128: input = np.uint8(np.int16(input) + 128) else: input = np.int8(np.int16(input) - 128) new_var = mb.dequantize( input=input, scale=op.scale, axis=op.axis, before_op=op, ) block: Block = op.enclosing_block if not block.try_replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=new_var ): return False block.remove_ops([op]) return True @staticmethod def try_transform_zp128_quantize_dequantize(op: Operation) -> bool: if op.op_type != "quantize": return False zero_point = op.zero_point # if already no zero point, no need for further nullification if zero_point is None: return False zero_point = zero_point.val is_negative_128 = np.all(zero_point == -128) is_positive_128 = np.all(zero_point == 128) if not (is_negative_128 or is_positive_128): return False if not _check_child_op_type(op, "dequantize"): return False dequantize_op = op.outputs[0].child_ops[0] dequantize_zero_point = dequantize_op.zero_point if dequantize_zero_point is None: return False dequantize_zero_point = dequantize_zero_point.val if not np.all(dequantize_zero_point == (-128 if is_negative_128 else 128)): return False new_quantize = mb.quantize( input=op.input, scale=op.scale, axis=op.axis, output_dtype="uint8" if is_negative_128 else "int8", before_op=dequantize_op, ) new_dequantize = mb.dequantize( input=new_quantize, scale=dequantize_op.scale, axis=dequantize_op.axis, before_op=dequantize_op, ) block: Block = op.enclosing_block if not block.try_replace_uses_of_var_after_op( anchor_op=dequantize_op, old_var=dequantize_op.outputs[0], new_var=new_dequantize, ): return False block.remove_ops([op, dequantize_op]) return True @register_pass(namespace="common") class dequantize_quantize_pair_elimination(AbstractGraphPass): """ When a ``dequantize`` is followed by an identical ``quantize`` (same scale, zero point, axis), they cancel out and can be eliminated .. code-block:: Input graph: input -> dequantize -> quantize -> output Output graph: input -> output When the pattern has branches (dequantize has multiple children), we cannot eliminate the whole pair, but can still shorten the path. More specifically: .. code-block:: Input graph: op1 -> dequantize -> quantize -> op2 | |-> some_other_op Output graph: op1 -> dequantize -> some_other_op | |-> op2 PS: On the other hand, the reversed pattern, i.e., ``quantize -> dequantize``, is not redundant, since that is the pattern which naturally occurs when a quantized op is converted. In current activation quantization conversion, a quantized op becomes .. code-block:: dequantize -> regular op -> quantize so if we have a sequence of quantized ops, we will get .. code-block:: dequantize -> regular op1 -> quantize -> dequantize -> regular op2 -> quantize The ``quantize -> dequantize`` pair in the middle is not redundant, even if they have identical scales and zero points and axes, since removing them will lead to loss of information about the quantization parameters of the output var of op1 """ def apply(self, prog): for f in prog.functions.values(): self._dequantize_quantize_pair_elimination_block(f) @block_context_manager def _dequantize_quantize_pair_elimination_block(self, block): def apply_block(block: Block) -> bool: fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: self._dequantize_quantize_pair_elimination_block(b) # has to break as the downstream iterator is affected if self.try_dequantize_quantize_pair_elimination(op): fusion_occurred = True return fusion_occurred need_transformation = True while need_transformation: need_transformation = apply_block(block) @staticmethod def try_dequantize_quantize_pair_elimination(op: Operation) -> bool: def _check_quantize_removable(quantize_op: Operation) -> bool: if np.any(op.scale.val != quantize_op.scale.val): return False is_dequantize_zp_present = op.zero_point is not None is_quantize_zp_present = quantize_op.zero_point is not None if is_dequantize_zp_present != is_quantize_zp_present: return False if is_dequantize_zp_present and is_quantize_zp_present: if np.any(op.zero_point.val != quantize_op.zero_point.val): return False is_dequantize_axis_present = op.axis is not None is_quantize_axis_present = quantize_op.axis is not None if is_dequantize_axis_present != is_quantize_axis_present: return False if is_dequantize_axis_present and is_quantize_axis_present: if op.axis.val != quantize_op.axis.val: return False return True if op.op_type != "dequantize": return False if op.outputs[0] in op.enclosing_block.outputs: return False any_quantize_removed: bool = False for child_op in op.outputs[0].child_ops: if child_op.op_type == "quantize" and _check_quantize_removable(child_op): block: Block = op.enclosing_block if block.try_replace_uses_of_var_after_op( anchor_op=child_op, old_var=child_op.outputs[0], new_var=op.input, ): block.remove_ops([child_op]) any_quantize_removed = True if any_quantize_removed and len(op.outputs[0].child_ops) == 0: # Remove the dequant op if all its children quantize ops got removed. block.remove_ops([op]) return any_quantize_removed @register_pass(namespace="common") class distributive_quantized_binary_op_scale_normalization(AbstractGraphPass): """ In the backend, for better performance, quantized op can have 1 input scale fused within the quantized op kernel. For binary ops, there are 2 inputs, but only 1 can get fused. For example, for quantized ``add`` .. code-block:: MIL graph (consists of MIL ops): dequantize(x, s_x, zp_x) -| x_fp = (x - zp_x) * s_x | |-> add(x_fp, y_fp) -> quantize(z_fp, s_z, zp_z) dequantize(y, s_y, zp_y) -| z_fp = x_fp + y_fp z = z_fp / s_z + zp_z y_fp = (y - zp_y) * s_y Backend graph (consists of backend instructions, usually including + - * / and fused *+): x_shift = x - zp_x -------------------------| |-> z_fp = s_x * x_shift + y_fp -> z = z_fp / s_z + zp_z y_shift = y - zp_y -> y_fp = s_y * y_shift -| Where ``x`` and ``y`` are the inputs, ``z`` is the output, ``s`` and ``zp`` are the corresponding scale and zero point. The reason why fusing one scale leads to better performance is, instead of 2 instructions ``x_fp = s_x * x_shift`` and ``z_fp = x_fp + y_fp``, a single ``z_fp = x_shift * s_x + y_fp`` instruction achieves the same result. In this pass, we normalize ``s_y`` to 1, so the ``y_fp = s_y * y_shift`` instruction can get skipped as well, leading to even better performance. This pass only applies to distributive binary ops such as ``add`` and ``sub`` Appendix: Mathematical and Computer-Scientific Details Mathematically, for a binary operator ``.op.`` .. code-block:: z_fp = (x - zp_x) * s_x .op. (y - zp_y) * s_y = s_y * [(x - zp_x) * s_x/s_y .op. (y - zp_y) * 1] The corresponding pseudo code is .. code-block:: # before z_fp = (x - zp_x) * s_x .op. (y - zp_y) * s_y z = z_fp / s - zp_z # after z_fp_modified = (x - zp_x) * s_x/s_y .op. (y - zp_y) * 1.0 z = z_fp_modified / (s_z/s_y) - zp_z Concretely, as a MIL graph pass .. code-block:: Input graph: dequantize(scale=s_x) -| |-> op -> quantize(scale=s_z) dequantize(scale=s_y) -| Output graph: dequantize(scale=s_x/s_y) -| |-> op -> quantize(scale=s_z/s_y) dequantize(scale=1.0) -| PS: we only support scalar ``s_y`` for now. If ``s_y`` is not scalar but ``s_x`` is, we would swap ``x`` and ``y``. Support for both-vector case is to be explored, due to the broadcasting complication. """ DISTRIBUTIVE_BINARY_OPS = {"add", "sub"} def apply(self, prog): @block_context_manager def apply_block(block: Block): for op in list(block.operations): for b in op.blocks: apply_block(b) matched_ops = self.match_pattern(op) if matched_ops is not None: dequantize_x, dequantize_y, quantize_z = matched_ops self.try_to_transform(op, dequantize_x, dequantize_y, quantize_z) for f in prog.functions.values(): apply_block(f) def match_pattern(self, op: Operation) -> Tuple[Operation, Operation, Operation]: """ try to match distributive quantized binary op: ... ^ | dequantize(x) -| |-> op(x, y) (-> relu) -> quantize(z) dequantize(y) -| | v ... return dequantize_x, dequantize_y, quantize_z for further transformation return None if no match """ # make sure the op is distributive if op.op_type not in self.DISTRIBUTIVE_BINARY_OPS: return None # quantized op may be fused with relu # relu would not affect distributivity tail_op = op if _check_child_op_type(op, "relu"): tail_op = op.outputs[0].child_ops[0] # make sure the inputs are quantized dequantize_x = op.x.op dequantize_y = op.y.op if ( dequantize_x is None or dequantize_y is None or dequantize_x.op_type != "dequantize" or dequantize_y.op_type != "dequantize" ): return None # make sure the output is quantized if not _check_child_op_type(tail_op, "quantize"): return None quantize_z = tail_op.outputs[0].child_ops[0] # make sure the intermediate results are not block outputs # since we only guarantee conservation of z if not _check_no_output_connection( op.enclosing_block, [dequantize_x, dequantize_y, op, tail_op, quantize_z] ): return None return dequantize_x, dequantize_y, quantize_z def try_to_transform( self, op: Operation, dequantize_x: Operation, dequantize_y: Operation, quantize_z: Operation ) -> bool: """ given dequantize_x, dequantize_y, quantize_z, transform by z_fp = (x - zp_x) * s_x/s_y .op. (y - zp_y) * 1.0 z = z_fp / (s_z/s_y) - zp_z See the class doc for details """ block = quantize_z.enclosing_block new_s_x, new_s_z = self.try_to_divide(dequantize_x, dequantize_y, quantize_z) # if s_y cannot be used to divide, then swap x and y and try again if new_s_x is None and new_s_z is None: dequantize_x, dequantize_y = dequantize_y, dequantize_x new_s_x, new_s_z = self.try_to_divide(dequantize_x, dequantize_y, quantize_z) # after swap, if still cannot divide, then give up if new_s_x is None and new_s_z is None: return False def convert_mil_float_dtype_to_np(mil_dtype): if mil_dtype == types.fp16 or mil_dtype == "float16": np_dtype = np.float16 else: np_dtype = np.float32 return np_dtype new_s_x_dtype = convert_mil_float_dtype_to_np(dequantize_x.scale.val.dtype) new_s_y_dtype = convert_mil_float_dtype_to_np(dequantize_y.scale.val.dtype) new_s_z_dtype = convert_mil_float_dtype_to_np(quantize_z.scale.val.dtype) # insert normalized new_dequantize_x and new_dequantize_y before op new_dequantize_x = mb.dequantize( input=dequantize_x.input, scale=new_s_x_dtype(new_s_x), zero_point=dequantize_x.zero_point, axis=dequantize_x.axis, before_op=op, ) new_dequantize_y = mb.dequantize( input=dequantize_y.input, scale=new_s_y_dtype(1) if dequantize_y.axis is None else np.full(dequantize_y.scale.val.shape, 1.0), zero_point=dequantize_y.zero_point, axis=dequantize_y.axis, before_op=op, ) # insert normalized new_quantize_z before quantize_z new_quantize_z = mb.quantize( input=quantize_z.input, scale=new_s_z_dtype(new_s_z), zero_point=quantize_z.zero_point, axis=quantize_z.axis, output_dtype=quantize_z.output_dtype, before_op=quantize_z, ) if not ( # replace dequantize_x and dequantize_y with the normalized ones # in the range of (new_dequantize_x, op] and (new_dequantize_y, op] # in case dequantize_x and dequantize_y also feed to other ops # which should not get altered by this transformation block.try_replace_uses_of_var_after_op( anchor_op=new_dequantize_x.op, end_op=op, old_var=dequantize_x.outputs[0], new_var=new_dequantize_x, ) and block.try_replace_uses_of_var_after_op( anchor_op=new_dequantize_y.op, end_op=op, old_var=dequantize_y.outputs[0], new_var=new_dequantize_y, ) # replace quantize_z with the normalized one and block.try_replace_uses_of_var_after_op( anchor_op=quantize_z, old_var=quantize_z.outputs[0], new_var=new_quantize_z ) ): return False # remove quantize_z here, but not dequantize_x and dequantize_y, since: # * all uses of quantize_z has been replaced with the normalized one # * dequantize_x and dequantize_y may feed to multiple ops, which are not replaced # (if not, then pass dead_code_elimination will eliminate them) block.remove_ops([quantize_z]) return True def try_to_divide( self, dequantize_x: Operation, dequantize_y: Operation, quantize_z: Operation, ) -> Tuple[np.ndarray, np.ndarray]: """ compute s_x/s_y and s_z/s_y, return the results if succeeds, else None The broadcast rule is very complicated: 1. Broadcast s_x to x, s_y to y, s_z to z, according to axes 2. Broadcast s_x and s_y 3. Perform s_x/s_y and s_z/s_y 4. De-broadcast s_x/s_y and s_z/s_y down to vectors according to axes, raise exception if impossible to de-broadcast As a result, for now we only handle the scalar s_y case """ # TODO (rdar://109170887): explore vector s_y if dequantize_y.axis is not None: return None, None s_x_fp32 = np.float32(dequantize_x.scale.val) s_y_fp32 = np.float32(dequantize_y.scale.val) s_z_fp32 = np.float32(quantize_z.scale.val) s_x_d_s_y = s_x_fp32 / s_y_fp32 s_z_d_s_y = s_z_fp32 / s_y_fp32 if ( self.overflow_fp16(s_x_d_s_y) or self.underflow_fp16(s_x_d_s_y) or self.overflow_fp16(s_z_d_s_y) or self.underflow_fp16(s_z_d_s_y) ): return None, None return s_x_d_s_y, s_z_d_s_y @staticmethod def overflow_fp16(x: np.ndarray) -> bool: return np.max(np.abs(x)) > 65504 @staticmethod def underflow_fp16(x: np.ndarray) -> bool: return np.min(np.abs(x)) < np.nextafter(0.0, 1.0, dtype=np.float16) @register_pass(namespace="common") class dequantize_to_constexpr(AbstractGraphPass): """ ``dequantize`` op with constant input is equivalent to ``constexpr_affine_dequantize``. This is one of the canonicalization pass that transforms all such ``dequantize`` ops to respective ``constexpr_affine_dequantize`` ops. .. code-block:: Input graph: dequantize(input=const) -> downstream op Output graph: constexpr_affine_dequantize -> downstream op This pass is being performed because constant tensors being propagated through ``dequantize`` op would be serialized in bloated/decompressed fashion, whereas with ``constexpr_affine_dequantize``, constant weights/tensors remain compressed at serialization. """ def apply(self, prog): @block_context_manager def apply_block(block): for op in list(block.operations): for b in op.blocks: apply_block(b) if self.is_valid_op(op): self.transform_op(op) for f in prog.functions.values(): apply_block(f) def is_valid_op(self, op): return op.op_type == "dequantize" and op.can_materialize_val() def transform_op(self, op): quantized_data = op.input.val scale = op.scale.val zero_point = None if op.zero_point is not None: zero_point = op.zero_point.val else: zero_point = np.int8(0) if op.input.dtype == types.int8 else np.uint8(0) axis = None if op.axis is None else op.axis.val new_var = _utils._construct_constexpr_dequant_op( quantized_data, zero_point, scale, axis, name=op.name + "_affine_dequantized", before_op=op, ) block = op.enclosing_block block.replace_uses_of_var_after_op(anchor_op=op, old_var=op.outputs[0], new_var=new_var) block.remove_ops([op]) @register_pass(namespace="common") class reorder_lut_per_channel_scale(AbstractGraphPass): """ The lut with per-channel-scale was represented as the following op combinations: weight = constexpr_lut_to_dense() weight = constexpr_blockwise_shift_scale(weight) output = linear/matmul/conv(x, weight) However, for ANE, it requires the scale to be after the linear/matmul/conv, which is: weight = constexpr_lut_to_dense() unscaled_output = linear/matmul(x, weight) output = mul(unscaled_output, scale) This graph pass finds the lut with per-channel-scale and move the scale to be ANE-friendly. """ _OPS_SUPPORT_MOVE_SCALE = {"linear", "matmul", "conv"} def apply(self, prog): @block_context_manager def apply_block(block: Block): for op in list(block.operations): for b in op.blocks: apply_block(b) if op.op_type == "constexpr_lut_to_dense" and len(op.outputs[0].child_ops) == 1: child_op = op.outputs[0].child_ops[0] if child_op.op_type == "constexpr_blockwise_shift_scale": # Can move the scale when the constexpr op is only used to scale the weight. has_offset = child_op.offset is not None and child_op.offset.val.any() if types.is_float(child_op.data.dtype) and not has_offset: self._reorder_lut_per_channel_scale(block, op) for f in prog.functions.values(): apply_block(f) def _reorder_lut_per_channel_scale(self, block: Block, lut_op: Operation): # Lazy import to avoid circular import error. from coremltools.optimize.coreml import _utils as optimize_utils # The original order is lut_op -> scale_op -> output_op. scale_op = lut_op.outputs[0].child_ops[0] # Only move the scale when all ops that consume this scale op support moving. for output_op in scale_op.outputs[0].child_ops: if output_op.op_type not in self._OPS_SUPPORT_MOVE_SCALE: return # Only the scale on output axis could be moved to get mathematically equivalent results. scale_val: np.ndarray = scale_op.scale.val output_axis = optimize_utils.select_input_output_channel_axis(scale_op)[1] if output_axis is None: return if output_axis < 0: output_axis += len(scale_val.shape) for axis, dim_size in enumerate(scale_val.shape): if axis != output_axis and dim_size != 1: return for output_op in list(scale_op.outputs[0].child_ops): self._help_move_scale(block, lut_op, scale_op, output_op) block.remove_ops([output_op]) block.remove_ops([scale_op]) @staticmethod def _help_move_scale( block: Block, lut_op: Operation, scale_op: Operation, output_op: Operation ): """Move the scale from `lut_op -> scale_op -> output_op` to `lut_op -> output_op -> mul`.""" scale_val: np.ndarray = scale_op.scale.val inputs = output_op.inputs if output_op.op_type == "linear": scale_val = scale_val.T inputs["weight"] = lut_op.outputs[0] if getattr(output_op, "bias", None) and output_op.bias.val is not None: original_bias = output_op.bias.val new_bias = (original_bias / np.squeeze(scale_val)).astype(original_bias.dtype) inputs["bias"] = new_bias elif output_op.op_type == "matmul": # Determine if the scaled weight is used by `x` or `y` in matmul. if output_op.y == scale_op.outputs[0]: if output_op.transpose_y.val is True: scale_val = scale_val.T inputs["y"] = lut_op.outputs[0] else: if output_op.transpose_x.val is True: scale_val = scale_val.T inputs["x"] = lut_op.outputs[0] else: if output_op.op_type != "conv": raise AssertionError( "The scale could only be moved for linear/matmul/conv, " f"but got {output_op.op_type}" ) # The weight of conv has C_out at axis=0, but in output the C_out is at axis=1 scale_val = np.squeeze(scale_val) if len(scale_val.shape) > 1: # The per-channel-scale should only have one axis with larger than 1 dim size. return channel_size = 1 if len(scale_val.shape) == 0 else scale_val.shape[0] scale_val = scale_val.reshape((1, channel_size, 1, 1)) inputs["weight"] = lut_op.outputs[0] if getattr(output_op, "bias", None) and output_op.bias.val is not None: original_bias = output_op.bias.val new_bias = (original_bias / np.squeeze(scale_val)).astype(original_bias.dtype) inputs["bias"] = new_bias # Reconstruct the unscaled output which uses lut output as weight (skip the original scale). unscaled_output = getattr(mb, output_op.op_type)(**inputs, before_op=output_op) scaled_output = mb.mul(x=unscaled_output, y=scale_val, before_op=output_op) # Now the order is lut_op -> unscaled_output -> scaled_output. block.replace_uses_of_var_after_op( anchor_op=output_op, old_var=output_op.outputs[0], new_var=scaled_output, force_replace=True, # Need to force replace because it involves replacing constexpr op. ) @register_pass(namespace="common") class canonicalize_quantized_lut_pattern(AbstractGraphPass): """ The quantized lut (e.g. each entry in the LUT is int8) could be represented by two patterns: Pattern 1: lut(int8) -> constexpr_blockwise_shift_scale -> lut(fp16) -> constexpr_lut_to_dense -> dense(fp16) Pattern 2: lut(int8) -> constexpr_lut_to_dense -> dense(int8) -> constexpr_blockwise_shift_scale -> dense(fp16) Those two patterns are mathematically equivalent when the quantization is per-tensor or per-channel. This graph pass makes sure we always use one specific pattern by re-ordering the ops. """ _DEQUANT_FIRST = True # First dequantize and then depalettize (use pattern 1). def apply(self, prog): wrong_order_op1 = ( "constexpr_lut_to_dense" if self._DEQUANT_FIRST else "constexpr_blockwise_shift_scale" ) wrong_order_op2 = ( "constexpr_blockwise_shift_scale" if self._DEQUANT_FIRST else "constexpr_lut_to_dense" ) @block_context_manager def apply_block(block: Block): for op in list(block.operations): for b in op.blocks: apply_block(b) if op.op_type == wrong_order_op1 and len(op.outputs[0].child_ops) == 1: if op.outputs[0].child_ops[0].op_type == wrong_order_op2: self._reorder_quant_lut(block, op) for f in prog.functions.values(): apply_block(f) def _reorder_quant_lut(self, block: Block, old_op1: Operation): """ Original order is op1 -> op2 -> output_op, and after reorder it becomes op2 -> op1 -> output_op. Here op1 and op2 corresponds to either lut op or quant op, depending on `_DEQUANT_FIRST`. """ old_op2 = old_op1.outputs[0].child_ops[0] # If the old op has some meaningful info in the name (such as "conv1.weight"), we need to keep it. new_op1_name = None if old_op1.op_type in old_op1.name else old_op1.name new_op2_name = None if old_op2.op_type in old_op2.name else old_op2.name if old_op1.op_type == "constexpr_blockwise_shift_scale": # The old_op1 is dequant op and old_op2 is a lut op. # The scale and offset from old_op1 is for lut, so the rank need to be adjusted. if old_op1.scale.shape[-2:] != (1, 1): raise AssertionError( "The quantization on lut must be per-tensor, so last two dims in `scale` should " f"both be 1, but got scale with shape {old_op1.scale.shape}." ) new_scale_shape = old_op1.scale.shape[-2:] scale = old_op1.scale.val.reshape(new_scale_shape) offset = old_op1.offset if offset is not None and offset.val is not None: offset = old_op1.offset.val.reshape(new_scale_shape) new_op1_args = {"indices": old_op2.indices, "lut": old_op1.data, "before_op": old_op2} if new_op1_name is not None: new_op1_args["name"] = new_op1_name new_op1 = mb.constexpr_lut_to_dense(**new_op1_args) new_op2_args = {"data": new_op1, "scale": scale, "offset": offset, "before_op": old_op2} if new_op2_name is not None: new_op2_args["name"] = new_op2_name new_op2 = mb.constexpr_blockwise_shift_scale(**new_op2_args) else: # The old_op1 is lut op and old_op2 is a dequant op. # The scale and offset from old_op2 is for depalettized weight, so the rank need to be adjusted to match # the lut's rank. new_scale_shape = old_op2.scale.shape + (1, 1) scale = old_op2.scale.val.reshape(new_scale_shape) offset = old_op2.offset if offset is not None and offset.val is not None: offset = old_op2.offset.val.reshape(new_scale_shape) lut = old_op1.lut if any(shape != 1 for shape in new_scale_shape): # The lut need to be repeated when necessary. For example, in per-channel-scale, the lut has shape # [16, 1, 16, 1], indices has shape [32, 1], and scale has shape [32, 1]. It means every two rows in # the weight share a lut, and it's impossible to apply 32 scales to 16 lut tables. So we need to repeat # the lut to become [32, 1, 16, 1], and then apply those 32 scales to each row. lut = old_op1.lut.val if lut is None: return # Cannot handle the reording when the lut is not const. for axis, (scale_shape, lut_shape) in enumerate(zip(new_scale_shape, lut.shape)): if scale_shape > lut_shape: if scale_shape % lut_shape != 0: return # Skip when lut's shape cannot be repeated to match scale's shape. lut = np.repeat(lut, scale_shape // lut_shape, axis=axis) new_op1_args = {"data": lut, "scale": scale, "offset": offset, "before_op": old_op1} if new_op1_name is not None: new_op1_args["name"] = new_op1_name new_op1 = mb.constexpr_blockwise_shift_scale(**new_op1_args) new_op2_args = {"indices": old_op1.indices, "lut": new_op1, "before_op": old_op1} if new_op2_name is not None: new_op2_args["name"] = new_op2_name new_op2 = mb.constexpr_lut_to_dense(**new_op2_args) block.replace_uses_of_var_after_op( anchor_op=old_op2, old_var=old_op2.outputs[0], new_var=new_op2, force_replace=True, # Need to force replace because it involves replacing constexpr op. ) block.remove_ops([old_op1, old_op2]) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/optimize_repeat_ops.py0000644000000000000000000022577614672066616027472 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy from collections import defaultdict from typing import List, Text, Tuple import numpy as np from coremltools import _logger as logger from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, Operation from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import _check_child_op_type, block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.types.symbolic import any_symbolic from coremltools.converters.mil.mil.types.type_mapping import ( RangeTuple, builtin_to_range, builtin_to_resolution, string_to_builtin, ) from coremltools.converters.mil.mil.var import Var @register_pass(namespace="common") class merge_consecutive_paddings(AbstractGraphPass): """ Identify two consecutive ``pad`` layers which could be merged into a single ``pad`` layer. This is possible only if one of the following conditions is satisfied: - The paddings are "constant" and have the same ``constant_val``. - The paddings act along different axes. .. code-block:: Input graph: input(1, 2, 6, 8) ------> pad([1, 1], mode='reflect) -----> pad([1, 1, 0, 0], mode='reflect') ---> out(1, 2, 8, 10) Output graph: input(1, 2, 6, 8) ------> pad([1, 1, 1, 1], mode='reflect) ---> out(1, 2, 8, 10) """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._merge_padding_block(f) def _match_pattern(self, block, padding_op): if padding_op.op_type != "pad": return False if not _check_child_op_type(padding_op, "pad"): return False child_padding_op = list(padding_op.outputs[0].child_ops)[0] if padding_op.inputs["mode"].val != child_padding_op.inputs["mode"].val: return False # Ensure the paddings have the same length by prepending zeros to the shorter one first_pad = padding_op.inputs["pad"].val child_pad = child_padding_op.inputs["pad"].val if len(first_pad) > len(child_pad): child_pad = np.insert(child_pad, 0, [0] * (len(first_pad) - len(child_pad))) elif len(child_pad) > len(first_pad): first_pad = np.insert(first_pad, 0, [0] * (len(child_pad) - len(first_pad))) final_pad = child_pad + first_pad if padding_op.inputs["mode"].val == "constant": # if the padding is constant, then the values need to be equal if padding_op.inputs["constant_val"].val != child_padding_op.inputs["constant_val"].val: return False else: # if the padding is not constant, then we can't merge if both pads affected the same # side of the image if any(i != 0 and j != 0 for (i, j) in zip(first_pad, child_pad)): return False return self._replace_ops(block, padding_op, child_padding_op, final_pad) @staticmethod def _replace_ops(block, padding_op, child_padding_op, final_pad): mode = padding_op.inputs["mode"].val x = mb.pad( x=padding_op.inputs["x"], pad=final_pad, mode=mode, constant_val=padding_op.inputs["constant_val"].val, before_op=padding_op, ) padding_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=padding_op, old_var=child_padding_op.outputs[0], new_var=x ) block.remove_ops([padding_op, child_padding_op]) return True @block_context_manager def _merge_padding_block(self, block): fusion_happens = False for op in list(block.operations): if op.enclosing_block is None: continue if self._match_pattern(block, op): fusion_happens = True return fusion_happens @register_pass(namespace="common") class merge_consecutive_transposes(AbstractGraphPass): """ Identify consecutive 'transpose' layers which could be merged into a single 'transpose' layer. .. code-block:: Input graph: input ------> transpose -----> 1 or more transpose layers ---> out Output graph: input ------> transpose ---> out """ def apply(self, prog): for f in prog.functions.values(): self._merge_transposes_in_block(f) def _match_and_replace_pattern(self, block, transpose_op): if not (transpose_op.op_type == "transpose" and _check_child_op_type(transpose_op, "transpose")): return False if transpose_op.outputs[0] in block.outputs: return False child_transpose_op = list(transpose_op.outputs[0].child_ops)[0] return self._replace_ops(block, transpose_op, child_transpose_op) @staticmethod def _replace_ops(block, transpose_op, child_transpose_op): perm = transpose_op.perm.val new_perm = [perm[i] for i in child_transpose_op.perm.val] x = mb.transpose(x=transpose_op.x, perm=new_perm, before_op=transpose_op) if transpose_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=child_transpose_op, old_var=child_transpose_op.outputs[0], new_var=x, ): block.remove_ops([transpose_op, child_transpose_op]) return True return False @block_context_manager def _merge_transposes_in_block(self, block): def help_merge_transpose_ops(block): fusion_happens = False for op in list(block.operations): if op.enclosing_block is None: continue if self._match_and_replace_pattern(block, op): fusion_happens = True return fusion_happens block_changed = True while block_changed: block_changed = help_merge_transpose_ops(block) @register_pass(namespace="common") class merge_consecutive_relus(AbstractGraphPass): """ Identify consecutive ``relu`` layers which could be merged into a single ``relu`` layer. .. code-block:: Input graph: input ------> relu -----> 1 or more relu layers ---> out Output graph: input ------> relu ---> out """ def apply(self, prog): for f in prog.functions.values(): self._merge_relus_in_block(f) def _match_and_replace_pattern(self, block, relu_op): if not (relu_op.op_type == "relu" and _check_child_op_type(relu_op, "relu")): return False child_relu_op = list(relu_op.outputs[0].child_ops)[0] return self._replace_ops(block, relu_op, child_relu_op) @staticmethod def _replace_ops(block, relu_op, child_relu_op): if relu_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=child_relu_op, old_var=child_relu_op.outputs[0], new_var=relu_op.outputs[0] ): block.remove_ops([child_relu_op]) return True return False @block_context_manager def _merge_relus_in_block(self, block): def help_merge_relu_ops(block): fusion_happens = False for op in list(block.operations): if op.enclosing_block is None: continue if self._match_and_replace_pattern(block, op): fusion_happens = True return fusion_happens block_changed = True while block_changed: block_changed = help_merge_relu_ops(block) @register_pass(namespace="common") class merge_consecutive_reshapes(AbstractGraphPass): """ Identify consecutive ``reshape`` ops which could be merged into a single ``reshape``. .. code-block:: Input graph: input -> reshape -> 1 or more reshapes -> output Output graph: input -> reshape -> output """ # TODO (rdar://105227587): merge a tree of consecutive reshapes def apply(self, prog): for f in prog.functions.values(): self._merge_consecutive_reshapes_block(f) @staticmethod def _match_pattern(reshape_op): """ Given a ``reshape`` op, consider it as the head of a sequence of ``reshape`` ops, and then end the sequence at a non-removable ``reshape`` op. Return this sequence as a list. """ res = [] op = reshape_op while op.op_type == "reshape": res.append(op) # current reshape has 0 or 2+ child ops: # * no child: this is the end of graph # * 2+ children: only pattern of sequential reshape ops (1 child) # is supported for now. For more general cases, please see TODO below if len(op.outputs[0].child_ops) != 1: break # current reshape output is a block output, so it is non-removable if op.outputs[0] in op.enclosing_block.outputs: break op = op.outputs[0].child_ops[0] return res @block_context_manager def _merge_consecutive_reshapes_block(self, block): @block_context_manager def help_merge_consecutive_reshapes_block(block): fusion_happens = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = help_merge_consecutive_reshapes_block(b) # move on to the next op if this op is not reshape if op.op_type != "reshape": continue reshape_ops = self._match_pattern(op) # merge the list of consecutive reshape ops if len(reshape_ops) > 1: # create a new reshape op reshape_out = mb.reshape( x=reshape_ops[0].x, shape=reshape_ops[-1].shape, name=reshape_ops[-1].outputs[0].name, before_op=reshape_ops[-1], ) # replace the consecutive reshape ops with the new reshape op reshape_ops[-1].enclosing_block.replace_uses_of_var_after_op( anchor_op=reshape_ops[-1], old_var=reshape_ops[-1].outputs[0], new_var=reshape_out, ) reshape_ops[-1].enclosing_block.remove_ops(reshape_ops) fusion_happens = True return fusion_happens block_changed = True while block_changed: block_changed = help_merge_consecutive_reshapes_block(block) class CastOptimizationNode: def __init__(self, op_type, match_criterion=None): """ Parameters ---------- param op_type : Type of an operation. param match_criterion : A callable function that matches a MIL op and returns a boolean. Examples -------- .. sourcecode:: python CastOptimizationNode("mul"), CastOptimizationNode("round"), CastOptimizationNode("add", lambda op: op.y.val == 0), CastOptimizationNode("clip", lambda op: op.alpha.val == -128 and op.beta.val == 127), CastOptimizationNode("cast", lambda op: op.dtype.val == "int8"), CastOptimizationNode("cast", lambda op: op.dtype.val == "fp32"), """ self.op_type = op_type if not match_criterion: match_criterion = lambda op: True self.match_criterion = match_criterion @register_pass(namespace="common") class cast_optimization(AbstractGraphPass): """ This optimization pass performs the following: - Removes redundant ``cast`` op; that is, ``cast`` where source and destination tensors have same dtypes. - Fuses two consecutive `cast` ops if applicable, repeatedly. This is a non-algebraic translation which assumes that the upcasting doesn't change the user's intent. (1) Example for redundant ``cast`` op removal: .. code-block:: Input graph: input(fp16) -> cast(dtype="fp16") -> relu -> out Output graph: input -> relu -> out The input and output tensors for the ``cast`` op are both with type of ``fp16``. Hence, it can be removed. (2) Example for two ``cast`` ops fusion: .. code-block:: Input graph: input(int8) -> cast(dtype="fp16") -> cast(dtype="fp32") -> out Output graph: input(int8) -> cast(dtype="fp32") -> out The data range and resolution of the above graph are limited by the int8 input, so the fusion is allowed. (3) Negative example for two ``cast`` ops fusion: .. code-block:: Input graph: input(fp32) -> cast(dtype="bool") -> cast(dtype="fp16") -> out Output graph: Same as input graph. The above two ``cast`` ops cannot be merged, since after the first cast, the resolution of the numerical output is downcasted to binary (``0, 1``). If we fuse them, the output would be in the range and resolution of ``fp16`` instead. (4) Another Negative example for two ``cast`` ops fusion: .. code-block:: Input graph: input(int32) -> cast(dtype="int8") -> cast(dtype="uint8") -> out Output graph: Same as input graph. The above two ``cast`` ops cannot be merged, since in the original graph, by going through two casts, the output numerical range is capped to ``[0, 127]``. However, if two ``cast`` ops are reduced to 1 ``cast(dtype="uint8")``, the output numerical would in the range of ``[0, 255]``. The fusion would cause numerical issue for the numbers between ``[128, 255]``, which is prohibited. In general, two ``cast`` ops can be merged if the output data range and resolution is not affected. For more examples, please see the unittests that start with prefix ``TestCastOptimization`` in ``test_passes.py``. """ _num_of_visited_ops = 0 # Testing purpose, making sure the algorithm performs in O(N) def apply(self, prog): self._num_of_visited_ops = 0 for f in prog.functions.values(): self._fuse_or_cancel_consecutive_casts_block_wrapper(f) def _propagate_range_resolution(self, in_dtype: type, dtype_chain: Tuple[type]): """ Given an input type ``in_dtype``, and a chain of casting, return the resulting output data range and resolution. For example, ``in_dtype = fp32`` and ``dtype_chain = [int8, int32]``. This means an input data with type ``fp32``, is propagated through ``cast(dtype="int8")`` and ``cast(dtype="int32")`` in order. 1. The input fp32 data range is ``[-3.4e+38, 3.4e+38]`` with resolution ``1e-06``. 2. After the first ``cast(dtype="int8")`` downcast, the range becomes ``[-128, 127]`` with resolution ``1``. 3. Even the ``int32`` has a larger range, the resulting range is still capped to ``[-128, 127]``. For the above example, this function returns range of ``[-128, 127]`` and resolution ``1``. """ assert isinstance(dtype_chain, tuple) cur_range, cur_resolution = builtin_to_range(in_dtype), builtin_to_resolution(in_dtype) for v in dtype_chain: tmp_range, tmp_resolution = builtin_to_range(v), builtin_to_resolution(v) cur_range = RangeTuple( max(cur_range.low, tmp_range.low), min(cur_range.high, tmp_range.high) ) cur_resolution = max(cur_resolution, tmp_resolution) return cur_range, cur_resolution def _is_cast_ops_fusable(self, cast_1: Operation, cast_2: Operation): """ Check if two cast ops can be fused by verifying the consistency between the range and resolution before and after fusion. Take the same example shown in ``_propagate_range_resolution``: input(fp32) -> cast(dtype="int8") -> cast(dtype="int32") The original pattern has output range and resolution ``[-128, 127]``, ``1``. However, if the two ``cast`` ops are fused: input(fp32) -> cast(dtype="int32") The output range becomes the range of int32, which is not ``[-128, 127]``. As the result, the fusion is prohibited. """ x_dtype, cast_1_dtype, cast_2_dtype = ( cast_1.x.dtype, string_to_builtin(cast_1.dtype.val), string_to_builtin(cast_2.dtype.val), ) ref_range, ref_resolution = self._propagate_range_resolution( x_dtype, (cast_1_dtype, cast_2_dtype) ) out_range, out_resolution = self._propagate_range_resolution(x_dtype, (cast_2_dtype,)) return out_range == ref_range and out_resolution == ref_resolution def _dup_if_affect_io(self, new_var: Var, old_var: Var, before_op: Operation): """ We cannot replace old_var with new_var, if: 1. old_var is a function output 2. new_var is a function input Since the name of the function is going to be changed and become invalid. For this special corner case, we use an identity op to duplicate the new_var. """ block_1 = before_op.enclosing_block is_new_var_function_input = ( isinstance(block_1, Function) and new_var in block_1.inputs.values() ) block_2 = old_var.op.enclosing_block is_old_var_function_output = isinstance(block_2, Function) and old_var in block_2.outputs if is_new_var_function_input and is_old_var_function_output: return mb.identity(x=new_var, before_op=before_op) return new_var def _fuse_cast_ops(self, cast_ops: List[Operation], reuse_input_var: bool = False): """ Fuse the pattern of: input -> cast_1(dtype=dtype_1) -> cast_2(dtype=dtype_2) -> out If ``reuse_input_var = True``, the pattern is reduced to: input -> out otherwise, a new ``cast`` op with the same ``dtype`` as ``cast_2`` is created: input -> cast_3(dtype=dtype_2) -> out """ if not isinstance(cast_ops[0], tuple): cast_ops = tuple((cast_ops,)) ops_to_remove = [] for cast_1, cast_2 in cast_ops: if reuse_input_var: new_output_var = self._dup_if_affect_io(cast_1.x, cast_2.outputs[0], cast_1) else: fused_output_var_name = cast_1.x.name + "_to_{}".format(cast_2.dtype.val) new_output_var = mb.cast( x=cast_1.x, dtype=cast_2.dtype, name=fused_output_var_name, before_op=cast_2, ) # It's important to use `cast_2.enclosing_block` since `cast_2` might be present in a block nested under `cast_1.enclosing_block` cast_2.enclosing_block.replace_uses_of_var_after_op( anchor_op=cast_2, old_var=cast_2.outputs[0], new_var=new_output_var, ) # Remove just the last cast op and let dce eliminate the rest of the ops if needed, # The reason is that first cast op could be feeding into other non-cast ops. ops_to_remove.append(cast_2) ops_to_remove[0].enclosing_block.remove_ops(ops_to_remove) def _try_to_transform(self, root_op, cast_ops_across_blocks): block = root_op.enclosing_block if block is None: return False # Scenario: Redundant cast when source and destination dtype are same. if root_op.op_type == "cast" and root_op.x.is_tensor_or_scalar_of(dtype=root_op.dtype.val): new_var = root_op.x old_var = root_op.outputs[0] new_var = self._dup_if_affect_io(root_op.x, old_var, root_op) block.replace_uses_of_var_after_op( anchor_op=root_op, old_var=old_var, new_var=new_var, ) block.remove_ops([root_op]) return True # Scenario: Consecutive casts candidate_child_ops = [] for op in root_op.outputs[0].child_ops: if op.op_type == "cast": candidate_child_ops.append(op) fusion_happens = False for child_op in candidate_child_ops: if not self._is_cast_ops_fusable(root_op, child_op): continue if root_op.x.is_tensor_or_scalar_of(dtype=child_op.dtype.val): # when consecutive casts cancel each other # Please check out: test_linear_consecutive_cast_ops_cancellation in TestCastOptimization self._fuse_cast_ops((root_op, child_op), reuse_input_var=True) fusion_happens = True else: if child_op.enclosing_block != block: # If cast_2 is in an inner block, we handle it at once in a separated function `_fuse_casts_ops_across_blocks` cast_ops_across_blocks[child_op.enclosing_block].add((root_op, child_op)) continue self._fuse_cast_ops((root_op, child_op)) fusion_happens = True return fusion_happens @block_context_manager def _fuse_casts_ops_across_blocks(self, block: Block, ops_to_fused: Tuple[Operation]): self._fuse_cast_ops(ops_to_fused) @block_context_manager def _fuse_or_cancel_consecutive_casts_block_wrapper(self, block): def _fuse_or_cancel_consecutive_casts_block(block, cast_ops_across_blocks): # We first make sure all the inner blocks are optimized # It is important to do it seperately in the very beginning, to ensure the last step of optimization cast ops across the block boundary is correct. for op in block.operations: for b in op.blocks: self._fuse_or_cancel_consecutive_casts_block_wrapper(b) fusion_happens = False for op in list(block.operations): self._num_of_visited_ops += 1 # start pattern match if cast op is encountered if op.op_type == "cast": if self._try_to_transform(op, cast_ops_across_blocks): # It is important not to exist the loop right away when a fusion happens, # in order to make the time conplexity low. # For instance, given a program of the pattern: # relu -> relu -> cast -> cast -> cast, # the three cast ops can be fused into a single cast op in one shot. # On the other hand, if we break the loop right away, the # two relu ops will be visited 3 times, and makes the overal # time complexity O(N^2). fusion_happens = True return fusion_happens block_changed = True cast_ops_across_blocks = defaultdict(set) while block_changed: block_changed = _fuse_or_cancel_consecutive_casts_block(block, cast_ops_across_blocks) # fuse the cast ops across the inner / outer block boundary for k, v in cast_ops_across_blocks.items(): self._fuse_casts_ops_across_blocks(k, tuple(v)) class TransformAxisUpdateOps: """ Parent class for every axis update op's class An axis update op is an op that can be updated, such that it can allow a transpose layer to "pass" through it. That is, op(transpose(x)) == transpose(op_updated(x)) where "op" : original op, "op_updated": op after being updated. Example: if x is a tensor of rank 2, and transpose has perm=[1,0], then reduce_mean[axis=1](transpose(x)) == transpose(reduce_mean[axis=0](x)) here reduce_mean op with axis=1 can be updated to a reduce_mean op with axis=0, to allow the transpose to "pass" through it, i.e. get applied after it. """ def __init__(self, op, transpose_axes, var_to_hypothetical_value_dict=None): self.op = op self.transpose_axes = transpose_axes self.var_to_hypothetical_value_dict = var_to_hypothetical_value_dict def can_transpose_pass(self): """ Each "axis" op must determine whether it can act like a unary op and allow the transpose to pass through. Return True if it can allow the transpose to pass through, otherwise return False. :return: bool """ raise NotImplementedError("This function must be implemented by each op") def update(self): """ A method that updates some attribute of the axis op, based on the transpose axes value. This method only gets called if "can_transpose_pass" returns True. Update the op such that the output %i2 should be equal to %o2 Before: %i_1 = transpose_op(%i_0, perm=transpose_axes) %i2 = op(%i1) After: %o1 = op_updated(%i0) %o2 = transpose_op(%o1, perm=transpose_axes) :return: None """ raise NotImplementedError("This function must be implemented by each op") @staticmethod def _find_transpose_compliment(perm): """ return the permutation value that when applied will reverse the effect of the given permutation. e.g.: if perm == (1, 2, 3, 0), then return (3, 0, 1, 2), which will undo the first permutation's effect """ rank = len(perm) all_positive_perm = [p + rank if p < 0 else p for p in perm] perm_inverse = [0] * rank for i in range(rank): perm_inverse[i] = all_positive_perm.index(i) return perm_inverse class _HypotheticalValue: """ A hypothetical value that simply wraps a Var. Actual Var it wraps doesn't really matter, as its mainly for debugging. This class really exists to differentiate a "_LazyTransposeHypotheticalValue" type with a non-"_LazyTransposeHypotheticalValue" type. """ def __init__(self, var=None): self.value = var # type : Var class _LazyTransposeHypotheticalValue: """ A hypothetical value that represents a transpose op on top of a hypothetical value, or a collection of transpose_ops, which have the same "perm" parameter. """ def __init__(self, hypothetical_value, transpose_ops, perm): # Input hypothetical value to the transpose op. # When there are multiple transpose ops, this is the incoming hypothetical value to any one of those self.wrapped_hypothetical_value = hypothetical_value # type : _HypotheticalValue if not isinstance(hypothetical_value, _HypotheticalValue): raise ValueError( "transpose optimization pass: incorrect type passed for hypothetical_value" ) for op in transpose_ops: if op.op_type != "transpose": raise ValueError( "transpose optimization pass: _LazyTransposeHypotheticalValue can only be made with transpose ops" ) perm_op = list(op.inputs["perm"].val) if perm_op != perm: raise ValueError( "transpose optimization pass: _LazyTransposeHypotheticalValue can only be made with transpose ops with the same 'perm' values" ) self.perm = perm # type : list[int], perm parameter of all the transpose ops self.transpose_ops = transpose_ops # type : Set(op) class _TransposeOptimization: _DEBUG = False # Set to true to plot the block before and after the transformation. # Dictionary from axis update op to its class # This is filled in by child classes of the class "TransformAxisUpdateOps". _AXIS_UPDATE_OPS = dict() # TODO: instead of a hard-coded set, use op-traits # These are the ops that satisfy the following property: # - single non constant input # - single output # - non rank changing # - doesn't need to be updated of a transpose passes through it. i.e. # Transpose(op(x)) == op(Transpose(x)) _UNARY_LIKE_OP_TYPES = { "relu", "log", "relu6", "abs", "acos", "asin", "atan", "atanh", "ceil", "clip", "cos", "cosh", "erf", "exp", "exp2", "floor", "identity", "logical_not", "round", "rsqrt", "sign", "sin", "sinh", "sqrt", "square", "pow", "tan", "tanh", "threshold", "clamped_relu", "elu", "gelu", "leaky_relu", "linear_activation", "scaled_tanh", "sigmoid", "sigmoid_hard", "softplus", "softplus_parametric", "softsign", "thresholded_relu", } def __init__(self, block): self.block = block # for each var in the block, this dictionary stores the hypothetical value that is assigned to it during # graph traversal self.var_to_hypothetical_value = ( {} ) # type : var : _HypotheticalValue or _LazyTransposeHypotheticalValue # start out by filling this dictionary with all the inputs of the block for _, input_var in block.inputs.items(): self.var_to_hypothetical_value[input_var] = _HypotheticalValue(input_var) # Dictionaries below are used to store transpose cancellation/fusion information. # These are filled during the traversal of the graph, # after which they are used by the `_apply_transform` method # transpose op to the list of transpose ops that are its compliments and can be cancelled away with it self.transpose_op_to_cancel_ops = defaultdict(lambda: []) # type : op : List[op] # transpose op to the list of ops before which it has to materialize, i.e. the root transpose op # can be moved downstream in the graph, as far as these materialize ops self.transpose_op_to_materialize_ops = defaultdict( lambda: [] ) # type : op : List[Tuple(op, Var)] # list of the ops that need to be updated (either their axis parameter or one of their constant inputs) # if the transpose op is fused away or moved downstream in the graph self.transpose_op_to_axis_update_ops = defaultdict(lambda: []) # type : op : List[op] # for book keeping self.ops_updated = set() self.materialized_ops_handled = set() self.transpose_ops_removed = set() # save the output sinks' information self.old_output_vars = [] self.output_sink_ops = [] # We modify the graph temporarily for outputs self._add_output_sinks() def _add_output_sinks(self): # We add an identity sink for all outputs. self.old_output_vars = {var: var.name for var in self.block.outputs} new_outputs = [] output_sinks_var = {} for out_var in self.block.outputs: if out_var not in output_sinks_var: out_sink = mb.identity(x=out_var) output_sinks_var[out_var] = out_sink else: out_sink = output_sinks_var[out_var] new_outputs.append(out_sink) self.output_sink_ops.append(out_sink.op) self.block.set_outputs(new_outputs) def _visit_unary_like_op(self, op, input_var=None): # pass the input var's hypothetical_value to the output var's, since shape invariant ops do # not modify the incoming hypothetical_value if input_var is None: input_var = op.inputs["x"] if len(op.outputs) > 1: msg = ( "transpose optimization pass: op '{}', of type = '{}', has multiple outputs, hence it" "cannot be handled like a unary op" ) raise ValueError(msg.format(op.name, op.op_type)) self.var_to_hypothetical_value[op.outputs[0]] = self.var_to_hypothetical_value[input_var] def _visit_materialize_op(self, op): # this is the catch all category of ops # these are the "not-lazy-transpose-pass-through" kind of ops # output hypothetical_value is same as the vars for out_var in op.outputs: self.var_to_hypothetical_value[out_var] = _HypotheticalValue(out_var) # check for the inputs # if there is a lazy transpose hypothetical value as an input, # all the transpose ops it hold, # need to be materialized here now, i.e., we should update "transpose_op_to_materialize_ops" for input_var in self._get_input_vars(op): input_hypothetical_value = self.var_to_hypothetical_value[input_var] if isinstance(input_hypothetical_value, _LazyTransposeHypotheticalValue): all_lazy_transpose_ops = input_hypothetical_value.transpose_ops for transpose_op in all_lazy_transpose_ops: self.transpose_op_to_materialize_ops[transpose_op].append((op, input_var)) def _visit_axis_update_op(self, op): """ Check: - at least one of the non-constant inputs to this op is of type _LazyTransposeHypotheticalValue - for all non-constant inputs, that are of type _LazyTransposeHypotheticalValue, they have the same perm value. These checks are common for all "axis update" ops. """ input_vars = self._get_input_vars(op, only_nonconst_vars=True) perm = None num_lazy_input_vars = 0 for var in input_vars: hypothetical_value = self.var_to_hypothetical_value[var] if isinstance(hypothetical_value, _LazyTransposeHypotheticalValue): num_lazy_input_vars += 1 if perm is None: perm = hypothetical_value.perm elif perm != hypothetical_value.perm: self._visit_materialize_op(op) return if num_lazy_input_vars == 0: self._visit_materialize_op(op) return # checks specific to the op type op_cls = self._AXIS_UPDATE_OPS.get(op.op_type, None) if op_cls is None: raise ValueError("Transform class for op of type '{}' not found".format(op.op_type)) if not op_cls( **{ "op": op, "transpose_axes": perm, "var_to_hypothetical_value_dict": self.var_to_hypothetical_value, } ).can_transpose_pass(): self._visit_materialize_op(op) return # add this op to the dictionary "transpose_op_to_axis_update_ops" # and update self.var_to_hypothetical_value[op.outputs[0]] all_lazy_transpose_ops = set() wrapped_hypothetical_value = None for var in input_vars: input_hypothetical_value = self.var_to_hypothetical_value[var] if isinstance(input_hypothetical_value, _LazyTransposeHypotheticalValue): all_lazy_transpose_ops.update(input_hypothetical_value.transpose_ops) wrapped_hypothetical_value = input_hypothetical_value.wrapped_hypothetical_value for transpose_op in all_lazy_transpose_ops: self.transpose_op_to_axis_update_ops[transpose_op].append(op) for output in op.outputs: self.var_to_hypothetical_value[output] = _LazyTransposeHypotheticalValue( wrapped_hypothetical_value, all_lazy_transpose_ops, perm, ) @staticmethod def _do_transposes_cancel(perm1, perm2): if len(perm1) != len(perm2): return False x = list(range(len(perm1))) x1 = [x[i] for i in perm1] x2 = [x1[i] for i in perm2] if x == x2: return True return False def _visit_transpose_op(self, op): input_var = op.inputs["x"] if op.inputs["perm"].val is None: self._visit_materialize_op(op) return perm = list(op.inputs["perm"].val) input_hypothetical_value = self.var_to_hypothetical_value[input_var] """ There are 3 cases to handle: 1. input type == _HypotheticalValue 2. input type == _LazyTransposeHypotheticalValue and this op is the transpose compliment of it 3. input type == _LazyTransposeHypotheticalValue and this op is NOT the transpose compliment of it """ if isinstance(input_hypothetical_value, _HypotheticalValue): # case 1 # the input is not a lazy transpose. # Since the current node is a transpose, there are two sub-cases. # a) It's a output node. We materialize it directly. # b) It might get cancelled downstream, so make the output var's # hypothetical_value a lazy transpose if op.outputs[0] in self.old_output_vars: self._visit_materialize_op(op) else: self.var_to_hypothetical_value[op.outputs[0]] = _LazyTransposeHypotheticalValue( input_hypothetical_value, set([op]), perm ) return # input is a Lazy transpose hypothetical value. Lets first check whether the current # transpose cancels it or not do_cancel = self._do_transposes_cancel(input_hypothetical_value.perm, perm) if do_cancel: # case 2 # transposes cancel, so now the hypothetical_value of the output will # be same as the hypothetical value wrapped inside the upstream lazy transpose self.var_to_hypothetical_value[ op.outputs[0] ] = input_hypothetical_value.wrapped_hypothetical_value # also update the dictionary "transpose_op_to_cancel_ops" all_lazy_transpose_ops = input_hypothetical_value.transpose_ops for transpose_op in all_lazy_transpose_ops: self.transpose_op_to_cancel_ops[transpose_op].append(op) else: # case 3 # transposes don't cancel # this is same as a materialize op then self._visit_materialize_op(op) def _visit_op(self, op): input_vars = self._get_input_vars(op) for var in input_vars: assert ( var in self.var_to_hypothetical_value ), "transpose optimization pass: hypothetical value for var '{}', not found".format( var.name ) if op in self.output_sink_ops: self._visit_materialize_op(op) elif op.op_type in self._UNARY_LIKE_OP_TYPES: self._visit_unary_like_op(op) elif op.op_type in self._AXIS_UPDATE_OPS: self._visit_axis_update_op(op) elif op.op_type == "transpose": self._visit_transpose_op(op) elif op.op_type == "const": self.var_to_hypothetical_value[op.outputs[0]] = _HypotheticalValue(op.outputs[0]) else: self._visit_materialize_op(op) def block_traversal(self): # Since the ops are already organized in a topological manner, # simply iterate through all the ops for op in self.block.operations: self._visit_op(op) def _verify_cancellable_transposes(self): # invert "transpose_op_to_cancel_ops" transpose_cancel_ops_to_starting_transpose_set = defaultdict(lambda: set()) for op, cancel_ops_list in self.transpose_op_to_cancel_ops.items(): for cancel_op in cancel_ops_list: transpose_cancel_ops_to_starting_transpose_set[cancel_op].update(set([op])) for op in transpose_cancel_ops_to_starting_transpose_set: assert ( op not in self.transpose_op_to_cancel_ops ), "transpose reduction optimization: transpose op '{}' cannot be both a starting and cancel op".format( op.name ) # invert "transpose_op_to_materialize_ops" materizalize_ops_to_starting_transpose_set = defaultdict(lambda: set()) for op, materialize_ops in self.transpose_op_to_materialize_ops.items(): for materialize_op, edge in materialize_ops: materizalize_ops_to_starting_transpose_set[materialize_op].update(set([op])) # the starting transpose op may not be in "transpose_op_to_cancel_ops" # but it needs to be removed if it materializes later, hence we need to add it # to the "transpose_op_to_cancel_ops", with an empty value, i.e. no other ops to cancel because of it if op not in self.transpose_op_to_cancel_ops: self.transpose_op_to_cancel_ops[op] = [] # (starting transpose ops) and (transpose cancel ops + materialize ops) form a bipartite graph. # Find the connected components of this graph, by doing a BFS traversal connected_components = [] # List[(Set(op), Set(op)), Set(op)] visited = {} for op in list(self.transpose_op_to_cancel_ops.keys()): if op in visited: continue visited[op] = 1 set_a = set([op]) # set of starting transpose ops set_b1 = set() # set of transpose cancel ops connected to set_a set_b2 = set() # set of materialize ops connected to set_a queue = [] queue.extend(self.transpose_op_to_cancel_ops[op]) if op in self.transpose_op_to_materialize_ops: materialize_ops_list = list(list(zip(*self.transpose_op_to_materialize_ops[op]))[0]) queue.extend(materialize_ops_list) while len(queue) > 0: o = queue.pop(0) visited[o] = 1 # enqueue nodes connected to o if o in self.transpose_op_to_cancel_ops: set_a.update(set([o])) for neighbor_op in self.transpose_op_to_cancel_ops[o]: if neighbor_op not in visited: queue.append(neighbor_op) if o in self.transpose_op_to_materialize_ops: materialize_ops_list = list( list(zip(*self.transpose_op_to_materialize_ops[o]))[0] ) for neighbor_op in materialize_ops_list: if neighbor_op not in visited: queue.append(neighbor_op) elif o in transpose_cancel_ops_to_starting_transpose_set: set_b1.update(set([o])) for neighbor_op in transpose_cancel_ops_to_starting_transpose_set[o]: if neighbor_op not in visited: queue.append(neighbor_op) else: set_b2.update(set([o])) for neighbor_op in materizalize_ops_to_starting_transpose_set[o]: if neighbor_op not in visited: queue.append(neighbor_op) connected_components.append((set_a, set_b1, set_b2)) starting_ops_to_remove = set() # starting ops to remove from the optimization list # now for each connected component, make a decision whether to cancel it or not # (either all transpose ops in a set get cancelled or they don't) for op_set, op_cancel_set, materialize_op_set in connected_components: block_output = False # check that output is not directly connected to a starting transpose op for op in op_set: if op.outputs[0] in self.block.outputs: starting_ops_to_remove.update(op_set) block_output = True break if block_output: continue materizalize_set = set(list(materialize_op_set)) if len(materizalize_set) >= len(op_set) + len(op_cancel_set): starting_ops_to_remove.update(op_set) # remove ops for op in starting_ops_to_remove: self.transpose_op_to_cancel_ops.pop(op, None) def _remove_transpose_ops(self, starting_transpose_op): perm = list(starting_transpose_op.inputs["perm"].val) starting_transpose_op_out_var = starting_transpose_op.outputs[0] starting_transpose_op_input_var = starting_transpose_op.inputs["x"] # update all the "axis_update" ops for op in self.transpose_op_to_axis_update_ops.get(starting_transpose_op, []): if op not in self.ops_updated: op_cls = self._AXIS_UPDATE_OPS.get(op.op_type, None) op_cls( **{ "op": op, "transpose_axes": perm, "var_to_hypothetical_value_dict": self.var_to_hypothetical_value, } ).update() self.ops_updated.add(op) # short circuit starting_transpose_op and its cancel ops to_be_removed_ops = [] name_changed_vars = set() for op in [starting_transpose_op] + self.transpose_op_to_cancel_ops[starting_transpose_op]: if op in self.transpose_ops_removed: continue to_be_removed_ops.append(op) self.transpose_ops_removed.add(op) input_var = op.inputs["x"] # input to the transpose op output_var = op.outputs[0] # output of the transpose op parent_op = input_var.op # parent op of the transpose op if output_var in self.old_output_vars: # output is a block output, so this must be one of the "edge" transpose compliment ops # We need to set `input_var` as the block output var # Change the name of the input_var to match the block output if input_var is not changed. # If the same input_var is in output twice, we can't rename it twice, therefore we initiate an # Identity op to match the name if input_var in self.block.inputs.values(): input_var = mb.identity(x=input_var, before_op=op, name=output_var.name) parent_op = None # set anchor op as None. elif input_var not in name_changed_vars: input_var.name = output_var.name input_var.op.name = output_var.op.name name_changed_vars.update([input_var]) else: input_var = mb.identity(x=input_var, before_op=op, name=output_var.name) parent_op = input_var.op # connect all the child ops of the output_var to the parent of the transpose op. self.block.replace_uses_of_var_after_op( anchor_op=parent_op, old_var=output_var, new_var=input_var, no_check_var_types=True, ) """ Insert a transpose op JUST before each one of the materialize ops i.e. Given: %i1 = op(...) ... ... = materialize_op(..., %i1 ,...) ... Result: %i1 = op(...) ... %i2 = transpose_op(%i1, %perm) ... = materialize_op(..., %i2 ,...) ... """ for op, input_var in self.transpose_op_to_materialize_ops.get(starting_transpose_op, []): if (op, input_var) in self.materialized_ops_handled: continue self.materialized_ops_handled.add((op, input_var)) if input_var == starting_transpose_op_out_var: # materialize op is connected to the starting transpose op # in this case, connect to its parent if op in self.output_sink_ops: continue i1 = starting_transpose_op_input_var else: i1 = input_var if op in self.output_sink_ops: # The input_var of output sink is itself a output. We can safely # modify the name of the input_var since it should only be consumed # by block output here. if i1 not in name_changed_vars: x = mb.transpose(x=i1, perm=perm, before_op=op, name=i1.name) i1.name = "_before_transpose_op_" + x.op.name i1.op.name = "_before_transpose_op_" + x.op.name else: x = mb.transpose(x=i1, perm=perm, before_op=op, name=self.old_output_vars[i1]) else: x = mb.transpose(x=i1, perm=perm, before_op=op) self.block.replace_uses_of_var_after_op( anchor_op=x.op, end_op=op, old_var=i1, new_var=x, no_check_var_types=True, ) self.block.remove_ops(to_be_removed_ops) def apply_transform(self): """ Take in the data collected during graph traversal and transform the graph by cancelling out transpose ops that can be removed. """ logger.debug("Block before optimize transpose transform:\n{}".format(self.block)) if self._DEBUG: import graphviz graphviz.Source( self.block.get_dot_string( highlight_debug_op_names=[], highlight_debug_op_types=["transpose"] ) ).view(filename="/tmp/block_before_reduce_transpose") """ First check which transposes can be cancelled. After this function call we get an updated dictionary "transpose_op_to_cancel_ops" with only the transpose ops that can really be cancelled in the graph Reasons to not cancel: - materialize_ops are greater than cancel_ops, so removing transpose will instead end up increasing the count of transposes - removing a transpose op can only be successful, if all of its cancel ops are removed, removing all the cancel ops is only successful if all of their starting transpose ops are removed and so on. This check is also done in "_verify_cancellable_transposes()" """ self._verify_cancellable_transposes() # apply transform for transpose_op in self.transpose_op_to_cancel_ops: self._remove_transpose_ops(transpose_op) self.block.set_outputs([sink_op.x for sink_op in self.output_sink_ops]) self.block.remove_ops(list(self.output_sink_ops)) if self._DEBUG: graphviz.Source( self.block.get_dot_string( highlight_debug_op_names=[], highlight_debug_op_types=["transpose"] ) ).view(filename="/tmp/block_after_reduce_transpose") logger.debug("Block after optimize transpose transform:\n{}".format(self.block)) for op in self.block.operations: op.type_value_inference(overwrite_output=True) @staticmethod def register_axis_update_op(ops: List[Text]): """ :param ops: Ops that will be registered. For example: the class "_TransformReduceMean" can be used to register ops including "reduce_prod", "reduce_sum" etc. """ def class_wrapper(op_update_cls): for op_type in ops: if op_type in _TransposeOptimization._AXIS_UPDATE_OPS: raise ValueError( "Update class for op of type '{}' already defined".format(op_type) ) _TransposeOptimization._AXIS_UPDATE_OPS[op_type] = op_update_cls return op_update_cls return class_wrapper @staticmethod def _get_input_vars(op, only_nonconst_vars=False) -> List[Var]: input_vars = [] for name, val in op.inputs.items(): if isinstance(val, Var): if only_nonconst_vars: if val.op and val.op.op_type == "const": continue input_vars.append(val) elif isinstance(val, (list, tuple)): for var in val: if not isinstance(var, Var): raise ValueError( f"transpose optimization pass: unrecognized input type of " f"op='{op.name}', input='{name}'" ) if only_nonconst_vars: if var.op and var.op.op_type == "const": continue input_vars.append(var) else: raise ValueError( f"transpose optimization pass: unrecognized input type of " f"op='{op.name}', input='{name}'" ) return input_vars @_TransposeOptimization.register_axis_update_op(ops=["concat"]) class _TransformConcat(TransformAxisUpdateOps): def __init__(self, **kwargs): super(_TransformConcat, self).__init__(**kwargs) self.axis_var = self.op.inputs["axis"] def can_transpose_pass(self): # Check that all non const inputs are of type _LazyTransposeHypotheticalValue. # That they have the same perm value has already been checked before. input_vars = _TransposeOptimization._get_input_vars(self.op, only_nonconst_vars=True) for var in input_vars: hypothetical_value = self.var_to_hypothetical_value_dict[var] if not isinstance(hypothetical_value, _LazyTransposeHypotheticalValue): return False if self.axis_var.val is not None: return True return False def update(self): new_axis_val = self.transpose_axes[self.axis_var.val] # to be used, if there is a constant inputs to the concat op self._update_const_inputs() # insert a new constant for the new axis, JUST before the op with self.op.enclosing_block: new_axis_var = mb.const(val=new_axis_val, before_op=self.op) self.op.enclosing_block.replace_uses_of_var_after_op( anchor_op=new_axis_var.op, end_op=self.op, old_var=self.axis_var, new_var=new_axis_var, no_check_var_types=True, ) def _update_const_inputs(self): transpose_perm_for_const = [0] * len(self.transpose_axes) for i, axis in enumerate(self.transpose_axes): transpose_perm_for_const[axis] = i # if there is a constant input, transpose it inputs = list(self.op.inputs["values"]) for input_var in inputs: if input_var.op.op_type == "const": const_val = input_var.val new_const_val = np.transpose(const_val, transpose_perm_for_const) # insert a new constant JUST before the op with self.op.enclosing_block: new_const_input_var = mb.const(val=new_const_val, before_op=self.op) self.op.enclosing_block.replace_uses_of_var_after_op( anchor_op=new_const_input_var.op, end_op=self.op, old_var=input_var, new_var=new_const_input_var, no_check_var_types=True, ) @_TransposeOptimization.register_axis_update_op(ops=["split"]) class _TransformSplit(_TransformConcat): def __init__(self, **kwargs): super(_TransformSplit, self).__init__(**kwargs) # The split op is handled the same as the concat op, except it does not need # to transform const inputs def _update_const_inputs(self): pass @_TransposeOptimization.register_axis_update_op(ops=["pad"]) class _TransformPad(TransformAxisUpdateOps): def __init__(self, **kwargs): super(_TransformPad, self).__init__(**kwargs) self.pad_var = self.op.inputs["pad"] self.pad_op = self.pad_var.op self.mode = self.op.mode.val self.pad_amounts_new = None def _compute_new_pad_values(self): pad_amounts = np.reshape(self.pad_var.val, [-1, 2]) rank_diff = len(self.transpose_axes) - pad_amounts.shape[0] self.pad_amounts_new = copy.deepcopy(pad_amounts) # append "rank_diff" rows of zeros to the top self.pad_amounts_new = np.concatenate( (np.zeros((2 * rank_diff)).reshape(-1, 2), self.pad_amounts_new) ) self.pad_amounts_new = self.pad_amounts_new.astype(pad_amounts.dtype) pad_amounts = np.concatenate((np.zeros((2 * rank_diff)).reshape(-1, 2), pad_amounts)) for i, axis in enumerate(self.transpose_axes): self.pad_amounts_new[axis][0] = pad_amounts[i][0] self.pad_amounts_new[axis][1] = pad_amounts[i][1] # get the top "rank_diff" rows top_rows = self.pad_amounts_new[:rank_diff, :] if not np.all(top_rows == 0): return False # cut "rank_diff" from the top self.pad_amounts_new = self.pad_amounts_new[rank_diff:, :] self.pad_amounts_new = self.pad_amounts_new.flatten() return True def can_transpose_pass(self): if ( len(_TransposeOptimization._get_input_vars(self.op, only_nonconst_vars=True)) != 1 or self.pad_op.op_type != "const" ): return False if len(self.transpose_axes) < 2: return False if not self._compute_new_pad_values(): return False # check that if mode is not constant, the updated padding # would stay limited to last 2 axes if self.mode != "constant" and not np.all(self.pad_amounts_new[:-4] == 0): return False return True def update(self): self._compute_new_pad_values() # insert a new constant for pad val, JUST before the op with self.op.enclosing_block: new_pad_var = mb.const(val=self.pad_amounts_new, before_op=self.op) self.op.enclosing_block.replace_uses_of_var_after_op( anchor_op=new_pad_var.op, end_op=self.op, old_var=self.pad_var, new_var=new_pad_var, no_check_var_types=True, ) @_TransposeOptimization.register_axis_update_op( ops=[ "reduce_l1_norm", "reduce_l2_norm", "reduce_max", "reduce_log_sum", "reduce_log_sum_exp", "reduce_mean", "reduce_min", "reduce_prod", "reduce_sum", "reduce_sum_square", ] ) class _TransformReduceMean(TransformAxisUpdateOps): def __init__(self, **kwargs): super(_TransformReduceMean, self).__init__(**kwargs) self.axes_var = self.op.inputs["axes"] self.axes_op = self.axes_var.op def can_transpose_pass(self): # allow transpose to push through it only if keep_dims are True since that doesn't change the rank if self.op.inputs["keep_dims"].val: if self.axes_op.op_type == "const": return True return False def update(self): # update axis of the op old_axes_val = self.axes_var.val new_axes_val = [0] * len(old_axes_val) for i, axis in enumerate(old_axes_val): new_axes_val[i] = self.transpose_axes[axis] # insert a new constant for the axis, JUST before the op with self.op.enclosing_block: new_axis_var = mb.const(val=new_axes_val, before_op=self.op) self.op.enclosing_block.replace_uses_of_var_after_op( anchor_op=new_axis_var.op, end_op=self.op, old_var=self.axes_var, new_var=new_axis_var, no_check_var_types=True, ) @_TransposeOptimization.register_axis_update_op( ops=["add", "mul", "sub", "real_div", "maximum", "minimum"] ) class _TransformAdd(TransformAxisUpdateOps): def __init__(self, **kwargs): super(_TransformAdd, self).__init__(**kwargs) # self.tranpose_input: this is the input coming from an upstream transpose op. If both inputs are # connected to an upstream transpose, this will be set to one of those # self.other_input: the other input, that is not coming from a transpose is_x_input_lazy_transpose = isinstance( self.var_to_hypothetical_value_dict[self.op.x], _LazyTransposeHypotheticalValue ) is_y_input_lazy_transpose = isinstance( self.var_to_hypothetical_value_dict[self.op.y], _LazyTransposeHypotheticalValue ) if is_x_input_lazy_transpose and is_y_input_lazy_transpose: self.other_input = None self.tranpose_input = self.op.x elif is_y_input_lazy_transpose and not is_x_input_lazy_transpose: self.other_input = self.op.x self.tranpose_input = self.op.y elif is_x_input_lazy_transpose and not is_y_input_lazy_transpose: self.other_input = self.op.y self.tranpose_input = self.op.x else: # we should not be here since this class is only invoked, # when there is at least one input var of type _LazyTransposeHypotheticalValue self.tranpose_input = None self.other_input = None def can_transpose_pass(self): """ Return True if the one of the following is true: - (scenario 1) both inputs are of type _LazyTransposeHypotheticalValue, with the same perm value - one input is of type _LazyTransposeHypotheticalValue and the other satisfies one of the following: - (scenario 2) it is constant. In this case, the constant can be updated accordingly to allow the transpose to pass through - (scenario 3) if its non constant, then all of the following must be true - its shape is fully defined - the transpose compliment operation on the other input can be expressed via a reshape. This can be done if there is only 1 non unit dimension in its shape, or if there are more than 1 non unit dims, the transpose compliment operation only permutes the unit dimensions. In scenario 3, the transpose will be removed, by adding an extra static reshape. This is based on the assumption that a static reshape op will be less expensive than transpose. An example of scenario 3 is displayed below: Input pattern: (shape=(10, 20, 30)) | | V Transpose op (shape = (20, 30, 10)) | | V this op <--------- (shape = (10,)) (other non const input) | V After transpose passes through: (shape=(10, 20, 30)) | | V this op <--------- (shape = (10, 1, 1)) Reshape op <---------- (shape = (10,)) (other non const input) | V Transpose op (shape = (20, 30, 10)) | V """ # --------------------- # check for scenario 1 # -------------------- # are both inputs _LazyTransposeHypotheticalValue? if self.other_input is None: return True # --------------------- # check for scenario 2 # -------------------- # is the second input a constant? rank = len(self.tranpose_input.shape) if len(self.transpose_axes) != rank: return False other_input_shape = self.other_input.shape if any_symbolic(other_input_shape): return False if len(other_input_shape) > rank: return False if isinstance(self.other_input.val, (np.ndarray, np.generic)): return True # --------------------- # check for scenario 3 # -------------------- # can other input be "reshaped" to allow the transpose to pass through? if any_symbolic(self.other_input.shape): return False transpose_compliment_perm = self._find_transpose_compliment(self.transpose_axes) # make the rank of the other input, same as that of the transpose input, # by broadcasting if len(other_input_shape) < rank: other_input_shape = [1] * (rank - len(other_input_shape)) + list(other_input_shape) # how many non unit dimensions in the other input's shape? if other_input_shape.count(1) in [rank, rank - 1]: # 0 or 1 non unit dimension return True else: # more than 1 non unit dimensions in other input # check if transpose is moving only dimensions that have values 1 # if true, then the transpose compliment can be expressed via a reshape for i, axis in enumerate(transpose_compliment_perm): if i != axis and other_input_shape[axis] != 1: return False return True def update(self): # ---------------------- # update for scenario 1 # ---------------------- if self.other_input is None: # nothing to update return # -------------------------- # update for scenario 2 & 3 # -------------------------- if len(self.other_input.shape) == 0: # other input is a scalar, no need to modify it return # broadcast the shape of other input to match the rank rank = len(self.tranpose_input.shape) other_input_shape = self.other_input.shape if len(other_input_shape) < rank: other_input_shape = [1] * (rank - len(other_input_shape)) + list(other_input_shape) # find new shape after transpose compliment transpose_compliment_perm = self._find_transpose_compliment(self.transpose_axes) new_shape = [0] * rank for i, axis in enumerate(transpose_compliment_perm): new_shape[i] = other_input_shape[axis] if self.other_input.val is not None: # update the const (scenario 2) const_value = self.other_input.val new_const_val = np.transpose( const_value.reshape(other_input_shape), transpose_compliment_perm ) # insert a new constant JUST before the op with self.op.enclosing_block: new_const_var = mb.const(val=new_const_val, before_op=self.op) self.op.enclosing_block.replace_uses_of_var_after_op( anchor_op=new_const_var.op, end_op=self.op, old_var=self.other_input, new_var=new_const_var, no_check_var_types=True, ) else: # insert a reshape (scenario 3) with self.op.enclosing_block: new_other_var = mb.reshape(x=self.other_input, shape=new_shape, before_op=self.op) self.op.enclosing_block.replace_uses_of_var_after_op( anchor_op=new_other_var.op, end_op=self.op, old_var=self.other_input, new_var=new_other_var, no_check_var_types=True, ) @register_pass(namespace="common") class reduce_transposes(AbstractGraphPass): """ Reduce transposes when it is applicable. For example: .. code-block:: # Example 1 Input graph: input -----> transpose(axis=[1,0]) -----> transpose(axis=[1,0]) ---> out Output graph: input -----> identity -----> out # Example 2 Input graph: input---->transpose(axis=[0,3,1,2])---->relu---->transpose(axis=[0,2,3,1])--->out Output graph: input----->relu----->out # Example 3 Input graph: input(shape=10,2,3,5)--->transpose(axis=[0,2,3,1])----->relu---->pool----->out1 | | --->relu----->log---->transpose(axis=[0,3,1,2])---->out2 Output graph: input(shape=10,2,3,5)----->relu---->transpose(axis=[0,2,3,1])---->pool----->out1 | | --->relu----->log---->out2 Please see ``TransposeOptimizationPass`` for more details. Notes ----- This pass is divided into 3 phases: `1st phase:` Information gathering. - Plug in Identity ops for all output nodes. This allows us to treat all ops uniformly during traversal. - Block is traversed in the topological order, starting from the ops connected to the inputs. - During the traversal, a value is associated with every var in the block. This value can be either of type ``_HypotheticalValue`` or ``_LazyTransposeHypotheticalValue``. The main purpose of type ``_HypotheticalValue`` is to indicate that it is `not` of type ``_LazyTransposeHypotheticalValue``. - ``_LazyTransposeHypotheticalValue`` represents either one or multiple transpose ops with the same perm value. This information is stored in this class. It also wraps a ``_HypotheticalValue`` that was the last hypothetical value which was generated prior to the origin of ``_LazyTransposeHypotheticalValue``. - Each op decides which type of hypothetical value to associate with its output vars, based on its op type, attributes, and the types of the hypothetical values of its input vars. - Ops are classified into 4 categories: `unary like`, `axis update`, `transpose`, and `materialize` (for all the rest). - Transpose ops are the ops from which a ``_LazyTransposeHypotheticalValue`` originate. - If the input to it is a ``_HypotheticalValue``, its output will be a ``_LazyTransposeHypotheticalValue``, indicating that this ``transpose`` op is available to get cancelled downstream. - If the input to it is a ``_LazyTransposeHypotheticalValue``, then it is checked whether this op cancels it or not. - If the op cancels it, a ``_HypotheticalValue`` value is generated at the output and the information about this ``transpose`` cancellation is recorded in the dictionary ``transpose_op_to_cancel_ops``. - If the op does not cancel, the current ``transpose`` op is categrorized as a `materialize` op. Therefore, the information in dictionary ``transpose_op_to_materialize_ops`` is updated accordingly. The output of the op is now mapped to a ``_HypotheticalValue``. - Unary like ops: These simply transfer their input hypothetical value type to the output. - Axis update ops: If a ``transpose`` can pass through them, they are treated like a unary op and the dictionary ``transpose_op_to_axis_update_ops`` is updated. If the op cannot be updated in any manner to allow a ``transpose`` to pass through, this op is then categorized as a `materialize` op and handled accordingly. - Materialize ops: All ``_LazyTransposeHypotheticalValue`` input vars, if present, materialize here. Output of this op is always of type ``_HypotheticalValue``. If the input is a ``_LazyTransposeHypotheticalValue``, update the dictionary ``transpose_op_to_materialize_ops``. - To treat an op like a unary op, add its type to ``_UNARY_LIKE_OP_TYPES``. In future changes we want to make this process automatic by detecting an op as a `unary like` by its "traits". - To treat an op like `axis update` op, add a class specific to the op implementing the class ``TransformAxisUpdateOps``. For examples, see classes ``_TransformConcat``, ``_TransformPad``, and so on. The dictionary ``AXIS_UPDATE_OPS`` is automatically filled in by the decorator ``_TransposeOptimization.register_axis_update_op``. `2nd phase:` Determining which ``transpose`` ops to remove from the graph. All ``transpose`` ops that have a corresponding compliment op in dict ``transpose_op_to_cancel_ops`` is a candidate. However, you need to ensure the following: - If a ``transpose`` op is removed, then all of its ``cancel`` ops in ``transpose_op_to_cancel_ops`` must also be removed, to ensure correctness of the graph. The same is true in the reverse direction as well; that is, for every ``cancel`` op that is removed, all its parent ``transpose`` ops upstream must also be removed. - ``transpose`` ops should be removed only if the number of ``cancel`` ops is greater than the number of ``transpose`` ops that would get freshly introduced to the block as a result of materialization ops. Currently in the algorithm, each materialization op/output var (dicts ``transpose_op_to_materialize_ops``/``old_output_vars``) results in one more ``transpose`` op, although this can be further optimized in the future. To resolve this, we recognize that nodes consisting of sets ``(a)`` and ``(b)`` form a bipartitle graph, where, ``(a) ==`` starting ``transpose`` ops (originators of ``_LazyTransposeHypotheticalValue``) and ``(b) ==`` set of ``transpose`` ``cancel`` ops and ``materialize`` ops. - In this bipartite graph, we find all the connected components for each connected component. Either the entire set of ``transpose`` ops in it are removed/materialized, or none of them are touched. - Thus for each set, a determination is made based on counting the number of ``cancel`` ops and ``materialize`` ops. - Based on this determination, the final set of ``transpose`` ops to be removed is updated. `3rd phase:` Transforming the graph. - ``transpose`` starting ops and the ``cancel`` ops are removed. - Axis update ops, affected by these ``transpose`` ops, are updated. - Transposes are materialized; that is, added just before the ``materialize`` ops, which are linked to the starting ``transpose`` ops. The starting ``transpose`` op can be materialized (inserted) multiple times, before each of the ``materialize`` ops downstream. - Block outputs are handled in a similar fashion as the `materialize` ops. - Type inference on all ops is invoked after all the transformations. - All Identity ops that are plugged into the graph to treat outputs as materialized are removed. `Debugging` If the ``debug`` flag is set to ``True``, the block before and after the transformation is plotted, with transpose nodes highlighted. """ def apply(self, prog): for f in prog.functions.values(): self._reduce_transposes_block(f) @staticmethod def _reduce_transposes_block(block): """ Only apply the optimization if the block is flat, i.e, it does not contain any op which contains a sub-block. TODO: Removing transposes and transpose compliments requires re-running type inference for the set of ops in between the fused transpose ops, which is simpler to do when all the ops in the block are free of sub blocks. The case of transpose fusion with sub-block containing ops needs to be handled with more care and test cases. """ for op in block.operations: if len(op.blocks) > 0: return with block: opt_transposes = _TransposeOptimization(block) opt_transposes.block_traversal() opt_transposes.apply_transform() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/optimize_state.py0000644000000000000000000001661014672066616026432 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import List from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, Program from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.scope import ScopeInfo @register_pass(namespace="common") class canonicalize_inplace_pattern(AbstractGraphPass): """ As a functional-graph framework, Core ML represents in-place operation as .. code-block:: read_state -> functional operation -> write_state Due to the non-uniqueness of topological order, in the list representation of ops, ``write_state`` can be anywhere after the functional op. We prefer the canonical order, i.e. have ``write_state`` immediately follow the functional op In practice 1. In PyMIL, we do not use ``write_state`` op. Instead, we use ``coreml_update_state``, which is the composition of ``write_state -> read_state`` 2. The ``read_state`` op does not matter in the pattern match and transform So we will match .. code-block:: functional operation -> coreml_update_state then reorder the ``coreml_update_state``. For example .. code-block:: Given: mul = mul(state, x) add = add(mul, y) update = coreml_update_state(state, mul) Return: mul = mul(state, x) update = coreml_update_state(state, mul) add = add(mul, y) """ def apply(self, prog: Program) -> None: for f in prog.functions.values(): self._apply_block(f) @block_context_manager def _apply_block(self, block: Block) -> None: block_operation_list = list(block.operations) for op in block_operation_list: # general boilterplate: special case when op manipulates block if op.enclosing_block is None: continue for b in op.blocks: self._apply_block(b) # Although downstream iterator (op list) gets changed, the change is only in # ``coreml_udpate_state`` op, which cannot be the pattern start and will quick return, # so no need to break and iterate self._try_match_and_transform_pattern(op, block, block_operation_list) def _try_match_and_transform_pattern( self, op: Operation, block: Block, block_operation_list: List[Operation] ) -> None: # state op itself is irrelevant if op.op_type in ("read_state", "coreml_update_state"): return coreml_update_state_ops = self._try_find_child_coreml_update_state_ops(op) for coreml_update_state_op in coreml_update_state_ops: before_op = block_operation_list[block_operation_list.index(op) + 1] scopes = self._construct_scope_info_list_from_op_scopes(op) with mb.scope(*scopes): immediate_coreml_update_state = mb.coreml_update_state( state=coreml_update_state_op.state, value=coreml_update_state_op.value, before_op=before_op, ) # We need to eliminate dead code here, # because our dead code elimination graph pass does not work for coreml_update_state if block.try_replace_uses_of_var_after_op( anchor_op=coreml_update_state_op, old_var=coreml_update_state_op.outputs[0], new_var=immediate_coreml_update_state, ): block.remove_ops([coreml_update_state_op]) @staticmethod def _try_find_child_coreml_update_state_ops(op: Operation) -> List[Operation]: coreml_update_state_ops = [] for output in op.outputs: for child_op in output.child_ops: if child_op.op_type == "coreml_update_state": coreml_update_state_ops.append(child_op) return coreml_update_state_ops @staticmethod def _construct_scope_info_list_from_op_scopes(op: Operation) -> List[ScopeInfo]: scope_info_list = [] for source, data in op.scopes.items(): scope_info_list.append(ScopeInfo(source=source, data=data)) return scope_info_list @register_pass(namespace="common") class prefer_state_in_downstream(AbstractGraphPass): """ As a functional-graph framework, Core ML represents in-place operation as .. code-block:: read_state -> functional operation -> write_state When the output of the in-place operation is used downstream, there are 2 possible patterns, one reuses state memory .. code-block:: read_state -> functional operation -> write_state -> read_state -> ... the other wastes memory for keeping functional output .. code-block:: |-> write_state read_state -> functional operation -| |-> ... We prefer the reuse-state one In practice 1. In PyMIL, we do not use ``write_state`` op. Instead, we use ``coreml_update_state``, which is the composition of ``write_state -> read_state`` 2. With canonical inplace pattern (guaranteed by graph pass ``canonicalize_inplace_pattern``), simply replace the usage of functional output with ``coreml_update_state`` output is enough For example .. code-block:: Given: mul = mul(state, x) update = coreml_update_state(state, mul) add = add(mul, y) Return: mul = mul(state, x) update = coreml_update_state(state, mul) add = add(update, y) """ def apply(self, prog: Program) -> None: for f in prog.functions.values(): self._apply_block(f) @block_context_manager def _apply_block(self, block: Block) -> None: for op in list(block.operations): # general boilterplate: special case when op manipulates block if op.enclosing_block is None: continue for b in op.blocks: self._apply_block(b) self._try_match_and_transform_pattern(op, block) def _try_match_and_transform_pattern(self, op: Operation, block: Block) -> None: if op.op_type == "coreml_update_state": # if the var is both blck input and output, we should not replace it if op.value in block.outputs and op.value in block.inputs.values(): return other_child_ops = [val for val in op.value.child_ops if val != op] # if the var doesn't feed into any other op, this pass should do nothing if len(other_child_ops) == 0: return # if the var only feeds into coreml_update_state ops, this pass should do nothing if all([val.op_type == "coreml_update_state" for val in other_child_ops]): return block.try_replace_uses_of_var_after_op( anchor_op=op, old_var=op.value, new_var=op.outputs[0], ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/optimize_tensor_operation.py0000644000000000000000000010005714672066616030703 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil._deployment_compatibility import AvailableTarget from coremltools.converters.mil.frontend._utils import value_at from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.block import is_current_opset_version_compatible_with from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import ( _check_child_op_type, _check_var_scalar_value, block_context_manager, ) from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.types.symbolic import any_symbolic @register_pass(namespace="common") class fuse_squeeze_expand_dims(AbstractGraphPass): """ Detect the pattern ``input-->squeeze-->expand_dims``, and fuse them into an ``identity`` op if ``squeeze`` and ``expand_dims`` cancel out each other. Note that, the ``identity`` can be further removed by ``noop_elimination``. .. code-block:: Given: %x[3, 1, 4, 1] %1[3, 4] = squeeze(%x, axes=[1, 3]) %2[3, 1, 4, 1] = expand_dims(%1, axes=[1, 3]) %3 = op(%2) Result: %x[3, 1, 4, 1] %2[3, 1, 4, 1] = identity(%x) %3 = op(%2) """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self.fuse_squeeze_expand_dims_block(f) @block_context_manager def fuse_squeeze_expand_dims_block(self, block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self.fuse_squeeze_expand_dims_block(b) if len(op.blocks) > 0: continue squeeze_op = self._match_pattern(op) if squeeze_op is not None: if self._try_to_transform(squeeze_op, block): fusion_occurred = True return fusion_occurred @staticmethod def _match_pattern(op): if op.op_type != "squeeze": return None if not _check_child_op_type(op, "expand_dims"): return None return op @staticmethod def _try_to_transform(op, block): expand_dims_op = op.outputs[0].child_ops[0] x = op.x out_var = expand_dims_op.outputs[0] if x.shape != out_var.shape: return False if op.outputs[0] in block.outputs: return False new_var = mb.identity(x=x, before_op=op) if op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=expand_dims_op, old_var=out_var, new_var=new_var, ): # Remove all the ops at once block.remove_ops([op, expand_dims_op]) return True return False @register_pass(namespace="common") class expand_high_rank_reshape_and_transpose(AbstractGraphPass): """ Detect the pattern ``reshape_1-->transpose-->reshape_2``, where ``reshape_1`` has an output tensor with ``rank >= 6``, and ``reshape_2`` produces a tensor with ``rank <= 5``. In general, we can expand this pattern into a sequence of rank 4 ``reshape`` and ``transpose`` ops, which is supported by the Core ML runtime. .. code-block:: Given: %1 = reshape(%x, shape=(d1, d2, d3, d4, ..., dn)) %2 = transpose(%1, perm=(p1, p2, ..., pn)) %3 = reshape(%2, shape=(o1, o2, o3, o4, o5)) Result: %t1 = reshape(%x, shape=(y11, y12, y13, y14)) %h1 = transpose(%t1, perm=[0, 2, 1, 3]) %t2 = reshape(%h1, shape=(y21, y22, y23, 214)) %h2 = transpose(%t2, perm=[0, 2, 1, 3]) .... %hn = transpose(%tn, perm=[0, 2, 1, 3]) %3 = reshape(%hn, shape=(o1, o2, o3, o4, o5)) """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self.expand_high_rank_reshape_and_transpose_block(f) @staticmethod def _match_pattern(op): # We are detecting the # reshape(>= rank 6) -> transpose -> reshape(<= rank 5) pattern ops = [op] if op.op_type != "reshape": return None if op.outputs[0].rank <= 5: return None if any_symbolic(op.outputs[0].shape): return None if not _check_child_op_type(op, "transpose"): return None transpose_op = op.outputs[0].child_ops[0] ops.append(transpose_op) if not _check_child_op_type(transpose_op, "reshape"): return None reshape_op = transpose_op.outputs[0].child_ops[0] ops.append(reshape_op) if reshape_op.outputs[0].rank >= 6: return None for candidate_op in ops[:-1]: if candidate_op.outputs[0] in op.enclosing_block.outputs: return None return ops @staticmethod def _try_to_transform(ops, block): def _get_prod(start, end, arr, skip_indices): res = 1 for i in range(start, end): if i in skip_indices: continue res *= arr[i] return res reshape_op, transpose_op, last_reshape_op = ops[0], ops[1], ops[2] original_shape = reshape_op.outputs[0].shape original_perm = transpose_op.perm.val.tolist() # Group the consecutive axes in the perm, sometimes this could directly lower the # rank under 6. # # For instance: # # reshape = mb.reshape(x=x, shape=[1, 2, 3, 4, 5, 6]) # transpose = mb.transpose(x=reshape, perm=[4, 5, 3, 2, 0, 1]) # output = mb.reshape(x=transpose, shape=[6, 20, 6]) # # Have 4 groups of axes: [4, 5], [3], [2], [0, 1] # We can transform the ops to # # new_reshape = mb.reshape(x=x, shape=[1*2, 3, 4, 5*6]) # new_transpose = mb.transpose(x=reshape, perm=[3, 2, 1, 0]) # output = mb.reshape(x=new_transpose, shape=[6, 20, 6]) # # Note that, the output of new_transpose have different rank than transpose, # however, they have the same data layout, so the final output is still unchanged. group_axes = [] i = 0 res = [] for i in range(len(original_perm)): if i > 0 and original_perm[i] == original_perm[i-1] + 1: res.append(original_perm[i]) else: if len(res) > 0: group_axes.append(res) res = [original_perm[i]] if i == len(original_perm) - 1: group_axes.append(res) group_shape = [] for axes in group_axes: start, end = axes[0], axes[-1] + 1 group_shape.append(_get_prod(start, end, original_shape, set())) start_group_axis = [axes[0] for axes in group_axes] group_axis_order = np.argsort(start_group_axis) shape = np.array(group_shape)[group_axis_order].tolist() sorted_start_group_axis = np.sort(start_group_axis).tolist() perm = [sorted_start_group_axis.index(i) for i in start_group_axis] rank = len(perm) x = reshape_op.x if rank < 6: # If the intermediate tensors have rank < 6, # we can directly use them to replace the original pattern x = mb.reshape(x=x, shape=shape, before_op=reshape_op) x = mb.transpose(x=x, perm=perm, before_op=reshape_op) else: # Otherwise, we need to expand the rank-N tensor into N reshape, and N transpose ops. # Note that all intrermediate tensors have rank 4. # # The algorithm is as followed: # # reshape shape: [d_1, d_2, ..., d_n] # transpose perm: [p_1, p_2, ..., p_n] # # reshape to [1, d_1*d_2*...*d_(p_1-1), d_(p_1), d_(p_1+1)*...*d_n] # transpose to [1, d_(p_1), d_1*d_2*...*d_(p_1-1), d_(p_1+1)*...*d_n] # # reshape to [d_(p_1), d_1*d_2*...*d_(p_2-1), d_(p_2), d_(p_2+1)*...*d_n] # transpose to [d_(p_1), d_(p_2), d_1*d_2*...*d_(p_2-1), d_(p_2+1)*...*d_n] # # reshape to [d_(p_1)*d_(p_2), d_1*d_2*...*d_(p_3-1), d_(p_3), d_(p_3+1)*...*d_n] # .... # so on and so forth leading_dim = 1 memo = set() for i in range(rank): axis = perm[i] dim = shape[axis] memo.add(axis) reshape_shape = [ leading_dim, _get_prod(0, axis, shape, memo), dim, _get_prod(axis + 1, rank, shape, memo) ] x = mb.reshape(x=x, shape=reshape_shape, before_op=reshape_op) x = mb.transpose(x=x, perm=[0, 2, 1, 3], before_op=reshape_op) leading_dim *= dim x = mb.reshape(x=x, shape=last_reshape_op.shape.val, before_op=reshape_op) if reshape_op.enclosing_block.try_replace_uses_of_var_after_op( anchor_op=reshape_op, old_var=last_reshape_op.outputs[0], new_var=x, ): # Remove all the ops at once block.remove_ops(ops) return True return False @block_context_manager def expand_high_rank_reshape_and_transpose_block(self, block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self.expand_high_rank_reshape_and_transpose_block(b) if len(op.blocks) > 0: continue ops = self._match_pattern(op) if ops is not None: if self._try_to_transform(ops, block): fusion_occurred = True return fusion_occurred @register_pass(namespace="common") class concat_to_pixel_shuffle(AbstractGraphPass): """ Identify nested, interleaved ``concat`` ops which can be replaced by a single ``concat`` and a `pixel shuffle` layer. This pattern occurs with the faster up-convolution from the FCRN model (Laina et al., 2016). .. code-block:: # Before the concat_to_pixel_shuffle pass. input(N, C, H, W) ------------------- | v input(N, C, H, W) -----> concat(axis=2, interleave=True) -----> concat(axis=3, interleave=True) ----> output ^ | input(N, C, H, W) -----> concat(axis=2, interleave=True) -------------------- | ^ | | input(N, C, H, W) ------------------- # After the concat_to_pixel_shuffle pass. input(N, C, H, W) --------------- | v input(N, C, H, W) -----> concat(axis=1, interleave=True) -----> pixel_shuffle(upscale_factor=2) ----> output ^ | input(N, C, H, W) --------------| | | input(N, C, H, W) --------------- """ def apply(self, prog): for f in prog.functions.values(): self._concat_to_pixel_shuffle_block(f) @staticmethod def _match_pattern(op): # Identify if this is an op we can transform if op.op_type != "concat": return None w_concat = op if w_concat.inputs["values"][0].rank != 4: return None if w_concat.inputs["axis"].val != 3: return None if not w_concat.inputs["interleave"].val: return None inputs = list(w_concat.inputs["values"]) if len(inputs) != 2: return None if not inputs[0].op or not inputs[1].op: return None if inputs[0].op.op_type != "concat" or inputs[1].op.op_type != "concat": return None h_concat_0 = inputs[0].op if not h_concat_0.inputs["interleave"].val: return None h_concat_0_inputs = list(h_concat_0.inputs["values"]) if len(h_concat_0_inputs) != 2: return None h_concat_1 = inputs[1].op if not h_concat_1.inputs["interleave"].val: return None h_concat_1_inputs = list(h_concat_1.inputs["values"]) if len(h_concat_1_inputs) != 2: return None if h_concat_0.inputs["axis"].val != 2 or h_concat_1.inputs["axis"].val != 2: return None return w_concat, h_concat_0, h_concat_1 @staticmethod def _replace_ops(block, w_concat, h_concat_0, h_concat_1): h_concat_0_inputs = list(h_concat_0.inputs["values"]) h_concat_1_inputs = list(h_concat_1.inputs["values"]) all_inputs = [ h_concat_0_inputs[0], h_concat_1_inputs[0], h_concat_0_inputs[1], h_concat_1_inputs[1], ] # Concatenate all 4 inputs on the channel axis x = mb.concat(values=all_inputs, axis=1, before_op=h_concat_0, interleave=True) # Shuffle into place x = mb.pixel_shuffle(x=x, upscale_factor=2, before_op=h_concat_0) w_concat.enclosing_block.replace_uses_of_var_after_op( anchor_op=h_concat_0, old_var=w_concat.outputs[0], new_var=x ) block.remove_ops([w_concat, h_concat_0, h_concat_1]) @block_context_manager def _concat_to_pixel_shuffle_block(self, block): for op in list(block.operations): layers = self._match_pattern(op) if layers: self._replace_ops(block, layers[0], layers[1], layers[2]) @register_pass(namespace="common") class detect_concat_interleave(AbstractGraphPass): """ Detect the pattern ``concat-->reshape--->transpose--->reshape``, where ``concat`` is along the channel axis ``(axis=-3)``, and map this pattern to the ``concat`` with ``interleave`` op. This pattern occurs, for example, in the ``shufflenet`` model in ``torchvision``. .. code-block:: Given: %3 = concat(%1.a, %1.b, ..., axis=-3, interleave=False) #shape = (B, n*C, H, W) %4 = reshape(%3) #shape = (B, n, C, H, W) %5 = transpose(%4, perm=[0, 2, 1, 3, 4]) # shape = (B, C, n, H, W) %6 = reshape(%5) # shape = (B, C*n, H, W) Result: %6 = concat(%1.a, %1.b, ..., axis=-3, interleave=True) """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._fuse_concat_interleave(f) @staticmethod def _match_pattern(op): if op.outputs[0] in op.enclosing_block.outputs: return None if op.op_type == "concat": if op.interleave.val: return None # check that axis is -3 and rank is 4 rank = op.values[0].rank if rank != 4: return None axis = op.axis.val if axis > 0: axis = axis - rank if axis != -3: return None # check that all inputs to concat have fully defined shapes for in_ in op.values: if any_symbolic(in_.shape): return None # check that all inputs to concat have the same shape inshape = list(op.values[0].shape) for v in op.values[1:]: for i in range(rank): if inshape[i] != v.shape[i]: return None # check that this concat is connected to exactly 1 reshape op child_ops = list(op.outputs[0].child_ops) if len(child_ops) == 1: if list(child_ops)[0].op_type == "reshape": return op return None @staticmethod def _try_to_transform(concat_op, add_op, block): all_ops = [concat_op] B, C, H, W = list(concat_op.values[0].shape) n = len(concat_op.values) # check that reshape shapes the input to (B, n, C, H, W) reshape_op1 = concat_op.outputs[0].child_ops[0] reshape_shape1 = reshape_op1.shape.val if reshape_shape1 is None: return False if not isinstance(reshape_shape1, np.ndarray): return False reshape_shape1 = list(reshape_shape1) if reshape_shape1 != [B, n, C, H, W]: return False all_ops.append(reshape_op1) # check that after reshape is a transpose op with perm=[0, 2, 1, 3, 4] if len(list(reshape_op1.outputs[0].child_ops)) != 1: return False transpose_op = list(reshape_op1.outputs[0].child_ops)[0] if transpose_op.op_type != "transpose": return False perm = transpose_op.perm.val if perm is None: return if list(perm) != [0, 2, 1, 3, 4]: return False all_ops.append(transpose_op) # check that after transpose is another reshape with [B, . , H, W] if len(list(transpose_op.outputs[0].child_ops)) != 1: return False reshape_op2 = list(transpose_op.outputs[0].child_ops)[0] if reshape_op2.op_type != "reshape": return False reshape_shape2 = reshape_op2.shape.val if reshape_shape2 is None: return False if not isinstance(reshape_shape2, np.ndarray): return False reshape_shape2 = list(reshape_shape2) if len(reshape_shape2) != 4: return False if [reshape_shape2[0], reshape_shape2[-2], reshape_shape2[-1]] != [B, H, W]: return False all_ops.append(reshape_op2) # check that none of the op in this pattern is connected to the output # (except the last mul op) for i, op in enumerate(all_ops): if i == len(all_ops) - 1: continue for out in op.outputs: if out in block.outputs: return False # add a new concat op out_name = reshape_op2.outputs[0].name x = mb.concat( values=concat_op.values, axis=concat_op.axis.val, interleave=True, name=out_name, before_op=concat_op, ) reshape_op2.enclosing_block.replace_uses_of_var_after_op( anchor_op=reshape_op2, old_var=reshape_op2.outputs[0], new_var=x ) # Remove all the ops at once block.remove_ops(all_ops) return True @block_context_manager def _fuse_concat_interleave(self, block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._fuse_concat_interleave(b) if len(op.blocks) > 0: continue concat_op = self._match_pattern(op) if concat_op is not None: if self._try_to_transform(op, concat_op, block): fusion_occurred = True return fusion_occurred @register_pass(namespace="common") class fuse_onehot_matmul_to_gather(AbstractGraphPass): """ Detect if ``onehot (axis=-1, on_value=1, off_value=0)`` is followed by a ``matmul`` op (no bias). If so, they can be replaced by a ``gather`` op. .. code-block:: Input: %2 = one_hot(%1, on_value=1, off_value=0, axis=-1) %3 = const() # rank 2 %4 = matmul(%2, %3) Output: %4 = gather(%3, %2, axis=0) """ def apply(self, prog): for f in prog.functions.values(): block_changed = True while block_changed: block_changed = self._fuse_onehot_matmul_to_gather_block(f) @staticmethod def _try_to_transform(onehot_op, block): root_var = onehot_op.indices # check that the output of the onehot op is not a block output if onehot_op.outputs[0] in block.outputs: return False # check that onehot op has axis=-1, on_value=1 and off_value=0 # and constant one_hot_vector_size axis = onehot_op.axis.val if axis is None: return False if onehot_op.indices.shape is None: return False rank = len(onehot_op.indices.shape) if axis >= 0: axis -= rank if axis != -1: return False if not _check_var_scalar_value(onehot_op.on_value, 1): return False if not _check_var_scalar_value(onehot_op.off_value, 0): return False if onehot_op.one_hot_vector_size.val is None: return False # checks for the following matmul op if not _check_child_op_type(onehot_op, "matmul"): return False matmul_op = list(onehot_op.outputs[0].child_ops)[0] if matmul_op.x != onehot_op.outputs[0]: return False if matmul_op.transpose_x.val or matmul_op.transpose_y.val: return False W_var = matmul_op.y if W_var.val is None: return False if len(W_var.val.shape) != 2: return False # remove onehot and matmul and replace with gather op if is_current_opset_version_compatible_with(AvailableTarget.iOS17): # IOS17 `gather` requires non-negative indices. root_var = mb.select( cond=mb.greater_equal(x=root_var, y=0, before_op=matmul_op), a=root_var, b=mb.add( x=root_var, y=value_at(mb.shape(x=W_var, before_op=matmul_op), 0, before_op=matmul_op), before_op=matmul_op, ), before_op=matmul_op, ) x = mb.gather( x=W_var, indices=root_var, axis=0, name=matmul_op.outputs[0].name, before_op=matmul_op ) matmul_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=matmul_op, old_var=matmul_op.outputs[0], new_var=x ) # Remove all the ops at once block.remove_ops([onehot_op, matmul_op]) return True @block_context_manager def _fuse_onehot_matmul_to_gather_block(self, block): fusion_occurred = False for op in list(block.operations): if op.enclosing_block is None: continue for b in op.blocks: block_changed = True while block_changed: block_changed = self._fuse_onehot_matmul_to_gather_block(b) if len(op.blocks) > 0: # This op can't be pow continue # start pattern match if one_hot op is encountered if op.op_type == "one_hot": if self._try_to_transform(op, block): fusion_occurred = True return fusion_occurred @register_pass(namespace="common") class replace_stack_reshape(AbstractGraphPass): """ A stack followed by a reshape layer can be replaced by a ``concat`` if the reshape simply removes the new axis and doubles the size of one of the axes next to it. If the new axis is reshaped to the "right" (that is, the axis just after it is doubled), then we can use a ``concat``. If it is reshaped to the "left" (the axis just before it is doubled), then the ``concat`` needs to set the ``interleaved`` flag. Examples: .. code-block:: Given: %1 = tensor(1, 5, 3, 4) %2 = tensor(1, 5, 3, 4) %3 = stack((%1,%2), axis=2) # shape = (1, 5, 2, 3, 4) %4 = reshape(%3, shape=[1, 10, 3, 4]) Result: %1 = tensor(1, 5, 3, 4) %2 = tensor(1, 5, 3, 4) %4 = concat((%1,%2), axis=1, interleave=True) # shape = (1, 10, 3, 4) Given: %1 = tensor(1, 5, 3, 4) %2 = tensor(1, 5, 3, 4) %3 = stack((%1, %2), axis=1) # shape = (1, 2, 5, 3, 4) %4 = reshape(%3, shape=[1, 10, 3, 4]) Result: %1 = tensor(1, 5, 3, 4) %2 = tensor(1, 5, 3, 4) %4 = concat((%1, %2), axis = 1) # shape = (1, 10, 3, 4) """ def apply(self, prog): for f in prog.functions.values(): self._replace_stack_reshape_block(f) @staticmethod def _match_operation(stack_op): # Identify if this is an op we can transform if stack_op.op_type != "stack": return None, None child_ops = stack_op.outputs[0].child_ops if len(child_ops) != 1: return None, None if child_ops[0].op_type != "reshape": return None, None stack_axis = stack_op.inputs["axis"] if not stack_axis: return None, None stack_axis_val = stack_axis.val reshape_op = child_ops[0] # Now, op is a stack op followed by a reshape op # So we need to check that the stack really gets eliminated stack_output_rank = len(stack_op.outputs[0].shape) reshape_output_rank = len(reshape_op.outputs[0].shape) if stack_output_rank != (reshape_output_rank + 1): return None, None # Compare the input to stack to the output from reshape # These shapes should differ in either the stack_axis_val place (by a factor of 2), # or in the stack_axis_val-1 place by the same factor input_shape = list(stack_op.inputs["values"][0].shape) concat_axis = [ idx for idx, (x, y) in enumerate(zip(input_shape, reshape_op.outputs[0].shape)) if x != y ] if len(concat_axis) != 1: return None, None concat_axis = concat_axis[0] if input_shape[concat_axis] * 2 != reshape_op.outputs[0].shape[concat_axis]: return None, None if concat_axis != stack_axis_val and concat_axis != stack_axis_val - 1: return None, None return stack_op, reshape_op @staticmethod def _replace_stack_reshape_ops(block, stack_op, reshape_op): stack_axis = stack_op.inputs["axis"] if not stack_axis: return None, None stack_axis_val = stack_axis.val input_shape = list(stack_op.outputs[0].shape) input_shape.pop(stack_axis_val) concat_axis = [ idx for idx, (x, y) in enumerate(zip(input_shape, reshape_op.outputs[0].shape)) if x != y ] if len(concat_axis) != 1: return concat_axis = concat_axis[0] interleave = concat_axis == stack_axis_val - 1 x = mb.concat( values=stack_op.values, axis=concat_axis, before_op=stack_op, interleave=interleave ) reshape_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=stack_op, old_var=reshape_op.outputs[0], new_var=x ) block.remove_ops([stack_op, reshape_op]) @block_context_manager def _replace_stack_reshape_block(self, block): for op in list(block.operations): stack_op, reshape_op = self._match_operation(op) if stack_op: self._replace_stack_reshape_ops(block, stack_op, reshape_op) @register_pass(namespace="common") class use_reflection_padding(AbstractGraphPass): """ Identify a reflection padding layer composed out of `slices` and `concats`. .. code-block:: Input graph: ------------------------------------------------------------------------------------- | | v input(1, 2, 6, 8) ------> slice_by_index(begin=[0, 0, 0, 1], end=[0, 0, 0, 2]) -----> concat(axis=3) ---> out(1, 2, 6, 10) | ^ ----------------> slice_by_index(begin=[0, 0, 0, -2], end=[0, 0, 0, -1]) -------------| Output graph: input(1, 2, 6, 8) -----0> pad(mode=reflect, size=[0, 0, 1, 1]) -----> out(1, 2, 6, 10) """ def apply(self, prog): for f in prog.functions.values(): self._reflection_padding_block(f) @staticmethod def _match_pattern(concat_op, block): if concat_op.op_type != "concat": return False concat_inputs = list(concat_op.inputs["values"]) # There need to be an odd number of inputs, and at least one model has a concat input of # length 1 if len(concat_inputs) % 2 != 1 or len(concat_inputs) == 1: return False # The original input will need to be in the middle of the concatenated inputs original_input = concat_inputs[len(concat_inputs) // 2] axis = None slice_ops_out = [] end_mask = None begin_index = len(concat_inputs) // 2 for slice_op in concat_inputs: # one of the concat inputs is the original input (to the slices) if slice_op == original_input: # We'll now start checking indices from the end begin_index = begin_index - 2 continue slice_op = slice_op.op if not slice_op: return False if slice_op.op_type != "slice_by_index": return False # check that the input to slice op is the original input if slice_op.inputs["x"] != original_input: return False # If the slice is an output if slice_op.outputs[0] in block.outputs: return False if end_mask is None: end_mask = slice_op.inputs["end_mask"].val axis = list(end_mask).index(False, 0, len(end_mask)) if end_mask is None: return False if axis != list(end_mask).index(False, 0, len(end_mask)): return False # Check that we're only taking a slice of size 1 end = slice_op.inputs["end"].val begin = slice_op.inputs["begin"].val if end[axis] - begin[axis] != 1: return False input_shape = original_input.shape # Check that the slices are in order if begin[axis] != begin_index and begin[axis] != begin_index + input_shape[axis]: return False begin_index = begin_index - 1 slice_ops_out.append(slice_op) if axis is None: return False return use_reflection_padding._replace_ops( block, concat_op, slice_ops_out, axis - len(end_mask) ) @staticmethod def _replace_ops(block, concat_op, slice_ops, axis): pad_size = len(slice_ops) // 2 if axis == -1: pad = [pad_size, pad_size] elif axis == -2: pad = [pad_size, pad_size, 0, 0] else: return False x = mb.pad(x=slice_ops[0].inputs["x"], pad=pad, mode="reflect", before_op=concat_op) concat_op.enclosing_block.replace_uses_of_var_after_op( anchor_op=concat_op, old_var=concat_op.outputs[0], new_var=x ) block.remove_ops([concat_op] + slice_ops) return True @block_context_manager def _reflection_padding_block(self, block): for op in list(block.operations): self._match_pattern(op, block) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/preprocess.py0000644000000000000000000004111514672066616025555 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clausefrom import re import warnings from collections import OrderedDict from typing import Optional from coremltools import _logger as logger from coremltools.converters.mil.input_types import EnumeratedShapes, ImageType, Shape from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, Program, types from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class image_input_preprocess(AbstractGraphPass): """ Plug in to ``transpose`` image input in NHWC format to NCHW format. Follow these steps: 1. Check whether there are any inputs that the users specify as ImageType. 2. Check the channel's dimension for all inputs that are ImageType. a) ``channel_first == True`` We do not modify this input, since ``channel_first`` is the intended behaviour for feeding images for optimal performance. b) ``channel_first == False`` We convert the input into a "channel_first" input, and plug in a ``transpose`` for the input to maintain the remaining graph's dimensionality. """ def apply(self, prog): for f_name, f in prog.functions.items(): if f_name == "main": # We need to make sure main exist and start here. self._image_input_preprocess(prog) @staticmethod def _image_input_preprocess(prog): def _transform_to_channel_first(shape): if isinstance(shape, tuple): shape = list(shape) return tuple(shape[:-3] + [shape[-1]] + shape[-3:-1]) else: return shape[:-3] + [shape[-1]] + shape[-3:-1] main_input_types = list(prog.functions["main"].input_types) for idx, input_type in enumerate(main_input_types): if isinstance(input_type, ImageType) and not input_type.channel_first: name = input_type.name # Build new ImageType to change data layout if isinstance(input_type.shape, Shape): new_shape = _transform_to_channel_first(input_type.shape.shape) new_default = _transform_to_channel_first(input_type.shape.default) shape_type = Shape(shape=new_shape, default=new_default) elif isinstance(input_type.shape, EnumeratedShapes): shape_list = [] for shape in input_type.shape.shapes: if isinstance(shape, Shape): shape_list.append(_transform_to_channel_first(shape.shape)) else: shape_list.append(_transform_to_channel_first(shape)) shape_type = EnumeratedShapes( shapes=shape_list, default=_transform_to_channel_first(input_type.shape.default), ) new_image_type = ImageType( name=name, shape=shape_type, bias=input_type.bias, scale=input_type.scale, color_layout=input_type.color_layout, channel_first=True, ) main_input_types[idx] = new_image_type # Reconstruct Placeholder of Function inputs. placeholder_op = prog.functions["main"].placeholder_inputs[name] old_var = placeholder_op.outputs[0] nchw_shape = _transform_to_channel_first(placeholder_op.sym_shape) placeholder_op.__init__( nchw_shape, dtype=placeholder_op.dtype, name=placeholder_op.name ) # Update Function input var prog.functions["main"]._input_dict[name] = placeholder_op.outputs[0] # Add transpose into graph (Transpose from NCHW back to NHWC) curr_block = prog.functions["main"] curr_var = prog.functions["main"].inputs[name] perm = list(range(curr_var.rank)) perm = perm[:-3] + [perm[-2], perm[-1], perm[-3]] with curr_block: new_input = mb.transpose( x=curr_var, perm=perm, before_op=prog.functions["main"].operations[0], name=curr_var.name + "__transpose_from_nchw__", ) curr_block.replace_uses_of_var_after_op( anchor_op=None, old_var=old_var, new_var=new_input ) prog.functions["main"].input_types = tuple(main_input_types) class NameSanitizer: def __init__(self, prefix=None): # to hold all names encountered, # to make sure that all new names are unique self.all_names = set() self.prefix = "_" if prefix is None else prefix @staticmethod def _replace_invalid_char_with_underscore(name): return re.sub("[^a-zA-Z0-9_]", "_", name) def sanitize_name(self, name: str, allow_prefix_underscore: Optional[bool] = True) -> str: """ Sanitize the input string and return it back. Input string should be of the format: [a-zA-Z_][a-zA-Z0-9_]* If it is not, then it is sanitized in the following manner: - first, any character that is not [a-zA-Z0-9_] is replaced with "_" - if the starting character is not [a-zA-Z_], it is prefixed with self.prefix - the resulting string must be unique. If it has been encountered before, it is appended by "_0" or "_1" and so on, until it becomes unique. :name: str current name :return: str updated name. Returns the same string, if sanitization not required. """ # replace any character that is not [a-zA-Z0-9_] with an underscore new_name = self._replace_invalid_char_with_underscore(name) # now check if the name starts with anything but [A-Za-z_] # if so, then add the prefix if re.match("[^a-zA-Z_]", new_name): new_name = self.prefix + new_name reserved_names = [ "any", "bool", "program", "func", "tensor", "list", "dict", "tuple", "true", "false", "string", "bf16", "fp16", "fp32", "fp64", "int8", "int16", "int32", "int64", "uint8", "uint16", "uint32", "uint64", "state", ] if new_name in reserved_names: new_name += "_workaround" # if the name start with _, we append "var" in front of it if not allow_prefix_underscore: if new_name.startswith("_"): new_name = "var" + new_name if new_name == name: # return if nothing has changed self.all_names.add(name) return name else: # name has changed # make sure it is unique, then return if new_name in self.all_names: idx = 0 new_name += "_" + str(idx) while new_name in self.all_names: idx += 1 new_name += "_" + str(idx) # now we have a unique name self.all_names.add(new_name) return new_name @staticmethod def sanitize_block( block, sanitizer_vars, sanitizer_ops, main_input_types=None, sanitize_model_inputs_outputs_only=False, ): """ Sanitize the vars and op names inside the block to adhere to the format [a-zA-Z_][a-zA-Z0-9_]* """ if sanitize_model_inputs_outputs_only: NameSanitizer._sanitize_block_input_vars( block, sanitizer_vars, main_input_types, sanitize_main_input_only=True ) NameSanitizer._sanitize_main_outputs_only(block, sanitizer_vars) else: NameSanitizer._sanitize_block_input_vars(block, sanitizer_vars, main_input_types) NameSanitizer._sanitize_output_vars_and_nested_blocks( block, sanitizer_vars, sanitizer_ops ) NameSanitizer._sanitize_op_names(block, sanitizer_ops) @staticmethod def _sanitize_block_input_vars( block, sanitizer_vars, main_input_types, sanitize_main_input_only=False ): # iterate over all the block input vars and sanitize the names if isinstance(block, Function): # this is the "main" block # block.inputs is a dict from input names to input vars # iterate over the input vars of the main program and sanitize their names new_input_dict = OrderedDict() input_name_updated = False for input_name, var in block.inputs.items(): msg = "Main block's input name, '{}', is different from its corresponding var's name, '{}'." assert input_name == var.name, msg.format(input_name, var.name) new_name = sanitizer_vars.sanitize_name(var.name) new_input_dict[new_name] = var if new_name != var.name: msg = "Input, '{}', of the source model, has been renamed to '{}' in the Core ML model." warnings.warn(msg.format(var.name, new_name)) if var.name in block.placeholder_inputs: block.placeholder_inputs[new_name] = block.placeholder_inputs.pop(var.name) block.placeholder_inputs[new_name].set_name(new_name) var.set_name(new_name) input_name_updated = True if main_input_types is not None: # update prog's main_input_types, since we are updating the name of a model input here for i in range(len(main_input_types)): if main_input_types[i].name == input_name: main_input_types[i].name = new_name break if input_name_updated: block._input_dict = new_input_dict elif not sanitize_main_input_only: # in this case block is not the "main" function # in this case block.inputs is a list of input vars of the block for var in block.inputs: new_name = sanitizer_vars.sanitize_name(var.name) if new_name != var.name: var.set_name(new_name) @staticmethod def _sanitize_var_names(var, sanitizer_vars, emit_warning=False): new_name = sanitizer_vars.sanitize_name(var.name) if new_name != var.name: if emit_warning: msg = "Output, '{}', of the source model, has been renamed to '{}' in the Core ML model." warnings.warn(msg.format(var.name, new_name)) var.set_name(new_name) @staticmethod def _sanitize_op_names(block, sanitizer_ops): # iterate over all the ops and sanitize the op names for op in list(block.operations): if op.name is not None: op.name = sanitizer_ops.sanitize_name(op.name) @staticmethod def _sanitize_output_vars_and_nested_blocks(block, sanitizer_vars, sanitizer_ops): for op in list(block.operations): for b in op.blocks: NameSanitizer.sanitize_block(b, sanitizer_vars, sanitizer_ops) for var in op.outputs: if isinstance(block, Function) and var in block.outputs: NameSanitizer._sanitize_var_names(var, sanitizer_vars, emit_warning=True) else: NameSanitizer._sanitize_var_names(var, sanitizer_vars) @staticmethod def _sanitize_main_outputs_only(block, sanitizer_vars): for op in list(block.operations): for var in op.outputs: if isinstance(block, Function) and var in block.outputs: NameSanitizer._sanitize_var_names(var, sanitizer_vars, emit_warning=True) @register_pass(namespace="common") class sanitize_input_output_names(AbstractGraphPass): """ Sanitize the names of model input and output vars to make sure that they are of the format as described in the NameSanitizer class; that is, of the format ``[a-zA-Z_][a-zA-Z0-9_]*``. """ def apply(self, prog): sanitizer_vars = NameSanitizer(prefix="var_") sanitizer_ops = NameSanitizer(prefix="op_") # sanitize the input/output of the main block # TODO: rdar://126498947 ([Infra] Investigate the name sanitizer on multifunction model) if "main" in prog.functions: NameSanitizer.sanitize_block( prog.functions["main"], sanitizer_vars, sanitizer_ops, prog.functions["main"].input_types, sanitize_model_inputs_outputs_only=True, ) # TODO: rdar://122845072 ([Infra] Refactor the transform_function_signatures, adjust_io_to_supported_types and update_output_dtypes using a shared graph pass) @register_pass(namespace="common") class update_output_dtypes(AbstractGraphPass): """ Update the dtypes of output vars of each function block to match the dtypes provided in ``function.output_types``. The output types for the main function is populated by the ``outputs`` argument provided by the user in the ``coremltools.convert()`` API. This graph pass assumes that the list of outputs in ``function.output_types`` (if not ``None``), are in the same order as the output vars. """ @block_context_manager def adjust_function_output_types(self, func: Function) -> None: """ Adjust output dtypes for a pymil function. """ user_provided_output_types = func.output_types output_vars = func.outputs input_vars = list(func.inputs.values()) if user_provided_output_types is None or len(user_provided_output_types) == 0: return if len(output_vars) != len(user_provided_output_types): msg = ( "Number of outputs provided by the user, which is {}, " "does not match the number of outputs generated by the model, which is {}" ) raise ValueError(msg.format(len(user_provided_output_types), len(output_vars))) new_outputs = [] for i, output_type in enumerate(user_provided_output_types): required_output_dtype = output_type.dtype output_var = output_vars[i] if ( required_output_dtype is None or not ( types.is_tensor(output_var.sym_type) or types.is_scalar(output_var.sym_type) ) or required_output_dtype == output_var.dtype ): # no need to update the output var's dtype in this case new_outputs.append(output_var) elif output_var in input_vars: # Here is this rare special case, that the program input is also an output # For this case, we don't do anything, and throw a warning message new_outputs.append(output_var) logger.warning( f"Output var '{output_var.name}' is also an input var, hence the " f"dtype cannot be changed: output var '{output_var.name}' remains " f"dtype {types.builtin_to_string(output_var.dtype)}" ) else: output_var_name = output_var.name output_var.set_name( output_var_name + "_type_" + types.builtin_to_string(output_var.dtype) ) new_output_var = mb.cast( x=output_var, dtype=types.builtin_to_string(required_output_dtype) ) new_output_var.set_name(output_var_name) Block._copy_scope_info(output_var, new_output_var) new_outputs.append(new_output_var) func.set_outputs(new_outputs) def apply(self, prog: Program): for func in prog.functions.values(): self.adjust_function_output_types(func) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/quantization.py0000644000000000000000000005555314672066616026131 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from abc import abstractmethod from enum import Enum as _Enum from typing import Dict, Set, Text, Tuple import numpy as np from coremltools import _logger as logger from coremltools.converters.mil._deployment_compatibility import AvailableTarget from coremltools.converters.mil.input_types import TensorType from coremltools.converters.mil.mil import Block from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, Operation, Var, types from coremltools.converters.mil.mil.block import is_current_opset_version_compatible_with from coremltools.converters.mil.mil.ops.registry import SSAOpRegistry from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.program import Program from coremltools.converters.mil.mil.types.symbolic import is_symbolic from coremltools.converters.mil.mil.types.type_mapping import string_to_builtin class ComputePrecision(_Enum): FLOAT16 = "float16" FLOAT32 = "float32" class AbstractQuantizationPass(AbstractGraphPass): """ Base class for Post-Training Quantization transforms. Derived class needs to implement following two methods: - is_valid_op(op) - transform_op(op) """ type_eps = {} type_min = {} type_negmin = {} def __init__(self, op_selector=None): super().__init__() if op_selector is not None and not callable(op_selector): raise TypeError( "Argument `op_selector` needs to be a callable function which accepts " "a MIL operation object and returns a boolean value." ) self.op_selector = op_selector # Var that feeds into multiple ops will be cast once and cached into this dict # For reference: Checkout test_single_input_to_multiple_operations in `TestFP16CastTransform`. # Note that, we make it a stack of dict to keep tracking the blocks self._cache_vars = [] def current_cache_vars(self) -> Set[Var]: return self._cache_vars[-1] def apply(self, prog): """ Walks over each operation in the graph and performs following two steps, 1. Checks whether an operation is valid for that quantized transform using `is_valid_op` method. 2. If yes, calls `transform_op` method of the derived quantized transform class. :param prog: MIL program :return: Transformed MIL program """ if not isinstance(prog, Program): raise TypeError('Transform "{}" can only be applied on PyMIL programs.'.format(self)) if getattr(self, "skip_ops_by_type", set()) and self.op_selector is not None: raise ValueError( "The graph pass option `skip_ops_by_type` cannot be set along with " "the `op_selector` in FP16ComputePrecision. Please only use one " "method to control which ops to operate on." ) @block_context_manager def apply_block(block): self._cache_vars.append({}) for op in list(block.operations): for b in op.blocks: apply_block(b) if self.is_valid_op(op): need_transform: bool if self.op_selector is not None: need_transform = self.op_selector(op) else: need_transform = op.op_type not in getattr(self, "skip_ops_by_type", set()) if need_transform: self.transform_op(op) self._cache_vars.pop() for f in prog.functions.values(): apply_block(f) def transform_op(self, op): """ Replaces an op with a transformed op. :param op: MIL operation :return: None """ raise NotImplementedError( 'Op transformation for quantization mode "{}" not implemented.'.format(self) ) def is_valid_op(self, op): """ Checks whether an operation is valid for given quantized transform. :param op: MIL operation :return: true | false """ raise NotImplementedError( 'Operation Preconditions for quantization mode "{}" not implemented.'.format(self) ) @classmethod def _close_to_zero(cls, val, np_type): if np_type not in cls.type_eps: cls.type_eps[np_type] = np.finfo(np_type).eps cls.type_min[np_type] = np.nextafter(0.0, 1.0, dtype=np_type) cls.type_negmin[np_type] = np.nextafter(0.0, -1.0, dtype=np_type) return np.isclose(val, 0, atol=cls.type_min[np_type], rtol=cls.type_eps[np_type]) def __repr__(self): return str(self) def __str__(self): return type(self).__name__ class CastTypeQuantization(AbstractQuantizationPass): """ Base class for all type casting related quantization, such as fp32->fp16, int32->int16, etc. For each valid op, if the "op_selector" return True: - For each input with dtype `origin_dtype`, inject a "cast" op to change it to `target_dtype`. - For each output with dtype `target_dtype`, inject a "cast" op to change it back to `origin_dtype`. All child classes need to specify `origin_dtype` and `target_dtype`. """ def __init__(self, op_selector=None): super().__init__(op_selector=op_selector) @property @abstractmethod def origin_dtype(self) -> str: """Original dtype that need to be cast, such as fp32.""" raise NotImplementedError("origin_dtype must be specified in subclass.") @property @abstractmethod def target_dtype(self) -> str: """Target dtype, such as fp16.""" raise NotImplementedError("target_dtype must be specified in subclass.") # TODO: rdar://122845072 ([Infra] Refactor the transform_function_signatures, adjust_io_to_supported_types and update_output_dtypes using a shared graph pass) @block_context_manager def transform_function_signatures(self, func: Function) -> None: """ This utility transform a function input / output signatures from the original_dtype to the target_dtype. For instance, in the add_fp16_cast class, this member function transforms the following function: function(%input(fp32)) { block0() { % var_1 = op_1(x=%input) ... % output(fp32) = ... } -> (%output) } into: function(%input(fp16)) { block0() { # input_cast = cast(x=input, dtype="fp32") % var_1 = op_1(x=%input_cast) ... % output(fp32) = ... } -> (%output) } and function.output_types is set to [TensorType(dtype=types.fp16)], in which will be used in common::update_output_dtypes to upgrade the function output dtype accordingly. """ # reset input signatures old_func_inputs = func.inputs new_func_inputs = {} cache_vars = {} # cast the new input into the original dtype for k, v in old_func_inputs.items(): if v.is_tensor_or_scalar_of(self.origin_dtype): new_input = mb.placeholder( shape=v.shape, dtype=string_to_builtin(self.target_dtype), name=v.name, ).outputs[0] if v in func.outputs: new_outputs = [] for val in func.outputs: new_outputs.append(new_input if val == v else val) func.set_outputs(new_outputs) new_func_inputs[k] = new_input cast_input = mb.cast( x=new_input, dtype=self.origin_dtype, before_op=func.operations[0] if len(func.operations) > 0 else None, ) cache_vars[k] = cast_input else: new_func_inputs[k] = v cache_vars[k] = v # replace the use of the old input vars with the new cast var for k, v in old_func_inputs.items(): func.replace_uses_of_var_after_op( anchor_op=None, old_var=v, new_var=cache_vars[k], ) func._input_dict = new_func_inputs # reset output signatures if func.output_types is None: output_types = [TensorType(dtype=v.dtype) for v in func.outputs] else: output_types = func.output_types for idx, v in enumerate(output_types): if v.dtype == string_to_builtin(self.origin_dtype): output_types[idx] = TensorType(dtype=string_to_builtin(self.target_dtype)) func.output_types = output_types def should_cast_parameter(self, op: Operation, param_name: str) -> bool: """ Determines if a param of an op should be cast to target_dtype. There are two cases that an op shouldn't be cast: 1. The op's parameter doesn't support target_dtype. 2. The cast op itself doesn't support target_dtype """ type_domain = getattr(op.input_spec.input_types[param_name], "type_domain", None) if type_domain and types.string_to_builtin(self.target_dtype) not in type_domain: return False if self.target_dtype not in SSAOpRegistry._get_core_op_cls("cast").supported_dtypes(): return False return True def _get_casted_outputs(self, op: Operation, casted_inputs: Dict[str, Var]) -> Tuple[Var]: """ Given an op and casted_inputs, this utility returns the new resulting outputs. """ return getattr(mb, op.op_type)(**casted_inputs) def transform_op(self, op) -> None: """Transform the input(s)/output(s) dtypes of the op.""" block = op.enclosing_block casted_inputs = {} inputs_modified = False for param, inputs in op.inputs.items(): if not self.should_cast_parameter(op, param): continue is_list_input = isinstance(inputs, (list, tuple)) if not is_list_input: inputs = [inputs] casted_inputs[param] = list(inputs[:]) for i, var in enumerate(inputs): if not var.is_tensor_or_scalar_of(dtype=self.origin_dtype): continue inputs_modified = True casted_var_name = f"{var.name}_to_{self.target_dtype}" if ( len(var._child_ops) > 1 and casted_var_name in self.current_cache_vars() ): if self.current_cache_vars()[casted_var_name].op.x != var: logger.warning( "The cached cast Var doesn't match the original Var. It's due to duplicated Var " f"names in the graph for {casted_var_name}." ) casted_inputs[param][i] = self.current_cache_vars()[casted_var_name] else: x = mb.cast( x=var, dtype=self.target_dtype, name=casted_var_name, before_op=op, ) if self.target_dtype == "fp16": self._check_underflow_to_zero(x, var) Block._copy_metadata(var, x) casted_inputs[param][i] = x if len(var._child_ops) > 1: self.current_cache_vars()[casted_var_name] = casted_inputs[param][i] if not is_list_input: casted_inputs[param] = casted_inputs[param][0] if inputs_modified: casted_inputs.update({k: v for k, v in op.inputs.items() if k not in casted_inputs}) casted_inputs["name"] = f"{op.name}_cast_{self.target_dtype}" casted_inputs["before_op"] = op quant_output = self._get_casted_outputs(op, casted_inputs) if not isinstance(quant_output, (list, tuple)): quant_output = [quant_output] for old_output_var, new_output_var in zip(op.outputs, quant_output): if old_output_var.is_tensor_or_scalar_of(dtype=self.origin_dtype) and ( not new_output_var.is_tensor_or_scalar_of(dtype=self.origin_dtype) ): x = mb.cast( x=new_output_var, dtype=self.origin_dtype, name=f"{new_output_var.name}_to_{self.origin_dtype}", before_op=op, ) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=old_output_var, new_var=x, force_replace=True, ) else: op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=old_output_var, new_var=new_output_var, force_replace=True, ) block.remove_ops([op]) class FP16ComputePrecision(CastTypeQuantization): """ This transform does the following, for each valid op and if the "op_selector" return True: - For each input of dtype float32, inject a "cast" op to change it to float16 dtype - For each output of dtype float16, inject a "cast" op to change it back to float32 """ # Activation related ops with alpha/beta parameters. _ACTIVATION_ALPHA_OPS: Set[str] = {"elu", "leaky_relu", "prelu", "thresholded_relu"} _ACTIVATION_ALPHA_BETA_OPS: Set[str] = { "clamped_relu", "linear_activation", "scaled_tanh", "sigmoid_hard", "softplus_parametric", } _ELEMENTWISE_UNARY_EPSILON_OPS: Set[str] = {"inverse", "log", "rsqrt"} # Unsupported op for fp16 casting _UNSUPPORTED_FP16_OPS: Set[str] = { "cast", "while_loop", "cond", # TODO: Remove after supporting FP16 dynamic quantize transformation for list ops (rdar://74458192) "make_list", "list_gather", "list_scatter", "list_read", "list_write", "list_length", "read_state", "coreml_update_state", } def __init__(self, op_selector=None): super(FP16ComputePrecision, self).__init__(op_selector=op_selector) @property def origin_dtype(self) -> str: return "fp32" @property def target_dtype(self) -> str: return "fp16" @staticmethod def fp16_overflow(op: Operation) -> bool: """ Determines if any of the op's input will overflow when represented by FP16. This overflow check consists of two parts: 1. For valid fp32 numbers (abs < 1e38), we want their exact values, so we make sure they are within fp16 range [-65504, 65504] 2. For inifinities (abs >= 1e38), their exact values does not matter, so we can always downcast them to fp16 inf. For example, in attention mask we just want -inf to make the masked entries have 0 probability after softmax """ for _, inputs in op.inputs.items(): is_list_input = isinstance(inputs, (list, tuple)) if not is_list_input: inputs = [inputs] for var in inputs: if ( var.op is not None and var.op.op_type == "const" and var.is_tensor_or_scalar_of(dtype="fp32") ): value = np.expand_dims(var.op.val.val, 0) abs_value = np.abs(value) if np.max(abs_value[np.where(abs_value < 1e38)], initial=0.0) > 65504: return True return False def is_valid_op(self, op: Operation) -> bool: """Determines if op is valid for fp16 casting.""" if op.op_type in self._UNSUPPORTED_FP16_OPS: return False if self.fp16_overflow(op): return False return True def should_cast_parameter(self, op: Operation, param_name: str) -> bool: """Determines if a param of an op should be cast to fp16.""" if not super().should_cast_parameter(op, param_name): return False if is_current_opset_version_compatible_with(AvailableTarget.iOS17): # In IOS17+ activation ops with alpha/beta support mixed precision, and we don't want to # cast alpha/beta to fp16 for better numerical accuracy. if op.op_type in self._ACTIVATION_ALPHA_OPS and param_name == "alpha": return False if op.op_type in self._ACTIVATION_ALPHA_BETA_OPS and param_name in {"alpha", "beta"}: return False # Element-wise unary ops with epsilon also support mixed precision. if op.op_type in self._ELEMENTWISE_UNARY_EPSILON_OPS and param_name == "epsilon": return False return True def _check_underflow_to_zero(self, new_var, var): # We check whether there are casted values that "becomes" 0 which is not ideal for eps purposes. # However we skip arrays with more than 400 in case we compare through a large sparse matrix. if ( new_var.val is not None and len(var.val.flatten()) < 400 and self._close_to_zero(new_var.val, np.float16).any() ): value_modified = False original_val = var.val.flatten() new_val = new_var.val.flatten() for idx in range(len(original_val)): if not self._close_to_zero(original_val[idx], np.float32) and self._close_to_zero( new_val[idx], np.float16 ): new_val[idx] = ( self.type_min[np.float16] if np.sign(original_val[idx]) > 0 else self.type_negmin[np.float16] ) value_modified = True if value_modified: if np.isscalar(new_var.val): new_var._sym_val.val = new_val[0] else: new_var._sym_val.val = new_val.reshape(new_var.val.shape) @register_pass(namespace="common") class add_fp16_cast(FP16ComputePrecision): """ For each input of dtype float32, inject a ``cast`` op to change it to float16 dtype. For each output of dtype float16, inject a ``cast`` op to change it back to float32. This pass is the registered interface for FP16ComputePrecision, which makes it consistent with other passes' interfaces. Support options: - ``skip_ops_by_type``: Skip op types specified by comma-separated string; for example, ``"mul,const"``. """ _skip_ops_by_type: Set[Text] = set() @property def skip_ops_by_type(self): return self._skip_ops_by_type @skip_ops_by_type.setter def skip_ops_by_type(self, criteria: Text): self._skip_ops_by_type = set(criteria.split(",")) @register_pass(namespace="common") class add_int16_cast(CastTypeQuantization): """ This transform does the following, for each op that supports int16/uint16: - For each input of dtype int32 which supports int16/uint16, inject a "cast" op to change it to int16/uint16 dtype. - For each output of dtype int16/uint16, inject a "cast" op to change it back to int32. Notice that the cast will not be inserted if the const value is out of int16/uint16 range. """ # Ops that prefer int16 params. # If an op supports 16-bit only in later iOS (e.g. gather started to support 16-bit from iOS16) # then int16 cast will be inserted only if the iOS version is high enough # (e.g. nothing will happen for iOS15 gather) # This is achieved by type domain confirmation in `CastTypeQuantization.should_cast_parameter` _PREFER_INT16_OPS: Set[str] = {"gather", "gather_along_axis", "gather_nd", "squeeze"} def __init__(self, op_selector=None): super().__init__(op_selector=op_selector) # Use variable instead of hard-coded "int16" because the target dtype could be uint16 # depending on if the param is non-negative const and within uint16 range. self._target_dtype: str = "int16" @property def origin_dtype(self) -> str: return "int32" @property def target_dtype(self) -> str: return self._target_dtype @target_dtype.setter def target_dtype(self, target_dtype: str): if target_dtype not in {"int16", "uint16"}: raise ValueError("The target_dtype in add_int16_cast must be int16 or uint16") self._target_dtype = target_dtype def should_cast_parameter(self, op: Operation, param_name: str) -> bool: """ Determine if a parameter should be cast or not. If should be cast, determine whether to use int16 or uint16. """ _INT16_MAX = np.iinfo(np.int16).max _INT16_MIN = np.iinfo(np.int16).min _UINT16_MAX = np.iinfo(np.uint16).max _UINT16_MIN = np.iinfo(np.uint16).min input_var = op.inputs[param_name] if not input_var.is_tensor_or_scalar_of(dtype="int32"): return False input_op = input_var.op if input_op is not None and input_op.op_type == "const": if ( input_op.outputs[0].val.min() >= _UINT16_MIN and input_op.outputs[0].val.max() <= _UINT16_MAX ): self._target_dtype = "uint16" elif ( input_op.outputs[0].val.min() >= _INT16_MIN and input_op.outputs[0].val.max() <= _INT16_MAX ): self._target_dtype = "int16" else: return False # In `gather` and `gather_along_axis`, if the dim size of x is larger than int16 # upperbound, the dynamic indices could overflow, so it shouldn't be cast. if op.op_type in {"gather", "gather_along_axis"} and param_name == "indices": if op.indices.val is None and op.x.shape is not None: dim_size = op.x.shape[op.axis.val] if not is_symbolic(dim_size) and dim_size > _INT16_MAX: return False if not super().should_cast_parameter(op, param_name): return False return True def is_valid_op(self, op: Operation) -> bool: """Determines if op is valid for int16/uint16 casting.""" return op.op_type in self._PREFER_INT16_OPS ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/randomize.py0000644000000000000000000000363014672066616025360 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass @register_pass(namespace="common") class WeightRandomizer(AbstractGraphPass): """ This graph pass randomizes the weights of each ``const`` op """ def apply(self, prog): for f in prog.functions.values(): self._randomize_weights_block(f) @block_context_manager def _randomize_weights_block(self, block): for op in list(block.operations): for b in op.blocks: self._randomize_weights_block(b) if self.is_valid_op(op): self.transform_op(op) def is_valid_op(self, op: Operation): # lazy import to prevent circular import from coremltools.converters.mil.backend.mil.load import should_use_weight_file if op.op_type == "const" and should_use_weight_file(op.outputs[0].val): return True return False def transform_op(self, op): weight = op.outputs[0].val random_weight = np.random.rand(*weight.shape).astype(weight.dtype) new_var = mb.const( val=random_weight, before_op=op, name=op.name, ) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=new_var, no_check_var_types=True, ) op.enclosing_block.remove_ops([op]) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/defs/symbol_transform.py0000644000000000000000000004064214672066616026774 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Dict, Set, Tuple from coremltools import _logger as logger from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, Placeholder, Program, Var, types from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.types.symbolic import any_symbolic, is_symbolic @register_pass(namespace="common") class materialize_symbolic_shape_program(AbstractGraphPass): """ If we realize that only a few fixed shapes are used in a symbolic-shape model, we may prefer materialization into a fixed-shape (multifunction) model, which has the potential to be further optimized Supported options: - ``function_name_to_materialization_map``: Dict[str, Dict[str, Tuple[int]]] A dictionary specifying the name of new functions to be created, and for each new function what is the new fixed shapes for inputs. If a new function has the same name as an old function, then the old function will be overridden - ``source_function_name``: str The name of the source symbolic-shape function to be materialized, default = main Example: Suppose we have a symbolic shape model with 2 symbols ``is0`` and ``is1`` .. code-block:: symbolic_shape_mlmodel: ct.models.MLModel symbolic_shape_prog = symbolic_shape_mlmodel._mil_program We may invoke this graph pass to materialize some fixed shapes (e.g. ``is0 = 2, is1 = 5`` and ``is0 = 4, is1 = 7``), then run every other optimization passes .. code-block:: pass_pipeline: PassPipeline = ct.PassPipeline.DEFAULT pass_pipeline.insert_pass(0, "common::materialize_symbolic_shape_program") pass_pipeline.set_options( "common::materialize_symbolic_shape_program", { "function_name_to_materialization_map": { # As an example, let us assume the input is x (is0, is1, 1024) "materialization_2_5": {"x": (2, 5, 1024)}, "materialization_4_7": {"x": (4, 7, 1024)}, } }, ) PassPipelineManager.apply_pipeline(symbolic_shape_prog, pass_pipeline) We will arrive at .. code-block:: main[CoreML8](%x: (is0, is1, 1024, fp16)(Tensor)) { block0() { ... } -> (%y) } materialization_2_5[CoreML8](%x: (2, 5, 1024, fp16)(Tensor)) { block5() { ... } -> (%y) } materialization_4_7[CoreML8](%x: (4, 7, 1024, fp16)(Tensor)) { block6() { ... } -> (%y) } """ def __init__(self) -> None: self._function_name_to_materialization_map: Dict[str, Dict[str, Tuple[int]]] = None self._source_function_name: str = "main" @property def function_name_to_materialization_map(self) -> Dict[str, Dict[str, Tuple[int]]]: return self._function_name_to_materialization_map @function_name_to_materialization_map.setter def function_name_to_materialization_map( self, function_name_to_materialization_map_: Dict[str, Dict[str, Tuple[int]]] ) -> None: if not isinstance(function_name_to_materialization_map_, dict): raise ValueError( "function_name_to_materialization_map must be type of dict, " f"but got {type(function_name_to_materialization_map_)}" ) for function_name, materialization_map in function_name_to_materialization_map_.items(): if not isinstance(function_name, str): raise ValueError( f"Materialized new function name must be type of str, but got {type(function_name)}" ) if not isinstance(materialization_map, dict): raise ValueError( f"Materialization map must be type of dict, but got {type(materialization_map)}" ) for input_name, shape in materialization_map.items(): if not isinstance(input_name, str): raise ValueError( f"Materialization map key (input name) must be type of str, but got {type(input_name)}" ) if not isinstance(shape, tuple): raise ValueError( f"Materialization map value (shape) must be type of tuple, but got {type(shape)}" ) for size in shape: if not isinstance(size, int): raise ValueError(f"Shape element must be type of int, but got {type(size)}") self._function_name_to_materialization_map = function_name_to_materialization_map_ @property def source_function_name(self) -> str: return self._source_function_name @source_function_name.setter def source_function_name(self, source_function_name_: str) -> None: if not isinstance(source_function_name_, str): raise ValueError( f"Source function name must be type of str, but got {type(source_function_name_)}" ) self._source_function_name = source_function_name_ @staticmethod def _canonicalize_materialization_map( source_function: Function, function_name_to_materialization_map: Dict[str, Dict[str, Tuple[int]]], ) -> Dict[str, Dict[str, int]]: """ User input ``materialization_map`` maps input names to fixed shapes, but what a rigorous graph pass really need is a map from symbols to integers, so here we construct the canonical materialization map from user input """ function_name_to_canonical_materialization_map: Dict[str, Dict[str, int]] = {} for function_name, materialization_map in function_name_to_materialization_map.items(): canonical_materialization_map: Dict[str, int] = {} for source_input_var in source_function.inputs.values(): input_name: str = source_input_var.name if input_name in materialization_map: fixed_shape = materialization_map[input_name] for size, integer in zip(source_input_var.shape, fixed_shape): if is_symbolic(size): if size.name not in canonical_materialization_map: canonical_materialization_map[size.name] = integer else: existing_integer = canonical_materialization_map[size.name] if existing_integer != integer: raise ValueError( f"Inconsistent symbol materialization in new function {function_name}: " f"symbol {size.name} is to be materialized into {existing_integer} and {integer}. " f"Please make sure input {input_name} has compatible shape with others" ) else: if size != integer: raise ValueError( f"Already fixed size cannot be altered: new function {function_name}, " f"input {input_name}, original size is {size}, but user specified new size {integer}" ) else: logger.warning( f"In new function {function_name}, " f"although materialization for input {input_name} is not specified, " f"it may still be materialized if it shares symbol with other inputs" ) function_name_to_canonical_materialization_map[ function_name ] = canonical_materialization_map return function_name_to_canonical_materialization_map @staticmethod def _validate_inputs( source_function: Function, function_name_to_canonical_materialization_map: Dict[str, Dict[str, int]], ) -> None: # Get existing symbols in program symbols: Set[str] = set() for source_input_var in source_function.inputs.values(): for size in source_input_var.shape: if is_symbolic(size): symbols.add(size.name) # Compare existing symbols vs user specified materialization map for ( function_name, canonical_materialization_map, ) in function_name_to_canonical_materialization_map.items(): symbols_to_be_materialized = set(canonical_materialization_map.keys()) # Q: Why we only check symbols is subset of symbols_to_be_materialized, # but not symbols_to_be_materialized is subset of symbols? # A: Since our API has user specify {input name: fixed shape tuple}, # we will not receive any redundant symbol, # i.e. symbols_to_be_materialized will always be a subset of symbols if not symbols.issubset(symbols_to_be_materialized): logger.warning( f"In new function {function_name}, these symbols will not be materialized: " f"{symbols - symbols_to_be_materialized}" ) @staticmethod def _maybe_materialize_symbolic_shape( shape: Tuple, canonical_materialization_map: Dict[str, int] ) -> Tuple: if any_symbolic(shape): materialized_shape = [] for size in shape: if is_symbolic(size) and size.name in canonical_materialization_map: materialized_shape.append(canonical_materialization_map[size.name]) else: materialized_shape.append(size) return tuple(materialized_shape) else: return shape @staticmethod def _create_placeholders( source_function: Function, canonical_materialization_map: Dict[str, int] ) -> Dict[str, Placeholder]: placeholders: Dict[str, Placeholder] = {} for source_input_name, source_input_var in source_function.inputs.items(): target_input_shape = ( materialize_symbolic_shape_program._maybe_materialize_symbolic_shape( source_input_var.shape, canonical_materialization_map ) ) if types.is_state(source_input_var.sym_type): placeholders[source_input_name] = mb.state_tensor_placeholder( target_input_shape, source_input_var.dtype ) else: placeholders[source_input_name] = mb.placeholder( target_input_shape, source_input_var.dtype ) return placeholders @staticmethod def _copy_construct_const_var(source_const_var: Var) -> Var: target_const_var = mb.const(val=source_const_var.val, name=source_const_var.name) if ( hasattr(source_const_var.op, "weight_key") and source_const_var.op.weight_key is not None ): target_const_var.op.weight_key = source_const_var.op.weight_key return target_const_var def apply(self, prog: Program) -> None: if self.source_function_name not in prog.functions: raise ValueError( f"Source function {self.source_function_name} not found, " f"available functions are {list(prog.functions.keys())}" ) source_function = prog.functions[self.source_function_name] function_name_to_canonical_materialization_map = self._canonicalize_materialization_map( source_function, self.function_name_to_materialization_map ) self._validate_inputs(source_function, function_name_to_canonical_materialization_map) for ( target_function_name, canonical_materialization_map, ) in function_name_to_canonical_materialization_map.items(): context: Dict[str, Var] = {} with Function( inputs=self._create_placeholders(source_function, canonical_materialization_map), opset_version=source_function.opset_version, ) as target_function: # Extract function input variables for target_input_name, target_input_var in target_function.inputs.items(): context[target_input_name] = target_input_var # Rebuild all operations with new variables for source_operation in source_function.operations: # Instead of building constants as we encounter, # we will build them when we find them in operation input, # otherwise we will mess up with block internal variable if source_operation.op_type == "const": continue else: # prepare operation inputs target_name_to_input: Dict[str, Var] = {} for source_input_name, source_input_vars in source_operation.inputs.items(): # operation input may either be Var or Tuple[Var] is_source_single_input = isinstance(source_input_vars, Var) if is_source_single_input: source_input_vars = [source_input_vars] target_input_vars = [] for source_input_var in source_input_vars: # build const input that is currently missing from context if source_input_var.name not in context: assert ( source_input_var.op.op_type == "const" ), "Only const may be absent from context" context[source_input_var.name] = self._copy_construct_const_var( source_input_var ) target_input_vars.append(context[source_input_var.name]) if is_source_single_input: target_name_to_input[source_input_name] = target_input_vars[0] else: target_name_to_input[source_input_name] = tuple(target_input_vars) # build operation outputs = getattr(mb, source_operation.op_type)( **target_name_to_input, name=source_operation.name ) # operation output may either be Var or Tuple[Var] if isinstance(outputs, Var): outputs = [outputs] for output, source_output in zip(outputs, source_operation.outputs): output.set_name(source_output.name) context[output.name] = output # Set function output variables target_function.set_outputs( [ context[source_output_var.name] for source_output_var in source_function.outputs ] ) prog.add_function(target_function_name, target_function) # For some reason, if we run const_deduplication._deduplicate_const_across_functions here, # the populated `const.weight_id` will get lost if we run pass pipeline afterwards, # so we have no choice but to let user manually deduplicate after all passes are done # TODO (rdar://131680531): Investigate why it happens & whether we can change this behavior logger.warning( "(If you are using ct.utils.materialize_dynamic_shape_mlmodel, " "you are safe to ignore this warning message) " "Weights are duplicated in each materialized new function, " "so you may want to run const_deduplication._deduplicate_const_across_functions " "on your pymil program before serialization to milproto" ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/graph_pass.py0000644000000000000000000000530414672066616024576 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from abc import ABC, abstractmethod from typing import Callable, List, Optional, Text, Union from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, Program from coremltools.converters.mil.mil.scope import ScopeInfo, ScopeSource class PassOption: """ Option that will be applied in a graph pass. Each graph pass need to have their own implementation to support the corresponding option. Available options are documented in each pass's docstring. """ # The Callable option_val is for op_selector backward compatibility only. def __init__(self, option_name: Text, option_val: Union[Text, Callable[[Operation], bool]]): if not isinstance(option_name, Text): raise ValueError(f"The option name should be text, but got {type(option_name)}") self._option_name = option_name self._option_val = option_val def __str__(self): return f"{self.option_name}: {self.option_val}" @property def option_name(self): return self._option_name @property def option_val(self): return self._option_val class AbstractGraphPass(ABC): """ Base class for a graph pass. Each graph pass should be a subclass of this and implement the `apply` method. Each graph pass can also implement their own supported options. See examples of `skip_ops_by_type` in `add_fp16_cast` and `skip_const_by_size` in `const_elimination` about how to support new options in each pass. """ def __call__(self, prog: Program): if not prog.skip_all_passes: # we use the scope context manager to populate the graph pass information to the ops # constructed by the pass. with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=[str(self)])): self.apply(prog) def __str__(self): return type(self).__name__ @abstractmethod def apply(self, prog: Program): pass def set_options(self, pass_options: Optional[List[PassOption]] = None): """Set pass options.""" if pass_options is not None: for pass_option in pass_options: option_name = pass_option.option_name if not hasattr(self, option_name): raise NotImplementedError( f"The graph pass `{self}` doesn't support option `{option_name}`." ) setattr(self, option_name, pass_option.option_val) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/helper.py0000644000000000000000000001013214672066616023721 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Callable, List, Optional import numpy as np from coremltools.converters.mil.mil import Block, Operation from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass class classproperty(property): """ A decorator class that allow us to have a class-level property """ def __get__(self, owner, cls): return self.fget(cls) def block_context_manager(_func: Optional[Callable] = None): """ This decorator executes a function under the context manager `with block`. For instance, given a function `func` with an input block and other arguments: def func(block, *args): ... with block: op_1 = mb.add(...) ... with block: op_2 = mb.relu...() It can be be streamlined as: @block_context_manager def func(block, *args): ... op_1 = mb.add(...) ... op_2 = mb.relu...() Note that, the first argument of the function must have type Block. It is highly recommended to decorate a function with block_context_manager if it is calling `with block` multiple times, since when the code exit `block`, an expensive _propagate_nonreplaceable_vars() is invoked. The decorator reduces the amount of calling `with block` overally. """ def wrapper(*args): # Make it compatible with class method. if isinstance(args[0], AbstractGraphPass): block = args[1] else: block = args[0] if not isinstance(block, Block): raise ValueError( "The function decorated with block_context_manager must have a Block " "type argument as the first input." ) with block: return _func(*args) return wrapper def _check_child_op_type(op, child_op_type): """ :param op: operation :param child_op_type: str :return: Return True if op has 1 child and type of that child matches child_op_type """ if len(op.outputs) != 1: return False child_ops = list(op.outputs[0].child_ops) if len(child_ops) != 1: return False if child_ops[0].op_type == child_op_type: return True return False def _check_no_output_connection(block: Block, ops: List[Operation]) -> bool: """ Check that none of the op in this pattern is connected to the output (except the last op) :param block: Block :param ops: List of operations to check on. """ for op in ops[:-1]: for out in op.outputs: if out in block.outputs: return False return True def _check_var_scalar_value_in_interval(x, lower_bound, upper_bound): """ :param x: var :param lower_bound: a scalar value :param upper_bound: a scalar value :return: True if the value of var is in the interval [lower_bound, upper_bound] """ if x.val is None: return False if not isinstance(x.val, (np.ndarray, np.generic)): return False if isinstance(x.val, np.ndarray): if x.val.size != 1: return False x_val = x.val[:][0] if len(x.val.shape) > 0 else x.val[()] else: x_val = x.val if x_val >= lower_bound and x_val <= upper_bound: return True return False def _check_var_scalar_value(x, val, tol=1e-3): """ :param x: var :param val: a scalar value :return: True if x.val is equal to val otherwise return False """ if x.val is None: return False if not isinstance(x.val, np.ndarray) and not np.isscalar(x.val): return False if isinstance(x.val, np.ndarray): if x.val.size != 1: return False if len(x.val.shape) == 0: x_val = x.val else: x_val = x.val[:][0] if len(x.val.shape) > 0 else x.val[()] else: x_val = x.val if abs(x_val - val) < tol: return True return False ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/pass_pipeline.py0000644000000000000000000005217414672066616025311 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from __future__ import annotations from typing import Dict, List, Optional, Set, Text, Union from tqdm import tqdm from coremltools import _logger as logger from coremltools.converters._profile_utils import _profile from coremltools.converters.mil.mil import Program from coremltools.converters.mil.mil.passes.graph_pass import PassOption from coremltools.converters.mil.mil.passes.helper import classproperty as _classproperty from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY _COMMON_PASSES: List[Text] = [ "common::lower_complex_dialect_ops", "common::update_output_dtypes", "common::cast_optimization", "common::noop_elimination", # quantization pass 1: canonicalizations # always start quantization passes with canonicalizations "common::int_op_canonicalization", # ops that support int do not need dequantize -> op -> quantize sandwich "common::nullify_redundant_quantization_zero_point", # canonicalize zero point # quantization pass 2: remove redundancy # remove redundancy after canonicalization but before anything else "common::dequantize_quantize_pair_elimination", # the main quantization passes "common::distributive_quantized_binary_op_scale_normalization", # the last quantization pass: replace const dequantize with constexpr # after all quantization passes, since constexpr will not be further optimized # before const elimination, otherwise const dequantize would get bloated "common::dequantize_to_constexpr", "common::canonicalize_quantized_lut_pattern", "common::const_elimination", "common::sanitize_input_output_names", "common::divide_to_multiply", "common::select_optimization", "common::add_conv_transpose_output_shape", "common::const_elimination", "common::const_deduplication", # after all consts have been settled "common::loop_invariant_elimination", "common::remove_symbolic_reshape", "common::noop_elimination", "common::fuse_matmul_weight_bias", "common::fuse_linear_bias", "common::fuse_gelu_tanh_approximation", "common::fuse_gelu_exact", "common::fuse_leaky_relu", "common::rank0_expand_dims_swap", "common::fuse_squeeze_expand_dims", "common::compose_conv1d", # compose conv1d before any other conv passes "common::use_reflection_padding", "common::merge_consecutive_paddings", # Should come after use_reflection_padding, which will introduce new padding layers "common::fuse_pad_conv", # Should come after merge_consecutive_paddings "common::image_input_preprocess", "common::replace_stack_reshape", # should come before detect_concat_interleave since it may add concat "common::reduce_transposes", "common::fuse_dilated_conv", "common::fuse_conv_scale", "common::fuse_conv_bias", "common::fuse_onehot_matmul_to_gather", "common::fuse_layernorm_or_instancenorm", # should come after reduce_transposes, to detect instance_norm "common::fuse_elementwise_to_batchnorm", # should come after fuse_layernorm_or_instancenorm "common::fuse_reduce_mean", # should come after fuse_layernorm_or_instancenorm "common::fuse_conv_batchnorm", # should come after fuse_elementwise_to_batchnorm "common::fuse_conv_scale", # Re-run the fuse conv scale pass after the conv and batch_norm are fused "common::fuse_conv_bias", # Re-run the fuse conv bias pass after the conv and batch_norm are fused "common::fuse_conv_batchnorm", # In some cases, we need to run conv / batch_norm fusion again after the fuse_conv_scale and fuse_conv_bias passes "common::detect_concat_interleave", "common::concat_to_pixel_shuffle", # should come after detect_concat_interleave and after replace_stack_reshape "common::fuse_prelu", # reduce_transpose pass should run before and after this pass (the one after will be run during the cleanup passes stage) "common::prelu_to_lrelu", "common::merge_consecutive_relus", "common::merge_consecutive_reshapes", "common::merge_consecutive_transposes", "common::fuse_transpose_matmul", # "expand_high_rank_reshape_and_transpose" must come after "common::merge_consecutive_transposes" "common::expand_high_rank_reshape_and_transpose", "common::reduce_transposes", # "remove_redundant_ops" pass should be applied towards the end, once other graph passes have done their optimizations. # For instance, it should come after passes such as "reduce_transpose" that can introduce redundant transposes # in the network (while reducing the total number of transposes), and after passes such as "fuse_layernorm_or_instancenorm" # which detects patterns that involve redundant ops ("sub") etc. "common::remove_redundant_ops", "common::dedup_op_and_var_names", # Must be applied before "add_fp16_cast" because "add_fp16_cast" use unique name cache. "common::add_fp16_cast", # Will be removed if compute precision is not FP16. "common::add_int16_cast", # Will be removed if compute precision is not FP16. "common::update_output_dtypes", # Must run again after `add_fp16_cast` and `add_int16_cast`. "common::const_elimination", "common::dead_code_elimination", "common::cast_optimization", "common::dead_code_elimination", # must follow cast_optimization "common::const_elimination", # After all fusions have settled, start inserting state ops "common::canonicalize_inplace_pattern", # always start with canonicalizations "common::prefer_state_in_downstream", "common::const_elimination", "common::dead_code_elimination", # always end with dce ] _CLEANUP_PASSES: List[Text] = [ "common::dead_code_elimination", "common::const_elimination", "common::cast_optimization", "common::dead_code_elimination", # must follow cast_optimization "common::const_elimination", "common::const_deduplication", # after all consts have been settled "common::dead_code_elimination", # come before merge_affine_dequantize_with_consecutive_ops "common::merge_affine_dequantize_with_consecutive_ops", # after const_deduplication and dead_code_elimination "common::expand_dynamic_linear", # if weight or bias were not merged into constexpr, then expand linear to matmul + add "common::fuse_transpose_matmul", # there might be left over transpose that got created in hoping to use linear, but now can be fused back with matmul "common::dead_code_elimination", # fused transposes become orphans thus can be elimianted "common::const_deduplication", # additional consts may be introduced during merging dequantize and expanding linear "common::loop_invariant_elimination", "common::noop_elimination", "common::dedup_op_and_var_names", "common::reduce_transposes", # fuse_layernorm_or_instancenorm can potentially add transposes "common::remove_redundant_ops", "common::topological_reorder", "common::dead_code_elimination", # always end with dce ] _PALETTIZATION_PASSES: List[Text] = [ "compression::palettize_weights", ] _SPARSIFICATION_PASSES: List[Text] = [ "compression::prune_weights", ] _FRONTEND_TORCH_PASSES: List[Text] = [ "common::dead_code_elimination", "common::loop_invariant_elimination", "common::dead_code_elimination", "torch::torch_upsample_to_core_upsample", "torch::torch_tensor_assign_to_core", ] _FRONTEND_TF1_PASSES: List[Text] = [ "common::dead_code_elimination", "common::loop_invariant_elimination", "tensorflow::backfill_make_list_elem_type", # DCE to reduce tf_lstm_block outputs and allow lstm_rewrite to # ssa lstm "common::dead_code_elimination", # tensorflow::tf_lstm_to_core_lstm must come before # tensorflow::expand_tf_lstm "tensorflow::tf_lstm_to_core_lstm", "tensorflow::expand_tf_lstm", ] _FRONTEND_TF2_PASSES: List[Text] = [ "common::dead_code_elimination", "common::loop_invariant_elimination", # tensorflow2::remove_vacuous_cond should come before # tensorflow::backfill_make_list_elem_type. "tensorflow2::remove_vacuous_cond", "tensorflow::backfill_make_list_elem_type", # DCE to reduce tf_lstm_block outputs and allow lstm_rewrite to # ssa lstm "common::dead_code_elimination", # tensorflow::tf_lstm_to_core_lstm must come before # tensorflow::expand_tf_lstm "tensorflow::tf_lstm_to_core_lstm", "tensorflow::expand_tf_lstm", ] _BACKEND_MIL_PASSES: List[Text] = [ "common::const_elimination", "mil_backend::adjust_io_to_supported_types", "mil_backend::insert_image_preprocessing_ops", "mil_backend::fuse_activation_silu", "mil_backend::fuse_pow2_sqrt", "common::const_elimination", # rank0_expand_dims_swap might introduce some new const tensor "common::const_deduplication", # after all consts have been settled "common::cast_optimization", "common::dead_code_elimination", "mil_backend::sanitize_name_strings", "common::dedup_op_and_var_names", "nn_backend::handle_unused_inputs", # must come after dce. ] _BACKEND_NN_PASSES: List[Text] = [ "nn_backend::decompose_conv1d", # at the beginning of nn pass "nn_backend::commingle_loop_vars", "nn_backend::handle_return_inputs_as_outputs", "common::const_elimination", "common::const_deduplication", # after all consts have been settled # "remove_redundant_ops" pass should be applied towards the end, once other graph passes have done their optimizations. # For instance, it should come after passes such as "reduce_transpose" that can introduce redundant transposes # in the network (while reducing the total number of transposes), and after passes such as "fuse_layernorm_or_instancenorm" # which detects patterns that involve redundant ops ("sub") etc. "common::remove_redundant_ops", "common::dead_code_elimination", "nn_backend::handle_unused_inputs", # must come after dce. "nn_backend::alert_return_type_cast", # must be at the end. ] class PassPipeline: """ A pipeline that contains graph passes. Create a default pipeline (with all default graph passes that will operate on the program): .. sourcecode:: python pipeline = ct.PassPipeline.DEFAULT Create an empty pipeline (this will result in no graph passes being applied to the model): .. sourcecode:: python pipeline = ct.PassPipeline.EMPTY Add passes to pipeline: .. sourcecode:: python pipeline = ct.PassPipeline.DEFAULT pipeline.append_pass("common::reduce_transposes") pipeline.insert_pass(index=0, pass_name="common::reduce_transposes") # Can also specify all passes by setting the passes of the pipeline. pipeline.passes = ["common::reduce_transposes", "common::add_fp16_cast"] Remove passes: .. sourcecode:: python # Remove a pass at a specific index. pipeline.remove_pass(index=10) # Remove passes by names. pipeline.remove_passes({"common::add_fp16_cast", "common::reduce_transposes"}) Inspect passes in the pipeline: .. sourcecode:: python # Get all passes. pass_names = pipeline.passes # Find indexes of a specific pass. pass_indexes = [ idx for idx, pass_name in enumerate(pass_names) if pass_names[idx] == "common::reduce_transposes" ] Set options for a specific pass: .. sourcecode:: python pipeline = ct.PassPipeline.DEFAULT pipeline.set_options( pass_name="common::const_elimination", options={"skip_const_by_size": "100000"}, ) """ # TODO: rdar://121242189 ([Infra] Have a better way to handle predefined pass pipeline) _PIPELINE_NAME_TO_PASSES = { "default": _COMMON_PASSES + _CLEANUP_PASSES, "cleanup": _CLEANUP_PASSES, "default_palettization": _PALETTIZATION_PASSES + _COMMON_PASSES + _CLEANUP_PASSES, "default_sparsification": _SPARSIFICATION_PASSES + _COMMON_PASSES + _CLEANUP_PASSES, "empty": [], # Frontend pipelines. "frontend_milinternal": [], "frontend_pytorch": _FRONTEND_TORCH_PASSES, "frontend_tensorflow": _FRONTEND_TF1_PASSES, "frontend_tensorflow2": _FRONTEND_TF2_PASSES, # Backend pipelines. "backend_mlprogram": _BACKEND_MIL_PASSES, "backend_neuralnetwork": _BACKEND_NN_PASSES, "backend_milinternal": [], } def __init__(self, pass_names=None, pipeline_name="default"): if pass_names is None: pass_names = _COMMON_PASSES + _CLEANUP_PASSES self._pass_names: List[Text] = pass_names self._pass_options: Dict[Text, List[PassOption]] = dict() self._pipeline_name = pipeline_name def __str__(self): return self._pipeline_name @property def passes(self): return self._pass_names @passes.setter def passes(self, passes: List[Text]): for pass_name in passes: if pass_name not in PASS_REGISTRY: raise ValueError(f"The pass {pass_name} is not registered.") self._pass_names = list(passes) @property def pipeline_name(self): return self._pipeline_name @pipeline_name.setter def pipeline_name(self, pipeline_name: Text): self._pipeline_name = pipeline_name def append_pass(self, pass_name: Text): """Append a pass at the end of the current passes in the pipeline.""" if pass_name not in PASS_REGISTRY: raise ValueError(f"The pass {pass_name} is not registered.") self._pass_names.append(pass_name) def insert_pass(self, index: int, pass_name: Text) -> None: """Adds a pass at a specific index""" if pass_name not in PASS_REGISTRY: raise ValueError(f"The pass {pass_name} is not registered.") self._pass_names.insert(index, pass_name) def remove_pass(self, index: int) -> None: """Removes a pass at a specific index.""" del self._pass_names[index] def remove_passes(self, passes_names: Union[Set[Text], List[Text]]) -> None: """Removes all passes with specific name.""" self._pass_names = [ pass_name for pass_name in self._pass_names if pass_name not in passes_names ] def get_options(self, pass_name: Text) -> Optional[List[PassOption]]: """ Gets options of a pass that has been set by the user. Return None if the pass doesn't have any associated option set by the user. """ return self._pass_options.get(pass_name, None) def get_all_options(self) -> Dict[Text, List[PassOption]]: """Gets all options in the pipeline.""" return self._pass_options def set_options(self, pass_name: Text, options: Dict[Text, Text], override: bool = True): """Sets options for a specific pass.""" if self._pass_options.get(pass_name, None): if not override: raise ValueError(f"The pass {pass_name} already has associated options.") else: logger.warning(f"The pass {pass_name} already has associated options. Override the existing options.") pass_options: List[PassOption] = [] for option_name, option_val in options.items(): pass_option = PassOption(option_name=option_name, option_val=option_val) pass_options.append(pass_option) self._pass_options[pass_name] = pass_options def set_options_by_another_pipeline(self, other_pipeline: PassPipeline): """ Convenience method for setting options from another pipeline's options. For each option in other_pipeline, set it if it's also applicable to this pipeline. """ for pass_name, options in other_pipeline.get_all_options().items(): if pass_name in self.passes: self._pass_options[pass_name] = options def validate(self): """Validates the pipeline (including options).""" pass_names_set = set(self._pass_names) for pass_name in self._pass_options.keys(): if pass_name not in pass_names_set: raise ValueError( f"This pass pipeline is not valid. The pass {pass_name} has " f"associated options but it's not in the passes. Passes in this " f"pipeline: {self._pass_names}" ) @classmethod def get_pipeline(cls, pipeline_name: Text) -> PassPipeline: """ Gets a pipeline based on the name. Raises an error if no pipeline is found. Available Pipelines are defined in _PIPELINE_NAME_TO_PASSES """ if pipeline_name not in cls._PIPELINE_NAME_TO_PASSES: raise ValueError( f"There is no pipeline for `{pipeline_name}`. " f"Available pipelines: {cls._PIPELINE_NAME_TO_PASSES.keys()}" ) # We need to copy the pass names when initialize a PassPipeline object, # to prevent the member functions of PassPipeline from potentially modifying the original # data in _PIPELINE_NAME_TO_PASSES. passes = list(cls._PIPELINE_NAME_TO_PASSES[pipeline_name]) return PassPipeline(passes, pipeline_name) @classmethod def list_available_pipelines(cls) -> List[str]: """List all available pipelines.""" return list(cls._PIPELINE_NAME_TO_PASSES.keys()) """ ======================================= Pre-defined PassPipeline configurations ======================================= """ @_classproperty def EMPTY(cls) -> PassPipeline: """Creates an empty pipeline without any pass.""" return PassPipeline(pass_names=[]) @_classproperty def DEFAULT(cls) -> PassPipeline: """Creates a pipeline that the converter uses by default.""" return cls.get_pipeline("default") @_classproperty def CLEANUP(cls) -> PassPipeline: """Create a pipeline that contains cleanup passes.""" return cls.get_pipeline("cleanup") @_classproperty def DEFAULT_PALETTIZATION(cls) -> PassPipeline: """Create a default palettization pipeline to convert a compressed source model""" # We use delayed import to avoid circular import from coremltools.optimize.coreml import OpPalettizerConfig, OptimizationConfig pipeline = cls.get_pipeline("default_palettization") # set default palettization config = OptimizationConfig(global_config=OpPalettizerConfig(mode="unique")) pipeline.set_options("compression::palettize_weights", {"config": config}) return pipeline @_classproperty def DEFAULT_PRUNING(cls) -> PassPipeline: """Create a default sparsification pipeline to convert a compressed source model""" # We use delayed import to avoid circular import from coremltools.optimize.coreml import OpThresholdPrunerConfig, OptimizationConfig pipeline = cls.get_pipeline("default_sparsification") # set default sparsification config = OptimizationConfig( global_config=OpThresholdPrunerConfig( threshold=1e-12, ) ) pipeline.set_options("compression::prune_weights", {"config": config}) return pipeline class PassPipelineManager: @staticmethod @_profile def apply_pipeline(prog: Program, pass_pipeline: PassPipeline): """Apply a pass pipeline to a program, which modifies the program in-place.""" if pass_pipeline is None: raise ValueError("The pass_pipeline cannot be None.") pass_pipeline.validate() prog.validate() logger.debug(f"Program before {pass_pipeline} pipeline:\n{prog}") for pass_name in tqdm( pass_pipeline.passes, desc=f"Running MIL {pass_pipeline} pipeline", unit=" passes", ): logger.debug(f'Performing pass: "{pass_name}"') pass_options = pass_pipeline.get_options(pass_name) if pass_options is not None: logger.debug( f"The graph pass options for {pass_name} is set to {pass_options}. " f"It will change the pass behavior. Make sure the option is intended." ) if pass_name.startswith("experimental::"): logger.warning( f"The graph pass {pass_name} is under experimental development, " f"and the API could be changed in the future." ) graph_pass = PASS_REGISTRY[pass_name] graph_pass.set_options(pass_options) try: graph_pass(prog) except Exception as e: logger.error( f"\n\nERROR - '{pass_name}' graph pass produces the following error:\n" ) raise e # re-raise exception # After dead code elimination, we should check if the program misses any essential scope info check_essential_scope = pass_name == "common::dead_code_elimination" prog.validate(check_essential_scope=check_essential_scope) logger.debug(f"Program after {pass_pipeline} pipeline:\n{prog}") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/pass_registry.py0000644000000000000000000000434314672066616025347 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import inspect from typing import Dict, Optional, Text, Type from coremltools import _logger as logger from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass class PassRegistry: def __init__(self): """ Store the pass class instead of instance to avoid the same instance got modified by several callers. """ self.passes: Dict[Text, Type[AbstractGraphPass]] = {} def __getitem__(self, pass_id: Text) -> AbstractGraphPass: """ pass_id: namespace::func_name (e.g., 'common::const_elimination') """ if pass_id not in self.passes: raise KeyError(f"Pass {pass_id} not found") current_pass = self.passes[pass_id] # The current_pass could be a PassContainer instance if registered by register_generic_pass. return current_pass() if inspect.isclass(current_pass) else current_pass def __contains__(self, pass_id: Text) -> bool: return pass_id in self.passes def add( self, namespace: Text, pass_cls: Type[AbstractGraphPass], override: bool, name: Optional[Text], ): cls_name = pass_cls.__name__ if name is None else name pass_id = namespace + "::" + cls_name logger.debug(f"Registering pass {pass_id}") if pass_id in self.passes and not override: raise KeyError(f"Pass {pass_id} already registered.") self.passes[pass_id] = pass_cls PASS_REGISTRY = PassRegistry() def register_pass(namespace: Text, override: bool = False, name: Optional[Text] = None): """ namespaces like {'common', 'nn_backend', , } Params: override: indicate the graph pass can override an existing pass with the same name. name: name of the graph pass. Default to class name if not provided """ def class_wrapper(pass_cls: Type[AbstractGraphPass]): PASS_REGISTRY.add(namespace, pass_cls, override, name) return pass_cls return class_wrapper ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.249547 coremltools-8.0/coremltools/converters/mil/mil/passes/tests/0000755000000000000000000000000014672075535023235 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/tests/__init__.py0000644000000000000000000000033214672066616025344 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/tests/test_cleanup_passes.py0000644000000000000000000030676614672066616027675 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import itertools import unittest import numpy as np import pytest from mock import patch import coremltools as ct from coremltools.converters.mil import mil from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, Symbol, get_new_symbol, types from coremltools.converters.mil.mil.passes.defs.cleanup import topological_reorder from coremltools.converters.mil.mil.passes.defs.cleanup.remove_redundant_ops import ( remove_redundant_ops, ) from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.converters.mil.testing_utils import ( apply_pass_and_basic_check, assert_model_is_valid, assert_op_count_match, assert_same_output_names, get_op_names_in_program, get_op_types_in_block, get_op_types_in_program, ) from .test_passes import _VALIDATE_MODEL, CONSTEXPR_FUNCS, CONSTEXPR_OPS class TestConstDeduplication: def test_const_deduplication_cross_functions(self): val_1 = np.random.rand( 100, ) val_2 = np.random.rand( 100, ) val_3 = np.random.rand( 100, ) @mb.function( input_specs=[mb.TensorSpec((100,))], ) def func(x): const_1 = mb.const(val=val_1) const_2 = mb.const(val=val_1) const_3 = mb.const(val=val_2) const_4 = mb.const(val=val_3) x = mb.add(x=x, y=const_1) x = mb.add(x=x, y=const_2) x = mb.add(x=x, y=const_3) return mb.add(x=x, y=const_4) @mb.function( input_specs=[mb.TensorSpec((100,))], ) def func_1(x): const_5 = mb.const(val=val_1) const_6 = mb.const(val=val_2) x = mb.add(x=x, y=const_5) return mb.add(x=x, y=const_6) prog = mil.Program() prog.add_function("main", func) prog.add_function("func_1", func_1) # In the above case, const_1 and const_2 in main is going to deduplicated in a single const op first. # And it will share the same weight_id with const_5 in func_1. # const_3 / const_6 are going to share the same weight_id across functions. # While const_6.weight_id remains None. graph_pass = PASS_REGISTRY["common::const_deduplication"] graph_pass._deduplicate_const_across_functions(prog) # validate the prog main_func = prog.functions["main"] expected_ops = ["const", "const", "const", "add", "add", "add", "add"] assert get_op_types_in_block(main_func, skip_const_ops=False) == expected_ops const_ops = main_func.find_ops(op_type="const") assert const_ops[0].weight_id == 0 assert const_ops[1].weight_id == 1 assert const_ops[2].weight_id is None func_1 = prog.functions["func_1"] expected_ops = [ "const", "const", "add", "add", ] assert get_op_types_in_block(func_1, skip_const_ops=False) == expected_ops const_ops = func_1.find_ops(op_type="const") assert const_ops[0].weight_id == 0 assert const_ops[1].weight_id == 1 def test_const_deduplication_cross_functions_from_same_source(self): """ In the case of users copying a source function into two functions, same weight should be assigned with the same weighr_id as well. """ val_1 = np.random.rand( 100, ) val_2 = np.random.rand( 100, ) val_3 = np.random.rand( 100, ) @mb.function( input_specs=[mb.TensorSpec((100,))], ) def func(x): const_1 = mb.const(val=val_1) const_2 = mb.const(val=val_1) const_3 = mb.const(val=val_2) const_4 = mb.const(val=val_3) x = mb.add(x=x, y=const_1) x = mb.add(x=x, y=const_2) x = mb.add(x=x, y=const_3) return mb.add(x=x, y=const_4) prog = mil.Program() prog.add_function("func_1", func) prog.add_function("func_2", func) graph_pass = PASS_REGISTRY["common::const_deduplication"] graph_pass._deduplicate_const_across_functions(prog) # validate the prog func_1 = prog.functions["func_1"] expected_ops = ["const", "const", "const", "add", "add", "add", "add"] assert get_op_types_in_block(func_1, skip_const_ops=False) == expected_ops func_2 = prog.functions["func_2"] assert get_op_types_in_block(func_2, skip_const_ops=False) == expected_ops for func in [prog.functions["func_1"], prog.functions["func_2"]]: const_ops = func.find_ops(op_type="const") assert const_ops[0].weight_id == 0 assert const_ops[1].weight_id == 1 assert const_ops[2].weight_id == 2 @staticmethod def test_const_deduplication_with_threshold(): @mb.program( input_specs=[ mb.TensorSpec(shape=(2,)), ] ) def prog(x): # const_1 and const_2 will not be deduplicated const_1 = [0.0] const_2 = [0.0] const_3 = [0.0, 1.0] const_4 = [0.0, 1.0] # 4 add ops x = mb.add(x=x, y=const_1) x = mb.add(x=x, y=const_2) x = mb.add(x=x, y=const_3) return mb.add(x=x, y=const_4) graph_pass = PASS_REGISTRY["common::const_deduplication"] graph_pass.const_threshold = 2 apply_pass_and_basic_check(prog, graph_pass) # check the graph pass assert_op_count_match(prog, expect=3, op="const") const_ops = prog.functions["main"].find_ops(op_type="const") assert const_ops[0].outputs[0].val.tolist() == [0.0] assert const_ops[1].outputs[0].val.tolist() == [0.0] assert const_ops[2].outputs[0].val.tolist() == [0.0, 1.0] @staticmethod def test_const_deduplication_with_threshold_for_pad(): @mb.program( input_specs=[ mb.TensorSpec(shape=(100,)), ] ) def prog(x): # both constant_val and pad inputs for two pad ops are deduplicaed c_zero_scalar = np.float32(0.0) x = mb.pad(x=x, pad=[1, 0], mode="constant", constant_val=c_zero_scalar) return mb.pad(x=x, pad=[1, 0], mode="constant", constant_val=c_zero_scalar) graph_pass = PASS_REGISTRY["common::const_deduplication"] graph_pass.const_threshold = -1 apply_pass_and_basic_check(prog, graph_pass) # check the graph pass assert_op_count_match(prog, expect=4, op="const") const_ops = prog.functions["main"].find_ops(op_type="const") assert const_ops[0].outputs[0].val.tolist() == [1, 0] assert const_ops[1].outputs[0].val == "constant" assert const_ops[2].outputs[0].val == 0.0 assert const_ops[3].outputs[0].val == "constant" @staticmethod @pytest.mark.parametrize( "constexpr_op", CONSTEXPR_OPS, ) def test_constexpr_deduplication_with_threshold(constexpr_op): BATCH_DIM = 1 SEQUENCE_LENGTH = 1 ENCODING_DIM = 1 EMBEDDING_DIM = 2 @mb.program( input_specs=[ mb.TensorSpec(shape=(BATCH_DIM, SEQUENCE_LENGTH, ENCODING_DIM)), mb.TensorSpec(shape=(BATCH_DIM, SEQUENCE_LENGTH, ENCODING_DIM)), ] ) def prog(q, k): weight_q = CONSTEXPR_FUNCS[constexpr_op]((EMBEDDING_DIM, ENCODING_DIM), seed=19) weight_k = CONSTEXPR_FUNCS[constexpr_op]((EMBEDDING_DIM, ENCODING_DIM), seed=19) q_e = mb.linear(x=q, weight=weight_q) k_e = mb.linear(x=k, weight=weight_k) return mb.matmul(x=q_e, y=k_e, transpose_y=True) graph_pass = PASS_REGISTRY["common::const_deduplication"] graph_pass.const_threshold = -1 apply_pass_and_basic_check(prog, graph_pass) # check the graph pass assert_op_count_match(prog, expect=1, op=constexpr_op) @staticmethod def test_str_should_not_be_deduplicated(): @mb.program( input_specs=[ mb.TensorSpec(shape=(1,)), ] ) def prog(x): x = mb.cast(x=x, dtype="int32") return mb.cast(x=x, dtype="int32") graph_pass = PASS_REGISTRY["common::const_deduplication"] graph_pass.const_threshold = -1 apply_pass_and_basic_check(prog, graph_pass) # check the graph pass assert_op_count_match(prog, expect=2, op="const") const_ops = prog.functions["main"].find_ops(op_type="const") assert const_ops[0].outputs[0].val == "int32" assert const_ops[1].outputs[0].val == "int32" @staticmethod def test_bool_should_not_be_deduplicated(): @mb.program( input_specs=[ mb.TensorSpec(shape=(2,)), mb.TensorSpec(shape=(2,)), ] ) def prog(x, y): return mb.argsort(x=x, axis=-1, ascending=False), mb.argsort( x=y, axis=-1, ascending=False ) graph_pass = PASS_REGISTRY["common::const_deduplication"] graph_pass.const_threshold = -1 apply_pass_and_basic_check(prog, graph_pass) # check the graph pass assert_op_count_match(prog, expect=3, op="const") const_ops = prog.functions["main"].find_ops(op_type="const") assert const_ops[0].outputs[0].val == -1 assert const_ops[1].outputs[0].val == False assert const_ops[2].outputs[0].val == False @pytest.mark.parametrize( "q_weight_key, k_weight_key", itertools.product( (None, "weight", "q_weight"), (None, "weight", "k_weight"), ), ) def test_const_deduplication(self, q_weight_key, k_weight_key): BATCH_DIM = 5 SEQUENCE_LENGTH = 4 ENCODING_DIM = 256 EMBEDDING_DIM = 128 weight = np.random.rand(EMBEDDING_DIM, ENCODING_DIM) bias = np.random.rand(EMBEDDING_DIM) @mb.program( input_specs=[ mb.TensorSpec(shape=(BATCH_DIM, SEQUENCE_LENGTH, ENCODING_DIM)), mb.TensorSpec(shape=(BATCH_DIM, SEQUENCE_LENGTH, ENCODING_DIM)), ] ) def prog(q, k): q_e = mb.linear(x=q, weight=weight, bias=bias) q_e.op.weight.op.weight_key = q_weight_key k_e = mb.linear(x=k, weight=weight, bias=bias) k_e.op.weight.op.weight_key = k_weight_key attention = mb.matmul(x=q_e, y=k_e, transpose_y=True) return attention prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::const_deduplication") assert_op_count_match(prev_prog, expect=6, op="const") # bias will always be deduplicated # weight will be deduplicated only when q and k have same weight key assert_op_count_match(prog, expect=4 if q_weight_key == k_weight_key else 5, op="const") @pytest.mark.parametrize( "constexpr_op, q_bias_key, k_bias_key", itertools.product( CONSTEXPR_OPS, (None, "weight", "q_weight"), (None, "weight", "k_weight"), ), ) def test_constexpr_deduplication(self, constexpr_op, q_bias_key, k_bias_key): BATCH_DIM = 5 SEQUENCE_LENGTH = 4 ENCODING_DIM = 256 EMBEDDING_DIM = 128 @mb.program( input_specs=[ mb.TensorSpec(shape=(BATCH_DIM, SEQUENCE_LENGTH, ENCODING_DIM)), mb.TensorSpec(shape=(BATCH_DIM, SEQUENCE_LENGTH, ENCODING_DIM)), ] ) def prog(q, k): weight_q = CONSTEXPR_FUNCS[constexpr_op]((EMBEDDING_DIM, ENCODING_DIM), seed=19) weight_k = CONSTEXPR_FUNCS[constexpr_op]((EMBEDDING_DIM, ENCODING_DIM), seed=19) bias_q = CONSTEXPR_FUNCS[constexpr_op]((EMBEDDING_DIM,), seed=29) bias_k = CONSTEXPR_FUNCS[constexpr_op]((EMBEDDING_DIM,), seed=29) q_e = mb.linear(x=q, weight=weight_q, bias=bias_q) q_e.op.bias.op.weight_key = q_bias_key k_e = mb.linear(x=k, weight=weight_k, bias=bias_k) k_e.op.bias.op.weight_key = k_bias_key attention = mb.matmul(x=q_e, y=k_e, transpose_y=True) return attention prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::const_deduplication") assert_op_count_match(prev_prog, expect=4, op=constexpr_op) # weight will always be deduplicated # bias will be deduplicated only when q and k have same weight key assert_op_count_match(prog, expect=2 if q_bias_key == k_bias_key else 3, op=constexpr_op) def test_const_deduplication_as_outputs(self): """ If the duplicated constants are block outputs, we should not remove them. """ # case 1: # const_2 can be eliminated since it is not block output const = np.random.rand(40, 20, 30) @mb.program( input_specs=[ mb.TensorSpec( shape=( 40, 20, 30, ) ) ] ) def prog(x): const_1 = mb.const(val=const, name="const_1") const_2 = mb.const(val=const, name="const_2") x = mb.relu(x=x) x = mb.add(x=x, y=const_2) return x, const_1 prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::const_deduplication") assert_op_count_match(prev_prog, expect=2, op="const") assert_op_count_match(prog, expect=1, op="const") assert prog.functions["main"].outputs[1].name == "const_1" # case 2: # const_2 can not be eliminated since it is a block output const = np.random.rand(40, 20, 30) @mb.program( input_specs=[ mb.TensorSpec( shape=( 40, 20, 30, ) ) ] ) def prog(x): const_1 = mb.const(val=const, name="const_1") const_2 = mb.const(val=const, name="const_2") x = mb.relu(x=x) x = mb.add(x=x, y=const_2) return x, const_1, const_2 prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::const_deduplication") assert_op_count_match(prev_prog, expect=2, op="const") assert_op_count_match(prog, expect=2, op="const") assert prog.functions["main"].outputs[1].name == "const_1" assert prog.functions["main"].outputs[2].name == "const_2" @pytest.mark.skip("rdar://109374995 consts are not shared across blocks") def test_const_deduplication_multiple_blocks(self): weight = np.random.rand(5, 3, 2, 2) @mb.program(input_specs=[mb.TensorSpec(shape=(4, 3, 8, 8))]) def prog(x): def _true_fn(): return mb.conv(x=x, weight=weight, pad_type="valid") def _false_fn(): y = mb.mul(x=x, y=2.0) return mb.conv(x=y, weight=weight, pad_type="valid") x_gt_0_tensor = mb.greater(x=x, y=0.0) x_gt_0 = mb.slice_by_index(x=x_gt_0_tensor, begin=(0, 0, 0, 0), end=(1, 1, 1, 1)) return mb.cond(pred=x_gt_0, _true_fn=_true_fn, _false_fn=_false_fn) prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::const_deduplication") assert_op_count_match(prev_prog, expect=8, op="const") assert_op_count_match(prog, expect=6, op="const") class TestConstElimination: def test_const_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): a = np.random.rand(2, 4).astype(np.float32) double_a = mb.add(x=a, y=a) return mb.add(x=x, y=double_a) assert_op_count_match(prog, expect=2, op="const") prev_prog = copy.deepcopy(prog) PASS_REGISTRY["common::const_elimination"](prog) assert_same_output_names(prev_prog, prog) assert_op_count_match(prog, expect=3, op="const") if _VALIDATE_MODEL: assert_model_is_valid(prog, {"x": (2, 4)}) def test_const_elimination_nonreplaceable(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): a = np.random.rand(2, 4).astype(np.float16) constexpr_a = mb.constexpr_cast(source_val=a, output_dtype="fp32") double_a = mb.add(x=constexpr_a, y=a.astype(np.float32)) return mb.add(x=x, y=double_a) prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::const_elimination") assert get_op_types_in_program(prev_prog) == ["constexpr_cast", "add", "add"] # Not fold into const because the upstream constexpr_cast op is non-replaceable. assert get_op_types_in_program(prog) == ["constexpr_cast", "add", "add"] def test_force_const_eliminate_nonreplaceable_ops(self): @mb.program(input_specs=[mb.TensorSpec(shape=(3,), dtype=types.int32)]) def prog(x): a = np.random.rand(2, 3, 5).astype(np.float16) constexpr_a = mb.constexpr_cast(source_val=a, output_dtype="fp32") double_a = mb.add(x=constexpr_a, y=a.astype(np.float32)) a_shape = mb.shape(x=double_a) return mb.add(x=x, y=a_shape) assert get_op_types_in_program(prog) == ["constexpr_cast", "add", "shape", "add"] apply_pass_and_basic_check(prog, "common::const_elimination") # still fold shape into const regardless the non-replaceable upstream # constexpr_cast op, since it only provides a shape assert get_op_types_in_program(prog) == ["constexpr_cast", "add", "add"] apply_pass_and_basic_check(prog, "common::dead_code_elimination") # constexpr_cast(a) and add(a, a) no longer contributes to output, # so they should get dead code eliminated assert get_op_types_in_program(prog) == ["add"] def test_force_const_eliminate_nonreplaceable_ops_case_2(self): @mb.program( input_specs=[ mb.TensorSpec(shape=(1,), dtype=types.int32), mb.TensorSpec(shape=(2,), dtype=types.int32), ], opset_version=ct.target.iOS17, ) def prog(x, y): a = np.random.rand(2, 3, 5).astype(np.float16) constexpr_a = mb.constexpr_cast(source_val=a, output_dtype="fp32") reshape_shape = mb.concat(values=[y, [5]], axis=0) reshape = mb.reshape(x=constexpr_a, shape=reshape_shape) a_shape = mb.shape(x=reshape) a_shape_int16 = mb.cast(x=a_shape, dtype="int16") # Even though the gather ops has constexpr_cast op as upstream, # it can still be removed by const elimination. gather = mb.gather( x=a_shape, indices=[ 2, ], axis=0, ) gather_int32 = mb.cast(x=gather, dtype="int32") return mb.add(x=x, y=gather) assert get_op_types_in_program(prog) == [ "constexpr_cast", "concat", "reshape", "shape", "cast", "gather", "cast", "add", ] apply_pass_and_basic_check(prog, "common::const_elimination") # still const-folding gather into const regardless the non-replaceable upstream # constexpr_cast op, since it only provides the meta data (shape) assert get_op_types_in_program(prog) == [ "constexpr_cast", "concat", "reshape", "shape", "cast", "add", ] apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["add"] @patch( "coremltools.converters.mil.mil.passes.defs.cleanup.const_elimination._skip_const_by_size", 1000, ) def test_const_elimination_larger_than_threshold(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): # Construct a 10 x 10 matrix (100 elements) which is smaller than the threshold (1000). tmp = mb.range_1d(start=0, end=10, step=1) tmp_x = mb.reshape(x=tmp, shape=[-1, 1]) tmp_y = mb.reshape(x=tmp, shape=[1, -1]) return mb.matmul(x=tmp_x, y=tmp_y) @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog_large_const_size(x): # Construct a 100 x 100 matrix (10000 elements) which is larger than the threshold (1000). tmp = mb.range_1d(start=0, end=100, step=1) tmp_x = mb.reshape(x=tmp, shape=[-1, 1]) tmp_y = mb.reshape(x=tmp, shape=[1, -1]) return mb.matmul(x=tmp_x, y=tmp_y) prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::const_elimination") assert get_op_types_in_program(prev_prog) == [ "range_1d", "reshape", "reshape", "matmul", ] # All ops (range_1d, reshape, matmul) constructing that 10x10 matrix is folded into a const. assert get_op_types_in_program(prog) == [] prev_prog_large_const_size, _, _ = apply_pass_and_basic_check( prog_large_const_size, "common::const_elimination" ) assert get_op_types_in_program(prev_prog_large_const_size) == [ "range_1d", "reshape", "reshape", "matmul", ] # The matmul op constructing the large matrix is kept due to size larger than threshold. assert get_op_types_in_program(prog_large_const_size) == ["matmul"] class TestDeadCodeElimination: def test_dead_code_elimination(self): @mb.program( input_specs=[ mb.TensorSpec(shape=(2, 4)), mb.TensorSpec(shape=(2, 4)), ] ) def program0(x, y): # following three unused op should be eliminated a = mb.const(val=np.zeros(shape=(1,))) b = mb.const(val=np.zeros(shape=(1,))) _ = mb.add(x=a, y=b) return mb.add(x=x, y=y) assert_op_count_match(program0, expect=4) prev_prog = copy.deepcopy(program0) PASS_REGISTRY["common::dead_code_elimination"](program0) assert_same_output_names(prev_prog, program0) assert_op_count_match(program0, expect=1) if _VALIDATE_MODEL: assert_model_is_valid(program0, {"x": (2, 4), "y": (2, 4)}) @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def program1(x): weights_val = np.random.rand(4, 2).T.astype(np.float32) weights = mb.const(val=weights_val) bias_val = np.random.rand(2).astype(np.float32) bias = mb.const(val=bias_val) # unused op and its inputs should be eliminated weights_for_matmul = mb.transpose(x=weights, perm=[1, 0]) mb.matmul(x=x, y=weights_for_matmul) return mb.linear(x=x, weight=weights, bias=bias) assert_op_count_match(program1, expect=8) prev_prog = copy.deepcopy(program1) PASS_REGISTRY["common::dead_code_elimination"](program1) assert_same_output_names(prev_prog, program1) assert_op_count_match(program1, expect=3) if _VALIDATE_MODEL: assert_model_is_valid(program1, {"x": (2, 4)}) class TestDedupOpAndVarNames(unittest.TestCase): def test_unchanged(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): x = mb.reshape(x=x, shape=(1, 8), name="reshape") return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::dedup_op_and_var_names") self.assertEqual(get_op_types_in_program(prev_prog), ["reshape"]) self.assertEqual(get_op_names_in_program(prev_prog), ["reshape"]) self.assertEqual(get_op_types_in_program(prog), ["reshape"]) self.assertEqual(get_op_names_in_program(prog), ["reshape"]) assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (1, 8)}, ) def test_op_name_duplicated_once(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.cast(x=x, dtype="fp16", name="castop") x = mb.cast(x=x, dtype="fp32", name="castop") x = mb.square(x=x, name="square_last") return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::dedup_op_and_var_names") self.assertEqual(get_op_types_in_program(prev_prog), ["cast", "cast", "square"]) self.assertEqual(get_op_names_in_program(prev_prog), ["castop", "castop", "square_last"]) self.assertEqual(get_op_types_in_program(prog), ["cast", "cast", "square"]) self.assertEqual(get_op_names_in_program(prog), ["castop", "castop_1", "square_last"]) assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={block.outputs[0].name: (10, 20)}, ) def test_op_name_duplicated_many(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.cast(x=x, dtype="fp16", name="castop") x = mb.cast(x=x, dtype="fp16", name="castop") x = mb.cast(x=x, dtype="int32", name="castop_2") x = mb.cast(x=x, dtype="fp16", name="castop") x = mb.cast(x=x, dtype="fp32", name="castop_2") x = mb.square(x=x, name="square") return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::dedup_op_and_var_names") self.assertEqual( get_op_types_in_program(prev_prog), ["cast", "cast", "cast", "cast", "cast", "square"] ) self.assertEqual( get_op_names_in_program(prev_prog), ["castop", "castop", "castop_2", "castop", "castop_2", "square"], ) self.assertEqual( get_op_types_in_program(prog), ["cast", "cast", "cast", "cast", "cast", "square"] ) self.assertEqual( get_op_names_in_program(prog), ["castop", "castop_1", "castop_2", "castop_3", "castop_2_1", "square"], ) assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={block.outputs[0].name: (10, 20)}, ) def test_input_name_shadow(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): # op name "x" results in output var name "x", which shadows prog # input var name "x" x = mb.transpose(x=x, perm=[1, 0], name="x") x = mb.relu(x=x, name="relu") return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::dedup_op_and_var_names") self.assertEqual(get_op_types_in_program(prev_prog), ["transpose", "relu"]) self.assertEqual(get_op_names_in_program(prev_prog), ["x", "relu"]) self.assertEqual(get_op_types_in_program(prog), ["transpose", "relu"]) self.assertEqual(get_op_names_in_program(prog), ["x", "relu"]) op = prog["main"].find_ops(op_type="transpose")[0] self.assertEqual("x_1", op.outputs[0].name) assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={block.outputs[0].name: (20, 10)}, ) def test_nested_block(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1,))]) def prog(x): def true_fn(): # returns var with name x shadows input 'x' return mb.add(x=x, y=1.0, name="x") def false_fn(): # two ops with name "x" return mb.add(x=x, y=-1.0, name="x") pred = mb.equal(x=mb.squeeze(x=x), y=1.0) return mb.cond(pred=pred, _true_fn=true_fn, _false_fn=false_fn) cond_op = prog.functions["main"].operations[-1] assert cond_op.blocks[0].outputs[0].name == "x" assert cond_op.blocks[1].outputs[0].name == "x" prev_prog, _, block = apply_pass_and_basic_check(prog, "common::dedup_op_and_var_names") cond_op = prog.functions["main"].operations[-1] assert cond_op.blocks[0].outputs[0].name == "x_1" assert cond_op.blocks[1].outputs[0].name == "x_2" assert_model_is_valid( prog, {"x": (1,)}, expected_output_shapes={block.outputs[0].name: (1,)}, ) class TestExpandDynamicLinear: def test_keep_static_weight_static_bias(self): X_SHAPE = (2, 5) WEIGHT_SHAPE = (3, X_SHAPE[-1]) bias_shape = (WEIGHT_SHAPE[0],) output_shape = (X_SHAPE[0], WEIGHT_SHAPE[0]) quantized_weight = np.random.randint(-127, 128, WEIGHT_SHAPE, np.int8) quantized_bias = np.random.randint(-127, 128, bias_shape, np.int8) @mb.program( input_specs=[mb.TensorSpec(shape=X_SHAPE)], opset_version=ct.target.iOS16, ) def prog(x): weight = mb.constexpr_affine_dequantize( quantized_data=quantized_weight, scale=1.2, zero_point=np.int8(3), axis=0, ) bias = mb.constexpr_affine_dequantize( quantized_data=quantized_bias, scale=4.5, zero_point=np.int8(6), axis=0, ) return mb.linear(x=x, weight=weight, bias=bias) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::expand_dynamic_linear") assert get_op_types_in_program(prev_prog) == [ "constexpr_affine_dequantize", "constexpr_affine_dequantize", "linear", ] assert get_op_types_in_program(prog) == get_op_types_in_program(prev_prog) assert_model_is_valid( prog, {"x": X_SHAPE}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=("mlprogram", "fp16"), minimum_deployment_target=ct.target.iOS16, ) def test_expand_static_weight_dynamic_bias(self): X_SHAPE = (2, 5) WEIGHT_SHAPE = (3, X_SHAPE[-1]) bias_shape = (WEIGHT_SHAPE[0],) output_shape = (X_SHAPE[0], WEIGHT_SHAPE[0]) weight = np.random.rand(*WEIGHT_SHAPE) quantized_bias = np.random.randint(-127, 128, bias_shape, np.int8) @mb.program( input_specs=[mb.TensorSpec(shape=X_SHAPE)], opset_version=ct.target.iOS16, ) def prog(x): bias = mb.constexpr_affine_dequantize( quantized_data=quantized_bias, scale=1.2, zero_point=np.int8(3), axis=0, ) screwed_bias = mb.exp(x=bias) return mb.linear(x=x, weight=weight, bias=screwed_bias) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::expand_dynamic_linear") assert get_op_types_in_program(prev_prog) == [ "constexpr_affine_dequantize", "exp", "linear", ] assert get_op_types_in_program(prog) == [ "constexpr_affine_dequantize", "exp", "linear", "add", ] assert_model_is_valid( prog, {"x": X_SHAPE}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=("mlprogram", "fp16"), minimum_deployment_target=ct.target.iOS16, ) def test_expand_dynamic_weight_static_zero_bias(self): X_SHAPE = (2, 5) WEIGHT_SHAPE = (3, X_SHAPE[-1]) output_shape = (X_SHAPE[0], WEIGHT_SHAPE[0]) quantized_weight = np.random.randint(-127, 128, WEIGHT_SHAPE, np.int8) @mb.program( input_specs=[mb.TensorSpec(shape=X_SHAPE)], opset_version=ct.target.iOS16, ) def prog(x): weight = mb.constexpr_affine_dequantize( quantized_data=quantized_weight, scale=1.2, zero_point=np.int8(3), axis=0, ) screwed_weight = mb.exp(x=weight) return mb.linear(x=x, weight=screwed_weight) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::expand_dynamic_linear") assert get_op_types_in_program(prev_prog) == [ "constexpr_affine_dequantize", "exp", "linear", ] assert get_op_types_in_program(prog) == [ "constexpr_affine_dequantize", "exp", "matmul", ] assert_model_is_valid( prog, {"x": X_SHAPE}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=("mlprogram", "fp16"), minimum_deployment_target=ct.target.iOS16, ) def test_expand_dynamic_weight_static_compressed_zero_bias(self): X_SHAPE = (2, 5) WEIGHT_SHAPE = (3, X_SHAPE[-1]) bias_shape = (WEIGHT_SHAPE[0],) output_shape = (X_SHAPE[0], WEIGHT_SHAPE[0]) quantized_weight = np.random.randint(-127, 128, WEIGHT_SHAPE, np.int8) quantized_bias = np.random.randint(-127, 128, bias_shape, np.int8) @mb.program( input_specs=[mb.TensorSpec(shape=X_SHAPE)], opset_version=ct.target.iOS16, ) def prog(x): weight = mb.constexpr_affine_dequantize( quantized_data=quantized_weight, scale=1.2, zero_point=np.int8(3), axis=0, ) bias = mb.constexpr_affine_dequantize( quantized_data=quantized_bias, scale=np.random.rand(*bias_shape), zero_point=quantized_bias, axis=0, ) screwed_weight = mb.exp(x=weight) return mb.linear(x=x, weight=screwed_weight, bias=bias) original_prog, _, _ = apply_pass_and_basic_check(prog, "common::expand_dynamic_linear") expanded_prog, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(original_prog) == [ "constexpr_affine_dequantize", "constexpr_affine_dequantize", "exp", "linear", ] assert get_op_types_in_program(expanded_prog) == [ "constexpr_affine_dequantize", "constexpr_affine_dequantize", "exp", "matmul", ] assert get_op_types_in_program(prog) == [ "constexpr_affine_dequantize", "exp", "matmul", ] assert_model_is_valid( prog, {"x": X_SHAPE}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=("mlprogram", "fp16"), minimum_deployment_target=ct.target.iOS16, ) def test_expand_dynamic_weight_static_nonzero_bias(self): X_SHAPE = (2, 5) WEIGHT_SHAPE = (3, X_SHAPE[-1]) bias_shape = (WEIGHT_SHAPE[0],) output_shape = (X_SHAPE[0], WEIGHT_SHAPE[0]) quantized_weight = np.random.randint(-127, 128, WEIGHT_SHAPE, np.int8) bias = np.random.rand(*bias_shape) @mb.program( input_specs=[mb.TensorSpec(shape=X_SHAPE)], opset_version=ct.target.iOS16, ) def prog(x): weight = mb.constexpr_affine_dequantize( quantized_data=quantized_weight, scale=1.2, zero_point=np.int8(3), axis=0, ) screwed_weight = mb.exp(x=weight) return mb.linear(x=x, weight=screwed_weight, bias=bias) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::expand_dynamic_linear") assert get_op_types_in_program(prev_prog) == [ "constexpr_affine_dequantize", "exp", "linear", ] assert get_op_types_in_program(prog) == [ "constexpr_affine_dequantize", "exp", "matmul", "add", ] assert_model_is_valid( prog, {"x": X_SHAPE}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=("mlprogram", "fp16"), minimum_deployment_target=ct.target.iOS16, ) def test_expand_dynamic_weight_dynamic_bias(self): X_SHAPE = (2, 5) WEIGHT_SHAPE = (3, X_SHAPE[-1]) bias_shape = (WEIGHT_SHAPE[0],) output_shape = (X_SHAPE[0], WEIGHT_SHAPE[0]) quantized_weight = np.random.randint(-127, 128, WEIGHT_SHAPE, np.int8) quantized_bias = np.random.randint(-127, 128, bias_shape, np.int8) @mb.program( input_specs=[mb.TensorSpec(shape=X_SHAPE)], opset_version=ct.target.iOS16, ) def prog(x): weight = mb.constexpr_affine_dequantize( quantized_data=quantized_weight, scale=1.2, zero_point=np.int8(3), axis=0, ) bias = mb.constexpr_affine_dequantize( quantized_data=quantized_bias, scale=1.2, zero_point=np.int8(3), axis=0, ) screwed_weight = mb.exp(x=weight) screwed_bias = mb.exp(x=bias) return mb.linear(x=x, weight=screwed_weight, bias=screwed_bias) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::expand_dynamic_linear") assert get_op_types_in_program(prev_prog) == [ "constexpr_affine_dequantize", "constexpr_affine_dequantize", "exp", "exp", "linear", ] assert get_op_types_in_program(prog) == [ "constexpr_affine_dequantize", "constexpr_affine_dequantize", "exp", "exp", "matmul", "add", ] assert_model_is_valid( prog, {"x": X_SHAPE}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=("mlprogram", "fp16"), minimum_deployment_target=ct.target.iOS16, ) class TestReduceMeanFusion: def test_valid_pattern1(self): @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): x1 = mb.reduce_sum(x=x, axes=[-1, -2], keep_dims=True) x1 = mb.mul(x=1.0 / 30, y=x1) return x1 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_reduce_mean") assert get_op_types_in_program(prev_prog) == ["reduce_sum", "mul"] assert get_op_types_in_program(prog) == ["reduce_mean"] assert_model_is_valid( prog, {"x": (3, 5, 6)}, expected_output_shapes={block.outputs[0].name: (3, 1, 1)}, ) def test_valid_pattern2(self): @mb.program(input_specs=[mb.TensorSpec(shape=(4, 5))]) def prog(x): x1 = mb.reduce_sum(x=x, axes=[0], keep_dims=False) x1 = mb.real_div(x=x1, y=4.0) return x1 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_reduce_mean") assert get_op_types_in_program(prev_prog) == ["reduce_sum", "real_div"] assert get_op_types_in_program(prog) == ["reduce_mean"] assert_model_is_valid( prog, {"x": (4, 5)}, expected_output_shapes={block.outputs[0].name: (5,)}, ) def test_invalid_pattern1(self): """ The mul does not correspond to "1/count" """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): x1 = mb.reduce_sum(x=x, axes=[-1, -2], keep_dims=True) x1 = mb.mul(x=5.0, y=x1) return x1 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_reduce_mean") assert get_op_types_in_program(prog) == ["reduce_sum", "mul"] def test_invalid_pattern2(self): """ The div does not correspond to "count" """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): x1 = mb.reduce_sum(x=x, axes=[-1, -2], keep_dims=True) x1 = mb.real_div(x=x1, y=31.0) return x1 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_reduce_mean") assert get_op_types_in_program(prog) == ["reduce_sum", "real_div"] def test_invalid_pattern3(self): """ One of the reduction dim is symbolic """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, get_new_symbol(), 6))]) def prog(x): x1 = mb.reduce_sum(x=x, axes=[-1, -2], keep_dims=True) x1 = mb.real_div(x=x1, y=30.0) return x1 pass_name = "common::fuse_reduce_mean" PASS_REGISTRY[pass_name](prog) assert get_op_types_in_program(prog) == ["reduce_sum", "real_div"] def test_invalid_pattern4(self): """ output of reduce_sum is model output """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): x1 = mb.reduce_sum(x=x, axes=[-1, -2], keep_dims=True) y1 = mb.real_div(x=x1, y=30.0) return y1, x1 pass_name = "common::fuse_reduce_mean" PASS_REGISTRY[pass_name](prog) assert get_op_types_in_program(prog) == ["reduce_sum", "real_div"] def test_invalid_pattern5(self): """ output of reduce_sum is feeding into another op """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): x1 = mb.reduce_sum(x=x, axes=[-1, -2], keep_dims=True) y1 = mb.real_div(x=x1, y=30.0) y2 = mb.mul(x=x1, y=10.0) y3 = mb.add(x=y1, y=y2) return y3 pass_name = "common::fuse_reduce_mean" PASS_REGISTRY[pass_name](prog) assert get_op_types_in_program(prog) == ["reduce_sum", "real_div", "mul", "add"] class TestLoopInvariantElimination: def test_loop_invariant_elimination1(self): """ Invariant pattern: Block input vars are returned as block output vars. """ def body(a, b): return mb.add(x=a, y=b), b def cond(a, b): a_mean = mb.reduce_mean(x=a, axes=[0, 1]) b_mean = mb.reduce_mean(x=b, axes=[0, 1]) return mb.less(x=a_mean, y=b_mean) @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 2)), mb.TensorSpec(shape=(1, 2)), ] ) def prog(a, b): # b is loop invariant return mb.while_loop(_cond=cond, _body=body, loop_vars=(a, b)) while_op = prog.find_ops(op_type="while_loop", exactly_one=True)[0] assert len(while_op.blocks[0].inputs) == 2 assert len(while_op.outputs) == 2 assert len(while_op.loop_vars) == 2 assert while_op.blocks[0].inputs[0].name == "a_x0" assert while_op.blocks[0].inputs[1].name == "b_x0" prev_prog = copy.deepcopy(prog) PASS_REGISTRY["common::loop_invariant_elimination"](prog) assert_same_output_names(prev_prog, prog) while_op = prog.find_ops(op_type="while_loop", exactly_one=True)[0] assert len(while_op.blocks[0].inputs) == 1 assert len(while_op.outputs) == 1 assert len(while_op.loop_vars) == 1 assert while_op.blocks[0].inputs[0].name == "a_x0" if _VALIDATE_MODEL: assert_model_is_valid(prog, {"a": (1, 2), "b": (1, 2)}) def test_loop_invariant_elimination2(self): """ Invariant pattern: Block outputs var from outside of the block """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 2)), mb.TensorSpec(shape=(1, 2)), ] ) def prog(a, b): def body(a, bx): return mb.add(x=a, y=b), b def cond(a, bx): a_mean = mb.reduce_mean(x=a, axes=[0, 1]) b_mean = mb.reduce_mean(x=bx, axes=[0, 1]) return mb.less(x=a_mean, y=b_mean) # b is loop invariant return mb.while_loop(_cond=cond, _body=body, loop_vars=(a, b)) while_op = prog.find_ops(op_type="while_loop", exactly_one=True)[0] assert len(while_op.blocks[0].inputs) == 2 assert len(while_op.outputs) == 2 assert len(while_op.loop_vars) == 2 assert while_op.blocks[0].inputs[0].name == "a_x0" assert while_op.blocks[0].inputs[1].name == "b_x0" prev_prog = copy.deepcopy(prog) PASS_REGISTRY["common::loop_invariant_elimination"](prog) assert_same_output_names(prev_prog, prog) while_op = prog.find_ops(op_type="while_loop", exactly_one=True)[0] assert len(while_op.blocks[0].inputs) == 1 assert len(while_op.outputs) == 1 assert len(while_op.loop_vars) == 1 assert while_op.blocks[0].inputs[0].name == "a_x0" if _VALIDATE_MODEL: assert_model_is_valid(prog, {"a": (1, 2), "b": (1, 2)}) class TestNoopElimination: @pytest.mark.parametrize("is_block_output", ((True, False))) def test_identity(self, is_block_output): """ Input graph: input -> identity -> (add 1.0 if not is_block_output) -> output Output graph: if is_block_output: input -> identity -> output else: input -> add 1.0 -> output """ SHAPE = (2, 3) @mb.program(input_specs=[mb.TensorSpec(shape=SHAPE)]) def prog(x): y = mb.identity(x=x) if not is_block_output: y = mb.add(x=y, y=1.0) return y prev_prog, _, block = apply_pass_and_basic_check(prog, "common::noop_elimination") if is_block_output: assert get_op_types_in_program(prev_prog) == ["identity"] assert get_op_types_in_program(prog) == ["identity"] else: assert get_op_types_in_program(prev_prog) == ["identity", "add"] assert get_op_types_in_program(prog) == ["add"] output_name = block.outputs[0].name assert_model_is_valid( prog, {"x": SHAPE}, expected_output_shapes={output_name: SHAPE}, ) @pytest.mark.parametrize( "op_type, pos, val", itertools.product( ["add", "mul", "floor_div", "pow", "real_div", "sub"], ["x", "y"], [0.0, 1.0, [0.0, 0.0, 0.0, 0.0], [1.0, 1.0, 1.0, 1.0]], ), ) def test_elementwise_elimination(self, op_type, pos, val): if "div" in op_type and np.prod(val) == 0: return if "pow" in op_type and (val != 0 or val != 1): return test_op = getattr(mb, op_type) @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): if pos == "x": r1 = test_op(x=val, y=x) else: r1 = test_op(x=x, y=val) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") original_program = [op_type, "relu"] new_program = original_program if op_type in {"add"}: if val == 0.0 or val == [0.0, 0.0, 0.0, 0.0]: new_program = ["relu"] elif op_type in {"mul"}: if val == 1.0 or val == [1.0, 1.0, 1.0, 1.0]: new_program = ["relu"] elif op_type in {"real_div"}: if pos == "y" and (val == 1.0 or val == [1.0, 1.0, 1.0, 1.0]): new_program = ["relu"] elif op_type in {"pow", "floor_div"}: if pos == "y" and (val == 1.0 or val == [1.0, 1.0, 1.0, 1.0]): new_program = ["relu"] elif op_type in {"sub"}: if pos == "y" and (val == 0.0 or val == [0.0, 0.0, 0.0, 0.0]): new_program = ["relu"] assert get_op_types_in_program(prev_prog) == original_program assert get_op_types_in_program(prog) == new_program assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (2, 4)}, ) def test_elementwise_broadcast(self): @mb.program(input_specs=[mb.TensorSpec(shape=[4])]) def prog(x): r1 = mb.add(x=x, y=[[0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0]]) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") original_program = ["add", "relu"] assert get_op_types_in_program(prev_prog) == original_program assert get_op_types_in_program(prog) == original_program assert_model_is_valid( prog, {"x": [4]}, expected_output_shapes={block.outputs[0].name: (2, 4)}, ) def test_elementwise_elimination_fill(self): """ When fill layer with dynamic shape is fed to elementwise-binary operation, even though the tensor can't be materialized at conversion time but no-op elimination can still be performed based on fill-value """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, get_new_symbol()))]) def prog(x): shape = mb.shape(x=x) y = mb.fill(value=0.0, shape=shape) x = mb.add(x=x, y=y) return mb.relu(x=x) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["shape", "fill", "add", "relu"] assert get_op_types_in_program(prog) == ["shape", "fill", "relu"] apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (2, 4)}, ) def test_reshape_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): r1 = mb.reshape(x=x, shape=[1, 8]) mb.reshape(x=r1, shape=[1, 8]) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["reshape", "reshape", "relu"] assert get_op_types_in_program(prog) == ["reshape", "relu"] assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (1, 8)}, ) def test_oneway_split_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): r1 = mb.split(x=x, num_splits=1, axis=-1) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["split", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (2, 4)}, ) def test_full_split_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): r1 = mb.split(x=x, split_sizes=[4], axis=-1) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["split", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (2, 4)}, ) def test_slicebysize_full_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): r1 = mb.slice_by_size(x=x, begin=[0, 0], size=[2, 4]) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["slice_by_size", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (2, 4)}, ) def test_slicebysize_to_end_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): r1 = mb.slice_by_size(x=x, begin=[0, 0], size=[-1, -1]) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["slice_by_size", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (2, 4)}, ) def test_slicebyindex_full_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): r1 = mb.slice_by_index(x=x, begin=[0, 0], end=[2, 4]) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["slice_by_index", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (2, 4)}, ) def test_slicebyindex_negative_stride(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): r1 = mb.slice_by_index( x=x, begin=[0, 0], end=[0, 0], stride=[1, -1], begin_mask=[True, True], end_mask=[True, True], ) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["slice_by_index", "relu"] assert get_op_types_in_program(prog) == ["slice_by_index", "relu"] assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (2, 4)}, ) @pytest.mark.parametrize( "begin_mask, end_mask", itertools.product( itertools.product([True, False], [True, False]), itertools.product([True, False], [True, False]), ), ) def test_slicebyindex_mask_elimination(self, begin_mask, end_mask): @mb.program(input_specs=[mb.TensorSpec(shape=(4, 4))]) def prog(x): begin = [1, 1] end = [1, 1] for i in range(2): if not begin_mask[i]: begin[i] = 0 if not end_mask[i]: end[i] = 4 r1 = mb.slice_by_index( x=x, begin=begin, end=end, begin_mask=begin_mask, end_mask=end_mask ) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["slice_by_index", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (4, 4)}, expected_output_shapes={block.outputs[0].name: (4, 4)}, ) def test_pad_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): r1 = mb.pad(x=x, pad=[0, 0, 0, 0]) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["pad", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (2, 4)}, ) def test_keep_pad(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): r1 = mb.pad(x=x, pad=[4, 4, 2, 2]) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["pad", "relu"] assert get_op_types_in_program(prog) == ["pad", "relu"] assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (10, 8)}, ) @pytest.mark.parametrize( "dynamic", [True, False], ) def test_tile_elimination(self, dynamic): if dynamic: input_shape = (get_new_symbol(), get_new_symbol()) else: input_shape = (2, 4) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): r1 = mb.tile(x=x, reps=[1, 1]) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["tile", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (2, 4)}, ) def test_keep_tile(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): r1 = mb.tile(x=x, reps=[2, 2]) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["tile", "relu"] assert get_op_types_in_program(prog) == ["tile", "relu"] assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (4, 8)}, ) def test_upsample_nearest_neighbor_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(3, 2, 4))]) def prog(x): r1 = mb.upsample_nearest_neighbor(x=x) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["upsample_nearest_neighbor", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (3, 2, 4)}, expected_output_shapes={block.outputs[0].name: (3, 2, 4)}, ) def test_upsample_bilinear_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(3, 2, 4))]) def prog(x): r1 = mb.upsample_bilinear(x=x) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["upsample_bilinear", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (3, 2, 4)}, expected_output_shapes={block.outputs[0].name: (3, 2, 4)}, ) def test_resize_bilinear_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(3, 2, 4))]) def prog(x): r1 = mb.resize_bilinear(x=x, target_size_height=2, target_size_width=4) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["resize_bilinear", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (3, 2, 4)}, expected_output_shapes={block.outputs[0].name: (3, 2, 4)}, ) def test_crop_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(3, 2, 4))]) def prog(x): r1 = mb.crop(x=x, crop_height=[0, 0], crop_width=[0, 0]) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["crop", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (3, 2, 4)}, expected_output_shapes={block.outputs[0].name: (3, 2, 4)}, ) def test_linear_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): r1 = mb.linear_activation(x=x, alpha=1.0, beta=0.0) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["linear_activation", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (2, 4)}, expected_output_shapes={block.outputs[0].name: (2, 4)}, ) def test_transpose_elimination(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3, 4))]) def prog(x): r1 = mb.transpose(x=x, perm=[0, 1, 2]) return mb.relu(x=r1) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prev_prog) == ["transpose", "relu"] assert get_op_types_in_program(prog) == ["relu"] assert_model_is_valid( prog, {"x": (2, 3, 4)}, expected_output_shapes={block.outputs[0].name: (2, 3, 4)}, ) class TestRemoveRedundantOps: def test_redundant_ops_just_after_input_valid_pattern_1(self): """ Input graph: input----->transpose(perm=[0, 2, 1])--->add---> add ---> out | ^ ^ | | | |---->transpose(perm=[0, 2, 1])---- | | | | | |---->transpose(perm=[0, 2, 1])------------ Output graph: input----->transpose(perm=[0, 2, 1])--->add---> add ----> out | ^ ^ | | | |------------- | | | |-------------------- """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3, 5))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 2, 1]) x2 = mb.transpose(x=x, perm=[0, 2, 1]) x3 = mb.transpose(x=x, perm=[0, 2, 1]) z = mb.add(x=x1, y=x2) z = mb.add(x=z, y=x3) return z prev_prog, _, block = apply_pass_and_basic_check(prog, "common::remove_redundant_ops") assert get_op_types_in_program(prev_prog) == [ "transpose", "transpose", "transpose", "add", "add", ] assert get_op_types_in_program(prog) == ["transpose", "add", "add"] assert_model_is_valid( prog, {"x": (2, 3, 5)}, expected_output_shapes={block.outputs[0].name: (2, 5, 3)}, ) def test_redundant_ops_just_after_input_valid_pattern_2(self): """ Input graph: input----->leaky_relu(alpha=0.3)--->add---> add ---> out | ^ ^ | | | |----->leaky_relu(alpha=0.3)--- | | | | | |---->leaky_relu(alpha=0.3)------------ Output graph: input--------->leaky_relu(alpha=0.3)--->add---> add ----> out | ^ ^ | | | |------------- | | | |--------------------- """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3, 5))]) def prog(x): x1 = mb.leaky_relu(x=x, alpha=0.3) x2 = mb.leaky_relu(x=x, alpha=0.3) x3 = mb.leaky_relu(x=x, alpha=0.3) z = mb.add(x=x1, y=x2) z = mb.add(x=z, y=x3) return z prev_prog, _, block = apply_pass_and_basic_check(prog, "common::remove_redundant_ops") assert get_op_types_in_program(prev_prog) == [ "leaky_relu", "leaky_relu", "leaky_relu", "add", "add", ] assert get_op_types_in_program(prog) == ["leaky_relu", "add", "add"] assert_model_is_valid( prog, {"x": (2, 3, 5)}, expected_output_shapes={block.outputs[0].name: (2, 3, 5)}, ) def test_redundant_ops_just_after_input_valid_pattern_3(self): """ Input graph: input----->leaky_relu(alpha=0.4)--->add---> add ---> out | ^ ^ | | | |----->leaky_relu(alpha=0.3)--- | | | | | |---->leaky_relu(alpha=0.3)------------ Output graph: input----->leaky_relu(alpha=0.4)--->add---> add ---> out | ^ ^ | | | |----->leaky_relu(alpha=0.3)---------- """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3, 5))]) def prog(x): x1 = mb.leaky_relu(x=x, alpha=0.4) x2 = mb.leaky_relu(x=x, alpha=0.3) x3 = mb.leaky_relu(x=x, alpha=0.3) z = mb.add(x=x1, y=x2) z = mb.add(x=z, y=x3) return z prev_prog, _, block = apply_pass_and_basic_check(prog, "common::remove_redundant_ops") assert get_op_types_in_program(prev_prog) == [ "leaky_relu", "leaky_relu", "leaky_relu", "add", "add", ] assert get_op_types_in_program(prog) == ["leaky_relu", "leaky_relu", "add", "add"] leaky_relu_ops = block.find_ops(op_type="leaky_relu") assert leaky_relu_ops[0].alpha.val == np.float32(0.4) assert leaky_relu_ops[1].alpha.val == np.float32(0.3) def test_redundant_ops_just_after_input_invalid_pattern_1(self): """ input----->transpose(perm=[0, 2, 1])---> reshape(shape=[-1]) -----> add ---> out | ^ | | |---->transpose(perm=[1, 0, 2])----> reshape(shape=[-1])------ """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3, 5))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 2, 1]) x2 = mb.transpose(x=x, perm=[1, 0, 2]) x1 = mb.reshape(x=x1, shape=[-1]) x2 = mb.reshape(x=x2, shape=[-1]) z = mb.add(x=x1, y=x2) return z prev_prog, _, block = apply_pass_and_basic_check(prog, "common::remove_redundant_ops") assert get_op_types_in_program(prev_prog) == [ "transpose", "transpose", "reshape", "reshape", "add", ] assert get_op_types_in_program(prog) == [ "transpose", "transpose", "reshape", "reshape", "add", ] assert_model_is_valid( prog, {"x": (2, 3, 5)}, expected_output_shapes={block.outputs[0].name: (30,)}, ) def test_redundant_ops_just_after_input_invalid_pattern_2(self): """ input----->leaky_relu(alpha=0.3) -----> add ---> out | ^ | | |---->leaky_relu(alpha=0.4)------- """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3, 5))]) def prog(x): x1 = mb.leaky_relu(x=x, alpha=0.3) x2 = mb.leaky_relu(x=x, alpha=0.4) z = mb.add(x=x1, y=x2) return z prev_prog, _, block = apply_pass_and_basic_check(prog, "common::remove_redundant_ops") assert get_op_types_in_program(prev_prog) == ["leaky_relu", "leaky_relu", "add"] assert get_op_types_in_program(prog) == ["leaky_relu", "leaky_relu", "add"] assert_model_is_valid( prog, {"x": (2, 3, 5)}, expected_output_shapes={block.outputs[0].name: (2, 3, 5)}, ) def test_redundant_ops_just_after_input_invalid_pattern_3(self): """ test case, when inputs of 1 op is a subset of the inputs of the other op input----->layer_norm1 -----> add ---> out | ^ | | |---->layer_norm2------- """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 3, 2))]) def prog(x): x1 = mb.layer_norm(x=x, axes=[2], epsilon=1e-4) gamma_val = np.array([1.0, 1.0], dtype=np.float32) beta_val = np.array([1.0, 0.0], dtype=np.float32) x2 = mb.layer_norm(x=x, axes=[2], epsilon=1e-4, gamma=gamma_val, beta=beta_val) z = mb.add(x=x1, y=x2) return z prev_prog, _, block = apply_pass_and_basic_check(prog, "common::remove_redundant_ops") assert get_op_types_in_program(prev_prog) == ["layer_norm", "layer_norm", "add"] assert get_op_types_in_program(prog) == ["layer_norm", "layer_norm", "add"] assert_model_is_valid( prog, {"x": (1, 3, 2)}, expected_output_shapes={block.outputs[0].name: (1, 3, 2)}, ) @staticmethod def _make_repeated_conv_prog(redundant_conv=True, out_channel=2): prog = mil.Program() func_inputs = {"x": mb.placeholder(shape=[1, 4, 5, 5])} with Function(func_inputs) as ssa_fun: x = ssa_fun.inputs["x"] x = mb.relu(x=x) W = np.random.rand(out_channel, 4, 3, 3) if redundant_conv: bias = np.random.rand(out_channel) x1 = mb.conv(x=x, weight=W, bias=bias, pad_type="same", strides=[1, 1]) x2 = mb.conv(x=x, weight=W, bias=bias, pad_type="same", strides=[1, 1]) else: x1 = mb.conv( x=x, weight=W, bias=np.random.rand(out_channel), pad_type="same", strides=[1, 1] ) x2 = mb.conv( x=x, weight=W, bias=np.random.rand(out_channel), pad_type="same", strides=[1, 1] ) x1 = mb.relu(x=x1) x2 = mb.relu(x=x2) x1 = mb.avg_pool(x=x1, kernel_sizes=[2, 2], strides=[1, 1], pad_type="same") z = mb.concat(values=(x1, x2), axis=-3) ssa_fun.set_outputs([z]) prog.add_function("main", ssa_fun) return prog def test_redundant_ops_inside_graph_valid_pattern(self): """ Input graph: input--> relu--------->conv------>relu----> pool ---> concat ---> out | ^ | | |---->conv---->relu---------------------------- Output graph: input-> relu--->conv------>relu----> pool ---> concat ---> out | ^ | | |------------------- """ prog = self._make_repeated_conv_prog(redundant_conv=True) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::remove_redundant_ops") assert get_op_types_in_program(prev_prog) == [ "relu", "conv", "conv", "relu", "relu", "avg_pool", "concat", ] assert get_op_types_in_program(prog) == ["relu", "conv", "relu", "avg_pool", "concat"] assert_model_is_valid( prog, {"x": (1, 4, 5, 5)}, expected_output_shapes={block.outputs[0].name: (1, 4, 5, 5)}, ) def test_redundant_ops_inside_graph_with_large_const(self): """ For the large constants, they need to be deduplicated by the const_deduplication first. This test is making sure the converter is not doing any "brutal force" comparison. Input graph: input--> relu--------->conv------>relu----> pool ---> concat ---> out | ^ | | |---->conv---->relu---------------------------- Output graph: input-> relu--->conv------>relu----> pool ---> concat ---> out | ^ | | |------------------- """ # The remove_redundant_ops is not doing brutal force array comparison prog = self._make_repeated_conv_prog(redundant_conv=True, out_channel=10) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::remove_redundant_ops") ops_in_prev_prog = [ "relu", "conv", "conv", "relu", "relu", "avg_pool", "concat", ] assert get_op_types_in_program(prev_prog) == ops_in_prev_prog assert get_op_types_in_program(prog) == ops_in_prev_prog # We need to first run the const_deduplication pass. prog = self._make_repeated_conv_prog(redundant_conv=True, out_channel=10) _, _, block = apply_pass_and_basic_check(prog, "common::const_deduplication") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") _, _, block = apply_pass_and_basic_check(prog, "common::remove_redundant_ops") assert get_op_types_in_program(prog) == ["relu", "conv", "relu", "avg_pool", "concat"] assert_model_is_valid( prog, {"x": (1, 4, 5, 5)}, expected_output_shapes={block.outputs[0].name: (1, 20, 5, 5)}, ) def test_redundant_ops_inside_graph_invalid_pattern(self): """ input--->relu--------->conv1------>relu----> pool ---> concat ---> out | ^ | | |---->conv2---->relu--------------------------- """ prog = self._make_repeated_conv_prog(redundant_conv=False) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::remove_redundant_ops") assert get_op_types_in_program(prev_prog) == [ "relu", "conv", "conv", "relu", "relu", "avg_pool", "concat", ] assert get_op_types_in_program(prog) == [ "relu", "conv", "conv", "relu", "relu", "avg_pool", "concat", ] assert_model_is_valid( prog, {"x": (1, 4, 5, 5)}, expected_output_shapes={block.outputs[0].name: (1, 4, 5, 5)}, ) def test_redundant_op_as_output_valid_pattern_1(self): """ Input graph: input--------->relu------> out1 | | |---->relu---->tanh---> out2 Output graph: input--------->relu------> out1 | | |---->tanh---> out2 """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3, 5))]) def prog(x): x1 = mb.relu(x=x) x2 = mb.relu(x=x) return x1, mb.tanh(x=x2) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::remove_redundant_ops") assert get_op_types_in_program(prev_prog) == ["relu", "relu", "tanh"] assert get_op_types_in_program(prog) == ["relu", "tanh"] assert_model_is_valid( prog, {"x": (2, 3, 5)}, expected_output_shapes={ block.outputs[0].name: (2, 3, 5), block.outputs[1].name: (2, 3, 5), }, ) def test_redundant_op_as_output_invalid_pattern_1(self): """ Input graph: input--------->relu------> out1 | | |---->relu---> out2 "common::remove_redundant_ops" pass does not remove ops if their outputs are block outputs. """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3, 5))]) def prog(x): x1 = mb.relu(x=x) x2 = mb.relu(x=x) return x1, x2 prev_prog, _, block = apply_pass_and_basic_check( prog, "common::remove_redundant_ops", ) assert get_op_types_in_program(prev_prog) == ["relu", "relu"] assert get_op_types_in_program(prog) == ["relu", "relu"] assert_model_is_valid( prog, {"x": (2, 3, 5)}, expected_output_shapes={ block.outputs[0].name: (2, 3, 5), block.outputs[1].name: (2, 3, 5), }, ) def test_cond_block_program(self): """ - Test identical ops within different blocks are not removed. The "relu" op inside true and false blocks are not removed since they are in different blocks. - Test ops that have blocks inside them are not removed. There are two cond ops here, with identical inputs but they are not removed, since they are ops that have nested block inside them. """ @mb.program(input_specs=[mb.TensorSpec(shape=(1,))]) def prog(x): x1 = mb.cast(x=x, dtype="bool") def true_fn(): x = mb.shape(x=x1) x = mb.cast(x=x, dtype="fp32") return mb.add(x=x, y=1.0) def false_fn(): x = mb.shape(x=x1) x = mb.cast(x=x, dtype="fp32") return mb.add(x=x, y=-1.0) z1 = mb.cond(pred=x1, _true_fn=true_fn, _false_fn=false_fn) z2 = mb.cond(pred=x1, _true_fn=true_fn, _false_fn=false_fn) z = mb.add(x=z1, y=z2) return z prev_prog, _, block = apply_pass_and_basic_check( prog, "common::remove_redundant_ops", ) assert get_op_types_in_program(prev_prog) == ["cast", "cond", "cond", "add"] assert get_op_types_in_program(prog) == ["cast", "cond", "cond", "add"] cond_op = prog.find_ops(op_type="cond")[0] assert cond_op.blocks[0].operations[0].op_type == "shape" assert cond_op.blocks[1].operations[0].op_type == "shape" assert_model_is_valid( prog, {"x": (1,)}, expected_output_shapes={block.outputs[0].name: (1,)}, ) def test_concat_op_pattern(self): """ Input graph: ---------------> concat ------> log ------> out1 | ^ | | input--------->relu------> concat ------> relu----> out2 | ^ | | | | |---->tanh-------------------- Output graph: |------>log ------> out1 | | input--------->relu------> concat ------> relu----> out2 | ^ | | |---->tanh--------- """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 5))]) def prog(x): x1 = mb.relu(x=x) x2 = mb.tanh(x=x) c1 = mb.concat(values=(x1, x2), axis=0) c2 = mb.concat(values=(x1, x2), axis=0) z1 = mb.log(x=c1) z2 = mb.relu(x=c2) return z1, z2 prev_prog, _, block = apply_pass_and_basic_check( prog, "common::remove_redundant_ops", ) assert get_op_types_in_program(prev_prog) == [ "relu", "tanh", "concat", "concat", "log", "relu", ] assert get_op_types_in_program(prog) == ["relu", "tanh", "concat", "log", "relu"] assert_model_is_valid( prog, {"x": (10, 5)}, expected_output_shapes={block.outputs[0].name: (20, 5), block.outputs[1].name: (20, 5)}, ) def test_multiple_redundant_child_ops_pattern(self): """ Input graph input -------------> reshape ----------> add ---------> out1 | ^ | | |-------> reshape --------------- | |------> slice_by_size-----> add ----------> out2 | ^ | | |------> slice_by_size ------- Output graph input -------------> reshape ----------> add ------------> out1 | | ^ | | | | |--------- | |------> slice_by_size----------> add -----------------> out2 | ^ | | |--------------------- """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 5, 4))]) def prog(x): x1 = mb.reshape(x=x, shape=[5, 2, -1]) x2 = mb.reshape(x=x, shape=[5, 2, -1]) x3 = mb.slice_by_size(x=x, begin=[0, 0, 1], size=[2, 4, 3]) x4 = mb.slice_by_size(x=x, begin=[0, 0, 1], size=[2, 4, 3]) z1 = mb.add(x=x1, y=x2) z2 = mb.add(x=x3, y=x4) return z1, z2 prev_prog, _, block = apply_pass_and_basic_check( prog, "common::remove_redundant_ops", ) assert get_op_types_in_program(prev_prog) == [ "reshape", "reshape", "slice_by_size", "slice_by_size", "add", "add", ] assert get_op_types_in_program(prog) == ["reshape", "slice_by_size", "add", "add"] assert_model_is_valid( prog, {"x": (10, 5, 4)}, expected_output_shapes={ block.outputs[0].name: (5, 2, 20), block.outputs[1].name: (2, 4, 3), }, ) def test_random_distribution_op_invalid_pattern(self): """ Identical random ops are not removed input----->cast---->random_uniform------> add ---> out | ^ | | |---->random_uniform------------ """ @mb.program(input_specs=[mb.TensorSpec(shape=(3,))]) def prog(shape): shape = mb.cast(x=shape, dtype="int32") x1 = mb.random_uniform(shape=shape, low=0.0, high=1.0, seed=11) x2 = mb.random_uniform(shape=shape, low=0.0, high=1.0, seed=11) return mb.add(x=x1, y=x2) prev_prog, _, block = apply_pass_and_basic_check( prog, "common::remove_redundant_ops", ) assert get_op_types_in_program(prev_prog) == [ "cast", "random_uniform", "random_uniform", "add", ] assert get_op_types_in_program(prog) == ["cast", "random_uniform", "random_uniform", "add"] def test_nonreplaceable_vars(self): """ Nonreplaceable vars shouldn't be removed, e.g. palettized weights const_1----->add---->add_1------| | | input add---->output | | const_2----->add---->add_2------| """ def _constexpr_lut_to_dense(): lut_data = np.array( [-19.0, 4.0, 0.0, -1.0, 1.0, 3.0, 5.0, -8.0, 19, 13, 42, 4.5, 5.4, 2.0, -6, -7] ).astype(np.float32) indices = np.array([212, 21]).astype(np.uint8) shape = np.array([4, 1]).astype(np.uint32) return mb.constexpr_lut_to_dense(lut=lut_data, indices=indices, shape=shape) @mb.program(input_specs=[mb.TensorSpec(shape=(4, 1))]) def prog(x): constexpr_1 = _constexpr_lut_to_dense() constexpr_2 = _constexpr_lut_to_dense() c = mb.add(x=constexpr_1, y=x) d = mb.add(x=constexpr_2, y=x) return mb.add(x=c, y=d) prev_prog, _, _ = apply_pass_and_basic_check( prog, "common::remove_redundant_ops", ) assert get_op_types_in_program(prev_prog) == get_op_types_in_program(prog) def test_redundant_ops_time_complexity(self): """ Test the graph pass doesn't re-run right away after detecting a redundant pattern, in order to keep time complexity low. In this example, a program with 26 ops is first traversed, and 5 relu ops are removed. At the time of second traversal, there are only 21 remaining ops. As the result, the total ops of visited is 26 + 21 = 47. """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3, 5))]) def prog(x): x = mb.cos(x=x) for i in range(5): x1 = mb.relu(x=x) x2 = mb.relu(x=x) z = mb.add(x=x1, y=x2) z = mb.add(x=z, y=x2) x = mb.sin(x=x) return x graph_pass = remove_redundant_ops() graph_pass.apply(prog) assert get_op_types_in_program(prog) == ["cos"] + ["relu", "add", "add", "sin"] * 5 assert graph_pass._num_of_visited_ops == 47 def test_redundant_ops_time_complexity_pattern_2(self): """ Test the graph pass doesn't re-run right away after detecting a redundant pattern, in order to keep time complexity low. In this example, there are three groups of identical leaky_relu ops can be removed, and the algorithm should be run in the fashion that only goes through the program twice. As the result, the total ops visited is: 8 + (8 - 3) = 13 """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3, 5))]) def prog(x): x = mb.cos(x=x) x1 = mb.leaky_relu(x=x, alpha=0.2) x2 = mb.leaky_relu(x=x, alpha=0.2) x3 = mb.leaky_relu(x=x, alpha=0.3) x4 = mb.leaky_relu(x=x, alpha=0.3) x5 = mb.leaky_relu(x=x, alpha=0.4) x6 = mb.leaky_relu(x=x, alpha=0.4) return mb.sin(x=x6) graph_pass = remove_redundant_ops() graph_pass.apply(prog) assert get_op_types_in_program(prog) == ["cos"] + ["leaky_relu"] * 3 + ["sin"] assert graph_pass._num_of_visited_ops == 13 class TestRemoveSymbolicReshape: def test_remove_symbolic_reshape(self): sym_b = Symbol("s0") original_shape = (sym_b, Symbol("s1"), 2) reshape_name = "reshape" @mb.program(input_specs=[mb.TensorSpec(shape=(sym_b, 4))]) def prog(x): # const cannot represent symbolic values. Use _const_symbolic shape = mb._const_symbolic(val=original_shape) return mb.reshape(x=x, shape=shape, name=reshape_name) reshape_op = prog.find_ops(prefix=reshape_name, op_type="reshape", exactly_one=True)[0] shape_var = reshape_op.shape reshaped_var = reshape_op.outputs[0] assert np.all(shape_var.sym_val == original_shape) assert np.all(reshaped_var.shape == (sym_b, 2, 2)) # Note: we cannot deepcopy prog with symbol. prev_outputs = [o.name for o in prog["main"].outputs] PASS_REGISTRY["common::remove_symbolic_reshape"](prog) curr_outputs = [o.name for o in prog["main"].outputs] assert curr_outputs == prev_outputs reshape_op = prog.find_ops(prefix=reshape_name, op_type="reshape", exactly_one=True)[0] shape_var = reshape_op.shape reshaped_var = reshape_op.outputs[0] # shape param cannot be symbolic after the pass assert np.all(shape_var.sym_val == (-1, 2, 2)) # output shape is still symbolic assert np.all(reshaped_var.shape == (sym_b, 2, 2)) if _VALIDATE_MODEL: assert_model_is_valid(prog, {"x": (3, 4)}) class TestTopologicalReorder: def test_move_sink_casts_to_the_end(self): """ Input graph: x (input) ---> square ---> cast (output) | | -----------> log ------> cast (output) | | -----------> relu -----> cast ----> relu (output) """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.cast(x=x, dtype="fp16") x1 = mb.square(x=x) x2 = mb.cast(x=x1, dtype="fp32", name="x2") x3 = mb.log(x=x) x4 = mb.cast(x=x3, dtype="fp32", name="x4") x5 = mb.relu(x=x) x6 = mb.cast(x=x5, dtype="fp32", name="x6") x7 = mb.relu(x=x6) return x2, x4, x7 assert get_op_types_in_program(prog) == [ "cast", "square", "cast", "log", "cast", "relu", "cast", "relu", ] apply_pass_and_basic_check(prog, "common::topological_reorder") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == [ "cast", "square", "log", "relu", "cast", "relu", "cast", "cast", ] cast_ops = block.find_ops(op_type="cast") assert cast_ops[1].outputs[0].name == "x6" assert cast_ops[2].outputs[0].name == "x4" assert cast_ops[3].outputs[0].name == "x2" assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={ block.outputs[0].name: (10, 20), block.outputs[1].name: (10, 20), block.outputs[2].name: (10, 20), }, ) def test_move_sink_cast_transpose_to_the_end(self): """ Input graph: x (input) ---> square ---> transpose ---> cast (output) | | -----------> log ------> transpose ---> cast (output) | | -----------> relu -----> cast ----> relu (output) | | -----------> relu (output) """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.cast(x=x, dtype="fp16") x1 = mb.square(x=x) x1_t = mb.transpose(x=x1, perm=[1, 0]) x2 = mb.cast(x=x1_t, dtype="fp32") x3 = mb.log(x=x) x3_t = mb.transpose(x=x3, perm=[1, 0]) x4 = mb.cast(x=x3_t, dtype="fp32") x5 = mb.relu(x=x) x6 = mb.cast(x=x5, dtype="fp32") x7 = mb.relu(x=x6) x8 = mb.relu(x=x) return x2, x4, x7, x8 assert get_op_types_in_program(prog) == [ "cast", "square", "transpose", "cast", "log", "transpose", "cast", "relu", "cast", "relu", "relu", ] apply_pass_and_basic_check(prog, "common::topological_reorder") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == [ "cast", "square", "log", "relu", "cast", "relu", "relu", "transpose", "cast", "transpose", "cast", ] assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={ block.outputs[0].name: (20, 10), block.outputs[1].name: (20, 10), block.outputs[2].name: (10, 20), block.outputs[3].name: (10, 20), }, ) def test_move_multiple_uses_overlapping(self): """ Input graph: x (input) ---> cast ---> cast (output) | |-------> transpose ---> transpose (output) """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x1 = mb.cast(x=x, dtype="fp16") x2 = mb.cast(x=x1, dtype="fp32") x3 = mb.transpose(x=x1, perm=[1, 0]) x4 = mb.transpose(x=x3, perm=[1, 0]) return x2, x4 assert get_op_types_in_program(prog) == ["cast", "cast", "transpose", "transpose"] apply_pass_and_basic_check(prog, "common::topological_reorder") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["cast", "transpose", "transpose", "cast"] assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={ block.outputs[0].name: (10, 20), block.outputs[1].name: (10, 20), }, ) def test_move_split_to_first_use(self): """ Input graph: x (input) ---> split ---> square ---> add (output) | | | | | --------------------| | | -----------> square --------------> relu (output) """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): s1, s2 = mb.split(x=x, num_splits=2, axis=0) x2 = mb.square(x=x) x3 = mb.relu(x=x2) s1_1 = mb.square(x=s1) s3 = mb.add(x=s1_1, y=s2) return x3, s3 assert get_op_types_in_program(prog) == ["split", "square", "relu", "square", "add"] block = prog.functions["main"] # Reorder `split` op to test op with multiple output case topological_reorder._move_operations_to_the_end_block(block, ["split"]) assert get_op_types_in_program(prog) == ["square", "relu", "split", "square", "add"] assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={ block.outputs[0].name: (10, 20), block.outputs[1].name: (5, 20), }, ) def test_move_transpose_before_subblock(self): """ Input graph: x (input) ---> cast ---> transpose ---> cast (output) | | -----------> square ------> transpose (x1_t) ---> cast (output) | | -----------> squeeze ----> equal ----> squeeze | (true) <--- / \ ---> (false) | | | /<-(x1_t)->\ | add <-/ \--> add |---------> | <---------| | add ---> cast (output) """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.cast(x=x, dtype="fp16") x1 = mb.square(x=x) x1_t = mb.transpose(x=x1, perm=[1, 0]) def true_fn(): return mb.add(x=x1_t, y=np.float16(1), name="x2") def false_fn(): return mb.add(x=x1_t, y=np.float16(2), name="x2") is_one = mb.equal(x=mb.squeeze(x=x), y=np.float16(1.0)) pred = mb.squeeze(x=is_one) x3 = mb.cond(pred=pred, _true_fn=true_fn, _false_fn=false_fn) x4 = mb.add(x=x1_t, y=x3) x5 = mb.cast(x=x4, dtype="fp32") return x5 apply_pass_and_basic_check(prog, "common::topological_reorder") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == [ "cast", "square", "squeeze", "equal", "squeeze", "transpose", "cond", "add", "cast", ] assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={block.outputs[0].name: (20, 10)}, ) def test_cast_transpose_already_at_the_end(self): """ Input graph: x (input) ---> square ---> transpose ---> cast (output) | | -----------> log ------> transpose ---> cast (output) | | -----------> relu -----> cast ----> relu (output) | | -----------> relu (output) """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.cast(x=x, dtype="fp16") x1 = mb.square(x=x) x3 = mb.log(x=x) x5 = mb.relu(x=x) x6 = mb.cast(x=x5, dtype="fp32") x7 = mb.relu(x=x6) x8 = mb.relu(x=x) x1_t = mb.transpose(x=x1, perm=[1, 0]) x2 = mb.cast(x=x1_t, dtype="fp32") x3_t = mb.transpose(x=x3, perm=[1, 0]) x4 = mb.cast(x=x3_t, dtype="fp32") return x2, x4, x7, x8 assert get_op_types_in_program(prog) == [ "cast", "square", "log", "relu", "cast", "relu", "relu", "transpose", "cast", "transpose", "cast", ] apply_pass_and_basic_check(prog, "common::topological_reorder") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == [ "cast", "square", "log", "relu", "cast", "relu", "relu", "transpose", "cast", "transpose", "cast", ] assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={ block.outputs[0].name: (20, 10), block.outputs[1].name: (20, 10), block.outputs[2].name: (10, 20), block.outputs[3].name: (10, 20), }, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/tests/test_lower_complex_dialect_ops.py0000644000000000000000000001256614672066616032105 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import numpy as np import pytest from coremltools import ComputeUnit from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.defs.lower_complex_dialect_ops import ( _calculate_dft_matrix, ) from coremltools.converters.mil.mil.scope import ScopeInfo, ScopeSource from coremltools.converters.mil.testing_utils import ( apply_pass_and_basic_check, assert_model_is_valid, ct_convert, get_op_types_in_program, ) np.random.seed(9) class TestLowerComplexDialectOps: def test_lower_complex_real(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3))]) def prog(x): complex_data = mb.complex(real_data=x, imag_data=x) real_data = mb.complex_real(data=complex_data) return real_data prev_prog, _, block = apply_pass_and_basic_check(prog, "common::lower_complex_dialect_ops") assert get_op_types_in_program(prev_prog) == ["complex", "complex_real"] assert get_op_types_in_program(prog) == ["identity"] inputs = {"x": (1, 2, 3)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 3)}, ) def test_lower_fft_with_scope(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3))]) def prog(x): with mb.scope(ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["m1"])): fft_res = mb.complex_fft(data=x) with mb.scope(ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["m2"])): return mb.complex_real(data=fft_res) prog._add_essential_scope_source(ScopeSource.TORCHSCRIPT_MODULE_TYPE) apply_pass_and_basic_check( prog, "common::lower_complex_dialect_ops", skip_essential_scope_check=True, # this graph pass introduces two subgraphs, while only one of them is used. ) apply_pass_and_basic_check( prog, "common::dead_code_elimination", ) # since the _replace_var is operated on the output of complex_real, so the scope info should be "m2" block = prog.functions["main"] for op in block.operations: assert op.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["m2"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["lower_complex_dialect_ops"], } def test_lower_fft(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3))]) def prog(x): fft_res = mb.complex_fft(data=x) real_data = mb.complex_real(data=fft_res) return real_data # Test the apply_pass_and_basic_check utils has the ability to catch errors regarding incomplete scope information with pytest.raises( ValueError, match="is missing essential scopes ScopeSource.TORCHSCRIPT_MODULE_TYPE" ): prev_prog, _, block = apply_pass_and_basic_check( copy.deepcopy(prog), "common::lower_complex_dialect_ops", ) prev_prog, _, block = apply_pass_and_basic_check( prog, "common::lower_complex_dialect_ops", skip_essential_scope_check=True, # this graph pass introduces two subgraphs, while only one of them is used. ) assert get_op_types_in_program(prev_prog) == ["complex_fft", "complex_real"] after_pass_op_types_set = set(get_op_types_in_program(prog)) # Verifies that the complex dialect ops got lowered to core ops. assert "complex_fft" not in after_pass_op_types_set assert "complex_real" not in after_pass_op_types_set apply_pass_and_basic_check( prog, "common::dead_code_elimination", ) # Verifies that the complex dialect ops got lowered to core ops. assert "complex_fft" not in after_pass_op_types_set assert "complex_real" not in after_pass_op_types_set inputs = {"x": (1, 2, 3)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 3)}, ) @pytest.mark.parametrize("onesided", [True, False]) def test_calculate_dft_matrix(self, onesided): expected_C = np.zeros((16, 16)) expected_S = np.zeros((16, 16)) _range = np.arange(16) for k in range(16): expected_C[k, :] = np.cos(2 * np.pi * k * _range / 16) expected_S[k, :] = np.sin(2 * np.pi * k * _range / 16) if onesided: expected_C = expected_C[:9] expected_S = expected_S[:9] @mb.program(input_specs=[mb.TensorSpec(shape=(1,))]) def prog(x): return _calculate_dft_matrix(x, onesided=onesided) model = ct_convert( program=prog, convert_to=("neuralnetwork", "fp32"), compute_units=ComputeUnit.CPU_ONLY ) p = model.predict({"x": np.array([16.0])}) cos_matrix, sin_matrix = p["cos_0"], p["sin_0"] np.testing.assert_allclose(expected_C, cos_matrix, atol=1e-04, rtol=1e-05) np.testing.assert_allclose(expected_S, sin_matrix, atol=1e-04, rtol=1e-05) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/tests/test_optimize_linear_passes.py0000644000000000000000000002567014672066616031430 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import itertools import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.converters.mil.testing_reqs import backends from coremltools.converters.mil.testing_utils import ( apply_pass_and_basic_check, assert_model_is_valid, assert_op_count_match, assert_same_output_names, get_op_types_in_program, ) from .test_passes import _VALIDATE_MODEL class TestFuseLinearBias: @staticmethod def _apply_transform(inputs, func, is_first_input, has_bias): """ Utility function to test the weight/bias transform function in linear bias fusion pass. """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 4))]) def prog(x): if has_bias: linear = mb.linear( x=x, weight=inputs["linear_weight"], bias=inputs["linear_bias"], ) else: linear = mb.linear( x=x, weight=inputs["linear_weight"], ) if is_first_input: kwargs = { "x": linear, "y": inputs["bias"], } else: kwargs = { "x": inputs["bias"], "y": linear, } x = func(**kwargs) return x apply_pass_and_basic_check( prog, "common::fuse_linear_bias", ) # get the updated weight from the prog linear_op = [] for op in prog["main"].operations: if op.op_type == "const": continue linear_op.append(op) assert len(linear_op) == 1, "should only have one linear layer." return linear_op[0].weight.val, linear_op[0].bias.val @pytest.mark.parametrize( "op_type, is_first_input, has_bias, broadcast", itertools.product( ["add", "sub"], [True, False], [True, False], [True, False], ), ) def test_transform_linear(self, op_type, is_first_input, has_bias, broadcast): """ Test the weight / bias transform function in the linear bias fusion pass """ weight = np.reshape(np.arange(8), (2, 4)).astype(np.float32) linear_bias = ( np.array([1, 2]).astype(np.float32) if has_bias else np.array([0, 0]).astype(np.float32) ) bias = np.array([3, 4]).astype(np.float32) if broadcast: bias = np.reshape(bias, (1, 2)) inputs = { "linear_weight": weight, "linear_bias": linear_bias, "bias": bias, } if op_type == "add": func = mb.add elif op_type == "sub": func = mb.sub new_weight, new_bias = self._apply_transform( inputs, func, is_first_input, has_bias, ) if broadcast: bias = np.reshape(bias, (2,)) if op_type == "sub" and not is_first_input: expected_weight = -weight else: expected_weight = weight if op_type == "sub": if is_first_input: expected_bias = linear_bias - bias else: expected_bias = bias - linear_bias else: expected_bias = linear_bias + bias np.testing.assert_almost_equal(new_weight, expected_weight) np.testing.assert_almost_equal(new_bias, expected_bias) @pytest.mark.parametrize( "rank, op_type, is_first_input, broadcast, backend", itertools.product([1, 2, 3], ["add", "sub"], [True, False], [True, False], backends), ) def test_linear_bias_fusion(self, rank, op_type, is_first_input, broadcast, backend): """ Input graph: Const | V input -----> linear -----> add/sub ---> out Output graph: input -----> linear ----> out """ input_shape = [1, 2, 3] input_shape = input_shape[-rank:] input_shape = tuple(input_shape) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): linear_weight = np.reshape(np.arange(6), (2, 3)).astype(np.float32) linear_bias = np.array([1.0, 2.0]) bias = np.array([3.0, 4.0]) if broadcast: if rank >= 2: bias = np.reshape(bias, (1, 2)) x = mb.linear( x=x, weight=linear_weight, bias=linear_bias, ) func = mb.add if op_type == "add" else mb.sub if is_first_input: kwargs = { "x": x, "y": bias, } else: kwargs = { "x": bias, "y": x, } x = func(**kwargs) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_linear_bias") assert get_op_types_in_program(prev_prog) == ["linear", op_type] assert get_op_types_in_program(prog) == ["linear"] # validate graph pass output_shape = [1, 2, 2] output_shape = tuple(output_shape[-rank:]) assert_model_is_valid( prog, {"x": input_shape}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) class TestFuseMatmulWeightBias: def test_fuse_matmul_weight_bias(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): weights_val = np.random.rand(2, 4).T.astype(np.float32) weights = mb.const(val=weights_val) bias_val = np.random.rand(2).astype(np.float32) bias = mb.const(val=bias_val) matmul = mb.matmul(x=x, y=weights) return mb.add(x=matmul, y=bias) assert_op_count_match(prog, expect=1, op="matmul") assert_op_count_match(prog, expect=0, op="linear") prev_prog = copy.deepcopy(prog) PASS_REGISTRY["common::fuse_matmul_weight_bias"](prog) assert_same_output_names(prev_prog, prog) assert_op_count_match(prog, expect=0, op="matmul") assert_op_count_match(prog, expect=1, op="linear") if _VALIDATE_MODEL: assert_model_is_valid(prog, {"x": (2, 4)}) class TestFuseTransposeMatmul: def test_fuse_transposes(self): X_SHAPE = (3, 2) Y_SHAPE = (5, 2) output_shape = (X_SHAPE[0], Y_SHAPE[0]) @mb.program(input_specs=[mb.TensorSpec(shape=X_SHAPE), mb.TensorSpec(shape=Y_SHAPE)]) def prog(x, y): transposed_x = mb.transpose(x=x, perm=(1, 0)) transposed_y = mb.transpose(x=y, perm=(1, 0)) z = mb.matmul(x=transposed_x, y=transposed_y, transpose_x=True, transpose_y=False) return z prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::fuse_transpose_matmul") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prev_prog) == ["transpose", "transpose", "matmul"] assert get_op_types_in_program(prog) == ["matmul"] matmul = prog.find_ops(op_type="matmul")[0] assert not matmul.transpose_x.val assert matmul.transpose_y.val assert_model_is_valid( prog, {"x": X_SHAPE, "y": Y_SHAPE}, expected_output_shapes={block.outputs[0].name: output_shape}, ) def test_fuse_transpose_y(self): X_SHAPE = (3, 2) Y_SHAPE = (2, 5) output_shape = (X_SHAPE[0], Y_SHAPE[1]) @mb.program(input_specs=[mb.TensorSpec(shape=X_SHAPE), mb.TensorSpec(shape=Y_SHAPE)]) def prog(x, y): transposed_y = mb.transpose(x=y, perm=(1, 0)) z = mb.matmul(x=x, y=transposed_y, transpose_y=True) return z prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::fuse_transpose_matmul") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prev_prog) == ["transpose", "matmul"] assert get_op_types_in_program(prog) == ["matmul"] matmul = prog.find_ops(op_type="matmul")[0] assert not matmul.transpose_x.val assert not matmul.transpose_y.val assert_model_is_valid( prog, {"x": X_SHAPE, "y": Y_SHAPE}, expected_output_shapes={block.outputs[0].name: output_shape}, ) def test_fuse_transpose_x_but_unfuseable_transpose_y(self): X_SHAPE = (4, 2, 5, 3) Y_SHAPE = (4, 5, 2, 7) output_shape = (X_SHAPE[0], X_SHAPE[1], X_SHAPE[3], Y_SHAPE[3]) @mb.program(input_specs=[mb.TensorSpec(shape=X_SHAPE), mb.TensorSpec(shape=Y_SHAPE)]) def prog(x, y): transposed_x = mb.transpose(x=x, perm=(0, 1, 3, 2)) transposed_y = mb.transpose(x=y, perm=(0, 2, 1, 3)) z = mb.matmul(x=transposed_x, y=transposed_y) return z prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::fuse_transpose_matmul") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prev_prog) == ["transpose", "transpose", "matmul"] assert get_op_types_in_program(prog) == ["transpose", "matmul"] assert_model_is_valid( prog, {"x": X_SHAPE, "y": Y_SHAPE}, expected_output_shapes={block.outputs[0].name: output_shape}, ) def test_unfuseable_transposes(self): X_SHAPE = (3, 2, 5) Y_SHAPE = (5, 2, 7) output_shape = (X_SHAPE[1], X_SHAPE[0], Y_SHAPE[2]) @mb.program(input_specs=[mb.TensorSpec(shape=X_SHAPE), mb.TensorSpec(shape=Y_SHAPE)]) def prog(x, y): transposed_x = mb.transpose(x=x, perm=(1, 0, 2)) transposed_y = mb.transpose(x=y, perm=(1, 0, 2)) z = mb.matmul(x=transposed_x, y=transposed_y) return z prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::fuse_transpose_matmul") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prev_prog) == ["transpose", "transpose", "matmul"] assert get_op_types_in_program(prev_prog) == get_op_types_in_program(prog) assert_model_is_valid( prog, {"x": X_SHAPE, "y": Y_SHAPE}, expected_output_shapes={block.outputs[0].name: output_shape}, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/tests/test_pass_pipeline.py0000644000000000000000000001267614672066616027515 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.pass_pipeline import PassPipeline, PassPipelineManager from coremltools.converters.mil.testing_utils import assert_model_is_valid, get_op_types_in_program np.random.seed(1984) class TestPassPipeline: def test_add_pass(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): x = mb.relu(x=x) x = mb.relu(x=x) x = mb.add(x=x, y=1.0) return x assert get_op_types_in_program(prog) == ["relu", "relu", "add"] pipeline = PassPipeline.EMPTY pipeline.append_pass("common::merge_consecutive_relus") assert pipeline.passes == ["common::merge_consecutive_relus"] PassPipelineManager.apply_pipeline(prog, pipeline) assert get_op_types_in_program(prog) == ["relu", "add"] inputs = {"x": (2, 3)} assert_model_is_valid( prog, inputs, expected_output_shapes={prog.functions["main"].outputs[0].name: (2, 3)}, ) def test_insert_pass_at_index(self): pipeline = PassPipeline.EMPTY pipeline.insert_pass(index=0, pass_name="common::merge_consecutive_relus") pipeline.insert_pass(index=0, pass_name="common::noop_elimination") pipeline.insert_pass(index=1, pass_name="common::noop_elimination") pipeline.insert_pass(index=1, pass_name="common::merge_consecutive_reshapes") assert pipeline.passes == [ "common::noop_elimination", "common::merge_consecutive_reshapes", "common::noop_elimination", "common::merge_consecutive_relus", ] def test_insert_invalid_pass(self): pipeline = PassPipeline.EMPTY with pytest.raises(ValueError, match="The pass test_pass is not registered."): pipeline.append_pass("test_pass") with pytest.raises(ValueError, match="The pass test_pass is not registered."): pipeline.insert_pass(0, "test_pass") with pytest.raises(ValueError, match="The pass invalid_pass is not registered."): pipeline.passes = ["invalid_pass"] def test_remove_passes(self): pipeline = PassPipeline.EMPTY pipeline.passes = [ "common::noop_elimination", "common::merge_consecutive_reshapes", "common::noop_elimination", "common::merge_consecutive_relus", ] pipeline.remove_passes(passes_names=["common::noop_elimination"]) assert pipeline.passes == [ "common::merge_consecutive_reshapes", "common::merge_consecutive_relus", ] pipeline.remove_pass(index=1) assert pipeline.passes == ["common::merge_consecutive_reshapes"] def test_set_pass_options(self): pipeline = PassPipeline.EMPTY pipeline.append_pass("common::add_fp16_cast") assert pipeline.get_options("common::add_fp16_cast") is None pipeline.set_options("common::add_fp16_cast", {"skip_ops_by_type": "matmul,const"}) assert len(pipeline.get_options("common::add_fp16_cast")) == 1 assert pipeline.get_options("common::add_fp16_cast")[0].option_name == "skip_ops_by_type" assert pipeline.get_options("common::add_fp16_cast")[0].option_val == "matmul,const" def test_set_pass_options_already_exist(self): pipeline = PassPipeline() pipeline.set_options("common::add_fp16_cast", {"skip_ops_by_type": "matmul,const"}) with pytest.raises( ValueError, match="The pass common::add_fp16_cast already has associated options." ): pipeline.set_options("common::add_fp16_cast", {"skip_ops_by_type": "concat"}, override=False) # Override the options. pipeline.set_options("common::add_fp16_cast", {"skip_ops_by_type": "concat"}) assert pipeline.get_options("common::add_fp16_cast")[0].option_name == "skip_ops_by_type" assert pipeline.get_options("common::add_fp16_cast")[0].option_val == "concat" def test_set_pass_options_for_pass_not_in_pipeline(self): pipeline = PassPipeline.EMPTY pipeline.set_options("common::add_fp16_cast", {"skip_ops_by_type": "matmul,const"}) with pytest.raises( ValueError, match="This pass pipeline is not valid. The pass common::add_fp16_cast " "has associated options but it's not in the passes.", ): pipeline.validate() def test_get_invalid_pipeline(self): with pytest.raises( ValueError, match="There is no pipeline for `invalid`.", ): PassPipeline.get_pipeline("invalid") def test_list_available_pipelines(self): available_pipelines = PassPipeline.list_available_pipelines() assert len(available_pipelines) == 12 assert "default" in available_pipelines assert "default_palettization" in available_pipelines @staticmethod def test_get_pipeline_should_use_copy(): pipeline = PassPipeline.DEFAULT_PRUNING pipeline.append_pass("compression::palettize_weights") pipeline_2 = PassPipeline.DEFAULT_PRUNING assert "compression::palettize_weights" not in pipeline_2.passes ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/tests/test_passes.py0000644000000000000000000100507414672066616026153 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import itertools import unittest import numpy as np import pytest import torch import coremltools as ct import coremltools.optimize as cto from coremltools._deps import _IS_MACOS from coremltools.converters.mil import mil from coremltools.converters.mil.experimental.passes.generic_pass_infrastructure import ( register_generic_pass, ) from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, types from coremltools.converters.mil.mil.ops.defs.iOS15.elementwise_unary import cast as _cast_iOS14 from coremltools.converters.mil.mil.ops.defs.iOS17.elementwise_unary import cast as _cast_iOS17 from coremltools.converters.mil.mil.passes.defs.optimize_repeat_ops import cast_optimization from coremltools.converters.mil.mil.passes.helper import _check_var_scalar_value from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.converters.mil.mil.scope import ScopeInfo, ScopeSource from coremltools.converters.mil.mil.types import numpy_type_to_builtin_type from coremltools.converters.mil.mil.types.type_mapping import builtin_to_string from coremltools.converters.mil.testing_reqs import backends from coremltools.converters.mil.testing_utils import ( apply_pass_and_basic_check, assert_model_is_valid, assert_op_count_match, assert_same_output_names, get_op_types_in_block, get_op_types_in_program, ) from coremltools.models.utils import _macos_version np.random.seed(1984) _VALIDATE_MODEL = True def _get_constexpr_cast(shape, seed=None): if seed is not None: np.random.seed(seed) val = np.random.rand(*shape).astype(np.float16) return mb.constexpr_cast(source_val=val, output_dtype="fp32") def _get_constexpr_sparse_to_dense(shape, seed=None): if seed is not None: np.random.seed(seed) val = np.random.rand(*shape) sparse_params = cto.coreml._quantization_passes.prune_weights.compress_by_magnitude( val=val, target_sparsity=0.4 ) return mb.constexpr_sparse_to_dense( nonzero_data=sparse_params.nonzero_data, mask=sparse_params.mask, shape=np.uint32(sparse_params.shape), ) def _get_constexpr_lut_to_dense(shape, seed=None): if seed is not None: np.random.seed(seed) val = np.random.rand(*shape) lut_params = cto.coreml._quantization_passes.palettize_weights.compress( val=val, nbits=4, mode="UNIFORM" ) return mb.constexpr_lut_to_dense( indices=lut_params.indices, lut=lut_params.lut, shape=np.uint32(lut_params.shape), ) def _get_constexpr_affine_dequantize(shape, seed=None): if seed is not None: np.random.seed(seed) val = np.random.rand(*shape) quant_params = cto.coreml._quantization_passes.linear_quantize_weights.compress( val=val, axis=0, mode="LINEAR_SYMMETRIC", dtype=types.uint8 ) return mb.constexpr_affine_dequantize( quantized_data=quant_params.quantized_data, zero_point=quant_params.zero_point, scale=quant_params.scale, axis=quant_params.axis, ) def _get_constexpr_val(constexpr_var): assert "constexpr" in constexpr_var.op.op_type if constexpr_var.val is not None: return constexpr_var.val return constexpr_var.op.materialized_val_inference() CONSTEXPR_FUNCS = { "constexpr_cast": _get_constexpr_cast, "constexpr_sparse_to_dense": _get_constexpr_sparse_to_dense, "constexpr_lut_to_dense": _get_constexpr_lut_to_dense, "constexpr_affine_dequantize": _get_constexpr_affine_dequantize, } CONSTEXPR_OPS = [ "constexpr_cast", "constexpr_sparse_to_dense", "constexpr_lut_to_dense", "constexpr_affine_dequantize", ] class TestFuseSqueezeExpandDims: @pytest.mark.parametrize( "rank", [1, 5], ) def test_fuse_squeeze_expand_dims_basic(self, rank): """ Given: %1 = squeeze(%x) %2 = expand_dims(%1) %3 = relu(%2) Result: %3 = relu(%x) """ if rank == 1: input_shape = (1,) axes = (0,) else: assert rank == 5 input_shape = (3, 1, 4, 1, 1) axes = (1, 3, 4) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): x = mb.squeeze(x=x, axes=axes) x = mb.expand_dims(x=x, axes=axes) return mb.relu(x=x) # fuse_squeeze_expand_dims fused squeeze + expand_dims into identity apply_pass_and_basic_check(prog, "common::fuse_squeeze_expand_dims") assert get_op_types_in_program(prog) == ["identity", "relu"] # noop_elimination can further remove the identity op apply_pass_and_basic_check(prog, "common::noop_elimination") assert get_op_types_in_program(prog) == ["relu"] def test_fuse_squeeze_expand_dims_negative(self): """ If squeeze and expand_dims cannot cancel each other, the graph pass does nothing """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 1, 4, 1, 1))]) def prog(x): x = mb.squeeze(x=x, axes=(1, 2)) x = mb.expand_dims(x=x, axes=(1, 3)) return mb.relu(x=x) apply_pass_and_basic_check(prog, "common::fuse_squeeze_expand_dims") assert get_op_types_in_program(prog) == ["squeeze", "expand_dims", "relu"] def test_fuse_squeeze_expand_dims_connected_output(self): """ If squeeze is connected to block output, it cannot be removed. However, the expand_dims can be a block output. """ # squeeze connected to output. Nothing happens. @mb.program(input_specs=[mb.TensorSpec(shape=(1,))]) def prog(x): squeeze = mb.squeeze(x=x, axes=(0,)) expand_dims = mb.expand_dims(x=squeeze, axes=(0,)) return mb.relu(x=expand_dims), squeeze apply_pass_and_basic_check(prog, "common::fuse_squeeze_expand_dims") assert get_op_types_in_program(prog) == ["squeeze", "expand_dims", "relu"] # expand_dims connected to output. Still good to fuse. @mb.program(input_specs=[mb.TensorSpec(shape=(1,))]) def prog(x): squeeze = mb.squeeze(x=x, axes=(0,)) expand_dims = mb.expand_dims(x=squeeze, axes=(0,)) return mb.relu(x=expand_dims), expand_dims apply_pass_and_basic_check(prog, "common::fuse_squeeze_expand_dims") assert get_op_types_in_program(prog) == ["identity", "relu"] class TestAddConvTransposeOutputShape: def test_add_conv_transpose_output_shape(self): """ Given: %1: (1, 5, 39, fp32) = conv_transpose(...) # no output_shape input. Result: %2: (3, i32) = const(val=[1,5,39]) %3: (1, 5, 39, fp32) = conv_transpose(..., output_shape=%2) """ N, C_in, C_out, D1 = 1, 3, 5, 20 @mb.program(input_specs=[mb.TensorSpec(shape=(N, C_in, D1))]) def prog(x): weight = np.random.rand(C_in, C_out, D1).astype(np.float32) return mb.conv_transpose(x=x, weight=weight) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::add_conv_transpose_output_shape" ) assert get_op_types_in_program(prev_prog) == ["conv_transpose"] assert get_op_types_in_program(prog) == ["conv_transpose"] prev_conv_transpose_op = prev_prog.find_ops(op_type="conv_transpose", exactly_one=True)[0] conv_transpose_op = prog.find_ops(op_type="conv_transpose", exactly_one=True)[0] assert np.all(conv_transpose_op.output_shape.val == prev_conv_transpose_op.outputs[0].shape) class TestChildOrdering: def test_generic_child_ordering(self): """ Checks that the new generic pattern matching infrastructure works regardless of the ordering of an operation's children """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): power = mb.pow(x=x, y=3.0, name="thepowerop") add_0 = mb.add(x=power, y=5.0, name="add_0") sub_0 = mb.sub(x=power, y=5.0, name="sub_0") mul_0 = mb.mul(x=power, y=5.0, name="mul_0") add_1 = mb.add(x=add_0, y=mul_0, name="add_1") add_2 = mb.add(x=sub_0, y=add_1, name="add_2") return add_2 @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def ops_arrangement(x): power = mb.pow(x=x, y=3.0, name="thepowerop") sub_0 = mb.sub(x=power, y=5.0, name="sub_0") add_0 = mb.add(x=power, y=5.0, name="add_0") mul_0 = mb.mul(x=power, y=5.0, name="mul_0") add_1 = mb.add(x=add_0, y=mul_0, name="add_1") add_2 = mb.add(x=sub_0, y=add_1, name="add_2") return add_2 def var_constraints(pattern): constraints_passed = True constraints_passed &= _check_var_scalar_value(pattern.thepowerop.y, 3) constraints_passed &= _check_var_scalar_value(pattern.sub_0.y, 5) constraints_passed &= _check_var_scalar_value( pattern.add_0.x, 5 ) or _check_var_scalar_value(pattern.add_0.y, 5) constraints_passed &= _check_var_scalar_value( pattern.mul_0.x, 5 ) or _check_var_scalar_value(pattern.mul_0.y, 5) return constraints_passed def transform_pattern(pattern): out_name = "new operation" x = mb.gelu( x=pattern.root_var, mode="TANH_APPROXIMATION", name=out_name, before_op=pattern.thepowerop, ) pattern.add_2.enclosing_block.replace_uses_of_var_after_op( anchor_op=pattern.add_2, old_var=pattern.add_2.outputs[0], new_var=x ) pattern.block.remove_ops(pattern.op_list()) register_generic_pass( ops_arrangement=ops_arrangement, var_constraints=var_constraints, transform_pattern=transform_pattern, pass_name="test_generic_child_ordering", namespace="common", ) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::test_generic_child_ordering" ) assert get_op_types_in_program(prev_prog) == [ "pow", "add", "sub", "mul", "add", "add", ] assert get_op_types_in_program(prog) == ["gelu"] assert_model_is_valid( prog, {"x": (3, 5, 6)}, expected_output_shapes={block.outputs[0].name: (3, 5, 6)}, ) class TestGeluFusion: def test_gelu_tanh_approximation1(self): """ Detect gelu tanh approx pattern, found in the TF bert model. y = ( tanh((.0447)x^3 + x ) * (sqrt(2/pi)) + 1 ) * 0.5 * x """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): x1 = mb.pow(x=x, y=3.0) x1 = mb.mul(x=0.044715, y=x1) x1 = mb.add(x=x1, y=x) x1 = mb.mul(x=x1, y=np.sqrt(2 / np.pi)) x1 = mb.tanh(x=x1) x1 = mb.add(x=1.0, y=x1) x1 = mb.mul(x=0.5, y=x1) x1 = mb.mul(x=x, y=x1) return x1 prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_gelu_tanh_approximation" ) assert get_op_types_in_program(prev_prog) == [ "pow", "mul", "add", "mul", "tanh", "add", "mul", "mul", ] assert get_op_types_in_program(prog) == ["gelu"] assert_model_is_valid( prog, {"x": (3, 5, 6)}, expected_output_shapes={block.outputs[0].name: (3, 5, 6)}, ) @pytest.mark.parametrize( "first_op_1, first_op_2, first_op_3, first_op_4, first_op_5, first_op_6", itertools.product( [True, False], [True, False], [True, False], [True, False], [True, False], [True, False] ), ) def test_gelu_tanh_approximation2( self, first_op_1, first_op_2, first_op_3, first_op_4, first_op_5, first_op_6 ): """ Detect gelu tanh approx pattern, found in the TF Sanitized GPT2 model. y = ( tanh((.0447)x^3 + x ) * (sqrt(2/pi)) + 1 ) * 0.5 * x """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): firstmul = mb.mul(x=x, y=0.5) if first_op_1 else mb.mul(x=0.5, y=x) x1 = mb.pow(x=x, y=3.0) x1 = mb.mul(x=0.044715, y=x1) if first_op_2 else mb.mul(x=x1, y=0.044715) x1 = mb.add(x=x1, y=x) if first_op_3 else mb.add(x=x, y=x1) x1 = ( mb.mul(x=x1, y=np.sqrt(2 / np.pi)) if first_op_4 else mb.mul(x=np.sqrt(2 / np.pi), y=x1) ) x1 = mb.tanh(x=x1) x1 = mb.add(x=1.0, y=x1) if first_op_5 else mb.add(x=x1, y=1.0) x1 = mb.mul(x=firstmul, y=x1) if first_op_6 else mb.mul(x=x1, y=firstmul) return x1 prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_gelu_tanh_approximation" ) assert get_op_types_in_program(prev_prog) == [ "mul", "pow", "mul", "add", "mul", "tanh", "add", "mul", ] assert get_op_types_in_program(prog) == ["gelu"] assert_model_is_valid( prog, {"x": (3, 5, 6)}, expected_output_shapes={block.outputs[0].name: (3, 5, 6)}, ) def test_gelu_tanh_multiple_final_operations(self): """ The generic pattern matching only supports one final output operation. For multiple final operations, we want to make sure it just skip the pattern matching instead of failing the whole conversion. """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): x_1 = mb.mul(x=x, y=0.5) x_2 = mb.pow(x=x, y=3.0) x_2 = mb.mul(x=x_2, y=0.044715) x_2 = mb.add(x=x, y=x_2) x_2 = mb.mul(x=x_2, y=np.sqrt(2 / np.pi)) x_2 = mb.tanh(x=x_2) x_2 = mb.add(x=x_2, y=1.0) x_2 = mb.mul(x=x_1, y=x_2) x_2 = mb.mul(x=x_2, y=1.0) return x_2 with pytest.warns( UserWarning, match="User defined pattern matched to more than one final operation. " "Skipped the pattern matching.", ): apply_pass_and_basic_check(prog, "common::fuse_gelu_tanh_approximation") @pytest.mark.parametrize( "op_type, is_first_op1, is_first_op2, is_first_op3, is_first_op4, const_mul_first", itertools.product( ["real_div", "mul"], [True, False], [True, False], [True, False], [True, False], [True, False], ), ) def test_gelu_exact( self, op_type, is_first_op1, is_first_op2, is_first_op3, is_first_op4, const_mul_first ): """ Detect gelu exact pattern. y = 0.5 * (x * ( 1 + erf ( x / srqt(2)))) or y = x * (0.5 * ( 1 + erf ( x / srqt(2)))) """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): if op_type == "real_div": x1 = mb.real_div(x=x, y=2**0.5) elif op_type == "mul": x1 = mb.mul(x=x, y=2**-0.5) if is_first_op1 else mb.mul(x=2**-0.5, y=x) x2 = mb.erf(x=x1) x3 = mb.add(x=x2, y=1.0) if is_first_op2 else mb.add(x=1.0, y=x2) if const_mul_first: y1 = mb.const(val=0.5) y2 = x else: y1 = x y2 = mb.const(val=0.5) x4 = mb.mul(x=x3, y=y1) if is_first_op3 else mb.mul(x=y1, y=x3) x5 = mb.mul(x=x4, y=y2) if is_first_op4 else mb.mul(x=y2, y=x4) return x5 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_gelu_exact") assert get_op_types_in_program(prev_prog) == [ op_type, "erf", "add", "mul", "mul", ] assert get_op_types_in_program(prog) == ["gelu"] assert_model_is_valid( prog, {"x": (3, 5, 6)}, expected_output_shapes={block.outputs[0].name: (3, 5, 6)}, ) @pytest.mark.parametrize( "is_first_op0, is_first_op4", itertools.product( [True, False], [True, False], ), ) def test_gelu_exact_pattern_2(self, is_first_op0, is_first_op4): """ Detect gelu exact pattern. y = (0.5 * x) * ( 1 + erf ( x / srqt(2))) """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): x0 = mb.mul(x=x, y=0.5) if is_first_op0 else mb.mul(x=0.5, y=x) x1 = mb.mul(x=x, y=2**-0.5) x2 = mb.erf(x=x1) x3 = mb.add(x=x2, y=1.0) x4 = mb.mul(x=x0, y=x3) if is_first_op4 else mb.mul(x=x3, y=x0) return x4 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_gelu_exact") assert get_op_types_in_program(prev_prog) == [ "mul", "mul", "erf", "add", "mul", ] assert get_op_types_in_program(prog) == ["gelu"] assert_model_is_valid( prog, {"x": (3, 5, 6)}, expected_output_shapes={block.outputs[0].name: (3, 5, 6)}, ) class TestLeakyReluFusion: @pytest.mark.parametrize( "swap_mul_input_order, swap_max_input_order", itertools.product( [True, False], [True, False], ), ) def test_valid_leaky_relu_pattern(self, swap_mul_input_order, swap_max_input_order): """ Input graph: const (val = 0.3) | input ----> mul ---------------> maximum -----------> output | | |---------------------------------- Output graph: input --------> leaky_relu ---------> output """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): if swap_mul_input_order: x1 = mb.mul(x=x, y=0.3) else: x1 = mb.mul(x=0.3, y=x) if swap_max_input_order: x1 = mb.maximum(x=x1, y=x) else: x1 = mb.maximum(x=x, y=x1) return x1 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_leaky_relu") assert get_op_types_in_program(prev_prog) == ["mul", "maximum"] assert get_op_types_in_program(prog) == ["leaky_relu"] assert_model_is_valid( prog, {"x": (3, 5, 6)}, expected_output_shapes={block.outputs[0].name: (3, 5, 6)}, ) def test_invalid_leaky_relu_pattern1(self): """ Invalid because alpha value greater than 1 Input graph: const (val = 1.3) | input ----> mul ---------------> maximum -----------> output | | |---------------------------------- Output graph: same as input graph """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): x1 = mb.mul(x=x, y=1.3) x1 = mb.maximum(x=x1, y=x) return x1 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_leaky_relu") assert get_op_types_in_program(prev_prog) == ["mul", "maximum"] assert get_op_types_in_program(prog) == ["mul", "maximum"] def test_invalid_leaky_relu_pattern2(self): """ Invalid because input to the "maximum" op is not same as the input of the "mul" op Input graph: const (val = 0.3) | input ----> mul ---------------> maximum -----------> output | const Output graph: same as input graph """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 5, 6))]) def prog(x): x1 = mb.mul(x=x, y=0.3) x1 = mb.maximum(x=x1, y=0.4) return x1 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_leaky_relu") assert get_op_types_in_program(prev_prog) == ["mul", "maximum"] assert get_op_types_in_program(prog) == ["mul", "maximum"] class TestPreluFusion: @pytest.mark.parametrize( "swap_input_order, alpha_rank", itertools.product( [True, False], [3, 4], ), ) def test_channel_first_pattern(self, swap_input_order, alpha_rank): """ Input: | ------------> relu --------------------| | V x (BCHW) ------| add -----> y (BCHW) | ^ --------> mul -------> relu -----> mul---| ^ ^ | | Const(val=-1) Const(name=a, shape=(1,C,1,1)) Output: x (BCHW) ------> prelu(alpha=a, shape=(C,)) ---------> y (BCHW) """ B, C, H, W = 2, 3, 5, 6 if alpha_rank == 3: alpha = np.random.rand(C, 1, 1) elif alpha_rank == 4: alpha = np.random.rand(1, C, 1, 1) else: raise NotImplementedError("alpha rank must be 3 or 4") @mb.program(input_specs=[mb.TensorSpec(shape=(B, C, H, W))]) def prog(x): if swap_input_order: neg = mb.mul(x=x, y=-1.0) else: neg = mb.mul(x=-1.0, y=x) relu1 = mb.relu(x=neg) if swap_input_order: mul = mb.mul(x=relu1, y=alpha) else: mul = mb.mul(x=alpha, y=relu1) relu2 = mb.relu(x=x) if swap_input_order: out = mb.add(x=relu2, y=mul) else: out = mb.add(x=mul, y=relu2) return out prev_prog, _, _ = apply_pass_and_basic_check( prog, "common::fuse_prelu", ) assert get_op_types_in_program(prev_prog) == ["mul", "relu", "mul", "relu", "add"] assert get_op_types_in_program(prog) == ["prelu"] @pytest.mark.parametrize( "swap_input_order, alpha_rank", itertools.product( [True, False], [1, 2, 3], ), ) def test_channel_last_transpose_pattern(self, swap_input_order, alpha_rank): """ Input: | ------------> relu --------------------| | V x (shappe=BCHW)-->transpose(out_shape=BHWC)---->| add -----> y (BHWC) | ^ --------> mul -------> relu -----> mul---| ^ ^ | | Const(val=-1) Const(shape=(1,1,C)) Output: x (BCHW) ------> prelu ---------> transpose ------> y (BHWC) """ B, C, H, W = 2, 3, 5, 6 if alpha_rank == 1: alpha = np.random.rand(C) elif alpha_rank == 2: alpha = np.random.rand(1, C) elif alpha_rank == 3: alpha = np.random.rand(1, 1, C) else: raise NotImplementedError("alpha rank must be 1 or 2 or 3") @mb.program(input_specs=[mb.TensorSpec(shape=(B, C, H, W))]) def prog(x): x = mb.transpose(x=x, perm=[0, 2, 3, 1]) if swap_input_order: neg = mb.mul(x=x, y=-1.0) else: neg = mb.mul(x=-1.0, y=x) relu1 = mb.relu(x=neg) if swap_input_order: mul = mb.mul(x=relu1, y=alpha) else: mul = mb.mul(x=alpha, y=relu1) relu2 = mb.relu(x=x) if swap_input_order: out = mb.add(x=relu2, y=mul) else: out = mb.add(x=mul, y=relu2) return out prev_prog, _, block = apply_pass_and_basic_check( prog, "common::fuse_prelu", ) assert get_op_types_in_program(prev_prog) == [ "transpose", "mul", "relu", "mul", "relu", "add", ] assert get_op_types_in_program(prog) == ["prelu", "transpose"] assert_model_is_valid( prog, {"x": (B, C, H, W)}, expected_output_shapes={block.outputs[0].name: (B, H, W, C)}, ) class TestPreluToLrelu: def test_prelu_to_lrelu(self): @mb.program(input_specs=[mb.TensorSpec(shape=(4, 2, 3, 1))]) def prog(x): # Not a common leakage factor. alpha_0 = np.array([1.0, 2.0], dtype=np.float32) x = mb.prelu(x=x, alpha=alpha_0) add_val = np.random.rand(4, 2, 3, 1).astype(np.float32) x = mb.add(x=x, y=add_val) # Common leakage factor. alpha_1 = np.array([1.5, 1.5], dtype=np.float32) x = mb.prelu(x=x, alpha=alpha_1) return x assert_op_count_match(prog, expect=2, op="prelu") assert_op_count_match(prog, expect=0, op="leaky_relu") prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::prelu_to_lrelu") assert_same_output_names(prev_prog, prog) # The prelu with a common leakage factor becomes leaky_relu. assert_op_count_match(prog, expect=1, op="prelu") assert_op_count_match(prog, expect=1, op="leaky_relu") if _VALIDATE_MODEL: assert_model_is_valid(prog, {"x": (4, 2, 3, 1)}) class TestSkipConstexprOps: @staticmethod @pytest.mark.parametrize( "constexpr_op", CONSTEXPR_OPS, ) def test_skip_const_elimination(constexpr_op): """ constexpr_op | v const -> linear | v input --------------> add -> output We are testing that: 1. constexpr_op can serve as a const input weight for linear op 2. linear op shouldn't be removed by the const_elimination pass """ @mb.program(input_specs=[mb.TensorSpec(shape=(4,))]) def prog(x): a = np.random.rand( 2, ) constexpr = CONSTEXPR_FUNCS[constexpr_op]((4, 2)) linear = mb.linear(x=a, weight=constexpr) return mb.add(x=x, y=linear) PASS_REGISTRY["common::const_elimination"](prog) assert get_op_types_in_program(prog) == [constexpr_op, "linear", "add"] @staticmethod @pytest.mark.parametrize( "constexpr_op, weight_constexpr, bias_constexpr", itertools.product( CONSTEXPR_OPS, [True, False], [True, False], ), ) def test_skip_fuse_matmul_weight_bias(constexpr_op, weight_constexpr, bias_constexpr): """ const_1 const_2 | | v v input -----> matmul -----> add ---> out In this case, if either const_1 or const_2 is constexpr op, they should be not fused into a single linear op """ def get_matmul(x, weight_constexpr): weight = CONSTEXPR_FUNCS[constexpr_op]((3, 2)) if not weight_constexpr: weight = _get_constexpr_val(weight) return mb.matmul(x=x, y=weight) def get_add(x, bias_constexpr): bias = CONSTEXPR_FUNCS[constexpr_op]((2,)) if not bias_constexpr: bias = _get_constexpr_val(bias) return mb.add(x=x, y=bias) @mb.program(input_specs=[mb.TensorSpec(shape=(1, 3))]) def prog(x): x = get_matmul(x, weight_constexpr) x = get_add(x, bias_constexpr) return x apply_pass_and_basic_check(prog, "common::fuse_matmul_weight_bias") apply_pass_and_basic_check(prog, "common::const_elimination") apply_pass_and_basic_check(prog, "common::dead_code_elimination") if not weight_constexpr and not bias_constexpr: expected_ops = ["linear"] else: expected_ops = [] if weight_constexpr: expected_ops.append(constexpr_op) expected_ops.append("matmul") if bias_constexpr: expected_ops.append(constexpr_op) expected_ops.append("add") assert get_op_types_in_program(prog) == expected_ops @staticmethod @pytest.mark.parametrize( "constexpr_op, op, weight_constexpr, const_constexpr", itertools.product( CONSTEXPR_OPS, ["mul", "add"], [True, False], [True, False], ), ) def test_skip_fuse_conv(constexpr_op, op, weight_constexpr, const_constexpr): """ const_1 const_2 | | v v input -----> conv -----> mul/add ---> out This pattern shouldn't be fused into a single conv layer if one of const_1 or const_2 is a constexpr op. """ Cin, Cout = 3, 3 input_shape = (2, Cin, 5, 5) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): conv_weight = CONSTEXPR_FUNCS[constexpr_op]((Cout, Cin, 2, 2)) if not weight_constexpr: conv_weight = _get_constexpr_val(conv_weight) x = mb.conv(x=x, weight=conv_weight) const = CONSTEXPR_FUNCS[constexpr_op]((Cout, 1, 1)) if not const_constexpr: const = _get_constexpr_val(const) return getattr(mb, op)(x=x, y=const) apply_pass_and_basic_check(prog, "common::fuse_conv_scale") apply_pass_and_basic_check(prog, "common::fuse_conv_bias") apply_pass_and_basic_check(prog, "common::const_elimination") apply_pass_and_basic_check(prog, "common::dead_code_elimination") expected_ops = [] if not weight_constexpr and not const_constexpr: expected_ops = ["conv"] else: if weight_constexpr: expected_ops.append(constexpr_op) expected_ops.append("conv") if const_constexpr: expected_ops.append(constexpr_op) if op != "add" or const_constexpr: expected_ops.append(op) assert get_op_types_in_program(prog) == expected_ops @staticmethod @pytest.mark.parametrize( "constexpr_op, weight_constexpr, bias_constexpr", itertools.product( CONSTEXPR_OPS, [True, False], [True, False], ), ) def test_skip_fuse_linear_bias(constexpr_op, weight_constexpr, bias_constexpr): """ const_1 const_2 | | v V input -----> linear -----> add ---> out This pattern shouldn't be fused into a single linear layer if one of const_1 or const_2 is a constexpr op. """ @mb.program(input_specs=[mb.TensorSpec(shape=(2,))]) def prog(x): weight = CONSTEXPR_FUNCS[constexpr_op]((4, 2)) if not weight_constexpr: weight = _get_constexpr_val(weight) linear = mb.linear(x=x, weight=weight) bias = CONSTEXPR_FUNCS[constexpr_op]((4,)) if not bias_constexpr: bias = _get_constexpr_val(bias) return mb.add(x=linear, y=bias) apply_pass_and_basic_check(prog, "common::fuse_linear_bias") apply_pass_and_basic_check(prog, "common::const_elimination") apply_pass_and_basic_check(prog, "common::dead_code_elimination") expected_ops = [] if not weight_constexpr and not bias_constexpr: expected_ops = ["linear"] else: if weight_constexpr: expected_ops.append(constexpr_op) expected_ops.append("linear") if bias_constexpr: expected_ops.append(constexpr_op) expected_ops.append("add") assert get_op_types_in_program(prog) == expected_ops @staticmethod @pytest.mark.parametrize( "constexpr_op, weight_constexpr, bias_constexpr", itertools.product( CONSTEXPR_OPS, [True, False], [True, False], ), ) def test_skip_fuse_conv_batchnorm(constexpr_op, weight_constexpr, bias_constexpr): """ weight bias | | |_____ ____| | | v v input -----> conv -----> batch_norm ---> out This pattern shouldn't be fused into a single conv layer if one of the weight / bias is a constexpr op. """ Cin, Cout = 2, 3 input_shape = (2, Cin, 5, 5) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): # conv layer weight = CONSTEXPR_FUNCS[constexpr_op]((Cout, Cin, 2, 2)) if not weight_constexpr: weight = _get_constexpr_val(weight) bias = CONSTEXPR_FUNCS[constexpr_op]((Cout,)) if not bias_constexpr: bias = _get_constexpr_val(bias) x = mb.conv( x=x, weight=weight, bias=bias, ) # batch_norm layer gamma = np.random.rand(Cout) beta = np.random.rand(Cout) mean = np.random.rand(Cout) variance = np.random.rand(Cout) epsilon = 1e-2 return mb.batch_norm( x=x, mean=mean, variance=variance, gamma=gamma, beta=beta, epsilon=epsilon, ) apply_pass_and_basic_check(prog, "common::fuse_conv_batchnorm") apply_pass_and_basic_check(prog, "common::const_elimination") apply_pass_and_basic_check(prog, "common::dead_code_elimination") expected_ops = [] if not weight_constexpr and not bias_constexpr: expected_ops = ["conv"] else: expected_ops = [constexpr_op] * sum([weight_constexpr, bias_constexpr]) + [ "conv", "batch_norm", ] assert get_op_types_in_program(prog) == expected_ops class TestMergeConsecutivePaddings: def test_success_reflect(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): pad1 = mb.pad(x=x1, pad=[0, 0, 1, 1], mode="reflect") pad2 = mb.pad(x=pad1, pad=[1, 1, 0, 0], mode="reflect") return pad2 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_paddings") assert get_op_types_in_program(prev_prog) == ["pad", "pad"] assert get_op_types_in_program(prog) == ["pad"] inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 8, 10)}, ) @pytest.mark.parametrize("swap_axes", [False, True]) def test_success_different_rank1(self, swap_axes): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): if swap_axes: pad1 = mb.pad(x=x1, pad=[1, 1], mode="reflect") pad2 = mb.pad(x=pad1, pad=[1, 1, 0, 0], mode="reflect") else: pad1 = mb.pad(x=x1, pad=[1, 1, 0, 0], mode="reflect") pad2 = mb.pad(x=pad1, pad=[1, 1], mode="reflect") return pad2 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_paddings") assert get_op_types_in_program(prev_prog) == ["pad", "pad"] assert get_op_types_in_program(prog) == ["pad"] inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 8, 10)}, ) def test_success_constant(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): pad1 = mb.pad(x=x1, pad=[0, 0, 1, 1], mode="constant", constant_val=3.0) pad2 = mb.pad(x=pad1, pad=[1, 1, 0, 0], mode="constant", constant_val=3.0) return pad2 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_paddings") assert get_op_types_in_program(prev_prog) == ["pad", "pad"] assert get_op_types_in_program(prog) == ["pad"] pad_ops = [op for op in prog["main"].operations if op.op_type == "pad"] assert pad_ops[0].inputs["constant_val"].val == 3.0 inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 8, 10)}, ) def test_success_3_layers(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): pad1 = mb.pad(x=x1, pad=[0, 0, 1, 1], mode="constant", constant_val=3.0) pad2 = mb.pad(x=pad1, pad=[1, 1, 0, 0], mode="constant", constant_val=3.0) pad3 = mb.pad(x=pad2, pad=[1, 1, 0, 0], mode="constant", constant_val=3.0) return pad3 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_paddings") assert get_op_types_in_program(prev_prog) == ["pad", "pad", "pad"] assert get_op_types_in_program(prog) == ["pad"] pad_ops = [op for op in prog["main"].operations if op.op_type == "pad"] assert pad_ops[0].inputs["constant_val"].val == 3.0 inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 10, 10)}, ) def test_failure_different_mode(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): pad1 = mb.pad(x=x1, pad=[0, 0, 1, 1], mode="reflect") pad2 = mb.pad(x=pad1, pad=[1, 1, 0, 0], mode="constant") return pad2 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_paddings") assert get_op_types_in_program(prev_prog) == ["pad", "pad"] assert get_op_types_in_program(prog) == ["pad", "pad"] inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 8, 10)}, ) def test_failure_different_constants(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): pad1 = mb.pad(x=x1, pad=[0, 0, 1, 1], mode="constant", constant_val=1.0) pad2 = mb.pad(x=pad1, pad=[1, 1, 0, 0], mode="constant", constant_val=2.0) return pad2 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_paddings") assert get_op_types_in_program(prev_prog) == ["pad", "pad"] assert get_op_types_in_program(prog) == ["pad", "pad"] inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 8, 10)}, ) def test_failure_repeat_on_same_axis(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): pad1 = mb.pad(x=x1, pad=[1, 1], mode="reflect") pad2 = mb.pad(x=pad1, pad=[1, 1], mode="reflect") return pad2 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_paddings") assert get_op_types_in_program(prev_prog) == ["pad", "pad"] assert get_op_types_in_program(prog) == ["pad", "pad"] inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 6, 12)}, ) class TestMergeConsecutiveTransposes: def test_success_reduce_consecutive_transposes(self): """ Input: |--> transpose_1 -> transpose_2 -> output_1 x - |--> transpose_3 -> tranpose_4 -> transpose_5 -> output_2 Output: |--> transpose_6 -> output_1 x - |--> transpose_7 -> output_2 """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3, 4))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 2, 1, 3]) x1 = mb.transpose(x=x1, perm=[3, 2, 0, 1]) x2 = mb.transpose(x=x, perm=[3, 2, 1, 0]) x2 = mb.transpose(x=x2, perm=[2, 3, 0, 1]) x2 = mb.transpose(x=x2, perm=[0, 2, 1, 3]) return x1, x2 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_transposes") assert get_op_types_in_program(prev_prog) == ["transpose"] * 5 assert get_op_types_in_program(prog) == ["transpose"] * 2 inputs = {"x": (1, 2, 3, 4)} assert_model_is_valid( prog, inputs, expected_output_shapes={ block.outputs[0].name: (4, 2, 1, 3), block.outputs[1].name: (2, 4, 1, 3), }, ) def test_success_reduce_consecutive_transposes_with_output_constrain(self): """ Input: x --> transpose_1 -> transpose_2 -> transpose_3 -> transpose_4 -> transpose_5 -> add -> output_3 | | v v output_1 output_2 Output: x --> transpose_1 -> transpose_6 -> transpose_7-> add -> output_1 | | v v output_2 output_3 """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3, 4))]) def prog(x): x1 = mb.transpose(x=x, perm=[3, 2, 1, 0], name="output_1") x2 = mb.transpose(x=x1, perm=[1, 3, 2, 0]) x2 = mb.transpose(x=x2, perm=[2, 3, 0, 1], name="output_2") x3 = mb.transpose(x=x2, perm=[0, 2, 1, 3]) x3 = mb.transpose(x=x3, perm=[3, 2, 1, 0]) x3 = mb.add(x=x3, y=1., name="output_3") return x1, x2, x3 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_transposes") assert get_op_types_in_program(prev_prog) == ["transpose"] * 5 + ["add"] assert get_op_types_in_program(prog) == ["transpose"] * 3 + ["add"] inputs = {"x": (1, 2, 3, 4)} assert_model_is_valid( prog, inputs, expected_output_shapes={ block.outputs[0].name: (4, 3, 2, 1), block.outputs[1].name: (2, 4, 3, 1), block.outputs[2].name: (1, 4, 3, 2), }, ) assert block.outputs[0].name == "output_1" assert block.outputs[1].name == "output_2" assert block.outputs[2].name == "output_3" def test_not_merge_transposes(self): """ Input: x --> transpose_1 -> add -> transpose_2 -> output Output: x --> transpose_1 -> add -> transpose_2 -> output """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3, 4))]) def prog(x): x = mb.transpose(x=x, perm=[3, 2, 1, 0]) x = mb.add(x=x, y=1.) x = mb.transpose(x=x, perm=[1, 3, 2, 0]) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_transposes") assert get_op_types_in_program(prev_prog) == ["transpose", "add", "transpose"] assert get_op_types_in_program(prog) == ["transpose", "add", "transpose"] inputs = {"x": (1, 2, 3, 4)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (3, 1, 2, 4),}, ) class TestExpandHighRankReshapeAndTranspose: @staticmethod def _test_numerical(prog, input_shape, reshape_shape, perm, output_shape): x = np.random.rand(*input_shape) coreml_input = {"x": x} mlmodel = ct.convert(prog, source="milinternal") coreml_output = list(mlmodel.predict(coreml_input).values())[0] gt = np.reshape(x, reshape_shape) gt = np.transpose(gt, perm) gt = np.reshape(gt, output_shape) np.testing.assert_allclose(gt, coreml_output, rtol=1e-03, atol=1e-05) def test_rank6(self): input_shape = (1, 2, 3, 4, 5) reshape_shape = (1, 2, 3, 2, 2, 5) perm = (4, 5, 3, 2, 0, 1) output_shape = (5, 24) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): x = mb.reshape(x=x, shape=reshape_shape) x = mb.transpose(x=x, perm=perm) x = mb.reshape(x=x, shape=output_shape) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::expand_high_rank_reshape_and_transpose") prog._check_early_error_out_for_invalid_program() assert get_op_types_in_program(prog) == ["reshape", "transpose", "reshape"] TestExpandHighRankReshapeAndTranspose._test_numerical(prev_prog, input_shape, reshape_shape, perm, output_shape) def test_rank10(self): input_shape = (2, 3, 4, 5, 6) reshape_shape = (1, 2, 1, 3, 2, 2, 1, 5, 2, 3) perm = (0, 1, 2, 3, 4, 9, 5, 6, 7, 8) output_shape = (30, 24) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): x = mb.reshape(x=x, shape=reshape_shape) x = mb.transpose(x=x, perm=perm) x = mb.reshape(x=x, shape=output_shape) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::expand_high_rank_reshape_and_transpose") prog._check_early_error_out_for_invalid_program() assert get_op_types_in_program(prog) == ["reshape", "transpose", "reshape"] TestExpandHighRankReshapeAndTranspose._test_numerical(prev_prog, input_shape, reshape_shape, perm, output_shape) @pytest.mark.xfail( reason="rdar://131637870 Why It Randomly Segfaults on CI but Cannot Reproduce Locally", run=False, ) def test_rank20(self): input_shape = (4, 6, 8, 20, 40) reshape_shape = (1, 2, 2, 1, 2, 3, 2, 2, 2, 2, 2, 1, 1, 1, 5, 2, 2, 2, 1, 5) perm = (19, 14, 13, 12, 0, 3, 1, 2, 10, 5, 4, 6, 15, 11, 17, 18, 7, 8, 9, 16) output_shape = (24, 160, 40) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): x = mb.reshape(x=x, shape=reshape_shape) x = mb.transpose(x=x, perm=perm) x = mb.reshape(x=x, shape=output_shape) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::expand_high_rank_reshape_and_transpose") prog._check_early_error_out_for_invalid_program() assert get_op_types_in_program(prog) == ["reshape", "transpose"] * 16 + ["reshape"] TestExpandHighRankReshapeAndTranspose._test_numerical(prev_prog, input_shape, reshape_shape, perm, output_shape) def test_negative_case(self): input_shape = (4, 6, 8, 20, 40) reshape_shape = (1, 2, 2, 1, 2, 3, 2, 2, 2, 2, 2, 1, 1, 1, 5, 2, 2, 2, 1, 5) perm = (19, 14, 13, 12, 0, 3, 1, 2, 10, 5, 4, 6, 15, 11, 17, 18, 7, 8, 9, 16) output_shape = (24, 160, 40) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): x1 = mb.reshape(x=x, shape=reshape_shape) x2 = mb.transpose(x=x1, perm=perm) x3 = mb.reshape(x=x2, shape=output_shape) return x, x1 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::expand_high_rank_reshape_and_transpose") with pytest.raises(ValueError, match="Core ML only supports tensors with rank <= 5"): prog._check_early_error_out_for_invalid_program() class TestMergeConsecutiveRelus: @pytest.mark.parametrize( "relu_num", [2, 3, 4], ) def test_success_reduce_consecutive_relus(self, relu_num): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3))]) def prog(x): for _ in range(relu_num): x = mb.relu(x=x) x = mb.add(x=x, y=1.0) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_relus") assert get_op_types_in_program(prev_prog) == ["relu"] * relu_num + ["add"] assert get_op_types_in_program(prog) == ["relu", "add"] inputs = {"x": (1, 2, 3)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 3)}, ) @pytest.mark.parametrize( "relu_num", [2, 3, 4], ) def test_keep_not_consecutive_relus(self, relu_num): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3))]) def prog(x): for _ in range(relu_num): x = mb.relu(x=x) x = mb.add(x=x, y=1.0) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_relus") assert get_op_types_in_program(prev_prog) == ["relu", "add"] * relu_num assert get_op_types_in_program(prog) == get_op_types_in_program(prev_prog) inputs = {"x": (1, 2, 3)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 3)}, ) def test_mix_situation(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3))]) def prog(x): relu1 = mb.relu(x=x) relu_after_add = mb.add(x=relu1, y=1.0) relu2 = mb.relu(x=relu_after_add) relu3 = mb.relu(x=relu2) return relu3 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_relus") assert get_op_types_in_program(prev_prog) == ["relu", "add", "relu", "relu"] assert get_op_types_in_program(prog) == ["relu", "add", "relu"] inputs = {"x": (1, 2, 3)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 3)}, ) def test_name_change_depend_on_output(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3))]) def prog_output_transpose_2(x): transpose_1 = mb.relu(x=x, name="transpose_1") transpose_2 = mb.relu(x=transpose_1, name="transpose_2") transpose_3 = mb.transpose(x=transpose_2, perm=[0, 2, 1], name="transpose_3") return transpose_2, transpose_3 @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3))]) def prog_output_transpose_3(x): transpose_1 = mb.relu(x=x, name="transpose_1") transpose_2 = mb.relu(x=transpose_1, name="transpose_2") transpose_3 = mb.transpose(x=transpose_2, perm=[0, 2, 1], name="transpose_3") return transpose_3 prev_prog_output_transpose_2, _, block = apply_pass_and_basic_check( prog_output_transpose_2, "common::merge_consecutive_relus" ) assert get_op_types_in_program(prev_prog_output_transpose_2) == [ "relu", "relu", "transpose", ] assert get_op_types_in_program(prog_output_transpose_2) == ["relu", "transpose"] assert prog_output_transpose_2["main"].operations[0].name == "transpose_1" # As the block's output has transpose_2, the original output name of the first operation # is replaced. assert prog_output_transpose_2["main"].operations[0].outputs[0].name == "transpose_2" prev_prog_output_transpose_3, _, block = apply_pass_and_basic_check( prog_output_transpose_3, "common::merge_consecutive_relus" ) assert get_op_types_in_program(prev_prog_output_transpose_3) == [ "relu", "relu", "transpose", ] assert get_op_types_in_program(prog_output_transpose_3) == ["relu", "transpose"] assert prog_output_transpose_3["main"].operations[0].name == "transpose_1" # As the block's output only has transpose_3, the entire transpose_2 gets removed. assert prog_output_transpose_3["main"].operations[0].outputs[0].name == "transpose_1" inputs = {"x": (1, 2, 3)} assert_model_is_valid( prog_output_transpose_2, inputs, expected_output_shapes={block.outputs[0].name: (1, 3, 2)}, ) inputs = {"x": (1, 2, 3)} assert_model_is_valid( prog_output_transpose_3, inputs, expected_output_shapes={block.outputs[0].name: (1, 3, 2)}, ) class TestMergeConsecutiveReshapes: @pytest.mark.parametrize( "backend", backends, ) def test_merge_consecutive_2reshapes(self, backend): INPUT_SHAPE = (2, 3) OUTPUT_SHAPE = (3, 2) @mb.program(input_specs=[mb.TensorSpec(shape=INPUT_SHAPE)]) def prog(x): y1 = mb.reshape(x=x, shape=(-1,)) y2 = mb.reshape(x=y1, shape=OUTPUT_SHAPE) return y2 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_reshapes") assert get_op_types_in_program(prev_prog) == ["reshape"] * 2 assert get_op_types_in_program(prog) == ["reshape"] assert_model_is_valid( prog, {"x": INPUT_SHAPE}, expected_output_shapes={block.outputs[0].name: OUTPUT_SHAPE}, backend=backend, ) @pytest.mark.parametrize( "backend", backends, ) def test_merge_consecutive_4reshapes(self, backend): INPUT_SHAPE = (2, 3, 5) OUTPUT_SHAPE = (10, 3) @mb.program(input_specs=[mb.TensorSpec(shape=INPUT_SHAPE)]) def prog(x): y1 = mb.reshape(x=x, shape=(15, 2)) y2 = mb.reshape(x=y1, shape=(2, 5, 3)) y3 = mb.reshape(x=y2, shape=(6, 5)) y4 = mb.reshape(x=y3, shape=OUTPUT_SHAPE) return y4 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_reshapes") assert get_op_types_in_program(prev_prog) == ["reshape"] * 4 assert get_op_types_in_program(prog) == ["reshape"] assert_model_is_valid( prog, {"x": INPUT_SHAPE}, expected_output_shapes={block.outputs[0].name: OUTPUT_SHAPE}, backend=backend, ) @pytest.mark.parametrize( "backend", backends, ) def test_keep_separate_reshapes(self, backend): INPUT_SHAPE = (3, 5, 7) OUTPUT_SHAPE = (7, 3, 5) @mb.program(input_specs=[mb.TensorSpec(shape=INPUT_SHAPE)]) def prog(x): y1 = mb.reshape(x=x, shape=(21, 5)) # Note [elementwise op and reshape] # In principle, elementwise ops can be swapped with the reshapes, e.g. # in -> reshape1 -> elementwise1 -> reshape2 -> elementwise2 -> reshape3 -> out # is equivalent to # in -> elementwise1 -> elementwise2 -> reshape1 -> reshape2 -> reshape3 -> out # which can then be optimized to # in -> elementwise1 -> elementwise2 -> reshape3 -> out # # so here we divide the reshape sequence with something non-elementwise bias = np.random.rand(5) * 2.0 - 1.0 y2 = mb.add(x=y1, y=bias) y3 = mb.reshape(x=y2, shape=OUTPUT_SHAPE) return y3 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_reshapes") assert get_op_types_in_program(prev_prog) == ["reshape", "add", "reshape"] assert get_op_types_in_program(prog) == ["reshape", "add", "reshape"] assert_model_is_valid( prog, {"x": INPUT_SHAPE}, expected_output_shapes={block.outputs[0].name: OUTPUT_SHAPE}, backend=backend, ) @pytest.mark.parametrize("backend", backends) def test_merge_2consecutive_keep_1separate(self, backend): INPUT_SHAPE = (5, 7, 11) OUTPUT_SHAPE = (11, 5, 7) @mb.program(input_specs=[mb.TensorSpec(shape=(INPUT_SHAPE))]) def prog(x): # these 2 reshapes will be merged y1 = mb.reshape(x=x, shape=(35, 11)) y2 = mb.reshape(x=y1, shape=(55, 7)) # see Note [elementwise op and reshape] bias = np.random.rand(7) * 2.0 - 1.0 y3 = mb.sub(x=y2, y=bias) # this reshape is separated, so it will be kept y4 = mb.reshape(x=y3, shape=OUTPUT_SHAPE) return y4 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_reshapes") assert get_op_types_in_program(prev_prog) == ["reshape", "reshape", "sub", "reshape"] assert get_op_types_in_program(prog) == ["reshape", "sub", "reshape"] assert_model_is_valid( prog, {"x": INPUT_SHAPE}, expected_output_shapes={block.outputs[0].name: OUTPUT_SHAPE}, backend=backend, ) @pytest.mark.parametrize( "backend", backends, ) def test_keep_block_outputs(self, backend): INPUT_SHAPE = (5, 6) OUTPUT0_SHAPE = (15, 2) OUTPUT1_SHAPE = (3, 10) @mb.program(input_specs=[mb.TensorSpec(shape=INPUT_SHAPE)]) def prog(x): y1 = mb.reshape(x=x, shape=OUTPUT0_SHAPE) y2 = mb.reshape(x=y1, shape=OUTPUT1_SHAPE) return y1, y2 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_reshapes") assert get_op_types_in_program(prev_prog) == ["reshape", "reshape"] assert get_op_types_in_program(prog) == ["reshape", "reshape"] assert len(block.outputs) == 2 expected_output_shapes = { block.outputs[0].name: OUTPUT0_SHAPE, block.outputs[1].name: OUTPUT1_SHAPE, } assert_model_is_valid( prog, {"x": INPUT_SHAPE}, expected_output_shapes=expected_output_shapes, backend=backend, ) @pytest.mark.parametrize( "backend", backends, ) def test_keep_nonreshape_child(self, backend): INPUT_SHAPE = (6, 7) OUTPUT_SHAPE = (14, 3) @mb.program(input_specs=[mb.TensorSpec(shape=INPUT_SHAPE)]) def prog(x): y1 = mb.reshape(x=x, shape=(21, 2)) y2 = mb.reshape(x=y1, shape=OUTPUT_SHAPE) # the 1st reshape creating y1 has a non-reshape child op (matmul), # so it will not be merged y3 = mb.matmul(x=y1, y=np.random.rand(2, 5)) return y2, y3 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_reshapes") assert get_op_types_in_program(prev_prog) == ["reshape", "reshape", "matmul"] assert get_op_types_in_program(prog) == ["reshape", "reshape", "matmul"] assert len(block.outputs) == 2 assert_model_is_valid( prog, {"x": INPUT_SHAPE}, expected_output_shapes={block.outputs[0].name: OUTPUT_SHAPE}, backend=backend, ) @pytest.mark.parametrize( "backend", backends, ) def test_merge_reshape_in_nested_block(self, backend): INPUT_SHAPE = (6, 7) OUTPUT_SHAPE = (7, 6) @mb.program(input_specs=[mb.TensorSpec(shape=INPUT_SHAPE)]) def prog(x): loop_var = np.int32(2) def while_cond(loop_var, _x): return mb.equal(x=loop_var, y=np.int32(0)) def while_body(loop_var, x): # Do reshapes of the input y1 = mb.reshape(x=x, shape=(3, 2, 7)) y2 = mb.reshape(x=y1, shape=(7, 2, 3)) y3 = mb.reshape(x=y2, shape=(14, 3)) y4 = mb.reshape(x=y3, shape=OUTPUT_SHAPE) return mb.add(x=loop_var, y=np.int32(-1)), y4 while_results = mb.while_loop(_cond=while_cond, _body=while_body, loop_vars=(loop_var, x)) return while_results[1] prev_prog, _, block = apply_pass_and_basic_check(prog, "common::merge_consecutive_reshapes") assert get_op_types_in_program(prev_prog, recurse=True) == ["while_loop", "equal", "reshape", "reshape", "reshape", "reshape", "add"] assert get_op_types_in_program(prog, recurse=True) == ["while_loop", "equal", "reshape", "add"] assert len(block.outputs) == 1 assert block.outputs[0].shape == OUTPUT_SHAPE # the runtime is failing and tracked by this radar: # rdar://133783519 ([CI] test_merge_reshape_in_nested_block unittest is crushing) # TODO: After the framework fixes the issue, we should run the below checking again """ assert_model_is_valid( prog, {"x": INPUT_SHAPE}, expected_output_shapes={block.outputs[0].name: OUTPUT_SHAPE}, backend=backend, ) """ class TestCastOptimizationReduendantCastRemoval: """ Test single cast op removal. """ def test_time_complexity(self): """ This test makes sure the cast_optimization's time complexity is O(N) for most of the cases. In this test case, the program consists of 1000 relu ops followed by 100 cast ops. input -> relu -> relu -> ... -> relu -> cast -> cast -> ... -> cast The algorithm goes through the first pass to eliminate all cast ops: input -> relu -> ... -> relu Note that, the total number of visited op is 1000 (relu) + 100 (cast) + 100 (const for the dtype) = 1200. Because the fusion happens, the algorithm goes through the program again. This time, the number of visited op is 1000 (relu) + 100 (const) = 1100. Overally, the number of visited op is 2300. """ @mb.program(input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)]) def prog(x): for _ in range(1000): x = mb.relu(x=x) for _ in range(100): x = mb.cast(x=x, dtype="fp32") return x graph_pass = cast_optimization() graph_pass.apply(prog) assert ( graph_pass._num_of_visited_ops == 2_300 ) # Please refer to the doc string for how 2300 comes from. def test_remove_redundant_cast_smoke(self): """ Input graph: input(fp32) -> cast(dtype=fp32) -> output Output graph: input -> output """ @mb.program(input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)]) def prog(x): x = mb.cast(x=x, dtype="fp32") return x assert get_op_types_in_program(prog) == ["cast"] _, _, block = apply_pass_and_basic_check(prog, "common::cast_optimization") assert len(block.find_ops(op_type="cast")) == 0 assert block.outputs[0].dtype == types.fp32 def test_remove_redundant_cast_negative_smoke(self): """ Input graph: input(fp32) -> cast(dtype=fp16) -> output Output graph: input -> cast -> output """ @mb.program(input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)]) def prog(x): x = mb.cast(x=x, dtype="fp16") return x assert get_op_types_in_program(prog) == ["cast"] _, _, block = apply_pass_and_basic_check(prog, "common::cast_optimization") assert len(block.find_ops(op_type="cast")) == 1 assert block.outputs[0].dtype == types.fp16 @pytest.mark.parametrize( "opset_version", [ct.target.iOS14, ct.target.iOS17], ) def test_remove_redundant_cast_stress(self, opset_version): """ Test all possible dtype combination for each iOS version of cast. Input graph: input(dtype=dtype_a) -> cast(dtype=dtype_b) -> out Output graph: if dtype_a == dtype_b, the cast op can be eliminated input -> out if dtype_a != dtype_b, the cast op should be preserved input -> cast -> out """ def _test_cast_op_cancellation(dtype_a, dtype_b): @mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=dtype_a)], opset_version=opset_version ) def prog(x): x = mb.cast(x=x, dtype=builtin_to_string(dtype_b)) return x assert get_op_types_in_program(prog) == ["cast"] _, _, block = apply_pass_and_basic_check(prog, "common::cast_optimization") cast_ops = block.find_ops(op_type="cast") if dtype_a == dtype_b: assert len(cast_ops) == 0 else: assert len(cast_ops) == 1 assert block.outputs[0].dtype == dtype_b opset_version_to_cast_op = { ct.target.iOS14: _cast_iOS14, ct.target.iOS17: _cast_iOS17, } cast_op = opset_version_to_cast_op[opset_version] for dtype_a in cast_op.type_domains["T"]: for dtype_b in cast_op.type_domains["T"]: _test_cast_op_cancellation(dtype_a, dtype_b) class TestCastOptimizationCastFusion: """ Test consecutive cast ops funsion """ def test_cast_ops_fusion_smoke(self): """ Input graph: input(fp16) --> cast(dtype="fp32") --> cast(dtype="fp16") --> out Output graph: input --> identity --> out This pattern should be fused, since it doesn't affect the computation precision """ @mb.program(input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp16)]) def prog(x): x = mb.cast(x=x, dtype="fp32") x = mb.cast(x=x, dtype="fp16") return x apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["identity"] assert block.outputs[0].dtype == types.fp16 def test_cast_ops_fusion_smoke_2(self): """ Input graph: input(int8) --> cast(dtype="fp16") --> cast(dtype="fp32") --> out Output graph: input --> cast(dtype="fp32") --> out This pattern should be fused, since it doesn't affect the computation precision, given that the precision is limited by the program int8 input. """ @mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.int8)], opset_version=ct.target.iOS17 ) def prog(x): x = mb.cast(x=x, dtype="fp16") x = mb.cast(x=x, dtype="fp32") return x apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["cast"] assert block.find_ops(op_type="cast")[0].outputs[0].dtype == types.fp32 assert block.outputs[0].dtype == types.fp32 def test_cast_ops_fusion_smoke_3(self): """ Input graph: input(fp32) --> cast(dtype="fp16") --> cast(dtype="fp16") --> out Output graph: input --> cast(dtype="fp16") --> out Two identical cast ops can be fused into one. """ @mb.program(input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)]) def prog(x): x = mb.cast(x=x, dtype="fp16") x = mb.cast(x=x, dtype="fp16") return x apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["cast"] assert block.find_ops(op_type="cast")[0].outputs[0].dtype == types.fp16 assert block.outputs[0].dtype == types.fp16 def test_cast_ops_fusion_smoke_4(self): """ Input graph: input(int8) --> cast(dtype="fp32") --> cast(dtype="int8") --> out Output graph: input --> identity --> out There will be two staged of optimization: 1. cast(dtype=fp32) + cast(dtype=int8) fused into a single cast(dtype=int8) 2. cast(dtype=int8) is further removed """ @mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.int8)], opset_version=ct.target.iOS17 ) def prog(x): x = mb.cast(x=x, dtype="fp32") x = mb.cast(x=x, dtype="int8") return x apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["identity"] assert block.outputs[0].dtype == types.int8 def test_cast_ops_fusion_negative_smoke(self): """ Input graph: input(fp32) --> cast(dtype="fp16") --> cast(dtype="fp32") --> out Output graph: input --> cast --> cast --> out This pattern should not be fused, since the precision is lowered. """ @mb.program(input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)]) def prog(x): x = mb.cast(x=x, dtype="fp16") x = mb.cast(x=x, dtype="fp32") return x apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["cast", "cast"] cast_ops = block.find_ops(op_type="cast") assert cast_ops[0].outputs[0].dtype == types.fp16 assert cast_ops[1].outputs[0].dtype == types.fp32 assert block.outputs[0].dtype == types.fp32 def test_cast_ops_fusion_negative_smoke_2(self): """ Input graph: input(int32) --> cast(dtype="uint8") --> cast(dtype="int8") --> out Output graph: input --> cast --> cast --> out This pattern should not be fused, since the data range results from uint8 -> int8 is [0, 127], while a single cast(int8) produces [-128, 127]. The data point between [-128, 0] will have wrong numerical result. """ @mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.int32)], opset_version=ct.target.iOS17, ) def prog(x): x = mb.cast(x=x, dtype="uint8") x = mb.cast(x=x, dtype="int8") return x apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["cast", "cast"] cast_ops = block.find_ops(op_type="cast") assert cast_ops[0].outputs[0].dtype == types.uint8 assert cast_ops[1].outputs[0].dtype == types.int8 assert block.outputs[0].dtype == types.int8 @pytest.mark.parametrize( "opset_version", [ct.target.iOS14, ct.target.iOS17], ) def test_cast_ops_fusion_stress(self, opset_version): """ Test all possible dtype combination for each iOS version of cast. Input graph: input(dtype=dtype_a) -> cast(dtype=dtype_b) -> cast(dtype=dtype_c) -> out Output graph: The output graph can have cast ops with number from 0 to 2 """ def _test_cast_op_fusion(dtype_a, dtype_b, dtype_c): @mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=dtype_a)], opset_version=opset_version ) def prog(x): x = mb.cast(x=x, dtype=builtin_to_string(dtype_b)) x = mb.cast(x=x, dtype=builtin_to_string(dtype_c)) return x _, _, block = apply_pass_and_basic_check(prog, "common::cast_optimization") assert block.outputs[0].dtype == dtype_c return cast_ops = block.find_ops(op_type="cast") if dtype_a == dtype_b: assert len(cast_ops) == 0 else: assert len(cast_ops) == 1 opset_version_to_cast_op = { ct.target.iOS14: _cast_iOS14, ct.target.iOS17: _cast_iOS17, } cast_op = opset_version_to_cast_op[opset_version] supported_dtypes = cast_op.type_domains["T"] for dtype_a in supported_dtypes: for dtype_b in supported_dtypes: for dtype_c in supported_dtypes: _test_cast_op_fusion(dtype_a, dtype_b, dtype_c) class TestCastOptimizationComplexPatterns: """ Test cast ops fusion / romoval in some complex graph examples. """ def test_linear_consecutive_cast_ops_cancellation(self): """Test the cast optimization pass with more complicated patterns.""" """ Input graph: input(fp16) -----> cast(dtype="fp32") -----> cast(dtype="fp16") ----> square ---> out Output graph: input -----> square -----> out """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20), dtype=types.fp16)]) def prog(x): x = mb.cast(x=x, dtype="fp32") x = mb.cast(x=x, dtype="fp16") x = mb.square(x=x) return x assert get_op_types_in_program(prog) == ["cast", "cast", "square"] apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["square"] assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={block.outputs[0].name: (10, 20)}, ) def test_linear_consecutive_cast_ops_fusion(self): """ Input graph: input(fp32)---->cast(dtype="fp16")---->cast(dtype="bool")--->identity--->out Output graph: input(fp32)----->cast(dtype="bool")----->identity--->out """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.cast(x=x, dtype="fp16") x = mb.cast(x=x, dtype="bool") x = mb.identity(x=x) return x assert get_op_types_in_program(prog) == ["cast", "cast", "identity"] apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["cast", "identity"] assert block.find_ops(op_type="cast")[0].dtype.val == "bool" assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={block.outputs[0].name: (10, 20)}, ) def test_linear_multiple_consecutive_cast_ops(self): """ Input graph: input(fp16)-->cast(dtype="fp32")-->cast(dtype="fp32")-->cast(dtype="int32")-->cast(dtype="fp32")-->cast(dtype="fp16")-->square->out Output graph: input(fp16)-->cast(dtype="int32")-->cast(dtype="fp16")-->square--->out """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20), dtype=types.fp16)]) def prog(x): x = mb.cast(x=x, dtype="fp32") x = mb.cast(x=x, dtype="fp32") x = mb.cast(x=x, dtype="int32") x = mb.cast(x=x, dtype="fp32") x = mb.cast(x=x, dtype="fp16") x = mb.square(x=x) return x assert get_op_types_in_program(prog) == [ "cast", "cast", "cast", "cast", "cast", "square", ] apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["cast", "cast", "square"] assert block.find_ops(op_type="cast")[0].dtype.val == "int32" assert block.find_ops(op_type="cast")[1].dtype.val == "fp16" assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={block.outputs[0].name: (10, 20)}, ) def test_same_consecutive_cancelling_casts_on_all_branches(self): """ Input graph: |---->cast(dtype="fp16")---->square--->out_1 | input(fp16)---->cast(dtype="fp32")---->cast(dtype="fp16")---->relu--->out_2 | |---->cast(dtype="fp16")---->log--->out_3 Output graph: |---->square--->out_1 | input---->relu--->out_2 | |---->log--->out_3 """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20), dtype=types.fp16)]) def prog(x): x = mb.cast(x=x, dtype="fp32") x1 = mb.cast(x=x, dtype="fp16") x2 = mb.cast(x=x, dtype="fp16") x3 = mb.cast(x=x, dtype="fp16") x4 = mb.square(x=x1) x5 = mb.relu(x=x2) x6 = mb.log(x=x3) return x4, x5, x6 assert get_op_types_in_program(prog) == [ "cast", "cast", "cast", "cast", "square", "relu", "log", ] apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["square", "relu", "log"] assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={ block.outputs[0].name: (10, 20), block.outputs[1].name: (10, 20), block.outputs[2].name: (10, 20), }, ) def test_consecutive_fusable_casts_on_all_branches(self): """ Input graph: |---->cast(dtype="int32")---->square--->out_1 | input(fp16)---->cast(dtype="fp32")---->cast(dtype="int32")---->abs--->out_2 | |---->cast(dtype="int32")---->identity--->out_3 Output graph: |-->square-->out_1 | input(fp16)---->cast(dtype="int32")-->abs-->out_2 | |-->identity->out_3 Note that, this result needs the assistant of another pass remove_redundant_ops """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20), dtype=types.fp16)]) def prog(x): x = mb.cast(x=x, dtype="fp32") x1 = mb.cast(x=x, dtype="int32") x2 = mb.cast(x=x, dtype="int32") x3 = mb.cast(x=x, dtype="int32") x4 = mb.square(x=x1) x5 = mb.abs(x=x2) x6 = mb.identity(x=x3) return x4, x5, x6 assert get_op_types_in_program(prog) == [ "cast", "cast", "cast", "cast", "square", "abs", "identity", ] apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == [ "cast", "cast", "cast", "square", "abs", "identity", ] cast_ops = block.find_ops(op_type="cast") assert all([v.dtype.val == "int32" for v in cast_ops]) apply_pass_and_basic_check(prog, "common::remove_redundant_ops") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == [ "cast", "square", "abs", "identity", ] assert block.find_ops(op_type="cast")[0].dtype.val == "int32" assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={ block.outputs[0].name: (10, 20), block.outputs[1].name: (10, 20), block.outputs[2].name: (10, 20), }, ) def test_mixed_consecutive_casts_on_different_branches(self): """ Input graph: |---->cast(dtype="fp16")---->square--->out_1 | |---->cast(dtype="int32")---->square--->out_2 | input(fp16)---->cast(dtype="fp32")---->cast(dtype="int32")---->identity--->out_3 | |---->cast(dtype="int32")---->abs--->out_4 | |---->cast(dtype="fp16")---->abs--->out_5 Output graph: |---->square--->out_1 | | |---->square--->out_2 | | input(fp16)---->cast(dtype="int32")---->identity--->out_3 | | | |---->abs--->out_4 | | |---->abs--->out_5 Note that, this result needs the assistant of another pass remove_redundant_ops """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20), dtype=types.fp16)]) def prog(x): x = mb.cast(x=x, dtype="fp32") x1 = mb.cast(x=x, dtype="fp16") x2 = mb.cast(x=x, dtype="int32") x3 = mb.cast(x=x, dtype="int32") x4 = mb.cast(x=x, dtype="int32") x5 = mb.cast(x=x, dtype="fp16") x6 = mb.square(x=x1) x7 = mb.square(x=x2) x8 = mb.identity(x=x3) x9 = mb.abs(x=x4) x10 = mb.abs(x=x5) return x6, x7, x8, x9, x10 assert get_op_types_in_program(prog) == [ "cast", "cast", "cast", "cast", "cast", "cast", "square", "square", "identity", "abs", "abs", ] apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == [ "cast", "cast", "cast", "square", "square", "identity", "abs", "abs", ] cast_ops = block.find_ops(op_type="cast") assert all([v.dtype.val == "int32" for v in cast_ops]) apply_pass_and_basic_check(prog, "common::remove_redundant_ops") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == [ "cast", "square", "square", "identity", "abs", "abs", ] assert block.find_ops(op_type="cast")[0].dtype.val == "int32" assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={ block.outputs[0].name: (10, 20), block.outputs[1].name: (10, 20), block.outputs[2].name: (10, 20), }, ) def test_different_consecutive_casts_config_on_different_branches(self): """ Input graph: |---->cast(dtype="fp16")---->square--->out_1 | input(fp16)---->cast(dtype="fp32")---->cast(dtype="int32")---->exp2--->out_2 | |---->abs--->out_3 Output graph: |---->square--->out_1 | | | input(fp16)---->cast(dtype="int32")---->exp2--->out_2 | | | | |---->cast(dtype="fp32")---->abs--->out_3 """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20), dtype=types.fp16)]) def prog(x): x = mb.cast(x=x, dtype="fp32") x1 = mb.cast(x=x, dtype="fp16") x2 = mb.cast(x=x, dtype="int32") x3 = mb.square(x=x1) x4 = mb.exp2(x=x2) x5 = mb.abs(x=x) return x3, x4, x5 assert get_op_types_in_program(prog) == ["cast", "cast", "cast", "square", "exp2", "abs"] apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["cast", "cast", "square", "exp2", "abs"] # Asserting first cast configuration cast_1 = block.find_ops(op_type="cast")[0] assert cast_1.dtype.val == "fp32" assert len(cast_1.outputs) == 1 assert len(cast_1.outputs[0].child_ops) == 1 assert cast_1.outputs[0].child_ops[0].op_type == "abs" # Asserting second cast configuration cast_2 = block.find_ops(op_type="cast")[1] assert cast_2.dtype.val == "int32" assert len(cast_2.outputs) == 1 assert len(cast_2.outputs[0].child_ops) == 1 assert cast_2.outputs[0].child_ops[0].op_type == "exp2" assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={ block.outputs[0].name: (10, 20), block.outputs[1].name: (10, 20), block.outputs[2].name: (10, 20), }, ) def test_two_casts_at_the_end(self): """ Input graph: input(dtype="fp16")---->relu----->relu | --------| | V cast(dtype="fp32")---->cast(dtype="fp16") | ----------------------| | V cast(dtype="fp32")---->cast(dtype="fp16")---->output(dtype="fp16") Output graph: input(dtype="fp16")---->relu----->relu---->output(dtype="fp16") """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20), dtype=types.fp16)]) def prog(x): x = mb.relu(x=x) x = mb.relu(x=x) x = mb.cast(x=x, dtype="fp32") x = mb.cast(x=x, dtype="fp16") x = mb.cast(x=x, dtype="fp32") x = mb.cast(x=x, dtype="fp16", name="original_output_name") return x assert get_op_types_in_program(prog) == ["relu", "relu", "cast", "cast", "cast", "cast"] apply_pass_and_basic_check(prog, "common::cast_optimization") _, prev_block, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["relu", "relu"] assert prev_block.outputs[0].name == "original_output_name" assert block.outputs[0].name == "original_output_name" assert block.outputs[0].dtype == types.fp16 def test_mixed_consecutive_casts_on_different_branches_complex(self): """ Input graph: |->cast(dtype="fp16")->cast(dtype="fp16")->out_1 | input(fp16)---->cast(dtype="fp32")->cast(dtype="uint8")->cast(dtype="int8")->out_2 | |->cast(dtype="int32")->out_3 | |->cast(dtype="int32")->cast(dtype="float32")->out_4 Output graph: |-->out_1 | input(fp16)-->cast(dtype="uint8")-->cast(dtype="int8")-->out_2 | .-->cast(dtype="int32")-->out_3 | .-->cast(dtype="float32")-->out_4 Note that, this result needs the assistant of another pass remove_redundant_ops """ @mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp16)], opset_version=ct.target.iOS17 ) def prog(x): x = mb.cast(x=x, dtype="fp32") x1 = mb.cast(x=x, dtype="fp16") x1 = mb.cast(x=x1, dtype="fp16") x2 = mb.cast(x=x, dtype="uint8") x2 = mb.cast(x=x2, dtype="int8") x3 = mb.cast(x=x, dtype="int32") x4 = mb.cast(x=x, dtype="int32") x4 = mb.cast(x=x4, dtype="fp32") return x2, x3, x4 assert get_op_types_in_program(prog) == ["cast"] * 8 apply_pass_and_basic_check(prog, "common::cast_optimization") apply_pass_and_basic_check(prog, "common::remove_redundant_ops") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["cast"] * 4 expected_cast_dtype = ["uint8", "int8", "int32", "fp32"] cast_ops = block.find_ops(op_type="cast") assert [v.dtype.val for v in cast_ops] == expected_cast_dtype class TestCastOptimizationAcrossBlocks: """ Test the cast optimization for cast ops at the boundary of inner and outer block. """ def test_cast_ops_fuse_across_block_smoke_1(self): """ Input graph: main[CoreML3](%x: (1,int32)(Tensor)) { main[CoreML3](%x: (1,int32)(Tensor)) { block0() { %cast_0: (1,fp32)(Tensor) = cast(x=%x, dtype="fp32", name="cast_0") %cond_0: (1,fp32)(Tensor) = cond(pred=True, name="cond_0") cond_0_true() { %cast_1: (1,fp32)(Tensor) = cast(x=%cast_0, dtype="fp32", name="cast_1") } -> (%cast_1) cond_0_false() { %cast_2: (1,fp32)(Tensor) = cast(x=%cast_0, dtype="fp32", name="cast_2") } -> (%cast_2) } -> (%cond_0) } Output graph: main[CoreML3](%x: (1,int32)(Tensor)) { block0() { %cast_0: (1,fp32)(Tensor) = cast(x=%x, dtype="fp32", name="cast_0") %cond_0: (1,fp32)(Tensor) = cond(pred=True, name="cond_0") cond_0_true() { } -> (%cast_0) cond_0_false() { } -> (%const_0) } -> (%cond_0) } """ @mb.program(input_specs=[mb.TensorSpec(shape=(1,), dtype=types.int32)]) def prog(x): x = mb.cast(x=x, dtype="fp32") def _true_fn(): return mb.cast(x=x, dtype="fp32") def _false_fn(): return mb.cast(x=x, dtype="fp32") return mb.cond(pred=True, _true_fn=_true_fn, _false_fn=_false_fn) _, _, block = apply_pass_and_basic_check(prog, "common::cast_optimization") assert get_op_types_in_program(prog) == ["cast", "cond"] cast_op = block.find_ops(op_type="cast")[0] assert cast_op.dtype.val == "fp32" cond_op = block.find_ops(op_type="cond")[0] true_block, false_block = cond_op.blocks assert get_op_types_in_block(true_block) == [] assert get_op_types_in_block(false_block) == [] assert true_block.outputs[0] == cast_op.outputs[0] assert false_block.outputs[0] == cast_op.outputs[0] def test_cast_ops_fuse_across_block_smoke_2(self): """ Input graph: main[CoreML3](%x: (1,fp32)(Tensor)) { block0() { %cast_0: (1,fp32)(Tensor) = cast(x=%x, dtype="fp32", name="cast_0") %cond_0: (1,fp32)(Tensor) = cond(pred=True, name="cond_0") cond_0_true() { %cast_1: (1,fp32)(Tensor) = cast(x=%cast_0, dtype="fp32", name="cast_1") } -> (%cast_1) cond_0_false() { %cast_2: (1,fp32)(Tensor) = cast(x=%cast_0, dtype="fp32", name="cast_2") } -> (%cast_2) } -> (%cond_0) } Output graph: main[CoreML3](%x: (1,fp32)(Tensor)) { block0() { %cond_0: (1,fp32)(Tensor) = cond(pred=True, name="cond_0") cond_0_true() { } -> (%x) cond_0_false() { } -> (%x) } -> (%cond_0) } """ @mb.program(input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)]) def prog(x): x = mb.cast(x=x, dtype="fp32") def _true_fn(): return mb.cast(x=x, dtype="fp32") def _false_fn(): return mb.cast(x=x, dtype="fp32") return mb.cond(pred=True, _true_fn=_true_fn, _false_fn=_false_fn) _, _, block = apply_pass_and_basic_check(prog, "common::cast_optimization") assert get_op_types_in_program(prog) == ["cond"] cond_op = block.find_ops(op_type="cond")[0] true_block, false_block = cond_op.blocks assert get_op_types_in_block(true_block) == [] assert get_op_types_in_block(false_block) == [] assert true_block.outputs[0] == block.inputs["x"] assert false_block.outputs[0] == block.inputs["x"] def test_cast_ops_fuse_across_block_smoke_3(self): """ Input graph: main[CoreML7](%x: (1,int32)(Tensor)) { block0() { %cast_0: (1,fp32)(Tensor) = cast(x=%x, dtype="fp32", name="cast_0") %cond_0: (1,uint8)(Tensor) = cond(pred=True, name="cond_0") cond_0_true() { %cast_1: (1,int32)(Tensor) = cast(x=%cast_0, dtype="int32", name="cast_1") %cast_2: (1,uint8)(Tensor) = cast(x=%cast_1, dtype="uint8", name="cast_2") %cast_3: (1,fp32)(Tensor) = cast(x=%cast_2, dtype="fp32", name="cast_3") %cast_4: (1,uint8)(Tensor) = cast(x=%cast_3, dtype="uint8", name="cast_4") } -> (%cast_4) cond_0_false() { %cast_5: (1,int8)(Tensor) = cast(x=%cast_0, dtype="int8", name="cast_5") %cast_6: (1,bool)(Tensor) = cast(x=%cast_5, dtype="bool", name="cast_6") %cast_7: (1,uint8)(Tensor) = cast(x=%cast_6, dtype="uint8", name="cast_7") } -> (%cast_7) } -> (%cond_0) } Output graph: main[CoreML7](%x: (1,int32)(Tensor)) { block0() { %cond_0: (1,uint8)(Tensor) = cond(pred=True, name="cond_0") cond_0_true() { %x_to_uint8: (1,uint8)(Tensor) = cast(x=%x, dtype="uint8", name="x_to_uint8") } -> (%x_to_uint8) cond_0_false() { %x_to_bool: (1,bool)(Tensor) = cast(x=%x, dtype="bool", name="x_to_bool") %cast_7: (1,uint8)(Tensor) = cast(x=%x_to_bool, dtype="uint8", name="cast_7") } -> (%cast_7) } -> (%cond_0) } This is a more complex example: First, in the true branch, 4 ``cast`` ops are optimized into a single ``cast(dtype="uint8")``. In the false branch, 3 ``cast`` ops are optimized to ``cast(dtype="bool")->cast(dtype="uint8")`` Second, the first ``cast`` op in each inner block is fused with the outer ``cast_0`` op, resulting in the above output graph. """ @mb.program( input_specs=[mb.TensorSpec(shape=(1,), dtype=types.int32)], opset_version=ct.target.iOS17, ) def prog(x): x = mb.cast(x=x, dtype="fp32") def _true_fn(): x1 = mb.cast(x=x, dtype="int32") x1 = mb.cast(x=x1, dtype="uint8") x1 = mb.cast(x=x1, dtype="fp32") return mb.cast(x=x1, dtype="uint8") def _false_fn(): x2 = mb.cast(x=x, dtype="int8") x2 = mb.cast(x=x2, dtype="bool") return mb.cast(x=x2, dtype="uint8") return mb.cond(pred=True, _true_fn=_true_fn, _false_fn=_false_fn) _, _, block = apply_pass_and_basic_check(prog, "common::cast_optimization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["cond"] cond_op = block.find_ops(op_type="cond")[0] true_block, false_block = cond_op.blocks assert get_op_types_in_block(true_block) == ["cast"] assert get_op_types_in_block(false_block) == ["cast"] * 2 expected_true_branch_types = ["uint8"] expected_false_branch_types = ["bool", "uint8"] assert expected_true_branch_types == [ v.dtype.val for v in true_block.find_ops(op_type="cast") ] assert expected_false_branch_types == [ v.dtype.val for v in false_block.find_ops(op_type="cast") ] class TestConv1dCompositionPasses: @pytest.mark.parametrize( "backend, has_strides, pad_type, has_pad, has_dilations, has_bias", itertools.product( backends, (True, False), ("valid", "custom", "same"), (True, False), (True, False), (True, False), ), ) def test_conv1d_composition( self, backend, has_strides, pad_type, has_pad, has_dilations, has_bias ): """ Input graph: input -> expand_dims -> conv2d -> squeeze -> out Output graph: input -> conv1d -> out """ N, L = 2, 8 C_in, C_out = 3, 4 K = 3 conv_kwargs = {"weight": np.random.rand(C_out, C_in, 1, K), "pad_type": pad_type} if has_strides: conv_kwargs["strides"] = (2, 2) if has_pad: # The pad is specially designed to make sure the output of conv has dim_size=1 at axis 1. conv_kwargs["pad"] = (0, 0, 1, 1) if pad_type == "custom" else (1, 1, 1, 1) if has_dilations: conv_kwargs["dilations"] = (2, 2) if has_bias: conv_kwargs["bias"] = np.random.rand(C_out) @mb.program(input_specs=[mb.TensorSpec(shape=(N, C_in, L))]) def prog(x): y_expand = mb.expand_dims(x=x, axes=(2,)) y_conv = mb.conv(x=y_expand, **conv_kwargs) y_squeeze = mb.squeeze(x=y_conv, axes=(2,)) return y_squeeze assert get_op_types_in_program(prog) == ["expand_dims", "conv", "squeeze"] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::compose_conv1d") assert get_op_types_in_program(prog) == ["squeeze", "conv"] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::const_elimination") assert get_op_types_in_program(prog) == ["conv"] # infer output shape strides = conv_kwargs["strides"] if has_strides else (1, 1) pad = conv_kwargs["pad"] if has_pad else (0, 0, 0, 0) dilations = conv_kwargs["dilations"] if has_dilations else (1, 1) L_out = None if pad_type == "valid": L_out = (L - dilations[-1] * (K - 1) - 1) // strides[-1] + 1 elif pad_type == "custom": L_out = (L + pad[-2] + pad[-1] - dilations[-1] * (K - 1) - 1) // strides[-1] + 1 elif pad_type == "same": L_out = np.ceil(L / strides[-1]) else: raise Exception("unsupported pad type") output_shape = (N, C_out, L_out) assert_model_is_valid( prog, {"x": (N, C_in, L)}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) @pytest.mark.parametrize("backend", backends) def test_conv1d_composotion_dynamic_weight(self, backend): """ Input graph: input -> expand_dims -> conv2d -> squeeze -> out Output graph: input -> conv1d -> out """ N, L = 2, 9 C_in, C_out = 4, 3 K = 4 strides = (1, 2) pad = (0, 0, 1, 1) # MIL convolution with dynamic weights does not support dilations != 1 # see coremltools/coremltools/converters/mil/mil/ops/defs/iOS15/conv.py dilations = (1, 1) # infer L_out with pad_type fixed to custom L_out = (L + pad[-2] + pad[-1] - dilations[-1] * (K - 1) - 1) // strides[-1] + 1 conv_kwargs = { "strides": strides, "pad_type": "custom", "pad": pad, "dilations": dilations, } @mb.program( input_specs=[ mb.TensorSpec(shape=(N, C_in, L)), mb.TensorSpec(shape=(C_out, C_in, 1, K)), ] ) def prog(x, weight): y_expand = mb.expand_dims(x=x, axes=(-2,)) y_conv = mb.conv(x=y_expand, weight=weight, **conv_kwargs) y_squeeze = mb.squeeze(x=y_conv, axes=(-2,)) return y_squeeze assert get_op_types_in_program(prog) == ["expand_dims", "conv", "squeeze"] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::compose_conv1d") assert get_op_types_in_program(prog) == ["squeeze", "conv"] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::const_elimination") assert get_op_types_in_program(prog) == ["squeeze", "conv"] output_shape = (N, C_out, L_out) assert_model_is_valid( prog, {"x": (N, C_in, L), "weight": (C_out, C_in, 1, K)}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) @pytest.mark.parametrize( "backend, has_bias, bias_op_type", itertools.product( backends, (True, False), ("add", "sub"), ), ) def test_conv1d_bias_fusion(self, backend, has_bias, bias_op_type): """ After recomposing the shattered conv1d, conv1d optimization passes should work Input graph: input -> expand_dims -> conv2d -> squeeze -> add/sub a constant -> out Output graph: input -> conv1d -> out """ N, L = 2, 8 C_in, C_out = 3, 5 K = 3 strides = (1, 2) pad = (0, 0, 0, 1) dilations = (1, 2) # infer L_out with pad_type fixed to custom L_out = (L + pad[-2] + pad[-1] - dilations[-1] * (K - 1) - 1) // strides[-1] + 1 conv_kwargs = { "weight": np.random.rand(C_out, C_in, 1, K), "strides": strides, "pad_type": "custom", "pad": pad, "dilations": dilations, } if has_bias: conv_kwargs["bias"] = np.random.rand(C_out) bias2 = np.random.rand(C_out, 1) @mb.program(input_specs=[mb.TensorSpec(shape=(N, C_in, L))]) def prog(x): y_expand = mb.expand_dims(x=x, axes=(-2,)) y_conv = mb.conv(x=y_expand, **conv_kwargs) y_squeeze = mb.squeeze(x=y_conv, axes=(-2,)) y_bias2 = ( mb.add(x=y_squeeze, y=bias2) if bias_op_type == "add" else mb.sub(x=y_squeeze, y=bias2) ) return y_bias2 assert get_op_types_in_program(prog) == ["expand_dims", "conv", "squeeze", bias_op_type] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::compose_conv1d") assert get_op_types_in_program(prog) == ["squeeze", "conv", bias_op_type] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_conv_bias") assert get_op_types_in_program(prog) == ["squeeze", "conv"] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::const_elimination") assert get_op_types_in_program(prog) == ["conv"] output_shape = (N, C_out, L_out) assert_model_is_valid( prog, {"x": (N, C_in, L)}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) class TestConv1dChannellastCompositionPasses: @pytest.mark.parametrize( "backend, has_strides, pad_type, has_pad, has_dilations, has_bias", itertools.product( backends, (True, False), ("valid", "custom", "same"), (True, False), (True, False), (True, False), ), ) def test_conv1d_channellast_composition( self, backend, has_strides, pad_type, has_pad, has_dilations, has_bias ): """ Input graph: input -> expand_dims -> transpose -> conv2d -> transpose -> squeeze -> out Output graph: input -> transpose -> conv1d -> transpose -> out """ N, L = 2, 8 C_in, C_out = 5, 3 K = 3 conv_kwargs = { "weight": np.random.rand(C_out, C_in, 1, K), "pad_type": pad_type, } if has_strides: conv_kwargs["strides"] = (2, 2) if has_pad: # The pad is specially designed to make sure the output of conv has dim_size=1 at axis 1. conv_kwargs["pad"] = (0, 0, 1, 1) if pad_type == "custom" else (1, 1, 1, 1) if has_dilations: conv_kwargs["dilations"] = (2, 2) if has_bias: conv_kwargs["bias"] = np.random.rand(C_out) @mb.program(input_specs=[mb.TensorSpec(shape=(N, L, C_in))]) def prog(x): y_expand = mb.expand_dims(x=x, axes=(1,)) y_transpose1 = mb.transpose(x=y_expand, perm=(0, 3, 1, 2)) y_conv = mb.conv(x=y_transpose1, **conv_kwargs) y_transpose2 = mb.transpose(x=y_conv, perm=(0, 2, 3, 1)) y_squeeze = mb.squeeze(x=y_transpose2, axes=(1,)) return y_squeeze assert get_op_types_in_program(prog) == [ "expand_dims", "transpose", "conv", "transpose", "squeeze", ] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::compose_conv1d") assert get_op_types_in_program(prog) == ["transpose", "squeeze", "conv", "transpose"] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::const_elimination") assert get_op_types_in_program(prog) == ["transpose", "conv", "transpose"] # infer output shape strides = conv_kwargs["strides"] if has_strides else (1, 1) pad = conv_kwargs["pad"] if has_pad else (0, 0, 0, 0) dilations = conv_kwargs["dilations"] if has_dilations else (1, 1) L_out = None if pad_type == "valid": L_out = (L - dilations[-1] * (K - 1) - 1) // strides[-1] + 1 elif pad_type == "custom": L_out = (L + pad[-2] + pad[-1] - dilations[-1] * (K - 1) - 1) // strides[-1] + 1 elif pad_type == "same": L_out = np.ceil(L / strides[-1]) else: raise Exception("unsupported pad type") output_shape = (N, L_out, C_out) assert_model_is_valid( prog, {"x": (N, L, C_in)}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) @pytest.mark.parametrize("backend", backends) def test_conv1d_channellast_composotion_dynamic_weight(self, backend): """ Input graph: input -> expand_dims -> transpose -> conv2d -> transpose -> squeeze -> out Output graph: input -> transpose -> conv1d -> transpose -> out """ N, L = 2, 9 C_in, C_out = 4, 5 K = 4 strides = (1, 2) # The pad is specially designed to make sure the output of conv has dim_size=1 at axis 1. pad = (0, 0, 0, 1) # MIL convolution with dynamic weights does not support dilations != 1 # see coremltools/coremltools/converters/mil/mil/ops/defs/iOS15/conv.py dilations = (1, 1) # infer L_out with pad_type fixed to custom L_out = (L + pad[-2] + pad[-1] - dilations[-1] * (K - 1) - 1) // strides[-1] + 1 conv_kwargs = { "strides": strides, "pad_type": "custom", "pad": pad, "dilations": dilations, } @mb.program( input_specs=[ mb.TensorSpec(shape=(N, L, C_in)), mb.TensorSpec(shape=(C_out, C_in, 1, K)), ] ) def prog(x, weight): y_expand = mb.expand_dims(x=x, axes=(1,)) y_transpose1 = mb.transpose(x=y_expand, perm=(0, 3, 1, 2)) y_conv = mb.conv(x=y_transpose1, weight=weight, **conv_kwargs) y_transpose2 = mb.transpose(x=y_conv, perm=(0, 2, 3, 1)) y_squeeze = mb.squeeze(x=y_transpose2, axes=(1,)) return y_squeeze assert get_op_types_in_program(prog) == [ "expand_dims", "transpose", "conv", "transpose", "squeeze", ] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::compose_conv1d") assert get_op_types_in_program(prog) == ["transpose", "squeeze", "conv", "transpose"] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::const_elimination") assert get_op_types_in_program(prog) == ["transpose", "squeeze", "conv", "transpose"] output_shape = (N, L_out, C_out) assert_model_is_valid( prog, {"x": (N, L, C_in), "weight": (C_out, C_in, 1, K)}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) @pytest.mark.parametrize( "backend, has_bias, bias_op_type", itertools.product( backends, (True, False), ("add", "sub"), ), ) def test_conv1d_channellast_bias_fusion(self, backend, has_bias, bias_op_type): """ After recomposing the shattered conv1d, conv1d optimization passes should work Input graph: input -> expand_dims -> transpose -> conv2d -> transpose -> squeeze -> add/sub a constant -> out Output graph: input -> transpose -> conv1d -> transpose -> out """ N, L = 2, 8 C_in, C_out = 5, 4 K = 4 strides = (1, 2) pad = (0, 0, 1, 0) dilations = (1, 2) # infer L_out with pad_type fixed to custom L_out = (L + pad[-2] + pad[-1] - dilations[-1] * (K - 1) - 1) // strides[-1] + 1 conv_kwargs = { "weight": np.random.rand(C_out, C_in, 1, K), "strides": strides, "pad_type": "custom", "pad": pad, "dilations": dilations, } if has_bias: conv_kwargs["bias"] = np.random.rand(C_out) bias2 = np.random.rand(C_out) @mb.program(input_specs=[mb.TensorSpec(shape=(N, L, C_in))]) def prog(x): y_expand = mb.expand_dims(x=x, axes=(-3,)) y_transpose1 = mb.transpose(x=y_expand, perm=(0, 3, 1, 2)) y_conv = mb.conv(x=y_transpose1, **conv_kwargs) y_transpose2 = mb.transpose(x=y_conv, perm=(0, 2, 3, 1)) y_squeeze = mb.squeeze(x=y_transpose2, axes=(-3,)) y_bias2 = ( mb.add(x=y_squeeze, y=bias2) if bias_op_type == "add" else mb.sub(x=y_squeeze, y=bias2) ) return y_bias2 assert get_op_types_in_program(prog) == [ "expand_dims", "transpose", "conv", "transpose", "squeeze", bias_op_type, ] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::compose_conv1d") assert get_op_types_in_program(prog) == [ "transpose", "squeeze", "conv", "transpose", bias_op_type, ] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_conv_bias") assert get_op_types_in_program(prog) == ["transpose", "squeeze", "conv", "transpose"] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::const_elimination") assert get_op_types_in_program(prog) == ["transpose", "conv", "transpose"] output_shape = (N, L_out, C_out) assert_model_is_valid( prog, {"x": (N, L, C_in)}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) class TestConvBatchNormFusion: @staticmethod def _apply_weight_transform(inputs, is_deconv, dtype=np.float32): """ Utility function to test the weight transform function in conv batch_norm fusion pass. """ Cin, _, groups = 10, 20, 10 input_shape = (1, Cin, 2, 2) @mb.program( input_specs=[mb.TensorSpec(shape=input_shape, dtype=numpy_type_to_builtin_type(dtype))] ) def prog(x): if is_deconv: x = mb.conv_transpose( x=x, weight=inputs["conv_weight"], bias=inputs["conv_bias"], groups=groups, ) else: x = mb.conv( x=x, weight=inputs["conv_weight"], bias=inputs["conv_bias"], groups=groups, ) x = mb.batch_norm( x=x, mean=inputs["mean"], variance=inputs["variance"], gamma=inputs["gamma"], beta=inputs["beta"], epsilon=inputs["epsilon"], ) return x apply_pass_and_basic_check(prog, "common::fuse_conv_batchnorm") # get the updated weight from the prog conv_op = [] for op in prog["main"].operations: if op.op_type == "const": continue conv_op.append(op) assert len(conv_op) == 1, "should only have one conv / conv_transpose layer." return conv_op[0].weight.val, conv_op[0].bias.val @pytest.mark.parametrize( "conv_type", ["conv", "conv_transpose"], ) def test_weight_transform_conv_identity(self, conv_type): """ Test the weight transform function with an identity batchnorm layer. """ # parameters for conv is_deconv = conv_type == "conv_transpose" conv_weight = np.arange(20).astype(np.float32) conv_weight = ( np.reshape(conv_weight, (10, 2, 1, 1)) if is_deconv else np.reshape(conv_weight, (20, 1, 1, 1)) ) conv_bias = np.arange(20).astype(np.float32) # parameters for batch_norm gamma = np.ones(20).astype(np.float32) beta = np.zeros(20).astype(np.float32) mean = np.zeros(20).astype(np.float32) variance = np.ones(20).astype(np.float32) epsilon = 0.0 inputs = { "conv_weight": conv_weight, "conv_bias": conv_bias, "gamma": gamma, "beta": beta, "mean": mean, "variance": variance, "epsilon": epsilon, } new_conv_weight, new_conv_bias = self._apply_weight_transform(inputs, is_deconv) np.testing.assert_equal(new_conv_weight, conv_weight) np.testing.assert_equal(new_conv_bias, conv_bias) @pytest.mark.parametrize( "conv_type, dtype", itertools.product( ["conv", "conv_transpose"], [np.float16, np.float32], ), ) def test_weight_transform_conv_type(self, conv_type, dtype): """ The weight transform function should return an updated conv weight with correct data type """ # parameters for conv is_deconv = conv_type == "conv_transpose" conv_weight = np.arange(20).astype(dtype) conv_weight = ( np.reshape(conv_weight, (10, 2, 1, 1)) if is_deconv else np.reshape(conv_weight, (20, 1, 1, 1)) ) conv_bias = np.arange(20).astype(dtype) # parameters for batch_norm gamma = np.ones(20).astype(dtype) beta = np.zeros(20).astype(dtype) mean = np.zeros(20).astype(dtype) variance = np.ones(20).astype(dtype) epsilon = dtype(0.1) inputs = { "conv_weight": conv_weight, "conv_bias": conv_bias, "gamma": gamma, "beta": beta, "mean": mean, "variance": variance, "epsilon": epsilon, } new_conv_weight, _ = self._apply_weight_transform(inputs, is_deconv, dtype) assert ( new_conv_weight.dtype == dtype ), "the weight transform function should retain the weight's original dtype." @pytest.mark.parametrize( "rank, groups, has_bias, backend", itertools.product([3, 4, 5], [1, 2, 10], [False, True], backends), ) def test_conv(self, rank, groups, has_bias, backend): """ Input graph: input -----> conv -----> batch_norm ---> out Output graph: input -----> conv ----> out Different `rank` represents different conv dimensions: rank=3 for Conv1d, rank=4 for Conv2d, rank=5 for Conv3d. """ Cin, Cout = 10, 30 rank_to_input_shape = {3: (2, Cin, 20), 4: (2, Cin, 20, 24), 5: (2, Cin, 20, 24, 24)} rank_to_conv_weight_shape = { 3: (Cout, Cin // groups, 2), 4: (Cout, Cin // groups, 2, 3), 5: (Cout, Cin // groups, 2, 3, 3), } rank_to_output_shape = {3: (2, Cout, 19), 4: (2, Cout, 19, 22), 5: (2, Cout, 19, 22, 22)} input_shape = rank_to_input_shape[rank] @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): # conv layer conv_weight = np.random.rand(*rank_to_conv_weight_shape[rank]) conv_bias = np.random.rand(Cout) if has_bias else None x = mb.conv( x=x, weight=conv_weight, bias=conv_bias, groups=groups, ) # batch_norm layer gamma = np.random.rand(Cout) beta = np.random.rand(Cout) mean = np.random.rand(Cout) variance = np.random.rand(Cout) epsilon = 1e-2 x = mb.batch_norm( x=x, mean=mean, variance=variance, gamma=gamma, beta=beta, epsilon=epsilon, ) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_conv_batchnorm" ) assert get_op_types_in_program(prev_prog) == ["conv", "batch_norm"] assert get_op_types_in_program(prog) == ["conv"] # validate graph pass output_shape = rank_to_output_shape[rank] assert_model_is_valid( prog, {"x": input_shape}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) @pytest.mark.parametrize( "rank, groups, has_bias, backend", itertools.product([3, 4, 5], [1, 2, 10], [False, True], backends), ) def test_conv_transpose(self, rank, groups, has_bias, backend): """ Input graph: input -----> conv_transpose -----> batch_norm ---> out Output graph: input -----> conv_transpose ----> out """ Cin, Cout = 10, 30 rank_to_input_shape = {3: (2, Cin, 20), 4: (2, Cin, 20, 24), 5: (2, Cin, 20, 24, 24)} rank_to_conv_weight_shape = { 3: (Cin, Cout // groups, 2), 4: (Cin, Cout // groups, 2, 3), 5: (Cin, Cout // groups, 2, 3, 3), } rank_to_output_shape = {3: (2, Cout, 21), 4: (2, Cout, 21, 26), 5: (2, Cout, 21, 26, 26)} input_shape = rank_to_input_shape[rank] @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): # conv layer conv_weight = np.random.rand(*rank_to_conv_weight_shape[rank]) conv_bias = np.random.rand(Cout) if has_bias else None x = mb.conv_transpose( x=x, weight=conv_weight, bias=conv_bias, groups=groups, ) # batch_norm layer gamma = np.random.rand(Cout) beta = np.random.rand(Cout) mean = np.random.rand(Cout) variance = np.random.rand(Cout) epsilon = 1e-5 x = mb.batch_norm( x=x, mean=mean, variance=variance, gamma=gamma, beta=beta, epsilon=epsilon, ) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_conv_batchnorm" ) assert get_op_types_in_program(prev_prog) == ["conv_transpose", "batch_norm"] assert get_op_types_in_program(prog) == ["conv_transpose"] # validate graph pass output_shape = rank_to_output_shape[rank] assert_model_is_valid( prog, {"x": input_shape}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) class TestConvBiasFusion: @staticmethod def get_conv(x, name, Cin=3, Cout=3): conv_weight = np.random.rand(Cout, Cin, 2, 2) x = mb.conv(x=x, weight=conv_weight, name=name) return x @staticmethod def get_linear(x, name, linear_op, C=3): bias = np.arange(C).astype(np.float32) bias = np.reshape(bias, (C, 1, 1)) x = getattr(mb, linear_op)(x=x, y=bias, name=name) return x @pytest.mark.parametrize( "rank, linear_op", itertools.product([4], ["add", "sub"]), ) def test_conv(self, rank, linear_op): """ Input graph: input -----> conv -----> add/sub ---> out Output graph: If the linear op is trainable, the program is not modified. Otherwise, conv and the linear op will be fused: input -----> conv ----> out """ Cin, Cout = 3, 3 input_shape = (2, Cin, 100, 100) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): x = self.get_conv(x, "conv") x = self.get_linear(x, "linear", linear_op) return x apply_pass_and_basic_check(prog, "common::fuse_conv_bias") apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["conv"] def test_scope_back_propagation(self): Cin, Cout = 3, 3 input_shape = (2, Cin, 100, 100) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"])): x = self.get_conv(x, "conv1") with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_2"])): x = self.get_linear(x, "linear1", "add") with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_3"])): x = self.get_conv(x, "conv2") with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_4"])): x = self.get_linear(x, "linear2", "add") return x apply_pass_and_basic_check(prog, "common::fuse_conv_bias") assert get_op_types_in_program(prog) == ["conv", "conv"] conv_ops = prog.functions["main"].find_ops(op_type="conv") assert conv_ops[0].scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_2", "fuse_conv_bias"] } assert conv_ops[1].scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_4", "fuse_conv_bias"] } """ Input graph: Const | V input -----> convolution -----> add/sub ----> relu ---> out Output graph: input -----> convolution -----> relu ----> out """ @pytest.mark.parametrize( "conv_dim, \ flip_add_input_order, \ add_batch_dim_to_const, \ use_sub_instead, \ prebuilt_bias, \ scalar_elementwise, \ use_conv_transpose", itertools.product( [2, 3], # 1D conv conversion broken even without the pass: rdar://problem/62960720 [True, False], # flip_add_input_order [True, False], # add_batch_dim_to_const [True, False], # use_sub_instead [True, False], # prebuilt_bias [True, False], # scalar_elementwise [True, False], # use_conv_transpose ), ) def test_fuse_conv_bias( self, conv_dim, flip_add_input_order, add_batch_dim_to_const, use_sub_instead, prebuilt_bias, scalar_elementwise, use_conv_transpose, ): if flip_add_input_order and use_sub_instead: return if use_conv_transpose and conv_dim != 2: return input_shape = None W = None Cout = 8 Cin = 3 D = 10 const = np.random.rand(Cout) if add_batch_dim_to_const else np.random.rand(1, Cout) const = np.expand_dims(const, axis=-1) if conv_dim == 1: input_shape = (1, Cin, D) W = np.random.rand(Cout, Cin, 1) elif conv_dim == 2: input_shape = (1, Cin, D, D) W = np.random.rand(Cout, Cin, 1, 1) const = np.expand_dims(const, axis=-1) elif conv_dim == 3: input_shape = (1, Cin, D, D, D) W = np.random.rand(Cout, Cin, 1, 1, 1) const = np.expand_dims(const, axis=-1) const = np.expand_dims(const, axis=-1) if use_conv_transpose: W = np.swapaxes(W, 0, 1) output_shape = list(input_shape) output_shape[1] = Cout if scalar_elementwise: const = np.random.uniform(0) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): kwargs = { "x": x, "weight": W, "pad_type": "valid", "dilations": [1] * conv_dim, "strides": [1] * conv_dim, } if prebuilt_bias: kwargs["bias"] = np.random.rand(Cout) x = mb.conv_transpose(**kwargs) if use_conv_transpose else mb.conv(**kwargs) if use_sub_instead: x = mb.sub(x=x, y=const) else: x = mb.add( x=const if flip_add_input_order else x, y=x if flip_add_input_order else const, ) x = mb.relu(x=x) return x element_op = "sub" if use_sub_instead else "add" conv_op = "conv" if not use_conv_transpose else "conv_transpose" prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_conv_bias") assert get_op_types_in_program(prev_prog) == [conv_op, element_op, "relu"] assert get_op_types_in_program(prog) == [conv_op, "relu"] old_bias = prev_block.find_ops(op_type=conv_op)[0].inputs.get("bias", None) old_bias_val = 0 if old_bias is None else old_bias.val assert old_bias_val is not None assert block.find_ops(op_type=conv_op)[0].inputs["bias"] is not None new_bias_val = block.find_ops(op_type=conv_op)[0].inputs["bias"].val assert new_bias_val is not None if use_sub_instead: np.testing.assert_almost_equal(old_bias_val - np.squeeze(const), new_bias_val) else: np.testing.assert_almost_equal(old_bias_val + np.squeeze(const), new_bias_val) assert_model_is_valid( prog, {"x": input_shape}, expected_output_shapes={block.outputs[0].name: tuple(output_shape)}, ) """ Input graph: Const | V input -----> convolution -----> transpose -----> add/sub ---> out Output graph: input -----> convolution -----> transpose -----> out """ @pytest.mark.parametrize( "conv_dim, has_bias, is_sub, is_conv_first_input, is_bias_scalar, is_deconv, is_all_1s", itertools.product( [1, 2, 3], # conv_dim [True, False], # has_bias [True, False], # is_sub [True, False], # is_conv_first_input [True, False], # is_bias_scalar [True, False], # is_deconv [True, False], # is_all_1s ), ) def test_fuse_conv_bias_transpose_pattern( self, conv_dim, has_bias, is_sub, is_conv_first_input, is_bias_scalar, is_deconv, is_all_1s, ): if is_all_1s and is_bias_scalar: return # construct the conv weight/bias input_shape = None Cout = 8 Cin = 3 D = 10 conv_weight = None conv_bias = ( np.arange(Cout).astype(np.float32) if has_bias else np.zeros(Cout).astype(np.float32) ) rank = conv_dim + 2 if conv_dim == 1: input_shape = (1, Cin, D) conv_weight = np.random.rand(Cout, Cin, 1) elif conv_dim == 2: input_shape = (1, Cin, D, D) conv_weight = np.random.rand(Cout, Cin, 1, 1) elif conv_dim == 3: input_shape = (1, Cin, D, D, D) conv_weight = np.random.rand(Cout, Cin, 1, 1, 1) if is_deconv: conv_weight = np.swapaxes(conv_weight, 0, 1) output_shape = list(input_shape) output_shape[1] = Cout output_shape = np.array(output_shape) # generate the perm for the transpose op perm = np.arange(rank) np.random.shuffle(perm) output_shape = output_shape[perm] cout_index = np.where(perm == 1)[0][0] # generate the const bias, and reshape it to a random broadcasable shape bias = np.arange(Cout).astype(np.float32) bias_shape = [1] * rank bias_shape[cout_index] = Cout if cout_index != 0: crop_index = np.random.randint(low=0, high=cout_index + 1) bias_shape = bias_shape[crop_index:] bias = np.reshape(bias, bias_shape) # for the scalar case, random generate a number if is_bias_scalar: bias = np.random.uniform(0) # for the all 1s case, random generate a number and reshape it to (1, 1, ..., 1) if is_all_1s: bias = np.array([np.random.uniform(0)]) bias_rank = np.random.randint(low=1, high=rank + 1) bias_shape = [1] * bias_rank bias = np.reshape(bias, bias_shape) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): # conv or conv_transpose kwargs = { "x": x, "weight": conv_weight, "pad_type": "valid", "dilations": [1] * conv_dim, "strides": [1] * conv_dim, } if has_bias: kwargs["bias"] = conv_bias x = mb.conv_transpose(**kwargs) if is_deconv else mb.conv(**kwargs) # transpose x = mb.transpose(x=x, perm=perm) # elementwise op element_args = {"x": x, "y": bias} if is_conv_first_input else {"x": bias, "y": x} element_op = mb.sub if is_sub else mb.add x = element_op(**element_args) return x element_op = "sub" if is_sub else "add" conv_op = "conv" if not is_deconv else "conv_transpose" prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_conv_bias") assert get_op_types_in_program(prev_prog) == [conv_op, "transpose", element_op] assert get_op_types_in_program(prog) == [conv_op, "transpose"] # get the value of new weight/bias new_bias_val = block.find_ops(op_type=conv_op)[0].inputs["bias"].val assert new_bias_val is not None new_weight_val = block.find_ops(op_type=conv_op)[0].inputs["weight"].val assert new_weight_val is not None # compare the weight if is_sub and not is_conv_first_input: np.testing.assert_almost_equal(new_weight_val, -conv_weight) else: np.testing.assert_almost_equal(new_weight_val, conv_weight) # compare the bias if is_sub: if is_conv_first_input: bias = -bias else: conv_bias = -conv_bias expected_conv_bias_val = conv_bias + np.squeeze(bias) np.testing.assert_almost_equal(expected_conv_bias_val, new_bias_val, decimal=6) # run the model assert_model_is_valid( prog, {"x": input_shape}, expected_output_shapes={block.outputs[0].name: tuple(output_shape)}, ) class TestConvScaleFusion: @staticmethod def _apply_weight_transform(inputs, is_deconv, is_real_div, is_conv_first_input, const_type): """ Utility function to test the weight transform function in conv scale fusion pass. """ Cin, _, groups = 10, 20, 10 input_shape = (1, Cin, 2, 2) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): # create conv or deconv op if is_deconv: conv = mb.conv_transpose( x=x, weight=inputs["conv_weight"], bias=inputs["conv_bias"], groups=groups, ) else: conv = mb.conv( x=x, weight=inputs["conv_weight"], bias=inputs["conv_bias"], groups=groups, ) # create const op based on different mode scale = inputs["scale"] if const_type == "python_scale": scale = mb.const(val=scale) elif const_type == "numpy_scale": if type(scale) == int: np_value = np.int32(scale) elif type(scale) == float: np_value = np.float32(scale) scale = mb.const(val=np_value) elif const_type == "numpy_0d_array": scale = mb.const(val=np.array(scale)) elif const_type == "numpy_1d_array": scale = mb.const(val=np.array([scale])) else: scale = mb.const(val=scale) # do the scale operation if is_real_div: x = mb.real_div( x=conv, y=scale, ) else: if is_conv_first_input: x = mb.mul( x=conv, y=scale, ) else: x = mb.mul( x=scale, y=conv, ) return x apply_pass_and_basic_check(prog, "common::fuse_conv_scale") # get the updated weight from the prog conv_op = [] for op in prog["main"].operations: if op.op_type == "const": continue conv_op.append(op) assert len(conv_op) == 1, "should only have one conv / conv_transpose layer." return conv_op[0].weight.val, conv_op[0].bias.val @pytest.mark.parametrize( "conv_type, is_real_div, is_conv_first_input, const_type", itertools.product( ["conv", "conv_transpose"], [True, False], [True, False], [ "python_scale", "numpy_scale", "numpy_0d_array", "numpy_1d_array", "numpy_3d_array", "numpy_4d_array", ], ), ) def test_weight_transform_conv(self, conv_type, is_real_div, is_conv_first_input, const_type): """ Test the weight transform function in the conv scale fusion pass """ # parameters for conv is_deconv = conv_type == "conv_type" conv_weight = np.arange(20).astype(np.float32) conv_weight = ( np.reshape(conv_weight, (10, 2, 1, 1)) if is_deconv else np.reshape(conv_weight, (20, 1, 1, 1)) ) conv_bias = np.arange(20).astype(np.float32) if const_type == "numpy_3d_array": scale = np.reshape(np.arange(20).astype(np.float32), (20, 1, 1)) elif const_type == "numpy_4d_array": scale = np.reshape(np.arange(20).astype(np.float32), (1, 20, 1, 1)) else: scale = 12.7 inputs = { "conv_weight": conv_weight, "conv_bias": conv_bias, "scale": scale, } new_conv_weight, new_conv_bias = self._apply_weight_transform( inputs, is_deconv, is_real_div, is_conv_first_input, const_type ) if is_real_div: scale = 1.0 / scale if const_type != "numpy_3d_array" and const_type != "numpy_4d_array": expected_bias = conv_bias * scale expected_weight = conv_weight * scale else: scale = np.reshape(scale, (20)) expected_bias = conv_bias * scale if is_deconv: scale = np.reshape(scale, (20, 1, 1)) expected_weight = np.reshape(np.arange(20), (20, 1, 1)) expected_weight = expected_weight * scale expected_weight = np.reshape(expected_weight, (10, 2, 1, 1)).astype(np.float32) else: scale = np.reshape(scale, (20, 1, 1, 1)) expected_weight = conv_weight * scale np.testing.assert_almost_equal(new_conv_weight, expected_weight) np.testing.assert_almost_equal(new_conv_bias, expected_bias) assert ( new_conv_weight.dtype == conv_weight.dtype ), "weight data type should not be changed after conv_scale_fusion pass." assert ( new_conv_bias.dtype == conv_weight.dtype ), "bias data type should be the same as the weight for conv layer." @pytest.mark.parametrize( "rank, groups, has_bias, scale_op, scale_type, backend", itertools.product( [3, 4], [1, 10], [False, True], ["mul", "real_div"], ["scalar", "vector"], backends ), ) def test_conv(self, rank, groups, has_bias, scale_op, scale_type, backend): """ Input graph: input -----> conv -----> mul/real_div ---> out Output graph: input -----> conv ----> out """ Cin, Cout = 10, 30 input_shape = (2, Cin, 20) if rank == 3 else (2, Cin, 20, 24) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): # conv layer conv_weight = ( np.random.rand(Cout, Cin // groups, 2) if rank == 3 else np.random.rand(Cout, Cin // groups, 2, 3) ) conv_bias = np.random.rand(Cout) if has_bias else None x = mb.conv( x=x, weight=conv_weight, bias=conv_bias, groups=groups, ) if scale_type == "scalar": scale = np.array([2.3]) else: scale = np.arange(Cout).astype(np.float32) scale = np.reshape(scale, (1, Cout, 1) if rank == 3 else (Cout, 1, 1)) # scale layer if scale_op == "mul": x = mb.mul(x=x, y=scale) elif scale_op == "real_div": x = mb.real_div(x=x, y=scale) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_conv_scale") assert get_op_types_in_program(prev_prog) == ["conv", scale_op] assert get_op_types_in_program(prog) == ["conv"] # validate graph pass output_shape = (2, Cout, 19) if rank == 3 else (2, Cout, 19, 22) assert_model_is_valid( prog, {"x": input_shape}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) @pytest.mark.parametrize( "rank, groups, has_bias, scale_op, scale_type, backend", itertools.product( [3, 4], [1, 10], [False, True], ["mul", "real_div"], ["scalar", "vector"], backends ), ) def test_conv_transpose(self, rank, groups, has_bias, scale_op, scale_type, backend): """ Input graph: input -----> conv_transpose -----> mul/real_div ---> out Output graph: input -----> conv_transpose ----> out """ Cin, Cout = 10, 30 input_shape = (2, Cin, 20) if rank == 3 else (2, Cin, 20, 24) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): # conv layer conv_weight = ( np.random.rand(Cin, Cout // groups, 2) if rank == 3 else np.random.rand(Cin, Cout // groups, 2, 3) ) conv_bias = np.random.rand(Cout) if has_bias else None x = mb.conv_transpose( x=x, weight=conv_weight, bias=conv_bias, groups=groups, ) if scale_type == "scalar": scale = np.array([2.3]) else: scale = np.arange(Cout).astype(np.float32) scale = np.reshape(scale, (Cout, 1) if rank == 3 else (1, Cout, 1, 1)) # scale layer if scale_op == "mul": x = mb.mul(x=x, y=scale) elif scale_op == "real_div": x = mb.real_div(x=x, y=scale) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_conv_scale") assert get_op_types_in_program(prev_prog) == ["conv_transpose", scale_op] assert get_op_types_in_program(prog) == ["conv_transpose"] # validate graph pass output_shape = (2, Cout, 21) if rank == 3 else (2, Cout, 21, 26) assert_model_is_valid( prog, {"x": input_shape}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) class TestFusePadConv(unittest.TestCase): """ Input graph: input -----> pad -----> transpose -----> conv -----> transpose ---> out Output graph: input -----> transpose -----> pad ----> conv -----> transpose ----> out """ def test_simple_direct_output(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 16, 20, 24))]) def prog(x): x = mb.pad(x=x, pad=[0, 0, 1, 1, 1, 1, 0, 0]) x = mb.transpose(x=x, perm=[0, 3, 1, 2]) x = mb.conv(x=x, weight=np.random.random([24, 24, 3, 3]), pad_type="valid") x = mb.transpose(x=x, perm=[0, 2, 3, 1]) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_pad_conv") self.assertEqual( get_op_types_in_program(prev_prog), ["pad", "transpose", "conv", "transpose"] ) self.assertEqual(get_op_types_in_program(prog), ["transpose", "pad", "conv", "transpose"]) assert_model_is_valid( prog, {"x": (1, 16, 20, 24)}, expected_output_shapes={block.outputs[0].name: (1, 16, 20, 24)}, ) """ Input graph: input -----> pad -----> transpose -----> conv -----> transpose ---> out | | --------> transpose -----> conv -----> transpose ---> out Output graph: input ---------> transpose -----> pad -----> conv -----> transpose ---> out | | ------> transpose -----> pad -----> conv -----> transpose ---> out """ def test_pad_transposed_forked_conv(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 16, 20, 24))]) def prog(x): pad = mb.pad(x=x, pad=[0, 0, 1, 1, 1, 1, 0, 0]) x = mb.transpose(x=pad, perm=[0, 3, 1, 2]) x = mb.conv(x=x, weight=np.random.random([24, 24, 3, 3]), pad_type="valid") x = mb.transpose(x=x, perm=[0, 2, 3, 1]) y = mb.transpose(x=pad, perm=[0, 3, 1, 2]) y = mb.conv(x=y, weight=np.random.random([24, 24, 3, 3]), pad_type="valid") y = mb.transpose(x=y, perm=[0, 2, 3, 1]) return x, y prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_pad_conv") self.assertEqual( get_op_types_in_program(prev_prog), ["pad", "transpose", "conv", "transpose", "transpose", "conv", "transpose"], ) self.assertEqual( get_op_types_in_program(prog), ["transpose", "pad", "conv", "transpose", "transpose", "pad", "conv", "transpose"], ) assert_model_is_valid( prog, {"x": (1, 16, 20, 24)}, expected_output_shapes={ block.outputs[0].name: (1, 16, 20, 24), block.outputs[1].name: (1, 16, 20, 24), }, ) """ Input graph: input -----> pad -----> transpose -----> conv -----> transpose ---> out | | ---------> out Output graph: No change. """ def test_pad_output(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 16, 20, 24))]) def prog(x): pad = mb.pad(x=x, pad=[0, 0, 1, 1, 1, 1, 0, 0]) x = mb.transpose(x=pad, perm=[0, 3, 1, 2]) x = mb.conv(x=x, weight=np.random.random([24, 24, 3, 3]), pad_type="valid") x = mb.transpose(x=x, perm=[0, 2, 3, 1]) return x, pad prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_pad_conv") self.assertEqual( get_op_types_in_program(prev_prog), ["pad", "transpose", "conv", "transpose"] ) self.assertEqual(get_op_types_in_program(prog), ["pad", "transpose", "conv", "transpose"]) assert_model_is_valid( prog, {"x": (1, 16, 20, 24)}, expected_output_shapes={ block.outputs[0].name: (1, 16, 20, 24), block.outputs[1].name: (1, 18, 22, 24), }, ) class TestFuseDilatedConv(unittest.TestCase): """ Input graph: input -----> space_to_batch -----> conv (2D) -----> batch_to_space ---> out Output graph: input -----> conv (2D w dilations) ----> out """ def test_fusion_with_same_padding(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 384, 48, 48))], opset_version=ct.target.iOS16) def prog(x): x = mb.space_to_batch(x=x, block_shape=[2, 2], paddings=[[2,2], [2,2]]) x = mb.conv(x=x, weight=np.ones((384,1,3,3)), pad_type="valid", groups=384) x = mb.batch_to_space(x=x, block_shape=[2,2], crops=[[0,0], [0,0]]) return x extract_conv_op = lambda prog: [op for op in prog['main'].operations if op.op_type=="conv"][0] prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_dilated_conv") self.assertEqual( get_op_types_in_program(prev_prog), ['space_to_batch', 'conv', 'batch_to_space'] ) self.assertEqual(get_op_types_in_program(prog), ["conv"]) self.assertEqual(extract_conv_op(prev_prog).pad_type.val, "valid") self.assertEqual(extract_conv_op(prog).pad_type.val, "same") self.assertTrue(np.all(extract_conv_op(prev_prog).dilations.val == 1)) self.assertTrue(np.all(extract_conv_op(prog).dilations.val == 2)) assert_model_is_valid( prog, {"x": (1, 384, 48, 48)}, expected_output_shapes={block.outputs[0].name: (1, 384, 48, 48)}, ) class TestConcatToPixelShuffle(unittest.TestCase): def test_success(self): """ Input graph: input1(1, 2, 3, 4) -----> concat(axis=2, interleave=True) -----> concat(axis=3, interleave=True) ---> out(1, 2, 6, 8) ^ ^ | | input2(1, 2, 3, 4) ------------------- | | input3(1, 2, 3, 4) -----> concat(axis=2, interleave=True) -----------------------| ^ | input4(1, 2, 3, 4) ------------------| Output graph: input1(1, 2, 3, 4) -----> concat(axis=1) ---> pixel_shuffle(upsample_factor=2) ----> out(1, 2, 6, 8) ^ input2(1, 2, 3, 4) ----------| | input3(1, 2, 3, 4) ----------| | input4(1, 2, 3, 4) ----------| """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), ] ) def prog(x1, x2, x3, x4): ab = mb.concat(values=[x1, x2], axis=2, interleave=True) cd = mb.concat(values=[x3, x4], axis=2, interleave=True) x = mb.concat(values=[ab, cd], axis=3, interleave=True) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::concat_to_pixel_shuffle" ) self.assertEqual(get_op_types_in_program(prev_prog), ["concat", "concat", "concat"]) self.assertEqual(get_op_types_in_program(prog), ["concat", "pixel_shuffle"]) inputs = {"x1": (1, 2, 3, 4), "x2": (1, 2, 3, 4), "x3": (1, 2, 3, 4), "x4": (1, 2, 3, 4)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 6, 8)}, ) mlmodel = ct.convert( prog, source="milinternal", convert_to="neuralnetwork", compute_units=ct.ComputeUnit.CPU_ONLY, ) if not _IS_MACOS: # Can not get predictions unless on macOS. return input_dict = dict() input_dict["x1"] = np.ones(inputs["x1"]) input_dict["x2"] = np.ones(inputs["x2"]) * 2 input_dict["x3"] = np.ones(inputs["x3"]) * 3 input_dict["x4"] = np.ones(inputs["x4"]) * 4 output_name = block.outputs[0].name ab = np.reshape( np.stack((input_dict["x1"], input_dict["x2"]), axis=3), newshape=[1, 2, 6, 4] ) cd = np.reshape( np.stack((input_dict["x3"], input_dict["x4"]), axis=3), newshape=[1, 2, 6, 4] ) old_prediction = np.reshape(np.stack((ab, cd), axis=4), newshape=[1, 2, 6, 8]) prediction = mlmodel.predict(input_dict) np.testing.assert_allclose(old_prediction, prediction[output_name], atol=1e-04, rtol=1e-05) def test_nested(self): """ Two nested blocks that will each be transformed. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), ] ) def prog(x1, x2, x3, x4, x5, x6, x7, x8): ab = mb.concat(values=[x1, x2], axis=2, interleave=True) cd = mb.concat(values=[x3, x4], axis=2, interleave=True) x = mb.concat(values=[ab, cd], axis=3, interleave=True) ef = mb.concat(values=[x5, x6], axis=2, interleave=True) gh = mb.concat(values=[x7, x8], axis=2, interleave=True) y = mb.concat(values=[ef, gh], axis=3, interleave=True) z = mb.concat(values=[x, y], axis=1) return z prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::concat_to_pixel_shuffle" ) self.assertEqual( get_op_types_in_program(prev_prog), ["concat", "concat", "concat", "concat", "concat", "concat", "concat"], ) self.assertEqual( get_op_types_in_program(prog), ["concat", "pixel_shuffle", "concat", "pixel_shuffle", "concat"], ) inputs = { "x1": (1, 2, 3, 4), "x2": (1, 2, 3, 4), "x3": (1, 2, 3, 4), "x4": (1, 2, 3, 4), "x5": (1, 2, 3, 4), "x6": (1, 2, 3, 4), "x7": (1, 2, 3, 4), "x8": (1, 2, 3, 4), } assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 4, 6, 8)}, ) input_dict = dict() for name, shape in inputs.items(): input_dict[name] = np.random.rand(*shape) output_name = block.outputs[0].name ab = np.reshape( np.stack((input_dict["x1"], input_dict["x2"]), axis=3), newshape=[1, 2, 6, 4] ) cd = np.reshape( np.stack((input_dict["x3"], input_dict["x4"]), axis=3), newshape=[1, 2, 6, 4] ) x = np.reshape(np.stack((ab, cd), axis=4), newshape=[1, 2, 6, 8]) ef = np.reshape( np.stack((input_dict["x5"], input_dict["x6"]), axis=3), newshape=[1, 2, 6, 4] ) gh = np.reshape( np.stack((input_dict["x7"], input_dict["x8"]), axis=3), newshape=[1, 2, 6, 4] ) y = np.reshape(np.stack((ef, gh), axis=4), newshape=[1, 2, 6, 8]) old_prediction = np.concatenate((x, y), axis=1) mlmodel = ct.convert( prog, source="milinternal", convert_to="neuralnetwork", compute_units=ct.ComputeUnit.CPU_ONLY, ) if _IS_MACOS: prediction = mlmodel.predict(input_dict) np.testing.assert_allclose( old_prediction, prediction[output_name], atol=1e-04, rtol=1e-05 ) def test_failure_0(self): """ The h_concat has three inputs, so the pattern won't match. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), ] ) def prog(x1, x2, x3, x4): ab = mb.concat(values=[x1, x2, x3], axis=2, interleave=True) cd = mb.concat(values=[x3, x4, x1], axis=2, interleave=True) x = mb.concat(values=[ab, cd], axis=3, interleave=True) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::concat_to_pixel_shuffle" ) self.assertEqual(get_op_types_in_program(prev_prog), ["concat", "concat", "concat"]) self.assertEqual(get_op_types_in_program(prog), ["concat", "concat", "concat"]) def test_failure_1(self): """ The first concat is on the wrong axis, so the pattern won't match. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), ] ) def prog(x1, x2, x3, x4): ab = mb.concat(values=[x1, x2], axis=3, interleave=True) cd = mb.concat(values=[x3, x4], axis=3, interleave=True) x = mb.concat(values=[ab, cd], axis=3, interleave=True) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::concat_to_pixel_shuffle" ) self.assertEqual(get_op_types_in_program(prev_prog), ["concat", "concat", "concat"]) self.assertEqual(get_op_types_in_program(prog), ["concat", "concat", "concat"]) def test_failure_2(self): """ The last concat is on the wrong axis, so the pattern won't match. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), ] ) def prog(x1, x2, x3, x4): ab = mb.concat(values=[x1, x2], axis=2, interleave=True) cd = mb.concat(values=[x3, x4], axis=2, interleave=True) x = mb.concat(values=[ab, cd], axis=2, interleave=True) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::concat_to_pixel_shuffle" ) self.assertEqual(get_op_types_in_program(prev_prog), ["concat", "concat", "concat"]) self.assertEqual(get_op_types_in_program(prog), ["concat", "concat", "concat"]) def test_failure_3(self): """ The first concat is not interleaved, so the pattern won't match. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), ] ) def prog(x1, x2, x3, x4): ab = mb.concat(values=[x1, x2], axis=2, interleave=False) cd = mb.concat(values=[x3, x4], axis=2, interleave=True) x = mb.concat(values=[ab, cd], axis=3, interleave=True) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::concat_to_pixel_shuffle" ) self.assertEqual(get_op_types_in_program(prev_prog), ["concat", "concat", "concat"]) self.assertEqual(get_op_types_in_program(prog), ["concat", "concat", "concat"]) def test_failure_4(self): """ The second concat is not interleaved, so the pattern won't match. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), ] ) def prog(x1, x2, x3, x4): ab = mb.concat(values=[x1, x2], axis=2, interleave=True) cd = mb.concat(values=[x3, x4], axis=2, interleave=False) x = mb.concat(values=[ab, cd], axis=3, interleave=True) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::concat_to_pixel_shuffle" ) self.assertEqual(get_op_types_in_program(prev_prog), ["concat", "concat", "concat"]) self.assertEqual(get_op_types_in_program(prog), ["concat", "concat", "concat"]) def test_failure_5(self): """ The last concat is not interleaved, so the pattern won't match. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), ] ) def prog(x1, x2, x3, x4): ab = mb.concat(values=[x1, x2], axis=2, interleave=True) cd = mb.concat(values=[x3, x4], axis=2, interleave=True) x = mb.concat(values=[ab, cd], axis=3, interleave=False) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::concat_to_pixel_shuffle" ) self.assertEqual(get_op_types_in_program(prev_prog), ["concat", "concat", "concat"]) self.assertEqual(get_op_types_in_program(prog), ["concat", "concat", "concat"]) def test_failure_6(self): """ The inputs are the wrong rank, so the pattern won't match. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 2, 3, 4, 5)), mb.TensorSpec(shape=(1, 2, 3, 4, 5)), mb.TensorSpec(shape=(1, 2, 3, 4, 5)), mb.TensorSpec(shape=(1, 2, 3, 4, 5)), ] ) def prog(x1, x2, x3, x4): ab = mb.concat(values=[x1, x2], axis=2, interleave=True) cd = mb.concat(values=[x3, x4], axis=2, interleave=True) x = mb.concat(values=[ab, cd], axis=3, interleave=True) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::concat_to_pixel_shuffle" ) self.assertEqual(get_op_types_in_program(prev_prog), ["concat", "concat", "concat"]) self.assertEqual(get_op_types_in_program(prog), ["concat", "concat", "concat"]) def test_failure_7(self): """ Extra input to the w_concats means the pattern won't match. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 2, 4, 4)), mb.TensorSpec(shape=(1, 2, 4, 4)), mb.TensorSpec(shape=(1, 2, 4, 4)), mb.TensorSpec(shape=(1, 2, 4, 4)), mb.TensorSpec(shape=(1, 2, 8, 4)), ] ) def prog(x1, x2, x3, x4, x5): ab = mb.concat(values=[x1, x2], axis=2, interleave=True) cd = mb.concat(values=[x3, x4], axis=2, interleave=True) x = mb.concat(values=[ab, cd, x5], axis=3, interleave=True) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::concat_to_pixel_shuffle" ) self.assertEqual(get_op_types_in_program(prev_prog), ["concat", "concat", "concat"]) self.assertEqual(get_op_types_in_program(prog), ["concat", "concat", "concat"]) class TestConcatInterleave: def test_concat_interleave_fusion_pass(self): """ Given: %3 = concat(%1.a, %1.b, axis=-3, interleave=False) #shape = (B, n*C, H, W) %4 = reshape(%3) #shape = (B, n, C, H, W) %5 = transpose(%4, perm=[0, 2, 1, 3, 4]) # shape = (B, C, n, H, W) %6 = reshape(%5) # shape = (B, C*n, H, W) Result: %6 = concat(%1.a, %1.b, axis=-3, interleave=True) """ B, C, H, W = 1, 10, 20, 20 @mb.program( input_specs=[mb.TensorSpec(shape=(B, C, H, W)), mb.TensorSpec(shape=(B, C, H, W))] ) def prog(x, y): z = mb.concat(values=[x, y], axis=1) z = mb.reshape(x=z, shape=(B, 2, C, H, W)) z = mb.transpose(x=z, perm=[0, 2, 1, 3, 4]) z = mb.reshape(x=z, shape=(B, -1, H, W)) return z prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::detect_concat_interleave" ) assert get_op_types_in_program(prev_prog) == ["concat", "reshape", "transpose", "reshape"] assert get_op_types_in_program(prog) == ["concat"] concat_op = prog.find_ops(op_type="concat", exactly_one=True)[0] assert concat_op.interleave.val assert_model_is_valid( prog, {"x": (B, C, H, W), "y": (B, C, H, W)}, expected_output_shapes={block.outputs[0].name: (B, 2 * C, H, W)}, ) class TestFuseOnehotMatmulToGather: @pytest.mark.parametrize( "backend, rank, opset_version", itertools.product(backends, [1, 2, 3, 4], [None, ct.target.iOS17]), ) def test_fuse_onehot_matmul_to_gather(self, backend, rank, opset_version): """ Input: %2 = one_hot(%1, on_value=1, off_value=0, axis=-1) %3 = const() # rank 2 %4 = matmul(%2, %3) Output: %4 = gather(%3, %2, axis=0) """ rank4_shape = (10, 3, 6, 7) input_shape = rank4_shape[-rank:] vocab_size = 15 embedding_size = 12 @mb.program( input_specs=[mb.TensorSpec(shape=input_shape, dtype=types.int32)], opset_version=opset_version, ) def prog(x): x = mb.one_hot( indices=x, on_value=1.0, off_value=0.0, axis=-1, one_hot_vector_size=vocab_size ) x = mb.matmul(x=x, y=np.random.rand(vocab_size, embedding_size)) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_onehot_matmul_to_gather" ) assert get_op_types_in_program(prev_prog) == ["one_hot", "matmul"] if opset_version == ct.target.iOS17: # Several ops added to make sure indices in iOS17 gather is non-negative. assert get_op_types_in_program(prog) == [ "greater_equal", "shape", "slice_by_index", "add", "select", "gather", ] else: assert get_op_types_in_program(prog) == ["gather"] if opset_version == ct.target.iOS17: if backend[0] != "mlprogram" or _macos_version() < (14, 0): pytest.skip("IOS17 target available only on macOS 14+ with mlprogram.") assert_model_is_valid( prog, {"x": input_shape}, backend=backend, expected_output_shapes={block.outputs[0].name: input_shape + (embedding_size,)}, minimum_deployment_target=opset_version, ) class TestReplaceStackReshape(unittest.TestCase): def test_with_interleave(self): """ input1(1, 5, 3, 4) -----> stack(axis=2) -----> reshape(shape=(1, 10, 3, 4)) ---> out(1, 10, 3, 4) ^ | input2(1, 5, 3, 4) ---------- Output graph: input -----> concat ----> out """ @mb.program( input_specs=[mb.TensorSpec(shape=(1, 5, 3, 4)), mb.TensorSpec(shape=(1, 5, 3, 4))] ) def prog(x1, x2): x = mb.stack(values=[x1, x2], axis=2) x = mb.reshape(x=x, shape=[1, 10, 3, 4]) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::replace_stack_reshape" ) self.assertEqual(get_op_types_in_program(prev_prog), ["stack", "reshape"]) self.assertEqual(get_op_types_in_program(prog), ["concat"]) inputs = {"x1": (1, 5, 3, 4), "x2": (1, 5, 3, 4)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 10, 3, 4)}, ) concat_ops = [op for op in block.operations if op.op_type == "concat"] concat_op = concat_ops[0] assert concat_op.interleave.val == True # noqa: E712 output_name = block.outputs[0].name mlmodel = ct.convert( prog, source="milinternal", convert_to="neuralnetwork", compute_units=ct.ComputeUnit.CPU_ONLY, ) if not _IS_MACOS: # Can not get predictions unless on macOS. return input_dict = dict() for name, shape in inputs.items(): input_dict[name] = np.random.rand(*shape) old_prediction = np.reshape( np.stack([input_dict["x1"], input_dict["x2"]], axis=2), newshape=[1, 10, 3, 4] ) prediction = mlmodel.predict(input_dict) np.testing.assert_allclose(old_prediction, prediction[output_name], atol=1e-04, rtol=1e-05) def test_without_interleave(self): """ Input graph: input1(1, 5, 3, 4) -----> stack(axis=1) -----> reshape(shape=(1, 10, 3, 4)) ---> out(1, 10, 3, 4) ^ | input2(1, 5, 3, 4) ---------- Output graph: input -----> concat ----> out """ @mb.program( input_specs=[mb.TensorSpec(shape=(1, 5, 3, 4)), mb.TensorSpec(shape=(1, 5, 3, 4))] ) def prog(x1, x2): x = mb.stack(values=[x1, x2], axis=1) x = mb.reshape(x=x, shape=[1, 10, 3, 4]) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::replace_stack_reshape" ) self.assertEqual(get_op_types_in_program(prev_prog), ["stack", "reshape"]) self.assertEqual(get_op_types_in_program(prog), ["concat"]) inputs = {"x1": (1, 5, 3, 4), "x2": (1, 5, 3, 4)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 10, 3, 4)}, ) concat_ops = [op for op in block.operations if op.op_type == "concat"] concat_op = concat_ops[0] assert concat_op.interleave.val == False # noqa: E712 output_name = block.outputs[0].name mlmodel = ct.convert( prog, source="milinternal", convert_to="neuralnetwork", compute_units=ct.ComputeUnit.CPU_ONLY, ) if not _IS_MACOS: # Can not get predictions unless on macOS. return input_dict = dict() for name, shape in inputs.items(): input_dict[name] = np.random.rand(*shape) old_prediction = np.reshape( np.stack([input_dict["x1"], input_dict["x2"]], axis=1), newshape=[1, 10, 3, 4] ) prediction = mlmodel.predict(input_dict) np.testing.assert_allclose(old_prediction, prediction[output_name], atol=1e-04, rtol=1e-05) def test_multiple(self): @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), mb.TensorSpec(shape=(1, 2, 3, 4)), ] ) def prog(x1, x2, x3, x4): a = mb.stack(values=[x1, x2], axis=1) a = mb.reshape(x=a, shape=[1, 4, 3, 4]) b = mb.stack(values=[x3, x4], axis=1) b = mb.reshape(x=b, shape=[1, 4, 3, 4]) c = mb.stack(values=[a, b], axis=2) c = mb.reshape(x=c, shape=[1, 4, 6, 4]) return c prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::replace_stack_reshape" ) self.assertEqual( get_op_types_in_program(prev_prog), ["stack", "reshape", "stack", "reshape", "stack", "reshape"], ) self.assertEqual(get_op_types_in_program(prog), ["concat", "concat", "concat"]) inputs = {"x1": (1, 2, 3, 4), "x2": (1, 2, 3, 4), "x3": (1, 2, 3, 4), "x4": (1, 2, 3, 4)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 4, 6, 4)}, ) output_name = block.outputs[0].name mlmodel = ct.convert( prog, source="milinternal", convert_to="neuralnetwork", compute_units=ct.ComputeUnit.CPU_ONLY, ) if not _IS_MACOS: # Can not get predictions unless on macOS. return input_dict = dict() for name, shape in inputs.items(): input_dict[name] = np.random.rand(*shape) branch_1 = np.reshape( np.stack([input_dict["x1"], input_dict["x2"]], axis=1), newshape=[1, 4, 3, 4] ) branch_2 = np.reshape( np.stack([input_dict["x3"], input_dict["x4"]], axis=1), newshape=[1, 4, 3, 4] ) old_prediction = np.reshape(np.stack([branch_1, branch_2], axis=2), newshape=[1, 4, 6, 4]) prediction = mlmodel.predict(input_dict) np.testing.assert_allclose(old_prediction, prediction[output_name], atol=1e-04, rtol=1e-05) def test_negative_1(self): """ Input graph: input1(1, 5, 3, 4) -----> stack(axis=1) -----> reshape(shape=(-1, 5, 6, 4)) ---> out(1, 5, 6, 4) ^ | input2(1, 5, 3, 4) ---------- Output graph: Unchanged -- this graph is not equivalent to a concat. """ @mb.program( input_specs=[mb.TensorSpec(shape=(1, 5, 3, 4)), mb.TensorSpec(shape=(1, 5, 3, 4))] ) def prog(x1, x2): a = mb.stack(values=[x1, x2], axis=1) a = mb.reshape(x=a, shape=[-1, 5, 6, 4]) return a prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::replace_stack_reshape" ) self.assertEqual(get_op_types_in_program(prev_prog), ["stack", "reshape"]) self.assertEqual(get_op_types_in_program(prog), ["stack", "reshape"]) def test_negative_2(self): """ Input graph: input1(1, 5, 3, 4) -----> stack(axis=1) -----> reshape(shape=(-1, 5, 12, 2)) ---> out(1, 5, 6, 4) ^ | input2(1, 5, 3, 4) ---------- Output graph: Unchanged -- this graph is not equivalent to a concat. """ @mb.program( input_specs=[mb.TensorSpec(shape=(1, 5, 3, 4)), mb.TensorSpec(shape=(1, 5, 3, 4))] ) def prog(x1, x2): a = mb.stack(values=[x1, x2], axis=1) a = mb.reshape(x=a, shape=[-1, 5, 12, 2]) return a prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::replace_stack_reshape" ) self.assertEqual(get_op_types_in_program(prev_prog), ["stack", "reshape"]) self.assertEqual(get_op_types_in_program(prog), ["stack", "reshape"]) def test_negative_3(self): """ Input graph: input1(1, 5, 3, 4) -----> stack(axis=1) -----> reshape(shape=(-1, 2, 5, 4, 3)) ---> out(1, 5, 6, 4) ^ | input2(1, 5, 3, 4) ---------- Output graph: Unchanged -- this graph is not equivalent to a concat. """ @mb.program( input_specs=[mb.TensorSpec(shape=(1, 5, 3, 4)), mb.TensorSpec(shape=(1, 5, 3, 4))] ) def prog(x1, x2): a = mb.stack(values=[x1, x2], axis=1) a = mb.reshape(x=a, shape=[-1, 2, 5, 4, 3]) return a prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::replace_stack_reshape" ) self.assertEqual(get_op_types_in_program(prev_prog), ["stack", "reshape"]) self.assertEqual(get_op_types_in_program(prog), ["stack", "reshape"]) def test_negative_4(self): """ More than two inputs to the stack op -- can't be transformed. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 5, 3, 4)), mb.TensorSpec(shape=(1, 5, 3, 4)), mb.TensorSpec(shape=(1, 5, 3, 4)), ] ) def prog(x1, x2, x3): a = mb.stack(values=[x1, x2, x3], axis=1) a = mb.reshape(x=a, shape=[-1, 15, 4, 3]) return a prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::replace_stack_reshape" ) self.assertEqual(get_op_types_in_program(prev_prog), ["stack", "reshape"]) self.assertEqual(get_op_types_in_program(prog), ["stack", "reshape"]) def test_negative_5(self): """ The stack and reshape are not adjacent, so the graph is not transformed. """ @mb.program( input_specs=[mb.TensorSpec(shape=(1, 5, 3, 4)), mb.TensorSpec(shape=(1, 5, 3, 4))] ) def prog(x1, x2): a = mb.stack(values=[x1, x2], axis=1) a = mb.relu(x=a) a = mb.reshape(x=a, shape=[-1, 10, 4, 3]) return a prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::replace_stack_reshape" ) self.assertEqual(get_op_types_in_program(prev_prog), ["stack", "relu", "reshape"]) self.assertEqual(get_op_types_in_program(prog), ["stack", "relu", "reshape"]) def test_negative_6(self): """ The stack op's output is used elsewhere in the graph, so it can't be removed """ @mb.program( input_specs=[mb.TensorSpec(shape=(1, 5, 3, 4)), mb.TensorSpec(shape=(1, 5, 3, 4))] ) def prog(x1, x2): a = mb.stack(values=[x1, x2], axis=1) b = mb.reshape(x=a, shape=[-1, 10, 4, 3]) c = mb.relu(x=a) c = mb.reshape(x=c, shape=[-1, 10, 4, 3]) d = mb.add(x=b, y=c) return d prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::replace_stack_reshape" ) self.assertEqual( get_op_types_in_program(prev_prog), ["stack", "reshape", "relu", "reshape", "add"] ) self.assertEqual( get_op_types_in_program(prog), ["stack", "reshape", "relu", "reshape", "add"] ) def test_negative_7(self): """ The stack op is not followed by any other ops. """ @mb.program( input_specs=[mb.TensorSpec(shape=(1, 5, 3, 4)), mb.TensorSpec(shape=(1, 5, 3, 4))] ) def prog(x1, x2): a = mb.stack(values=[x1, x2], axis=1) return a prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::replace_stack_reshape" ) self.assertEqual(get_op_types_in_program(prev_prog), ["stack"]) self.assertEqual(get_op_types_in_program(prog), ["stack"]) class TestUseReflectionPadding: def test_success_w_axis(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): left = mb.slice_by_index( x=x1, begin=[0, 0, 0, 1], end=[0, 0, 0, 2], end_mask=[True, True, True, False] ) right = mb.slice_by_index( x=x1, begin=[0, 0, 0, -2], end=[0, 0, 0, -1], end_mask=[True, True, True, False] ) x = mb.concat(values=[left, x1, right], axis=3) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::use_reflection_padding") assert get_op_types_in_program(prev_prog) == ["slice_by_index", "slice_by_index", "concat"] assert get_op_types_in_program(prog) == ["pad"] inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 6, 10)}, ) def test_success_w_axis_multiple(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): left0 = mb.slice_by_index( x=x1, begin=[0, 0, 0, 2], end=[0, 0, 0, 3], end_mask=[True, True, True, False] ) left1 = mb.slice_by_index( x=x1, begin=[0, 0, 0, 1], end=[0, 0, 0, 2], end_mask=[True, True, True, False] ) right0 = mb.slice_by_index( x=x1, begin=[0, 0, 0, -2], end=[0, 0, 0, -1], end_mask=[True, True, True, False] ) right1 = mb.slice_by_index( x=x1, begin=[0, 0, 0, -3], end=[0, 0, 0, -2], end_mask=[True, True, True, False] ) x = mb.concat(values=[left0, left1, x1, right0, right1], axis=3) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::use_reflection_padding") assert get_op_types_in_program(prev_prog) == [ "slice_by_index", "slice_by_index", "slice_by_index", "slice_by_index", "concat", ] assert get_op_types_in_program(prog) == ["pad"] inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 6, 12)}, ) def test_success_h_axis(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): left = mb.slice_by_index( x=x1, begin=[0, 0, 1, 0], end=[0, 0, 2, 0], end_mask=[True, True, False, True] ) right = mb.slice_by_index( x=x1, begin=[0, 0, -2, 0], end=[0, 0, -1, 0], end_mask=[True, True, False, True] ) x = mb.concat(values=[left, x1, right], axis=2) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::use_reflection_padding") assert get_op_types_in_program(prev_prog) == ["slice_by_index", "slice_by_index", "concat"] assert get_op_types_in_program(prog) == ["pad"] inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 8, 8)}, ) def test_failure_wrong_concat_order(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): left = mb.slice_by_index( x=x1, begin=[0, 0, 1, 0], end=[0, 0, 2, 0], end_mask=[True, True, False, True] ) right = mb.slice_by_index( x=x1, begin=[0, 0, -2, 0], end=[0, 0, -1, 0], end_mask=[True, True, False, True] ) # Concat is not in correct order x = mb.concat(values=[left, right, x1], axis=2) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::use_reflection_padding") assert get_op_types_in_program(prev_prog) == ["slice_by_index", "slice_by_index", "concat"] assert get_op_types_in_program(prog) == ["slice_by_index", "slice_by_index", "concat"] inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 8, 8)}, ) def test_failure_wrong_concat_order_2(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): left0 = mb.slice_by_index( x=x1, begin=[0, 0, 0, 1], end=[0, 0, 0, 2], end_mask=[True, True, True, False] ) left1 = mb.slice_by_index( x=x1, begin=[0, 0, 0, 2], end=[0, 0, 0, 3], end_mask=[True, True, True, False] ) right0 = mb.slice_by_index( x=x1, begin=[0, 0, 0, -3], end=[0, 0, 0, -2], end_mask=[True, True, True, False] ) right1 = mb.slice_by_index( x=x1, begin=[0, 0, 0, -2], end=[0, 0, 0, -1], end_mask=[True, True, True, False] ) # concat args are out of order x = mb.concat(values=[left0, left1, x1, right1, right0], axis=3) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::use_reflection_padding") assert get_op_types_in_program(prev_prog) == [ "slice_by_index", "slice_by_index", "slice_by_index", "slice_by_index", "concat", ] assert get_op_types_in_program(prog) == [ "slice_by_index", "slice_by_index", "slice_by_index", "slice_by_index", "concat", ] inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 6, 12)}, ) def test_failure_wrong_slice_size(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): # slice is too big left = mb.slice_by_index( x=x1, begin=[0, 0, 1, 0], end=[0, 0, 3, 0], end_mask=[True, True, False, True] ) right = mb.slice_by_index( x=x1, begin=[0, 0, -2, 0], end=[0, 0, -1, 0], end_mask=[True, True, False, True] ) x = mb.concat(values=[left, x1, right], axis=2) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::use_reflection_padding") assert get_op_types_in_program(prev_prog) == ["slice_by_index", "slice_by_index", "concat"] assert get_op_types_in_program(prog) == ["slice_by_index", "slice_by_index", "concat"] inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 9, 8)}, ) def test_failure_not_all_same_input(self): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8)), mb.TensorSpec(shape=(1, 2, 6, 8))] ) def prog(x1, x2): left0 = mb.slice_by_index( x=x1, begin=[0, 0, 0, 1], end=[0, 0, 0, 2], end_mask=[True, True, True, False] ) left1 = mb.slice_by_index( x=x1, begin=[0, 0, 0, 2], end=[0, 0, 0, 3], end_mask=[True, True, True, False] ) right0 = mb.slice_by_index( x=x1, begin=[0, 0, 0, -3], end=[0, 0, 0, -2], end_mask=[True, True, True, False] ) # one of the slices consumes a different input from the others right1 = mb.slice_by_index( x=x2, begin=[0, 0, 0, -2], end=[0, 0, 0, -1], end_mask=[True, True, True, False] ) x = mb.concat(values=[left0, left1, x1, right0, right1], axis=3) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::use_reflection_padding") assert get_op_types_in_program(prev_prog) == [ "slice_by_index", "slice_by_index", "slice_by_index", "slice_by_index", "concat", ] assert get_op_types_in_program(prog) == [ "slice_by_index", "slice_by_index", "slice_by_index", "slice_by_index", "concat", ] inputs = {"x1": (1, 2, 6, 8), "x2": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (1, 2, 6, 12)}, ) def test_failure_slice_output(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x1): left = mb.slice_by_index( x=x1, begin=[0, 0, 0, 1], end=[0, 0, 0, 2], end_mask=[True, True, True, False] ) right = mb.slice_by_index( x=x1, begin=[0, 0, 0, -2], end=[0, 0, 0, -1], end_mask=[True, True, True, False] ) x = mb.concat(values=[left, x1, right], axis=3) # slice is an output return x, right prev_prog, _, block = apply_pass_and_basic_check(prog, "common::use_reflection_padding") assert get_op_types_in_program(prev_prog) == ["slice_by_index", "slice_by_index", "concat"] assert get_op_types_in_program(prog) == ["slice_by_index", "slice_by_index", "concat"] inputs = {"x1": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={ block.outputs[0].name: (1, 2, 6, 10), block.outputs[1].name: (1, 2, 6, 1), }, ) def test_concat_input_only(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 8))]) def prog(x): x = mb.concat(values=[x, x, x], axis=0) return x prev_prog, _, block = apply_pass_and_basic_check(prog, "common::use_reflection_padding") assert get_op_types_in_program(prog) == ["concat"] inputs = {"x": (1, 2, 6, 8)} assert_model_is_valid( prog, inputs, expected_output_shapes={block.outputs[0].name: (3, 2, 6, 8)}, ) class TestDivideToMultiply: def test_divide_to_multiply(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): div_val = np.random.rand(2, 4).astype(np.float32) div_const = mb.const(val=div_val) div_val_1 = np.random.rand(2, 4).astype(np.float32) div_const_1 = mb.const(val=div_val_1) real_div = mb.real_div(x=x, y=div_const) return mb.real_div(x=real_div, y=div_const_1) assert_op_count_match(prog, expect=2, op="real_div") assert_op_count_match(prog, expect=0, op="mul") prev_prog = copy.deepcopy(prog) PASS_REGISTRY["common::divide_to_multiply"](prog) assert_same_output_names(prev_prog, prog) assert_op_count_match(prog, expect=0, op="real_div") assert_op_count_match(prog, expect=2, op="mul") if _VALIDATE_MODEL: assert_model_is_valid(prog, {"x": (2, 4)}) class TestSelectOptimization: @pytest.mark.parametrize( "cond_val, is_cond_scalar, need_broadcast, is_block_output", itertools.product( (True, False), (True, False), (True, False), (True, False), ), ) def test_const_scalar_cond(self, cond_val, is_cond_scalar, need_broadcast, is_block_output): """ Input graph: const(cond) -| | a -----------|-> select -> (add 1.0 if not is_block_output) -> output | b -----------| If a and b need broadcast, then nothing is changed; else output graph becomes: if cond: if is_block_output: a -> identity -> output else: a -> add 1.0 -> output else: if is_block_output: b -> identity -> output else: b -> add 1.0 -> output """ SHAPE = (5, 2, 3) if need_broadcast: a_shape = (5, 2, 1) b_shape = (5, 1, 3) else: a_shape = SHAPE b_shape = SHAPE if is_cond_scalar: cond = cond_val else: cond_shape = (5, 1, 1) cond = np.full(cond_shape, cond_val) @mb.program( input_specs=[ mb.TensorSpec(shape=a_shape), mb.TensorSpec(shape=b_shape), ] ) def prog(a, b): c = mb.select(cond=cond, a=a, b=b) if not is_block_output: c = mb.add(x=c, y=1.0) return c prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::select_optimization") apply_pass_and_basic_check(prog, "common::noop_elimination") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") # check previous program if is_block_output: assert get_op_types_in_program(prev_prog) == ["select"] else: assert get_op_types_in_program(prev_prog) == ["select", "add"] # check passed program if is_block_output: if need_broadcast: assert get_op_types_in_program(prog) == ["select"] else: assert get_op_types_in_program(prog) == ["identity"] else: if need_broadcast: assert get_op_types_in_program(prog) == ["select", "add"] else: assert get_op_types_in_program(prog) == ["add"] output_name = block.outputs[0].name assert_model_is_valid( prog, {"a": a_shape, "b": b_shape}, expected_output_shapes={output_name: SHAPE}, ) prev_model = ct.convert( prev_prog, pass_pipeline=ct.PassPipeline.EMPTY, convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY, ) model = ct.convert( prog, pass_pipeline=ct.PassPipeline.EMPTY, convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY, ) a = np.random.rand(*a_shape) b = np.random.rand(*b_shape) input_dict = {"a": a, "b": b} prev_output = prev_model.predict(input_dict)[output_name] output = model.predict(input_dict)[output_name] np.testing.assert_allclose(prev_output, output, rtol=0.0, atol=0.0) @pytest.mark.parametrize( "is_a_const, is_fill_scalar", itertools.product((True, False), (True, False)), ) def test_inf_const_selection(self, is_a_const, is_fill_scalar): """ Input graph if is_a_const (else input and fill are swapped): const(cond) ------| | input ------------|-> select -> tanh -> output | const(±inf fill) -| Output graph: input -> add -> tanh -> output """ INPUT_SHAPE = (5, 2, 3) cond_shape = (2, 3) while True: cond = np.random.randint(0, 2, size=cond_shape) == 0 if not np.all(cond) and not np.all(np.logical_not(cond)): break if is_fill_scalar: fill = np.float16(-np.inf) else: fill_shape = (5, 2, 1) fill = np.empty(fill_shape, dtype=np.float16) neg_pos = np.random.randint(0, 2, size=fill_shape) fill[np.where(neg_pos == 0)] = -np.inf fill[np.where(neg_pos == 1)] = np.inf output_shape = INPUT_SHAPE @mb.program(input_specs=[mb.TensorSpec(shape=INPUT_SHAPE, dtype=types.fp16)]) def prog(x): if is_a_const: y = mb.select(cond=cond, a=fill, b=x) else: y = mb.select(cond=cond, a=x, b=fill) return mb.tanh(x=y) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::select_optimization") assert get_op_types_in_program(prev_prog) == ["select", "tanh"] assert get_op_types_in_program(prog) == ["add", "tanh"] output_name = block.outputs[0].name assert_model_is_valid( prog, {"x": INPUT_SHAPE}, expected_output_shapes={output_name: output_shape}, ) prev_model = ct.convert( prev_prog, pass_pipeline=ct.PassPipeline.EMPTY, convert_to="mlprogram", ) model = ct.convert( prog, pass_pipeline=ct.PassPipeline.EMPTY, convert_to="mlprogram", ) a = 65500.0 * np.random.rand(*INPUT_SHAPE) input_dict = {"x": a} prev_output = prev_model.predict(input_dict)[output_name] output = model.predict(input_dict)[output_name] np.testing.assert_allclose(prev_output, output, rtol=0.0, atol=0.0) class TestFuseElementwiseToBatchNorm: """ Input graph: Const Const | | V V input -----> transpose -----> mul ----> add ---> out Output graph: input -----> transpose -----> batchnorm ----> out """ @pytest.mark.parametrize( "flip_mul_input_order, flip_add_input_order, rank_3_const_input", itertools.product([False, True], [False, True], [False, True]), ) def test_mul_add_fusion_to_batchnorm( self, flip_mul_input_order, flip_add_input_order, rank_3_const_input ): C = 3 gamma = np.random.rand(1, C, 1, 1) beta = np.random.rand(1, C, 1, 1) if rank_3_const_input: gamma = np.squeeze(gamma, axis=0) beta = np.squeeze(beta, axis=0) @mb.program(input_specs=[mb.TensorSpec(shape=(1, 10, 10, C))]) def prog(x): x = mb.transpose(x=x, perm=[0, 3, 1, 2]) if flip_mul_input_order: x = mb.mul(x=gamma, y=x) else: x = mb.mul(x=x, y=gamma) if flip_add_input_order: x = mb.add(x=beta, y=x) else: x = mb.add(x=x, y=beta) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_elementwise_to_batchnorm" ) assert get_op_types_in_program(prev_prog) == ["transpose", "mul", "add"] assert get_op_types_in_program(prog) == ["transpose", "batch_norm"] assert_model_is_valid( prog, {"x": (1, 10, 10, C)}, expected_output_shapes={block.outputs[0].name: (1, C, 10, 10)}, ) class TestRank0ExpandDimsSwap: """ Input graph: 2.0 | v input --> slice_by_index --> sub --> expand_dims --> output Output graph: [2.0] | v input --> slice_by_index --> expand_dims --> sub --> output """ @pytest.mark.skipif( ct.utils._macos_version() < (12, 0), reason="mlprogram predict available only on macOS12+" ) @pytest.mark.parametrize( "reverse_order, elem_op", itertools.product( [True, False], ["add", "sub", "mul", "real_div", "floor_div"], ), ) def test(self, reverse_order, elem_op): x_shape = [ 1, ] @mb.program(input_specs=[mb.TensorSpec(shape=x_shape)]) def program(x): x = mb.slice_by_index(x=x, begin=[0], end=[1], squeeze_mask=[True]) func = getattr(mb, elem_op) if reverse_order: x = func(x=2.0, y=x) else: x = func(x=x, y=2.0) expand = mb.expand_dims(x=x, axes=[0]) other_1 = mb.add(x=x, y=[1.0, 2.0, 3.0]) other_2 = mb.sub(x=x, y=[1.0, 2.0, 3.0]) return expand, other_1, other_2 prev_prog, prev_block, block = apply_pass_and_basic_check( program, "common::rank0_expand_dims_swap" ) assert get_op_types_in_program(prev_prog) == [ "slice_by_index", elem_op, "expand_dims", "add", "sub", ] assert get_op_types_in_program(program) == [ "slice_by_index", "expand_dims", "expand_dims", elem_op, "squeeze", "add", "sub", ] assert_model_is_valid( program=program, inputs={"x": x_shape}, expected_output_shapes={ block.outputs[0].name: tuple(x_shape), block.outputs[1].name: (3,), block.outputs[2].name: (3,), }, ) class TestImageInputPreprocess(unittest.TestCase): """ Input graph: input (format=NHWC) ------> transpose(axis=[0, 3, 1, 2]) ---------> add ----> relu ---> out | ^ | | ---> relu ---> transpose(axis=[0, 3, 1, 2]) --- Intermediate graph: input (format=NCHW) -----> transpose(axis=[0, 2, 3, 1]) ----> transpose(axis=[0, 3, 1, 2]) ---------> add ----> relu ---> out | ^ | | ---> relu ---> transpose(axis=[0, 3, 1, 2]) --- Output graph: input (format=NCHW) -----> relu -----> add -----> relu -----> out | ^ | | ------------------- """ def test_fusion_with_image_intermediate_graph(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20, 30, 3))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 3, 1, 2]) x2 = mb.relu(x=x) x3 = mb.transpose(x=x2, perm=[0, 3, 1, 2]) x4 = mb.add(x=x1, y=x3) return mb.relu(x=x4) prog.functions["main"].input_types = [ ct.ImageType(name="x", shape=(10, 20, 30, 3), channel_first=False) ] prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::image_input_preprocess" ) self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "relu", "transpose", "add", "relu"] ) self.assertEqual( get_op_types_in_program(prog), ["transpose", "transpose", "relu", "transpose", "add", "relu"], ) def test_fusion_with_image_full(self): # Avoid circular import from coremltools import convert @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20, 30, 3))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 3, 1, 2]) x2 = mb.relu(x=x) x3 = mb.transpose(x=x2, perm=[0, 3, 1, 2]) x4 = mb.add(x=x1, y=x3) return mb.relu(x=x4) mlmodel = convert( prog, inputs=[ct.ImageType(name="x", shape=(10, 20, 30, 3), channel_first=False)], source="milinternal", convert_to="neuralnetwork", ) assert mlmodel is not None assert len(mlmodel.get_spec().neuralNetwork.layers) == 3 class TestSanitizeInputOutputNames: def test_nn_backend_style_sanitization(self): """ Test that intermediate var names are unchanged, and only model input and output names are modified, i.e. sanitized (adhering to the format [a-zA-Z_][a-zA-Z0-9_]*) for the NN backend. """ prog = mil.Program() func_inputs = {"x/0": mb.placeholder(shape=[2, 3]), "y": mb.placeholder(shape=[2, 3])} with Function(func_inputs) as ssa_fun: x, y = ssa_fun.inputs["x/0"], ssa_fun.inputs["y"] x = mb.relu(x=x, name="relu/1") z = mb.add(x=x, y=y, name="out/1") ssa_fun.set_outputs([z]) prog.add_function("main", ssa_fun) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::sanitize_input_output_names", skip_output_name_check=True, skip_input_name_check=True, ) relu_op = prog.find_ops(op_type="relu", exactly_one=True)[0] assert relu_op.inputs["x"].name == "x_0" # input name: sanitized assert relu_op.outputs[0].name == "relu/1" # intermediate name: unchanged assert block.outputs[0].name == "out_1" # output name: sanitized # convert prev_prog to NN backend mlmodel = ct.convert(prev_prog, convert_to="neuralnetwork") spec = mlmodel._spec assert spec.description.input[0].name == "x_0" assert spec.description.output[0].name == "out_1" relu_layer = spec.neuralNetwork.layers[0] assert relu_layer.output[0] == "relu/1" @staticmethod def test_sanitize_input_named_state(): @mb.program( input_specs=[ mb.StateTensorSpec((2, 3), dtype=types.fp16), ], opset_version=ct.target.iOS18, ) def prog(state): return mb.read_state(input=state) _, _, block = apply_pass_and_basic_check( prog, "common::sanitize_input_output_names", skip_input_name_check=True, ) assert len(block.inputs) == 1 assert "state_workaround" in block.inputs assert block.inputs["state_workaround"].name == "state_workaround" class TestUpdateOutputDtypes: def test_single_output(self): """ Given: ------ main(%input: (1, 20, int32)(Tensor)) { block0() { %abs: (1, 20, int32)(Tensor) = abs(x=%input, name="abs") %output_square: (1, 20, int32)(Tensor) = square(x=%input, name="output_square") } -> (%output_square) } prog.main_output_types = [ct.TensorType(dtype=np.float16)] Result: ------ main(%input: (1, 20, int32)(Tensor)) { block0() { %abs: (1, 20, int32)(Tensor) = abs(x=%input, name="abs") %output_square_type_int32: (1, 20, int32)(Tensor) = square(x=%input, name="output_square") %output_square: (1, 20, fp16)(Tensor) = cast(x=%output_square_type_int32, dtype="fp16", name="cast_0") } -> (%output_square) } """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 20), dtype=types.int32)]) def prog(input): x = mb.abs(x=input, name="abs") x = mb.square(x=input, name="output_square") return x prog.functions["main"].set_output_types([ct.TensorType(dtype=np.float16)]) prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::update_output_dtypes", skip_output_type_check=True ) assert get_op_types_in_program(prev_prog) == ["abs", "square"] assert prev_block.outputs[0].dtype == types.int32 assert get_op_types_in_program(prog) == ["abs", "square", "cast"] assert block.outputs[0].dtype == types.fp16 assert block.outputs[0].name == "output_square" def test_multiple_outputs(self): """ Given: ----- main(%input: (1, 20, int32)(Tensor)) { block0() { %split_0: (1, 10, int32)(Tensor), %split_1: (1, 10, int32)(Tensor) = split(x=%input, num_splits=2, axis=1, name="split") } -> (%split_0, %split_1) } prog.main_output_types = [ct.TensorType(), ct.TensorType(dtype=np.float16)] Result: ------ main(%input: (1, 20, int32)(Tensor)) { block0() { %split_0: (1, 10, int32)(Tensor), %split_1_type_int32: (1, 10, int32)(Tensor) = split(x=%input, num_splits=2, axis=1, name="split") %split_1: (1, 10, fp16)(Tensor) = cast(x=%split_1_type_int32, dtype="fp16", name="cast_0") } -> (%split_0, %split_1) } """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 20), dtype=types.int32)]) def prog(input): x1, x2 = mb.split(x=input, num_splits=2, axis=1, name="split") return x1, x2 prog.functions["main"].set_output_types([ct.TensorType(), ct.TensorType(dtype=np.float16)]) _, _, block = apply_pass_and_basic_check( prog, "common::update_output_dtypes", skip_output_type_check=True ) assert get_op_types_in_program(prog) == ["split", "cast"] assert block.outputs[1].dtype == types.fp16 assert block.outputs[1].name == "split_1" def test_output_as_input(self, caplog): """ Given: ----- main(%input: (3, fp32)(Tensor)) { block0() { } -> (input) } prog.main_output_types = [ct.TensorType(dtype=np.float16)] Result: Since the output var is also an input var, the dtype is not changed, and a warning message is thrown ------ main(%input: (3, fp32)(Tensor)) { block0() { } -> (input) } """ @mb.program(input_specs=[mb.TensorSpec(shape=(3,), dtype=types.fp32)]) def prog(input): return input prog.functions["main"].set_output_types([ct.TensorType(dtype=np.float16)]) _, _, block = apply_pass_and_basic_check( prog, "common::update_output_dtypes", ) warning_msg = "Output var 'input' is also an input var, hence the dtype cannot be changed: output var 'input' remains dtype fp32" assert any([warning_msg in rec.message for rec in caplog.records]) assert get_op_types_in_program(prog) == [] assert block.outputs[0].dtype == types.fp32 class TestFuseLayerNormOrInstanceNorm: @pytest.mark.parametrize("axes_size", [1, 2, 3]) def test_layer_norm(self, axes_size): """ Detect layer norm pattern, found in the TF bert model. y = x * [gamma * rsqrt(variance + eps)] + (beta - mean * [gamma * rsqrt(variance + eps)]) where mean and variance are computed along axes [-1] or [-1,-2] and so on and gamma and beta are constants with rank equal to the length of the axes parameter. """ shape = (3, 5, 6) rank = len(shape) axes = list(range(rank - axes_size, rank)) @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def prog(x): x1 = mb.reduce_mean(x=x, axes=axes, keep_dims=True) x2 = mb.sub(x=x, y=x1) x2 = mb.square(x=x2) x2 = mb.reduce_mean(x=x2, axes=axes, keep_dims=True) x2 = mb.add(x=x2, y=1e-5) x2 = mb.rsqrt(x=x2) x3 = mb.mul(x=np.random.rand(*shape[-len(axes) :]), y=x2) x4 = mb.mul(x=x3, y=x1) x5 = mb.mul(x=x, y=x3) x4 = mb.sub(x=np.random.rand(*shape[-len(axes) :]), y=x4) y = mb.add(x=x4, y=x5) return y prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_layernorm_or_instancenorm" ) assert get_op_types_in_program(prev_prog) == [ "reduce_mean", "sub", "square", "reduce_mean", "add", "rsqrt", "mul", "mul", "mul", "sub", "add", ] assert get_op_types_in_program(prog) == ["layer_norm"] assert_model_is_valid( prog, {"x": shape}, expected_output_shapes={block.outputs[0].name: shape} ) @pytest.mark.parametrize( "with_affine, constexpr_beta", itertools.product([True, False], [True, False]) ) def test_ane_layer_norm(self, with_affine, constexpr_beta): """ Detect layer norm pattern, found in models based on ml-ane-transformers. ``y = (x - mean(x)) * rsqrt(mean((x - mean(x))^2) + eps)`` ``y = [(x - mean(x)) * rsqrt(mean((x - mean(x))^2) + eps) + beta] * gamma`` Note that beta and gamma in these equations are applied in opposite order compared to the MIL. Only applies when mean and variance are computed along axes [1] """ shape = (3, 5, 1, 6) @mb.program(input_specs=[mb.TensorSpec(shape=shape)], opset_version=ct.target.iOS16) def prog(x): x1 = mb.reduce_mean(x=x, axes=[1], keep_dims=True) # mean x2 = mb.sub(x=x, y=x1) # x - mean x3 = mb.mul(x=x2, y=x2) # (x - mean)^2 x4 = mb.reduce_mean(x=x3, axes=[1], keep_dims=True) # variance x5 = mb.add(x=x4, y=1e-5) # variance + eps x6 = mb.rsqrt(x=x5) # rsqrt(variance + eps) y = mb.mul(x=x2, y=x6) # (x - mean) * rsqrt(variance + eps) if with_affine: beta = np.random.rand(1, shape[1], 1, 1) if constexpr_beta: beta = mb.constexpr_lut_to_dense( lut=np.arange(2, dtype=np.float32), indices=np.ones((int(np.ceil(np.prod(beta.shape) / 8)),), dtype=np.uint8), shape=np.array(beta.shape, dtype=np.uint32), ) y = mb.add(x=y, y=beta) y = mb.mul(x=y, y=np.random.rand(1, shape[1], 1, 1)) return y prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_layernorm_or_instancenorm" ) prev_expected_ops = [ "reduce_mean", "sub", "mul", "reduce_mean", "add", "rsqrt", "mul", ] if with_affine: if constexpr_beta: prev_expected_ops.append("constexpr_lut_to_dense") prev_expected_ops += ["add", "mul"] assert get_op_types_in_program(prev_prog) == prev_expected_ops if with_affine and constexpr_beta: assert get_op_types_in_program(prog) == get_op_types_in_program(prev_prog) else: assert get_op_types_in_program(prog) == ["layer_norm"] assert_model_is_valid( prog, {"x": shape}, expected_output_shapes={block.outputs[0].name: shape}, minimum_deployment_target=ct.target.iOS16, ) @pytest.mark.parametrize("with_affine", [True, False]) def test_ane_layer_norm_root_var_reuse(self, with_affine): """ Detect layer norm pattern, found in models based on ml-ane-transformers. Cover cases where the input to the layer norm is used as input to an op that occurs after the layer norm. ``y = (x - mean(x)) * rsqrt(mean((x - mean(x))^2) + eps)`` ``y = [(x - mean(x)) * rsqrt(mean((x - mean(x))^2) + eps) + beta] * gamma`` Note that beta and gamma in these equations are applied in opposite order compared to the MIL. Only applies when mean and variance are computed along axes [1] """ shape = (3, 5, 1, 6) @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def prog(x): x1 = mb.add(x=x, y=np.random.rand(*shape)) x2 = mb.reduce_mean(x=x1, axes=[1], keep_dims=True) # mean x3 = mb.sub(x=x1, y=x2) # x - mean x4 = mb.mul(x=x3, y=x3) # (x - mean)^2 x5 = mb.reduce_mean(x=x4, axes=[1], keep_dims=True) # variance x6 = mb.add(x=x5, y=1e-5) # variance + eps x7 = mb.rsqrt(x=x6) # rsqrt(variance + eps) y = mb.mul(x=x3, y=x7) # (x - mean) * rsqrt(variance + eps) if with_affine: y = mb.add(x=y, y=np.random.rand(1, shape[1], 1, 1)) y = mb.mul(x=y, y=np.random.rand(1, shape[1], 1, 1)) y = mb.sub(x=x, y=y) # use x for something after the norm return y prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_layernorm_or_instancenorm" ) assert get_op_types_in_program(prev_prog) == [ "add", "reduce_mean", "sub", "mul", "reduce_mean", "add", "rsqrt", "mul", ] + (["add", "mul"] if with_affine else []) + ["sub"] assert get_op_types_in_program(prog) == ["add", "layer_norm", "sub"] assert_model_is_valid( prog, {"x": shape}, expected_output_shapes={block.outputs[0].name: shape} ) @pytest.mark.parametrize( "with_affine, reused_name", itertools.chain( itertools.product( [True, False], [None, "x2", "x3", "x4", "x8"], ), [[True, "x9"]] ), ) def test_ane_layer_norm_intermediate_var_reuse(self, with_affine, reused_name): """ Avoid false positive detection of ml-ane-transformers layer norm pattern. In cases where an intermediate value is used after the layer norm is computed, the pattern should not be fused. """ shape = (3, 5, 1, 6) @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def prog(x): x1 = mb.add(x=x, y=np.random.rand(*shape)) x2 = mb.reduce_mean(x=x1, axes=[1], keep_dims=True) # mean x3 = mb.sub(x=x1, y=x2) # x - mean x4 = mb.mul(x=x3, y=x3) # (x - mean)^2 x5 = mb.reduce_mean(x=x4, axes=[1], keep_dims=True) # variance x6 = mb.add(x=x5, y=1e-5) # variance + eps x7 = mb.rsqrt(x=x6) # rsqrt(variance + eps) x8 = mb.mul(x=x3, y=x7) # (x - mean) * rsqrt(variance + eps) y = x8 if with_affine: x9 = mb.add(x=y, y=np.random.rand(1, shape[1], 1, 1)) y = x9 y = mb.mul(x=y, y=np.random.rand(1, shape[1], 1, 1)) # All the same shape (3,5,1,6) reused = None if reused_name == "x2": reused = x2 elif reused_name == "x3": reused = x3 elif reused_name == "x4": reused = x4 elif reused_name == "x8": reused = x8 elif reused_name == "x9": reused = x9 if reused: y = mb.sub(x=reused, y=y) # reuse an intermediate variable return y prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_layernorm_or_instancenorm" ) if reused_name == "x8" and not with_affine: # This can be fused since without affine, x8 is the final op in the layer norm. assert get_op_types_in_program(prog) == ["add", "layer_norm", "sub"] elif reused_name: assert "layer_norm" not in get_op_types_in_program(prog) assert get_op_types_in_program(prog)[-1] == "sub" else: # Fusion should still work when nothing is reused. assert get_op_types_in_program(prog) == ["add", "layer_norm"] assert_model_is_valid( prog, {"x": shape}, expected_output_shapes={block.outputs[0].name: shape} ) def test_ane_layer_norm_ambiguous_add(self): """ Avoid false positive detection of ml-ane-transformers layer norm pattern. In cases where the pattern has an add that could be beta, but it does not feed into a gamma mul, it is ambiguous if the pattern should be fused so don't. """ shape = (3, 5, 1, 6) @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def prog(x): x1 = mb.add(x=x, y=np.random.rand(*shape)) x2 = mb.reduce_mean(x=x1, axes=[1], keep_dims=True) # mean x3 = mb.sub(x=x1, y=x2) # x - mean x4 = mb.mul(x=x3, y=x3) # (x - mean)^2 x5 = mb.reduce_mean(x=x4, axes=[1], keep_dims=True) # variance x6 = mb.add(x=x5, y=1e-5) # variance + eps x7 = mb.rsqrt(x=x6) # rsqrt(variance + eps) x8 = mb.mul(x=x3, y=x7) # (x - mean) * rsqrt(variance + eps) y = mb.add(x=x8, y=np.random.rand(1, shape[1], 1, 1)) # ambiguous add (maybe beta) return y prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_layernorm_or_instancenorm" ) assert get_op_types_in_program(prev_prog) == [ "add", "reduce_mean", "sub", "mul", "reduce_mean", "add", "rsqrt", "mul", "add", ] assert "layer_norm" not in get_op_types_in_program(prog) assert_model_is_valid( prog, {"x": shape}, expected_output_shapes={block.outputs[0].name: shape} ) @pytest.mark.parametrize( "first_axes, second_axes", [ [[0], [1]], [[1], [0]], [[2], [1]], [[1], [2]], [[1,2], [1]], [[1], [1,2]], ] ) def test_ane_layer_norm_wrong_axes(self, first_axes, second_axes): """ Avoid false positive detection of ml-ane-transformers layer norm pattern. In cases where the axes != [1] the pattern should not be fused. """ shape = (1, 1, 1, 6) @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def prog(x): x1 = mb.add(x=x, y=np.random.rand(*shape)) x2 = mb.reduce_mean(x=x1, axes=first_axes, keep_dims=True) # mean x3 = mb.sub(x=x1, y=x2) # x - mean x4 = mb.mul(x=x3, y=x3) # (x - mean)^2 x5 = mb.reduce_mean(x=x4, axes=second_axes, keep_dims=True) # variance x6 = mb.add(x=x5, y=1e-5) # variance + eps x7 = mb.rsqrt(x=x6) # rsqrt(variance + eps) y = mb.mul(x=x3, y=x7) # (x - mean) * rsqrt(variance + eps) return y prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_layernorm_or_instancenorm" ) assert get_op_types_in_program(prev_prog) == [ "add", "reduce_mean", "sub", "mul", "reduce_mean", "add", "rsqrt", "mul", ] assert "layer_norm" not in get_op_types_in_program(prog) assert_model_is_valid( prog, {"x": shape}, expected_output_shapes={block.outputs[0].name: shape} ) def test_instance_norm_pattern_1(self): """ Detect instance norm pattern y = x * [gamma * rsqrt(variance + eps)] + (beta - mean * [gamma * rsqrt(variance + eps)]) where input is rank 4, (N,C,H,W), axis=[2, 3], along which reduction happens, and gamma and beta are of shape (1,C,1,1) """ shape = (3, 5, 6, 7) @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def prog(x): x1 = mb.reduce_mean(x=x, axes=[2, 3], keep_dims=True) x2 = mb.sub(x=x, y=x1) x2 = mb.square(x=x2) x2 = mb.reduce_mean(x=x2, axes=[2, 3], keep_dims=True) x2 = mb.add(x=x2, y=1e-5) x2 = mb.rsqrt(x=x2) x3 = mb.mul(x=np.random.rand(1, shape[1], 1, 1), y=x2) x4 = mb.mul(x=x3, y=x1) x5 = mb.mul(x=x, y=x3) x4 = mb.sub(x=np.random.rand(1, shape[1], 1, 1), y=x4) y = mb.add(x=x4, y=x5) return y prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_layernorm_or_instancenorm" ) assert get_op_types_in_program(prev_prog) == [ "reduce_mean", "sub", "square", "reduce_mean", "add", "rsqrt", "mul", "mul", "mul", "sub", "add", ] assert get_op_types_in_program(prog) == ["instance_norm"] assert_model_is_valid( prog, {"x": shape}, expected_output_shapes={block.outputs[0].name: shape} ) def test_instance_norm_pattern_1_rank_1_gamma_beta(self): """ Detect instance norm pattern y = x * [gamma * rsqrt(variance + eps)] + (beta - mean * [gamma * rsqrt(variance + eps)]) where input is rank 4, (N,C,H,W), axis=[2, 3], along which reduction happens, and gamma and beta are of shape (C,) """ shape = (3, 5, 6, 7) @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def prog(x): x1 = mb.reduce_mean(x=x, axes=[1, 2], keep_dims=True) x2 = mb.sub(x=x, y=x1) x2 = mb.square(x=x2) x2 = mb.reduce_mean(x=x2, axes=[1, 2], keep_dims=True) x2 = mb.add(x=x2, y=1e-5) x2 = mb.rsqrt(x=x2) x3 = mb.mul(x=np.random.rand(shape[3]), y=x2) x4 = mb.mul(x=x3, y=x1) x5 = mb.mul(x=x, y=x3) x4 = mb.sub(x=np.random.rand(shape[3]), y=x4) y = mb.add(x=x4, y=x5) return y prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_layernorm_or_instancenorm" ) assert get_op_types_in_program(prev_prog) == [ "reduce_mean", "sub", "square", "reduce_mean", "add", "rsqrt", "mul", "mul", "mul", "sub", "add", ] assert get_op_types_in_program(prog) == ["transpose", "instance_norm", "transpose"] assert_model_is_valid( prog, {"x": shape}, expected_output_shapes={block.outputs[0].name: shape} ) def test_instance_norm_pattern_1_with_channel_last_data_format(self): """ Detect instance norm pattern with channel last data format x = transpose(x) # channel first to channel last, NCHW -> NHWC x = x * [gamma * rsqrt(variance + eps)] + (beta - mean * [gamma * rsqrt(variance + eps)]) x = transpose(x) # channel last to channel first, NHWC -> NCHW The input is rank 4 (N, C, H, W) and the input for fused "instance_norm" op is rank 4 (N, H, W, C), and axis=[1, 2] or [-3, -2], along which reduction happens. This is common in TensorFlow model when data format is channel last. PyMIL inserts transposes around "conv" layer to make "conv" channel first. "fuse_layernorm_or_instancenorm" pass is expected to fuse this pattern as well. """ shape = (1, 3, 5, 5) @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def prog(x): x = mb.transpose(x=x, perm=[0, 2, 3, 1]) x1 = mb.reduce_mean(x=x, axes=[1, 2], keep_dims=True) x2 = mb.sub(x=x, y=x1) x2 = mb.square(x=x2) x2 = mb.reduce_mean(x=x2, axes=[1, 2], keep_dims=True) x2 = mb.add(x=x2, y=1e-5) x2 = mb.rsqrt(x=x2) x3 = mb.mul(x=np.random.rand(1, 1, 1, shape[1]), y=x2) x4 = mb.mul(x=x3, y=x1) x5 = mb.mul(x=x, y=x3) x4 = mb.sub(x=np.random.rand(1, 1, 1, shape[1]), y=x4) x6 = mb.add(x=x4, y=x5) y = mb.transpose(x=x6, perm=[0, 3, 1, 2]) return y prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_layernorm_or_instancenorm" ) assert get_op_types_in_program(prev_prog) == [ "transpose", "reduce_mean", "sub", "square", "reduce_mean", "add", "rsqrt", "mul", "mul", "mul", "sub", "add", "transpose", ] assert get_op_types_in_program(prog) == [ "transpose", "transpose", "instance_norm", "transpose", "transpose", ] assert_model_is_valid( prog, {"x": shape}, expected_output_shapes={block.outputs[0].name: shape}, ) # reduce transpose pass should remove extra ones prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") assert get_op_types_in_program(prog) == ["instance_norm"] assert_model_is_valid( prog, {"x": shape}, expected_output_shapes={block.outputs[0].name: shape}, ) def test_instance_norm_pattern_2(self): """ Detect instance norm pattern 2 and fusion. |----> sub0 ----| const (0.5) | ^ | | | | V V x ---> mean0 square --> mean1 --> add_eps ---> pow const_gamma const_beta | | | | | | V V V V |----> sub1 --------------------------------> real_div --> mul_gamma --> add_beta --> ... """ shape = (3, 5, 6, 7) @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def prog(x): mean0 = mb.reduce_mean(x=x, axes=[2, 3], keep_dims=True) sub0 = mb.sub(x=x, y=mean0) sub1 = mb.sub(x=x, y=mean0) square = mb.square(x=sub0) mean1 = mb.reduce_mean(x=square, axes=[2, 3], keep_dims=True) add_eps = mb.add(x=mean1, y=1e-5) # epsilon pow = mb.pow(x=add_eps, y=0.5) div = mb.real_div(x=sub1, y=pow) mul_gamma = mb.mul(x=np.random.rand(1, shape[1], 1, 1), y=div) # add_beta = mb.add(x=np.random.rand(1, shape[1], 1, 1), y=mul_gamma) return add_beta prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_layernorm_or_instancenorm" ) assert get_op_types_in_program(prev_prog) == [ "reduce_mean", "sub", "sub", "square", "reduce_mean", "add", "pow", "real_div", "mul", "add", ] assert get_op_types_in_program(prog) == ["instance_norm"] assert_model_is_valid( prog, {"x": shape}, expected_output_shapes={block.outputs[0].name: shape} ) def test_instance_norm_pattern_3(self): """ Detect and fuse instance norm pattern 3 (pattern in TensorFlow-Addons). |-------------------------------------------------| | | | V x --> mean square --> mean1 --> add_eps --> rsqrt --> mul2 --> mul_sub | | ^ | | | V | | | | --> sub -----| | | | V V |--------------------------------------------> mul1 -------------> add --> ... """ shape = (3, 5, 6, 7) @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def prog(x): mean0 = mb.reduce_mean(x=x, axes=[2, 3], keep_dims=True) sub = mb.sub(x=x, y=mean0) square = mb.square(x=sub) mean1 = mb.reduce_mean(x=square, axes=[2, 3], keep_dims=True) add_eps = mb.add(x=mean1, y=1e-5) # epsilon rsqrt = mb.rsqrt(x=add_eps) mul1 = mb.mul(x=rsqrt, y=x) mul2 = mb.mul(x=mean0, y=rsqrt) mul_sub = mb.mul(x=mul2, y=-1.0) add = mb.add(x=mul1, y=mul_sub) return add prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_layernorm_or_instancenorm" ) assert get_op_types_in_program(prev_prog) == [ "reduce_mean", "sub", "square", "reduce_mean", "add", "rsqrt", "mul", "mul", "mul", "add", ] assert get_op_types_in_program(prog) == ["instance_norm"] assert_model_is_valid( prog, {"x": shape}, expected_output_shapes={block.outputs[0].name: shape} ) def test_instance_norm_pattern_4(self): """ Detect and fuse instance norm pattern 4. |-----------| | V |------> mul_square1 -----> sum1 -----> mul_mean1 | | | V x --> sum --> mul_mean ==> mul_square --> sub_variance --> add_eps --> rsqrt | | | | | V | | mul_gamma | | | | | |----------------| | | | V | |--------------------------------------------+-------------> mul2 | V | |----------------------------------------------------------> mul1 | | V | sub_beta --> add --> [...] | ^ |---------------------------| """ shape = (3, 5, 6, 7) @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def prog(x): mul_square1 = mb.mul(x=x, y=x) sum = mb.reduce_sum(x=x, axes=[2, 3], keep_dims=True) mul_mean = mb.mul(x=sum, y=3.3333334e-05) # dummy value here mul_square = mb.mul(x=mul_mean, y=mul_mean) sum1 = mb.reduce_sum(x=mul_square1, axes=[2, 3], keep_dims=True) mul_mean1 = mb.mul(x=sum1, y=8.333333e-06) # dummy value here sub_variance = mb.sub(x=mul_mean1, y=mul_square) add_eps = mb.add(x=sub_variance, y=1e-5) # epsilon rsqrt = mb.rsqrt(x=add_eps) mul_gamma = mb.mul(x=rsqrt, y=np.random.rand(1, shape[1], 1, 1)) mul1 = mb.mul(x=mul_gamma, y=x) mul2 = mb.mul(x=mul_mean, y=mul_gamma) sub_beta = mb.sub(x=np.random.rand(1, shape[1], 1, 1), y=mul2) add = mb.add(x=mul1, y=sub_beta) return add prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::fuse_layernorm_or_instancenorm" ) assert get_op_types_in_program(prev_prog) == [ "mul", "reduce_sum", "mul", "mul", "reduce_sum", "mul", "sub", "add", "rsqrt", "mul", "mul", "mul", "sub", "add", ] assert get_op_types_in_program(prog) == ["instance_norm"] assert_model_is_valid( prog, {"x": shape}, expected_output_shapes={block.outputs[0].name: shape} ) class TestFuseLinearBias: @staticmethod def _apply_transform(inputs, func, is_first_input, has_bias): """ Utility function to test the weight/bias transform function in linear bias fusion pass. """ @mb.program(input_specs=[mb.TensorSpec(shape=(3, 4))]) def prog(x): if has_bias: linear = mb.linear( x=x, weight=inputs["linear_weight"], bias=inputs["linear_bias"], ) else: linear = mb.linear( x=x, weight=inputs["linear_weight"], ) if is_first_input: kwargs = { "x": linear, "y": inputs["bias"], } else: kwargs = { "x": inputs["bias"], "y": linear, } x = func(**kwargs) return x apply_pass_and_basic_check( prog, "common::fuse_linear_bias", ) # get the updated weight from the prog linear_op = [] for op in prog["main"].operations: if op.op_type == "const": continue linear_op.append(op) assert len(linear_op) == 1, "should only have one linear layer." return linear_op[0].weight.val, linear_op[0].bias.val @pytest.mark.parametrize( "op_type, is_first_input, has_bias, broadcast", itertools.product( ["add", "sub"], [True, False], [True, False], [True, False], ), ) def test_transform_linear(self, op_type, is_first_input, has_bias, broadcast): """ Test the weight / bias transform function in the linear bias fusion pass """ weight = np.reshape(np.arange(8), (2, 4)).astype(np.float32) linear_bias = ( np.array([1, 2]).astype(np.float32) if has_bias else np.array([0, 0]).astype(np.float32) ) bias = np.array([3, 4]).astype(np.float32) if broadcast: bias = np.reshape(bias, (1, 2)) inputs = { "linear_weight": weight, "linear_bias": linear_bias, "bias": bias, } if op_type == "add": func = mb.add elif op_type == "sub": func = mb.sub new_weight, new_bias = self._apply_transform( inputs, func, is_first_input, has_bias, ) if broadcast: bias = np.reshape(bias, (2,)) if op_type == "sub" and not is_first_input: expected_weight = -weight else: expected_weight = weight if op_type == "sub": if is_first_input: expected_bias = linear_bias - bias else: expected_bias = bias - linear_bias else: expected_bias = linear_bias + bias np.testing.assert_almost_equal(new_weight, expected_weight) np.testing.assert_almost_equal(new_bias, expected_bias) @pytest.mark.parametrize( "rank, op_type, is_first_input, broadcast, backend", itertools.product([1, 2, 3], ["add", "sub"], [True, False], [True, False], backends), ) def test_linear_bias_fusion(self, rank, op_type, is_first_input, broadcast, backend): """ Input graph: Const | V input -----> linear -----> add/sub ---> out Output graph: input -----> linear ----> out """ input_shape = [1, 2, 3] input_shape = input_shape[-rank:] input_shape = tuple(input_shape) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)]) def prog(x): linear_weight = np.reshape(np.arange(6), (2, 3)).astype(np.float32) linear_bias = np.array([1.0, 2.0]) bias = np.array([3.0, 4.0]) if broadcast: if rank >= 2: bias = np.reshape(bias, (1, 2)) x = mb.linear( x=x, weight=linear_weight, bias=linear_bias, ) func = mb.add if op_type == "add" else mb.sub if is_first_input: kwargs = { "x": x, "y": bias, } else: kwargs = { "x": bias, "y": x, } x = func(**kwargs) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::fuse_linear_bias") assert get_op_types_in_program(prev_prog) == ["linear", op_type] assert get_op_types_in_program(prog) == ["linear"] # validate graph pass output_shape = [1, 2, 2] output_shape = tuple(output_shape[-rank:]) assert_model_is_valid( prog, {"x": input_shape}, expected_output_shapes={block.outputs[0].name: output_shape}, backend=backend, ) class TestFuseMatmulWeightBias: def test_fuse_matmul_weight_bias(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): weights_val = np.random.rand(2, 4).T.astype(np.float32) weights = mb.const(val=weights_val) bias_val = np.random.rand(2).astype(np.float32) bias = mb.const(val=bias_val) matmul = mb.matmul(x=x, y=weights) return mb.add(x=matmul, y=bias) assert_op_count_match(prog, expect=1, op="matmul") assert_op_count_match(prog, expect=0, op="linear") prev_prog = copy.deepcopy(prog) PASS_REGISTRY["common::fuse_matmul_weight_bias"](prog) assert_same_output_names(prev_prog, prog) assert_op_count_match(prog, expect=0, op="matmul") assert_op_count_match(prog, expect=1, op="linear") if _VALIDATE_MODEL: assert_model_is_valid(prog, {"x": (2, 4)}) class TestGraphPassScopePreservation: def test_single_pass(self): """ Input: x -> relu(torch_scope="module_1") -> transpose_1(torch_scope="module_1") -> transpose_2(torch_scope="module_2") -> output Output: x -> relu(torch_scope="module_1") -> transpose_3( torch_scope="module_2", pass_scope="merge_consecutive_transposes" ) -> output In the above case, the relu op preserves its original scope information. Since transpose_3 is created by the "merge_consecutive_transposes" pass, the COREMLTOOLS_GRAPH_PASS scope information will be saved in the op. Also, the TORCHSCRIPT_MODULE_TYPE scope info of transpose_2 is back propagated to transpose_3, when the use of output of transpose_2 is replaced by the output of transpose_3. """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="module_1"), ): x = mb.relu(x=x) x = mb.transpose(x=x, perm=[0, 2, 1, 3]) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="module_2"), ): return mb.transpose(x=x, perm=[3, 2, 0, 1]) prog._add_essential_scope_source(ScopeSource.TORCHSCRIPT_MODULE_TYPE) apply_pass_and_basic_check(prog, "common::merge_consecutive_transposes") assert get_op_types_in_program(prog) == ["relu", "transpose"] # the scope info in the relu op is not affected relu_op = prog.find_ops(op_type="relu")[0] assert len(relu_op.scopes) == 1 assert relu_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_1"] # the new transpose op has the scope information from the graph pass transpose_op = prog.find_ops(op_type="transpose")[0] assert len(transpose_op.scopes) == 2 assert transpose_op.scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == [ "merge_consecutive_transposes" ] assert transpose_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_2"] def test_single_pass_without_creating_new_var(self): """ Input: x -> relu(torch_scope="module_1") -> relu(torch_scope="module_2") -> relu(torch_scope="module_3") -> output Output: x -> relu(torch_scope="module_1") -> output In the above case, the relu op preserves its original scope information, since the graph pass only reconnects the graph. """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="module_1"), ): x = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="module_2"), ): x = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="module_3"), ): return mb.relu(x=x) prog._add_essential_scope_source(ScopeSource.TORCHSCRIPT_MODULE_TYPE) apply_pass_and_basic_check(prog, "common::merge_consecutive_relus") assert get_op_types_in_program(prog) == ["relu"] # the scope info in the relu op is not affected relu_op = prog.find_ops(op_type="relu")[0] assert len(relu_op.scopes) == 1 assert relu_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_1"] def test_multiple_passes(self): """ In this case, a program goes through two graph passes. And the resulting program should have scope information from both passes. """ shape = (3, 5, 6, 7) @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="module_1"), ): # dummy op x = mb.relu(x=x) # pattern for "merge_consecutive_transposes" x = mb.transpose(x=x, perm=[0, 2, 1, 3]) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="module_2"), ): x = mb.transpose(x=x, perm=[3, 2, 0, 1]) # pattern for "fuse_layernorm_or_instancenorm" mean0 = mb.reduce_mean(x=x, axes=[2, 3], keep_dims=True) sub0 = mb.sub(x=x, y=mean0) sub1 = mb.sub(x=x, y=mean0) square = mb.square(x=sub0) mean1 = mb.reduce_mean(x=square, axes=[2, 3], keep_dims=True) add_eps = mb.add(x=mean1, y=1e-5) # epsilon pow = mb.pow(x=add_eps, y=0.5) div = mb.real_div(x=sub1, y=pow) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="module_3"), ): mul_gamma = mb.mul(x=np.random.rand(1, shape[1], 1, 1), y=div) return mb.add(x=np.random.rand(1, shape[1], 1, 1), y=mul_gamma) prog._add_essential_scope_source(ScopeSource.TORCHSCRIPT_MODULE_TYPE) apply_pass_and_basic_check(prog, "common::fuse_layernorm_or_instancenorm") apply_pass_and_basic_check(prog, "common::merge_consecutive_transposes") assert get_op_types_in_program(prog) == ["relu", "transpose", "instance_norm"] # the scope info in the relu op is not affected relu_op = prog.find_ops(op_type="relu")[0] assert len(relu_op.scopes) == 1 assert relu_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_1"] # the new transpose op has the scope information from the graph pass transpose_op = prog.find_ops(op_type="transpose")[0] assert len(transpose_op.scopes) == 2 assert transpose_op.scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == [ "merge_consecutive_transposes" ] assert transpose_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_2"] # the new instance_norm op has the scope information from the graph pass instance_norm_op = prog.find_ops(op_type="instance_norm")[0] assert len(instance_norm_op.scopes) == 2 assert instance_norm_op.scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == [ "fuse_layernorm_or_instancenorm" ] assert instance_norm_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_3"] def test_fp16_scope_preservation(self): """ This test explains step-by-step how the scope information preservation works in the fp32 -> fp16 pass. Input graph: x(fp32) -> relu(torch_scope="module_1") -> sin(torch_scope="module_2") -> output(fp32) (1) "common::add_fp16_cast" First, in the add_fp16_cast graph pass, multiple cast ops are injected in the graph: x(fp32) -> cast(dtype="fp16", torch_scope="module_1", pass_scope="add_fp16_cast") -> relu(torch_scope="module_1", pass_scope="add_fp16_cast") -> cast(dtype="fp32", torch_scope="module_1", pass_scope="add_fp16_cast") -> cast(dtype="fp16", torch_scope="module_2", pass_scope="add_fp16_cast") -> sin(torch_scope="module_2", pass_scope="add_fp16_cast") -> cast(dtype="fp32, torch_scope="module_2", pass_scope="add_fp16_cast") -> output There are 4 cast ops in the graph who has pass_scope = "add_fp16_cast", which indicates they are added by the "add_fp16_cast" pass. Note that, the first cast -> relu -> cast pattern has the same torch scope information as the original relu(torch_scope="module_1"). This is due to the fact that when we replace the use of the original relu output with the output of the second cast op, the scope information is back propagated. The same reason applied for why the cast -> sin -> cast patterns has the torch scope as the original sin op. (2) "common::cast_optimization" + "dead-code_elimination" After the cleanup, the graph becomes: x(fp32) -> cast( dtype="fp16", torch_scope="module_1", pass_scope="add_fp16_cast" ) -> relu( torch_scope="module_1", pass_scope="add_fp16_cast] ) -> sin( torch_scope="module_2", pass_scope="add_fp16_cast" ) -> cast( dtype="fp32, torch_scope="module_2", pass_scope="add_fp16_cast" ) -> output We can see that, the fp16 version of relu / sin preserves the original torch scope information. """ shape = (3, 5, 6, 7) @mb.program(input_specs=[mb.TensorSpec(shape=shape)]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="module_1"), ): x = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="module_2"), ): return mb.sin(x=x) prog._add_essential_scope_source(ScopeSource.TORCHSCRIPT_MODULE_TYPE) # fp16 cast pass apply_pass_and_basic_check(prog, "common::add_fp16_cast") assert get_op_types_in_program(prog) == ["cast", "relu", "cast", "cast", "sin", "cast"] cast_ops = prog.find_ops(op_type="cast") assert len(cast_ops) == 4 assert len(cast_ops[0].scopes) == 2 assert cast_ops[0].scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == ["add_fp16_cast"] assert cast_ops[0].scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_1"] assert len(cast_ops[1].scopes) == 2 assert cast_ops[1].scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == ["add_fp16_cast"] assert cast_ops[1].scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_1"] assert len(cast_ops[2].scopes) == 2 assert cast_ops[2].scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == [ "add_fp16_cast", ] assert cast_ops[2].scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_2"] assert len(cast_ops[3].scopes) == 2 assert cast_ops[3].scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == ["add_fp16_cast"] assert cast_ops[3].scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_2"] relu_op = prog.find_ops(op_type="relu")[0] assert len(relu_op.scopes) == 2 assert relu_op.scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == ["add_fp16_cast"] assert relu_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_1"] sin_op = prog.find_ops(op_type="sin")[0] assert len(sin_op.scopes) == 2 assert sin_op.scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == ["add_fp16_cast"] assert sin_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_2"] # clean up with cast optimization and dead code elimination apply_pass_and_basic_check(prog, "common::cast_optimization") apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["cast", "relu", "sin", "cast"] cast_ops = prog.find_ops(op_type="cast") assert len(cast_ops) == 2 assert len(cast_ops[0].scopes) == 2 assert cast_ops[0].scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == ["add_fp16_cast"] assert cast_ops[0].scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_1"] assert len(cast_ops[1].scopes) == 2 assert cast_ops[1].scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == ["add_fp16_cast"] assert cast_ops[1].scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_2"] relu_op = prog.find_ops(op_type="relu")[0] assert len(relu_op.scopes) == 2 assert relu_op.scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == [ "add_fp16_cast", ] assert relu_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_1"] sin_op = prog.find_ops(op_type="sin")[0] assert len(sin_op.scopes) == 2 assert sin_op.scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == ["add_fp16_cast"] assert sin_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_2"] def test_pass_followed_by_fp16(self): """ Input: x -> transpose_1(torch_scope="module_1") -> transpose_2(torch_scope="module_2") -> output Output: x -> cast( dtype="fp16", torch_scope="module_2", pass_scope=["merge_consecutive_transposes", "add_fp16_cast"] ) -> transpose_3_fp16( torch_scope="module_2", pass_scope=["merge_consecutive_transposes", "add_fp16_cast"] ) -> cast(dtype="fp32", torch_scope="module_2", pass_scope=["merge_consecutive_transposes", "add_fp16_cast"] ) -> output In the above case, two transpose ops first merged into a single transpose op, and the graph is transformed into fp16. Hence, the final transpose op should have scope information from both graph passes. """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="module_1"), ): x = mb.transpose(x=x, perm=[0, 2, 1, 3]) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="module_2"), ): return mb.transpose(x=x, perm=[3, 2, 0, 1]) prog._add_essential_scope_source(ScopeSource.TORCHSCRIPT_MODULE_TYPE) apply_pass_and_basic_check(prog, "common::merge_consecutive_transposes") apply_pass_and_basic_check(prog, "common::add_fp16_cast") apply_pass_and_basic_check(prog, "common::cast_optimization") apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["cast", "transpose", "cast"] cast_ops = prog.find_ops(op_type="cast") assert len(cast_ops) == 2 assert len(cast_ops[0].scopes) == 2 assert cast_ops[0].scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == [ "merge_consecutive_transposes", "add_fp16_cast", ] assert cast_ops[0].scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_2"] assert len(cast_ops[1].scopes) == 2 assert cast_ops[1].scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == [ "merge_consecutive_transposes", "add_fp16_cast", ] assert cast_ops[1].scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "module_2", ] transpose_op = prog.find_ops(op_type="transpose")[0] assert len(transpose_op.scopes) == 2 assert transpose_op.scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == [ "merge_consecutive_transposes", "add_fp16_cast", ] assert transpose_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "module_2", ] class TestRandomizeWeights: @staticmethod def assert_weights_changed(prog1, prog2): changed = False const_ops_before = prog1.find_ops(op_type="const") const_ops_after = prog2.find_ops(op_type="const") assert len(const_ops_before) == len(const_ops_after) for i, op in enumerate(const_ops_before): weight_before = op.outputs[0].val weight_after = const_ops_after[i].outputs[0].val if not np.array_equal(weight_before, weight_after): changed = True break assert changed @staticmethod def test_randomize_weights_pass(): """ Test the WeightRandomizer graph pass const | v input -----> matmul -----> out const needs to large enough that should_use_weight_file==True """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 10))]) def prog(x): weights_val = np.random.rand(2, 10).T.astype(np.float32) weights = mb.const(val=weights_val) return mb.matmul(x=x, y=weights) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::WeightRandomizer") # check ops haven't changed assert get_op_types_in_program(prog) == ["matmul"] # check the weights have changed TestRandomizeWeights.assert_weights_changed(prev_prog, prog) @staticmethod def test_utils_randomize_weights(): """ Test ct.models.utils.randomize_weights method end to end """ # Doing a lazy import because it imports `coremltools.converters.mil.mil.ops.tests.iOS18 import backends` # which brings dependencies on backends which shouldn't be needed for most tests in `test_passes.py` from coremltools.test.optimize.coreml.test_post_training_quantization import ( get_test_model_and_data_complex, ) model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", compute_precision=ct.precision.FLOAT32, ) # randomize weights randomized_mlmodel = ct.models.utils.randomize_weights(mlmodel) # get before/after mil prog_before = mlmodel._mil_program prog_after = randomized_mlmodel._mil_program # check ops haven't changed assert get_op_types_in_program(prog_before) == get_op_types_in_program(prog_after) assert prog_before.find_ops(op_type="conv")[1].weight.op.op_type == "const" assert prog_after.find_ops(op_type="conv")[1].weight.op.op_type == "const" # check the weights have changed TestRandomizeWeights.assert_weights_changed(prog_before, prog_after) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/tests/test_quantization_passes.py0000644000000000000000000034215414672066616030763 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools from typing import Tuple import numpy as np import parameterized import pytest from mock import patch import coremltools as ct import coremltools.converters.mil.mil.types as types from coremltools._deps import _HAS_TORCH, _IS_MACOS, MSG_TORCH_NOT_FOUND from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.defs import quantization from coremltools.converters.mil.mil.passes.defs.quantization import add_fp16_cast from coremltools.converters.mil.mil.types import numpy_type_to_builtin_type from coremltools.converters.mil.testing_utils import ( apply_pass_and_basic_check, assert_model_is_valid, get_op_types_in_program, ) if _HAS_TORCH: import torch import torch.nn as nn np.random.seed(1818) class TestTensorwiseAffineDequantizeConstElimination: @pytest.mark.parametrize("axis", (None, 0, 1, -1)) def test_eliminate_transpose(self, axis): """ Input graph: data -> constexpr_affine_dequantize -> transpose Output graph: new_data -> constexpr_affine_dequantize where new_data is the value after applying transpose to data """ SHAPE = (1, 2, 3, 4) quantized_data = np.random.randint(0, 256, SHAPE).astype(np.int8) if axis is None: axis = 0 # although tensor-wise, constexpr_affine_dequantize requires a (dummy) axis scale = np.random.rand() zero_point = np.random.randint(-127, 128, dtype=np.int8) else: size = SHAPE[axis] scale = np.random.rand(size) zero_point = np.random.randint(-127, 128, size, dtype=np.int8) @mb.program(input_specs=[], opset_version=ct.target.iOS16) def prog(): res = mb.constexpr_affine_dequantize( quantized_data=quantized_data, axis=axis, scale=scale, zero_point=zero_point, ) return mb.transpose(x=res, perm=(2, 0, 1, 3)) apply_pass_and_basic_check(prog, "common::merge_affine_dequantize_with_consecutive_ops") assert get_op_types_in_program(prog) == ["constexpr_affine_dequantize"] new_op = prog.find_ops(op_type="constexpr_affine_dequantize", exactly_one=True)[0] expected_quantized_data = np.transpose(quantized_data, (2, 0, 1, 3)) np.testing.assert_array_equal(new_op.quantized_data.val, expected_quantized_data) def test_eliminate_reshape(self): """ Input graph: data -> constexpr_affine_dequantize -> reshape Output graph: new_data -> constexpr_affine_dequantize where new_data is the value after applying reshape to data """ quantized_data = np.random.randint(0, 256, (1, 2, 3, 4)).astype(np.int8) @mb.program(input_specs=[], opset_version=ct.target.iOS16) def prog(): res = mb.constexpr_affine_dequantize( quantized_data=quantized_data, axis=0, scale=8.9, zero_point=np.int8(34), ) return mb.reshape(x=res, shape=(3, -1)) apply_pass_and_basic_check(prog, "common::merge_affine_dequantize_with_consecutive_ops") assert get_op_types_in_program(prog) == ["constexpr_affine_dequantize"] new_op = prog.find_ops(op_type="constexpr_affine_dequantize", exactly_one=True)[0] expected_quantized_data = np.reshape(quantized_data, (3, 8)) np.testing.assert_array_equal(new_op.quantized_data.val, expected_quantized_data) def test_eliminate_expand_dims(self): """ Input graph: data -> constexpr_affine_dequantize -> expand_dims Output graph: new_data -> constexpr_affine_dequantize where new_data is the value after applying expand_dims to data """ quantized_data = np.random.randint(0, 256, (2, 3, 4)).astype(np.int8) @mb.program(input_specs=[], opset_version=ct.target.iOS16) def prog(): res = mb.constexpr_affine_dequantize( quantized_data=quantized_data, axis=0, scale=8.9, zero_point=np.int8(34), ) return mb.expand_dims(x=res, axes=(0, 2, 4)) apply_pass_and_basic_check(prog, "common::merge_affine_dequantize_with_consecutive_ops") assert get_op_types_in_program(prog) == ["constexpr_affine_dequantize"] new_op = prog.find_ops(op_type="constexpr_affine_dequantize", exactly_one=True)[0] expected_quantized_data = np.expand_dims(quantized_data, axis=(0, 2, 4)) np.testing.assert_array_equal(new_op.quantized_data.val, expected_quantized_data) @pytest.mark.parametrize("axis", [(0, 3), None]) def test_eliminate_squeeze(self, axis): """ Input graph: data -> constexpr_affine_dequantize -> squeeze Output graph: new_data -> constexpr_affine_dequantize where new_data is the value after applying squeeze to data """ quantized_data = np.random.randint(0, 256, (1, 2, 3, 1, 4)).astype(np.int8) @mb.program(input_specs=[], opset_version=ct.target.iOS16) def prog(): res = mb.constexpr_affine_dequantize( quantized_data=quantized_data, axis=0, scale=8.9, zero_point=np.int8(34), ) return mb.squeeze(x=res, axes=axis) apply_pass_and_basic_check(prog, "common::merge_affine_dequantize_with_consecutive_ops") assert get_op_types_in_program(prog) == ["constexpr_affine_dequantize"] new_op = prog.find_ops(op_type="constexpr_affine_dequantize", exactly_one=True)[0] expected_quantized_data = np.squeeze(quantized_data, axis=axis) np.testing.assert_array_equal(new_op.quantized_data.val, expected_quantized_data) def test_eliminate_multiple_ops(self): """ Input graph: data -> constexpr_affine_dequantize -> transpose -> reshape -> expand_dims -> squeeze Output graph: new_data -> constexpr_affine_dequantize where new_data is the value after applying the same chain of transformations to data """ quantized_data = np.random.randint(0, 256, (1, 2, 3, 4)).astype(np.int8) @mb.program(input_specs=[], opset_version=ct.target.iOS16) def prog(): res = mb.constexpr_affine_dequantize( quantized_data=quantized_data, axis=0, scale=8.9, zero_point=np.int8(34), ) res = mb.transpose(x=res, perm=(1, 0, 3, 2)) res = mb.reshape(x=res, shape=(8, 3)) res = mb.expand_dims(x=res, axes=(0, 2, 4)) return mb.squeeze(x=res, axes=(2,)) apply_pass_and_basic_check(prog, "common::merge_affine_dequantize_with_consecutive_ops") assert get_op_types_in_program(prog) == ["constexpr_affine_dequantize"] new_op = prog.find_ops(op_type="constexpr_affine_dequantize", exactly_one=True)[0] expected_quantized_data = np.transpose(quantized_data, (1, 0, 3, 2)) expected_quantized_data = np.reshape(expected_quantized_data, (8, 3)) expected_quantized_data = np.expand_dims(expected_quantized_data, (0, 2, 4)) expected_quantized_data = np.squeeze(expected_quantized_data, (2,)) np.testing.assert_array_equal(new_op.quantized_data.val, expected_quantized_data) def test_negative_non_linked_list_pattern(self): """ If ``quantized_data`` feeds into multiple ``constexpr_affine_dequantize`` ops, the graph will not be changed. """ quantized_data = np.random.randint(0, 256, (2, 3, 4)).astype(np.int8) @mb.program(input_specs=[], opset_version=ct.target.iOS16) def prog(): data = mb.const(val=quantized_data) x = mb.constexpr_affine_dequantize( quantized_data=data, axis=0, scale=8.9, zero_point=np.int8(34), ) y = mb.constexpr_affine_dequantize( quantized_data=data, axis=0, scale=8.1, zero_point=np.int8(56), ) return mb.transpose(x=x, perm=(1, 0, 2)), mb.reshape(x=y, shape=(24,)) apply_pass_and_basic_check(prog, "common::merge_affine_dequantize_with_consecutive_ops") assert get_op_types_in_program(prog) == [ "constexpr_affine_dequantize", "constexpr_affine_dequantize", "transpose", "reshape", ] def test_eliminate_connected_outputs(self): """ The optimization stops when the node is a block output """ quantized_data = np.random.randint(0, 256, (2, 3, 4)).astype(np.int8) @mb.program(input_specs=[], opset_version=ct.target.iOS16) def prog(): x = mb.constexpr_affine_dequantize( quantized_data=quantized_data, axis=0, scale=8.9, zero_point=np.int8(34), ) x = mb.transpose(x=x, perm=(1, 0, 2)) x = mb.reshape(x=x, shape=(2, 2, 3, 2)) y = mb.transpose(x=x, perm=(0, 3, 2, 1)) return x, y apply_pass_and_basic_check(prog, "common::merge_affine_dequantize_with_consecutive_ops") assert get_op_types_in_program(prog) == [ "constexpr_affine_dequantize", "transpose", ] new_op = prog.find_ops(op_type="constexpr_affine_dequantize", exactly_one=True)[0] expected_quantized_data = np.transpose(quantized_data, (1, 0, 2)) expected_quantized_data = np.reshape(expected_quantized_data, (2, 2, 3, 2)) np.testing.assert_array_equal(new_op.quantized_data.val, expected_quantized_data) transpose_op = prog.find_ops(op_type="transpose", exactly_one=True)[0] assert transpose_op.perm.val.tolist() == [0, 3, 2, 1] class QuantizationBaseTest: @staticmethod def generate_random_quantization_params( float_dtype: np.dtype, quant_dtype: np.dtype, input_shape: Tuple[int], is_zp_present: bool = True, is_axis_present: bool = True, ) -> Tuple[np.ndarray, np.ndarray, int]: """ return scale, zero point, axis """ input_rank = len(input_shape) low, high = (-128, 128) if quant_dtype == np.int8 else (0, 256) scale = None zero_point = None axis = ( np.random.randint(-input_rank, input_rank, dtype=np.int32) if is_axis_present else None ) if is_axis_present: scale = np.random.rand(input_shape[axis]).astype(float_dtype) if is_zp_present: zero_point = np.random.randint( low=low, high=high, size=input_shape[axis], dtype=quant_dtype ) else: scale = np.array(np.random.rand()).astype(float_dtype) if is_zp_present: zero_point = np.array(np.random.randint(low=low, high=high, dtype=quant_dtype)) return scale, zero_point, axis @staticmethod def generate_random_quantize_input( float_dtype: np.dtype, quant_dtype: np.dtype, scale: np.ndarray, zero_point: np.ndarray, axis: int, shape: Tuple[int], ) -> np.ndarray: assert float_dtype == scale.dtype if zero_point is not None: assert quant_dtype == zero_point.dtype if axis is not None: assert shape[axis] == scale.shape[0] if zero_point is not None and axis is not None: assert shape[axis] == zero_point.shape[0] low, high = (-128, 128) if quant_dtype == np.int8 else (0, 256) x_q = np.random.randint(low=low, high=high, size=shape, dtype=np.int32) if axis is None: if zero_point is None: x_fp = x_q * scale else: x_fp = (x_q - zero_point) * scale else: # prepare broadcast shape for latter dequantize broadcastable_shape = np.ones(len(shape), dtype=np.int32) broadcastable_shape[axis] = shape[axis] broadcasted_scale = np.reshape(scale, broadcastable_shape) if zero_point is None: x_fp = x_q * broadcasted_scale else: broadcasted_zero_point = np.reshape(zero_point, broadcastable_shape) x_fp = (x_q - broadcasted_zero_point) * broadcasted_scale return float_dtype(x_fp) class TestIntOpCanonicalization: @pytest.mark.parametrize("op_type", ["reshape"]) def test_canonicalize_int_op(self, op_type): """ Input graph: input -> quantize -> dequantize -> int op -> quantize -> dequantize -> output Output graph: input -> quantize -> int op -> dequantize -> output """ input_shape = (5, 6) output_shape = (5, 2, 3) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)], opset_version=ct.target.iOS17) def prog(x): quantize_0 = mb.quantize(input=x, scale=0.1, output_dtype="int8") dequantize_1 = mb.dequantize(input=quantize_0, scale=0.1) if op_type == "reshape": reshape = mb.reshape(x=dequantize_1, shape=output_shape) quantize_1 = mb.quantize(input=reshape, scale=0.1, output_dtype="int8") dequantize_2 = mb.dequantize(input=quantize_1, scale=0.1) return dequantize_2 prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::int_op_canonicalization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prev_prog) == [ "quantize", "dequantize", "reshape", "quantize", "dequantize", ] assert get_op_types_in_program(prog) == ["quantize", "reshape", "dequantize"] assert_model_is_valid( prog, {"x": input_shape}, expected_output_shapes={block.outputs[0].name: output_shape}, minimum_deployment_target=ct.target.iOS17, backend=("mlprogram", "fp32"), ) @pytest.mark.parametrize("all_are_int", (True, False)) def test_canonicalize_versatile_inputs(self, all_are_int): """ Input graph: |-> int op 0 if all_are_int else add -> quantize -> dequantize -> output_0 input -> quantize -> dequantize -| |-> int op 1 -> quantize -> dequantize -> output_1 Output graph: if all_are_int: |-> int op 0 -> dequantize -> output_0 input -> quantize -| |-> int op 1 -> dequantize -> output_1 else: |-> dequantize -> add -> quantize -> dequantize -> output_0 input -> quantize -| |-> int op 1 -> dequantize -> output_1 """ input_shape = (5, 6) output_shape = (5, 2, 3) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)], opset_version=ct.target.iOS17) def prog(x): quantize_0 = mb.quantize(input=x, scale=0.1, output_dtype="int8") dequantize_1 = mb.dequantize(input=quantize_0, scale=0.1) # int op 0 (here reshape) path if all_are_int: reshape = mb.reshape(x=dequantize_1, shape=output_shape) quantize_1_0 = mb.quantize(input=reshape, scale=0.1, output_dtype="int8") dequantize_2_0 = mb.dequantize(input=quantize_1_0, scale=0.1) # float op (here add) path else: add = mb.add(x=dequantize_1, y=1.0) quantize_1_0 = mb.quantize(input=add, scale=0.1, output_dtype="int8") dequantize_2_0 = mb.dequantize(input=quantize_1_0, scale=0.1) # int op 1 (here reshape) path reshape = mb.reshape(x=dequantize_1, shape=output_shape) quantize_1_1 = mb.quantize(input=reshape, scale=0.1, output_dtype="int8") dequantize_2_1 = mb.dequantize(input=quantize_1_1, scale=0.1) return ( dequantize_2_0, dequantize_2_1, ) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::int_op_canonicalization") if all_are_int: _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prev_prog) == [ "quantize", "dequantize", "reshape", "quantize", "dequantize", "reshape", "quantize", "dequantize", ] assert get_op_types_in_program(prog) == [ "quantize", "reshape", "dequantize", "reshape", "dequantize", ] else: assert get_op_types_in_program(prev_prog) == [ "quantize", "dequantize", "add", "quantize", "dequantize", "reshape", "quantize", "dequantize", ] assert get_op_types_in_program(prog) == [ "quantize", "dequantize", "add", "quantize", "dequantize", "reshape", "dequantize", ] assert_model_is_valid( prog, {"x": input_shape}, expected_output_shapes={ block.outputs[0].name: output_shape if all_are_int else input_shape, block.outputs[1].name: output_shape, }, minimum_deployment_target=ct.target.iOS17, backend=("mlprogram", "fp32"), ) def test_canonicalize_consecutive_int_ops(self): """ Input graph: input -> quantize -> dequantize -> int op 0 -> quantize -> dequantize -> int op 1 -> quantize -> dequantize -> output Output graph: input -> quantize -> int op 0 -> int op 1 -> dequantize -> output """ input_shape = (5, 6) activation_shape = (10, 3) output_shape = (5, 2, 3) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)], opset_version=ct.target.iOS17) def prog(x): quantize_0 = mb.quantize(input=x, scale=0.1, output_dtype="int8") dequantize_1 = mb.dequantize(input=quantize_0, scale=0.1) reshape0 = mb.reshape(x=dequantize_1, shape=activation_shape) quantize_1 = mb.quantize(input=reshape0, scale=0.1, output_dtype="int8") dequantize_2 = mb.dequantize(input=quantize_1, scale=0.1) reshape1 = mb.reshape(x=dequantize_2, shape=output_shape) quantize_2 = mb.quantize(input=reshape1, scale=0.1, output_dtype="int8") dequantize_3 = mb.dequantize(input=quantize_2, scale=0.1) return dequantize_3 prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::int_op_canonicalization") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prev_prog) == [ "quantize", "dequantize", "reshape", "quantize", "dequantize", "reshape", "quantize", "dequantize", ] assert get_op_types_in_program(prog) == ["quantize", "reshape", "reshape", "dequantize"] assert_model_is_valid( prog, {"x": input_shape}, expected_output_shapes={block.outputs[0].name: output_shape}, minimum_deployment_target=ct.target.iOS17, backend=("mlprogram", "fp32"), ) def test_canonicalize_block_output_input(self): """ Input graph: |-> output_0 input -> quantize -> dequantize -| |-> int op -> quantize -> dequantize -> output_1 Output graph: |-> dequantize -> output_0 input -> quantize -| |-> int op -> dequantize -> output_1 """ input_shape = (5, 6) output_shape = (5, 2, 3) @mb.program(input_specs=[mb.TensorSpec(shape=input_shape)], opset_version=ct.target.iOS17) def prog(x): quantize_0 = mb.quantize(input=x, scale=0.1, output_dtype="int8") dequantize_1 = mb.dequantize(input=quantize_0, scale=0.1) reshape = mb.reshape(x=dequantize_1, shape=output_shape) quantize_1 = mb.quantize(input=reshape, scale=0.1, output_dtype="int8") dequantize_2 = mb.dequantize(input=quantize_1, scale=0.1) return dequantize_1, dequantize_2 prev_prog, _, block = apply_pass_and_basic_check(prog, "common::int_op_canonicalization") assert get_op_types_in_program(prev_prog) == [ "quantize", "dequantize", "reshape", "quantize", "dequantize", ] assert get_op_types_in_program(prog) == [ "quantize", "dequantize", "reshape", "dequantize", ] assert_model_is_valid( prog, {"x": input_shape}, expected_output_shapes={ block.outputs[0].name: input_shape, block.outputs[1].name: output_shape, }, minimum_deployment_target=ct.target.iOS17, backend=("mlprogram", "fp32"), ) # TODO (rdar://112297858): test the case where `int_op_canonicalization` # refuses to transform because the "int op" is from an older iOS version # that does not support int8 and uint8 class TestNullifyRedundantQuantizationZeroPoint: @staticmethod def np_dtype_to_str(np_dtype: np.dtype) -> str: NP_DTYPE_TO_STR = {np.int8: "int8", np.uint8: "uint8"} return NP_DTYPE_TO_STR.get(np_dtype) @staticmethod def shift_128(input: np.ndarray, quant_dtype: np.dtype) -> np.ndarray: """ shift const input according to zero point shift and dtype change: int8: -128 -> 0, int8 -> uint8 uint8: 128 -> 0, uint8 -> int8 """ shifted_input: np.ndarray if quant_dtype == np.int8: shifted_input = np.uint8(np.int16(input) + 128) else: shifted_input = np.int8(np.int16(input) - 128) return shifted_input @pytest.mark.parametrize( "quant_dtype, is_axis_present", itertools.product( (np.int8, np.uint8), (True, False), ), ) def test_optimize_zp0_quantize(self, quant_dtype, is_axis_present): """ initial graph: input -> quantize(zero_point=0) -> dequantize(scale=1) -> output final graph: input -> quantize() -> dequantize(scale=1) -> output """ SHAPE = (1, 3) rank = len(SHAPE) axis = np.random.randint(-rank, rank) if is_axis_present else None scale = np.random.rand(SHAPE[axis]) if is_axis_present else np.random.rand() zero_point = np.zeros(SHAPE[axis], dtype=quant_dtype) if is_axis_present else quant_dtype(0) @mb.program(input_specs=[mb.TensorSpec(shape=SHAPE)]) def prog(x): quantized = mb.quantize( input=x, scale=scale, zero_point=zero_point, axis=axis, output_dtype=self.np_dtype_to_str(quant_dtype), ) dequantized = mb.dequantize( input=quantized, scale=1.0, ) return dequantized assert get_op_types_in_program(prog) == ["quantize", "dequantize"] quantize_op = prog.find_ops(op_type="quantize")[0] assert np.all(quantize_op.zero_point.val == 0) _, _, block = apply_pass_and_basic_check(prog, "common::nullify_redundant_quantization_zero_point") assert get_op_types_in_program(prog) == ["quantize", "dequantize"] quantize_op = prog.find_ops(op_type="quantize")[0] assert quantize_op.zero_point is None assert_model_is_valid( prog, {"x": SHAPE}, minimum_deployment_target=ct.target.iOS17, backend=("mlprogram", "fp32"), expected_output_shapes={block.outputs[0].name: SHAPE}, ) @pytest.mark.parametrize( "quant_dtype, is_axis_present", itertools.product( (np.int8, np.uint8), (True, False), ), ) def test_optimize_zp0_dequantize(self, quant_dtype, is_axis_present): """ initial graph: input -> quantize(scale=1) -> dequantize(zero_point=0) -> output final graph: input -> quantize(scale=1) -> dequantize() -> output """ SHAPE = (6, 5, 4, 3, 2) rank = len(SHAPE) axis = np.random.randint(-rank, rank) if is_axis_present else None scale = np.random.rand(SHAPE[axis]) if is_axis_present else np.random.rand() zero_point = np.zeros(SHAPE[axis], dtype=quant_dtype) if is_axis_present else quant_dtype(0) @mb.program(input_specs=[mb.TensorSpec(shape=SHAPE)]) def prog(x): quantized = mb.quantize( input=x, scale=1.0, output_dtype=self.np_dtype_to_str(quant_dtype), ) dequantized = mb.dequantize( input=quantized, scale=scale, zero_point=zero_point, axis=axis, ) return dequantized assert get_op_types_in_program(prog) == ["quantize", "dequantize"] dequantize_op = prog.find_ops(op_type="dequantize")[0] assert np.all(dequantize_op.zero_point.val == 0) _, _, block = apply_pass_and_basic_check( prog, "common::nullify_redundant_quantization_zero_point" ) assert get_op_types_in_program(prog) == ["quantize", "dequantize"] dequantize_op = prog.find_ops(op_type="dequantize")[0] assert dequantize_op.zero_point is None assert_model_is_valid( prog, {"x": SHAPE}, minimum_deployment_target=ct.target.iOS17, backend=("mlprogram", "fp32"), expected_output_shapes={block.outputs[0].name: SHAPE}, ) @pytest.mark.parametrize( "quant_dtype, is_axis_present", itertools.product( (np.int8, np.uint8), (True, False), ), ) def test_optimize_zp128_quantize_dequantize(self, quant_dtype, is_axis_present): """ initial graph: input -> quantize(zero_point=±128) -> dequantize(zero_point=±128) -> output final graph: input -> quantize() -> dequantize() -> output """ SHAPE = (2, 3) rank = len(SHAPE) axis = np.random.randint(-rank, rank) if is_axis_present else None scale_quantize = np.random.rand(SHAPE[axis]) if is_axis_present else np.random.rand() scale_dequantize = np.random.rand(SHAPE[axis]) if is_axis_present else np.random.rand() zero_point_value = -128 if quant_dtype == np.int8 else 128 zero_point = ( np.full(SHAPE[axis], zero_point_value, dtype=quant_dtype) if is_axis_present else quant_dtype(zero_point_value) ) @mb.program(input_specs=[mb.TensorSpec(shape=SHAPE)]) def prog(x): quantized = mb.quantize( input=x, scale=scale_quantize, zero_point=zero_point, axis=axis, output_dtype=self.np_dtype_to_str(quant_dtype), ) dequantized = mb.dequantize( input=quantized, scale=scale_dequantize, zero_point=zero_point, axis=axis, ) return dequantized assert get_op_types_in_program(prog) == ["quantize", "dequantize"] quantize_op = prog.find_ops(op_type="quantize")[0] dequantize_op = prog.find_ops(op_type="dequantize")[0] assert np.all(quantize_op.zero_point.val == zero_point_value) assert np.all(dequantize_op.zero_point.val == zero_point_value) prev_prog, _, block = apply_pass_and_basic_check( prog, "common::nullify_redundant_quantization_zero_point" ) assert get_op_types_in_program(prog) == ["quantize", "dequantize"] quantize_op = prog.find_ops(op_type="quantize")[0] dequantize_op = prog.find_ops(op_type="dequantize")[0] assert quantize_op.zero_point is None assert dequantize_op.zero_point is None assert_model_is_valid( prog, {"x": SHAPE}, minimum_deployment_target=ct.target.iOS17, backend=("mlprogram", "fp32"), expected_output_shapes={block.outputs[0].name: SHAPE}, ) prev_model = ct.convert(prev_prog, minimum_deployment_target=ct.target.iOS17) model = ct.convert(prog, minimum_deployment_target=ct.target.iOS17) x = np.random.rand(*SHAPE) prev_output = list(prev_model.predict({"x": x}).values())[0] output = list(model.predict({"x": x}).values())[0] assert np.all(prev_output == output) @pytest.mark.parametrize( "quant_dtype, is_axis_present", itertools.product( (np.int8, np.uint8), (True, False), ), ) def test_optimize_zp128_const_dequantize(self, quant_dtype, is_axis_present): """ initial graph: input -----------------------| |-> add -> output dequantize(zero_point=±128) -| apply nullify_redundant_quantization_zero_point: input --------| |-> add -> output dequantize() -| final graph: input -----------------------| |-> add -> output constexpr_affine_dequantize -| """ SHAPE = (2, 5, 3) quantized = ( np.random.randint(low=-128, high=128, size=SHAPE, dtype=quant_dtype) if quant_dtype == np.int8 else np.random.randint(low=0, high=256, size=SHAPE, dtype=quant_dtype) ) rank = len(SHAPE) axis = np.random.randint(-rank, rank) if is_axis_present else None scale = np.random.rand(SHAPE[axis]) if is_axis_present else np.random.rand() zero_point_value = -128 if quant_dtype == np.int8 else 128 zero_point = ( np.full(SHAPE[axis], zero_point_value, dtype=quant_dtype) if is_axis_present else quant_dtype(zero_point_value) ) @mb.program(input_specs=[mb.TensorSpec(shape=SHAPE)]) def prog(x): dequantized = mb.dequantize( input=quantized, scale=scale, zero_point=zero_point, axis=axis, ) # Core ML cannot have a model with idle input and constant outputs # so append an `add` op to make the model valid result = mb.add(x=x, y=dequantized) return result assert get_op_types_in_program(prog) == ["dequantize", "add"] dequantize_op = prog.find_ops(op_type="dequantize")[0] assert np.all(dequantize_op.input.val == quantized) assert np.all(dequantize_op.zero_point.val == zero_point_value) prev_prog, _, block = apply_pass_and_basic_check( prog, "common::nullify_redundant_quantization_zero_point" ) assert get_op_types_in_program(prog) == ["dequantize", "add"] dequantize_op = prog.find_ops(op_type="dequantize")[0] assert np.all(dequantize_op.input.val == self.shift_128(quantized, quant_dtype)) assert dequantize_op.zero_point is None _, _, block = apply_pass_and_basic_check(prog, "common::dequantize_to_constexpr") assert get_op_types_in_program(prog) == ["constexpr_affine_dequantize", "add"] assert_model_is_valid( prog, {"x": SHAPE}, minimum_deployment_target=ct.target.iOS17, backend=("mlprogram", "fp32"), expected_output_shapes={block.outputs[0].name: SHAPE}, ) prev_model = ct.convert(prev_prog, minimum_deployment_target=ct.target.iOS17) model = ct.convert(prog, minimum_deployment_target=ct.target.iOS17) x = np.random.rand(*SHAPE) prev_output = list(prev_model.predict({"x": x}).values())[0] output = list(model.predict({"x": x}).values())[0] assert np.all(prev_output == output) @pytest.mark.parametrize( "quant_dtype, is_axis_present", itertools.product( (np.int8, np.uint8), (True, False), ), ) def test_keep_mismatching_quantize_dequantize(self, quant_dtype, is_axis_present): """ initial graph: input -> quantize(zero_point=±128 + perturbation) -> dequantize(zero_point=±128) -> output final graph: input -> quantize(zero_point=±128 + perturbation) -> dequantize(zero_point=±128) -> output perturbation may also be applied to dequantize """ SHAPE = (2, 3) rank = len(SHAPE) axis = np.random.randint(-rank, rank) if is_axis_present else None scale_quantize = np.random.rand(SHAPE[axis]) if is_axis_present else np.random.rand() scale_dequantize = np.random.rand(SHAPE[axis]) if is_axis_present else np.random.rand() zero_point_value = -128 if quant_dtype == np.int8 else 128 perturbation = np.random.randint(1, 10, dtype=quant_dtype) zero_point = ( np.full(SHAPE[axis], zero_point_value, dtype=quant_dtype) if is_axis_present else quant_dtype(zero_point_value) ) zero_point_perturbed = quant_dtype(zero_point + perturbation) perturb_quantize = np.random.rand() < 0.5 if perturb_quantize: zero_point_quantize = zero_point_perturbed zero_point_dequantize = zero_point else: zero_point_quantize = zero_point zero_point_dequantize = zero_point_perturbed @mb.program(input_specs=[mb.TensorSpec(shape=SHAPE)]) def prog(x): quantized = mb.quantize( input=x, scale=scale_quantize, zero_point=zero_point_quantize, axis=axis, output_dtype=self.np_dtype_to_str(quant_dtype), ) dequantized = mb.dequantize( input=quantized, scale=scale_dequantize, zero_point=zero_point_dequantize, axis=axis, ) return dequantized assert get_op_types_in_program(prog) == ["quantize", "dequantize"] quantize_op = prog.find_ops(op_type="quantize")[0] dequantize_op = prog.find_ops(op_type="dequantize")[0] if perturb_quantize: assert np.all(quantize_op.zero_point.val == zero_point_perturbed) assert np.all(dequantize_op.zero_point.val == zero_point) else: assert np.all(quantize_op.zero_point.val == zero_point) assert np.all(dequantize_op.zero_point.val == zero_point_perturbed) _, _, block = apply_pass_and_basic_check( prog, "common::nullify_redundant_quantization_zero_point" ) assert get_op_types_in_program(prog) == ["quantize", "dequantize"] quantize_op = prog.find_ops(op_type="quantize")[0] dequantize_op = prog.find_ops(op_type="dequantize")[0] if perturb_quantize: assert np.all(quantize_op.zero_point.val == zero_point_perturbed) assert np.all(dequantize_op.zero_point.val == zero_point) else: assert np.all(quantize_op.zero_point.val == zero_point) assert np.all(dequantize_op.zero_point.val == zero_point_perturbed) assert_model_is_valid( prog, {"x": SHAPE}, minimum_deployment_target=ct.target.iOS17, backend=("mlprogram", "fp32"), expected_output_shapes={block.outputs[0].name: SHAPE}, ) class TestDequantizeQuantizePairElimination: @staticmethod def generate_scale_zp_axis(shape, is_zp_present, is_axis_present): rank = len(shape) axis = None if is_axis_present: axis = np.random.randint(-rank, rank, dtype=np.int32) scale = np.random.rand(shape[axis]) if is_axis_present else np.random.rand() zero_point = None if is_zp_present: zero_point = ( np.random.randint(-128, 128, shape[axis], dtype=np.int8) if is_axis_present else np.random.randint(-128, 128, dtype=np.int8) ) return scale, zero_point, axis @pytest.mark.parametrize( "is_zp_present, is_axis_present", itertools.product( (True, False), (True, False), ), ) def test_eliminate_identical_dequantize_quantize(self, is_zp_present, is_axis_present): """ Input graph: input -> quantize0 -> dequantize1 -> quantize1 -> dequantize2 -> add -> quantize2 -> dequantize3 -> output Output graph: input -> quantize0 -> dequantize2 -> add -> quantize2 -> dequantize3 -> output """ SHAPE = (2, 3) scale, zero_point, axis = self.generate_scale_zp_axis(SHAPE, is_zp_present, is_axis_present) @mb.program(input_specs=[mb.TensorSpec(shape=SHAPE, dtype=types.fp32)]) def prog(x): # quantize input quantized_0 = mb.quantize( input=x, scale=scale, zero_point=zero_point, axis=axis, output_dtype="int8" ) # redundant dequantize-quantize pair dequantized_1 = mb.dequantize( input=quantized_0, scale=scale, zero_point=zero_point, axis=axis ) quantized_1 = mb.quantize( input=dequantized_1, scale=scale, zero_point=zero_point, axis=axis, output_dtype="int8", ) # dequantize-op-quantize sandwich dequantized_2 = mb.dequantize( input=quantized_1, scale=scale, zero_point=zero_point, axis=axis ) y = mb.add(x=dequantized_2, y=dequantized_2) quantized_2 = mb.quantize(input=y, scale=0.1, output_dtype="int8") # dequantize output dequantized_3 = mb.dequantize(input=quantized_2, scale=0.1) return dequantized_3 prev_prog, _, _ = apply_pass_and_basic_check( prog, "common::dequantize_quantize_pair_elimination" ) assert get_op_types_in_program(prev_prog) == [ "quantize", "dequantize", "quantize", "dequantize", "add", "quantize", "dequantize", ] # As expected, dequantize_1 -> quantize_1 gets eliminated. # On the other hand, even with same scales and zero points and axes, # quantize_0 -> dequantize_2 and quantize_2 -> dequantize_3 are kept. assert get_op_types_in_program(prog) == [ "quantize", "dequantize", "add", "quantize", "dequantize", ] @pytest.mark.parametrize( "is_zp_present, is_axis_present, is_shifted_zp_present", itertools.product( (True, False), (True, False), (True, False), ), ) def test_keep_unidentical_dequantize_quantize( self, is_zp_present, is_axis_present, is_shifted_zp_present ): """ Input graph: input -> quantize0 -> dequantize1(scale1, zp1) -> quantize1(scale2, zp2) -> dequantize2 -> add -> quantize2 -> dequantize3 -> output Nothing changes when dequantize1 and quantize1 have different parameters """ SHAPE = (2, 3, 5) scale, zero_point, axis = self.generate_scale_zp_axis(SHAPE, is_zp_present, is_axis_present) @mb.program(input_specs=[mb.TensorSpec(shape=SHAPE, dtype=types.fp32)]) def prog(x): # quantize input quantized_0 = mb.quantize( input=x, scale=scale, zero_point=zero_point, axis=axis, output_dtype="int8" ) # non-redundant dequantize-quantize pair # this pattern can emerge from a (future) graph pass dequantized_1 = mb.dequantize( input=quantized_0, scale=scale, zero_point=zero_point, axis=axis ) if is_zp_present: # input graph: # dequantize -> add(y=const) -> quantize # output graph: # dequantize -> quantize(zero_point += const / scale) if is_shifted_zp_present: shifted_zero_point = ( (zero_point + 1.0).astype(np.int8) if is_axis_present else np.int8(zero_point + 1.0) ) quantized_1 = mb.quantize( input=dequantized_1, scale=scale, zero_point=shifted_zero_point, axis=axis, output_dtype="int8", ) else: quantized_1 = mb.quantize( input=dequantized_1, scale=scale, axis=axis, output_dtype="int8" ) else: # input graph: # dequantize(zero_point=0) -> mul(y=const) -> quantize(zero_point=0) # output graph: # dequantize(zero_point=0) -> quantize(scale /= const, zero_point=0) quantized_1 = mb.quantize( input=dequantized_1, scale=scale / 2.0, axis=axis, output_dtype="int8" ) # dequantize-op-quantize sandwich dequantized_2 = mb.dequantize( input=quantized_1, scale=scale, zero_point=zero_point, axis=axis ) y = mb.add(x=dequantized_2, y=dequantized_2) quantized_2 = mb.quantize(input=y, scale=0.1, output_dtype="int8") # dequantize output dequantized_3 = mb.dequantize(input=quantized_2, scale=0.1) return dequantized_3 prev_prog, _, _ = apply_pass_and_basic_check( prog, "common::dequantize_quantize_pair_elimination" ) assert get_op_types_in_program(prev_prog) == [ "quantize", "dequantize", "quantize", "dequantize", "add", "quantize", "dequantize", ] # nothing gets eliminated assert get_op_types_in_program(prog) == [ "quantize", "dequantize", "quantize", "dequantize", "add", "quantize", "dequantize", ] @pytest.mark.parametrize( "is_zp_present, is_axis_present", itertools.product( (True, False), (True, False), ), ) def test_keep_block_output_dequantize(self, is_zp_present, is_axis_present): """ Input graph: input -> quantize0 -> dequantize1 -> quantize1 -> dequantize2 -> add -> quantize2 -> dequantize3 -> output Nothing changes when dequantize1 is a block output """ SHAPE = (2, 3) scale, zero_point, axis = self.generate_scale_zp_axis(SHAPE, is_zp_present, is_axis_present) @mb.program(input_specs=[mb.TensorSpec(shape=SHAPE, dtype=types.fp32)]) def prog(x): # quantize input quantized_0 = mb.quantize( input=x, scale=scale, zero_point=zero_point, axis=axis, output_dtype="int8" ) # redundant dequantize-quantize pair dequantized_1 = mb.dequantize( input=quantized_0, scale=scale, zero_point=zero_point, axis=axis ) quantized_1 = mb.quantize( input=dequantized_1, scale=scale, zero_point=zero_point, axis=axis, output_dtype="int8", ) # dequantize-op-quantize sandwich dequantized_2 = mb.dequantize( input=quantized_1, scale=scale, zero_point=zero_point, axis=axis ) y = mb.add(x=dequantized_2, y=dequantized_2) quantized_2 = mb.quantize(input=y, scale=0.1, output_dtype="int8") # dequantize output dequantized_3 = mb.dequantize(input=quantized_2, scale=0.1) return dequantized_1, dequantized_3 prev_prog, _, _ = apply_pass_and_basic_check( prog, "common::dequantize_quantize_pair_elimination" ) assert get_op_types_in_program(prev_prog) == [ "quantize", "dequantize", "quantize", "dequantize", "add", "quantize", "dequantize", ] # nothing gets eliminated assert get_op_types_in_program(prog) == [ "quantize", "dequantize", "quantize", "dequantize", "add", "quantize", "dequantize", ] @pytest.mark.parametrize( "is_zp_present, is_axis_present", itertools.product( (True, False), (True, False), ), ) def test_keep_multichildren_dequantize(self, is_zp_present, is_axis_present): """ Input graph: |-> quantize1 -> dequantize2 -> add -> quantize2 -> dequantize3 -> output1 input -> quantize0 -> dequantize1 -| |-> mul -> quantize -> dequantize -> output2 Output graph: |-> dequantize2 -> add -> quantize2 -> dequantize3 -> output1 input -> quantize0 -> dequantize1 -| |-> mul -> quantize -> dequantize -> output2 As `dequantize1` has multiple children, we don't eliminate it, but can remove the child `quantize1`. """ SHAPE = (2, 3) scale, zero_point, axis = self.generate_scale_zp_axis(SHAPE, is_zp_present, is_axis_present) @mb.program(input_specs=[mb.TensorSpec(shape=SHAPE, dtype=types.fp32)]) def prog(x): # quantize input quantized_0 = mb.quantize( input=x, scale=scale, zero_point=zero_point, axis=axis, output_dtype="int8" ) # redundant dequantize-quantize pair dequantized_1 = mb.dequantize( input=quantized_0, scale=scale, zero_point=zero_point, axis=axis ) quantized_1 = mb.quantize( input=dequantized_1, scale=scale, zero_point=zero_point, axis=axis, output_dtype="int8", ) # dequantize-op-quantize sandwich dequantized_2 = mb.dequantize( input=quantized_1, scale=scale, zero_point=zero_point, axis=axis ) y = mb.add(x=dequantized_2, y=dequantized_2) quantized_2 = mb.quantize(input=y, scale=0.1, output_dtype="int8") # dequantize output dequantized_3 = mb.dequantize(input=quantized_2, scale=0.1) # now add another usage of dequantized_1 z = mb.mul(x=dequantized_1, y=dequantized_1) quantized_z = mb.quantize(input=z, scale=0.2, output_dtype="int8") dequantized_z = mb.dequantize(input=quantized_z, scale=0.2) return dequantized_3, dequantized_z prev_prog, _, _ = apply_pass_and_basic_check( prog, "common::dequantize_quantize_pair_elimination" ) assert get_op_types_in_program(prev_prog) == [ "quantize", "dequantize", "quantize", "dequantize", "add", "quantize", "dequantize", "mul", "quantize", "dequantize", ] # The `quantize` before `add` got eliminated. assert get_op_types_in_program(prog) == [ "quantize", "dequantize", "dequantize", "add", "quantize", "dequantize", "mul", "quantize", "dequantize", ] @pytest.mark.skipif(ct.utils._macos_version() < (14, 0), reason="Requires Core ML 7") class TestDistributiveQuantizedBinaryOpScaleNormalization(QuantizationBaseTest): @pytest.mark.parametrize( "op_type, has_relu_fusion, input_rank, is_axis_x_present", itertools.product( ("add", "sub"), (True, False), (1, 3, 5), (True, False), ), ) def test_normalize(self, op_type, has_relu_fusion, input_rank, is_axis_x_present): """ Input graph: x -> quantize(scale_x) -> dequantize(scale_x) -| |-> add/sub (-> relu) -> dequantize(scale_z) -> output y -> quantize(scale_y) -> dequantize(scale_y) -| Output graph: x -> quantize(scale_x) -> dequantize(scale_x/scale_y) -| |-> add/sub (-> relu) -> dequantize(scale_z/scale_y) -> output y -> quantize(scale_y) -> dequantize(1.0) -| x and y may get swapped to have the one with scalar scale being new "y" """ # if axis_x is present, then axis_y is not, vice versa, # so that one of scale_x or scale_y is scalar SHAPE = np.random.randint(1, 5, size=input_rank, dtype=np.int32) scale_x, zero_point_x, axis_x = self.generate_random_quantization_params( np.float32, np.int8, SHAPE, True, is_axis_x_present ) scale_y, zero_point_y, axis_y = self.generate_random_quantization_params( np.float32, np.int8, SHAPE, True, not is_axis_x_present ) scale_z, zero_point_z, axis_z = self.generate_random_quantization_params( np.float32, np.int8, SHAPE, True, True ) @mb.program( input_specs=[ mb.TensorSpec(shape=SHAPE, dtype=types.fp32), mb.TensorSpec(shape=SHAPE, dtype=types.fp32), ] ) def prog(x, y): # quantize input quantize_x = mb.quantize( input=x, scale=scale_x, zero_point=zero_point_x, axis=axis_x, output_dtype="int8" ) quantize_y = mb.quantize( input=y, scale=scale_y, zero_point=zero_point_y, axis=axis_y, output_dtype="int8" ) # quantized binary op dequantize_x = mb.dequantize( input=quantize_x, scale=scale_x, zero_point=zero_point_x, axis=axis_x ) dequantize_y = mb.dequantize( input=quantize_y, scale=scale_y, zero_point=zero_point_y, axis=axis_y ) z = None if op_type == "add": z = mb.add(x=dequantize_x, y=dequantize_y) elif op_type == "sub": z = mb.sub(x=dequantize_x, y=dequantize_y) else: raise ValueError("unsupported op type") if has_relu_fusion: z = mb.relu(x=z) quantize_z = mb.quantize( input=z, scale=scale_z, zero_point=zero_point_z, axis=axis_z, output_dtype="int8" ) # dequantize output dequantize_z = mb.dequantize( input=quantize_z, scale=scale_z, zero_point=zero_point_z, axis=axis_z ) return dequantize_z # dequantize_x, dequantize_y, z prev_prog, _, _ = apply_pass_and_basic_check( prog, "common::distributive_quantized_binary_op_scale_normalization" ) # dequantize_x, dequantize_y, dequantize_x_normalized, dequantize_y_normalized, z _, _, _ = apply_pass_and_basic_check(prog, "common::dead_code_elimination") # dequantize_x_normalized, dequantize_y_normalized, z scale_prev_dequantize_x = prev_prog.find_ops(op_type="dequantize")[0].scale.val scale_prev_dequantize_y = prev_prog.find_ops(op_type="dequantize")[1].scale.val scale_prev_quantize_z = prev_prog.find_ops(op_type="quantize")[-1].scale.val assert np.all(scale_prev_dequantize_x == scale_x) assert np.all(scale_prev_dequantize_y == scale_y) assert np.all(scale_prev_quantize_z == scale_z) scale_dequantize_x = prog.find_ops(op_type="dequantize")[0].scale.val scale_dequantize_y = prog.find_ops(op_type="dequantize")[1].scale.val scale_quantize_z = prog.find_ops(op_type="quantize")[-1].scale.val # if axis_x is present, then scale_y gets normalized # else, scale_x gets normalized, and x and y will get swapped assert np.all( scale_dequantize_x == scale_x / scale_y if is_axis_x_present else scale_y / scale_x ) assert np.all(scale_dequantize_y == 1.0) assert np.all( scale_quantize_z == scale_z / scale_y if is_axis_x_present else scale_z / scale_x ) prev_model = ct.convert( prev_prog, source="milinternal", convert_to="mlprogram", compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS17, ) model = ct.convert( prog, source="milinternal", convert_to="mlprogram", compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS17, ) x = self.generate_random_quantize_input( np.float32, np.int8, scale_x, zero_point_x, axis_x, SHAPE ) y = self.generate_random_quantize_input( np.float32, np.int8, scale_y, zero_point_y, axis_y, SHAPE ) prev_output = list(prev_model.predict({"x": x, "y": y}).values())[0] output = list(model.predict({"x": x, "y": y}).values())[0] assert np.all(prev_output == output) def test_normalize_versatile_inputs(self): """ Input graph: |-> exp -> dequantize(scale_z) | x -> quantize(scale_x) -> dequantize(scale_x) -| |-> add -> dequantize(scale_z) -> output y -> quantize(scale_y) -> dequantize(scale_y) -| Output graph: |-> dequantize(scale_x) -> exp -> dequantize(scale_z) | x -> quantize(scale_x) -> dequantize(scale_x/scale_y) -| |-> add -> dequantize(scale_z/scale_y) -> output y -> quantize(scale_y) -> dequantize(1.0) -| """ SHAPE = (2, 1) scale_x, zero_point_x, axis_x = np.float32(0.2), None, None scale_y, zero_point_y, axis_y = np.float32(0.3), None, None scale_z, zero_point_z, axis_z = np.float32(0.5), None, None @mb.program( input_specs=[ mb.TensorSpec(shape=SHAPE, dtype=types.fp32), mb.TensorSpec(shape=SHAPE, dtype=types.fp32), ] ) def prog(x, y): # quantize input quantize_x = mb.quantize( input=x, scale=scale_x, zero_point=zero_point_x, axis=axis_x, output_dtype="uint8" ) quantize_y = mb.quantize( input=y, scale=scale_y, zero_point=zero_point_y, axis=axis_y, output_dtype="uint8" ) # quantized binary op dequantize_x = mb.dequantize( input=quantize_x, scale=scale_x, zero_point=zero_point_x, axis=axis_x ) dequantize_y = mb.dequantize( input=quantize_y, scale=scale_y, zero_point=zero_point_y, axis=axis_y ) z = mb.add(x=dequantize_x, y=dequantize_y) quantize_z = mb.quantize( input=z, scale=scale_z, zero_point=zero_point_z, axis=axis_z, output_dtype="uint8" ) # another quantized op z1 = mb.exp(x=dequantize_x) quantize_z1 = mb.quantize( input=z1, scale=scale_z, zero_point=zero_point_z, axis=axis_z, output_dtype="uint8" ) # dequantize output dequantize_z = mb.dequantize( input=quantize_z, scale=scale_z, zero_point=zero_point_z, axis=axis_z ) dequantize_z1 = mb.dequantize( input=quantize_z1, scale=scale_z, zero_point=zero_point_z, axis=axis_z ) return dequantize_z, dequantize_z1 prev_prog, _, _ = apply_pass_and_basic_check( prog, "common::distributive_quantized_binary_op_scale_normalization" ) _, _, _ = apply_pass_and_basic_check(prog, "common::dead_code_elimination") scale_prev_dequantize_x = prev_prog.find_ops(op_type="dequantize")[0].scale.val scale_prev_dequantize_y = prev_prog.find_ops(op_type="dequantize")[1].scale.val scale_prev_quantize_z = prev_prog.find_ops(op_type="quantize")[-2].scale.val assert np.all(scale_prev_dequantize_x == scale_x) assert np.all(scale_prev_dequantize_y == scale_y) assert np.all(scale_prev_quantize_z == scale_z) scale_dequantize_x_to_z1 = prog.find_ops(op_type="dequantize")[0].scale.val scale_dequantize_x_to_z = prog.find_ops(op_type="dequantize")[1].scale.val scale_dequantize_y = prog.find_ops(op_type="dequantize")[2].scale.val scale_quantize_z = prog.find_ops(op_type="quantize")[-2].scale.val assert np.all(scale_dequantize_x_to_z1 == scale_x) assert np.all(scale_dequantize_x_to_z == scale_x / scale_y) assert np.all(scale_dequantize_y == 1.0) assert np.all(scale_quantize_z == scale_z / scale_y) def test_skip_0_scale(self): """ Input graph: x -> quantize(eps) -> dequantize(eps) -| |-> add -> dequantize -> output y -> quantize(eps) -> dequantize(eps) -| Nothing changes due to underflow scale """ # consider anything underflows fp16 to be 0 SHAPE = (1, 2) scale_x, zero_point_x, axis_x = np.float32(5e-8), None, None scale_y, zero_point_y, axis_y = np.float32(-5e-8), None, None scale_z, zero_point_z, axis_z = np.float32(0.8), None, None @mb.program( input_specs=[ mb.TensorSpec(shape=SHAPE, dtype=types.fp32), mb.TensorSpec(shape=SHAPE, dtype=types.fp32), ] ) def prog(x, y): # quantize input quantize_x = mb.quantize( input=x, scale=scale_x, zero_point=zero_point_x, axis=axis_x, output_dtype="uint8" ) quantize_y = mb.quantize( input=y, scale=scale_y, zero_point=zero_point_y, axis=axis_y, output_dtype="uint8" ) # quantized binary op dequantize_x = mb.dequantize( input=quantize_x, scale=scale_x, zero_point=zero_point_x, axis=axis_x ) dequantize_y = mb.dequantize( input=quantize_y, scale=scale_y, zero_point=zero_point_y, axis=axis_y ) z = mb.add(x=dequantize_x, y=dequantize_y) quantize_z = mb.quantize( input=z, scale=scale_z, zero_point=zero_point_z, axis=axis_z, output_dtype="uint8" ) # dequantize output dequantize_z = mb.dequantize( input=quantize_z, scale=scale_z, zero_point=zero_point_z, axis=axis_z ) return dequantize_z prev_prog, _, _ = apply_pass_and_basic_check( prog, "common::distributive_quantized_binary_op_scale_normalization" ) scale_prev_dequantize_x = prev_prog.find_ops(op_type="dequantize")[0].scale.val scale_prev_dequantize_y = prev_prog.find_ops(op_type="dequantize")[1].scale.val scale_prev_quantize_z = prev_prog.find_ops(op_type="quantize")[-1].scale.val assert np.all(scale_prev_dequantize_x == scale_x) assert np.all(scale_prev_dequantize_y == scale_y) assert np.all(scale_prev_quantize_z == scale_z) scale_dequantize_x = prog.find_ops(op_type="dequantize")[0].scale.val scale_dequantize_y = prog.find_ops(op_type="dequantize")[1].scale.val scale_quantize_z = prog.find_ops(op_type="quantize")[-1].scale.val assert np.all(scale_dequantize_x == scale_x) assert np.all(scale_dequantize_y == scale_y) assert np.all(scale_quantize_z == scale_z) @pytest.mark.parametrize("input_rank", (1, 2, 5)) def test_skip_2_vector_scales(self, input_rank): """ Input graph: x -> quantize(scale_x) -> dequantize(scale_x) -| |-> add -> dequantize(scale_z) -> output y -> quantize(scale_y) -> dequantize(scale_y) -| Nothing changes when both scale_x and scale_y are vectors """ # axis_x and axis_y are both present SHAPE = np.random.randint(1, 5, size=input_rank, dtype=np.int32) scale_x, zero_point_x, axis_x = self.generate_random_quantization_params( np.float16, np.uint8, SHAPE, False, True ) scale_y, zero_point_y, axis_y = self.generate_random_quantization_params( np.float16, np.uint8, SHAPE, False, True ) scale_z, zero_point_z, axis_z = self.generate_random_quantization_params( np.float16, np.uint8, SHAPE, False, False ) @mb.program( input_specs=[ mb.TensorSpec(shape=SHAPE, dtype=types.fp16), mb.TensorSpec(shape=SHAPE, dtype=types.fp16), ] ) def prog(x, y): # quantize input quantize_x = mb.quantize( input=x, scale=scale_x, zero_point=zero_point_x, axis=axis_x, output_dtype="uint8" ) quantize_y = mb.quantize( input=y, scale=scale_y, zero_point=zero_point_y, axis=axis_y, output_dtype="uint8" ) # quantized binary op dequantize_x = mb.dequantize( input=quantize_x, scale=scale_x, zero_point=zero_point_x, axis=axis_x ) dequantize_y = mb.dequantize( input=quantize_y, scale=scale_y, zero_point=zero_point_y, axis=axis_y ) z = mb.add(x=dequantize_x, y=dequantize_y) quantize_z = mb.quantize( input=z, scale=scale_z, zero_point=zero_point_z, axis=axis_z, output_dtype="uint8" ) # dequantize output dequantize_z = mb.dequantize( input=quantize_z, scale=scale_z, zero_point=zero_point_z, axis=axis_z ) return dequantize_z prev_prog, _, _ = apply_pass_and_basic_check( prog, "common::distributive_quantized_binary_op_scale_normalization" ) scale_prev_dequantize_x = prev_prog.find_ops(op_type="dequantize")[0].scale.val scale_prev_dequantize_y = prev_prog.find_ops(op_type="dequantize")[1].scale.val scale_prev_quantize_z = prev_prog.find_ops(op_type="quantize")[-1].scale.val assert np.all(scale_prev_dequantize_x == scale_x) assert np.all(scale_prev_dequantize_y == scale_y) assert np.all(scale_prev_quantize_z == scale_z) scale_dequantize_x = prog.find_ops(op_type="dequantize")[0].scale.val scale_dequantize_y = prog.find_ops(op_type="dequantize")[1].scale.val scale_quantize_z = prog.find_ops(op_type="quantize")[-1].scale.val assert np.all(scale_dequantize_x == scale_x) assert np.all(scale_dequantize_y == scale_y) assert np.all(scale_quantize_z == scale_z) class TestDequantizeToConstexpr: @pytest.mark.parametrize( "float_dtype, quant_dtype, is_scalar, is_zp_present", itertools.product( (np.float32, np.float16), (np.int8, np.uint8), (True, False), (True, False), ), ) def test_dequantize_const_to_constexpr( self, float_dtype, quant_dtype, is_scalar, is_zp_present ): """ Input graph: input -> dequantize -> output Output graph: input -> constexpr_affine_dequantize -> output """ @mb.program(input_specs=[]) def prog(): y = None if is_scalar: if is_zp_present: y = mb.dequantize( input=np.array([10, 11], dtype=quant_dtype), scale=float_dtype(0.1), zero_point=quant_dtype(2), ) else: y = mb.dequantize( input=np.array([13, 14, 15], dtype=quant_dtype), scale=float_dtype(0.2) ) else: if is_zp_present: y = mb.dequantize( input=np.array([[10, 11], [12, 13], [14, 15]], dtype=quant_dtype), scale=np.array([0.1, 0.2, 0.3], dtype=float_dtype), zero_point=np.array([6, 7, 8], dtype=quant_dtype), axis=0, ) else: y = mb.dequantize( input=np.array([[19, 20, 21], [22, 23, 24]], dtype=quant_dtype), scale=np.array([0.4, 0.5, 0.6], dtype=float_dtype), axis=1, ) return y assert get_op_types_in_program(prog) == ["dequantize"] dequantize_op = prog.find_ops(op_type="dequantize")[0] assert dequantize_op.outputs[0].val is None assert dequantize_op.can_materialize_val() apply_pass_and_basic_check(prog, "common::dequantize_to_constexpr") assert get_op_types_in_program(prog) == ["constexpr_affine_dequantize"] @pytest.mark.parametrize( "float_dtype, quant_dtype, is_scalar, is_zp_present", itertools.product( (np.float32, np.float16), (np.int8, np.uint8), (True, False), (True, False), ), ) def test_dequantize_variable_unchanged( self, float_dtype, quant_dtype, is_scalar, is_zp_present ): """ Input graph: input -> dequantize -> output Output graph: input -> dequantize -> output """ if is_scalar: if is_zp_present: @mb.program( input_specs=[ mb.TensorSpec( shape=(1, 2, 3, 4, 5), dtype=numpy_type_to_builtin_type(quant_dtype) ) ] ) def prog(x): y = mb.dequantize(input=x, scale=float_dtype(0.1), zero_point=quant_dtype(1)) return y else: @mb.program( input_specs=[ mb.TensorSpec( shape=(4, 3, 2, 1), dtype=numpy_type_to_builtin_type(quant_dtype) ) ] ) def prog(x): y = mb.dequantize(input=x, scale=float_dtype(0.2)) return y else: if is_zp_present: @mb.program( input_specs=[ mb.TensorSpec(shape=(3, 2), dtype=numpy_type_to_builtin_type(quant_dtype)) ] ) def prog(x): y = mb.dequantize( input=x, scale=np.array([0.1, 0.2, 0.3], dtype=float_dtype), zero_point=np.array([1, 2, 3], dtype=quant_dtype), axis=0, ) return y else: @mb.program( input_specs=[ mb.TensorSpec(shape=(2, 3), dtype=numpy_type_to_builtin_type(quant_dtype)) ] ) def prog(x): y = mb.dequantize( input=x, scale=np.array([0.4, 0.5, 0.6], dtype=float_dtype), axis=1, ) return y assert get_op_types_in_program(prog) == ["dequantize"] prev_prog, prev_block, block = apply_pass_and_basic_check( prog, "common::dequantize_to_constexpr" ) assert get_op_types_in_program(prog) == ["dequantize"] @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Only supported on macOS 15+") class TestReorderLutPerChannelScale: @staticmethod def _verify_numerical(prev_prog, prog, block, input_shape, rtol=1e-7, atol=0.0): # Verify the numerical output matches before and after the reordering. prev_model = ct.convert( prev_prog, pass_pipeline=ct.PassPipeline.EMPTY, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, ) model = ct.convert( prog, pass_pipeline=ct.PassPipeline.EMPTY, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, ) output_name = block.outputs[0].name x_val = np.random.rand(*input_shape).astype(np.float16) input_dict = {"x": x_val} prev_output = prev_model.predict(input_dict)[output_name] output = model.predict(input_dict)[output_name] np.testing.assert_allclose(prev_output, output, rtol=rtol, atol=atol) @staticmethod def _get_lut_pcs_weight(shape: Tuple[int, ...], nbits=4, scale_axis: int = 0): """Get a specific shape of weight produced by lut with per-channel-scale (pcs).""" num_palette = 2**nbits np_dtype = types.nptype_from_builtin(types.string_to_builtin(f"uint{nbits}")) indices = np.arange(np.prod(shape)).reshape(shape).astype(np_dtype) lut_shape = shape + (num_palette, 1) lut = np.arange(np.prod(lut_shape)).reshape(lut_shape).astype(np.float16) lut_op = mb.constexpr_lut_to_dense(indices=indices, lut=lut) scale_shape = [1] * len(shape) scale_shape[scale_axis] = shape[scale_axis] scale_shape = tuple(scale_shape) scale_val = np.arange(1, np.prod(scale_shape) + 1).reshape(scale_shape).astype(np.float16) return mb.constexpr_blockwise_shift_scale( data=lut_op, scale=scale_val, ) @pytest.mark.parametrize( "input_shape, has_bias", itertools.product([(4, 3), (2, 3, 2), (1, 2, 3, 4)], [True, False]) ) def test_reorder_scale_linear(self, input_shape: Tuple[int, ...], has_bias: bool): @mb.program( input_specs=[mb.TensorSpec(shape=input_shape, dtype=types.fp16)], opset_version=ct.target.iOS18, ) def prog(x): scaled_weight = self._get_lut_pcs_weight((2, input_shape[-1])) bias = np.array([20, 50], dtype=np.float16) if has_bias else None output = mb.linear(x=x, weight=scaled_weight, bias=bias) return mb.add(x=output, y=np.float16(1.0)) prev_prog, _, block = apply_pass_and_basic_check( prog, "common::reorder_lut_per_channel_scale", skip_essential_scope_check=True ) assert get_op_types_in_program(prev_prog) == [ "constexpr_lut_to_dense", "constexpr_blockwise_shift_scale", "linear", "add", ] assert get_op_types_in_program(prog) == ["constexpr_lut_to_dense", "linear", "mul", "add"] self._verify_numerical(prev_prog, prog, block, input_shape) @pytest.mark.parametrize( "use_y_as_weight, transpose_x, transpose_y", itertools.product([True, False], [True, False], [True, False]), ) def test_reorder_scale_matmul(self, use_y_as_weight, transpose_x, transpose_y): input_shape = (3, 4) @mb.program( input_specs=[mb.TensorSpec(shape=input_shape, dtype=types.fp16)], opset_version=ct.target.iOS18, ) def prog(x): if use_y_as_weight: if transpose_x: # x shape is (4, 3) weight_shape = (2, 3) if transpose_y else (3, 2) else: # x shape is (3, 4) weight_shape = (2, 4) if transpose_y else (4, 2) scaled_weight = self._get_lut_pcs_weight( weight_shape, scale_axis=0 if transpose_y else 1 ) output = mb.matmul( x=x, y=scaled_weight, transpose_x=transpose_x, transpose_y=transpose_y ) else: if transpose_y: # y shape is (4, 3) weight_shape = (4, 2) if transpose_x else (2, 4) else: # y shape is (3, 4) weight_shape = (3, 2) if transpose_x else (2, 3) scaled_weight = self._get_lut_pcs_weight( weight_shape, scale_axis=1 if transpose_x else 0 ) output = mb.matmul( x=scaled_weight, y=x, transpose_x=transpose_x, transpose_y=transpose_y ) return mb.add(x=output, y=np.float16(1.0)) prev_prog, _, block = apply_pass_and_basic_check( prog, "common::reorder_lut_per_channel_scale", skip_essential_scope_check=True ) assert get_op_types_in_program(prev_prog) == [ "constexpr_lut_to_dense", "constexpr_blockwise_shift_scale", "matmul", "add", ] assert get_op_types_in_program(prog) == ["constexpr_lut_to_dense", "matmul", "mul", "add"] self._verify_numerical(prev_prog, prog, block, input_shape) @pytest.mark.parametrize( "pad_type, has_bias, has_strides_dilations", itertools.product(["valid", "same", "same_lower", "custom"], [True, False], [True, False]), ) def test_reorder_scale_conv(self, pad_type, has_bias, has_strides_dilations): input_shape = (4, 3, 4, 3) @mb.program( input_specs=[mb.TensorSpec(shape=input_shape, dtype=types.fp16)], opset_version=ct.target.iOS18, ) def prog(x): scaled_weight = self._get_lut_pcs_weight((2, 3, 2, 2), nbits=6) bias = np.array([20, 50], dtype=np.float16) if has_bias else None pad = [1, 1, 1, 1] if pad_type == "custom" else None strides = [1, 2] if has_strides_dilations else None dilations = [1, 2] if has_strides_dilations else None output = mb.conv( x=x, weight=scaled_weight, strides=strides, pad_type=pad_type, pad=pad, dilations=dilations, bias=bias, ) return mb.add(x=output, y=np.float16(1.0)) prev_prog, _, block = apply_pass_and_basic_check( prog, "common::reorder_lut_per_channel_scale", skip_essential_scope_check=True ) assert get_op_types_in_program(prev_prog) == [ "constexpr_lut_to_dense", "constexpr_blockwise_shift_scale", "conv", "add", ] assert get_op_types_in_program(prog) == ["constexpr_lut_to_dense", "conv", "mul", "add"] self._verify_numerical(prev_prog, prog, block, input_shape) @pytest.mark.parametrize( "input_shape, has_bias", itertools.product([(4, 3), (2, 3, 2), (1, 2, 3, 4)], [True, False]) ) def test_reorder_multiple_usages(self, input_shape: Tuple[int, ...], has_bias: bool): """The scaled weight is used by multiple ops.""" @mb.program( input_specs=[mb.TensorSpec(shape=input_shape, dtype=types.fp16)], opset_version=ct.target.iOS18, ) def prog(x): scaled_weight = self._get_lut_pcs_weight((2, input_shape[-1])) bias = np.array([20, 50], dtype=np.float16) if has_bias else None linear_output = mb.linear(x=x, weight=scaled_weight, bias=bias) matmul_output = mb.matmul(x=x, y=scaled_weight, transpose_x=False, transpose_y=True) return mb.add(x=linear_output, y=matmul_output) prev_prog, _, block = apply_pass_and_basic_check( prog, "common::reorder_lut_per_channel_scale", skip_essential_scope_check=True ) assert get_op_types_in_program(prev_prog) == [ "constexpr_lut_to_dense", "constexpr_blockwise_shift_scale", "linear", "matmul", "add", ] assert get_op_types_in_program(prog) == [ "constexpr_lut_to_dense", "linear", "mul", "matmul", "mul", "add", ] self._verify_numerical(prev_prog, prog, block, input_shape) def test_reorder_not_happen(self): """The scale won't be moved when the scaled weight is used in unsupported ops.""" @mb.program( input_specs=[mb.TensorSpec(shape=(4, 16), dtype=types.fp16)], opset_version=ct.target.iOS18, ) def prog(x): scaled_weight = self._get_lut_pcs_weight((2, 16)) linear_output1 = mb.linear(x=x, weight=scaled_weight) add_out = mb.add(x=scaled_weight, y=np.float16(1.0)) linear_output2 = mb.linear(x=x, weight=add_out) return mb.add(x=linear_output1, y=linear_output2) prev_prog, _, block = apply_pass_and_basic_check( prog, "common::reorder_lut_per_channel_scale", skip_essential_scope_check=True ) assert get_op_types_in_program(prog) == get_op_types_in_program(prev_prog) @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Only supported on macOS 15+") class TestReorderQuantizedLut: @staticmethod def _verify_numerical(prev_prog, prog, block, input_shape, rtol=1e-7, atol=0.0): # Verify the numerical output matches between `prev_prog` and `prog`. prev_model = ct.convert( prev_prog, pass_pipeline=ct.PassPipeline.EMPTY, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, ) model = ct.convert( prog, pass_pipeline=ct.PassPipeline.EMPTY, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, ) output_name = block.outputs[0].name x_val = np.random.rand(*input_shape).astype(np.float16) input_dict = {"x": x_val} prev_output = prev_model.predict(input_dict)[output_name] output = model.predict(input_dict)[output_name] np.testing.assert_allclose(prev_output, output, rtol=rtol, atol=atol) @staticmethod def _construct_weights_with_two_orders(weight_shape: Tuple[int, ...]): """Construct two quantized lut weights, represented in different quant/lut orders.""" nbits = 4 num_palette = 2**nbits indices_np_dtype = types.nptype_from_builtin(types.string_to_builtin(f"uint{nbits}")) indices = np.random.randint(low=0, high=num_palette, size=weight_shape).astype( indices_np_dtype ) lut_shape = weight_shape + (num_palette, 1) int8_lut = np.random.randint(low=0, high=6, size=lut_shape, dtype=np.int8) scale = np.float16(2.0).reshape([1] * len(weight_shape)) offset = np.int8(1).reshape([1] * len(weight_shape)) lut_weight1 = mb.constexpr_lut_to_dense(indices=indices, lut=int8_lut) quantized_lut_weight1 = mb.constexpr_blockwise_shift_scale( data=lut_weight1, scale=scale, offset=offset ) quantized_weight2 = mb.constexpr_blockwise_shift_scale( data=int8_lut, scale=scale.reshape([1] * len(int8_lut.shape)), offset=offset.reshape([1] * len(int8_lut.shape)), ) quantized_lut_weight2 = mb.constexpr_lut_to_dense(indices=indices, lut=quantized_weight2) return quantized_lut_weight1, quantized_lut_weight2 @pytest.mark.parametrize( "input_shape, dequant_first", itertools.product([(4, 3), (2, 3, 4)], [True, False]) ) def test_dequant_first(self, input_shape, dequant_first): """ When dequant_first is True, the quantized lut ops representation will be reordered to follow lut(int8) -> constexpr_blockwise_shift_scale -> lut(fp16) -> constexpr_lut_to_dense -> dense(fp16). When dequant_first is False, the quantized lut ops representation will be reordered to follow lut(int8) -> constexpr_lut_to_dense -> dense(int8) -> constexpr_blockwise_shift_scale -> dense(fp16) """ @mb.program( input_specs=[mb.TensorSpec(shape=input_shape, dtype=types.fp16)], opset_version=ct.target.iOS18, ) def prog(x): quantized_lut_weight1, quantized_lut_weight2 = self._construct_weights_with_two_orders( weight_shape=(8, input_shape[-1]) ) output1 = mb.linear(x=x, weight=quantized_lut_weight1) output2 = mb.linear(x=x, weight=quantized_lut_weight2) return mb.add(x=output1, y=output2) from unittest import mock from coremltools.converters.mil.mil.passes.defs.optimize_quantization import ( canonicalize_quantized_lut_pattern, ) with mock.patch.object(canonicalize_quantized_lut_pattern, "_DEQUANT_FIRST", dequant_first): prev_prog, _, block = apply_pass_and_basic_check( prog, "common::canonicalize_quantized_lut_pattern", skip_essential_scope_check=True ) assert get_op_types_in_program(prev_prog) == [ "constexpr_lut_to_dense", "constexpr_blockwise_shift_scale", "constexpr_blockwise_shift_scale", "constexpr_lut_to_dense", "linear", "linear", "add", ] dequant_ops = prog.functions["main"].find_ops(op_type="constexpr_blockwise_shift_scale") lut_ops = prog.functions["main"].find_ops(op_type="constexpr_lut_to_dense") assert len(dequant_ops) == 2 assert len(lut_ops) == 2 if dequant_first: for dequant_op in dequant_ops: assert dequant_op.outputs[0].child_ops[0].op_type == "constexpr_lut_to_dense" for lut_op in lut_ops: assert lut_op.outputs[0].child_ops[0].op_type == "linear" else: for lut_op in lut_ops: assert lut_op.outputs[0].child_ops[0].op_type == "constexpr_blockwise_shift_scale" for dequant_op in dequant_ops: assert dequant_op.outputs[0].child_ops[0].op_type == "linear" self._verify_numerical(prev_prog, prog, block, input_shape) class TestFP16CastTransform: def assertEqual(self, first, second): """A convenience method to migrate from unittest (self.assertEqual) to pytest.""" assert first == second def test_single_input_to_single_operation(self): """ Input graph: input -> square -> output Output graph: input -> cast(fp32->fp16) -> square -> cast(fp16->fp32) -> output """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.square(x=x) return x self.assertEqual(get_op_types_in_program(prog), ["square"]) apply_pass_and_basic_check( prog, quantization.FP16ComputePrecision(op_selector=lambda op: True) ) _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") self.assertEqual(get_op_types_in_program(prog), ["cast", "square", "cast"]) # Asserting first cast configuration cast_1 = block.find_ops(op_type="cast")[0] self.assertEqual(cast_1.dtype.val, "fp16") self.assertEqual(len(cast_1.outputs), 1) self.assertEqual(len(cast_1.outputs[0].child_ops), 1) self.assertEqual(cast_1.outputs[0].child_ops[0].op_type, "square") # Asserting second cast configuration cast_2 = block.find_ops(op_type="cast")[1] self.assertEqual(cast_2.dtype.val, "fp32") self.assertEqual(len(cast_2.outputs), 1) self.assertEqual(len(cast_2.outputs[0].child_ops), 0) assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={block.outputs[0].name: (10, 20)}, ) @parameterized.parameterized.expand([[1.0], [-1.0]]) def test_inf(self, sign): """ Input graph: input -> add(±2e38) -> tanh -> output Output graph: input -> cast(fp32->fp16) -> add(±inf) -> tanh -> cast(fp16->fp32) -> output """ SHAPE = (2, 3) @mb.program(input_specs=[mb.TensorSpec(shape=SHAPE)]) def prog(x): y = mb.add(x=x, y=np.float32(sign * 2e38)) z = mb.tanh(x=y) return z assert get_op_types_in_program(prog) == ["add", "tanh"] prev_prog, _, _ = apply_pass_and_basic_check(prog, "common::add_fp16_cast") apply_pass_and_basic_check(prog, "common::cast_optimization") apply_pass_and_basic_check(prog, "common::const_elimination") _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == ["cast", "add", "tanh", "cast"] cast_to_fp16, cast_to_fp32 = prog.find_ops(op_type="cast") assert cast_to_fp16.dtype.val == "fp16" assert cast_to_fp32.dtype.val == "fp32" output_name = block.outputs[0].name assert_model_is_valid(prog, {"x": SHAPE}, expected_output_shapes={output_name: SHAPE}) prev_model = ct.convert(prev_prog) model = ct.convert(prog) x = 65500.0 * np.random.rand(*SHAPE) prev_output = prev_model.predict({"x": x})[output_name] output = model.predict({"x": x})[output_name] np.allclose(prev_output, output) def test_fp16_overflow(self): """ Input graph: input -> clip(-77777, 88888) -> output Nothing gets changed due to fp16 overflow """ SHAPE = (2, 1, 3, 7, 5) @mb.program(input_specs=[mb.TensorSpec(shape=SHAPE)]) def prog(x): y = mb.clip(x=x, alpha=np.float32(-77777), beta=np.float32(88888)) return y assert get_op_types_in_program(prog) == ["clip"] apply_pass_and_basic_check(prog, "common::add_fp16_cast") assert get_op_types_in_program(prog) == ["clip"] def test_divide_by_zero_operation(self): """ Input graph: input ------| |-> div -> output const(eps) -| Output graph: input ------> cast(fp32->fp16) -| |-> div -> cast(fp16->fp32) -> output const(eps) -> cast(fp32->fp16) -| """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): eps = mb.const(val=1e-10) x = mb.real_div(x=x, y=eps) return x prev_prog, prev_block, block = apply_pass_and_basic_check( prog, quantization.FP16ComputePrecision(op_selector=lambda op: True) ) mlmodel = ct.convert(prog, compute_units=ct.ComputeUnit.CPU_ONLY) input_dict = {"x": np.random.rand(10, 20) * 1e-3} if _IS_MACOS: prediction = mlmodel.predict(input_dict) assert not np.isnan(prediction["real_div_0"]).any() assert np.isfinite(prediction["real_div_0"]).all() def test_multiple_inputs_to_single_operation(self): """ Input graph: input1 -| |-> concat -> output input2 -| Output graph: input1 -> cast(fp32->fp16) -| |-> concat -> cast(fp16->fp32) -> output input2 -> cast(fp32->fp16) -| """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20)), mb.TensorSpec(shape=(10, 20))]) def prog(x, y): x = mb.concat(values=(x, y), axis=0) return x self.assertEqual(get_op_types_in_program(prog), ["concat"]) apply_pass_and_basic_check( prog, quantization.FP16ComputePrecision(op_selector=lambda op: True) ) _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") self.assertEqual(get_op_types_in_program(prog), ["cast", "cast", "concat", "cast"]) # Asserting first cast configuration cast_1 = block.find_ops(op_type="cast")[0] self.assertEqual(cast_1.dtype.val, "fp16") self.assertEqual(len(cast_1.outputs), 1) self.assertEqual(len(cast_1.outputs[0].child_ops), 1) self.assertEqual(cast_1.outputs[0].child_ops[0].op_type, "concat") # Asserting second cast configuration cast_2 = block.find_ops(op_type="cast")[1] self.assertEqual(cast_2.dtype.val, "fp16") self.assertEqual(len(cast_2.outputs), 1) self.assertEqual(len(cast_2.outputs[0].child_ops), 1) self.assertEqual(cast_2.outputs[0].child_ops[0].op_type, "concat") # Asserting third cast configuration cast_3 = block.find_ops(op_type="cast")[2] self.assertEqual(cast_3.dtype.val, "fp32") self.assertEqual(len(cast_3.outputs), 1) self.assertEqual(len(cast_3.outputs[0].child_ops), 0) assert_model_is_valid( prog, {"x": (10, 20), "y": (10, 20)}, expected_output_shapes={block.outputs[0].name: (20, 20)}, ) def test_multiple_outputs_from_single_operation(self): """ Input graph: |-> output_1 input -> split -| |-> output_2 Output graph: |-> cast(fp16->fp32) -> output_1 input -> cast(fp32->fp16) -> split -| |-> cast(fp16->fp32) -> output_2 """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.split(x=x, axis=0, num_splits=2) return x self.assertEqual(get_op_types_in_program(prog), ["split"]) apply_pass_and_basic_check( prog, quantization.FP16ComputePrecision(op_selector=lambda op: True) ) _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") self.assertEqual(get_op_types_in_program(prog), ["cast", "split", "cast", "cast"]) # Asserting first cast configuration cast_1 = block.find_ops(op_type="cast")[0] self.assertEqual(cast_1.dtype.val, "fp16") self.assertEqual(len(cast_1.outputs), 1) self.assertEqual(len(cast_1.outputs[0].child_ops), 1) self.assertEqual(cast_1.outputs[0].child_ops[0].op_type, "split") # Asserting second cast configuration cast_2 = block.find_ops(op_type="cast")[1] self.assertEqual(cast_2.dtype.val, "fp32") self.assertEqual(len(cast_2.outputs), 1) self.assertEqual(len(cast_2.outputs[0].child_ops), 0) # Asserting third cast configuration cast_3 = block.find_ops(op_type="cast")[2] self.assertEqual(cast_3.dtype.val, "fp32") self.assertEqual(len(cast_3.outputs), 1) self.assertEqual(len(cast_3.outputs[0].child_ops), 0) assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={block.outputs[0].name: (5, 20), block.outputs[1].name: (5, 20)}, ) def test_single_input_to_multiple_operations(self): """ Input graph: |-> square -> output_1 input -| |-> relu -> output_2 Output graph: |-> square -> cast(fp16->fp32) -> output_1 input -> cast(fp32->fp16) -| |-> relu -> cast(fp16->fp32) -> output_2 """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): y = mb.square(x=x) z = mb.relu(x=x) return y, z self.assertEqual(get_op_types_in_program(prog), ["square", "relu"]) apply_pass_and_basic_check( prog, quantization.FP16ComputePrecision(op_selector=lambda op: True) ) _, _, block = apply_pass_and_basic_check(prog, "common::dead_code_elimination") self.assertEqual(get_op_types_in_program(prog), ["cast", "square", "cast", "relu", "cast"]) # Asserting first cast configuration cast_1 = block.find_ops(op_type="cast")[0] self.assertEqual(cast_1.dtype.val, "fp16") self.assertEqual(len(cast_1.outputs), 1) self.assertEqual(len(cast_1.outputs[0].child_ops), 2) self.assertEqual(cast_1.outputs[0].child_ops[0].op_type, "square") self.assertEqual(cast_1.outputs[0].child_ops[1].op_type, "relu") # Asserting second cast configuration cast_2 = block.find_ops(op_type="cast")[1] self.assertEqual(cast_2.dtype.val, "fp32") self.assertEqual(len(cast_2.outputs), 1) self.assertEqual(len(cast_2.outputs[0].child_ops), 0) # Asserting third cast configuration cast_3 = block.find_ops(op_type="cast")[2] self.assertEqual(cast_3.dtype.val, "fp32") self.assertEqual(len(cast_3.outputs), 1) self.assertEqual(len(cast_3.outputs[0].child_ops), 0) assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={ block.outputs[0].name: (10, 20), block.outputs[1].name: (10, 20), }, ) def test_duplicate_output_vars(self): """ Input graph: |-> output_1 input -> relu -| |-> output_2 Output graph: |-> output_1 input -> cast(fp32->fp16) -> relu -> cast(fp16->fp32) -| |-> output_2 """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2))]) def prog(x): relu1 = mb.relu(x=x) return relu1, relu1 _, _, block = apply_pass_and_basic_check( prog, quantization.FP16ComputePrecision(op_selector=lambda op: True) ) self.assertEqual(get_op_types_in_program(prog), ["cast", "relu", "cast"]) assert_model_is_valid( prog, {"x": (1, 2)}, expected_output_shapes={block.outputs[0].name: (1, 2), block.outputs[1].name: (1, 2)}, backend=("mlprogram", "fp16"), ) @pytest.mark.parametrize( "opset_version, op_name", itertools.product( [None, ct.target.iOS17], ["inverse", "log", "rsqrt"], ), ) def test_epsilon_mixed_precision(self, opset_version, op_name): """The IOS17+ elementwise unary ops with epsilon support mixed precision.""" @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))], opset_version=opset_version) def prog(x): return getattr(mb, op_name)(x=x, epsilon=0.1) _, _, block = apply_pass_and_basic_check(prog, "common::add_fp16_cast") expected_ops = ["cast", "cast", op_name, "cast"] if opset_version is not None and opset_version >= ct.target.iOS17: # Allow mixed precision, so the epsilon is not cast to fp16. expected_ops = ["cast", op_name, "cast"] assert get_op_types_in_program(prog) == expected_ops assert_model_is_valid( prog, {"x": (2, 3)}, expected_output_shapes={block.outputs[0].name: (2, 3)}, backend=("mlprogram", "fp16"), minimum_deployment_target=opset_version, ) class TestTransformFunctionSignatures: @staticmethod def test_empty(): """ Case where the input var is also a block output. """ # case 1 @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): return x graph_pass = add_fp16_cast() block = prog.functions["main"] graph_pass.transform_function_signatures(block) apply_pass_and_basic_check(prog, "common::dead_code_elimination") assert get_op_types_in_program(prog) == [] assert block.inputs["x"].dtype == types.fp16 assert len(block.outputs) == 1 assert block.outputs[0].dtype == types.fp16 assert block.outputs[0] is block.inputs["x"] # case 2 @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): return x, mb.relu(x=x), x, x graph_pass = add_fp16_cast() block = prog.functions["main"] graph_pass.transform_function_signatures(block) assert block.inputs["x"].dtype == types.fp16 assert len(block.outputs) == 4 assert block.outputs[0].dtype == types.fp16 assert block.outputs[2].dtype == types.fp16 assert block.outputs[3].dtype == types.fp16 assert block.outputs[1].dtype == types.fp32 assert block.outputs[0] is block.inputs["x"] assert block.outputs[2] is block.inputs["x"] assert block.outputs[3] is block.inputs["x"] assert all([x.dtype == types.fp16 for x in block.output_types]) assert get_op_types_in_program(prog) == ["cast", "relu"] cast_op = block.find_ops(op_type="cast")[0] assert cast_op.dtype.val == "fp32" @staticmethod def test_simple(): """ Input graph: input(fp32) -> relu -> output Output graph: input(fp16) -> cast(dtype="fp32") -> relu -> output, with function.output_types = [ct.TesorType(dtype=types.fp16)] """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): return mb.relu(x=x) graph_pass = add_fp16_cast() block = prog.functions["main"] graph_pass.transform_function_signatures(block) assert block.inputs["x"].dtype == types.fp16 assert get_op_types_in_program(prog) == ["cast", "relu"] cast_op = block.find_ops(op_type="cast")[0] assert cast_op.dtype.val == "fp32" assert len(block.outputs) == 1 assert block.outputs[0].dtype == types.fp32 assert len(block.output_types) == 1 assert block.output_types[0].dtype == types.fp16 @staticmethod def test_simple_2(): """ Input graph: input(fp32) -> identity -> cast(dtype="int32") -> output_1 | .-> output_2 Output graph: input(fp16) -> cast(dtype="fp32") -> identity -> cast(dtype="int32") -> output_1 | .-> output_2, with function.output_types = [ct.TesorType(dtype=types.int32), ct.TesorType(dtype=types.fp16)] """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): x = mb.identity(x=x) return mb.cast(x=x, dtype="int32"), x graph_pass = add_fp16_cast() block = prog.functions["main"] graph_pass.transform_function_signatures(block) assert block.inputs["x"].dtype == types.fp16 assert get_op_types_in_program(prog) == ["cast", "identity", "cast"] cast_ops = block.find_ops(op_type="cast") assert cast_ops[0].dtype.val == "fp32" assert cast_ops[1].dtype.val == "int32" assert len(block.outputs) == 2 assert block.outputs[0].dtype == types.int32 assert block.outputs[1].dtype == types.fp32 assert len(block.output_types) == 2 assert block.output_types[0].dtype == types.int32 assert block.output_types[1].dtype == types.fp16 class TestInt32CastToInt16: @pytest.mark.parametrize( "x_dtype, dynamic, has_neg, opset_version", itertools.product( [np.int32, np.float32], [True, False], [True, False], [ct.target.iOS15, ct.target.iOS16, ct.target.iOS17], ), ) def test_gather_int16_indices(self, x_dtype, dynamic, has_neg, opset_version): @mb.program(opset_version=opset_version) def prog_static(): params = np.array([[1, 2, 3], [4, 5, 6]], dtype=x_dtype) indices = np.array([-2, 0] if has_neg else [1, 0], dtype=np.int32) return mb.gather(x=params, indices=indices, axis=-1) @mb.program( [ mb.TensorSpec(shape=(2, 3), dtype=types.numpy_type_to_builtin_type(x_dtype)), mb.TensorSpec(shape=(2,), dtype=types.int32), ], opset_version=opset_version, ) def prog_dynamic(x, indices): return mb.gather(x=x, indices=indices, axis=0) prog = prog_dynamic if dynamic else prog_static assert get_op_types_in_program(prog) == ["gather"] prev_prog, _, block = apply_pass_and_basic_check(prog, "common::add_int16_cast") if opset_version <= ct.target.iOS16: # iOS15 gather op's ``indices`` doesn't support int16, so this pass doesn't have effect. # iOS16 cast op doesn't support int16, so this pass doesn't have effect. assert get_op_types_in_program(prog) == get_op_types_in_program(prev_prog) else: # When input ``x`` is float32, the output is also float32, so no cast for output. # When input ``x`` is int32 and cast to int16, the output will also be int16, so there # is another cast op to cast it back to int32. expected_ops = ["cast", "gather"] if x_dtype == np.int32: expected_ops = ["cast", "cast", "gather", "cast"] assert get_op_types_in_program(prog) == expected_ops indices_cast_op_idx = 1 if x_dtype == np.int32 else 0 cast_op = block.find_ops(op_type="cast")[indices_cast_op_idx] assert cast_op.dtype.val == "int16" if has_neg else "uint16" assert len(cast_op.outputs) == 1 assert len(cast_op.outputs[0].child_ops) == 1 assert cast_op.outputs[0].child_ops[0].op_type == "gather" assert cast_op.outputs[0] == block.find_ops(op_type="gather")[0].indices if not dynamic: np.testing.assert_allclose( np.array([[2, 1], [5, 4]], dtype=np.float32), prog.functions["main"].find_ops(op_type="gather")[0].outputs[0].val, atol=1e-04, rtol=1e-05, ) def test_gather_int16_scalar_indices(self): @mb.program(input_specs=[], opset_version=ct.target.iOS17) def prog_static(): params = np.array([1, 2, 3, 4], dtype=np.int32) res = mb.gather(x=params, indices=0, axis=0, batch_dims=0, validate_indices=False) return res @mb.program( input_specs=[mb.TensorSpec(shape=(4,), dtype=types.int32)], opset_version=ct.target.iOS17, ) def prog_dynamic(x): return mb.gather(x=x, indices=0, axis=0) for prog in (prog_static, prog_dynamic): assert get_op_types_in_program(prog) == ["gather"] prev_prog, _, block = apply_pass_and_basic_check(prog, "common::add_int16_cast") expected_ops = ["cast", "cast", "gather", "cast"] assert get_op_types_in_program(prog) == expected_ops @pytest.mark.parametrize( "x_dtype, dynamic, has_neg, opset_version", itertools.product( [np.int32, np.float32], [True, False], [True, False], [ct.target.iOS15, ct.target.iOS16, ct.target.iOS17], ), ) def test_gather_along_axis_int16_indices(self, x_dtype, dynamic, has_neg, opset_version): @mb.program(opset_version=opset_version) def prog_static(): params = np.array([[1, 2, 3], [4, 5, 6]], dtype=x_dtype) indices = np.array( [[-2, 0, -2], [-2, -2, 0]] if has_neg else [[1, 0, 1], [1, 1, 0]], dtype=np.int32 ) return mb.gather_along_axis(x=params, indices=indices, axis=-1) @mb.program( [ mb.TensorSpec(shape=(2, 3), dtype=types.numpy_type_to_builtin_type(x_dtype)), mb.TensorSpec(shape=(2, 3), dtype=types.int32), ], opset_version=opset_version, ) def prog_dynamic(x, indices): return mb.gather_along_axis(x=x, indices=indices, axis=0) prog = prog_dynamic if dynamic else prog_static assert get_op_types_in_program(prog) == ["gather_along_axis"] prev_prog, _, block = apply_pass_and_basic_check(prog, "common::add_int16_cast") if opset_version <= ct.target.iOS16: # iOS15 gather op's ``indices`` doesn't support int16, so this pass doesn't have effect. # iOS16 cast op doesn't support int16, so this pass doesn't have effect. assert get_op_types_in_program(prog) == get_op_types_in_program(prev_prog) else: # When input ``x`` is float32, the output is also float32, so no cast for output. # When input ``x`` is int32 and cast to int16, the output will also be int16, so there # is another cast op to cast it back to int32. expected_ops = ["cast", "gather_along_axis"] if x_dtype == np.int32: expected_ops = ["cast", "cast", "gather_along_axis", "cast"] assert get_op_types_in_program(prog) == expected_ops indices_cast_op_idx = 1 if x_dtype == np.int32 else 0 cast_op = block.find_ops(op_type="cast")[indices_cast_op_idx] assert cast_op.dtype.val == "int16" if has_neg else "uint16" assert len(cast_op.outputs) == 1 assert len(cast_op.outputs[0].child_ops) == 1 assert cast_op.outputs[0].child_ops[0].op_type == "gather_along_axis" assert cast_op.outputs[0] == block.find_ops(op_type="gather_along_axis")[0].indices if not dynamic: np.testing.assert_allclose( np.array([[2, 1, 2], [5, 5, 4]], dtype=np.float32), prog.functions["main"].find_ops(op_type="gather_along_axis")[0].outputs[0].val, atol=1e-04, rtol=1e-05, ) @pytest.mark.parametrize("overflow", [True, False]) def test_gather_dynamic_overflow_int16(self, overflow): """Dynamic input indices should also be cast if x dim size doesn't overflow int16 range.""" @mb.program( input_specs=[ mb.TensorSpec(shape=(32769 if overflow else 2, 3)), mb.TensorSpec(shape=(2,), dtype=types.int32), ], opset_version=ct.target.iOS17, ) def prog(x, indices): return mb.gather(x=x, indices=indices, axis=0) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::add_int16_cast") if overflow: assert get_op_types_in_program(prog) == get_op_types_in_program(prev_prog) else: assert get_op_types_in_program(prog) == ["cast", "gather"] cast_op = block.find_ops(op_type="cast")[0] assert cast_op.dtype.val == "int16" assert cast_op.outputs[0] == block.find_ops(op_type="gather")[0].indices @pytest.mark.parametrize("overflow_uint16", [True, False]) def test_gather_static_overflow_int16(self, overflow_uint16): """Indices cannot be represented by int16 range, but might be represented by uint16.""" max_index = 65536 if overflow_uint16 else 32768 @mb.program(opset_version=ct.target.iOS17) def prog(): params = np.array([[1, 2]] * (max_index + 1), dtype=np.float32) indices = np.array([max_index, 0], dtype=np.int32) return mb.gather(x=params, indices=indices, axis=0) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::add_int16_cast") if overflow_uint16: assert get_op_types_in_program(prog) == get_op_types_in_program(prev_prog) else: assert get_op_types_in_program(prog) == ["cast", "gather"] cast_op = block.find_ops(op_type="cast")[0] assert cast_op.dtype.val == "uint16" assert cast_op.outputs[0] == block.find_ops(op_type="gather")[0].indices @pytest.mark.parametrize( "dtype, opset_version", itertools.product( [types.int32, types.fp32], [ct.target.iOS15, ct.target.iOS16, ct.target.iOS17, ct.target.iOS18], ), ) def test_squeeze(self, dtype, opset_version): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 1), dtype=dtype)], opset_version=opset_version, ) def prog(x): return mb.squeeze(x=x) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::add_int16_cast") if opset_version < ct.target.iOS17: # Prior to iOS 17, `squeeze` does not support int16, so this pass has no effect assert get_op_types_in_program(prog) == get_op_types_in_program(prev_prog) else: if dtype == types.int32: # When `x` is int32, it will be cast to int16 then feed into `squeeze`, # then `squeeze(x)` will be cast back to int32 for output assert get_op_types_in_program(prog) == ["cast", "squeeze", "cast"] cast_int16, cast_int32 = block.find_ops(op_type="cast") assert cast_int16.dtype.val == "int16" assert cast_int32.dtype.val == "int32" assert cast_int16.outputs[0].child_ops[0].op_type == "squeeze" else: # When `x` is float, this int pass has no effect assert get_op_types_in_program(prog) == ["squeeze"] @patch( "coremltools.converters.mil.mil.passes.defs.quantization.add_int16_cast._PREFER_INT16_OPS", set(), ) def test_int16_no_effect(self): """After patching the pass, no op should be cast to int16""" @mb.program( input_specs=[mb.TensorSpec(shape=(2, 3)), mb.TensorSpec(shape=(2,), dtype=types.int32)], opset_version=ct.target.iOS17, ) def prog(x, indices): return mb.gather(x=x, indices=indices, axis=0) prev_prog, _, block = apply_pass_and_basic_check(prog, "common::add_int16_cast") assert get_op_types_in_program(prog) == get_op_types_in_program(prev_prog) @pytest.mark.skipif(not _HAS_TORCH, reason=MSG_TORCH_NOT_FOUND) @pytest.mark.parametrize( "compute_precision, num_embeddings, minimum_deployment_target, symbolic", itertools.product( [ct.precision.FLOAT16, ct.precision.FLOAT32], [10, 32769], [ct.target.iOS15, ct.target.iOS16, ct.target.iOS17], [True, False], ), ) def test_int16_embedding_e2e( self, compute_precision, num_embeddings, minimum_deployment_target, symbolic ): """End-to-end conversion from a torch embedding model.""" class EmbeddingModel(nn.Module): def __init__(self): super(EmbeddingModel, self).__init__() self.embedding = torch.nn.Embedding(num_embeddings=num_embeddings, embedding_dim=2) def forward(self, x): return self.embedding(x) input_data = np.random.randint(low=0, high=num_embeddings, size=(3, 5)) input_data = torch.from_numpy(input_data) model = EmbeddingModel() model.eval() traced_model = torch.jit.trace(model, input_data) input_shape = (ct.RangeDim(1, 32), ct.RangeDim(1, 32)) if symbolic else input_data.shape converted_model = ct.convert( traced_model, inputs=[ct.TensorType(shape=input_shape, name="input", dtype=np.int32)], convert_to="mlprogram", compute_precision=compute_precision, compute_units=ct.ComputeUnit.CPU_ONLY, minimum_deployment_target=minimum_deployment_target, ) prog = converted_model._mil_program # The embedding layer is lowered to `gather` op. expected_ops = ["gather"] if ( compute_precision == ct.precision.FLOAT16 and minimum_deployment_target < ct.target.iOS16 ): # Cast from fp16 to fp32 because fp16 is not supported in I/O before iOS16. expected_ops.append("cast") if ( minimum_deployment_target >= ct.target.iOS17 and compute_precision == ct.precision.FLOAT16 and num_embeddings <= np.iinfo(np.int16).max ): # The int16 cast only happens for iOS17+ with fp16 precision and there is no overflow. expected_ops.insert(0, "cast") cast_op = prog["main"].find_ops(op_type="cast")[0] assert cast_op.dtype.val == "int16" assert cast_op.outputs[0] == prog["main"].find_ops(op_type="gather")[0].indices assert get_op_types_in_program(prog) == expected_ops ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/tests/test_reduce_transposes_pass.py0000644000000000000000000023177514672066616031443 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol from coremltools.converters.mil.mil.passes.defs.optimize_repeat_ops import TransformAxisUpdateOps from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.converters.mil.testing_utils import ( apply_pass_and_basic_check, assert_model_is_valid, get_op_types_in_program, ) np.random.seed(1984) class TransposeOptimizationPass(unittest.TestCase): """ Input graph: input -----> transpose(axis=[1,0]) -----> transpose(axis=[1,0]) ---> out Output graph: input -----> identity -----> out """ def test_simple_consecutive_ops_fusion_direct_output(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.transpose(x=x, perm=[1, 0]) x = mb.transpose(x=x, perm=[1, 0]) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual(get_op_types_in_program(prev_prog), ["transpose", "transpose"]) self.assertEqual(get_op_types_in_program(prog), ["identity"]) assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={block.outputs[0].name: (10, 20)}, ) """ Input graph: input -----> transpose(axis=[1,0]) -----> transpose(axis=[1,0]) ----> relu ---> out Output graph: input -----> relu -----> out """ def test_simple_consecutive_ops_fusion(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.transpose(x=x, perm=[1, 0]) x = mb.transpose(x=x, perm=[1, 0]) x = mb.relu(x=x) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual(get_op_types_in_program(prev_prog), ["transpose", "transpose", "relu"]) self.assertEqual(get_op_types_in_program(prog), ["relu"]) assert_model_is_valid( prog, {"x": (10, 20)}, expected_output_shapes={block.outputs[0].name: (10, 20)}, ) """ Input graph: input---->transpose(axis=[0,3,1,2])---->relu---->log--->transpose(axis=[0,2,3,1])--->relu--->out Output graph: input----->relu----->log----->relu--->out """ def test_linear_graph_two_op_fusion(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3, 4))]) def prog(x): x = mb.transpose(x=x, perm=[0, 3, 1, 2]) x = mb.relu(x=x) x = mb.log(x=x) x = mb.transpose(x=x, perm=[0, 2, 3, 1]) x = mb.relu(x=x) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "relu", "log", "transpose", "relu"], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "log", "relu"]) assert_model_is_valid( prog, {"x": (1, 2, 3, 4)}, expected_output_shapes={block.outputs[0].name: (1, 2, 3, 4)}, ) """ Input graph: input---->transpose(axis=[0,3,1,2])---->relu---->identity--->transpose(axis=[0,2,3,1])--->relu--->out Output graph: input----->relu----->identity----->relu--->out """ def test_linear_graph_two_op_fusion_1(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3, 4))]) def prog(x): x = mb.transpose(x=x, perm=[0, 3, 1, 2]) x = mb.relu(x=x) x = mb.identity(x=x) x = mb.transpose(x=x, perm=[0, 2, 3, 1]) x = mb.relu(x=x) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "relu", "identity", "transpose", "relu"], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "identity", "relu"]) assert_model_is_valid( prog, {"x": (1, 2, 3, 4)}, expected_output_shapes={block.outputs[0].name: (1, 2, 3, 4)}, ) """ Input graph: input(shape=1,2,3,4)---->transpose(axis=[0,3,1,2])---->relu---->log--->transpose(axis=[0,2,3,1])--->relu--->out1(shape=1,2,3,4) | v out2(shape=1,4,2,3) Output graph: input(shape=1,2,3,4)---->relu---->log--->relu--->out1(shape=1,2,3,4) | |----->transpose(axis=[0,3,1,2])----->out2(shape=1,4,2,3) """ def test_fusion_with_output_edge_inbetween(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3, 4))]) def prog(x): x = mb.transpose(x=x, perm=[0, 3, 1, 2]) x1 = mb.relu(x=x) x2 = mb.log(x=x1) x3 = mb.transpose(x=x2, perm=[0, 2, 3, 1]) x4 = mb.relu(x=x3) return x4, x1 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "relu", "log", "transpose", "relu"], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "log", "relu", "transpose"]) assert_model_is_valid( prog, {"x": (1, 2, 3, 4)}, expected_output_shapes={ block.outputs[0].name: (1, 2, 3, 4), block.outputs[1].name: (1, 4, 2, 3), }, ) """ Input graph: input---->transpose(axis=[0,3,1,2])---->relu---->transpose(axis=[0,2,3,1])--->out Output graph: input----->relu----->out """ def test_linear_graph_two_op_fusion_with_last_op_removal(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3, 4))]) def prog(x): x = mb.transpose(x=x, perm=[0, 3, 1, 2]) x = mb.relu(x=x) x = mb.transpose(x=x, perm=[0, 2, 3, 1]) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual(get_op_types_in_program(prev_prog), ["transpose", "relu", "transpose"]) self.assertEqual(get_op_types_in_program(prog), ["relu"]) assert_model_is_valid( prog, {"x": (1, 2, 3, 4)}, expected_output_shapes={block.outputs[0].name: (1, 2, 3, 4)}, ) """ Input graph: input(shape=10,2,3)--->transpose(axis=[0,2,1])----->relu---->transpose(axis=[0,2,1])---->out1 | | --->relu----->log---->transpose(axis=[0,2,1])---->out2 Output graph: input(shape=10,2,3)----->relu---->out1 | | --->relu----->log---->out2 """ def test_multiple_fusions(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 2, 3))]) def prog(x): x = mb.transpose(x=x, perm=[0, 2, 1]) x1 = mb.relu(x=x) x2 = mb.relu(x=x) y1 = mb.transpose(x=x1, perm=[0, 2, 1]) x3 = mb.log(x=x2) y2 = mb.transpose(x=x3, perm=[0, 2, 1]) return y1, y2 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "relu", "relu", "transpose", "log", "transpose"], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "relu", "log"]) assert prev_block.inputs["x"] == prev_block.find_ops(op_type="transpose")[0].inputs["x"] assert block.find_ops(op_type="log")[0].outputs[0] in block.outputs assert_model_is_valid( prog, {"x": (10, 2, 3)}, expected_output_shapes={ block.outputs[0].name: (10, 2, 3), block.outputs[1].name: (10, 2, 3), }, ) """ Input graph: input(shape=10,2,3,5)--->transpose(axis=[0,2,3,1])----->relu---->pool----->out1 | | --->relu----->log---->transpose(axis=[0,3,1,2])---->out2 Output graph: input(shape=10,2,3,5)----->relu---->transpose(axis=[0,2,3,1])---->pool----->out1 | | --->relu----->log---->out2 """ def test_partial_fusion_0(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 2, 3, 5))]) def prog(x): x = mb.transpose(x=x, perm=[0, 2, 3, 1]) x1 = mb.relu(x=x) x2 = mb.relu(x=x) y1 = mb.avg_pool(x=x1, kernel_sizes=[1, 1], strides=[1, 1], pad_type="valid") x3 = mb.log(x=x2) y2 = mb.transpose(x=x3, perm=[0, 3, 1, 2]) return y1, y2 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "relu", "relu", "avg_pool", "log", "transpose"], ) self.assertEqual( get_op_types_in_program(prog), ["relu", "relu", "transpose", "avg_pool", "log"], ) assert prev_block.inputs["x"] == prev_block.find_ops(op_type="transpose")[0].inputs["x"] assert block.find_ops(op_type="log")[0].outputs[0] == block.outputs[1] assert ( block.find_ops(op_type="transpose")[0].outputs[0] == block.find_ops(op_type="avg_pool")[0].inputs["x"] ) assert list(block.find_ops(op_type="transpose")[0].perm.val) == [0, 2, 3, 1] assert_model_is_valid( prog, {"x": (10, 2, 3, 5)}, expected_output_shapes={ block.outputs[0].name: (10, 3, 5, 2), block.outputs[1].name: (10, 2, 3, 5), }, ) """ Input graph: input(shape=10,2,3,5)--->transpose(axis=[0,2,1,3])----->relu---->transpose(axis=[0,2,1,3])---->out1 | | --->pool--->log---->transpose(axis=[0,2,1,3])---->out2 Output graph: input(shape=10,2,3,5)----->relu---->out1 | | --->transpose(axis=[0,2,1,3])---->pool----->log---->transpose(axis=[0,2,1,3])---->out2 """ def test_partial_fusion_1(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 2, 3, 5))]) def prog(x): x = mb.transpose(x=x, perm=[0, 2, 1, 3]) x1 = mb.relu(x=x) x2 = mb.avg_pool(x=x, kernel_sizes=[1, 1], strides=[1, 1], pad_type="valid") y1 = mb.transpose(x=x1, perm=[0, 2, 1, 3]) x3 = mb.log(x=x2) y2 = mb.transpose(x=x3, perm=[0, 2, 1, 3]) return y1, y2 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "relu", "avg_pool", "transpose", "log", "transpose"], ) self.assertEqual( get_op_types_in_program(prog), ["relu", "transpose", "avg_pool", "log", "transpose"], ) assert block.inputs["x"] == block.find_ops(op_type="relu")[0].inputs["x"] assert block.outputs[0] == block.find_ops(op_type="relu")[0].outputs[0] assert_model_is_valid( prog, {"x": (10, 2, 3, 5)}, expected_output_shapes={ block.outputs[0].name: (10, 2, 3, 5), block.outputs[1].name: (10, 2, 3, 5), }, ) """ Input graph: |-------> transpose(axis=[0,2,1,3]) ---->out1(shape=10,2,3,5) | input(shape=10,2,3,5)-->relu-->transpose(axis=[0,2,1,3])--->relu--->transpose(axis=[0,2,1,3]) ---->out2(shape=10,2,3,5) | |----->pool--------------->out3(shape=10,3,2,5) | |----->pool--------------->out4(shape=10.3.2.5) Output graph: |---->out1(shape=10,2,3,5) | input---->relu---------->relu------->out2(shape=10,2,3,5) | |----->transpose(axis=[0,2,1,3])--->pool---->out3(shape=10,3,2,5) | |----->transpose(axis=[0,2,1,3])---->pool--->out4(shape=10.3.2.5) """ def test_partial_fusion_2(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 2, 3, 5))]) def prog(x): x = mb.relu(x=x) x = mb.transpose(x=x, perm=[0, 2, 1, 3]) y1 = mb.transpose(x=x, perm=[0, 2, 1, 3]) x1 = mb.relu(x=x) y2 = mb.transpose(x=x1, perm=[0, 2, 1, 3]) y3 = mb.avg_pool(x=x1, kernel_sizes=[1, 1], strides=[1, 1], pad_type="valid") y4 = mb.avg_pool(x=x1, kernel_sizes=[1, 1], strides=[1, 1], pad_type="valid") return y1, y2, y3, y4 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), [ "relu", "transpose", "transpose", "relu", "transpose", "avg_pool", "avg_pool", ], ) self.assertEqual( get_op_types_in_program(prog), ["relu", "relu", "transpose", "avg_pool", "transpose", "avg_pool"], ) assert block.outputs[0] == block.find_ops(op_type="relu")[0].outputs[0] assert block.outputs[1] == block.find_ops(op_type="relu")[1].outputs[0] assert_model_is_valid( prog, {"x": (10, 2, 3, 5)}, expected_output_shapes={ # Two consecutive relus are merged, so the first two outputs have the same name. See # `test_name_change_depend_on_output` in TestMergeConsecutiveRelus. block.outputs[1].name: (10, 2, 3, 5), block.outputs[2].name: (10, 3, 2, 5), block.outputs[3].name: (10, 3, 2, 5), }, # rdar://100243127 ([PyTorch] Duplicate Output Tensor Doesn't work for neuralnetwork). backend=("mlprogram", "fp16"), ) """ Input graph: input(shape=10,2,3,5)-->relu--->transpose(axis=[0,2,1,3])----->transpose(axis=[0,2,1,3])---->out1(shape=10,2,3,5) | ---->relu------>out2(shape=10,3,2,5) Output graph: input(shape=10,2,3,5)-->relu---->out1(shape=10,2,3,5) | ---->relu--->transpose(axis=[0,2,1,3])------>out2(shape=10,3,2,5) """ def test_partial_fusion_3(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 2, 3, 5))]) def prog(x): x = mb.relu(x=x) x = mb.transpose(x=x, perm=[0, 2, 1, 3]) x1 = mb.transpose(x=x, perm=[0, 2, 1, 3]) x2 = mb.relu(x=x) return x1, x2 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["relu", "transpose", "transpose", "relu"], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "relu", "transpose"]) assert block.outputs[0] == block.find_ops(op_type="relu")[0].outputs[0] assert_model_is_valid( prog, {"x": (10, 2, 3, 5)}, expected_output_shapes={ block.outputs[0].name: (10, 2, 3, 5), block.outputs[1].name: (10, 3, 2, 5), }, ) """ Input graph: input(shape=10,2,3,5)-->relu--->transpose(axis=[0,2,1,3])----->transpose(axis=[0,2,1,3])---->out1(shape=10,2,3,5) | ------>out2(shape=10,3,2,5) Output graph: same as input graph as one of the optimizing transpose is connected to model output """ def test_partial_fusion_4(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 2, 3, 5))]) def prog(x): x = mb.relu(x=x) out2 = mb.transpose(x=x, perm=[0, 2, 1, 3]) out1 = mb.transpose(x=out2, perm=[0, 2, 1, 3]) return out1, out2 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual(get_op_types_in_program(prev_prog), ["relu", "transpose", "transpose"]) self.assertEqual(get_op_types_in_program(prog), ["relu", "transpose", "transpose"]) assert block.outputs[1] == block.find_ops(op_type="transpose")[0].outputs[0] assert_model_is_valid( prog, {"x": (10, 2, 3, 5)}, expected_output_shapes={ block.outputs[0].name: (10, 2, 3, 5), block.outputs[1].name: (10, 3, 2, 5), }, ) """ Input graph: input(shape=10,2,3,5)-->relu-->transpose(axis=[0,2,1,3])--->relu--->transpose(axis=[0,2,1,3]) ---->out1(shape=10,2,3,5) | |--->relu-->pool--------------->out2(shape=10,3,2,5) | |----->pool--------------->out3(shape=10.3.2.5) Output graph: same as the input graph as materialization ops are greater than cancel ops """ def test_no_fusion_more_materialization_ops(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 2, 3, 5))]) def prog(x): x = mb.relu(x=x) x = mb.transpose(x=x, perm=[0, 2, 1, 3]) x1 = mb.relu(x=x) y2 = mb.transpose(x=x1, perm=[0, 2, 1, 3]) x2 = mb.relu(x=x1) y3 = mb.avg_pool(x=x2, kernel_sizes=[1, 1], strides=[1, 1], pad_type="valid") y4 = mb.avg_pool(x=x1, kernel_sizes=[1, 1], strides=[1, 1], pad_type="valid") return y2, y3, y4 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["relu", "transpose", "relu", "transpose", "relu", "avg_pool", "avg_pool"], ) self.assertEqual( get_op_types_in_program(prog), ["relu", "transpose", "relu", "transpose", "relu", "avg_pool", "avg_pool"], ) assert_model_is_valid( prog, {"x": (10, 2, 3, 5)}, expected_output_shapes={ block.outputs[0].name: (10, 2, 3, 5), block.outputs[1].name: (10, 3, 2, 5), block.outputs[2].name: (10, 3, 2, 5), }, ) """ Input graph: input(shape=10,2,3)--->transpose(axis=[0,2,1])----->relu---->transpose(axis=[0,2,1])---->out1 | | --->reduce(axis=2)----->log---->transpose(axis=[0,2,1])---->out2 Output graph: input(shape=10,2,3)----->relu---->out1 | | --->reduce(axis=1)----->log---->out2 """ def test_fusion_with_axis_op(self): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 2, 3))]) def prog(x): x = mb.transpose(x=x, perm=[0, 2, 1]) x1 = mb.relu(x=x) x2 = mb.reduce_mean(x=x, axes=[2], keep_dims=True) y1 = mb.transpose(x=x1, perm=[0, 2, 1]) x3 = mb.log(x=x2) y2 = mb.transpose(x=x3, perm=[0, 2, 1]) return y1, y2 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "relu", "reduce_mean", "transpose", "log", "transpose"], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "reduce_mean", "log"]) assert list(block.find_ops(op_type="reduce_mean")[0].inputs["axes"].val) == [1] assert_model_is_valid( prog, {"x": (10, 2, 3)}, expected_output_shapes={ block.outputs[0].name: (10, 2, 3), block.outputs[1].name: (10, 1, 3), }, ) """ Input graph: input(shape=11,2,3,6)--->transpose(axis=[0,3,1,2])--- | | --->pad(pad=[0,0,0,0,1,2,3,4]) | |-->log--->transpose(axis=[0,2,3,1])-->out1(shape=11,5,10,6) Output graph: same as input graph, as transpose cannot be pushed through the pad op since "reflect" mode is only supported along the last two axis """ def test_fusion_with_pad_reflective_op_0(self): @mb.program(input_specs=[mb.TensorSpec(shape=(11, 2, 3, 6))]) def prog(x): x = mb.transpose(x=x, perm=[0, 3, 1, 2]) x2 = mb.pad(x=x, pad=[0, 0, 0, 0, 1, 2, 3, 4], mode="reflect") x3 = mb.log(x=x2) y2 = mb.transpose(x=x3, perm=[0, 2, 3, 1]) return y2 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "pad", "log", "transpose"] ) self.assertEqual(get_op_types_in_program(prog), ["transpose", "pad", "log", "transpose"]) assert list(block.find_ops(op_type="pad")[0].inputs["pad"].val.flatten()) == [ 0, 0, 0, 0, 1, 2, 3, 4, ] assert_model_is_valid( prog, {"x": (11, 2, 3, 6)}, expected_output_shapes={block.outputs[0].name: (11, 5, 10, 6)}, ) """ Input graph: input(shape=11,2,3,6)--->transpose(axis=[0,1,3,2])--- | | --->pad(pad=[0,0,0,0,1,2,3,4]) | |-->log--->transpose(axis=[0,1,3,2])-->out1(shape=11,2,10,9) Output graph: input(shape=11,2,3,6)--->pad(pad=[0,0,0,0,3,4,1,2])-->log-->out1(shape=11,2,10,9) """ def test_fusion_with_pad_reflective_op_1(self): @mb.program(input_specs=[mb.TensorSpec(shape=(11, 2, 3, 6))]) def prog(x): x = mb.transpose(x=x, perm=[0, 1, 3, 2]) x2 = mb.pad(x=x, pad=[0, 0, 0, 0, 1, 2, 3, 4], mode="reflect") x3 = mb.log(x=x2) y2 = mb.transpose(x=x3, perm=[0, 1, 3, 2]) return y2 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "pad", "log", "transpose"] ) self.assertEqual(get_op_types_in_program(prog), ["pad", "log"]) assert list(block.find_ops(op_type="pad")[0].inputs["pad"].val.flatten()) == [ 0, 0, 0, 0, 3, 4, 1, 2, ] assert_model_is_valid( prog, {"x": (11, 2, 3, 6)}, expected_output_shapes={block.outputs[0].name: (11, 2, 10, 9)}, ) """ Input graph: input(shape=11,2,3,6)--->transpose(axis=[0,3,1,2])--- | | --->pad(pad=[0,0,0,0,1,2,3,4]) | |-->log--->transpose(axis=[0,2,3,1])-->out1(shape=11,5,10,6) Output graph: input(shape=11,2,3,6)--->pad(pad=[0,0,1,2,3,4,0,0])-->log-->out1(shape=11,5,10,6) """ def test_fusion_with_pad_constant_op(self): @mb.program(input_specs=[mb.TensorSpec(shape=(11, 2, 3, 6))]) def prog(x): x = mb.transpose(x=x, perm=[0, 3, 1, 2]) x2 = mb.pad(x=x, pad=[0, 0, 0, 0, 1, 2, 3, 4], mode="constant", constant_val=3.0) x3 = mb.log(x=x2) y2 = mb.transpose(x=x3, perm=[0, 2, 3, 1]) return y2 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "pad", "log", "transpose"] ) self.assertEqual(get_op_types_in_program(prog), ["pad", "log"]) assert list(block.find_ops(op_type="pad")[0].inputs["pad"].val.flatten()) == [ 0, 0, 1, 2, 3, 4, 0, 0, ] assert_model_is_valid( prog, {"x": (11, 2, 3, 6)}, expected_output_shapes={block.outputs[0].name: (11, 5, 10, 6)}, ) """ Input graph: const(shape=2) | V input(shape=1,2,5,5)--->transpose(axis=[0,2,3,1])--->add---->transpose(axis=[0,3,1,2])--->out(shape=1,2,5,5) Output graph: const(shape=1,2,1,1) | V input(shape=1,2,5,5)--->add--->out(shape=1,2,5,5) """ def test_fusion_with_add_constant_op(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 5, 5))]) def prog(x): x = mb.transpose(x=x, perm=[0, 2, 3, 1]) x = mb.add(x=x, y=np.array([10.0, 100.0])) x = mb.transpose(x=x, perm=[0, 3, 1, 2]) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual(get_op_types_in_program(prev_prog), ["transpose", "add", "transpose"]) self.assertEqual(get_op_types_in_program(prog), ["add"]) assert_model_is_valid( prog, {"x": (1, 2, 5, 5)}, expected_output_shapes={block.outputs[0].name: (1, 2, 5, 5)}, ) """ Input graph: const(scalar) | V input(shape=1,2,5,5)--->transpose(axis=[0,2,3,1])--->add---->transpose(axis=[0,3,1,2])--->out(shape=1,2,5,5) Output graph: const(scalar) | V input(shape=1,2,5,5)--->add--->out(shape=1,2,5,5) """ def test_fusion_with_add_scalar_constant_op(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 5, 5))]) def prog(x): x = mb.transpose(x=x, perm=[0, 2, 3, 1]) x = mb.add(x=5.0, y=x) x = mb.transpose(x=x, perm=[0, 3, 1, 2]) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual(get_op_types_in_program(prev_prog), ["transpose", "add", "transpose"]) self.assertEqual(get_op_types_in_program(prog), ["add"]) assert_model_is_valid( prog, {"x": (1, 2, 5, 5)}, expected_output_shapes={block.outputs[0].name: (1, 2, 5, 5)}, ) """ Input graph: input(shape=1,2,5,5)----->transpose(axis=[0,2,3,1])--->add---->transpose(axis=[0,3,1,2])--->out(shape=1,2,5,5) | ^ | | |---->relu---->transpose(axis=[0,2,3,1]) Output graph: input(shape=1,2,5,5)----->add--->out(shape=1,2,5,5) | ^ | | |------>relu """ def test_fusion_with_add_broadcastable_0(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 5, 5))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x2 = mb.relu(x=x) x2 = mb.transpose(x=x2, perm=[0, 2, 3, 1]) x3 = mb.add(x=x1, y=x2) y = mb.transpose(x=x3, perm=[0, 3, 1, 2]) return y prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "relu", "transpose", "add", "transpose"], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "add"]) assert block.find_ops(op_type="relu")[0].inputs["x"] == block.inputs["x"] assert block.find_ops(op_type="add")[0].inputs["x"] == block.inputs["x"] assert ( block.find_ops(op_type="add")[0].inputs["y"] == block.find_ops(op_type="relu")[0].outputs[0] ) assert_model_is_valid( prog, {"x": (1, 2, 5, 5)}, expected_output_shapes={block.outputs[0].name: (1, 2, 5, 5)}, ) """ Input graph: input(shape=1,2,5,5)----->transpose(axis=[0,2,3,1])--->add---->transpose(axis=[0,3,1,2])--->out(shape=1,2,5,5) | ^ | | |----------------------->transpose(axis=[0,2,3,1]) Output graph: input(shape=1,2,5,5)----->add--->out(shape=1,2,5,5) | ^ | | |--------- """ def test_fusion_with_add_broadcastable_1(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 5, 5))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x2 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x3 = mb.add(x=x1, y=x2) y = mb.transpose(x=x3, perm=[0, 3, 1, 2]) return y prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "transpose", "add", "transpose"], ) self.assertEqual(get_op_types_in_program(prog), ["add"]) assert block.find_ops(op_type="add")[0].inputs["x"] == block.inputs["x"] assert block.find_ops(op_type="add")[0].inputs["y"] == block.inputs["x"] assert_model_is_valid( prog, {"x": (1, 2, 5, 5)}, expected_output_shapes={block.outputs[0].name: (1, 2, 5, 5)}, ) """ Input graph: input(shape=1,2,5,5)--->transpose(axis=[0,2,3,1])---> relu---->concat(axis=3)----->transpose(axis=[0,3,1,2])----->out1(shape=1,4,5,5) | ^ | | |->transpose(axis=[0,2,3,1])--->relu------------ Output graph: input(shape=1,2,5,5)------> relu---->concat(axis=1)--->out1(shape=1,4,5,5) | ^ | | |---->relu------------ """ def test_concat_pattern_0(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 5, 5))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x2 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x1 = mb.relu(x=x1) x2 = mb.relu(x=x2) x3 = mb.concat(values=[x1, x2], axis=3) x4 = mb.transpose(x=x3, perm=[0, 3, 1, 2]) return x4 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "transpose", "relu", "relu", "concat", "transpose"], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "relu", "concat"]) assert_model_is_valid( prog, {"x": (1, 2, 5, 5)}, expected_output_shapes={block.outputs[0].name: (1, 4, 5, 5)}, ) """ Input graph: input(shape=1,2,5,5)--->transpose(axis=[0,2,3,1])---> relu---->concat(axis=3)----->transpose(axis=[0,3,1,2])----->out1(shape=1,4,5,5) | ^ | | |->transpose(axis=[0,2,3,1])------->relu-------- | V pool--->out2(shape=1,5,5,2) Output graph: input(shape=1,2,5,5)------> relu---->concat(axis=1)--->out1(shape=1,4,5,5) | ^ | | |---->relu------------ | |--->transpose(axis=[0,2,3,1])---->pool--->out2(shape=1,5,5,2) """ def test_concat_pattern_1(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 5, 5))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x2 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x1 = mb.relu(x=x1) x2 = mb.relu(x=x2) x3 = mb.concat(values=[x1, x2], axis=3) x4 = mb.transpose(x=x3, perm=[0, 3, 1, 2]) x5 = mb.avg_pool(x=x2, kernel_sizes=[1, 1], strides=[1, 1], pad_type="valid") return x4, x5 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), [ "transpose", "transpose", "relu", "relu", "concat", "transpose", "avg_pool", ], ) self.assertEqual( get_op_types_in_program(prog), ["relu", "relu", "concat", "transpose", "avg_pool"], ) assert_model_is_valid( prog, {"x": (1, 2, 5, 5)}, expected_output_shapes={ block.outputs[0].name: (1, 4, 5, 5), block.outputs[1].name: (1, 5, 5, 2), }, ) """ Input graph: input(shape=1,2,5,5)--->transpose(axis=[0,2,3,1])---> relu---->concat(axis=3)----->transpose(axis=[0,3,1,2])----->out1(shape=1,4,5,5) | ^ | | |->transpose(axis=[0,2,3,1])------->relu-------- | V relu--->out2(shape=1,5,5,2) Output graph: input(shape=1,2,5,5)------> relu---->concat(axis=1)--->out1(shape=1,4,5,5) | ^ | | |---->relu------------ | |--->relu---->transpose(axis=[0,2,3,1])---->out2(shape=1,5,5,2) """ def test_concat_pattern_2(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 5, 5))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x2 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x1 = mb.relu(x=x1) x2 = mb.relu(x=x2) x3 = mb.concat(values=[x1, x2], axis=3) x4 = mb.transpose(x=x3, perm=[0, 3, 1, 2]) x5 = mb.relu(x=x2) return x4, x5 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "transpose", "relu", "relu", "concat", "transpose", "relu"], ) self.assertEqual( get_op_types_in_program(prog), ["relu", "relu", "concat", "relu", "transpose"], ) assert_model_is_valid( prog, {"x": (1, 2, 5, 5)}, expected_output_shapes={ block.outputs[0].name: (1, 4, 5, 5), block.outputs[1].name: (1, 5, 5, 2), }, ) """ Input graph: input(shape=1,2,5,5)--->transpose(axis=[0,2,3,1])---> relu---->concat(axis=3)----->transpose(axis=[0,3,1,2])----->out1(shape=1,4,5,5) | ^ | | |->transpose(axis=[0,2,3,1])------->relu-------- | V out2(shape=1,5,5,2) Output graph: input(shape=1,2,5,5)------> relu---->concat(axis=1)--->out1(shape=1,4,5,5) | ^ | | |---->relu------------ | |--->transpose(axis=[0,2,3,1])---->out2(shape=1,5,5,2) """ def test_concat_pattern_3(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 5, 5))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x2 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x1 = mb.relu(x=x1) x2 = mb.relu(x=x2) x3 = mb.concat(values=[x1, x2], axis=3) x4 = mb.transpose(x=x3, perm=[0, 3, 1, 2]) return x4, x2 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "transpose", "relu", "relu", "concat", "transpose"], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "relu", "concat", "transpose"]) assert_model_is_valid( prog, {"x": (1, 2, 5, 5)}, expected_output_shapes={ block.outputs[0].name: (1, 4, 5, 5), block.outputs[1].name: (1, 5, 5, 2), }, ) """ Input graph: input(shape=1,2,5,5)--->transpose(axis=[0,2,3,1])---> relu---->concat(axis=3)----->transpose(axis=[0,3,1,2])----->out1(shape=1,4,5,5) | ^ | | |->transpose(axis=[0,2,3,1])------->relu-------- | V transpose(axis=[0,3,1,2]) -----> out2(shape=1,2,5,5) Output graph: input(shape=1,2,5,5)---> relu---->concat(axis=1)----->out1(shape=1,4,5,5) | ^ | | |------------------->relu-------->out2(shape=1,2,5,5) """ def test_concat_pattern_4(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 5, 5))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x2 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x1 = mb.relu(x=x1) x2 = mb.relu(x=x2) x3 = mb.concat(values=[x1, x2], axis=3) x4 = mb.transpose(x=x3, perm=[0, 3, 1, 2]) x5 = mb.transpose(x=x2, perm=[0, 3, 1, 2]) return x4, x5 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), [ "transpose", "transpose", "relu", "relu", "concat", "transpose", "transpose", ], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "relu", "concat"]) assert_model_is_valid( prog, {"x": (1, 2, 5, 5)}, expected_output_shapes={ block.outputs[0].name: (1, 4, 5, 5), block.outputs[1].name: (1, 2, 5, 5), }, ) """ Input graph: constant(shape=[30,10,5]) | V input(shape=10,20,30)--->transpose(axis=[2,0,1])--->concat(axis=2)----->transpose(axis=[1,2,0])----->out1(shape=10,25,30) Output graph: constant(shape=[10,5,30]) | V input(shape=10,20,30)--->concat(axis=1)----->out1(shape=10,25,30) """ def test_concat_pattern_5(self): const = np.random.rand(30, 10, 5) @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20, 30))]) def prog(x): x1 = mb.transpose(x=x, perm=[2, 0, 1]) c = mb.const(val=const) x2 = mb.concat(values=[x1, c], axis=2) x3 = mb.transpose(x=x2, perm=[1, 2, 0]) return x3 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual(get_op_types_in_program(prev_prog), ["transpose", "concat", "transpose"]) self.assertEqual(get_op_types_in_program(prog), ["concat"]) assert_model_is_valid( prog, {"x": (10, 20, 30)}, expected_output_shapes={block.outputs[0].name: (10, 25, 30)}, ) """ Input graph: input2(shape=30,10,20)-----| | input(shape=10,20,30)--->transpose(axis=[2,0,1])----->relu-----|----->concat(axis=2)------>out1(shape=90,10,20) | | |-->relu-----| | |-->relu---->transpose(axis=[1,2,0])---->out2(shape=10,20,30) | |-->relu---->transpose(axis=[1,2,0])---->out3(shape=10,20,30) | |-->relu---->transpose(axis=[1,2,0])---->out4(shape=10,20,30) Output graph: input2(shape=30,10,20)-----| | input(shape=10,20,30)----->relu--->transpose(axis=[2,0,1])-----|----->concat(axis=2)------>out1(shape=90,10,20) | | |-->relu--->transpose(axis=[2,0,1])-----| | |-->relu---->out2(shape=10,20,30) | |-->relu---->out3(shape=10,20,30) | |-->relu---->out4(shape=10,20,30) Output graph: """ def test_concat_pattern_6(self): @mb.program( input_specs=[ mb.TensorSpec(shape=(10, 20, 30)), mb.TensorSpec(shape=(30, 10, 20)), ] ) def prog(x, y): x1 = mb.transpose(x=x, perm=[2, 0, 1]) r1 = mb.relu(x=x1) r2 = mb.relu(x=x1) r3 = mb.relu(x=x1) r4 = mb.relu(x=x1) r5 = mb.relu(x=x1) x2 = mb.concat(values=[r1, r2, y], axis=0) x3 = mb.transpose(x=r3, perm=[1, 2, 0]) x4 = mb.transpose(x=r4, perm=[1, 2, 0]) x5 = mb.transpose(x=r5, perm=[1, 2, 0]) return x2, x3, x4, x5 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), [ "transpose", "relu", "relu", "relu", "relu", "relu", "concat", "transpose", "transpose", "transpose", ], ) self.assertEqual( get_op_types_in_program(prog), [ "relu", "relu", "relu", "relu", "relu", "transpose", "transpose", "concat", ], ) assert_model_is_valid( prog, {"x": (10, 20, 30), "y": (30, 10, 20)}, expected_output_shapes={ block.outputs[0].name: (90, 10, 20), block.outputs[1].name: (10, 20, 30), block.outputs[2].name: (10, 20, 30), block.outputs[3].name: (10, 20, 30), }, ) """ Input graph: input(shape=1,4,5,6)--->transpose(axis=[0,3,2,1])--->relu---->split(axis=1, num_splits=2)----->transpose(axis=[0,3,2,1])----->out1(shape=1,4,5,3) | v transpose(axis[0,3,2,1])-------------------------->out2(shape=1,4,5,3) Output graph: input(shape=1,4,5,6)------> relu ---->split(axis=3)--->out1(shape=1,4,5,3) | v out2(shape=1,4,5,3) """ def test_split_nd_pattern_0(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 4, 5, 6))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 3, 2, 1]) x1 = mb.relu(x=x1) x2, x3 = mb.split(x=x1, axis=1, num_splits=2) x4 = mb.transpose(x=x2, perm=[0, 3, 2, 1]) x5 = mb.transpose(x=x3, perm=[0, 3, 2, 1]) return x4, x5 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "relu", "split", "transpose", "transpose"], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "split"]) assert_model_is_valid( prog, {"x": (1, 4, 5, 6)}, expected_output_shapes={ block.outputs[0].name: (1, 4, 5, 3), block.outputs[1].name: (1, 4, 5, 3), }, ) self.assertEqual(block.find_ops(op_type="split")[0].axis.val, 3) """ Input graph: input(shape=1,4,5,6)--->transpose(axis=[0,3,2,1])--->relu---->splitd(axis=1, num_splits=6)----->transpose(axis=[0,3,2,1])----->out1(shape=1,4,5,3) | v transpose(axis[0,3,2,1])-------------------------------------->out2(shape=1,4,5,3) Output graph: input(shape=1,4,5,6)------>relu---->split(axis=3)--->out1(shape=1,4,5,3) | v out2(shape=1,4,5,3) """ def test_split_nd_pattern_1(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 4, 5, 6))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 3, 2, 1]) x1 = mb.relu(x=x1) x2, x3, x4, x5, x6, x7 = mb.split(x=x1, axis=1, num_splits=6) x2 = mb.transpose(x=x2, perm=[0, 3, 2, 1]) x3 = mb.transpose(x=x3, perm=[0, 3, 2, 1]) x4 = mb.transpose(x=x4, perm=[0, 3, 2, 1]) x5 = mb.transpose(x=x5, perm=[0, 3, 2, 1]) x6 = mb.transpose(x=x6, perm=[0, 3, 2, 1]) x7 = mb.transpose(x=x7, perm=[0, 3, 2, 1]) return x2, x3, x4, x5, x6, x7 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), [ "transpose", "relu", "split", "transpose", "transpose", "transpose", "transpose", "transpose", "transpose", ], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "split"]) assert_model_is_valid( prog, {"x": (1, 4, 5, 6)}, expected_output_shapes={ block.outputs[0].name: (1, 4, 5, 1), block.outputs[1].name: (1, 4, 5, 1), block.outputs[2].name: (1, 4, 5, 1), block.outputs[3].name: (1, 4, 5, 1), block.outputs[4].name: (1, 4, 5, 1), block.outputs[5].name: (1, 4, 5, 1), }, ) self.assertEqual(block.find_ops(op_type="split")[0].axis.val, 3) """ Input graph: input(shape=1,4,5,6)--->transpose(axis=[0,3,2,1])---> split(axis=1, num_splits=2) ----> concat(axis=1) ----->transpose(axis=[0,3,2,1]) ----->out1(shape=1,4,5,6) | ^ v | relu() ---------------------- Output graph: input(shape=1,4,5,6)------>split(axis=3)--->concat(axis=3) -------> out1(shape=1,4,5,6) | ^ v | relu() -------------- """ def test_split_nd_pattern_2(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 4, 5, 6))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 3, 2, 1]) x2, x3 = mb.split(x=x1, axis=1, num_splits=2) x4 = mb.relu(x=x2) x5 = mb.concat(values=[x4, x3], axis=1) x6 = mb.transpose(x=x5, perm=[0, 3, 2, 1]) return x6 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "split", "relu", "concat", "transpose"], ) self.assertEqual(get_op_types_in_program(prog), ["split", "relu", "concat"]) assert_model_is_valid( prog, {"x": (1, 4, 5, 6)}, expected_output_shapes={block.outputs[0].name: (1, 4, 5, 6)}, ) self.assertEqual(block.find_ops(op_type="split")[0].axis.val, 3) """ Input graph: input(shape=1,5,5,3)----->transpose(axis=[0,3,1,2]) | ---->relu-------------->transpose(axis=[0,2,3,1]) | | | V | relu | | | V | transpose(axis=[0,3,1,2]) | | | V ----------------> add --------> relu---->pool---->out(shape=1,3,5,5) Output graph: input(shape=1,5,5,3)---->relu------------------------> relu | | | V ----------------> add | V relu | V transpose(axis=[0,3,1,2])-->pool---->out(shape=1,3,5,5) """ def test_skip_connection_pattern_0(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 5, 5, 3))]) def prog(x): x = mb.transpose(x=x, perm=[0, 3, 1, 2]) x = mb.relu(x=x) x1 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x2 = mb.relu(x=x1) x3 = mb.transpose(x=x2, perm=[0, 3, 1, 2]) x4 = mb.add(x=x, y=x3) x5 = mb.relu(x=x4) x6 = mb.avg_pool(x=x5, kernel_sizes=[1, 1], strides=[1, 1], pad_type="valid") return x6 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), [ "transpose", "relu", "transpose", "relu", "transpose", "add", "relu", "avg_pool", ], ) self.assertEqual( get_op_types_in_program(prog), ["relu", "relu", "add", "relu", "transpose", "avg_pool"], ) assert_model_is_valid( prog, {"x": (1, 5, 5, 3)}, expected_output_shapes={block.outputs[0].name: (1, 3, 5, 5)}, ) """ Input graph: input(shape=1,5,5,3)----->transpose(axis=[0,3,1,2]) | ---->relu-------------->transpose(axis=[0,2,3,1]) | | | V | relu | | | V | transpose(axis=[0,3,1,2]) | | | V ----------------> add -->transpose(axis=[0,2,3,1]) | V relu---->pool---->out(shape=1,5,5,3) Output graph: input(shape=1,5,5,3)---->relu------------------------> relu | | | V ----------------> add | V relu | V pool---->out(shape=1,5,5,3) """ def test_skip_connection_pattern_1(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 5, 5, 3))]) def prog(x): x = mb.transpose(x=x, perm=[0, 3, 1, 2]) x = mb.relu(x=x) x1 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x2 = mb.relu(x=x1) x3 = mb.transpose(x=x2, perm=[0, 3, 1, 2]) x4 = mb.add(x=x, y=x3) x4 = mb.transpose(x=x4, perm=[0, 2, 3, 1]) x5 = mb.relu(x=x4) x6 = mb.avg_pool(x=x5, kernel_sizes=[1, 1], strides=[1, 1], pad_type="valid") return x6 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), [ "transpose", "relu", "transpose", "relu", "transpose", "add", "transpose", "relu", "avg_pool", ], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "relu", "add", "relu", "avg_pool"]) assert_model_is_valid( prog, {"x": (1, 5, 5, 3)}, expected_output_shapes={block.outputs[0].name: (1, 5, 5, 3)}, ) """ Input graph: input(shape=2,5)--->transpose(axis=[1,0])--->transpose(axis=[1,0])-->reduce(axis=1) | | | V | transpose(axis=[1,0]) | | | V -------------------------------------------->add------->out(shape=5,2) Output graph: input(shape=2,5)--->reduce(axis=1)---->add---->transpose(axis=[1,0])--->out(shape=5,2) | ^ | | ------------------------ """ def test_residual_with_unmaterialized_output(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 5))]) def prog(x): x1 = mb.transpose(x=x, perm=[1, 0]) t1 = mb.transpose(x=x1, perm=[1, 0]) x2 = mb.reduce_mean(x=t1, axes=[1], keep_dims=True) t2 = mb.transpose(x=x2, perm=[1, 0]) return mb.add(x=x1, y=t2) prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "transpose", "reduce_mean", "transpose", "add"], ) self.assertEqual(get_op_types_in_program(prog), ["reduce_mean", "add", "transpose"]) assert_model_is_valid( prog, {"x": (2, 5)}, expected_output_shapes={block.outputs[0].name: (5, 2)} ) """ Input graph: input(shape=2,5)--->transpose(axis=[1,0])--->transpose(axis=[1,0])-->reduce(axis=1) | | | V | transpose(axis=[1,0]) | | | V -------------------------------------------->add------->out1(shape=5,2) | V relu------->out2(shape=5,2) Output graph: input(shape=2,5)--->reduce(axis=1)----> add ----->transpose(axis=[1,0])----->out1(shape=5,2) | | | V ---------------------> relu----->transpose(axis=[1,0])----->out2(shape=5,2) """ def test_residual_with_unmaterialized_multiple_output(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 5))]) def prog(x): x1 = mb.transpose(x=x, perm=[1, 0]) t1 = mb.transpose(x=x1, perm=[1, 0]) x2 = mb.reduce_mean(x=t1, axes=[1], keep_dims=True) t2 = mb.transpose(x=x2, perm=[1, 0]) out1 = mb.add(x=x1, y=t2) out2 = mb.relu(x=out1) return out1, out2 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "transpose", "reduce_mean", "transpose", "add", "relu"], ) self.assertEqual( get_op_types_in_program(prog), ["reduce_mean", "add", "relu", "transpose", "transpose"] ) assert_model_is_valid( prog, {"x": (2, 5)}, expected_output_shapes={block.outputs[0].name: (5, 2), block.outputs[1].name: (5, 2)}, ) """ Input graph: input(shape=2,5)---->transpose(axis=[1,0])------>relu----->transpose(axis=[1,0])------>out2(shape=2,5) | ------->out1(shape=5,2) Output graph: input(shape=2,5)---->relu-----> out2(shape=2,5) | V transpose(axis=[1,0]) -----> out1(shape=5,2) """ def test_materialized_output_reuse(self): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 5))]) def prog(x): x1 = mb.transpose(x=x, perm=[1, 0]) y1 = mb.relu(x=x1) y2 = mb.transpose(x=y1, perm=[1, 0]) return y1, y2 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), [ "transpose", "relu", "transpose", ], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "transpose"]) assert_model_is_valid( prog, {"x": (2, 5)}, expected_output_shapes={block.outputs[0].name: (5, 2), block.outputs[1].name: (2, 5)}, ) """ Input graph: input(shape=1,2,5,5)----->transpose(axis=[0,2,3,1])------->add------------>transpose(axis=[0,3,1,2])--->out1(shape=1,2,5,5) | ^ | | | | ---->relu ----->transpose(axis=[0,3,1,2])--->out2(shape=1,2,5,5) Output graph: input(shape=1,2,5,5)----->add------->out1(shape=1,2,5,5) | ^ | | | | |------>relu ------identity(renaming)---->out2(shape=1,2,5,5) """ def test_fusion_with_double_outputs(self): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 5, 5))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x2 = mb.relu(x=x1) x3 = mb.add(x=x1, y=x2) y1 = mb.transpose(x=x3, perm=[0, 3, 1, 2]) y2 = mb.transpose(x=x3, perm=[0, 3, 1, 2]) return y1, y2 prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "relu", "add", "transpose", "transpose"], ) self.assertEqual(get_op_types_in_program(prog), ["relu", "add", "identity"]) assert block.find_ops(op_type="relu")[0].inputs["x"] == block.inputs["x"] assert block.find_ops(op_type="add")[0].inputs["x"] == block.inputs["x"] assert ( block.find_ops(op_type="add")[0].inputs["y"] == block.find_ops(op_type="relu")[0].outputs[0] ) assert_model_is_valid( prog, {"x": (1, 2, 5, 5)}, expected_output_shapes={block.outputs[0].name: (1, 2, 5, 5)}, ) def test_pass_through_broadcasted_binary_op(self): """ Input graph: const (shape=(1,1,1,3)) | input (shape=(1,4,3,2)) --> transpose (shape=(1,2,4,3)) --> add --> transpose --> relu Output graph: const (shape=(1,1,3,1)) | input (shape=(1,4,3,2)) --> add --> relu """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 4, 3, 2))]) def prog(x): x = mb.transpose(x=x, perm=[0, 3, 1, 2]) x = mb.add(x=x, y=np.array(np.ones(shape=(1, 1, 1, 3)))) x = mb.transpose(x=x, perm=[0, 2, 3, 1]) x = mb.relu(x=x) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "add", "transpose", "relu"], ) self.assertEqual(get_op_types_in_program(prog), ["add", "relu"]) assert_model_is_valid( prog, {"x": (1, 4, 3, 2)}, expected_output_shapes={block.outputs[0].name: (1, 4, 3, 2)}, ) def test_binary_op_with_constant_input(self): """ Input graph: const (shape=(4,3)) | input (shape=(1,4,3,2)) --> transpose (shape=(1,2,4,3)) --> add --> transpose --> relu Output graph: const (shape=(1,4,3,1)) | input (shape=(1,4,3,2)) --> add --> relu """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 4, 3, 2))]) def prog(x): x = mb.transpose(x=x, perm=[0, 3, 1, 2]) x = mb.add(x=x, y=np.array(np.ones(shape=(4, 3)))) x = mb.transpose(x=x, perm=[0, 2, 3, 1]) x = mb.relu(x=x) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "add", "transpose", "relu"], ) self.assertEqual(get_op_types_in_program(prog), ["add", "relu"]) assert_model_is_valid( prog, {"x": (1, 4, 3, 2)}, expected_output_shapes={block.outputs[0].name: (1, 4, 3, 2)}, ) def test_binary_op_with_non_constant_input1(self): """ Input graph: input (shape=(3,)) | input (shape=(1,4,3,2)) --> transpose (shape=(1,2,4,3)) --> add --> transpose --> relu Output graph: input (shape=(3,)) | reshape (shape=(1,1,3,1)) | input (shape=(1,4,3,2)) --> add --> relu """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 4, 3, 2)), mb.TensorSpec(shape=(3,))]) def prog(x, y): x = mb.transpose(x=x, perm=[0, 3, 1, 2]) x = mb.add(x=x, y=y) x = mb.transpose(x=x, perm=[0, 2, 3, 1]) x = mb.relu(x=x) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "add", "transpose", "relu"], ) self.assertEqual(get_op_types_in_program(prog), ["reshape", "add", "relu"]) reshape_op = prog.find_ops(op_type="reshape", exactly_one=True)[0] assert reshape_op.outputs[0].shape == (1, 1, 3, 1) assert_model_is_valid( prog, {"x": (1, 4, 3, 2), "y": (3,)}, expected_output_shapes={block.outputs[0].name: (1, 4, 3, 2)}, ) def test_binary_op_with_non_constant_input2(self): """ Input graph: input (shape=(3,1,2)) | input (shape=(5,3,4,2)) --> transpose (shape=(4,3,5,2)) --> add --> transpose --> relu Output graph: input (shape=(3,1,2)) | reshape (shape=(1,3,1,2)) | input (shape=(5,3,4,2)) --> add --> relu """ @mb.program(input_specs=[mb.TensorSpec(shape=(5, 3, 4, 2)), mb.TensorSpec(shape=(3, 1, 2))]) def prog(x, y): x = mb.transpose(x=x, perm=[2, 1, 0, 3]) x = mb.add(x=x, y=y) x = mb.transpose(x=x, perm=[2, 1, 0, 3]) x = mb.relu(x=x) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "add", "transpose", "relu"], ) self.assertEqual(get_op_types_in_program(prog), ["reshape", "add", "relu"]) reshape_op = prog.find_ops(op_type="reshape", exactly_one=True)[0] assert reshape_op.outputs[0].shape == (1, 3, 1, 2) assert_model_is_valid( prog, {"x": (5, 3, 4, 2), "y": (3, 1, 2)}, expected_output_shapes={block.outputs[0].name: (5, 3, 4, 2)}, ) def test_binary_op_with_non_constant_input3(self): """ Input graph: input (shape=(3,1,2)) | input (shape=(s,3,4,2)) --> transpose (shape=(4,3,s,2)) --> add --> transpose --> relu Output graph: input (shape=(3,1,2)) | reshape (shape=(1,3,1,2)) | input (shape=(s,3,4,2)) --> add --> relu """ @mb.program( input_specs=[ mb.TensorSpec(shape=(get_new_symbol(), 3, 4, 2)), mb.TensorSpec(shape=(3, 1, 2)), ] ) def prog(x, y): x = mb.transpose(x=x, perm=[2, 1, 0, 3]) x = mb.add(x=x, y=y) x = mb.transpose(x=x, perm=[2, 1, 0, 3]) x = mb.relu(x=x) return x pass_name = "common::reduce_transposes" PASS_REGISTRY[pass_name](prog) self.assertEqual(get_op_types_in_program(prog), ["reshape", "add", "relu"]) reshape_op = prog.find_ops(op_type="reshape", exactly_one=True)[0] assert reshape_op.outputs[0].shape == (1, 3, 1, 2) block = prog.functions["main"] assert_model_is_valid( prog, {"x": (5, 3, 4, 2), "y": (3, 1, 2)}, expected_output_shapes={block.outputs[0].name: (5, 3, 4, 2)}, ) def test_binary_op_with_non_constant_input4(self): """ Input graph: input (shape=(3,s,2)) | input (shape=(1,3,4,2)) --> transpose (shape=(4,3,1,2)) --> add --> transpose --> relu Output graph: same as input graph since the non-transpose input of the add op has symbolic shape """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 3, 4, 2)), mb.TensorSpec(shape=(3, get_new_symbol(), 2)), ] ) def prog(x, y): x = mb.transpose(x=x, perm=[2, 1, 0, 3]) x = mb.add(x=x, y=y) x = mb.transpose(x=x, perm=[2, 1, 0, 3]) x = mb.relu(x=x) return x pass_name = "common::reduce_transposes" PASS_REGISTRY[pass_name](prog) self.assertEqual(get_op_types_in_program(prog), ["transpose", "add", "transpose", "relu"]) block = prog.functions["main"] assert_model_is_valid( prog, {"x": (1, 3, 4, 2), "y": (3, 10, 2)}, expected_output_shapes={block.outputs[0].name: (10, 3, 4, 2)}, ) def test_binary_op_with_non_constant_input5(self): """ Input graph: input (shape=(3,4)) | input (shape=(5,3,4,2)) --> transpose (shape=(5,2,3,4)) --> add --> transpose --> relu Output graph: same as input graph since transpose compliment for 2nd input of add cannot be represented as a static reshape """ @mb.program(input_specs=[mb.TensorSpec(shape=(5, 3, 4, 2)), mb.TensorSpec(shape=(3, 4))]) def prog(x, y): x = mb.transpose(x=x, perm=[0, 3, 1, 2]) x = mb.add(x=x, y=y) x = mb.transpose(x=x, perm=[0, 2, 3, 1]) x = mb.relu(x=x) return x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual( get_op_types_in_program(prev_prog), ["transpose", "add", "transpose", "relu"], ) self.assertEqual( get_op_types_in_program(prog), ["transpose", "add", "transpose", "relu"], ) assert_model_is_valid( prog, {"x": (5, 3, 4, 2), "y": (3, 4)}, expected_output_shapes={block.outputs[0].name: (5, 3, 4, 2)}, ) def test_input_duplicate_output(self): """ Input graph: input -----> out (consist of duplicated input) Output graph: input -----> out (consist of duplicated input) Notice that a temp identity sink is added for all outputs, so the block before going through the pass is: function[CoreML3](%x: (2, 2, 1, 1, fp32)(Tensor)) { block0() { %identity_0: (2, 2, 1, 1, fp32)(Tensor) = identity(x=%x, name="identity_0") } -> (%identity_0, %identity_0) } """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 2, 1, 1))]) def prog(x): return x, x prev_prog, prev_block, block = apply_pass_and_basic_check(prog, "common::reduce_transposes") self.assertEqual(get_op_types_in_program(prev_prog), []) self.assertEqual(get_op_types_in_program(prog), []) assert_model_is_valid( prog, {"x": (2, 2, 1, 1)}, backend=("mlprogram", "fp16"), expected_output_shapes={ block.outputs[0].name: (2, 2, 1, 1), block.outputs[1].name: (2, 2, 1, 1), }, ) class TestTransposePassUtilityMethods: @staticmethod @pytest.mark.parametrize("rank", [1, 2, 3, 4, 5]) def test_transpose_compliment_method(rank): x = np.random.rand(*np.random.randint(low=1, high=15, size=rank)) perm = np.random.permutation(rank) reverse_perm = TransformAxisUpdateOps._find_transpose_compliment(perm) x_transpose = np.transpose(x, perm) x_transpose_transpose = np.transpose(x_transpose, reverse_perm) np.testing.assert_equal(x, x_transpose_transpose) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/tests/test_state_passes.py0000644000000000000000000002445614672066616027357 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools.converters.mil._deployment_compatibility import AvailableTarget from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.testing_utils import ( apply_pass_and_basic_check, get_op_types_in_program, ) class TestCanonicalizeInplacePattern: @staticmethod def test_simple(): """ Given: mul = mul(state, x) add = add(mul, 1.0) update = coreml_update_state(state, mul) Return: mul = mul(state, x) update = coreml_update_state(state, mul) add = add(mul, 1.0) """ SHAPE = (2,) @mb.program( input_specs=[ mb.TensorSpec(shape=SHAPE, dtype=types.fp16), mb.StateTensorSpec(shape=SHAPE, dtype=types.fp16), ], opset_version=AvailableTarget.iOS18, ) def prog(x, state): read = mb.read_state(input=state) mul = mb.mul(x=read, y=x) add = mb.add(x=mul, y=np.float16(1.0)) update = mb.coreml_update_state(state=state, value=mul) return add assert get_op_types_in_program(prog) == ["read_state", "mul", "add", "coreml_update_state"] apply_pass_and_basic_check(prog, "common::canonicalize_inplace_pattern") assert get_op_types_in_program(prog) == ["read_state", "mul", "coreml_update_state", "add"] @staticmethod def test_irrelevant_ops_jam(): """ Given: relu = relu(x) mul = mul(state, x) tanh = tanh(x) add = add(mul, 1.0) update = coreml_update_state(state, mul) softmax = softmax(x) Where ``relu``, ``tanh``, and ``softmax`` are irrelevant to state Return: relu = relu(x) mul = mul(state, x) update = coreml_update_state(state, mul) tanh = tanh(x) add = add(mul, 1.0) softmax = softmax(x) """ SHAPE = (2, 3) @mb.program( input_specs=[ mb.TensorSpec(shape=SHAPE, dtype=types.fp16), mb.StateTensorSpec(shape=SHAPE, dtype=types.fp16), ], opset_version=AvailableTarget.iOS18, ) def prog(x, state): read = mb.read_state(input=state) relu = mb.relu(x=x) mul = mb.mul(x=read, y=x) tanh = mb.tanh(x=x) add = mb.add(x=mul, y=np.float16(1.0)) update = mb.coreml_update_state(state=state, value=mul) softmax = mb.softmax(x=x) return add, relu, tanh, softmax assert get_op_types_in_program(prog) == [ "read_state", "relu", "mul", "tanh", "add", "coreml_update_state", "softmax", ] apply_pass_and_basic_check(prog, "common::canonicalize_inplace_pattern") assert get_op_types_in_program(prog) == [ "read_state", "relu", "mul", "coreml_update_state", "tanh", "add", "softmax", ] class TestPreferStateInDownstream: @staticmethod def test_simple(): """ Given: mul = mul(state, x) update = coreml_update_state(state, mul) add = add(mul, y) Return: mul = mul(state, x) update = coreml_update_state(state, mul) add = add(update, y) """ SHAPE = (2, 3, 5) @mb.program( input_specs=[ mb.TensorSpec(shape=SHAPE, dtype=types.fp16), mb.StateTensorSpec(shape=SHAPE, dtype=types.fp16), ], opset_version=AvailableTarget.iOS18, ) def prog(x, state): read = mb.read_state(input=state) mul = mb.mul(x=x, y=read) update = mb.coreml_update_state(state=state, value=mul) add = mb.add(x=x, y=mul) return add mul_op = prog.find_ops(op_type="mul")[0] add_op = prog.find_ops(op_type="add")[0] assert add_op.y is mul_op.outputs[0] apply_pass_and_basic_check(prog, "common::prefer_state_in_downstream") coreml_update_state_op = prog.find_ops(op_type="coreml_update_state")[0] add_op = prog.find_ops(op_type="add")[0] assert add_op.y is coreml_update_state_op.outputs[0] @staticmethod def test_no_affect_if_var_is_input_and_output(): """ If the val of the coreml_update_state op is both block input and output, the graph pass should have no affects on it. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1,), dtype=types.fp16), mb.StateTensorSpec(shape=(1,), dtype=types.fp16), ], opset_version=AvailableTarget.iOS18, ) def prog(x, state): mb.coreml_update_state(state=state, value=x) return x apply_pass_and_basic_check(prog, "common::prefer_state_in_downstream") block = prog.functions["main"] assert block.outputs[0] == list(block.inputs.values())[0] @staticmethod def test_no_other_child_op(): """ If the val of the coreml_update_state doesn't feed into any other op, and only serves as a block output, the graph pass has no affects. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1,), dtype=types.fp16), mb.StateTensorSpec(shape=(1,), dtype=types.fp16), ], opset_version=AvailableTarget.iOS18, ) def prog(x, state): x = mb.sin(x=x) mb.coreml_update_state(state=state, value=x) return x apply_pass_and_basic_check(prog, "common::prefer_state_in_downstream") block = prog.functions["main"] sin_op = prog.find_ops(op_type="sin")[0] assert block.outputs[0] == sin_op.outputs[0] @staticmethod def test_output_with_affect(): @mb.program( input_specs=[ mb.TensorSpec(shape=(1,), dtype=types.fp16), mb.StateTensorSpec(shape=(1,), dtype=types.fp16), ], opset_version=AvailableTarget.iOS18, ) def prog(x, state): x = mb.sin(x=x) mb.coreml_update_state(state=state, value=x) cos = mb.cos(x=x) return x, cos apply_pass_and_basic_check(prog, "common::prefer_state_in_downstream") block = prog.functions["main"] update_state_op = prog.find_ops(op_type="coreml_update_state")[0] assert block.outputs[0] == update_state_op.outputs[0] cos_op = prog.find_ops(op_type="cos")[0] assert update_state_op.outputs[0] == cos_op.x @staticmethod def test_only_feeds_in_update_state(): """ If value only feeds into multiple coreml_update_state ops, the graph pass has no affects """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1,), dtype=types.fp16), mb.StateTensorSpec(shape=(1,), dtype=types.fp16), ], opset_version=AvailableTarget.iOS18, ) def prog(x, state): x = mb.sin(x=x) mb.coreml_update_state(state=state, value=x) mb.coreml_update_state(state=state, value=x) mb.coreml_update_state(state=state, value=x) return x apply_pass_and_basic_check(prog, "common::prefer_state_in_downstream") block = prog.functions["main"] sin_op = prog.find_ops(op_type="sin")[0] update_state_ops = prog.find_ops(op_type="coreml_update_state") for op in update_state_ops: assert op.value == sin_op.outputs[0] assert block.outputs[0] == sin_op.outputs[0] @staticmethod def test_feeds_in_update_state_and_other_op(): @mb.program( input_specs=[ mb.TensorSpec(shape=(1,), dtype=types.fp16), mb.StateTensorSpec(shape=(1,), dtype=types.fp16), ], opset_version=AvailableTarget.iOS18, ) def prog(x, state): x = mb.sin(x=x) mb.coreml_update_state(state=state, value=x) mb.coreml_update_state(state=state, value=x) return x, mb.identity(x=x) apply_pass_and_basic_check(prog, "common::prefer_state_in_downstream") block = prog.functions["main"] sin_op = prog.find_ops(op_type="sin")[0] update_state_ops = prog.find_ops(op_type="coreml_update_state") assert update_state_ops[1].value == update_state_ops[0].outputs[0] assert block.outputs[0] == update_state_ops[1].outputs[0] identity_op = prog.find_ops(op_type="identity")[0] assert identity_op.x == update_state_ops[1].outputs[0] @staticmethod def test_invalid_if_not_canonical(): """ Since the inplace op is not in canonical pattern, there is nothing this graph pass can do """ SHAPE = (2, 3, 5, 7) @mb.program( input_specs=[ mb.TensorSpec(shape=SHAPE, dtype=types.fp16), mb.StateTensorSpec(shape=SHAPE, dtype=types.fp16), ], opset_version=AvailableTarget.iOS18, ) def prog(x, state): read = mb.read_state(input=state) mul = mb.mul(x=x, y=read) add = mb.add(x=x, y=mul) update = mb.coreml_update_state(state=state, value=mul) return add mul_op = prog.find_ops(op_type="mul")[0] add_op = prog.find_ops(op_type="add")[0] assert add_op.y is mul_op.outputs[0] apply_pass_and_basic_check(prog, "common::prefer_state_in_downstream") mul_op = prog.find_ops(op_type="mul")[0] coreml_update_state_op = prog.find_ops(op_type="coreml_update_state")[0] add_op = prog.find_ops(op_type="add")[0] assert add_op.y is mul_op.outputs[0] assert add_op.y is not coreml_update_state_op.outputs[0] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/passes/tests/test_symbol_transform.py0000644000000000000000000001343214672066616030251 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import get_new_symbol from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.converters.mil.testing_utils import ( apply_pass_and_basic_check, get_op_types_in_program, ) class TestMaterializeSymbolicShapeProgram: @pytest.mark.parametrize("override_main_function", (True, False)) def test_simple(self, override_main_function): """ Input graph: x -> shape -> add Output graph: const """ symbolic_shape = (get_new_symbol(), get_new_symbol()) fixed_shape = (2, 3) new_function_name = "main" if override_main_function else "materialization" @mb.program(input_specs=[mb.TensorSpec(shape=symbolic_shape)]) def prog(x): shape = mb.shape(x=x) return mb.add(x=shape, y=1) graph_pass = PASS_REGISTRY["common::materialize_symbolic_shape_program"] graph_pass.function_name_to_materialization_map = {new_function_name: {"x": fixed_shape}} prev_prog, _, _ = apply_pass_and_basic_check( prog, graph_pass, skip_output_shape_check=True, skip_function_name_check=True ) apply_pass_and_basic_check(prog, "common::const_elimination") if override_main_function: assert set(prog.functions.keys()) == {"main"} else: assert set(prog.functions.keys()) == {"main", "materialization"} assert prog.functions["main"].inputs["x"].shape == symbolic_shape assert get_op_types_in_program(prev_prog, "main") == get_op_types_in_program( prog, "main" ) assert prog.functions[new_function_name].inputs["x"].shape == fixed_shape assert len(get_op_types_in_program(prog, new_function_name)) == 0 @pytest.mark.parametrize( "source_function_name, override_source_function", itertools.product( ("main", "func2"), (True, False), ), ) def test_multifunction_source_program(self, source_function_name, override_source_function): """ Input graph: x -> shape -> sub Output graph: const """ symbolic_shape = (get_new_symbol(),) fixed_shape = (5,) new_function_name = source_function_name if override_source_function else "materialization" @mb.program(input_specs=[mb.TensorSpec(shape=symbolic_shape)]) def prog(x): shape = mb.shape(x=x) return mb.sub(x=shape, y=1) @mb.function(input_specs=[mb.TensorSpec(shape=symbolic_shape)]) def func2(x): shape = mb.shape(x=x) return mb.sub(x=shape, y=2) prog.add_function("func2", func2) graph_pass = PASS_REGISTRY["common::materialize_symbolic_shape_program"] graph_pass.function_name_to_materialization_map = {new_function_name: {"x": fixed_shape}} if source_function_name != "main": graph_pass.souce_function_name = source_function_name prev_prog, _, _ = apply_pass_and_basic_check( prog, graph_pass, skip_output_name_check=True, skip_output_shape_check=True, skip_function_name_check=True, ) apply_pass_and_basic_check(prog, "common::const_elimination") if override_source_function: assert set(prog.functions.keys()) == {"main", "func2"} else: assert set(prog.functions.keys()) == {"main", "func2", "materialization"} assert prog.functions[new_function_name].inputs["x"].shape == fixed_shape assert len(get_op_types_in_program(prog, new_function_name)) == 0 for function_name in ("main", "func2"): if function_name == source_function_name and override_source_function: continue else: assert prog.functions[function_name].inputs["x"].shape == symbolic_shape assert get_op_types_in_program(prev_prog, function_name) == get_op_types_in_program( prog, function_name ) @pytest.mark.parametrize( "inconsistency_in_single_input", (True, False), ) def test_inconsistent_materialization(self, inconsistency_in_single_input): symbol = get_new_symbol() graph_pass = PASS_REGISTRY["common::materialize_symbolic_shape_program"] if inconsistency_in_single_input: @mb.program(input_specs=[mb.TensorSpec(shape=(symbol, symbol))]) def prog(x): return mb.add(x=x, y=1.0) graph_pass.function_name_to_materialization_map = {"materialization": {"x": (2, 3)}} else: @mb.program( input_specs=[mb.TensorSpec(shape=(2, symbol)), mb.TensorSpec(shape=(symbol, 4))] ) def prog(x, y): return mb.matmul(x=x, y=y) graph_pass.function_name_to_materialization_map = { "materialization": {"x": (2, 3), "y": (5, 4)} } with pytest.raises( ValueError, match=( r"Inconsistent symbol materialization in new function .*: " r"symbol [a-zA-Z]+[0-9]+ is to be materialized into [0-9]+ and [0-9]+\. " r"Please make sure input (.*) has compatible shape with others" ), ): apply_pass_and_basic_check( prog, graph_pass, skip_output_shape_check=True, skip_function_name_check=True ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/program.py0000644000000000000000000004425614672066616022631 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import defaultdict from typing import Dict, List, Optional, Union import numpy as _np import sympy as _sm from coremltools import _logger as logger from coremltools.converters.mil._deployment_compatibility import AvailableTarget as _target from coremltools.converters.mil.mil.input_type import InternalInputType from coremltools.converters.mil.mil.ops.helper import _get_version_of_op from coremltools.converters.mil.mil.var import ListVar from . import types from .block import Function from .operation import Operation from .scope import ScopeSource from .types.symbolic import k_num_internal_syms, k_used_symbols from .var import Var class Program: @staticmethod def _get_opset_str_value(op): return f"coremltools.target.{op.name}" def __init__(self): self.functions = {} self.skip_all_passes = False self.default_function_name = "main" def _add_essential_scope_source( self, scope_source: Union[ScopeSource, List[ScopeSource]] ) -> None: """ Add essential scope sources to functions. """ for func in self.functions.values(): func._add_essential_scope_source(scope_source) def _get_dialect_namespaces(self) -> Dict[str, List[Operation]]: """ Return a dict which maps the dialect namespace into a list of corresponding operations. """ res = defaultdict(list) def get_dialect_namespaces_block(block): for op in block.operations: for b in op.blocks: get_dialect_namespaces_block(b) if hasattr(op, "_dialect_namespace"): dialect_namespace = op._dialect_namespace res[dialect_namespace].append(op) for func in self.functions.values(): get_dialect_namespaces_block(func) return res def _get_max_opset_version_and_op(self): max_opset_version = _target.iOS13 op_with_max_opset_version = None for func in self.functions.values(): cur_max_opset, cur_op = func.get_max_opset_version_and_op() if cur_max_opset > max_opset_version: max_opset_version = cur_max_opset op_with_max_opset_version = cur_op return max_opset_version, op_with_max_opset_version def _check_ops_version_compatibility(self, max_opset_version): def check_version_compatibility_block(block): for op in block.operations: for b in op.blocks: check_version_compatibility_block(b) if not hasattr(op, "_op_variants") or not isinstance(op._op_variants, dict): continue expected_op_cls = _get_version_of_op(op._op_variants, max_opset_version) if type(op) is not expected_op_cls: msg = ( "Op {} with an out of date version {} is detected. Please use @mb.program(input_specs=..., " "opset_version={})" ).format( op.op_type, self._get_opset_str_value(op.opset_version), self._get_opset_str_value(max_opset_version), ) raise ValueError(msg) for func in self.functions.values(): check_version_compatibility_block(func) def _check_or_set_functions_opset_version(self, max_opset_version): funcs = list(self.functions.values()) for func in funcs: if func.opset_version is None: func.opset_version = max_opset_version else: if func.opset_version < max_opset_version: msg = "function should have at least opset_version {}. Got {}".format( self._get_opset_str_value(max_opset_version), self._get_opset_str_value(func.opset_version), ) raise ValueError(msg) for func in funcs: if func.opset_version != funcs[0].opset_version: msg = "all functions must have the same opset_version. Got {} and {}.".format( self._get_opset_str_value(func.opset_version), self._get_opset_str_value(funcs[0].opset_version), ) raise ValueError(msg) def _check_program_opset_version(self): max_opset_version, _ = self._get_max_opset_version_and_op() self._check_ops_version_compatibility(max_opset_version) self._check_or_set_functions_opset_version(max_opset_version) @staticmethod def _get_runtime_supported_dialect_opset() -> List[str]: """ Return a list of supported dialect opsets at runtime. Right now, we are allowing ``coreml``, until we fix this radar: rdar://114737210 ([Infra] Handle control flow mechanism in coremltools) """ return ["coreml"] def _check_invalid_opset(self): """ Check if the program consists of opsets not supported by runtime. """ dialect_namespaces = self._get_dialect_namespaces() if len(dialect_namespaces) != 0: for dialect_key in list(dialect_namespaces.keys()): if dialect_key not in self._get_runtime_supported_dialect_opset(): invalid_op = dialect_namespaces[dialect_key][0] raise ValueError( f'Core ML only support core opset. Got unsupported op "{invalid_op.name}" with type "{invalid_op.op_type}" of dialect namespace "{invalid_op._dialect_namespace}".' ) def _check_invalid_tensor_rank(self): """ Check if the program consists of tensors with rank >= 6. """ def _check_invalid_tensor_rank_block(block): for op in block.operations: for b in op.blocks: _check_invalid_tensor_rank_block(b) for o in op.outputs: if not isinstance(o, ListVar) and (o.rank < 0 or o.rank >= 6): if op.op_type == "const" or op.op_type.startswith("constexpr_"): if all( child_op.op_type.startswith("constexpr_") for child_op in o.child_ops ): # For const/constexpr op's constexpr output, tensor with rank > 5 is ok. continue raise ValueError( f'Core ML only supports tensors with rank <= 5. Layer "{op.name}", ' f'with type "{op.op_type}", outputs a rank {o.rank} tensor. ' ) for f in self.functions.values(): _check_invalid_tensor_rank_block(f) def _check_invalid_const_tensor_input(self): """ Check if non const tensor feed into const input. This might happen in the early stage of conversion, for instance: constexpr_ -> reshape -> transpose -> linear However, the pattern is optimized into the following in a graph pass. constexpr_ -> linear """ def _check_invalid_const_tensor_input_block(block): for op in block.operations: for b in op.blocks: _check_invalid_const_tensor_input_block(b) for k, v in op.inputs.items(): input_type = op.input_spec.input_types[k] if ( input_type.const and not isinstance(input_type, InternalInputType) and not (v.op.op_type.startswith("constexpr_") or v.val is not None) ): raise ValueError( f"In op {op.name}. Input {k} ({v.name}) must be const or constexpr ops." ) for f in self.functions.values(): _check_invalid_const_tensor_input_block(f) def _check_early_error_out_for_invalid_program(self): """ Early error out for 1. tensor with rank >= 6 2. non const tensor feed into const input 3. program consist of non mil core ops """ self._check_invalid_tensor_rank() self._check_invalid_const_tensor_input() self._check_invalid_opset() def add_function(self, name, ssa_func): if not isinstance(ssa_func, Function): raise ValueError("Only Function can be added to Program.") self.functions[name] = ssa_func self._check_program_opset_version() def add_parameters(self, name, ssa_val): raise NotImplementedError() def find_ops(self, prefix=None, op_type=None, exactly_one=False): """ Return list of ops with name matching `prefix` if specified, and op_type, if specified. At least one of {prefix, op_type} must be specified. If `exactly_one` == True, raise ValueError if we find <1 or >1 ops satisfying the criteria. prefix: str Return list[Operation]. Empty list if no op satisfies. """ found_ops = [] for f_name, f in self.functions.items(): found_ops.extend(f.find_ops(prefix=prefix, op_type=op_type)) if exactly_one and len(found_ops) != 1: msg = "Found matching ops not exactly one. Found ops: {}" raise ValueError(msg.format(found_ops)) return found_ops def validate(self, check_essential_scope: Optional[bool] = False) -> None: for f in self.functions.values(): f.validate(force_validate=True, check_essential_scope=check_essential_scope) def stringify_stack_trace(self) -> str: result = "" for function_name, function in self.functions.items(): if ScopeSource.EXIR_STACK_TRACE not in function._essential_scope_sources: raise NotImplementedError( f"Function ({function_name}) must have EXIR_STACK_TRACE as an essential scope source." ) for operation in function.operations: # TODO (rdar://115846569): Handle multi-block case from EXIR if len(operation.blocks) > 0: raise NotImplementedError("Multi-block case has not been supported yet") stack_trace = operation.scopes[ScopeSource.EXIR_STACK_TRACE] if stack_trace is None: continue stack_trace = stack_trace[0] result += ( f"{operation.op_type} : {operation.outputs[0].name}\n" f"{stack_trace}\n" ) return result def construct_debug_handle_to_ops_mapping(self) -> Dict: """ For PyMIL program translated from ExecuTorch only: Based on scope info inherited from EXIR, construct a debug handle to ops mapping. The mapping format is something like { 1: [ {"Type": "Program"}, {"Type": "Function", "Name": "main"}, {"Type": "Block"}, {"Type": "Operation", "Operator": "add", "Output": "z"} ] } where `1`, `"main"`, `"add"`, and `"z"` are example values of the debug handle, function name, operation type, and output var name (or the name of the first output var, if multiple outputs) """ debug_handle_to_ops_mapping = {} for function_name, function in self.functions.items(): if ScopeSource.EXIR_DEBUG_HANDLE not in function._essential_scope_sources: raise NotImplementedError( f"Function ({function_name}) must have EXIR_DEBUG_HANDLE as an essential scope source." ) for operation in function.operations: # TODO (rdar://115846569): Handle multi-block case from EXIR if len(operation.blocks) > 0: raise NotImplementedError("Multi-block case has not been supported yet") debug_handle = operation.scopes[ScopeSource.EXIR_DEBUG_HANDLE] if debug_handle is None: continue debug_handle = debug_handle[0] if debug_handle not in debug_handle_to_ops_mapping: debug_handle_to_ops_mapping[debug_handle] = [] debug_handle_to_ops_mapping[debug_handle].append( [ {"Type": "Program"}, {"Type": "Function", "Name": function_name}, {"Type": "Block"}, { "Type": "Operation", "Operator": operation.op_type, "Output": operation.outputs[0].name, }, ] ) return debug_handle_to_ops_mapping def __getitem__(self, func_name): if func_name not in self.functions: msg = "Function {} not found in among functions {}." raise KeyError(msg.format(func_name, self.functions.keys())) return self.functions[func_name] def __repr__(self): return self.__str__() def __str__(self, print_attr: Optional[bool] = False) -> str: s = "" for f_name, f in self.functions.items(): s += "\n" s += f.to_str(f_name, print_attr=print_attr) return s class Placeholder: counter = 0 def __init__(self, sym_shape, dtype=None, name=None, allow_rank0_input=False): """ sym_shape: () or [] for scalar. list, tuple, np.ndarray for tensor. May contain Symbol as symbolic shape (but not string). dtype: types.float or other scalar builtin types. allow_rank0_input: A flag that allows the rank 0 placeholder. """ if not isinstance(sym_shape, (list, tuple, _np.ndarray)): raise ValueError("Illegal shape for Placeholder: {}".format(sym_shape)) if len(sym_shape) == 0: if not allow_rank0_input: raise ValueError('Rank-0 (input {}) is unsupported'.format(name)) else: logger.warning('Rank-0 (input {}) is unsupported in coreml. You might run into error while\ running this model'.format(name)) for i, d in enumerate(sym_shape): if not isinstance(d, (_np.generic, int, Symbol)): msg = 'Placeholder dim {} in {} is not integer or symbol' raise ValueError(msg.format(i, sym_shape)) self.sym_shape = sym_shape self.dtype = dtype if self.dtype is None: self.dtype = types.float self.name = name self._infer_output_var() def set_name(self, name): self.name = name self.outputs[0].name = name def type_inference(self): if len(self.sym_shape) == 0: return self.dtype return types.tensor(self.dtype, self.sym_shape) def __str__(self): return str(self.outputs[0]) def _infer_output_var(self): sym_type = self.type_inference() # Globally unique var name for placeholders if self.name is None: self.name = f"{self.__class__.__name__}_{self.__class__.counter}" self.__class__.counter += 1 # List of output vars (consistent w/ other ops) self.outputs = [Var(self.name, sym_type)] class StateTensorPlaceholder(Placeholder): counter = 0 def __init__(self, sym_shape, dtype=None, name=None): """ A placeholder with a state wrapping a tensor. Parameters ---------- sym_shape: list, tuple * shape of the tensor. dtype: type * types.float or other scalar builtin types. name: str * name of the placeholder """ self.sym_shape = sym_shape if dtype is None: dtype = types.fp32 self.dtype = dtype self.name = name self._infer_output_var() def type_inference(self): wrapped_tensor_type = types.tensor(self.dtype, self.sym_shape) return types.state(wrapped_tensor_type) def get_new_variadic_symbol(): global k_num_internal_syms s = Symbol("*is" + str(k_num_internal_syms)) k_num_internal_syms += 1 return s def get_new_symbol(name=None): """ Returns a new symbol, optionally named. name: str (optional) Optional name that provides more readability. If the name specified is not available, an extra integer will be appended. """ global k_used_symbols global k_num_internal_syms if name is not None: s = Symbol(name) if s in k_used_symbols: new_name = name + k_num_internal_syms msg = 'Symbol name "{}" already occupied. Renaming to {}' logger.warning(msg.format(name, new_name)) s = Symbol(new_name) else: s = Symbol("is" + str(k_num_internal_syms)) k_num_internal_syms += 1 return s def get_existing_symbol(name): global k_used_symbols if name not in k_used_symbols: msg = 'Symbol name {} does not exist' raise ValueError(msg.format(name)) return k_used_symbols[name] class Symbol(_sm.Symbol): def __init__(self, sym_name): """ Essentially sympy.Symbol representing an i32 value in shape. sym_name: str. If first character is *, then this symbol represents variadic rank. Otherwise the symbol name should start with a alpha character. `sym_name` must be unique if specified, or it'd be auto generated (to a non-variadic symbol). Furthermore, sym_name may not start with 'is' (internal symbol) """ if not (sym_name[0].isalpha() or sym_name[0] == "*"): msg = "Symbol name must start with a letter or *. Got {}" raise ValueError(msg.format(sym_name)) global k_used_symbols if sym_name in k_used_symbols: msg = "Symbol `{}` is used already." raise ValueError(msg.format(sym_name)) k_used_symbols[sym_name] = self self.name = sym_name ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/scope.py0000644000000000000000000003056414672066616022270 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy from collections import defaultdict from enum import Enum from typing import Dict, List, Union from attrs import define, field, validators class ScopeSource(Enum): """ Pre-defined scope source enum: # Torch script related: TORCHSCRIPT_MODULE_TYPE: * Torchscript module type of a scope, which usually corresponds to the submodule object class type. * If provided as str, it denotes a single scope, and cannot be an empty str. * Nested scopes are represented by a list of str. TORCHSCRIPT_MODULE_NAME: * Unique torchscript identifier for a scope, which usually corresponds to the submodule object name. * If provided as str, it denotes a single scope. * Nested scopes are represented by a list of str. # Core ML converter graph passes related: COREMLTOOLS_GRAPH_PASS: * This scope traces the graph transformations (graph passes) applied on the program. * For instance, operations constructed under the "fuse_conv_batchnorm" pass, is going to have the scopes attribute of ``{COREMLTOOLS_GRAPH_PASS: ["fuse_conv_batchnorm"]}``. * If the op went through multiple graph pass transformations, it is represetned by a list of str. For instance: ["fuse_conv_batchnorm", "add_fp16_cast"] means the op is created by "fuse_conv_batchnorm" and then undergoes "add_fp16_cast". # Torch export related: EXIR_STACK_TRACE: * The ``stack_trace`` metadata inherited from torch.fx.Node.meta in EXIR * This metadata traces the MIL op back to original python source code EXIR_DEBUG_HANDLE: * The ``debug_handle`` metadata inherited from torch.fx.Node.meta in EXIR * This metadata enables post-run analysis in ExecuTorch integration * ExecuTorch uses integer as debug handle. When a MIL op can be traced back to ExecuTorch (e.g. translated from torch op), we inherit the integer value * If a MIL op cannot be traced back to ExecuTorch (e.g. created by graph pass), then we use None to denote "no debug handle" Examples -------- Here is an example of torchscript related scope enum: .. sourcecode:: python class SubModule(torch.nn.Module): pass class MainModule(torch.nn.Module): def __init__(self): self.submodule_1 = SubModule() def forward(self, x): node = self.submodule_1(x) return node my_model = MainModule() when the above model is translated into pymil, the Operation corresponding to ``node`` would have: * TORCHSCRIPT_MODULE_TYPE: ["SubModule", ...] * TORCHSCRIPT_MODULE_NAME: ["submodule_1", ...] in their scope attributes. """ TORCHSCRIPT_MODULE_TYPE = 0 TORCHSCRIPT_MODULE_NAME = 1 COREMLTOOLS_GRAPH_PASS = 2 EXIR_STACK_TRACE = 3 # no serialization for such debug info should be allowed yet EXIR_DEBUG_HANDLE = 4 class ScopeStack(defaultdict): """ A utility class to handle the scope context manager """ def __init__(self): super().__init__(list) def get_curr_scopes(self) -> Dict[ScopeSource, List[str]]: """ Returns the current scope information as a dictionary. """ res = defaultdict(list) for key, val in self.items(): if len(val) == 0: continue scope_for_one_source = [] for v in val: scope_for_one_source.extend(v.data) res[key] = scope_for_one_source return res SCOPE_STACK = ScopeStack() VALID_OPS_TO_COPY_SCOPE_INFO = [] def add_graph_pass_scope( src_scopes: Dict[ScopeSource, List[str]], graph_pass_scopes: Dict[ScopeSource, List[str]] ) -> Dict[ScopeSource, List[str]]: res = {} """ Construct a scope by adding graph pass scopes from ``graph_pass_scopes`` to ``src_scopes``. The rules are the following: (1) We append the COREMLTOOLS_GRAPH_PASS ScopeSource in ``graph_pass_scopes`` to the ``src_scopes``. This will allow us to keep tracking the history of transformation. For instance: Input: src_scopes = { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } graph_pass_scopes = { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_2", "pass_3"], } Output: res = { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1", "pass_2", "pass_3"], } (2) Only COREMLTOOLS_GRAPH_PASS ScopeSource is allowed in ``graph_pass_scopes``. (3) Other ScopeSource will be passed down from ``src_scopes``. Input: src_scopes = { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.TORCHSCRIPT_MODULE_NAME: ["a1"], ScScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } graph_pass_scopes = { ScScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_2", "pass_3"], } Output: res = { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.TORCHSCRIPT_MODULE_NAME: ["a1"], ScScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1", "pass_2", "pass_3"], } """ res = defaultdict(list) for scope_source_key in ScopeSource: if scope_source_key in graph_pass_scopes: assert ( scope_source_key == ScopeSource.COREMLTOOLS_GRAPH_PASS ), "Only ScopeSource.COREMLTOOLS_GRAPH_PASS is allowed in the graph_pass_scopes." if ScopeSource.COREMLTOOLS_GRAPH_PASS in src_scopes: old_graph_pass_data = copy.copy(src_scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS]) else: old_graph_pass_data = [] new_graph_pass_data = copy.copy(graph_pass_scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS]) res[ScopeSource.COREMLTOOLS_GRAPH_PASS] = old_graph_pass_data + new_graph_pass_data elif scope_source_key in src_scopes: res[scope_source_key] = copy.copy(src_scopes[scope_source_key]) return res @define class ScopeInfo: """ Parameters ---------- source: str * Source of the scope. For instance, it could be a frontend framework like torchsccript, or a converter graph pass, etc. * Must be type of ScopeSource Enum. data: Union[str, List[str]] * Scope data. * It could be type of str or List[str]. Examples -------- Here are examples of creating a ScopeInfo: .. sourcecode:: python # A scope for a single torchscript module type scope_info = ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="Module_1", ) # A scope for a two layers torchscript model hierarchy type scope_info = ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module_1", "Module_2"], ) """ source: str = field(validator=validators.instance_of(ScopeSource)) data: Union[str, List[str]] = field(validator=validators.instance_of((str, list))) def __attrs_post_init__(self): # cleanup scope info if self.source in ( ScopeSource.TORCHSCRIPT_MODULE_NAME, ScopeSource.TORCHSCRIPT_MODULE_TYPE, ScopeSource.COREMLTOOLS_GRAPH_PASS, ): if not isinstance(self.data, list): self.data = [self.data] for i, val in enumerate(self.data): if not isinstance(val, str): raise ValueError( f"Scope must be type of List[str]. Got element {val} with type {type(val)}." ) self.data[i] = val.replace(" ", "") elif self.source == ScopeSource.EXIR_DEBUG_HANDLE: if not isinstance(self.data, list): self.data = [self.data] for val in self.data: if val is not None and not isinstance(val, int): raise ValueError( f"Scope must be None or type of List[int]. Got element {val} with type {type(val)}." ) if self.source == ScopeSource.COREMLTOOLS_GRAPH_PASS: if len(self.data) > 1: raise ValueError( f"COREMLTOOLS_GRAPH_PASS scope cannot have len > 1. Got {self.data}." ) if self.source == ScopeSource.TORCHSCRIPT_MODULE_TYPE: if "" in self.data: raise ValueError( "TORCHSCRIPT_MODULE_TYPE scope info cannot contains empty string." ) if self.source == ScopeSource.EXIR_DEBUG_HANDLE: if len(self.data) > 1: raise ValueError(f"EXIR_DEBUG_HANDLE scope cannot have len > 1. Got {self.data}.") class ScopeContextManger: def __init__( self, *scopes: List[ScopeInfo], ): """ A context manager pushes/pops the scope information, which makes the operations created within it have the corresponding scope information. Parameters ---------- scopes: Optional[List[ScopeInfo]] (Optional) * A list of ScopeInfo under the context manager. * The source in each ScopeInfo cannot be duplicated. * If not provided, this context manager does no affects. Examples -------- Here is an example of creating a scope for torchscript module heirarchy with type and name information. .. sourcecode:: python @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module1"]), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_1"]), ): return mb.add(x=x, y=4.3, name="add_1") In the above example, the "add_1" op will have two scope attributes, for torchscipt module type and name: * TORCHSCRIPT_MODULE_TYPE: ["Module1"] * TORCHSCRIPT_MODULE_NAME: ["module_1"] Here is an example of creating nested scopes: .. sourcecode:: python @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module1"]), ): x = mb.add(x=x, y=4.3, name="add_1") with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module2"]), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_2"]), ): return mb.add(x=x, y=3.2, name="add_2") In the above example, the "add_1" op would have a scope attribute: * TORCHSCRIPT_MODULE_TYPE: ["Module1"] while the "add_2" op would have scope attributes: * TORCHSCRIPT_MODULE_TYPE: ["Module1", "Module2"] * TORCHSCRIPT_MODULE_NAME: ["module_2"] """ self.scopes = scopes # Validate scopes are type of ScopeInfo for scope in self.scopes: if not isinstance(scope, ScopeInfo): raise ValueError( f"mb.scope only accepts inputs of type ScopeInfo. Got {type(scope)}." ) # validate there is no duplicated scope source visited_scope_sources = set() for scope in self.scopes: if scope.source in visited_scope_sources: raise ValueError(f"Scope source {scope.source} duplicated.") visited_scope_sources.add(scope.source) def __enter__(self): for scope in self.scopes: SCOPE_STACK[scope.source].append(scope) if scope.source == ScopeSource.COREMLTOOLS_GRAPH_PASS: VALID_OPS_TO_COPY_SCOPE_INFO.append(set()) def __exit__(self, type, value, traceback): for scope in self.scopes: SCOPE_STACK[scope.source].pop() if scope.source == ScopeSource.COREMLTOOLS_GRAPH_PASS: VALID_OPS_TO_COPY_SCOPE_INFO.pop() ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.249547 coremltools-8.0/coremltools/converters/mil/mil/tests/0000755000000000000000000000000014672075535021737 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/tests/__init__.py0000644000000000000000000000033114672066616024045 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/tests/test_block.py0000644000000000000000000004211214672066616024442 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import numpy as np import pytest from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil.passes.tests.test_passes import CONSTEXPR_FUNCS from coremltools.converters.mil.mil.utils import CacheDoublyLinkedList from coremltools.converters.mil.testing_utils import ( assert_same_output_names, assert_same_output_shapes, get_op_types_in_program, ) """ Test manipulating variable and operations in the Block. In the test, we are actually testing Function, which is a child class of Block. Technically Function should not inherit from Block, which is a debt to be resolved in the future. Function has some different behaviors from Block that are irrelevant to the core API being tested here. """ def test_empty_block(): """ Test an empty program """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x0): return x0 block = prog.functions["main"] assert len(block.operations) == 0 assert len(block.inputs) == 1 assert len(block.outputs) == 1 assert block.inputs["x0"] == block.outputs[0] def test_add_op(): """ Test add statement to an empty program, also change the output """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x0): return x0 print("before:\n{}".format(prog)) block = prog.functions["main"] x0 = block.inputs["x0"] with block: x1 = mb.log(x=x0) block.set_outputs([x1]) print("after:\n{}".format(prog)) assert block.inputs["x0"] == block.find_ops(op_type="log")[0].inputs["x"] assert len(block.operations) == 2 # const op for epsilon + log assert list(block.operations)[1].op_type == "log" assert block.outputs[0] == x1 def test_remove_op(): """ Test remove all ops and return empty program """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x0): x1 = mb.log(x=x0) return x1 print("before:\n{}".format(prog)) block = prog.functions["main"] assert len(block.operations) == 2 x0 = block.inputs["x0"] ops = block.find_ops(op_type="log") block.set_outputs([x0]) block.remove_ops(ops) print("after:\n{}".format(prog)) assert len(block.operations) == 1 assert len(block.inputs) == 1 assert len(block.outputs) == 1 assert block.inputs["x0"] == block.outputs[0] def test_remove_op2(): """ Test remove ops with multiple identical inputs """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x0): x1 = mb.add(x=x0, y=x0) return x1 print("before:\n{}".format(prog)) block = prog.functions["main"] x0 = block.inputs["x0"] ops = block.find_ops(op_type="add") block.set_outputs([x0]) block.remove_ops(ops) print("after:\n{}".format(prog)) assert len(block.operations) == 0 assert len(block.inputs) == 1 assert len(block.outputs) == 1 assert block.inputs["x0"] == block.outputs[0] def test_remove_duplicate_ops(): """Test remove duplicated ops.""" @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x0): x1 = mb.add(x=x0, y=x0) return x1 block = prog.functions["main"] x0 = block.inputs["x0"] ops = block.find_ops(op_type="add") duplicate_ops = ops + ops block.set_outputs([x0]) block.remove_ops(duplicate_ops) assert len(block.operations) == 0 assert len(block.inputs) == 1 assert len(block.outputs) == 1 assert block.inputs["x0"] == block.outputs[0] def test_remove_duplicate_ops_not_affect_others(): """ Test remove duplicated ops doesn't affect other ops. We add another `add` op here, but keep the input to remove_ops only restricted to the first `add` op. This test is for checking that the second add op doesn't get removed. """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x0): x1 = mb.add(x=x0, y=x0) x2 = mb.add(x=x0, y=x0) return x1, x2 block = prog.functions["main"] x0 = block.inputs["x0"] ops = [block.find_ops(op_type="add")[0]] block.set_outputs([x0]) block.remove_ops(ops) # Deleting one add operation should not affect the other one. assert len(block.operations) == 1 assert len(block.inputs) == 1 assert len(block.outputs) == 1 def test_remove_ops_fail_for_block_output(): """Block's output cannot be removed.""" @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x0): x1 = mb.add(x=x0, y=x0) x2 = mb.add(x=x0, y=x0) return x1, x2 block = prog.functions["main"] ops = block.find_ops(op_type="add") expected_err_str = "cannot delete op add_.* with output 0: add_.* that's block block.*'s output" with pytest.raises(ValueError, match=expected_err_str): block.remove_ops(ops) assert len(block.operations) == 2 assert len(block.inputs) == 1 assert len(block.outputs) == 2 def test_op_removal_and_insertion(): """ Remove a transpose pair and materialize one transpose before another op Given: %x1 = transpose(%x) %x2 = relu(%x1) %out1 = avg_pool(%x2) %x3 = transpose(%x2) %out2 = log(%x3) After removing both transposes: %x2 = relu(%x) %out1 = avg_pool(%x2) %out2 = log(%x2) After inserting a transpose: %x2 = relu(%x) %x4 = transpose(%x2) %out1 = avg_pool(%x4) %out2 = log(%x2) """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 6, 6))]) def prog(x): x1 = mb.transpose(x=x, perm=[0, 2, 3, 1]) x2 = mb.relu(x=x1) out1 = mb.avg_pool(x=x2, kernel_sizes=[1, 1], strides=[1, 1], pad_type="valid") x3 = mb.transpose(x=x2, perm=[0, 3, 1, 2]) out2 = mb.log(x=x3) return out1, out2 prev_prog = copy.deepcopy(prog) print("before:\n{}".format(prog)) assert get_op_types_in_program(prog) == [ "transpose", "relu", "avg_pool", "transpose", "log", ] block = prog.functions["main"] def remove_transpose(block): op = block.find_ops(op_type="transpose")[0] block.replace_uses_of_var_after_op( anchor_op=op.inputs["x"].op, old_var=op.outputs[0], new_var=op.inputs["x"], no_check_var_types=True, ) block.remove_ops([op]) with block: # remove 1st transpose remove_transpose(block) assert get_op_types_in_program(prog) == ["relu", "avg_pool", "transpose", "log"] # remove 2nd transpose remove_transpose(block) assert get_op_types_in_program(prog) == ["relu", "avg_pool", "log"] print("after transpose ops removal:\n{}".format(prog)) # insert transpose before pool pool_op = block.find_ops(op_type="avg_pool")[0] with block: y = mb.transpose(x=pool_op.inputs["x"], perm=[0, 2, 3, 1], before_op=pool_op) block.replace_uses_of_var_after_op( anchor_op=y.op, end_op=pool_op, old_var=pool_op.inputs["x"], new_var=y, no_check_var_types=True, ) print("after transpose insertion:\n{}".format(prog)) assert get_op_types_in_program(prog) == ["relu", "transpose", "avg_pool", "log"] for op in block.operations: op.type_value_inference(overwrite_output=True) assert_same_output_names(prev_prog, prog) assert_same_output_shapes(prev_prog, prog) def test_replace_nonreplaceable_vars(): """ The conversion should error out if an invalid replacement is invoked with nonreplaceable vars """ constexpr_op = "constexpr_sparse_to_dense" @mb.program(input_specs=[mb.TensorSpec(shape=(4, 2))]) def prog(x): constexpr = CONSTEXPR_FUNCS[constexpr_op]((4, 2)) return mb.add(x=x, y=constexpr) block = prog.functions["main"] constexpr_op = block.find_ops(op_type=constexpr_op)[0] with block: const = mb.const(val=np.random.rand(4, 2), before_op=constexpr_op) expected_err_str = "might potentially be removed during the replacement of those vars." with pytest.raises(ValueError, match=expected_err_str): block.replace_uses_of_var_after_op( anchor_op=constexpr_op, old_var=constexpr_op.outputs[0], new_var=const ) def test_replace_nonreplaceable_vars_force(): """ The conversion should not error out if the replace_uses_of_vars_after_op is executed with force_replace=True Also we test that, the new nonreplaceable_vars_upstream is propagated after the code exist `with block`. """ constexpr_op = "constexpr_sparse_to_dense" @mb.program(input_specs=[mb.TensorSpec(shape=(4, 2))]) def prog(x): constexpr = CONSTEXPR_FUNCS[constexpr_op]((4, 2)) return mb.add(x=x, y=constexpr) block = prog.functions["main"] constexpr_op = block.find_ops(op_type=constexpr_op)[0] add_op = block.find_ops(op_type="add")[0] assert len(add_op.outputs[0].nonreplaceable_vars_upstream) == 1 with block: const = mb.const(val=np.random.rand(4, 2), before_op=constexpr_op) block.replace_uses_of_var_after_op( anchor_op=constexpr_op, old_var=constexpr_op.outputs[0], new_var=const, force_replace=True, ) block.remove_ops([constexpr_op]) assert len(add_op.outputs[0].nonreplaceable_vars_upstream) == 0 def test_simple_substituion(): """ Replace log(x+y) with log(x*y) """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4)), mb.TensorSpec(shape=(2, 4))]) def prog(x0, y0): x1 = mb.add(x=x0, y=y0) z = mb.log(x=x1) return z print("before:\n{}".format(prog)) block = prog.functions["main"] assert len(block.find_ops(op_type="log")) == 1 assert len(block.find_ops(op_type="add")) == 1 assert len(block.find_ops(op_type="mul")) == 0 add = block.find_ops(op_type="add")[0] x0 = add.inputs["x"] y0 = add.inputs["y"] x1 = add.outputs[0] with block: # It's important to add 'mul' before 'add' (its even better to do it # immediately after 'add' but we don't have the API) # because we need to replace any op affected by add with 'mul' x2 = mb.mul(x=x0, y=y0, before_op=add) assert len(block.find_ops(op_type="mul")) == 1 assert len(block.find_ops(op_type="add")) == 1 assert len(block.find_ops(op_type="log")) == 1 # It's important to set anchor_op = 'mul' because new_var is only visible # after 'mul'. block.replace_uses_of_var_after_op(anchor_op=x2.op, old_var=x1, new_var=x2) block.remove_ops([add]) print("after:\n{}".format(prog)) assert len(block.find_ops(op_type="add")) == 0 assert len(block.find_ops(op_type="mul")) == 1 assert len(block.find_ops(op_type="log")) == 1 def test_substitute_nested_op(): """" Replace an conditional op with nested block """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4)), mb.TensorSpec(shape=(2, 4))]) def prog(x0, y0): pred = mb.less(x=x0, y=y0) z = mb.cond( pred=pred, _true_fn=lambda: mb.abs(x=x0), _false_fn=lambda: mb.abs(x=y0) ) z1 = mb.log(x=z) return z1 print("before:\n{}".format(prog)) block = prog.functions["main"] assert len(block.find_ops(op_type="less")) == 1 assert len(block.find_ops(op_type="abs")) == 2 assert len(block.find_ops(op_type="cond")) == 1 assert len(block.find_ops(op_type="log")) == 1 cond = block.find_ops(op_type="cond")[0] x0 = block.inputs["x0"] z = cond.outputs[0] block.replace_uses_of_var_after_op(anchor_op=None, old_var=z, new_var=x0) # removing cond will also remove the abs ops within its block block.remove_ops([cond]) print("after:\n{}".format(prog)) assert len(block.find_ops(op_type="less")) == 1 assert len(block.find_ops(op_type="log")) == 1 assert len(block.find_ops(op_type="cond")) == 0 assert len(block.find_ops(op_type="abs")) == 0 def test_simple_transpose_squash(): """ Test eliminate consecutive transpose can be canceled """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x0): x1 = mb.transpose(x=x0, perm=[1, 0]) x2 = mb.transpose(x=x1, perm=[1, 0]) x3 = mb.log(x=x2) x4 = mb.transpose(x=x3, perm=[1, 0]) x5 = mb.transpose(x=x4, perm=[1, 0]) x6 = mb.transpose(x=x5, perm=[1, 0]) x7 = mb.transpose(x=x6, perm=[1, 0]) return x7 print("before:\n{}".format(prog)) block = prog.functions["main"] assert len(block.find_ops(op_type="transpose")) == 6 def can_squash(trans1, trans2): return ( len(trans1.outputs) == 1 and len(trans2.outputs) == 1 and all(trans1.perm.val == trans2.perm.val) ) # Find all candidate pairs transposes # we ignore all const (transpose_perm_%x), and add pairs of transpose op as # candidate. This won't generalize to more complicated program with other # shape invariant ops in between. candidates = [] non_const_ops = [op for op in block.operations if op.op_type != "const"] for i in range(len(non_const_ops) - 1): op = non_const_ops[i] if len(candidates) and op == candidates[-1][1]: # op is already a squash candidate continue next_op = non_const_ops[i + 1] if ( op.op_type == "transpose" and next_op.op_type == "transpose" and can_squash(op, next_op) ): candidates.append((op, next_op)) # Remove each candidate pairs for (trans1, trans2) in candidates: before = trans1.inputs["x"] after = trans2.outputs[0] block.replace_uses_of_var_after_op( anchor_op=trans2, old_var=after, new_var=before ) block.remove_ops([trans1, trans2]) print("after:\n{}".format(prog)) assert len(block.find_ops(op_type="transpose")) == 0 def test_duplicate_outputs_add_consuming_block_once(): """The same consuming block should only be added once.""" @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4)), mb.TensorSpec(shape=(2, 4))]) def prog(x0, y0): x1 = mb.add(x=x0, y=y0) return x1, x1, x1 block = prog.functions["main"] assert len(block.outputs[0].consuming_blocks) == 1 assert len(block.outputs[1].consuming_blocks) == 1 assert len(block.outputs[2].consuming_blocks) == 1 def test_duplicate_outputs_substituion(): """Replaces var that appears more than once in outputs.""" @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4)), mb.TensorSpec(shape=(2, 4))]) def prog(x0, y0): x1 = mb.add(x=x0, y=y0) z = mb.log(x=x1) return x1, x1, z block = prog.functions["main"] add = block.find_ops(op_type="add")[0] x0 = add.inputs["x"] y0 = add.inputs["y"] x1 = add.outputs[0] with block: x2 = mb.mul(x=x0, y=y0, before_op=add, name="new_output") block.replace_uses_of_var_after_op(anchor_op=x2.op, old_var=x1, new_var=x2) block.remove_ops([add]) assert block.outputs[0].op.name == "new_output" assert block.outputs[1].op.name == "new_output" assert len(block.outputs[0].consuming_blocks) == 1 class TestCacheDoublyLinkedList: def test_basic(self): operations = CacheDoublyLinkedList() operations.insert_op_before(1) assert list(operations) == [1] operations.insert_op_before(2, before_op=1) assert list(operations) == [2, 1] operations.insert_op_before(3) assert list(operations) == [2, 1, 3] operations.insert_op_before(4, before_op=1) assert list(operations) == [2, 4, 1, 3] operations.remove(2) assert list(operations) == [4, 1, 3] operations.remove(3) assert list(operations) == [4, 1] operations.remove(4) assert list(operations) == [1] node = operations._get_node_from_op(1) operations.remove(1) assert list(operations) == [] assert node.prev is CacheDoublyLinkedList.INVALID_NODE assert node.next is CacheDoublyLinkedList.INVALID_NODE operations.insert_op_before(0) assert list(operations) == [0] operations = CacheDoublyLinkedList([1, 2, 3]) assert list(operations) == [1, 2, 3] operations = CacheDoublyLinkedList([]) assert list(operations) == [] def test_reversed(self): operations = CacheDoublyLinkedList([1, 2, 3]) assert list(reversed(operations)) == [3, 2, 1] def test_error(self): operations = CacheDoublyLinkedList([1, 2, 3]) assert operations[0] == 1 assert operations[-1] == 3 # Indexing doubly linked list is super expensive, we need to error out. with pytest.raises( ValueError, match="Doubly linked list does not support indexing other than 0, -1." ): operations[1] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/tests/test_debug.py0000644000000000000000000002416614672066616024447 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import os import tempfile import pytest import numpy as np import coremltools as ct from coremltools.converters.mil import Builder as mb from coremltools.converters.mil.debugging_utils import extract_submodel from coremltools.converters.mil.mil import get_new_symbol from coremltools.converters.mil.mil.types.symbolic import is_symbolic from coremltools.converters.mil.testing_utils import get_op_types_in_program def get_simple_program(): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 3, 4)),]) def prog(x): x = mb.add(x=x, y=1.2, name="add") x = mb.transpose(x=x, perm=[0, 2, 3, 1]) x = mb.square(x=x, name="output_0") x = mb.tanh(x=x, name="output_1") x = mb.transpose(x=x, perm=[0, 2, 3, 1]) return x return prog def compute_ground_truth_answer(input): x = input + 1.2 x = np.transpose(x, axes=[0, 2, 3, 1]) square = x * x tanh = np.tanh(square) return {"output_0": square, "output_1":tanh} class TestExtractSubModel: def test_extract_submodel_error_handling(self): prog = get_simple_program() mlmodel = ct.convert(prog, convert_to="neuralnetwork") invalid_outputs = set() with pytest.raises(ValueError, match="outputs must be of type list/tuple. Got "): extract_submodel(mlmodel, outputs=invalid_outputs) invalid_outputs = ["output_1", 1] with pytest.raises(ValueError, match="outputs must be a list of str. Got element 1 with type ."): extract_submodel(mlmodel, outputs=invalid_outputs) invalid_outputs = ["output_1", "output_1"] with pytest.raises(ValueError, match="outputs must be a list of unique elements. 'output_1' occurs 2 times"): extract_submodel(mlmodel, outputs=invalid_outputs) invalid_outputs = ["error"] with pytest.raises(ValueError, match="outputs \['error'\] not found in the function."): extract_submodel(mlmodel, outputs=invalid_outputs) model_dir = tempfile.TemporaryDirectory() mlmodel_path = os.path.join(model_dir.name, "model.mlmodel") mlmodel.save(mlmodel_path) mlmodel = ct.models.MLModel(mlmodel_path) with pytest.raises(ValueError, match="NeuralNetwork model loaded from the disk is not supported by the extract_submodel util"): extract_submodel(mlmodel, outputs=["output_0", "output_1"]) def test_extract_submodel_symbolic_input(self): """ Input graph: x -> sin ---> sub -> output_1 | v mul -> tan -> output_2 If x has symbolic shape, then the subgraph mil -> tan should also have symbolic shape """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, get_new_symbol()))]) def prog(x): sin = mb.sin(x=x, name="sin") sub = mb.sub(x=sin, y=1.5, name="sub") mul = mb.mul(x=sin, y=1.2, name="mul") tan = mb.tan(x=mul, name="tan") return sub, tan model = ct.convert(prog, convert_to="neuralnetwork") submodel = extract_submodel(model, outputs=["tan"], inputs=["mul"]) func = submodel._mil_program.functions["main"] input = list(func.inputs.values())[0] assert input.shape[0] == 1 assert is_symbolic(input.shape[1]) output = func.outputs[0] assert output.shape[0] == 1 assert is_symbolic(output.shape[1]) def test_extract_submodel_complex(self): """ Input graph: x -> sin ------> sub -> output_1 | | v v y -> add -> mul -> tan -> realdiv -> output_2 """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2)), mb.TensorSpec(shape=(1, 2))]) def prog(x, y): sin = mb.sin(x=x, name="sin") add = mb.add(x=sin, y=y, name="add") sub = mb.sub(x=sin, y=1.5, name="sub") mul = mb.mul(x=sin, y=add, name="mul") tan = mb.tan(x=mul, name="tan") realdiv = mb.real_div(x=tan, y=4.7, name="realdiv") return sub, realdiv model = ct.convert(prog, convert_to="neuralnetwork") """ Case 1: inputs = None outputs = [sin, mul] Output graph: x -> sin ------> output_1 | | v v y -> add -> mul -> output_2 """ submodel = extract_submodel(model, outputs=["sin", "mul"]) assert get_op_types_in_program(submodel._mil_program) == ["sin", "add", "mul"] """ Case 2: inputs = None outputs = [sin, add] Output graph: x -> sin -> output_1 | v y -> add -> output_2 """ submodel = extract_submodel(model, outputs=["sin", "add"]) assert get_op_types_in_program(submodel._mil_program) == ["sin", "add"] """ Case 3: inputs = None outputs = [mul] Output graph: x -> sin ----- | | v v y -> add -> mul -> output_1 """ submodel = extract_submodel(model, outputs=["mul"]) assert get_op_types_in_program(submodel._mil_program) == ["sin", "add", "mul"] """ Case 4: inputs = None outputs = [sin, sub] Output graph: x -> sin -> sub -> output_2 | V output_1 y """ submodel = extract_submodel(model, outputs=["sin", "sub"]) assert get_op_types_in_program(submodel._mil_program) == ["sin", "sub"] """ Case 5: inputs = [x, y] outputs = [mul] Output graph: x -> sin ----- | | v v y -> add -> mul -> output_1 """ submodel = extract_submodel(model, outputs=["mul"], inputs=["x", "y"]) assert get_op_types_in_program(submodel._mil_program) == ["sin", "add", "mul"] """ Case 6: inputs = [mul] outputs = [tan] mul -> tan -> output_1 """ submodel = extract_submodel(model, outputs=["tan"], inputs=["mul"]) assert get_op_types_in_program(submodel._mil_program) == ["tan"] """ Case 7: inputs = [sin, add] outputs = [sub, mul] sin ------> sub -> output_1 | v add -> mul -> output_2 """ submodel = extract_submodel(model, outputs=["sub", "mul"], inputs=["sin", "add"]) assert get_op_types_in_program(submodel._mil_program) == ["sub", "mul"] """ Case 8 (Negative): inputs = [sin] outputs = [mul] mul not reachable merely through sin """ with pytest.raises(ValueError, match="output mul not reachable from inputs"): submodel = extract_submodel(model, outputs=["mul"], inputs=["sin"]) """ Case 9 (Negative): inputs = [mul] outputs = [sin] sin not reachable merely through sin """ with pytest.raises(ValueError, match="output sin not reachable from inputs"): submodel = extract_submodel(model, outputs=["sin"], inputs=["mul"]) @pytest.mark.parametrize( "compute_unit", [ ct.ComputeUnit.ALL, ct.ComputeUnit.CPU_ONLY, ], ) def test_extract_submodel_neuralnetwork(self, compute_unit): prog = get_simple_program() model = ct.convert(prog, convert_to="neuralnetwork", compute_units=compute_unit) submodel = extract_submodel(model, outputs=["output_0", "output_1"]) # check that the submodel retains the same backend assert submodel.get_spec().WhichOneof("Type") == "neuralNetwork" # check that the submodel retains the same compute unit assert submodel.compute_unit == compute_unit # check the subgraph assert get_op_types_in_program(submodel._mil_program) == ["add", "transpose", "square", "tanh"] # check the numerical outputs coreml_in = np.random.rand(1, 2, 3, 4) coreml_out = submodel.predict({"x": coreml_in}) gt = compute_ground_truth_answer(coreml_in) assert len(coreml_out) == len(gt) for k, v in gt.items(): np.testing.assert_allclose(v, coreml_out[k], atol=0.2) @pytest.mark.parametrize( "compute_unit, store_to_disk", itertools.product( [ ct.ComputeUnit.ALL, ct.ComputeUnit.CPU_ONLY, ], [True, False], ) ) def test_extract_submodel_mlprogram(self, compute_unit, store_to_disk): prog = get_simple_program() model = ct.convert( prog, convert_to="mlprogram", compute_units=compute_unit, compute_precision=ct.precision.FLOAT32 ) if store_to_disk: model_dir = tempfile.TemporaryDirectory() mlmodel_path = os.path.join(model_dir.name, "model.mlpackage") model.save(mlmodel_path) model = ct.models.MLModel(mlmodel_path, compute_units=compute_unit) submodel = extract_submodel(model, outputs=["output_0", "output_1"]) # check that the submodel retains the same backend assert submodel.get_spec().WhichOneof("Type") == "mlProgram" # check that the submodel retains the same compute unit assert submodel.compute_unit == compute_unit # check the subgraph assert get_op_types_in_program(submodel._mil_program) == ["add", "transpose", "square", "tanh"] # check the numerical outputs coreml_in = np.random.rand(1, 2, 3, 4) coreml_out = submodel.predict({"x": coreml_in}) gt = compute_ground_truth_answer(coreml_in) assert len(coreml_out) == len(gt) for k, v in gt.items(): np.testing.assert_allclose(v, coreml_out[k], atol=0.2) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/tests/test_programs.py0000644000000000000000000022766214672066616025221 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np import pytest import coremltools as ct from coremltools import _logger as logger from coremltools.converters.mil import mil from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, Program, types from coremltools.converters.mil.mil.passes.tests.test_passes import CONSTEXPR_FUNCS from coremltools.converters.mil.mil.scope import ScopeInfo, ScopeSource, add_graph_pass_scope np.random.seed(0) def test_single_layer_example(): batch_size, input_dim, output_dim = 2, 4, 2 @mb.program( input_specs=[mb.TensorSpec(shape=(batch_size, input_dim)),] ) def prog(x): # Weight W_val = ( np.array([0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]) .reshape(input_dim, output_dim) .T.astype(np.float32) ) W = mb.const(val=W_val, name="const_W") # bias b_val = np.array([-0.5, 0.5]).astype(np.float32) b = mb.const(val=b_val, name="const_b") return mb.linear(x=x, weight=W, bias=b, name="lin") logger.info("prog:\n" + str(prog)) mlmodel = ct.convert(prog, source="milinternal", convert_to="neuralnetwork") feed_dict = { "x": np.random.rand(batch_size, input_dim).astype(np.float32), } assert mlmodel is not None if ct.utils._is_macos(): prediction = mlmodel.predict(feed_dict) assert len(prediction) == 1 def test_conv_example(): batch, C_in, C_out, H, W = 2, 2, 3, 7, 10 kH, kW = 3, 5 img_shape, seq_shape = (batch, C_in, H, W), (batch, C_in, H) @mb.program( input_specs=[mb.TensorSpec(shape=img_shape), mb.TensorSpec(shape=seq_shape),] ) def prog(img, seq): ## 2D convolution # Weight W_2d = np.random.rand(C_out, C_in, kH, kW).astype(np.float32) W_2d = mb.const(val=W_2d, name="const_W") # Test 1: provide only required arguments. conv1 = mb.conv(x=img, weight=W_2d, pad_type="valid") logger.info("conv1 shape: {}".format(conv1.shape)) # Test 2: stride > 1 conv2 = mb.conv(x=img, weight=W_2d, pad_type="valid", strides=[2, 3]) logger.info("conv2 shape: {}".format(conv2.shape)) # Test 3: same padding conv3 = mb.conv(x=img, weight=W_2d, pad_type="same", strides=[2, 3]) logger.info("conv3 shape: {}".format(conv3.shape)) # Test max_pool pool1 = mb.max_pool( x=img, kernel_sizes=[kH, kW], pad_type="valid", strides=[2, 3] ) logger.info("pool1 shape: {}".format(pool1.shape)) # Test max_pool pool2 = mb.max_pool( x=img, kernel_sizes=[kH, kW], pad_type="same", strides=[2, 3] ) logger.info("pool2 shape: {}".format(pool2.shape)) ## 1D convolution W_1d = np.random.rand(C_out, C_in, kH).astype(np.float32) W_1d = mb.const(val=W_1d, name="const_W_1d") logger.info("W_1d val: {}".format(W_1d.val)) # Test 4: provide only required arguments for 1D. conv4 = mb.conv(x=seq, weight=W_1d, pad_type="valid") logger.info("conv4 shape: {}".format(conv4.shape)) return conv1, conv2, conv3, pool1, pool2, conv4 # rdar://105988903 ([Infra] re-enable the test_conv_example unit test on M1 with compute_units=ALL) mlmodel = ct.convert(prog, source="milinternal", convert_to="neuralnetwork", compute_units=ct.ComputeUnit.CPU_ONLY) feed_dict = { "img": np.random.rand(*img_shape).astype(np.float32), "seq": np.random.rand(*seq_shape).astype(np.float32), } assert mlmodel is not None if ct.utils._is_macos(): prediction = mlmodel.predict(feed_dict) assert len(prediction) == 6 def test_while_example(): def body(a, b): return mb.add(x=a, y=b), b def cond(a, b): a_mean = mb.reduce_mean(x=a, axes=[0, 1]) b_mean = mb.reduce_mean(x=b, axes=[0, 1]) return mb.less(x=a_mean, y=b_mean) @mb.program( input_specs=[mb.TensorSpec(shape=(1, 2)), mb.TensorSpec(shape=(1, 2)),] ) def prog(a, b): return mb.while_loop(_cond=cond, _body=body, loop_vars=(a, b)) logger.info("prog:\n" + str(prog)) mlmodel = ct.convert(prog, source="milinternal", convert_to="neuralnetwork") feed_dict = { "a": np.random.rand(1, 2).astype(np.float32), "b": np.random.rand(1, 2).astype(np.float32), } assert mlmodel is not None if ct.utils._is_macos(): prediction = mlmodel.predict(feed_dict) assert len(prediction) == 2 def test_reserved_node_names(): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): return mb.square(x=x, name="tensor") mlmodel = ct.convert( prog, source="milinternal", convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY ) feed_dict = { "x": np.random.rand(10, 20).astype(np.float32), } assert mlmodel is not None if ct.utils._is_macos(): prediction = mlmodel.predict(feed_dict) assert len(prediction) == 1 def get_simple_topk_program(opset_version=None): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 4, 4))], opset_version=opset_version) def prog(x): x = mb.topk(x=x, k=1, axis=-1, ascending=True) return x return prog def get_simple_pixel_unshuffle_program(opset_version=None): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 4, 4))], opset_version=opset_version) def prog(x): x = mb.pixel_unshuffle(x=x, downscale_factor=np.uint32(2)) return x return prog def get_simple_topk_pixel_unshuffle_program(opset_version=None): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 4, 4))], opset_version=opset_version) def prog(x): x = mb.pixel_unshuffle(x=x, downscale_factor=np.uint32(2)) x = mb.topk(x=x, k=1, axis=-1, ascending=True) return x return prog def get_simple_nested_block_program(opset_version=None): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 4, 4))], opset_version=opset_version) def prog(x): def true_fn(): topk, _ = mb.topk(x=x, k=1, axis=-1, ascending=True) return mb.add(x=topk, y=1.) def false_fn(): topk, _ = mb.topk(x=x, k=1, axis=-1, ascending=True) return mb.add(x=topk, y=2.) shape = mb.shape(x=x) rank = mb.shape(x=shape) pred = mb.squeeze(x=rank) return mb.cond(pred=mb.cast(x=pred, dtype="bool"), _true_fn=true_fn, _false_fn=false_fn) return prog class TestMILProgramVersionHandling: """ Test basic functionality of opset version handling in pymil """ @staticmethod def test_multi_versions_op_selection(): ''' Builder should pick up the right version of op based on opset_version ''' # pick up the oldest version (iOS13) topk by default prog = get_simple_topk_program() main_func = prog.functions["main"] topk_op = main_func.find_ops(op_type="topk")[0] assert topk_op.opset_version == ct.target.iOS13 # pick up iOS13 version topk prog = get_simple_topk_program(opset_version=ct.target.iOS15) main_func = prog.functions["main"] topk_op = main_func.find_ops(op_type="topk")[0] assert topk_op.opset_version == ct.target.iOS13 # pick up iOS16 version topk prog = get_simple_topk_program(opset_version=ct.target.iOS16) main_func = prog.functions["main"] topk_op = main_func.find_ops(op_type="topk")[0] assert topk_op.opset_version == ct.target.iOS16 @staticmethod def test_pymil_front_end_conversion(): prog = get_simple_topk_pixel_unshuffle_program(opset_version=ct.target.iOS16) mlmodel = ct.convert( prog, minimum_deployment_target=ct.target.iOS16, compute_units=ct.ComputeUnit.CPU_ONLY ) @staticmethod def test_nested_block_opset_version_selection(): # pick up the oldest version (iOS13) topk by default prog = get_simple_nested_block_program() main_func = prog.functions["main"] topk_ops = main_func.find_ops(op_type="topk") assert all([topk.opset_version == ct.target.iOS13 for topk in topk_ops]) # pick up iOS16 version topk prog = get_simple_nested_block_program(opset_version=ct.target.iOS16) main_func = prog.functions["main"] topk_ops = main_func.find_ops(op_type="topk") assert all([topk.opset_version == ct.target.iOS16 for topk in topk_ops]) @staticmethod def test_pymil_opset_version_inference(): ''' The program consist of pixel_unshuffle should be inferred as an iOS16 version program ''' prog = get_simple_pixel_unshuffle_program() assert prog.functions["main"].opset_version == ct.target.iOS16 expected_err_str = ( "Please update the minimum_deployment_target to coremltools.target.iOS16, " "since op pixel_unshuffle is only available in opset coremltools.target.iOS16 or newer." ) with pytest.raises(ValueError, match=expected_err_str): mlmodel = ct.convert( prog, convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY ) @staticmethod def test_pymil_front_end_conversion_early_error_out(): prog = get_simple_topk_pixel_unshuffle_program(opset_version=ct.target.iOS16) expected_err_str = ( "Please update the minimum_deployment_target to coremltools.target.iOS16, " "since op pixel_unshuffle is only available in opset coremltools.target.iOS16 or newer." ) with pytest.raises(ValueError, match=expected_err_str): mlmodel = ct.convert( prog, minimum_deployment_target=ct.target.iOS15, compute_units=ct.ComputeUnit.CPU_ONLY, ) @staticmethod def test_unsupported_op_early_error_out(): ''' We should error out at the point when Builder tries to add an op which is only supported in a newer spec version ''' expected_err_str = ( "No available version for pixel_unshuffle in the coremltools.target.iOS15 opset. " "Please update the minimum_deployment_target to at least coremltools.target.iOS16" ) with pytest.raises(ValueError, match=expected_err_str): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 1, 4, 4))], opset_version=ct.target.iOS15) def prog(x): x = mb.pixel_unshuffle(x=x, downscale_factor=np.uint32(2)) return x @staticmethod def test_bulid_non_compatible_program_early_error_out(): ''' `mb.program` API should detect potential non compatible ops in the program, and error out early In this example, `pixel_unshuffle` is an iO16 op, and `topk` has iOS13 and iOS16 version. If the builder version is not set, it is picking up the iOS13 version of topk, which would potentially create an invalid program. In this case, `mb.program` should error out, and tell the user to set `opset_version=target.iOS16` ''' expected_err_str = ( "Op topk with an out of date version coremltools.target.iOS13 is detected. Please use @mb.program\(input_specs=..., opset_version=coremltools.target.iOS16\)" ) with pytest.raises(ValueError, match=expected_err_str): get_simple_topk_pixel_unshuffle_program() class TestMILBuilderAPI: """ Test the basic builder API. """ def test_create_function(self): """ Test mb.function API """ @mb.function(input_specs=[mb.TensorSpec(shape=(2, 4))]) def func(x): return mb.add(x=x, y=0.0) assert isinstance(func, Function) assert len(func.operations) == 2 # add, const assert len(func.inputs) == 1 assert len(func.outputs) == 1 def test_create_program(self): """ Test mb.program API """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): return mb.add(x=x, y=0.0) assert isinstance(prog, Program) func = prog.functions["main"] assert len(func.operations) == 2 # add, const assert len(func.inputs) == 1 assert len(func.outputs) == 1 def test_create_program_function_name(self): """ If ``function_name`` is not provide, mb.program creates function with name "main" by default. """ # defaults to "main" @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x0): return x0 assert len(prog.functions) == 1 assert "main" in prog.functions # user can also provide function_name @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))], function_name="good_function") def prog(x0): return x0 assert len(prog.functions) == 1 assert "good_function" in prog.functions def test_program_with_multiple_functions(self): """ Basic creation of a program with multiple functions """ @mb.function(input_specs=[mb.TensorSpec(shape=(2, 4))]) def func_1(x): return x @mb.function(input_specs=[mb.TensorSpec(shape=(2, 4))]) def func_2(x): return x @mb.function(input_specs=[mb.TensorSpec(shape=(2, 4))]) def func_3(x): return x prog = mil.Program() prog.add_function("func_1", func_1) prog.add_function("func_2", func_2) prog.add_function("func_3", func_3) assert set(prog.functions.keys()) == set(["func_1", "func_2", "func_3"]) def test_error_out_incompatible_functions(self): """ ``add_function`` should error out when a function with different opset is added to a program. """ @mb.function(input_specs=[mb.TensorSpec(shape=(2, 4))], opset_version=ct.target.iOS13) def func_1(x): return x @mb.function(input_specs=[mb.TensorSpec(shape=(2, 4))], opset_version=ct.target.iOS17) def func_2(x): return x err_msg = "all functions must have the same opset_version." prog = mil.Program() prog.add_function("func_1", func_1) with pytest.raises(ValueError, match=err_msg): prog.add_function("func_2", func_2) prog = mil.Program() prog.add_function("func_2", func_2) with pytest.raises(ValueError, match=err_msg): prog.add_function("func_1", func_1) class TestMILBasic: """ Test the basic error handling / validation in pymil. """ @staticmethod def test_type_domain_validation(): ''' The builder should error out early when detecting the input type violation against the defined type_domain ''' expected_err_str = ( "In op, of type rsqrt, named rsqrt_0, the named input `epsilon` must have the same data type as the named input `x`. However, epsilon has dtype int32 whereas x has dtype fp32" ) with pytest.raises(ValueError, match=expected_err_str): @mb.program(input_specs=[mb.TensorSpec(shape=(2,), dtype=types.fp32)]) def prog(x): res = mb.rsqrt(x=x, epsilon=1) return res @staticmethod def test_get_dialect_namespaces(): """ Test we can get a dict of dialect namespaces in the program. """ # The pymil program is mixed of torch / complex dialect opset @mb.program(input_specs=[mb.TensorSpec(shape=(2, 2, 3, 4), dtype=types.fp32)]) def prog(x): real_data = mb.torch_upsample_nearest_neighbor( x=x, output_height=10, output_width=5, name="op_1" ) imag_data = mb.add(x=real_data, y=8.9, name="op_2") return mb.complex(real_data=real_data, imag_data=imag_data, name="op_3") dialect_namespaces = prog._get_dialect_namespaces() assert len(dialect_namespaces["torch"]) == 1 assert dialect_namespaces["torch"][0].name == "op_1" assert len(dialect_namespaces["complex"]) == 1 assert dialect_namespaces["complex"][0].name == "op_3" # The pymil program with only core ops returns an empty dict @mb.program(input_specs=[mb.TensorSpec(shape=(2, 2, 3, 4), dtype=types.fp32)]) def prog(x): return mb.add(x=x, y=8.9) assert len(prog._get_dialect_namespaces()) == 0 @staticmethod def test_invalid_dialect_namespaces_error_out(): """ The converter should early error out if dialect opset is detected in the pymil program. """ # The pymil program of torch dialect opset cannot be lowered to backend @mb.program(input_specs=[mb.TensorSpec(shape=(2, 2, 3, 4), dtype=types.fp32)]) def prog(x): return mb.torch_upsample_nearest_neighbor( x=x, output_height=10, output_width=5, name="op_1" ) expected_err_str = 'Core ML only support core opset. Got unsupported op "op_1" with type "torch_upsample_nearest_neighbor" of dialect namespace "torch".' with pytest.raises(ValueError, match=expected_err_str): ct.convert(prog, convert_to="mlprogram", pass_pipeline=ct.PassPipeline.EMPTY) @staticmethod def test_rank6_tensor_early_error_out(): ''' The builder should error out early when detecting a rank 6 (or higher) tensor which cannot be eliminated by graph passes ''' @mb.program(input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)]) def prog(x): res = mb.reshape(x=x, shape=(1, 1, 1, 1, 1, 1), name="reshape_0") return res expected_err_str = ( "Core ML only supports tensors with rank <= 5. Layer \"reshape_0\", with type \"reshape\", outputs a rank 6 tensor" ) with pytest.raises(ValueError, match=expected_err_str): ct.convert( prog, source="milinternal", convert_to="neuralnetwork", compute_units=ct.ComputeUnit.CPU_ONLY, ) @staticmethod def test_rank5_list_early_error_out(): ''' The builder should error out early when detecting a list of rank 5 (or higher) tensors is created ''' expected_err_str = ( "Core ML only supports list of elements with rank <= 4. Layer \"list_0\", with type \"make_list\", outputs a list of rank 5 tensors." ) with pytest.raises(ValueError, match=expected_err_str): @mb.program(input_specs=[mb.TensorSpec(shape=(1,), dtype=types.fp32)]) def prog(x): ls = mb.make_list( init_length=1, dtype="fp32", elem_shape=(1, 1, 1, 1, 1), dynamic_length=True, name="list_0", ) return ls @staticmethod def test_invalid_const_input_early_error_out(): """ The following program: constexpr -> transpose -> linear will not error out during the front end conversion, even though the weight of linear op needs to be const / constexpr directly. It is going to error out after all the optimization graph passes are finished, and transpose remains. However, if transpose can be removed, the conversion goes through. """ # Test a simple constexpr op fed into linear @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): constexpr = CONSTEXPR_FUNCS["constexpr_affine_dequantize"]((4, 3)) return mb.linear(x=x, weight=constexpr) for compute_precision in [ct.precision.FLOAT32, ct.precision.FLOAT16]: mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS16, compute_units=ct.ComputeUnit.CPU_ONLY, compute_precision=compute_precision, ) # Additional pattern (transpose) after constexpr will cause an early error out @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): constexpr = CONSTEXPR_FUNCS["constexpr_affine_dequantize"]((3, 4)) constexpr = mb.transpose(x=constexpr, perm=[1, 0]) return mb.linear(x=x, weight=constexpr) for compute_precision in [ct.precision.FLOAT32, ct.precision.FLOAT16]: with pytest.raises(ValueError, match="must be const or constexpr ops"): mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS16, pass_pipeline=ct.PassPipeline.EMPTY, compute_units=ct.ComputeUnit.CPU_ONLY, compute_precision=compute_precision, ) # If the transpose is removed by graph pass merge_affine_dequantize_with_consecutive_ops, # the conversion goes through @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): constexpr = CONSTEXPR_FUNCS["constexpr_affine_dequantize"]((4, 3)) constexpr = mb.transpose(x=constexpr, perm=[0, 1]) return mb.linear(x=x, weight=constexpr) mlmodel = ct.convert( prog, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS16, compute_units=ct.ComputeUnit.CPU_ONLY, compute_precision=compute_precision, ) class TestScope: @staticmethod def test_basic_single_TorchScript_scope(): # single scope with scope_name and scope_type @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data="module_1"), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="Module1"), ): return mb.add(x=x, y=5.4) add_op = prog.find_ops(op_type="add")[0] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == ["module_1"] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["Module1"] # single scope with scope_name and scope_type with list type @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_1"]), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module1"]), ): return mb.add(x=x, y=5.4) add_op = prog.find_ops(op_type="add")[0] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == ["module_1"] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["Module1"] # single scope with scope_type and no scope_name @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1"]), ): return mb.add(x=x, y=5.4) add_op = prog.find_ops(op_type="add")[0] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_1"] assert ScopeSource.TORCHSCRIPT_MODULE_NAME not in add_op.scopes # nested scope in a single mb.scope call. Both scope_name and scope_type provided @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_1", "module_2"] ), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module1", "Module2"]), ): return mb.add(x=x, y=5.4) add_op = prog.find_ops(op_type="add")[0] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == [ "module_1", "module_2", ] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "Module1", "Module2", ] # nested scope in a single mb.scope call. Only scope_type provided @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1", "module_2"] ), ): return mb.add(x=x, y=5.4) add_op = prog.find_ops(op_type="add")[0] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "module_1", "module_2", ] assert ScopeSource.TORCHSCRIPT_MODULE_NAME not in add_op.scopes @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["", ""]), ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1", "module_2"] ), ): return mb.add(x=x, y=5.4) add_op = prog.find_ops(op_type="add")[0] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == ["", ""] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "module_1", "module_2", ] @staticmethod def test_basic_nested_TorchScript_scope(): # nested scope with scope_name and scope_type @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data="module_1"), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="Module1"), ): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data="module_2"), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="Module2"), ): x = mb.add(x=x, y=5.4) return mb.add(x=x, y=0.0) add_op_1 = prog.find_ops(op_type="add")[0] assert add_op_1.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == [ "module_1", "module_2", ] assert add_op_1.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "Module1", "Module2", ] add_op_2 = prog.find_ops(op_type="add")[1] assert add_op_2.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == ["module_1"] assert add_op_2.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["Module1"] # nested scope with scope_name and scope_type with list type @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_1"]), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module1"]), ): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_2"]), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module2"]), ): x = mb.add(x=x, y=5.4) return mb.add(x=x, y=0.0) add_op_1 = prog.find_ops(op_type="add")[0] assert add_op_1.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == [ "module_1", "module_2", ] assert add_op_1.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "Module1", "Module2", ] add_op_2 = prog.find_ops(op_type="add")[1] assert add_op_2.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == ["module_1"] assert add_op_2.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["Module1"] # nested scope with scope_name and no scope_type @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1"]), ): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_2"]), ): x = mb.add(x=x, y=5.4) return mb.add(x=x, y=0.0) add_op_1 = prog.find_ops(op_type="add")[0] assert ScopeSource.TORCHSCRIPT_MODULE_NAME not in add_op_1.scopes assert add_op_1.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "module_1", "module_2", ] add_op_2 = prog.find_ops(op_type="add")[1] assert ScopeSource.TORCHSCRIPT_MODULE_NAME not in add_op_2.scopes assert add_op_2.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["module_1"] # nested scope in a nested mb.scope call. Both scope_name and scope_type provided @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_1", "module_2"] ), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module1", "Module2"]), ): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data="module_3"), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="Module3"), ): x = mb.add(x=x, y=5.4) return mb.add(x=x, y=0.0) add_op_1 = prog.find_ops(op_type="add")[0] assert add_op_1.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == [ "module_1", "module_2", "module_3", ] assert add_op_1.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "Module1", "Module2", "Module3", ] add_op_2 = prog.find_ops(op_type="add")[1] assert add_op_2.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == [ "module_1", "module_2", ] assert add_op_2.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "Module1", "Module2", ] # nested scope in a single mb.scope call. Only scope_type provided @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1", "module_2"] ), ): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_3"]), ): x = mb.add(x=x, y=5.4) return mb.add(x=x, y=0.0) add_op_1 = prog.find_ops(op_type="add")[0] assert ScopeSource.TORCHSCRIPT_MODULE_NAME not in add_op_1.scopes assert add_op_1.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "module_1", "module_2", "module_3", ] add_op_2 = prog.find_ops(op_type="add")[1] assert ScopeSource.TORCHSCRIPT_MODULE_NAME not in add_op_2.scopes assert add_op_2.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "module_1", "module_2", ] @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1", "module_2"] ), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["", ""]), ): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_3"]), ): x = mb.add(x=x, y=5.4) return mb.add(x=x, y=0.0) add_op_1 = prog.find_ops(op_type="add")[0] assert add_op_1.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "module_1", "module_2", "module_3", ] assert add_op_1.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == ["", ""] add_op_2 = prog.find_ops(op_type="add")[1] assert add_op_2.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "module_1", "module_2", ] assert add_op_2.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == ["", ""] @staticmethod def test_graph_pass_scope_handling(): # default list type @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo( source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data="pass_1", ), ): return mb.add(x=x, y=0.0) add_op_1 = prog.find_ops(op_type="add")[0] assert add_op_1.scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == [ "pass_1", ] # data cannot have len > 1 @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with pytest.raises( ValueError, match="COREMLTOOLS_GRAPH_PASS scope cannot have len > 1." ): with mb.scope( ScopeInfo( source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1", "pass_2"], ), ): return mb.add(x=x, y=0.0) return x # nested graph pass scope is allowed @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo( source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data="pass_1", ), ): with mb.scope( ScopeInfo( source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data="pass_2", ), ): return mb.add(x=x, y=0.0) add_op_1 = prog.find_ops(op_type="add")[0] assert add_op_1.scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == [ "pass_1", "pass_2", ] @staticmethod def test_EXIR_scope_handling(): # default list type @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.EXIR_STACK_TRACE, data=["x + 0.0"]), ScopeInfo(source=ScopeSource.EXIR_DEBUG_HANDLE, data=[1]), ): return mb.add(x=x, y=0.0) add_op_1 = prog.find_ops(op_type="add")[0] assert add_op_1.scopes[ScopeSource.EXIR_STACK_TRACE] == ["x + 0.0"] assert add_op_1.scopes[ScopeSource.EXIR_DEBUG_HANDLE] == [1] # debug handle data cannot have len > 1 @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with pytest.raises(ValueError, match="EXIR_DEBUG_HANDLE scope cannot have len > 1."): with mb.scope(ScopeInfo(source=ScopeSource.EXIR_DEBUG_HANDLE, data=[2, 3])): return mb.add(x=x, y=0.0) return x # nested graph pass scope is allowed @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope(ScopeInfo(source=ScopeSource.EXIR_DEBUG_HANDLE, data=[None])): with mb.scope(ScopeInfo(source=ScopeSource.EXIR_DEBUG_HANDLE, data=[0])): with mb.scope(ScopeInfo(source=ScopeSource.EXIR_STACK_TRACE, data=["x + 0.0"])): with mb.scope(ScopeInfo(source=ScopeSource.EXIR_STACK_TRACE, data=[None])): return mb.add(x=x, y=0.0) add_op_1 = prog.find_ops(op_type="add")[0] assert add_op_1.scopes[ScopeSource.EXIR_STACK_TRACE] == ["x + 0.0", None] assert add_op_1.scopes[ScopeSource.EXIR_DEBUG_HANDLE] == [None, 0] @staticmethod def test_invalid_dtype_error_out(): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with pytest.raises( ValueError, match="Scope must be type of List\[str\]. Got element 9 with type \.", ): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["m1", 9]), ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module1", "Module2"] ), ): return mb.add(x=x, y=5.4) with pytest.raises( ValueError, match="Scope must be type of List\[str\]. Got element 0 with type \.", ): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["m1", "m2"]), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module1", 0]), ): return mb.add(x=x, y=5.4) return x @staticmethod def test_empty_scope(): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope(): return mb.add(x=x, y=5.4) add_op = prog.find_ops(op_type="add")[0] assert ScopeSource.TORCHSCRIPT_MODULE_TYPE not in add_op.scopes assert ScopeSource.TORCHSCRIPT_MODULE_NAME not in add_op.scopes @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope(): with mb.scope(): return mb.add(x=x, y=5.4) add_op = prog.find_ops(op_type="add")[0] assert ScopeSource.TORCHSCRIPT_MODULE_TYPE not in add_op.scopes assert ScopeSource.TORCHSCRIPT_MODULE_NAME not in add_op.scopes @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope(): with mb.scope(ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="m1")): with mb.scope(): return mb.add(x=x, y=5.4) add_op = prog.find_ops(op_type="add")[0] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == ["m1"] assert ScopeSource.TORCHSCRIPT_MODULE_NAME not in add_op.scopes @staticmethod def test_empty_scope_type_error_out(): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with pytest.raises( ValueError, match="TORCHSCRIPT_MODULE_TYPE scope info cannot contains empty string." ): with mb.scope(ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="")): with mb.scope(): return mb.add(x=x, y=5.4) with pytest.raises( ValueError, match="TORCHSCRIPT_MODULE_TYPE scope info cannot contains empty string." ): with mb.scope( ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["a", ""], ) ): with mb.scope(): return mb.add(x=x, y=5.4) return x @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1"], ) ): with pytest.raises( ValueError, match="TORCHSCRIPT_MODULE_TYPE scope info cannot contains empty string.", ): with mb.scope( ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=[""], ) ): return mb.add(x=x, y=5.4) with pytest.raises( ValueError, match="TORCHSCRIPT_MODULE_TYPE scope info cannot contains empty string.", ): with mb.scope( ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["a", "", ""], ) ): return mb.add(x=x, y=5.4) return x @staticmethod def test_white_space_handling(): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=[" module_1 "]), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=[" Module1"]), ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=[" pass_1"]), ): return mb.add(x=x, y=5.4) add_op = prog.find_ops(op_type="add")[0] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == [ "module_1", ] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "Module1", ] assert add_op.scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS] == [ "pass_1", ] @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=[" Module1 ", " "]), ScopeInfo( source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=[" module_1 ", " module_2 "] ), ): return mb.add(x=x, y=5.4) add_op = prog.find_ops(op_type="add")[0] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] == [ "module_1", "module_2", ] assert add_op.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME] == ["Module1", ""] @staticmethod def test_duplicated_scope_source_error_out(): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with pytest.raises( ValueError, match="Scope source ScopeSource.TORCHSCRIPT_MODULE_TYPE duplicated." ): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="a1"), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data="a2"), ): return mb.add(x=x, y=5.4) return x @staticmethod def test_check_prog_has_scope_error_out(): def get_prog(): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_1"]), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["Module1"]), ): x = mb.add(x=x, y=5.4) x = mb.relu(x=x, name="invalid_op") return x return prog prog = get_prog() prog._add_essential_scope_source( [ScopeSource.TORCHSCRIPT_MODULE_TYPE, ScopeSource.TORCHSCRIPT_MODULE_NAME] ) with pytest.raises( ValueError, match="is missing essential scopes ScopeSource.TORCHSCRIPT_MODULE_TYPE" ): prog.validate(check_essential_scope=True) # If check_essential_scope is not passes, it will not error out prog.validate() # No error if no essential scope source are set prog = get_prog() prog.validate(check_essential_scope=True) @staticmethod def test_invalid_scope_source_type(): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with pytest.raises(TypeError, match="'source' must be \"): with mb.scope( ScopeInfo(source="invalid_source", data="a1"), ): return mb.add(x=x, y=5.4) return x @staticmethod def test_invalid_scope_info_type(): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with pytest.raises( ValueError, match="mb.scope only accepts inputs of type ScopeInfo. Got \.", ): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_1"]), "invalid", ): return mb.add(x=x, y=5.4) return x @staticmethod def test_scope_setter_immutable(): """ When setting the `scopes` property for an op, the value should be deep copied. """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_1"]), ): x = mb.add(x=x, y=5.4) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_2"]), ): y = mb.add(x=x, y=5.4) x.scopes = y.scopes y.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME][0] = "invalid" assert x.scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME][0] == "module_2" return x @staticmethod def test_scopes_for_function_inputs(): """ If a var's parent op is a placeholder, we cannot set its scopes. And its scopes is an empty dictionary. """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3))]) def prog(x): assert len(x.scopes) == 0 with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_1"]), ): y = mb.add(x=x, y=5.4) with pytest.raises( ValueError, match="Cannot set scopes to a function input var", ): x.scopes = y.scopes return y @staticmethod def test_add_graph_pass_scope(): """ Test the rules of merging two scopes. """ # Rule of merging COREMLTOOLS_GRAPH_PASS old_scopes = { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } new_scopes = { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_2", "pass_3"], } res = dict(add_graph_pass_scope(old_scopes, new_scopes)) assert res == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1", "pass_2", "pass_3"], } # Ensure we make a copy of the list old_scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS][0] = "invalid" assert res == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1", "pass_2", "pass_3"], } new_scopes[ScopeSource.COREMLTOOLS_GRAPH_PASS][0] = "invalid" assert res == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1", "pass_2", "pass_3"], } # Another test old_scopes = { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.TORCHSCRIPT_MODULE_NAME: ["a1"], } new_scopes = { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } res = add_graph_pass_scope(old_scopes, new_scopes) assert res == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.TORCHSCRIPT_MODULE_NAME: ["a1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } # Ensure we make a copy of the list old_scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE][0] = "invalid" assert res == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.TORCHSCRIPT_MODULE_NAME: ["a1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } old_scopes[ScopeSource.TORCHSCRIPT_MODULE_NAME][0] = "invalid" assert res == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.TORCHSCRIPT_MODULE_NAME: ["a1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } # Test for other scope source old_scopes = { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.TORCHSCRIPT_MODULE_NAME: ["a1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } new_scopes = { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_2"], } with pytest.raises( AssertionError, match="Only ScopeSource.COREMLTOOLS_GRAPH_PASS is allowed in the graph_pass_scopes.", ): add_graph_pass_scope(old_scopes, new_scopes) @staticmethod def test_scope_preservation_when_reconnect_graph(): """ If the _replace_var is doing reconnection of the graph, without any new op introduced, no scope information is going to change. """ def get_prog(): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1"]), ): relu = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_2"]), ): sin = mb.sin(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_3"]), ): return mb.relu(x=relu) return prog # Case 1: No graph pass is involved, and only reconnect graph is done. # Scope information will not change. prog = get_prog() block = prog.functions["main"] ops = list(block.operations) var_1, var_2 = ops[0].outputs[0], ops[1].outputs[0] assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], } assert var_2.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_2"], } block._replace_var(var_1, var_2) assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], } assert var_2.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_2"], } # Case 2: Even the reconnection happens under graph pass, nothing will change. prog = get_prog() block = prog.functions["main"] ops = list(block.operations) var_1, var_2 = ops[0].outputs[0], ops[1].outputs[0] with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["dummy_pass"])): block._replace_var(var_1, var_2) assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], } assert var_2.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_2"], } # Case 3: old_var and new_var are created under a graph pass, and the reconnection happens under the # same graph pass. Nothing will change still. with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["dummy_pass"])): prog = get_prog() block = prog.functions["main"] ops = list(block.operations) var_1, var_2 = ops[0].outputs[0], ops[1].outputs[0] assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["dummy_pass"], } assert var_2.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_2"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["dummy_pass"], } block._replace_var(var_1, var_2) assert var_2.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_2"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["dummy_pass"], } # Case 4: Ops are created under a graph pass, and the reconnection happens outside the graph pass. # Nothing happens. with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["dummy_pass"])): prog = get_prog() block = prog.functions["main"] ops = list(block.operations) var_1, var_2 = ops[0].outputs[0], ops[1].outputs[0] assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["dummy_pass"], } assert var_2.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_2"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["dummy_pass"], } block._replace_var(var_1, var_2) assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["dummy_pass"], } assert var_2.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_2"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["dummy_pass"], } # Case 5: Ops are created under a graph pass 1, and the reconnection happens under graph pass2. # Nothing happens. with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["dummy_pass"])): prog = get_prog() block = prog.functions["main"] ops = list(block.operations) var_1, var_2 = ops[0].outputs[0], ops[1].outputs[0] assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["dummy_pass"], } assert var_2.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_2"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["dummy_pass"], } with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["dummy_pass_2"])): block._replace_var(var_1, var_2) assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["dummy_pass"], } assert var_2.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_2"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["dummy_pass"], } # Case 6. old_var and new_var are created under the same graph pass @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): relu = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): sin = mb.sin(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_3"]), ): return mb.relu(x=relu) block = prog.functions["main"] ops = list(block.operations) var_1, var_2 = ops[0].outputs[0], ops[1].outputs[0] with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"])): block._replace_var(var_1, var_2) assert var_1.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } assert var_2.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } @staticmethod def test_scope_passdown_when_new_var_created_under_graph_pass(): """ If a new_var is created by a graph pass, and the _replace_var happens under the same graph pass, the scope information from the old_var is passed to new_var. """ def get_prog(): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1"]), ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): # This op is created by pass_1 relu = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_2"]), ): # This op is newly created by a pass_2 sin = mb.sin(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_3"]), ): return mb.relu(x=relu) return prog # Case 1: _replace_var happens outside the graph pass. Nothing happens prog = get_prog() block = prog.functions["main"] ops = list(block.operations) var_1, var_2 = ops[0].outputs[0], ops[1].outputs[0] assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } assert var_2.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_2"], } block._replace_var(var_1, var_2) assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } assert var_2.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_2"], } # Case 2: new_var created under a pass_2, and _replace_var happens under pass_2. Scope info is passed from the old_var # to the new_var @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1"]), ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): # This op is created by pass_1 relu = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_3"]), ): return mb.relu(x=relu) with prog.functions["main"] as block: op_1, op_2 = list(block.operations) with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_2"]), ): # This op is newly created by a pass_2 sin = mb.sin(x=block.inputs["x"], before_op=op_2) block._replace_var(op_1.outputs[0], sin) block = prog.functions["main"] ops = list(block.operations) var_1, var_2 = ops[0].outputs[0], ops[1].outputs[0] assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } assert var_2.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1", "pass_2"], } # Case 3: new_var created under a pass_2, but _replace_var happens under pass_3. # Nothing happens. prog = get_prog() block = prog.functions["main"] ops = list(block.operations) var_1, var_2 = ops[0].outputs[0], ops[1].outputs[0] with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_3"])): block._replace_var(var_1, var_2) assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } assert var_2.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_2"], } # Case 4: new_var created under pass_2, and be passed down some scope info, # so even though _replace_var happens under pass_2 again, nothing happens. @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1"]), ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): # This op is created by pass_1 relu = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_2"]), ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_2"]), ): # This op is newly created by a pass_2, and other scope info already passed down sin = mb.sin(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_3"]), ): return mb.relu(x=relu) block = prog.functions["main"] ops = list(block.operations) var_1, var_2 = ops[0].outputs[0], ops[1].outputs[0] assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } assert var_2.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_2"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_2"], } with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_2"])): block._replace_var(var_1, var_2) assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } assert var_2.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_2"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_2"], } # Case 5: new_var created under pass_2, but the graph pass already finished, # so even though _replace_var happens under pass_2 again, nothing happens. @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): # This op is created by pass_1 relu = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_2"]), ): # This op is newly created by a pass_2, and other scope info already passed down sin = mb.sin(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_3"]), ): return mb.relu(x=relu) block = prog.functions["main"] ops = list(block.operations) var_1, var_2 = ops[0].outputs[0], ops[1].outputs[0] assert var_1.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } assert var_2.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_2"], } with mb.scope(ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_2"])): block._replace_var(var_1, var_2) assert var_1.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } assert var_2.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_2"], } # Case 6: new_var created under nested graph passes scope. And graph pass happens under pass_3. @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1"]), ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): # This op is created by pass_1 relu = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_3"]), ): return mb.relu(x=relu) block = prog.functions["main"] ops = list(block.operations) with block: with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_2"]), ): with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_3"]), ): sin = mb.sin(x=block.inputs["x"], before_op=ops[1]) block._replace_var(ops[0].outputs[0], sin) ops = list(block.operations) var_1, var_2 = ops[0].outputs[0], ops[1].outputs[0] assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } assert var_2.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1", "pass_2", "pass_3"], } # Case 7: new_var created under nested graph passes scope. And graph pass happens under pass_2. Nothing will happen in this case, since new_var is created under pass_3. @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1"]), ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): # This op is created by pass_1 relu = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_3"]), ): return mb.relu(x=relu) block = prog.functions["main"] ops = list(block.operations) with block: with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_2"]), ): with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_3"]), ): sin = mb.sin(x=block.inputs["x"], before_op=ops[1]) block._replace_var(ops[0].outputs[0], sin) ops = list(block.operations) var_1, var_2 = ops[0].outputs[0], ops[1].outputs[0] assert var_1.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } assert var_2.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_2", "pass_3"], } @staticmethod def test_scope_passdown_resursive(): """ Test the resursive back propagation when passing down scope info. """ # Case 1 @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1"]), ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): # This op is created by pass_1 relu = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_3"]), ): return mb.relu(x=relu) block = prog.functions["main"] ops = list(block.operations) with block: with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_2"]), ): # The subgraph is constructed under pass_2 y = mb.leaky_relu(x=block.inputs["x"], alpha=0.8, before_op=ops[1]) y = mb.add(x=y, y=y, before_op=ops[1]) y = mb.leaky_relu(x=y, alpha=0.4, before_op=ops[1]) block._replace_var(ops[0].outputs[0], y) ops = list(block.operations) assert ops[0].outputs[0].scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } add_ops = block.find_ops(op_type="add") const_ops = block.find_ops(op_type="const") leaky_relu_ops = block.find_ops(op_type="leaky_relu") assert len(add_ops) == 1 assert len(const_ops) == 2 assert len(leaky_relu_ops) == 2 for op in leaky_relu_ops + add_ops + const_ops: assert op.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1", "pass_2"], } # Case 2: Test for VALID_OPS_TO_COPY_SCOPE_INFO in the scope back propagation # The same var cannot be visited twice @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): # This op is created by pass_1 relu = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): return mb.relu(x=relu) block = prog.functions["main"] ops = list(block.operations) with block: with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_2"]), ): # The subgraph is constructed under pass_2 relu = ops[0].outputs[0] y = mb.leaky_relu(x=relu, alpha=0.8, before_op=ops[1]) y = mb.concat(values=[y, y, relu, y], axis=0, before_op=ops[1]) y1, y2, y3, y4 = mb.split(x=y, axis=0, num_splits=4, before_op=ops[1]) block._replace_var(relu, y1, anchor_op=y1.op) ops = list(block.operations) relu_ops = block.find_ops(op_type="relu") leaky_relu_op = block.find_ops(op_type="leaky_relu")[0] concat_op = block.find_ops(op_type="concat")[0] split_op = block.find_ops(op_type="split")[0] for op in [leaky_relu_op, concat_op, split_op]: assert op.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1", "pass_2"], } for op in relu_ops: assert op.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } # Case 3: Similar to case 2, but the relu op has torch scope. @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1"]), ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): # This op is created by pass_1 relu = mb.relu(x=x) with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1"]), ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): return mb.relu(x=relu) block = prog.functions["main"] ops = list(block.operations) with block: with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_2"]), ): # The subgraph is constructed under pass_2 relu = ops[0].outputs[0] y = mb.leaky_relu(x=relu, alpha=0.8, before_op=ops[1]) y = mb.concat(values=[y, y, relu, y], axis=0, before_op=ops[1]) y1, y2, y3, y4 = mb.split(x=y, axis=0, num_splits=4, before_op=ops[1]) block._replace_var(relu, y1, anchor_op=y1.op) ops = list(block.operations) relu_ops = block.find_ops(op_type="relu") leaky_relu_op = block.find_ops(op_type="leaky_relu")[0] concat_op = block.find_ops(op_type="concat")[0] split_op = block.find_ops(op_type="split")[0] for op in [leaky_relu_op, concat_op, split_op]: assert op.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1", "pass_2"], ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], } for op in relu_ops: assert op.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["module_1"], } @staticmethod def test_scope_passdown_function_input_var(): """ If the old_var is function input var, and then the converter sets some default value for each scope source. """ # Case 1: with no essential scope set, no scope information is passed down def get_prog(): @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_TYPE, data=["module_1"]), ): return mb.sin(x=x) return prog prog = get_prog() block = prog.functions["main"] ops = list(block.operations) with block: with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): # This op is created by pass_1 relu = mb.relu(x=block.inputs["x"], before_op=ops[0]) block._replace_var(block.inputs["x"], relu) assert relu.scopes == { ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } # Case 2: essential scope set to TORCHSCRIPT_MODULE_TYPE prog = get_prog() prog._add_essential_scope_source(ScopeSource.TORCHSCRIPT_MODULE_TYPE) block = prog.functions["main"] ops = list(block.operations) with block: with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): # This op is created by pass_1 relu = mb.relu(x=block.inputs["x"], before_op=ops[0]) block._replace_var(block.inputs["x"], relu) assert relu.scopes == { ScopeSource.TORCHSCRIPT_MODULE_TYPE: ["__COREML__::TORCHSCRIPT_PLACEHOLDER"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } # Case 3: essential scope set to TORCHSCRIPT_MODULE_NAME @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.TORCHSCRIPT_MODULE_NAME, data=["module_1"]), ): return mb.sin(x=x) prog._add_essential_scope_source(ScopeSource.TORCHSCRIPT_MODULE_NAME) block = prog.functions["main"] ops = list(block.operations) with block: with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): # This op is created by pass_1 relu = mb.relu(x=block.inputs["x"], before_op=ops[0]) block._replace_var(block.inputs["x"], relu) assert relu.scopes == { ScopeSource.TORCHSCRIPT_MODULE_NAME: ["__COREML__::TORCHSCRIPT_PLACEHOLDER_x"], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } # Case 4: essential scope set to EXIR_STACK_TRACE and EXIR_DEBUG_HANDLE @mb.program(input_specs=[mb.TensorSpec(shape=(2, 4))]) def prog(x): with mb.scope( ScopeInfo(source=ScopeSource.EXIR_STACK_TRACE, data=["torch.sin(x)"]), ScopeInfo(source=ScopeSource.EXIR_DEBUG_HANDLE, data=[1]), ): return mb.sin(x=x) prog._add_essential_scope_source( [ScopeSource.EXIR_STACK_TRACE, ScopeSource.EXIR_DEBUG_HANDLE] ) block = prog.functions["main"] ops = list(block.operations) with block: with mb.scope( ScopeInfo(source=ScopeSource.COREMLTOOLS_GRAPH_PASS, data=["pass_1"]), ): # This op is created by pass_1 relu = mb.relu(x=block.inputs["x"], before_op=ops[0]) block._replace_var(block.inputs["x"], relu) assert relu.scopes == { ScopeSource.EXIR_STACK_TRACE: [None], ScopeSource.EXIR_DEBUG_HANDLE: [None], ScopeSource.COREMLTOOLS_GRAPH_PASS: ["pass_1"], } ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/tests/test_types.py0000644000000000000000000001203114672066616024511 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np import pytest from coremltools import ImageType, StateType, TensorType from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.types import type_mapping from coremltools.optimize.coreml import _utils as optimize_utils class TestTypes: def test_sub_byte_type(self): assert types.is_int(types.int4) assert types.is_int(types.uint1) assert types.is_int(types.uint2) assert types.is_int(types.uint3) assert types.is_int(types.uint4) assert types.is_int(types.uint6) assert types.is_int(types.int8) assert types.is_sub_byte(types.int4) assert types.is_sub_byte(types.uint1) assert types.is_sub_byte(types.uint2) assert types.is_sub_byte(types.uint3) assert types.is_sub_byte(types.uint4) assert types.is_sub_byte(types.uint6) assert not types.is_sub_byte(types.int8) assert not types.is_sub_byte(types.uint8) int4_instance = types.int4() uint1_instance = types.uint1() uint2_instance = types.uint2() uint3_instance = types.uint3() uint4_instance = types.uint4() uint6_instance = types.uint6() int8_instance = types.int8() assert types.is_sub_byte(int4_instance) assert types.is_sub_byte(uint1_instance) assert types.is_sub_byte(uint2_instance) assert types.is_sub_byte(uint3_instance) assert types.is_sub_byte(uint4_instance) assert types.is_sub_byte(uint6_instance) assert not types.is_sub_byte(int8_instance) def test_state_type_with_tensor(self): state_wrapped_type = types.tensor(types.int32, (2, 3)) state_type = types.state(state_wrapped_type) assert types.is_state(state_type) assert state_type.wrapped_type() == state_wrapped_type def test_numpy_type_to_builtin_type(self): assert types.numpy_type_to_builtin_type(np.float32) == types.fp32 assert types.numpy_type_to_builtin_type(np.float16) == types.fp16 assert types.numpy_type_to_builtin_type(np.int32) == types.int32 assert types.numpy_type_to_builtin_type(np.int16) == types.int16 assert types.numpy_type_to_builtin_type(np.int8) == types.int8 assert types.numpy_type_to_builtin_type(types.np_int4_dtype) == types.int4 assert types.numpy_type_to_builtin_type(types.np_uint4_dtype) == types.uint4 assert types.numpy_type_to_builtin_type(types.np_uint3_dtype) == types.uint3 class TestTypeMapping: def test_promote_dtypes_basic(self): assert type_mapping.promote_dtypes([types.int32, types.int32]) == types.int32 assert type_mapping.promote_dtypes([types.int32, types.int64, types.int16]) == types.int64 assert type_mapping.promote_dtypes([types.fp16, types.fp32, types.fp64]) == types.fp64 assert type_mapping.promote_dtypes([types.fp16, types.int32, types.int64]) == types.fp16 @pytest.mark.parametrize( "input_size", [10, 10000], ) def test_promote_dtypes_different_input_sizes(self, input_size): assert ( type_mapping.promote_dtypes([types.int32, types.int64, types.int16] * input_size) == types.int64 ) def test_np_val_to_py_type(self): assert types.type_mapping.np_val_to_py_type(np.array([True, False])) == (True, False) assert types.type_mapping.np_val_to_py_type(np.array(32, dtype=np.int32)) == 32 # Sub-byte conversion. int4_array = np.array([1, 2]).reshape([1, 2, 1]).astype(types.np_int4_dtype) py_bytes = types.type_mapping.np_val_to_py_type(int4_array) assert len(py_bytes) == 1 # Two 4-bit elements should only take 1 byte. restored_array = optimize_utils.restore_elements_from_packed_bits( np.frombuffer(py_bytes, dtype=np.uint8), nbits=4, element_num=2, are_packed_values_signed=True, ) np.testing.assert_array_equal(restored_array.reshape([1, 2, 1]), int4_array) class TestInputTypes: def test_state_type(self): state_type = StateType(name="x", wrapped_type=TensorType(shape=(2, 3), dtype=np.float32)) assert state_type.name == "x" assert state_type.shape.shape == (2, 3) def test_state_type_invalid_wrapped_type(self): wrapped_type = ImageType(shape=(1, 3, 3, 3)) with pytest.raises(ValueError, match="StateType only supports"): StateType(wrapped_type=wrapped_type) with pytest.raises(ValueError, match="name cannot be set in the state wrapped_type"): StateType(wrapped_type=TensorType(name="x", shape=(2, 3))) with pytest.raises( ValueError, match="default_value cannot be set in the state wrapped_type" ): StateType(wrapped_type=TensorType(shape=(3,), default_value=np.array([0.0, 0.0, 0.0]))) ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.253547 coremltools-8.0/coremltools/converters/mil/mil/types/0000755000000000000000000000000014672075535021741 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/__init__.py0000644000000000000000000000350714672066616024057 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .annotate import annotate, apply_delayed_types, class_annotate, delay_type from .get_type_info import get_type_info from .global_methods import global_remap from .type_bool import bool, is_bool from .type_complex import complex, complex64, complex128, is_complex from .type_dict import dict, empty_dict from .type_double import double, float, fp16, fp32, fp64, is_float from .type_globals_pseudo_type import globals_pseudo_type from .type_int import ( _SUB_BYTE_TYPES, SUB_BYTE_DTYPE_METADATA_KEY, int4, int8, int16, int32, int64, is_int, is_sub_byte, np_int4_dtype, np_uint1_dtype, np_uint2_dtype, np_uint3_dtype, np_uint4_dtype, np_uint6_dtype, uint, uint1, uint2, uint3, uint4, uint6, uint8, uint16, uint32, uint64, ) from .type_list import empty_list, is_list, list from .type_mapping import ( BUILTIN_TO_PROTO_TYPES, PROTO_TO_BUILTIN_TYPE, builtin_to_string, get_nbits_int_builtin_type, is_builtin, is_dict, is_primitive, is_scalar, is_str, is_subtype, is_tensor, is_tuple, np_dtype_to_py_type, nptype_from_builtin, numpy_type_to_builtin_type, numpy_val_to_builtin_val, promote_dtypes, promote_types, string_to_builtin, type_to_builtin_type, ) from .type_state import is_state, state from .type_str import str from .type_tensor import ( is_compatible_type, is_tensor_and_is_compatible, tensor, tensor_has_complete_shape, ) from .type_tuple import tuple from .type_unknown import unknown from .type_void import void apply_delayed_types() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/annotate.py0000644000000000000000000000652314672066616024132 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause class delay_type_cls: def __getattr__(self, t): return t # this delay type thingee is useful for class annotations. # for instance: the following code is invalid because when the annotate # function is invoked, the "double" class does not yet exist # # class double: # @annotate(double, other=double) # def __add__(self, other): # # So it is necessary to add one level of laziness and delay the type # # class double: # @annotate(delay_type.double, other=delay_type.double) # def __add__(self, other): # # This basically replaces the annotation with the string "double" which we will # then replace with the actual type later # delay_type = delay_type_cls() annotated_function_list = [] annotated_class_list = {} class _invalid_placeholder_type: pass def annotate(return_type=_invalid_placeholder_type, **kwargs): """ A decorator that informs the compyler about the return type of a function and a collection of hint for other variable names. These can include - captured variables - function arguments - other variables within the function Ex: @annotate(compyler.double, a=compyler.double, b=compyler.double) def add(a, b): In certain cases when the class members are annotated this does not work. For instance this fails because the annotate decorator is called before the class double is fully defined. class double: @annotate(double, other=double) def __add__(self, other): So it is necessary to add one level of laziness and delay the type @class_annotate() class double: @annotate(delay_type.double, other=delay_type.double) def __add__(self, other): After which apply_delayed_types() must be called to fill in the delayed type. """ global annotated_function_list def decorator(func): global annotated_function_list func.type_annotations = kwargs if return_type is not _invalid_placeholder_type: func.return_type = return_type annotated_function_list += [func] return func return decorator def class_annotate(): """ Registers a class to be used by delay_type. See annotate() """ global annotated_class_list def decorator(cls): global annotated_class_list annotated_class_list[cls.__name__] = cls return cls return decorator def apply_delayed_types( type_map=annotated_class_list, fnlist=annotated_function_list ): # pylint: disable=dangerous-default-value """ Apply all delayed types. See annotate() """ # pylint: disable=no-member # type name is a dict from str to type for func in fnlist: if ( hasattr(func, "return_type") and isinstance(func.return_type, str) and func.return_type in type_map ): func.return_type = type_map[func.return_type] if hasattr(func, "type_annotations"): for key in func.type_annotations: if func.type_annotations[key] in type_map: func.type_annotations[key] = type_map[func.type_annotations[key]] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/get_type_info.py0000644000000000000000000000411314672066616025145 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .type_spec import FunctionType, Type from .type_void import void def get_python_method_type(py_function): # given a python class method, parse the annotations to figure out the type function_inputs = [] function_output = get_type_info(void) annotations = {} if hasattr(py_function, "type_annotations"): annotations = { k: get_type_info(v) for k, v in py_function.type_annotations.items() } if hasattr(py_function, "return_type"): function_output = get_type_info(py_function.return_type) try: if hasattr(py_function, "__func__"): argcount = py_function.__func__.__code__.co_argcount argnames = py_function.__func__.__code__.co_varnames[:argcount] else: argcount = py_function.__code__.co_argcount argnames = py_function.__code__.co_varnames[:argcount] except: raise TypeError( "Unable to derive type information from method %s. " "You might have a misspecified type. Ex: use compyler.int and not int" % py_function ) for arg in argnames: if arg in annotations: function_inputs.append(annotations[arg]) elif arg != "self": raise TypeError( "Function " + str(py_function) + " insufficient annotations. " + arg + " needs a type" ) typeinfo = FunctionType(function_inputs, function_output, py_function) return typeinfo def get_type_info(t): if hasattr(t, "__type_info__"): ret = t.__type_info__() assert ret.python_class is not None return ret elif isinstance(t, type): return Type(t.__name__, python_class=t) elif hasattr(t, "__call__"): return get_python_method_type(t) raise TypeError("Unsupported type %s" % t) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/global_methods.py0000644000000000000000000000267414672066616025307 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ This defines a list of all the "global methods" like len. Or type cast operators like int, list, double, etc. The difficulty with some of these methods is that they don't have fixed types. For instance len(x) allows x to be list or a dictionary. However we don't support function overloading based on types, and we don't intend to. (It is complicated, requires the parser to be far more intelligent and do good type inference; will either require genre to support overloading or do name mangling. The final quirk is that we probably should not call these functions "len" or "int" because that will conflict with the existing python methods. So what we will simply do is to rewrite them to things like __len__, __str__ and __int__ and __double__ """ global_remap = { "len": "__len__", "str": "__str__", "int": "__int__", "double": "__double__", "float": "__double__", "bool": "__bool__", "log": "__log__", "exp": "__exp__", "max": "__max__", "min": "__min__", } global_invremap = { "__len__": "len", "__str__": "str", "__int__": "int", "__double__": "float", "__bool__": "bool", "__log__": "math.log", "__exp__": "math.exp", "__max__": "max", "__min__": "min", } ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/symbolic.py0000644000000000000000000000427514672066616024144 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np import sympy as sm k_used_symbols = {} k_num_internal_syms = 0 _OBJECT_DTYPE = np.empty(0, dtype=object).dtype def is_compatible_symbolic_vector(val_a, val_b): """ compare two vector and check if they are compatible. ([is0, 4], [9, 4]), ([is0, 1],[is1, is2]) are twp compatible examples. """ val_a = tuple(val_a) val_b = tuple(val_b) if len(val_a) != len(val_b): return False for a, b in zip(val_a, val_b): if not is_symbolic(a) and not is_symbolic(b): if a != b: return False return True def is_symbolic(val): return issubclass(type(val), sm.Basic) # pylint: disable=consider-using-ternary def is_variadic(val): return ( issubclass(type(val), sm.Symbol) and val.name[0] == "*" ) # pylint: disable=consider-using-ternary def num_symbolic(val): """ Return the number of symbols in val """ if is_symbolic(val): return 1 elif isinstance(val, np.ndarray) and val.dtype.type != _OBJECT_DTYPE: return 0 elif hasattr(val, "__iter__"): return sum(any_symbolic(i) for i in val) return 0 def any_symbolic(val): if is_symbolic(val): return True if isinstance(val, np.ndarray) and val.ndim == 0: return is_symbolic(val[()]) elif isinstance(val, np.ndarray) and val.dtype.type != _OBJECT_DTYPE: return False elif isinstance(val, str): # string is iterable return False elif hasattr(val, "__iter__"): return any(any_symbolic(i) for i in val) return False def any_variadic(val): if is_variadic(val): return True elif isinstance(val, np.ndarray) and val.dtype.type != _OBJECT_DTYPE: return False elif isinstance(val, str): # string is iterable return False elif hasattr(val, "__iter__"): return any(any_variadic(i) for i in val) return False def isscalar(val): return np.isscalar(val) or issubclass(type(val), sm.Basic) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_bool.py0000644000000000000000000000231614672066616024311 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .annotate import annotate, class_annotate, delay_type from .type_spec import Type @class_annotate() class bool: def __init__(self, v=False): self.val = v @classmethod def __type_info__(cls): return Type("bool", python_class=cls) @annotate(delay_type.bool, other=delay_type.bool) def __eq__(self, other): return bool(self.val == other.val) @annotate(delay_type.bool, other=delay_type.bool) def __ne__(self, other): return bool(self.val != other.val) @annotate(delay_type.bool) def __not__(self, other): return bool(not other.val) @annotate(delay_type.bool) def __bool__(self): return self.val @annotate(delay_type.int) def __int__(self): return int(self) @annotate(delay_type.double) def __double__(self): return float(self.val) @annotate(delay_type.str) def __str__(self): return str(self.val) def is_bool(t): return t is bool or isinstance(t, bool) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_complex.py0000644000000000000000000001311114672066616025020 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from coremltools import _logger as logger from .annotate import annotate, class_annotate, delay_type from .type_bool import bool from .type_spec import Type def make_complex(width): delay_type_complex = getattr(delay_type, "complex" + str(width)) @class_annotate() class complex: _width = width def __init__(self, v=0 + 0j): self._val: np.complexfloating = ( np.complex64(v) if width == 64 else np.complex128(v) ) @property def val(self): return self._val @val.setter def val(self, v): from .type_mapping import ( builtin_to_string, nptype_from_builtin, numpy_type_to_builtin_type, ) if not isinstance(v, np.generic): if isinstance(v, np.ndarray) and v.ndim == 0: # Rank zero tensor case. Use as a scalar. self._val = v.item() else: raise ValueError( f"Types should have zero-rank ndarray input, got {v} instead." ) elif isinstance(v, np.complexfloating): v_type = numpy_type_to_builtin_type(v.dtype) if v_type.get_bitwidth() <= self.get_bitwidth(): self._val = v else: self._val = v.astype(nptype_from_builtin(self.__class__)) logger.warning( "Saving value type of {} into a builtin type of {}, might lose precision!".format( v.dtype, builtin_to_string(self.__class__) ) ) else: self._val = v.astype(nptype_from_builtin(self.__class__)) logger.warning( "Saving value type of {} into a builtin type of {}, might be incompatible or " "loses precision!".format( v.dtype, builtin_to_string(self.__class__) ) ) @classmethod def __type_info__(cls): return Type("complex" + str(cls._width), python_class=cls) @classmethod def get_bitwidth(cls): return cls._width @annotate(delay_type_complex, other=delay_type_complex) def __add__(self, other): assert isinstance(other, complex) return complex(self.val + other.val) @annotate(delay_type_complex, other=delay_type_complex) def __sub__(self, other): assert isinstance(other, complex) return complex(self.val - other.val) @annotate(delay_type_complex, other=delay_type_complex) def __mul__(self, other): assert isinstance(other, complex) return complex(self.val * other.val) @annotate(delay_type_complex, other=delay_type_complex) def __div__(self, other): assert isinstance(other, complex) return complex(self.val / other.val) @annotate(delay_type_complex, other=delay_type_complex) def __mod__(self, other): raise ValueError("Can't mod complex numbers.") @annotate(delay_type.bool, other=delay_type_complex) def __lt__(self, other): return bool(self.val < other.val) @annotate(delay_type.bool, other=delay_type_complex) def __gt__(self, other): return bool(self.val > other.val) @annotate(delay_type.bool, other=delay_type_complex) def __le__(self, other): return bool(self.val <= other.val) @annotate(delay_type.bool, other=delay_type_complex) def __ge__(self, other): return bool(self.val >= other.val) @annotate(delay_type.bool, other=delay_type_complex) def __eq__(self, other): return bool(self.val == other.val) @annotate(delay_type.bool, other=delay_type_complex) def __ne__(self, other): return bool(self.val != other.val) @annotate(delay_type.bool) def __bool__(self): return self.val @annotate(delay_type.int) def __int__(self): logger.warning( "ComplexWarning: Casting complex to real discards the imaginary part." ) return int(np.real(self.val)) @annotate(delay_type_complex) def __complex__(self): return complex(self.val) @annotate(delay_type.str) def __str__(self): return str(self.val) @annotate(delay_type_complex) def __log__(self): # The `math.log` doesn't support complex numbers yet. return np.log(self.val) @annotate(delay_type_complex) def __exp__(self): return np.exp(self.val) @annotate(delay_type_complex) def __neg__(self): return complex(-self.val) complex.__name__ = "complex%d" % complex.get_bitwidth() return complex # We keep consistent with PyTorch and Tensorflow: # - complex64 consists of a fp32 real and a fp32 imag. # - complex128 consists of a fp64 real and a fp64 imag. complex64 = make_complex(64) complex128 = make_complex(128) complex = complex64 def is_complex(t): complex_types_set = (complex64, complex128) return (t in complex_types_set) or isinstance(t, complex_types_set) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_dict.py0000644000000000000000000000334514672066616024304 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import type_bool, type_int from .annotate import annotate from .get_type_info import get_type_info from .type_spec import Type from .type_void import void def memoize(f): memo = {} def helper(x, y): if (x, y) not in memo: memo[(x, y)] = f(x, y) return memo[(x, y)] return helper class empty_dict: @classmethod def __type_info__(cls): return Type("empty_dict", python_class=cls) @memoize def dict(keytype, valuetype): class dict: T = [keytype, valuetype] def __init__(self): self.val = {} @classmethod def __type_info__(cls): return Type("dict", [get_type_info(keytype), get_type_info(valuetype)], cls) @annotate(T[1], key=T[0]) def __getitem__(self, key): assert isinstance(key, self.T[0]) return self.val[key] @annotate(void, key=T[0], newval=T[1]) def __setitem__(self, key, newval): assert isinstance(key, self.T[0]) assert isinstance(newval, self.T[1]) self.val[key] = newval @annotate(type_int.int64) def __len__(self): return type_int.int64(len(self.val)) @annotate(type_bool.bool, key=T[0]) def __contains__(self, key): return key in self.val[key] dict.__template_name__ = "dict[" + keytype.__name__ + "," + valuetype.__name__ + "]" return dict def is_dict(t): if t is None: return False return get_type_info(t).name == "dict" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_double.py0000644000000000000000000001177714672066616024643 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import math import numpy as np from coremltools import _logger as logger from .annotate import annotate, class_annotate, delay_type from .type_bool import bool from .type_spec import Type def make_float(width): delay_type_float = getattr(delay_type, "fp" + str(width)) @class_annotate() class double: _width = width def __init__(self, v=0.0): self._val = v @property def val(self): return self._val @val.setter def val(self, v): from .type_mapping import (builtin_to_string, nptype_from_builtin, numpy_type_to_builtin_type) if not isinstance(v, np.generic): if isinstance(v, np.ndarray) and v.ndim == 0: # Rank zero tensor case. Use as a scalar. self._val = v.item() else: raise ValueError( f"Types should have zero-rank ndarray input, got {v} instead." ) elif isinstance(v, np.floating): v_type = numpy_type_to_builtin_type(v.dtype) if v_type.get_bitwidth() <= self.get_bitwidth(): self._val = v else: self._val = v.astype(nptype_from_builtin(self.__class__)) logger.warning( "Saving value type of {} into a builtin type of {}, might lose precision!".format( v.dtype, builtin_to_string(self.__class__) ) ) else: self._val = v.astype(nptype_from_builtin(self.__class__)) logger.warning( "Saving value type of {} into a builtin type of {}, might be incompatible or loses precision!".format( v.dtype, builtin_to_string(self.__class__) ) ) @classmethod def __type_info__(cls): return Type("fp" + str(cls._width), python_class=cls) @classmethod def get_bitwidth(cls): return cls._width @annotate(delay_type_float, other=delay_type_float) def __add__(self, other): assert isinstance(other, double) return double(self.val + other.val) @annotate(delay_type_float, other=delay_type_float) def __sub__(self, other): assert isinstance(other, double) return double(self.val - other.val) @annotate(delay_type_float, other=delay_type_float) def __mul__(self, other): assert isinstance(other, double) return double(self.val * other.val) @annotate(delay_type_float, other=delay_type_float) def __div__(self, other): assert isinstance(other, double) return double(self.val / other.val) @annotate(delay_type_float, other=delay_type_float) def __mod__(self, other): assert isinstance(other, double) return double(self.val % other.val) @annotate(delay_type.bool, other=delay_type_float) def __lt__(self, other): return bool(self.val < other.val) @annotate(delay_type.bool, other=delay_type_float) def __gt__(self, other): return bool(self.val > other.val) @annotate(delay_type.bool, other=delay_type_float) def __le__(self, other): return bool(self.val <= other.val) @annotate(delay_type.bool, other=delay_type_float) def __ge__(self, other): return bool(self.val >= other.val) @annotate(delay_type.bool, other=delay_type_float) def __eq__(self, other): return bool(self.val == other.val) @annotate(delay_type.bool, other=delay_type_float) def __ne__(self, other): return bool(self.val != other.val) @annotate(delay_type.bool) def __bool__(self): return self.val != 0 @annotate(delay_type.int) def __int__(self): return int(self) @annotate(delay_type_float) def __double__(self): return float(self.val) @annotate(delay_type.str) def __str__(self): return str(self.val) @annotate(delay_type_float) def __log__(self): return math.log(self.val) @annotate(delay_type_float) def __exp__(self): return math.exp(self.val) @annotate(delay_type_float) def __neg__(self): return double(-self.val) double.__name__ = "fp%d" % double.get_bitwidth() return double fp16 = make_float(16) fp32 = make_float(32) fp64 = make_float(64) float = fp32 double = fp64 def is_float(t): return any(t is i or isinstance(t, i) for i in [fp16, fp32, fp64]) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_globals_pseudo_type.py0000644000000000000000000000056214672066616027422 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .type_spec import Type class globals_pseudo_type: @classmethod def __type_info__(cls): return Type("globals", python_class=cls) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_int.py0000644000000000000000000001472214672066616024154 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import math import numpy as np import sympy as sm from coremltools import _logger as logger from .annotate import annotate, class_annotate, delay_type from .type_bool import bool from .type_spec import Type def make_int(width, unsigned): delay_type_int = getattr(delay_type, unsigned + "int" + str(width)) @class_annotate() class int: _width = width _unsigned = unsigned @annotate(v=delay_type_int) def __init__(self, v=0): self._val = v @property def val(self): return self._val @val.setter def val(self, v): from .type_mapping import (builtin_to_string, nptype_from_builtin, numpy_type_to_builtin_type) if not isinstance(v, (np.generic, np.ndarray, sm.Basic)): try: v = np.array(v) except Exception: raise ValueError( f"types should have value of numpy type or Symbols, got {type(v)} instead" ) if isinstance(v, sm.Basic): self._val = v elif isinstance(v, np.integer): v_type = numpy_type_to_builtin_type(v.dtype) if v_type.get_bitwidth() <= self.get_bitwidth() and ( v >= 0 or v < 0 and not self.is_unsigned() ): self._val = v else: self._val = v.astype(nptype_from_builtin(self.__class__)) logger.warning( f"Saving value type of {v.dtype} into a builtin type of " f"{builtin_to_string(self.__class__)}, might overflow or loses precision!" ) else: self._val = v.astype(nptype_from_builtin(self.__class__)) logger.warning( f"Saving value type of {v.dtype} into a builtin type of " f"{builtin_to_string(self.__class__)}, might be incompatible or loses precision!" ) @classmethod def __type_info__(cls): return Type(cls._unsigned + "int" + str(cls._width), python_class=cls) @classmethod def get_bitwidth(cls): return cls._width @classmethod def is_unsigned(cls): return cls._unsigned == "u" @annotate(delay_type_int, other=delay_type_int) def __add__(self, other): assert isinstance(other, int) return int(self.val + other.val) @annotate(delay_type_int, other=delay_type_int) def __sub__(self, other): assert isinstance(other, int) return int(self.val - other.val) @annotate(delay_type_int, other=delay_type_int) def __mul__(self, other): assert isinstance(other, int) return int(self.val * other.val) @annotate(delay_type_int, other=delay_type_int) def __div__(self, other): assert isinstance(other, int) return int(self.val // other.val) @annotate(delay_type_int, other=delay_type_int) def __mod__(self, other): assert isinstance(other, int) return int(self.val % other.val) @annotate(delay_type.bool, other=delay_type_int) def __lt__(self, other): return bool(self.val < other.val) @annotate(delay_type.bool, other=delay_type_int) def __gt__(self, other): return bool(self.val > other.val) @annotate(delay_type.bool, other=delay_type_int) def __le__(self, other): return bool(self.val <= other.val) @annotate(delay_type.bool, other=delay_type_int) def __ge__(self, other): return bool(self.val >= other.val) @annotate(delay_type.bool, other=delay_type_int) def __eq__(self, other): return bool(self.val == other.val) @annotate(delay_type.bool, other=delay_type_int) def __ne__(self, other): return bool(self.val != other.val) @annotate(delay_type.bool) def __bool__(self): return self.val != 0 @annotate(delay_type_int) def __int__(self): return int(self) @annotate(delay_type.double) def __double__(self): return float(self.val) @annotate(delay_type.str) def __str__(self): return str(self.val) @annotate(delay_type.double) def __log__(self): return math.log(self.val) @annotate(delay_type.double) def __exp__(self): return math.exp(self.val) @annotate(delay_type_int) def __neg__(self): return int(-self.val) return int int4 = make_int(4, "") int8 = make_int(8, "") int16 = make_int(16, "") int32 = make_int(32, "") int64 = make_int(64, "") uint1 = make_int(1, "u") uint2 = make_int(2, "u") uint3 = make_int(3, "u") uint4 = make_int(4, "u") uint6 = make_int(6, "u") uint8 = make_int(8, "u") uint16 = make_int(16, "u") uint32 = make_int(32, "u") uint64 = make_int(64, "u") uint = uint64 _INT_TYPES = ( int4, int8, int16, int32, int64, uint1, uint2, uint3, uint4, uint6, uint8, uint16, uint32, uint64, ) # The key name for storing type info in `np.dtype.metadata`. SUB_BYTE_DTYPE_METADATA_KEY = "true_dtype" # Uses np.int8/uint8 as np doesn't natively support sub-byte type (such as int4/uint4) yet. np_int4_dtype = np.dtype(np.int8, metadata={SUB_BYTE_DTYPE_METADATA_KEY: int4}) np_uint1_dtype = np.dtype(np.uint8, metadata={SUB_BYTE_DTYPE_METADATA_KEY: uint1}) np_uint2_dtype = np.dtype(np.uint8, metadata={SUB_BYTE_DTYPE_METADATA_KEY: uint2}) np_uint3_dtype = np.dtype(np.uint8, metadata={SUB_BYTE_DTYPE_METADATA_KEY: uint3}) np_uint4_dtype = np.dtype(np.uint8, metadata={SUB_BYTE_DTYPE_METADATA_KEY: uint4}) np_uint6_dtype = np.dtype(np.uint8, metadata={SUB_BYTE_DTYPE_METADATA_KEY: uint6}) _SUB_BYTE_TYPES = (int4, uint1, uint2, uint3, uint4, uint6) def is_int(t): return any(t is i or isinstance(t, i) for i in _INT_TYPES) def is_sub_byte(t): """Determines if a type (or instance) is sub-byte (less than 8-bit data type).""" return t in _SUB_BYTE_TYPES or isinstance(t, _SUB_BYTE_TYPES) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_list.py0000644000000000000000000000366614672066616024342 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import type_int from .annotate import annotate from .get_type_info import get_type_info from .type_spec import Type from .type_void import void def memoize(f): memo = {} def helper(x, init_length=None, dynamic_length=True): if x not in memo: memo[(x, init_length, dynamic_length)] = f(x, init_length, dynamic_length) return memo[(x, init_length, dynamic_length)] return helper class empty_list: @classmethod def __type_info__(cls): return Type("empty_list", python_class=cls) @memoize def list(arg, init_length=None, dynamic_length=True): class list: T = [arg, init_length, dynamic_length] def __init__(self): self.val = [] @classmethod def __type_info__(cls): return Type("list", [get_type_info(arg)], python_class=cls) @annotate(void, other=T[0]) def append(self, other): assert isinstance(other, self.T[0]) self.val.append(other) @annotate(T[0], index=type_int.int64) def __getitem__(self, index): assert isinstance(index, type_int.int64) return self.val[index.val] @annotate(void, index=type_int.int64, newval=T[0]) def __setitem__(self, index, newval): assert isinstance(index, type_int.int64) assert isinstance(newval, self.T[0]) self.val[index.val] = newval @annotate(type_int.int64) def __len__(self): return type_int.int64(len(self.val)) if self.T[1] is None else self.T[1] list.__template_name__ = "list[" + arg.__name__ + "]" return list def is_list(t): if t is None: return False return get_type_info(t).name == "list" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_mapping.py0000644000000000000000000004362614672066616025022 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import namedtuple from typing import Optional, Union import numpy as _np import numpy as np import sympy as sm import coremltools.converters.mil.backend.mil.helper as mil_helper import coremltools.proto.MIL_pb2 as _mil_pm from .get_type_info import get_type_info from .type_bool import bool as types_bool from .type_bool import is_bool from .type_complex import complex64 as types_complex64 from .type_complex import complex128 as types_complex128 from .type_complex import is_complex from .type_dict import is_dict from .type_double import fp16 as types_fp16 from .type_double import fp32 as types_fp32 from .type_double import fp64 as types_fp64 from .type_double import is_float from .type_int import SUB_BYTE_DTYPE_METADATA_KEY from .type_int import int4 as types_int4 from .type_int import int8 as types_int8 from .type_int import int16 as types_int16 from .type_int import int32 as types_int32 from .type_int import int64 as types_int64 from .type_int import ( is_int, is_sub_byte, np_int4_dtype, np_uint1_dtype, np_uint2_dtype, np_uint3_dtype, np_uint4_dtype, np_uint6_dtype, ) from .type_int import uint1 as types_uint1 from .type_int import uint2 as types_uint2 from .type_int import uint3 as types_uint3 from .type_int import uint4 as types_uint4 from .type_int import uint6 as types_uint6 from .type_int import uint8 as types_uint8 from .type_int import uint16 as types_uint16 from .type_int import uint32 as types_uint32 from .type_int import uint64 as types_uint64 from .type_list import is_list from .type_str import str as types_str from .type_unknown import unknown _TYPES_TO_NPTYPES = { types_bool: np.bool_, types_int4: np_int4_dtype, types_int8: np.int8, types_int16: np.int16, types_int32: np.int32, types_int64: np.int64, types_uint1: np_uint1_dtype, types_uint2: np_uint2_dtype, types_uint3: np_uint3_dtype, types_uint4: np_uint4_dtype, types_uint6: np_uint6_dtype, types_uint8: np.uint8, types_uint16: np.uint16, types_uint32: np.uint32, types_uint64: np.uint64, types_fp16: np.float16, types_fp32: np.float32, types_fp64: np.float64, types_complex64: np.complex64, types_complex128: np.complex128, types_str: np.str_, } _NPTYPES_TO_STRINGS = { np.bool_: "bool", np.int8: "int8", np.int16: "int16", np.int32: "int32", np.int64: "int64", np.uint8: "uint8", np.uint16: "uint16", np.uint32: "uint32", np.uint64: "uint64", np.float16: "fp16", np.float32: "fp32", np.float64: "fp64", np.complex64: "complex64", np.complex128: "complex128", np.str_: "string", } _TYPES_TO_STRINGS = { types_bool: "bool", types_int4: "int4", types_int8: "int8", types_int16: "int16", types_int32: "int32", types_int64: "int64", types_uint1: "uint1", types_uint2: "uint2", types_uint3: "uint3", types_uint4: "uint4", types_uint6: "uint6", types_uint8: "uint8", types_uint16: "uint16", types_uint32: "uint32", types_uint64: "uint64", types_fp16: "fp16", types_fp32: "fp32", types_fp64: "fp64", types_complex64: "complex64", types_complex128: "complex128", types_str: "string", } _TYPES_TO_RESOLUTION = { types_bool: 1, types_int4: 1, types_int8: 1, types_uint1: 1, types_uint2: 1, types_uint3: 1, types_uint4: 1, types_uint6: 1, types_uint8: 1, types_int16: 1, types_uint16: 1, types_int32: 1, types_int64: 1, types_fp16: np.finfo(np.float16).resolution, types_fp32: np.finfo(np.float32).resolution, types_fp64: np.finfo(np.float64).resolution, } RangeTuple = namedtuple("RangeTuple", "low high") _TYPES_TO_RANGE = { types_bool: RangeTuple(0, 1), types_int4: RangeTuple(np.iinfo(np.int8).min >> 4, np.iinfo(np.int8).max >> 4), types_int8: RangeTuple(np.iinfo(np.int8).min, np.iinfo(np.int8).max), types_uint1: RangeTuple(np.iinfo(np.uint8).min >> 7, np.iinfo(np.uint8).max >> 7), types_uint2: RangeTuple(np.iinfo(np.uint8).min >> 6, np.iinfo(np.uint8).max >> 6), types_uint3: RangeTuple(np.iinfo(np.uint8).min >> 5, np.iinfo(np.uint8).max >> 5), types_uint4: RangeTuple(np.iinfo(np.uint8).min >> 4, np.iinfo(np.uint8).max >> 4), types_uint6: RangeTuple(np.iinfo(np.uint8).min >> 2, np.iinfo(np.uint8).max >> 2), types_uint8: RangeTuple(np.iinfo(np.uint8).min, np.iinfo(np.uint8).max), types_int16: RangeTuple(np.iinfo(np.int16).min, np.iinfo(np.int16).max), types_uint16: RangeTuple(np.iinfo(np.uint16).min, np.iinfo(np.uint16).max), types_int32: RangeTuple(np.iinfo(np.int32).min, np.iinfo(np.int32).max), types_int64: RangeTuple(np.iinfo(np.int64).min, np.iinfo(np.int64).max), types_fp16: RangeTuple(np.finfo(np.float16).min, np.finfo(np.float16).max), types_fp32: RangeTuple(np.finfo(np.float32).min, np.finfo(np.float32).max), types_fp64: RangeTuple(np.finfo(np.float64).min, np.finfo(np.float64).max), } BUILTIN_TO_PROTO_TYPES = { # bool: types_bool: _mil_pm.BOOL, # fp types_fp16: _mil_pm.FLOAT16, types_fp32: _mil_pm.FLOAT32, types_fp64: _mil_pm.FLOAT64, # int types_uint1: _mil_pm.UINT1, types_uint2: _mil_pm.UINT2, types_uint3: _mil_pm.UINT3, types_uint4: _mil_pm.UINT4, types_uint6: _mil_pm.UINT6, types_uint8: _mil_pm.UINT8, types_int4: _mil_pm.INT4, types_int8: _mil_pm.INT8, types_uint16: _mil_pm.UINT16, types_int16: _mil_pm.INT16, types_uint32: _mil_pm.UINT32, types_int32: _mil_pm.INT32, types_uint64: _mil_pm.UINT64, types_int64: _mil_pm.INT64, # str types_str: _mil_pm.STRING, } def np_dtype_to_py_type(np_dtype): # Can't use dict, as hash(np.int32) != hash(val.dtype) if np_dtype in [np.int32, np.int64]: return int if np_dtype in [bool, np.bool_]: return bool if np_dtype in [np.float32, np.float64]: return float if np_dtype in [np.complex64, np.complex128]: return complex raise NotImplementedError('{} is not supported'.format(np_dtype)) PROTO_TO_BUILTIN_TYPE = {v: k for k, v in BUILTIN_TO_PROTO_TYPES.items()} _STRINGS_TO_TYPES = {v: k for k, v in _TYPES_TO_STRINGS.items()} _STRINGS_TO_NPTYPES = {v: k for k, v in _NPTYPES_TO_STRINGS.items()} _STRINGS_TO_NPTYPES.update( { "int4": np_int4_dtype, "uint1": np_uint1_dtype, "uint2": np_uint2_dtype, "uint3": np_uint3_dtype, "uint4": np_uint4_dtype, "uint6": np_uint6_dtype, } ) def string_to_builtin(s): """ Given a str, return its corresponding builtin type. """ return _STRINGS_TO_TYPES[s] def builtin_to_string(builtin_type): """ Given a builtin type, return its corresponding string representation. """ if is_dict(builtin_type): return "dict" return _TYPES_TO_STRINGS[builtin_type] def string_to_nptype(s: str): """ Given a str, return its corresponding numpy type. """ return _STRINGS_TO_NPTYPES[s] def nptype_from_builtin(btype): """ Given a builtin type, return its corresponding Numpy dtype. """ return _TYPES_TO_NPTYPES[btype] def builtin_to_resolution(builtin_type: type): """ Given a builtin type, return its corresponding resolution. """ return _TYPES_TO_RESOLUTION[builtin_type] def builtin_to_range(builtin_type: type) -> RangeTuple: """ Given a builtin type, return its corresponding range. """ return _TYPES_TO_RANGE[builtin_type] def promote_types(dtype1, dtype2): """ Get the smallest type to which the given scalar types can be cast. Args: dtype1 (builtin): dtype2 (builtin): Returns: A builtin datatype or None. Examples: >>> promote_types(int32, int64) builtin('int64') >>> promote_types(fp16, fp32) builtin('fp32') >>> promote_types(fp16, int32) builtin('fp16') """ nptype1 = nptype_from_builtin(dtype1) nptype2 = nptype_from_builtin(dtype2) # Circumvent the undesirable np type promotion: # >> np.promote_types(np.float32, np.int32) # dtype('float64') if np.issubdtype(nptype1, np.floating) and np.issubdtype(nptype2, np.signedinteger): nppromoted = nptype1 elif np.issubdtype(nptype2, np.floating) and np.issubdtype( nptype1, np.signedinteger ): nppromoted = nptype2 else: nppromoted = np.promote_types(nptype1, nptype2) return numpy_type_to_builtin_type(nppromoted) def promote_dtypes(dtypes): """ Get the smallest promoted dtype, to which all scalar dtypes (provided through dtypes list argument) can be casted. Args: List [dtype (builtin)] Returns: A builtin datatype or None. Examples: >>> promote_dtypes([int32, int64, int16]) builtin('int64') >>> promote_dtypes([fp16, fp32, fp64]) builtin('fp64') >>> promote_dtypes([fp16, int32, int64]) builtin('fp16') """ if not isinstance(dtypes, (list, tuple)) or len(dtypes) < 1: raise ValueError("dtypes needs to be a list/tuple of at least 1 element") # Deduplicate inputs to avoid redundant calculations. # Without dedup, too large input will cause maximum recursion depth exceeded error. dtypes = list(set(dtypes)) if len(dtypes) == 1: return dtypes[0] return promote_types(dtypes[0], promote_dtypes(dtypes[1:])) def is_primitive(btype): """ Is the indicated builtin type a primitive? """ return ( btype is types_bool or btype is types_str or is_float(btype) or is_int(btype) or is_complex(btype) ) def is_scalar(btype): """ Is the given builtin type a scalar integer, float, boolean or string? """ return ( is_bool(btype) or is_int(btype) or is_float(btype) or is_str(btype) or is_complex(btype) ) def is_tensor(tensor_type): if tensor_type is None: return False try: type_info = get_type_info(tensor_type).name except TypeError: return False return type_info == "tensor" def is_str(t): if t is None: return False try: type_info = get_type_info(t).name except TypeError: return False return type_info == "str" def is_tuple(t): if t is None: return False try: type_info = get_type_info(t).name except TypeError: return False return type_info == "tuple" def is_dict(t): if t is None: return False try: type_info = get_type_info(t).name except TypeError: return False return type_info == "dict" def is_builtin(t): return is_scalar(t) or is_tensor(t) or is_str(t) or is_tuple(t) def _numpy_dtype_instance_to_builtin_type(np_dtype: np.dtype) -> Optional[type]: metadata_dict = np_dtype.metadata if metadata_dict is not None and SUB_BYTE_DTYPE_METADATA_KEY in metadata_dict: return metadata_dict[SUB_BYTE_DTYPE_METADATA_KEY] if np_dtype in _NPTYPES_TO_STRINGS: return string_to_builtin(_NPTYPES_TO_STRINGS[np_dtype]) return None def numpy_type_to_builtin_type(nptype) -> type: """ Converts a numpy type to its builtin `types` equivalent. Supports Python native types and numpy types. """ if isinstance(nptype, np.dtype): builtin_type = _numpy_dtype_instance_to_builtin_type(nptype) if builtin_type is not None: return builtin_type # If this is a data type object, use the corresponding scalar data type. if issubclass(type(nptype), np.dtype): nptype = nptype.type if issubclass(nptype, (bool, np.bool_)): # numpy as 2 bool types it looks like. what is the difference? return types_bool # Because np.uint is a subclass of int, # we need to first check for np.uint before # checking for int elif issubclass(nptype, np.uint8): return types_uint8 elif issubclass(nptype, np.int8): return types_int8 elif issubclass(nptype, np.uint16): return types_uint16 elif issubclass(nptype, np.int16): return types_int16 elif issubclass(nptype, np.uint32): return types_uint32 elif issubclass(nptype, np.int32): return types_int32 elif issubclass(nptype, np.uint64): return types_uint64 elif issubclass(nptype, np.int64): return types_int64 elif issubclass(nptype, int) or nptype == int: # Catch all int return types_int32 elif issubclass(nptype, np.object_): # symbolic shape is considered int32 return types_int32 elif issubclass(nptype, np.float16): return types_fp16 elif ( issubclass(nptype, (np.float32, np.single)) or nptype == float ): return types_fp32 elif issubclass(nptype, (np.float64, np.double)): return types_fp64 elif issubclass(nptype, np.complex64): return types_complex64 elif issubclass(nptype, (np.complex128, complex)): return types_complex128 elif issubclass(nptype, (str, np.bytes_, np.str_)): return types_str else: raise TypeError(f"Unsupported numpy type: {nptype}.") # Tries to get the equivalent builtin type of a # numpy or python type. def type_to_builtin_type(type): # Infer from numpy type if it is one if type.__module__ == np.__name__: return numpy_type_to_builtin_type(type) # Otherwise, try to infer from a few generic python types if issubclass(type, bool): return types_bool elif issubclass(type, int): return types_int32 elif issubclass(type, str): return types_str elif issubclass(type, float): return types_fp32 elif issubclass(type, complex): return types_complex64 else: raise TypeError("Could not determine builtin type for " + str(type)) def numpy_val_to_builtin_val(npval): if np.isscalar(npval): ret_type = type_to_builtin_type(type(npval)) ret = ret_type() ret.val = npval return ret, ret_type else: builtintype = numpy_type_to_builtin_type(npval.dtype) from . import tensor as types_tensor ret_type = types_tensor(builtintype, npval.shape) ret = ret_type() ret.val = npval return ret, ret_type def is_subtype_tensor(type1, type2): # requires primitive types match if type1.get_primitive() != type2.get_primitive(): return False shape1 = type1.get_shape() shape2 = type2.get_shape() # Same rank if len(shape1) != len(shape2): return False for d1, d2 in zip(shape1, shape2): if d1 == d2: continue # tensor with shape (3, s0) is not a subtype of tensor with shape (3, # 1), but is a subtype of tensor with shape (3, s1) d1_is_symbolic = issubclass(type(d1), sm.Basic) d2_is_symbolic = issubclass(type(d2), sm.Basic) if d1_is_symbolic and d2_is_symbolic: continue if d1_is_symbolic and not d2_is_symbolic: return False if not d1_is_symbolic and not d2_is_symbolic and d1 != d2: return False return True def is_subtype(type1, type2): """ Return True if type1 is a subtype of type2. False otherwise. """ if type2 == unknown: return True # any class is a subclass of unknown (None) type. if is_list(type2): return is_list(type1) and is_subtype(type1.T[0], type2.T[0]) if is_tensor(type1) and is_tensor(type2): return is_subtype_tensor(type1, type2) return type1 == type2 def _numpy_val_to_bytes(val: Union[np.ndarray, np.generic]) -> bytes: # Import here to avoid circular import. from coremltools.optimize.coreml import _utils as optimize_utils builtin_type = numpy_type_to_builtin_type(val.dtype) if is_sub_byte(builtin_type): val = optimize_utils.pack_elements_into_bits(val, builtin_type.get_bitwidth()) return val.tobytes() def np_val_to_py_type(val): """Convert numpy val to python primitive equivalent. Ex: Given: val = np.array([True, False]) Returns: (True, False) Given: val = np.array(32, dtype=np.int32) Returns 32 """ if not isinstance(val, (_np.ndarray, _np.generic)): return val builtin_type = numpy_type_to_builtin_type(val.dtype) if builtin_type in mil_helper.IMMEDIATE_VALUE_TYPES_IN_BYTES: return _numpy_val_to_bytes(val) else: if val.dtype in (_np.uint16, _np.int16): # TODO (rdar://111797203): Serialize to byte after MIL changes to read from byte field. val = val.astype(np.int32) is_np_scalar = isinstance(val, _np.generic) or val.shape == () py_type = np_dtype_to_py_type(val.dtype) return py_type(val) if is_np_scalar else tuple(py_type(v) for v in val.flatten()) def infer_complex_dtype(real_dtype, imag_dtype): """Infers the complex dtype from real and imaginary part's dtypes.""" promoted_dtype = promote_types(real_dtype, imag_dtype) if promoted_dtype == types_fp32: return types_complex64 elif promoted_dtype == types_fp64: return types_complex128 else: raise ValueError( f"Unsupported real/imag dtype ({real_dtype}/{imag_dtype}) to construct a " f"complex dtype." ) def infer_fp_dtype_from_complex(complex_dtype): """Infers the fp dtype of real and imaginary part from the complex dtype.""" if complex_dtype == types_complex64: return types_fp32 elif complex_dtype == types_complex128: return types_fp64 else: raise ValueError(f"Unsupported complex dtype ({complex_dtype}).") def get_nbits_int_builtin_type(nbits: int, signed: True) -> type: """Get the nbits int built-in type.""" type_prefix = "u" if not signed else "" return string_to_builtin(f"{type_prefix}int{nbits}") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_spec.py0000644000000000000000000000566214672066616024317 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause class Type: """ - Type.name : A string with the name of the object - Type.tparam : For classes with template parameters, (list, dict), this contains a list of Type objects of the template parameters - Type.python_class : The original python class implementing this type. Two Type objects compare equal only on name and tparam and not python_class """ __slots__ = ["name", "tparam", "python_class"] def __init__(self, name, tparam=None, python_class=None): if tparam is None: tparam = [] assert isinstance(name, str) assert isinstance(tparam, list) self.name = name self.tparam = tparam self.python_class = python_class def __hash__(self): return hash((self.name, tuple(self.tparam))) def __eq__(self, other): return self.name == other.name and self.tparam == other.tparam def __ne__(self, other): return not self.__eq__(other) def __repr__(self): ret = self.name if len(self.tparam) > 0: ret += "[" + ",".join(repr(x) for x in self.tparam) + "]" return ret def __str__(self): return self.__repr__() def sexp(self): if len(self.tparam) == 0: return self.name else: ret = [self.name] ret.append([a.sexp() if hasattr(a, "sexp") else a for a in self.tparam]) return ret class FunctionType: """ - FunctionType.inputs : A list of Type objects defining the types of the input - FunctionType.output: A Type object defining the type of the output - FunctionType.python_function : The original python function implementing this type. Two FunctionType objects compare equal only on inputs and output and not python_function """ __slots__ = ["inputs", "output", "python_function"] def __init__(self, inputs, output, python_function=None): assert isinstance(inputs, list) assert isinstance(output, (FunctionType, Type)) self.inputs = inputs self.output = output self.python_function = python_function def __hash__(self): return hash((tuple(self.inputs), self.output)) def __eq__(self, other): return self.inputs == other.inputs and self.output == other.output def __repr__(self): return "(" + ",".join(repr(x) for x in self.inputs) + ")->" + repr(self.output) def __str__(self): return self.__repr__() def return_sexp(self): return self.output.sexp() def inputs_sexp(self): return [i.sexp() for i in self.inputs] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_state.py0000644000000000000000000000221414672066616024473 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.converters.mil.mil.types.get_type_info import get_type_info from coremltools.converters.mil.mil.types.type_spec import Type def memoize(f): memo = {} def helper(state_type): if state_type not in memo: memo[state_type] = f(state_type) return memo[state_type] return helper @memoize def state(state_type): class state: T = [state_type] def __init__(self): self.val = [] @property def val(self): return self._val @classmethod def wrapped_type(cls): return state_type @classmethod def __type_info__(cls): return Type("state", [get_type_info(state_type)], python_class=cls) state.__template_name__ = f"state[{get_type_info(state_type).name}]" return state def is_state(t): if t is None: return False return get_type_info(t).name == "state" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_str.py0000644000000000000000000000120114672066616024156 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .annotate import annotate, class_annotate, delay_type from .type_spec import Type @class_annotate() class str: def __init__(self, v=""): self.val = v @classmethod def __type_info__(cls): return Type("str", python_class=cls) @annotate(delay_type.str, other=delay_type.str) def __add__(self, other): assert isinstance(other, str) return str(self.val + other.val) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_tensor.py0000644000000000000000000001501414672066616024667 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np import sympy as sm from coremltools import _logger as logger from .get_type_info import get_type_info from .symbolic import is_symbolic from .type_mapping import ( builtin_to_string, is_subtype, is_tensor, nptype_from_builtin, numpy_type_to_builtin_type, promote_types, ) from .type_spec import Type def memoize(f): memo = {} def helper(x, y): y = tuple(y) if (x, y,) not in memo: memo[(x, y,)] = f(x, y,) return memo[(x, y,)] return helper def canonical_shape(shape): """ Return shape as tuple of int or Symbol. This utility function ensures the shape tuple using a single integer type (to its best effort). Args: shape: tuple(int|long|np.int*|Symbol|SymbolExpr...) """ def try_cast(x): try: # In python2.7, long and int are different types. # If we cast a long int whose value is out of the range of int, # the result is still long, avoiding overflow: # # `type(2<<64) == long # True` # `type(int(2<<64)) == long # True` x = int(x) except TypeError: # ignore symbolic value (sm.Symbol or sm.Expr) pass return x return tuple(try_cast(x) for x in shape) @memoize def tensor(primitive, shape): shape = canonical_shape(shape) class tensor: T = [primitive, shape] def __init__(self): self._val = [] @classmethod def __type_info__(cls): return Type( "tensor", list(shape) + [get_type_info(primitive)], python_class=cls ) @classmethod def get_primitive(cls): return primitive @classmethod def get_shape(cls): return shape @property def val(self): return self._val @val.setter def val(self, v): if not isinstance(v, np.ndarray): try: v = np.array(v) except: raise ValueError( f"tensor value type should be compatible with type np.ndarray, " f"got {type(v)} instead" ) v_type = numpy_type_to_builtin_type(v.dtype) promoted_type = promote_types(v_type, primitive) primitive_np_type = nptype_from_builtin(primitive) if v_type == primitive or v.dtype == np.dtype("O") or v.dtype == primitive_np_type: # np.array of symbolic has object type. Don't cast type. self._val = v elif promoted_type == primitive: self._val = v.astype(primitive_np_type) else: logger.warning( "Saving value type of {} into a builtin type of {}, might lose precision!".format( v.dtype, builtin_to_string(primitive) ) ) self._val = v.astype(primitive_np_type) tensor.__template_name__ = ( "tensor[" + primitive.__name__ + "," + ",".join(str(s) for s in shape) + "]" ) tensor.__name__ = ( "tensor[" + ",".join(str(s) for s in shape) + "," + primitive.__name__ + "]" ) return tensor def tensor_has_complete_shape(tensor_type): if not is_tensor(tensor_type): return True s = tensor_type.get_shape() if -1 in s: return False elif len(s) == 0: return False else: return True def is_tensor_and_is_compatible(tensor_type1, tensor_type2, allow_promotion=False): """ Try to find a tensor type compatible with both input types. Compatible means that the tensors have the same rank and matching or unspecified dimensions. For example, (10, -1) is compatible with (-1, 20) with the compatible shape (10, 20). Args: tensor_type1 (types.tensor) tensor_type2 (types.tensor) allow_promotion (bool): If True, allow primitive types to be promoted. Returns: A pair of (bool, type). If the given types are not tensor types with (1) compatible shapes and (2) either identical primitive types or allow_promition=True, return is False, None. Otherwise, return True and the compatible shape. Note that the returned shape may not be the same as either input. For example, is_tensor_and_is_compatible( tensor[fp32,[10,-1]], tensor[fp32,[-1,20]]) --> tensor[fp32, [10,20]] """ if not is_tensor(tensor_type1) or not is_tensor(tensor_type2): return False, None shape1 = tensor_type1.get_shape() shape2 = tensor_type2.get_shape() primitive_type = tensor_type1.get_primitive() if primitive_type != tensor_type2.get_primitive(): promoted_type = promote_types(primitive_type, tensor_type2.get_primitive()) if allow_promotion: primitive_type = promoted_type else: return False, promoted_type if len(shape1) == 0: return True, tensor_type2 if len(shape2) == 0: return True, tensor_type1 if len(shape1) != len(shape2): return False, None most_specific_shape = [] for i in range(len(shape1)): if shape1[i] == -1 or issubclass(type(shape1[i]), sm.Basic): most_specific_shape.append(shape2[i]) elif shape2[i] == -1 or issubclass(type(shape2[i]), sm.Basic): most_specific_shape.append(shape1[i]) elif shape1[i] == shape2[i]: most_specific_shape.append(shape1[i]) elif is_symbolic(shape1[i]) or is_symbolic(shape2[i]): most_specific_shape.append(shape1[i] if is_symbolic(shape2[i]) else shape2[i]) elif shape1[i] != shape2[i]: return False, None return True, tensor(primitive_type, most_specific_shape) def is_compatible_type(type1, type2): """ Return if type1 and type2 are compatible. """ # For single-element tensor, it's compatible with scalar. if is_tensor(type1) and len(type1.get_shape()) == 0: type1 = type1.get_primitive() if is_tensor(type2) and len(type2.get_shape()) == 0: type2 = type2.get_primitive() if not is_subtype(type1, type2): is_comp, _ = is_tensor_and_is_compatible(type1, type2) return is_comp return True ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_tuple.py0000644000000000000000000000235714672066616024514 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import type_int, type_unknown from .annotate import annotate from .get_type_info import get_type_info from .type_spec import Type _global_tuple = tuple def memoize(f): memo = {} def helper(x): x = _global_tuple(x) if x not in memo: memo[x] = f(x) return memo[x] return helper class empty_list: @classmethod def __type_info__(cls): return Type("empty_list", python_class=cls) @memoize def tuple(args): args = _global_tuple(i if i is not None else type_unknown.unknown for i in args) class tuple: T = args def __init__(self): self.val = [arg() for arg in args] @classmethod def __type_info__(cls): return Type("tuple", [get_type_info(arg) for arg in args], python_class=cls) @annotate(type_int.int64) def __len__(self): return len(args) tuple.__template_name__ = ( "tuple[" + ",".join([get_type_info(arg).name for arg in args]) + "]" ) return tuple ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_unknown.py0000644000000000000000000000072414672066616025056 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .type_spec import Type class unknown: """ unknown is basically Any type. """ @classmethod def __type_info__(cls): return Type("unknown", python_class=cls) def __init__(self, val=None): self.val = val ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/types/type_void.py0000644000000000000000000000054014672066616024314 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .type_spec import Type class void: @classmethod def __type_info__(cls): return Type("void", python_class=cls) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/utils.py0000644000000000000000000001030714672066616022310 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Dict, List, Optional from .operation import Operation class OpNode: """ A helper node class for the doubly linked list. It contains an Operation data and pointers to the previous and the next node. """ def __init__(self, op: Operation): self.op = op self.next: Optional[OpNode] = None self.prev: Optional[OpNode] = None class CacheDoublyLinkedList: """ This array-like data structure is useful to implement pymil's core program transformations, including: 1. Insert an op at a target location (before a target op) 2. Remove an op from the program Given the fact that each op in the list must be unique, a hash table is maintained in this data structure, and hence the insert / pop can both be performed in O(1). """ INVALID_NODE = OpNode(None) def __init__(self, array: Optional[List[Operation]] = None): self.start: OpNode = None self.end: OpNode = None self.op_to_node: Dict[Operation, OpNode] = {} if array is not None: for op in array: self.insert_op_before(op) def insert_op_before(self, new_op: Operation, before_op: Optional[Operation] = None): """ Insert an op right before before_op. If before_op is None, then the new op is appended in the end. """ if new_op in self.op_to_node: raise ValueError(f"{new_op} already exisits.") new_node = OpNode(new_op) if before_op is None: # If before op is None, the new node is appended in the end. if self.start is None: self.start = self.end = new_node else: self.end.next = new_node new_node.prev = self.end self.end = new_node else: anchor_node = self.op_to_node[before_op] prev_node = anchor_node.prev if prev_node is None: self.start = new_node else: prev_node.next = new_node new_node.prev = prev_node new_node.next = anchor_node anchor_node.prev = new_node self.op_to_node[new_op] = new_node def remove(self, op: Operation): """ Remove an op from the data structure. """ node = self.op_to_node[op] prev_node, next_node = node.prev, node.next # reconnect the linked list if prev_node is None: self.start = next_node else: prev_node.next = next_node if next_node is None: self.end = prev_node else: next_node.prev = prev_node node.prev = node.next = self.INVALID_NODE # remove op from the cache del self.op_to_node[op] def __getitem__(self, idx: int) -> Operation: """ The indexing is expensive in doubly linked list, we should prevent direct access besides [0] and [-1]. """ if self.start is None: raise ValueError("Cannot index an empty list.") if idx >= len(self): raise ValueError("Index out of range") if idx == 0: return self.start.op elif idx == -1: return self.end.op raise ValueError("Doubly linked list does not support indexing other than 0, -1.") def _get_node_from_op(self, op: Operation) -> OpNode: return self.op_to_node[op] def __iter__(self): cursor = self.start while cursor is not None: if cursor is self.INVALID_NODE: raise ValueError("Invalid iterator on CacheDoublyLinkedList.") yield cursor.op cursor = cursor.next def __reversed__(self): cursor = self.end while cursor is not None: if cursor is self.INVALID_NODE: raise ValueError("Invalid iterator on CacheDoublyLinkedList.") yield cursor.op cursor = cursor.prev def __len__(self) -> int: return len(self.op_to_node) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/var.py0000644000000000000000000003633014672066616021744 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy from collections import defaultdict from typing import Dict, List, Optional, Union import numpy as np from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.types.symbolic import any_symbolic from .scope import ScopeSource class Var: """ Var represents the outputs of an Operation. Most Vars are derived from an Operation (including const), and all Vars must have `sym_type`. Example Usage: from coremltools.converters.mil.mil import ( Builder as mb, Function, types ) func_inputs = {"a": mb.placeholder(shape=(1,2)), "b": mb.placeholder(shape=(1,2)) } with Function(func_inputs) as ssa_func: a, b = ssa_func.inputs["a"], ssa_func.inputs["b"] res = mb.add(x=a, y=b) # res is Var assert types.is_tensor(res.sym_type) assert res.rank == 2 assert res.dtype == types.float # since a, b are by default float # value is not available at compile time in this case. If # materializable, res.val would be a numpy / primitive value assert res.val is None Comment: Except InternalVar and Vars created in while_loop and by placeholder, all Var should only be constructed by Operation to represent outputs. Comment: Var hides the details of sym_type vs sym_val vs materialized value, which was represented by 2 objects prior to refactoring. # Properties: name: (str) name in MIL proto NamedValueType. Name is assigned by the parent Operation. sym_type [_sym_type]: (builtin type class) All Var must have a (possibly symbolic) type, usually derived from type inference of upstream ops or from default values in _Input. sym_val [_sym_val]: (builtin type instance) Possibly symbolic value. val [_sym_val]: (np.ndarray or python primitive scalar) Numpy (scalar / tensor) value. `val` is not None iff `sym_val` is not None and does not contain symbols. Read-only. op [_op]: (Operation) The Operation this Var is derived from. May not be None except for InternalVar. Read-only. op_output_idx: (int) Idx of the output from Operation corresponding to _Input. May be None. child_ops [_child_ops]: list[Operation] Ops that take this Var as an input. nonreplaceable_vars_upstream: set[Var] Set that consists of nonreplaceable vars upstream """ __slots__ = [ "name", "_sym_type", "_sym_val", "_op", "op_output_idx", "_child_ops", "consuming_blocks", "_nonreplaceable_vars_upstream", "is_descendant_of_const", ] def __init__( self, name, sym_type, sym_val=None, op=None, op_output_idx=None, ): """ sym_type (builtin type) sym_val (builtin value) op (Operation) op_output_idx (int) """ self.name = name self._sym_type = sym_type self._sym_val = sym_val self._op = op self.op_output_idx = op_output_idx # An op can appear twice if it consumes a var twice (e.g., # add(%1, %1), while_loop(loop_vars=(%1, %1)). self._child_ops = list() # A variable may not be consumed by any op (i.e. len(self._child_ops) # == 0) but is still used as block output. A var can be output of # multiple blocks (e.g., both current block and nested blocks) self.consuming_blocks = list() # replaceability self._nonreplaceable_vars_upstream = set() self._set_nonreplaceable_vars_upstream() self._adjust_sym_val() # Track vars constness, which requires a var to satisfy one of the following: # 1. var.val is not None, which's mean the converter already has its compile time value through value inference. # 2. Is a descendant of ``constexpr_`` ops. We don't compute the value inference of those ``constexpr_`` ops, # due to the fact it can potentially results in memory issue. self.is_descendant_of_const = Var._propagate_constness_upstream(self) def _adjust_sym_val(self): """For sub-byte dtype var, adjust the sym_val to make sure it reflects the true dtype.""" if types.is_list(self.sym_type): return if not types.is_sub_byte(self.dtype): return if isinstance(self.sym_val, (np.generic, np.ndarray)): np_val = self._sym_val.val if ( np_val.dtype.metadata is None or types.SUB_BYTE_DTYPE_METADATA_KEY not in np_val.dtype.metadata ): target_np_dtype = types.nptype_from_builtin(self.dtype) self._sym_val.val = np_val.astype(target_np_dtype) @property def nonreplaceable_vars_upstream(self): return self._nonreplaceable_vars_upstream @nonreplaceable_vars_upstream.setter def nonreplaceable_vars_upstream(self, val): assert isinstance(val, set) self._nonreplaceable_vars_upstream = val @staticmethod def _is_nonreplaceable_var(var): op = var.op if op is None: return False return op.op_type.startswith("constexpr_") @staticmethod def _propagate_constness_upstream(var): op = var.op if op is None: return False if ( op.op_type.startswith("constexpr_") or (op.op_type == "dequantize" and op.can_materialize_val()) or var.val is not None ): return True flattened_inputs = op.get_flattened_inputs() return all([x.is_descendant_of_const for x in flattened_inputs]) def _set_nonreplaceable_vars_upstream(self): """ A utility function to set the value of the "nonreplaceable_vars_upstream" property. If self is a non-replaceable var, then "nonreplaceable_vars_upstream" is a single element set, containing self. Otherwise, it is a union of the "nonreplaceable_vars_upstream" sets of all the input vars of its parent ops. """ op = self.op if op is None: return if op.op_type == "shape": # For the meta data ops, like shape, we stop propogate the nonreplaceable_vars. self.nonreplaceable_vars_upstream = set() return if Var._is_nonreplaceable_var(self): self.nonreplaceable_vars_upstream = set([self]) else: flattened_inputs = op.get_flattened_inputs() inputs_nonreplaceable_vars_upstream = [p.nonreplaceable_vars_upstream for p in flattened_inputs] if len(inputs_nonreplaceable_vars_upstream) > 0: self.nonreplaceable_vars_upstream = set.union(*inputs_nonreplaceable_vars_upstream) def _reset_nonreplaceable_vars_upstream(self): self.nonreplaceable_vars_upstream = set() def can_be_replaced_by_var(self, new_var): """ A var can be replaced by a new var only if the new var's nonreplaceable_vars_upstream is the super set of the old one """ return self.nonreplaceable_vars_upstream.issubset(new_var.nonreplaceable_vars_upstream) def can_be_folded_to_const(self) -> bool: """ When translating frontend ops to PyMIL ops, some vars could be directly folded into a const. For example, in PyTorch's `to()` op, the input could be converted by `cast` op, or directly be folded to const. We only fold the var to a const when its value is known AND it doesn't have any non-replaceable vars in the upstream. """ return self.val is not None and not self.nonreplaceable_vars_upstream @property def sym_type(self): return self._sym_type @property def shape(self): if types.is_tensor(self._sym_type): return self._sym_type.get_shape() if types.is_state(self._sym_type): wrapped_type = self._sym_type.wrapped_type() assert types.is_tensor(wrapped_type), "only tensor type is supported in state type." return wrapped_type.get_shape() return tuple() @property def rank(self): return len(self.shape) @property def dtype(self): if types.is_tensor(self._sym_type): return self._sym_type.get_primitive() if types.is_state(self._sym_type): wrapped_type = self._sym_type.wrapped_type() assert types.is_tensor(wrapped_type), "only tensor type is supported in state type." return wrapped_type.get_primitive() return self._sym_type @property def sym_val(self): if self._sym_val is None: return None return self._sym_val.val @property def val(self): if self._sym_val is None or any_symbolic(self._sym_val.val): return None return self._sym_val.val @property def op(self): return self._op @property def child_ops(self): return self._child_ops def add_child_op(self, new_op): self._child_ops.append(new_op) def remove_child_op(self, target_op, no_check=False): if target_op not in self._child_ops: if no_check: return # no-op msg = "Op {} does not takes Var {} as input" raise ValueError(msg.format(target_op.name, self.name)) self._child_ops.remove(target_op) def shape_str(self): annotation = "" if self.val is not None: annotation = "*" elif self.sym_val is not None: annotation = "^" shape_str = str(self.shape)[:-1] # trim the ")" if self.rank > 1: shape_str += ", " if types.builtin_to_string(self.dtype) is None: shape_str += ")" + annotation else: shape_str += types.builtin_to_string(self.dtype) + ")" + annotation return shape_str def type_str(self): is_tensor = types.is_tensor(self.sym_type) is_list = types.is_list(self.sym_type) is_state = types.is_state(self.sym_type) if is_tensor: type_string = "(Tensor)" elif is_list: type_string = "(List)" elif is_state: type_string = "(State)" else: type_string = "(Scalar)" return type_string def set_name(self, name): self.name = name def is_tensor_or_scalar_of(self, dtype: Union[str, type]): if isinstance(dtype, type): dtype = types.builtin_to_string(dtype) return ( types.is_tensor(self.sym_type) or types.is_scalar(self.sym_type) ) and types.builtin_to_string(self.dtype) == dtype def __str__(self): return "%" + self.name + ": " + self.shape_str() + self.type_str() @property def scopes(self) -> Dict[ScopeSource, List[str]]: if self.op is None: # An empty dictionary is returned for function input vars. return defaultdict(list) return self.op.scopes @scopes.setter def scopes(self, scopes: Dict[ScopeSource, List[str]]): if self.op is None: raise ValueError(f"Cannot set scopes to a function input var {self}.") self.op.scopes = copy.deepcopy(scopes) class ListVar(Var): __slots__ = ["_elem_type", "init_length", "dynamic_length"] def __init__( self, name, elem_type=None, init_length=None, dynamic_length=True, sym_val=None, **kwargs ): """ elem_type (builtin.tensor) init_length (int): initial length dynamic_length (bool): True to allow list to grow. False uses init_length as the fixed size (init_length is runtime length). sym_val: value of the list, if available """ super().__init__( name=name, sym_type=types.list(elem_type, init_length, dynamic_length), sym_val=sym_val, **kwargs ) self._elem_type = elem_type self.init_length = init_length self.dynamic_length = dynamic_length @property def shape(self): raise ValueError("shape not applicable to ListVar '{}'.".format(self.name)) @property def rank(self): raise ValueError("rank not applicable to ListVar '{}'".format(self.name)) @property def dtype(self): raise ValueError("dtype not applicable to ListVar '{}'".format(self.name)) @property def elem_type(self): return self._elem_type @property def elem_shape(self): if self._elem_type == types.unknown: return None elif types.is_tensor(self._elem_type): return self._elem_type.get_shape() return () def shape_str(self): length = "?" if not self.dynamic_length: length = str(self.init_length) if self._elem_type == types.unknown: return "List[{}, unknown]".format(length) if self._elem_type == types.str: return "List[{}, str]".format(length) elif self._elem_type == types.int64: return "List[{}, int]".format(length) else: elem_shape = self._elem_type.get_shape() elem_dtype = self._elem_type.get_primitive() shape_str = str(elem_shape)[:-1] # trim the ")" if len(elem_shape) > 1: shape_str += ", " shape_str += types.builtin_to_string(elem_dtype) + ")" return "List[{}, {}]".format(length, shape_str) class InternalVar(Var): """ Internal Var (with '__' prefix and won't appear in SSA) will ALWAYS have `sym_val == builtin.unknown`. InternalVar are constructed by builder only. Comment: Internal Var can be used to represent diverse types such as enum type `DataType.FLOAT32`. """ def __init__(self, val, name=None): super().__init__( name=name, sym_type=types.unknown, sym_val=types.unknown(val) ) class ComplexVar(Var): """Var to handle complex data.""" __slots__ = ["_real", "_imag"] def __init__( self, name, sym_type, sym_val=None, op=None, op_output_idx=None, real: Optional[Var] = None, imag: Optional[Var] = None, ): super().__init__( name=name, sym_type=sym_type, sym_val=sym_val, op=op, op_output_idx=op_output_idx, ) # Handle complex data types. self._real: Optional[Var] = real self._imag: Optional[Var] = imag @property def real(self): return self._real @property def imag(self): return self._imag @real.setter def real(self, real): if not types.is_complex(self.dtype): raise ValueError( f"Only complex number can set `real`. This var is {self.dtype}." ) self._real = real @imag.setter def imag(self, imag): if not types.is_complex(self.dtype): raise ValueError( f"Only complex number can set `imag`. This var is {self.dtype}." ) self._imag = imag ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.253547 coremltools-8.0/coremltools/converters/mil/mil/visitors/0000755000000000000000000000000014672075535022457 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/visitors/__init__.py0000644000000000000000000000033214672066616024566 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/mil/visitors/dot_visitor.py0000644000000000000000000001372414672066616025405 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..var import Var def _get_input_vars(op, only_nonconst_vars=False): """ Return type : List[Var] """ input_vars = [] for name, val in op.inputs.items(): if isinstance(val, Var): if only_nonconst_vars: if val.op and val.op.op_type == "const": continue input_vars.append(val) elif isinstance(val, (list, tuple)): for var in val: if not isinstance(var, Var): msg = "unrecognized input type of op='{}', input='{}'" raise ValueError(msg.format(op.name, name)) if only_nonconst_vars: if var.op and var.op.op_type == "const": continue input_vars.append(var) else: msg = "unrecognized input type of op='{}', input='{}'" raise ValueError(msg.format(op.name, name)) return input_vars class DotVisitor: """ Generates a dot description of a ssa block """ def __init__(self, annotation=True): self.result = [] self.visited_memo = {} self.highlights = {} self.alternate_labeller = lambda o: o.op_type + ": " + o.name self.annotation = annotation def labeller(self, labeller): self.alternate_labeller = labeller return self def highlight_nodes(self, nodeset, color="yellow"): for i in nodeset: self.highlights[i] = color return self def visit(self, block, op, nodename_prefix=""): """ Append edges connecting parents of op to the op """ if op in self.visited_memo: return self label = self.alternate_labeller(op) self.visited_memo[op] = 1 if op.name in self.highlights and op.name not in [ o.name for o in block.outputs ]: self.result.append( '"' + nodename_prefix + "op: " + op.name + '"' + '[label="' + label + '",fillcolor=%s,style=filled,fontcolor=%s]' % (self.highlights[op.name], "violetred") ) else: self.result.append( '"' + nodename_prefix + "op: " + op.name + '"' + '[label="' + label + '",fontcolor=%s]' % ("violetred") ) for input_var in _get_input_vars(op, only_nonconst_vars=True): if input_var.op is not None: input_name = "op: " + input_var.op.name else: input_name = input_var.name edge = ( '"' + nodename_prefix + input_name + '"' + " -> " + '"' + nodename_prefix + "op: " + op.name + '"' ) self.result.append(edge) if input_var.op is not None: self.visit(block, input_var.op, nodename_prefix) else: self.visit_input_var(input_var, nodename_prefix) return self def visit_input_var(self, var, nodename_prefix=""): label = "input: " + var.name if var.name in self.highlights: self.result.append( '"' + nodename_prefix + var.name + '"' + '[label="' + label + '",fillcolor=%s,style=filled,fontcolor=%s]' % (self.highlights[var.name], "violetred") ) else: self.result.append( '"' + nodename_prefix + var.name + '"' + '[label="' + label + '",fontcolor=%s]' % ("violetred") ) def visit_output_vars(self, block, var, nodename_prefix=""): label = "output: " + var.name if var.name in self.highlights: self.result.append( '"' + nodename_prefix + var.name + '"' + '[label="' + label + '",fillcolor=%s,style=filled,fontcolor=%s]' % (self.highlights[var.name], "violetred") ) else: self.result.append( '"' + nodename_prefix + var.name + '"' + '[label="' + label + '",fontcolor=%s]' % ("violetred") ) parent_op = var.op edge = ( '"' + nodename_prefix + "op: " + parent_op.name + '"' + " -> " + '"' + nodename_prefix + var.name + '"' ) self.result.append(edge) self.visit(block, parent_op, nodename_prefix=nodename_prefix) def visit_all(self, block, nodename_prefix=""): for out_var in block.outputs: self.visit_output_vars(block, out_var, nodename_prefix=nodename_prefix) for op in block.operations: if op.op_type != "const": self.visit(block, op, nodename_prefix=nodename_prefix) return self def get_result(self, graphtype="digraph", graph_name="g"): return ( graphtype + " " + graph_name + " {\n\t" + "\n\t".join(str(i) for i in self.result) + ';\n\tlabel="' + graph_name[8:] + '";\n\tfontsize=96;\n}' ) def __str__(self): return self.get_result() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/test_inputs_outputs_shape.py0000644000000000000000000006727014672066616025746 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import os import tempfile import numpy as _np import PIL.Image import pytest import coremltools as ct from coremltools._deps import _HAS_TF_2, _HAS_TORCH, MSG_TF2_NOT_FOUND, MSG_TORCH_NOT_FOUND from coremltools.converters.mil import Builder as mb from coremltools.converters.mil.testing_reqs import backends, compute_units if _HAS_TORCH: import torch torch.manual_seed(10) class TestConvModule(torch.nn.Module): def __init__(self, in_channels=3, out_channels=10, kernel_size=3): super(TestConvModule, self).__init__() self.conv = torch.nn.Conv2d(in_channels, out_channels, kernel_size) def forward(self, x): return self.conv(x) class TestSimpleModule(torch.nn.Module): def forward(self, x): x = x + 1.0 y = x - 9.0 z = torch.sum(x) return x, y, z if _HAS_TF_2: import tensorflow as tf def _numpy_array_to_pil_image(x): """ convert x of shape (1, 3, H, W) to PIL image """ assert len(x.shape) == 4 assert list(x.shape[:2]) == [1, 3] x = x[0, :, :, :] # (3, H, W) x = _np.transpose(x, [1, 2, 0]) # (H, W, 3) x = x.astype(_np.uint8) return PIL.Image.fromarray(x) def _compute_snr(arr1, arr2): arr1 = arr1.flatten() arr2 = arr2.flatten() noise = arr1 - arr2 noise_var = _np.sum(noise**2) / len(noise) + 1e-7 signal_energy = _np.sum(arr2**2) / len(arr2) max_signal_energy = _np.amax(arr2**2) snr = 10 * _np.log10(signal_energy / noise_var) psnr = 10 * _np.log10(max_signal_energy / noise_var) return snr, psnr def _assert_torch_coreml_output_shapes( coreml_model, spec, torch_model, torch_example_input, is_image_input=False ): torch_out = torch_model(torch_example_input) input_name = spec.description.input[0].name output_name = spec.description.output[0].name input_dict = {} if is_image_input: input_dict[input_name] = _numpy_array_to_pil_image(torch_example_input.numpy()) else: input_dict[input_name] = torch_example_input.numpy() coreml_out = coreml_model.predict(input_dict)[output_name] assert torch_out.shape == coreml_out.shape snr, psnr = _compute_snr(torch_out.cpu().detach().numpy(), coreml_out) _np.testing.assert_array_less(20, snr) _np.testing.assert_array_less(30, psnr) class TestOutputShapes: @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_static_output_shapes(backend): @mb.program( input_specs=[ mb.TensorSpec( shape=(2, 3), ) ] ) def prog(x): x = mb.add(x=x, y=1.0) y = mb.sub(x=x, y=3.0) z = mb.reduce_sum(x=x, axes=[0, 1], keep_dims=False) return x, y, z model = ct.convert(prog, convert_to=backend[0]) spec = model.get_spec() expected_output_shape = [2, 3] if backend[0] == "mlprogram" else [] assert spec.description.output[0].type.multiArrayType.shape == expected_output_shape assert spec.description.output[1].type.multiArrayType.shape == expected_output_shape # scalar outputs have shape () assert spec.description.output[2].type.multiArrayType.shape == [] coreml_in = {"x": _np.random.rand(2, 3)} model.predict(coreml_in) @staticmethod @pytest.mark.parametrize( "backend", backends, ) def test_dynamic_output_shapes(backend): example_input = torch.rand(2, 3) traced_model = torch.jit.trace(TestSimpleModule().eval(), example_input) input_shape = ct.Shape(shape=(2, ct.RangeDim(3, 5))) model = ct.convert( traced_model, inputs=[ct.TensorType(shape=input_shape)], convert_to=backend[0] ) spec = model.get_spec() # We don't put the shape information for dynamic output shapes, # otherwise a runtime validation error would raise assert spec.description.output[0].type.multiArrayType.shape == [] assert spec.description.output[1].type.multiArrayType.shape == [] # scalar outputs have shape () assert spec.description.output[2].type.multiArrayType.shape == [] coreml_in = {"x_1": _np.random.rand(2, 3)} model.predict(coreml_in) @pytest.mark.skipif(not _HAS_TORCH or not ct.utils._is_macos(), reason=MSG_TORCH_NOT_FOUND) class TestFlexibleInputShapesTorch: @pytest.mark.parametrize( "backend, compute_unit", itertools.product( backends, compute_units, ), ) def test_multiarray_input_rangedim(self, backend, compute_unit): convert_to = backend[0] if convert_to == "mlprogram" and ct.utils._macos_version() < (12, 0): return example_input = torch.rand(1, 3, 50, 50) * 100 traced_model = torch.jit.trace(TestConvModule().eval(), example_input) input_shape = ct.Shape( shape=(1, 3, ct.RangeDim(25, 100, default=45), ct.RangeDim(25, 100, default=45)) ) model = ct.convert( traced_model, inputs=[ct.TensorType(shape=input_shape)], convert_to=convert_to, compute_units=compute_unit, ) spec = model.get_spec() assert list(spec.description.input[0].type.multiArrayType.shape) == [1, 3, 45, 45] assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[2].lowerBound == 25 ) assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[2].upperBound == 100 ) _assert_torch_coreml_output_shapes(model, spec, traced_model, example_input) @pytest.mark.parametrize( "backend, compute_unit, explicitly_set", itertools.product( backends, compute_units, [True, False], ), ) def test_multiarray_input_rangedim_infinite(self, backend, compute_unit, explicitly_set): convert_to = backend[0] example_input = torch.rand(1, 3, 50, 50) * 100 traced_model = torch.jit.trace(TestConvModule().eval(), example_input) second_dim = ct.RangeDim() if explicitly_set: second_dim.upper_bound = -1 input_shape = ct.Shape(shape=(1, 3, second_dim, ct.RangeDim(25, 100, default=45))) if convert_to == "mlprogram": with pytest.raises( ValueError, match="For mlprogram, inputs with infinite upper_bound is not allowed. Please set " 'upper_bound to a positive value in "RangeDim\(\)" for the "inputs" param in ' "ct.convert\(\).", ): ct.convert( traced_model, inputs=[ct.TensorType(shape=input_shape)], convert_to=convert_to, compute_units=compute_unit, ) else: model = ct.convert( traced_model, inputs=[ct.TensorType(shape=input_shape)], convert_to=convert_to, compute_units=compute_unit, ) spec = model.get_spec() assert list(spec.description.input[0].type.multiArrayType.shape) == [1, 3, 1, 45] assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[2].lowerBound == 1 ) assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[2].upperBound == -1 ) _assert_torch_coreml_output_shapes(model, spec, traced_model, example_input) @pytest.mark.parametrize( "backend, compute_unit", itertools.product( backends, compute_units, ), ) def test_multiarray_input_enumerated(self, backend, compute_unit): convert_to = backend[0] if convert_to == "mlprogram" and ct.utils._macos_version() < (12, 0): return example_input = torch.rand(1, 3, 50, 50) * 100 traced_model = torch.jit.trace(TestConvModule().eval(), example_input) input_shape = ct.EnumeratedShapes( shapes=[[1, 3, 25, 25], [1, 3, 50, 50], [1, 3, 67, 67]], default=[1, 3, 67, 67] ) model = ct.convert( traced_model, inputs=[ct.TensorType(shape=input_shape)], convert_to=convert_to, compute_units=compute_unit, ) spec = model.get_spec() assert list(spec.description.input[0].type.multiArrayType.shape) == [1, 3, 67, 67] assert list( spec.description.input[0].type.multiArrayType.enumeratedShapes.shapes[0].shape ) == [1, 3, 67, 67] assert len(spec.description.input[0].type.multiArrayType.enumeratedShapes.shapes) == 3 _assert_torch_coreml_output_shapes(model, spec, traced_model, example_input) @pytest.mark.skipif( ct.utils._macos_version() < (12, 0), reason="Image input with RangeDim works correctly on macOS12+", ) @pytest.mark.parametrize( "backend, compute_unit", itertools.product( backends, compute_units, ), ) def test_image_input_rangedim(self, backend, compute_unit): convert_to = backend[0] example_input = torch.rand(1, 3, 50, 50) * 255 traced_model = torch.jit.trace(TestConvModule().eval(), example_input) input_shape = ct.Shape( shape=(1, 3, ct.RangeDim(25, 100, default=35), ct.RangeDim(25, 100, default=45)) ) model = ct.convert( traced_model, inputs=[ct.ImageType(shape=input_shape)], convert_to=convert_to, compute_units=compute_unit, ) spec = model.get_spec() assert spec.description.input[0].type.imageType.width == 45 assert spec.description.input[0].type.imageType.height == 35 assert spec.description.input[0].type.imageType.imageSizeRange.widthRange.lowerBound == 25 assert spec.description.input[0].type.imageType.imageSizeRange.widthRange.upperBound == 100 _assert_torch_coreml_output_shapes( model, spec, traced_model, example_input, is_image_input=True ) @pytest.mark.skipif( ct.utils._macos_version() < (12, 0), reason="Image input with RangeDim works correctly on macOS12+", ) @pytest.mark.parametrize( "backend, compute_unit, explicitly_set", itertools.product( backends, compute_units, [True, False], ), ) def test_image_input_rangedim_infinite(self, backend, compute_unit, explicitly_set): convert_to = backend[0] example_input = torch.rand(1, 3, 50, 50) * 255 traced_model = torch.jit.trace(TestConvModule().eval(), example_input) second_dim = ct.RangeDim(upper_bound=-1) if explicitly_set else ct.RangeDim() input_shape = ct.Shape(shape=(1, 3, second_dim, ct.RangeDim(25, 100, default=45))) if convert_to == "mlprogram": with pytest.raises( ValueError, match="For mlprogram, inputs with infinite upper_bound is not allowed. Please set " 'upper_bound to a positive value in "RangeDim\(\)" for the "inputs" param in ' "ct.convert\(\).", ): ct.convert( traced_model, inputs=[ct.ImageType(shape=input_shape)], convert_to=convert_to, compute_units=compute_unit, ) else: model = ct.convert( traced_model, inputs=[ct.ImageType(shape=input_shape)], convert_to=convert_to, compute_units=compute_unit, ) spec = model.get_spec() assert spec.description.input[0].type.imageType.width == 45 assert spec.description.input[0].type.imageType.height == 1 assert ( spec.description.input[0].type.imageType.imageSizeRange.heightRange.lowerBound == 1 ) assert ( spec.description.input[0].type.imageType.imageSizeRange.heightRange.upperBound == -1 ) _assert_torch_coreml_output_shapes( model, spec, traced_model, example_input, is_image_input=True ) @pytest.mark.parametrize( "backend, compute_unit", itertools.product( backends, compute_units, ), ) def test_image_input_enumerated(self, backend, compute_unit): convert_to = backend[0] if convert_to == "mlprogram" and ct.utils._macos_version() < (12, 0): return example_input = torch.rand(1, 3, 50, 50) * 255 traced_model = torch.jit.trace(TestConvModule().eval(), example_input) input_shape = ct.EnumeratedShapes( shapes=[[1, 3, 25, 25], [1, 3, 50, 50], [1, 3, 67, 67]], default=[1, 3, 67, 67] ) model = ct.convert( traced_model, inputs=[ct.ImageType(shape=input_shape)], convert_to=convert_to, compute_units=compute_unit, ) spec = model.get_spec() assert spec.description.input[0].type.imageType.width == 67 assert spec.description.input[0].type.imageType.height == 67 assert len(spec.description.input[0].type.imageType.enumeratedSizes.sizes) == 3 assert spec.description.input[0].type.imageType.enumeratedSizes.sizes[0].width == 25 assert spec.description.input[0].type.imageType.enumeratedSizes.sizes[0].height == 25 _assert_torch_coreml_output_shapes( model, spec, traced_model, example_input, is_image_input=True ) @pytest.mark.skipif(not _HAS_TF_2 or not ct.utils._is_macos(), reason=MSG_TF2_NOT_FOUND) class TestFlexibleInputShapesTF: @classmethod def setup_class(cls): """Prepares tf model in different formats (keras model, h5 file, saved_model dir).""" input_1 = tf.keras.Input(shape=(None, None, 16), name="input_1") input_2 = tf.keras.Input(shape=(None, None, 4), name="input_2") x = tf.keras.layers.Dense(4, activation=tf.nn.relu)(input_1) + input_2 outputs = tf.keras.layers.Dense(5, activation=tf.nn.softmax)(x) cls.model = tf.keras.Model(inputs=[input_1, input_2], outputs=outputs) cls.temp_dir = tempfile.TemporaryDirectory() cls.h5_model_path = os.path.join(cls.temp_dir.name, "tf_keras_model.h5") cls.model.save(cls.h5_model_path) cls.saved_model_path = os.path.join(cls.temp_dir.name, "saved_model") cls.model.save(cls.saved_model_path, save_format="tf") @classmethod def teardown_class(cls): """CLean up temp dir that stores the TF models.""" cls.temp_dir.cleanup() @staticmethod def _find_unknown_dim_warning(raised_warnings: pytest.WarningsRecorder) -> bool: """Find if pytest catches any warning message about the unknown dim warning.""" for raised_warning in raised_warnings: if raised_warning.message.args[0].startswith( "Some dimensions in the input shape are unknown, hence they are set to flexible ranges" ): return True return False @pytest.mark.parametrize( "backend, compute_unit, model_format", itertools.product( backends, compute_units, ["keras_model", "h5", "saved_model"], ), ) def test_dynamic_shape_no_inputs(self, backend, compute_unit, model_format): """ The `inputs` param in `ct.convert` is not provided, so all inputs in the TF model with `None` dim will have a range shape where lower-bound/default/upper-bound are sanitized to finite numbers and warns users. """ convert_to = backend[0] model_param = self.model if model_format == "h5": model_param = self.h5_model_path elif model_format == "saved_model": model_param = self.saved_model_path if convert_to == "mlprogram": with pytest.warns( UserWarning, match="Some dimensions in the input shape are unknown, hence they are set to " "flexible ranges with lower bound and default value = 1, and upper bound = 2. " "To set different values for the default shape and upper bound, please use " "the ct.RangeDim.*", ): mlmodel = ct.convert( model_param, source="tensorflow", convert_to=convert_to, compute_units=compute_unit, ) else: mlmodel = ct.convert( model_param, source="tensorflow", convert_to=convert_to, compute_units=compute_unit, ) spec = mlmodel.get_spec() assert list(spec.description.input[0].type.multiArrayType.shape) == [1, 1, 1, 16] assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[2].lowerBound == 1 ) assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[2].upperBound == -1 if convert_to == "neuralnetwork" else 2 ) @pytest.mark.parametrize( "backend, compute_unit, specify_input", itertools.product( backends, compute_units, ["input_1", "input_2"], ), ) def test_dynamic_shape_partial_inputs(self, backend, compute_unit, specify_input): """ The `inputs` param in `ct.convert` is partially provided, where the TF model has two inputs while we only provide one in `inputs` param. So another input in the TF model with `None` dim will have a range shape where lower-bound/default/upper-bound are sanitized to finite numbers and warns users. """ convert_to = backend[0] last_dim = 16 if specify_input == "input_1" else 4 inputs = [ ct.TensorType( shape=ct.Shape( shape=( 1, 3, ct.RangeDim(2, 10, default=8), ct.RangeDim(4, 20, default=last_dim), ) ), name=specify_input, ) ] if convert_to == "mlprogram": with pytest.warns( UserWarning, match="Some dimensions in the input shape are unknown, hence they are set to " "flexible ranges with lower bound and default value = 1, and upper bound = 2. " "To set different values for the default shape and upper bound, please use " "the ct.RangeDim.*", ): mlmodel = ct.convert( self.model, source="tensorflow", inputs=inputs, convert_to=convert_to, compute_units=compute_unit, ) else: mlmodel = ct.convert( self.model, source="tensorflow", inputs=inputs, convert_to=convert_to, compute_units=compute_unit, ) spec = mlmodel.get_spec() # Notice the input in spec is not ordered, so need to use name to find input_1 and input_2. for input_spec in spec.description.input: if input_spec.name == "input_1": input_1_spec = input_spec elif input_spec.name == "input_2": input_2_spec = input_spec assert ( list(input_1_spec.type.multiArrayType.shape) == [1, 3, 8, 16] if specify_input == "input_1" else [1, 1, 1, 16] ) assert ( list(input_2_spec.type.multiArrayType.shape) == [1, 3, 8, 4] if specify_input == "input_2" else [1, 1, 1, 4] ) assert ( input_1_spec.type.multiArrayType.shapeRange.sizeRanges[2].lowerBound == 2 if specify_input == "input_1" else 1 ) assert ( input_2_spec.type.multiArrayType.shapeRange.sizeRanges[2].lowerBound == 2 if specify_input == "input_2" else 1 ) default_upper_bound = -1 if convert_to == "neuralnetwork" else 2 assert ( input_1_spec.type.multiArrayType.shapeRange.sizeRanges[2].upperBound == 10 if specify_input == "input_1" else default_upper_bound ) assert ( input_2_spec.type.multiArrayType.shapeRange.sizeRanges[2].upperBound == 10 if specify_input == "input_2" else default_upper_bound ) @pytest.mark.parametrize( "backend, compute_unit", itertools.product( backends, compute_units, ), ) def test_multiarray_input_rangedim(self, backend, compute_unit): input_shape_1 = ct.Shape( shape=(1, 3, ct.RangeDim(8, 20, default=8), ct.RangeDim(10, 100, default=16)) ) input_shape_2 = ct.Shape( shape=(1, 3, ct.RangeDim(4, 16, default=16), ct.RangeDim(1, 10, default=4)) ) with pytest.warns() as raised_warnings: model = ct.convert( self.model, source="tensorflow", inputs=[ ct.TensorType(shape=input_shape_1, name="input_1"), ct.TensorType(shape=input_shape_2, name="input_2"), ], convert_to=backend[0], compute_units=compute_unit, ) assert not self._find_unknown_dim_warning(raised_warnings) spec = model.get_spec() assert list(spec.description.input[0].type.multiArrayType.shape) == [1, 3, 8, 16] assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[2].lowerBound == 8 ) assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[2].upperBound == 20 ) assert list(spec.description.input[1].type.multiArrayType.shape) == [1, 3, 16, 4] assert ( spec.description.input[1].type.multiArrayType.shapeRange.sizeRanges[2].lowerBound == 4 ) assert ( spec.description.input[1].type.multiArrayType.shapeRange.sizeRanges[2].upperBound == 16 ) @pytest.mark.parametrize( "backend, compute_unit, explicitly_set", itertools.product( backends, compute_units, [True, False], ), ) def test_multiarray_input_rangedim_infinite(self, backend, compute_unit, explicitly_set): convert_to = backend[0] second_dim = ct.RangeDim(upper_bound=-1) if explicitly_set else ct.RangeDim() input_shape = ct.Shape(shape=(1, 3, second_dim, ct.RangeDim(10, 100, default=16))) if convert_to == "mlprogram": with pytest.raises( ValueError, match="For mlprogram, inputs with infinite upper_bound is not allowed. Please set " 'upper_bound to a positive value in "RangeDim\(\)" for the "inputs" param in ' "ct.convert\(\).", ): ct.convert( self.model, source="tensorflow", inputs=[ct.TensorType(shape=input_shape, name="input_1")], convert_to=convert_to, compute_units=compute_unit, ) else: model = ct.convert( self.model, source="tensorflow", inputs=[ct.TensorType(shape=input_shape, name="input_1")], convert_to=convert_to, compute_units=compute_unit, ) spec = model.get_spec() assert list(spec.description.input[0].type.multiArrayType.shape) == [1, 3, 1, 16] assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[2].lowerBound == 1 ) assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[2].upperBound == -1 ) @pytest.mark.parametrize( "backend, compute_unit", itertools.product( backends, compute_units, ), ) def test_multiarray_single_input_rangedim(self, backend, compute_unit): input_1 = tf.keras.Input(shape=(None, None, 16), name="input_1") x = tf.keras.layers.Dense(4, activation=tf.nn.relu)(input_1) outputs = tf.keras.layers.Dense(5, activation=tf.nn.softmax)(x) single_input_model = tf.keras.Model(inputs=input_1, outputs=outputs) # The `inputs` will work without specifying the name. model = ct.convert( single_input_model, source="tensorflow", inputs=[ ct.TensorType( shape=(1, 3, ct.RangeDim(8, 20, default=8), ct.RangeDim(10, 100, default=16)) ) ], convert_to=backend[0], compute_units=compute_unit, ) spec = model.get_spec() assert list(spec.description.input[0].type.multiArrayType.shape) == [1, 3, 8, 16] assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[2].lowerBound == 8 ) assert ( spec.description.input[0].type.multiArrayType.shapeRange.sizeRanges[2].upperBound == 20 ) @pytest.mark.skipif( ct.utils._macos_version() < (12, 0), reason="Image input with RangeDim works correctly on macOS12+", ) @pytest.mark.parametrize( "backend, compute_unit, explicitly_set", itertools.product( backends, compute_units, [True, False], ), ) def test_image_input_rangedim_infinite(self, backend, compute_unit, explicitly_set): convert_to = backend[0] second_dim = ct.RangeDim(upper_bound=-1) if explicitly_set else ct.RangeDim() input_shape = ct.Shape(shape=(1, 2, second_dim, ct.RangeDim(1, 10, default=3))) if convert_to == "mlprogram": with pytest.raises( ValueError, match="For mlprogram, inputs with infinite upper_bound is not allowed. Please set " 'upper_bound to a positive value in "RangeDim\(\)" for the "inputs" param in ' "ct.convert\(\).", ): ct.convert( self.model, source="tensorflow", inputs=[ct.ImageType(shape=input_shape, name="input_1")], convert_to=convert_to, compute_units=compute_unit, ) else: model = ct.convert( self.model, source="tensorflow", inputs=[ct.ImageType(shape=input_shape, name="input_1")], convert_to=convert_to, compute_units=compute_unit, ) spec = model.get_spec() assert spec.description.input[0].type.imageType.width == 1 assert spec.description.input[0].type.imageType.height == 2 assert ( spec.description.input[0].type.imageType.imageSizeRange.widthRange.lowerBound == 1 ) assert ( spec.description.input[0].type.imageType.imageSizeRange.widthRange.upperBound == -1 ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/testing_reqs.py0000644000000000000000000001460514672066616023103 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os from typing import List import numpy as np import pytest from attrs import define, field, validators import coremltools as ct from coremltools._deps import _HAS_TF_1, _HAS_TF_2, _HAS_TORCH from coremltools.converters.mil.testing_utils import macos_compatible_with_deployment_target # Setting up backend / precision / op version _SUPPORTED_BACKENDS = ("neuralnetwork", "mlprogram") _SUPPORTED_PRECISIONS = ("fp32", "fp16") _SUPPORTED_OPSET_VERSIONS_NN = (ct.target.iOS14,) _SUPPORTED_OPSET_VERSIONS_MLPROGRAM = ( ct.target.iOS15, ct.target.iOS16, ct.target.iOS17, ct.target.iOS18, ) @define(frozen=True) class BackendConfig: """ Parameters ---------- backend: str "neuralnetwork" or "mlprogram" precision: str "fp16" or "fp32" opset_version: ct.target minimum_deployment_target for the ct.convert function """ backend: str = field(validator=validators.instance_of(str)) precision: str = field(validator=validators.instance_of(str)) opset_version: ct.target = field(validator=validators.instance_of(ct.target)) @backend.validator def check_backend(self, attr, backend): if backend not in _SUPPORTED_BACKENDS: raise ValueError( f"backend {backend} not supported. Please pass one of the following values: {_SUPPORTED_BACKENDS}" ) @precision.validator def check_precision(self, attr, precision): if precision not in _SUPPORTED_PRECISIONS: raise ValueError( f"precision {precision} not supported. Please pass one of the following values: {_SUPPORTED_PRECISIONS}" ) if precision == "fp16" and self.backend == "neuralnetwork": raise ValueError("fp16 precision is only supported in mlprogram backend.") @opset_version.validator def check_opset_version(self, attr, opset_version): if self.backend == "neuralnetwork" and opset_version not in _SUPPORTED_OPSET_VERSIONS_NN: raise ValueError( f"opset_version {opset_version} not supported in neuralnetwork backend. Supported opset versions are {_SUPPORTED_OPSET_VERSIONS_NN}" ) if self.backend == "mlprogram" and opset_version not in _SUPPORTED_OPSET_VERSIONS_MLPROGRAM: raise ValueError( f"opset_version {opset_version} not supported in mlprogram backend. Supported opset versions are {_SUPPORTED_OPSET_VERSIONS_MLPROGRAM}" ) if 'PYMIL_TEST_TARGETS' in os.environ: targets = os.environ['PYMIL_TEST_TARGETS'].split(',') for i in range(len(targets)): targets[i] = targets[i].strip() else: targets = ["mlprogram", "neuralnetwork"] # new backends using the new infrastructure backends_internal = [] if "mlprogram" in targets: for v in _SUPPORTED_OPSET_VERSIONS_MLPROGRAM: precisions = ["fp16"] if os.getenv('INCLUDE_MIL_FP32_UNIT_TESTS') == '1': precisions.append("fp32") for p in precisions: backends_internal.append( BackendConfig(backend="mlprogram", precision=p, opset_version=v) ) if "neuralnetwork" in targets: for v in _SUPPORTED_OPSET_VERSIONS_NN: backends_internal.append( BackendConfig( backend="neuralnetwork", precision="fp32", opset_version=v, ) ) # old backends approach backends = [] if "mlprogram" in targets: backends.append(("mlprogram", "fp16")) if os.getenv("INCLUDE_MIL_FP32_UNIT_TESTS") == "1": backends.append(("mlprogram", "fp32")) if "neuralnetwork" in targets: backends.append(("neuralnetwork", "fp32")) if not backends or not backends_internal: raise ValueError("PYMIL_TEST_TARGETS can be set to one or more of: neuralnetwork, mlprogram") def clean_up_backends( backends: List[BackendConfig], minimum_opset_version: ct.target, force_include_iOS15_test: bool = False, ) -> List[BackendConfig]: """ Given a list of BackendConfig objects, this utility function filters out the invalid elements. For instance, given a list of configs with opset_versions range from iOS14 to iOS17, with minimum_opset_version set to iOS16 and environment variable `RUN_BACKWARD_COMAPTIBILITY=1`, iOS14/iOS15 configs are removed, and iOS16/iOS17 configs are preserved. To be more specific, the config is removed if one of the following conditions is matched: 1. If opset_version is not compatible with the macOS. 2. If opset_version < minimum_opset_version 3. For the non backward compatibility run, opset_version > minimum_opset_version Note a corner case that when `force_include_iOS15_test=True`, the iOS15 configs are forced to be preserved. """ test_all_opset_versions = os.getenv("RUN_BACKWARD_COMPATIBILITY") == "1" res = [] for config in backends: # First check if the macOS are able to run the test if not macos_compatible_with_deployment_target(config.opset_version): continue if force_include_iOS15_test and config.opset_version == ct.target.iOS15: res.append(config) continue if config.opset_version < minimum_opset_version: continue if not test_all_opset_versions and config.opset_version > minimum_opset_version: continue res.append(config) if len(res) == 0: pytest.skip( f"Tests are not runnable under {minimum_opset_version.name}.", allow_module_level=True ) return res # Setting up compute unit compute_units = [] if "COMPUTE_UNITS" in os.environ: for cur_str_val in os.environ["COMPUTE_UNITS"].split(","): cur_str_val = cur_str_val.strip().upper() if cur_str_val not in ct.ComputeUnit.__members__: raise ValueError("Compute unit \"{}\" not supported in coremltools.".format(cur_str_val)) compute_units.append(ct.ComputeUnit[cur_str_val]) else: compute_units = [ct.ComputeUnit.CPU_ONLY] np.random.seed(1984) if _HAS_TF_1: tf = pytest.importorskip("tensorflow") tf.compat.v1.set_random_seed(1234) if _HAS_TF_2: tf = pytest.importorskip("tensorflow") tf.random.set_seed(1234) if _HAS_TORCH: torch = pytest.importorskip("torch") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/mil/testing_utils.py0000644000000000000000000007564614672066616023305 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import itertools import os from functools import partial from pathlib import Path from typing import Dict, List, Optional, Tuple, Union import numpy as np import pytest from PIL import Image import coremltools as ct import coremltools.models.utils as coremltoolsutils from coremltools._deps import _IS_MACOS from coremltools.converters.mil import mil from coremltools.converters.mil.mil import Block, Function, Program from coremltools.converters.mil.mil.passes.defs.preprocess import NameSanitizer as _NameSanitizer from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.converters.mil.mil.scope import ScopeSource from coremltools.proto import FeatureTypes_pb2 as ft np.random.seed(10) DTYPE_TO_FEATURE_TYPE_MAP: Dict[str, ft.ArrayFeatureType] = { "int32": ft.ArrayFeatureType.INT32, "fp32": ft.ArrayFeatureType.FLOAT32, "fp16": ft.ArrayFeatureType.FLOAT16, } # The minimum macOS version for an IOS target. For example, iOS16 target requires macOS13+. IOS_TO_MINIMUM_MACOS_VERSION: Dict[ct.target, int] = { ct.target.iOS14: 11, ct.target.iOS15: 12, ct.target.iOS16: 13, ct.target.iOS17: 14, ct.target.iOS18: 15, } _COREMLTOOLS_DEBUG_SAVE_MLMODEL_DIRECTORY = "/tmp/coremltools_debug_save_mlmodel" debug_save_mlmodels = set() debug_save_mlmodel_config_file_name = os.environ.get("DEBUG_SAVE_MLMODEL", "0") if debug_save_mlmodel_config_file_name != "0": if not os.path.isfile(debug_save_mlmodel_config_file_name): raise ValueError("DEBUG_SAVE_MLMODEL must be the name of a config file with tests to save") with open(debug_save_mlmodel_config_file_name, "r") as f: lines = f.readlines() for line in lines: if line[0] == "#" or line == "\n": continue debug_save_mlmodels.add(line[:-1]) hardcoded_einsum_equations: List[str] = [ # hardcoded cases "abcd,adce->abce", "abc,cbd->abd", "bnqd,bnkd->bnqk", "abc,cd->abd", "abc,cde->abde", "btnh,bfnh->bnft", "bnft,btnh->bfnh", "abcd,cde->abe", "a b c d , a d c e -> a b c e", ] einsum_equations: List[str] = hardcoded_einsum_equations + [ # with-diagonal generic cases "jiii,ijjk->jk", "iji,ji->j", "jii,ijk->jk", "ijij,iij->ij", # no-diagonal generic cases "i,j->ij", # outer product "a,a->a", # batched outer product "ija,la->ijal", # batched outer product "i,i->", # inner product "ia,ia->a", # batched inner product "ai,ia->a", # batched inner product "abi,abi->ab", # batched inner product "iab,iab->ab", # batched inner product "abi,bai->ba", # batched inner product "ij,j->i", # matrix-vector multiplication "i,ij->j", # vector-matrix multiplication "ai,ija->aj", # batched vector-matrix multiplication "aibj,bi->jba", # batched matrix-vector multiplication "ij,jk->ik", # matrix multiplication "aij,ajk->iak", # batched matrix multiplication "abij,abjk->abik", # batched matrix multiplication "aijb,bajk->abik", # batched matrix multiplication "ij,ij->", # double-inner product "ij,ji->", # double-inner product "aij,aij->a", # batched double-inner product "ija,ija->a", # batched double-inner product "ija,jia->a", # batched double-inner product "aijb,ajbi->ab", # batched double-inner product "aibj,cdij->cadb", # batched double-inner product "ijk,lmj->iklm", # 3rd-order tensor contraction "ijak,akl->aijl", # batched 3rd-order tensor and matrix contraction # Generic with sum "ij,j->ij", "ij,kjl->j", "iijj,j->j", ] def macos_compatible_with_deployment_target(minimum_deployment_target): if coremltoolsutils._is_macos(): macos_major_version = coremltoolsutils._macos_version()[0] if macos_major_version < IOS_TO_MINIMUM_MACOS_VERSION[minimum_deployment_target]: return False return True def _create_current_pytest_serialization_path() -> str: serialization_path = _COREMLTOOLS_DEBUG_SAVE_MLMODEL_DIRECTORY + "/" PYTEST_CURRENT_TEST = os.environ.get("PYTEST_CURRENT_TEST").split("(call)")[0].strip() test_name_fragments = PYTEST_CURRENT_TEST.split("::") for test_name_fragment in test_name_fragments[:-1]: serialization_path += f"{test_name_fragment.strip()}/" test_name = test_name_fragments[-1] # For a parameterized test, further decompose parameters into directories if "[" in test_name and test_name[-1] == "]": # Split test name with [] bra_index = test_name.index("[") test_function_name = test_name[:bra_index] parameters = test_name[bra_index + 1 : -1].split("-") # Append test function name and parameter to mlpackage path serialization_path += f"{test_function_name}/" for parameter in parameters: serialization_path += f"{parameter}/" else: serialization_path += f"{test_name}/" return serialization_path def _serialize_current_pytest_mlmodel(mlmodel) -> None: """ Usually pytest test name is of format file::class::test_function[param0-param1] (call)... Assume each test produces only one Core ML model, then file::class::test_function[param0-param1] is enough to determine unique name {_COREMLTOOLS_DEBUG_SAVE_MLMODEL_DIRECTORY}/file/class/test_function/param0/param1/model.mlpackage """ mlpackage_path = _create_current_pytest_serialization_path() + "model.mlpackage" Path(mlpackage_path).mkdir(parents=True, exist_ok=True) mlmodel.save(mlpackage_path) def assert_op_count_match(program, expect, op=None, verbose=False): """ Assert number of ops match expected number. If op is not specified, Count total number of ops and match with expect. """ if verbose: print(program) count = 0 for _, func in program.functions.items(): for o in func.operations: if not op: count += 1 elif o.op_type.lower() == op.lower(): count += 1 np.testing.assert_equal(count, expect) def assert_model_is_valid( program, inputs, backend=("neuralnetwork", "fp32"), verbose=True, expected_output_shapes=None, minimum_deployment_target: ct.target = None, ): """ Assert Core ML model is valid. Inputs: - input: str -> shape tuple. All program input names need to appear in str. shape tuple can only contain positive integers. """ if minimum_deployment_target is not None: validate_minimum_deployment_target(minimum_deployment_target, backend) # Avoid circular import from coremltools.converters.mil.testing_reqs import ct input_dict = dict() for name, shape in inputs.items(): input_dict[name] = np.random.rand(*shape) mlmodel = ct_convert( program, source="milinternal", convert_to=backend, compute_units=ct.ComputeUnit.CPU_ONLY, minimum_deployment_target=minimum_deployment_target, ) assert mlmodel is not None if verbose: from coremltools.models.neural_network.printer import print_network_spec print_network_spec(mlmodel.get_spec(), style="coding") if _IS_MACOS and (not mlmodel.is_package or coremltoolsutils._macos_version() >= (12, 0)): prediction = mlmodel.predict(input_dict) assert prediction is not None if expected_output_shapes is not None: for out_name, out_shape in expected_output_shapes.items(): assert out_name in prediction assert out_shape == prediction[out_name].shape, \ "{} != {}".format(out_shape, prediction[out_name].shape) def assert_same_input_names(prog1, prog2, func_name="main"): # check the input keys prog1_input_keys = list(prog1[func_name].inputs.keys()) prog2_input_keys = list(prog2[func_name].inputs.keys()) assert prog1_input_keys == prog2_input_keys # check the input var name prog1_input_names = [x.name for x in list(prog1[func_name].inputs.values())] prog2_input_names = [x.name for x in list(prog2[func_name].inputs.values())] assert prog1_input_names == prog2_input_names def assert_numerical_value(mil_var, expected_value): if mil_var is None: assert expected_value is None else: np.testing.assert_allclose(mil_var.val, expected_value) def assert_same_input_types(prog1, prog2, func_name="main"): prog1_input_types = [x.dtype for x in list(prog1[func_name].inputs.values())] prog2_input_types = [x.dtype for x in list(prog2[func_name].inputs.values())] assert prog1_input_types == prog2_input_types def assert_same_output_names(prog1, prog2, func_name="main"): prog1_outputs = [o.name for o in prog1[func_name].outputs] prog2_outputs = [o.name for o in prog2[func_name].outputs] assert prog1_outputs == prog2_outputs def assert_same_output_types(prog1: Program, prog2: Program, func_name: str = "main"): """ Check ``prog1`` and ``prog2`` have the same output dtypes. """ prog1_output_types = [o.dtype for o in prog1[func_name].outputs] prog2_output_types = [o.dtype for o in prog2[func_name].outputs] assert prog1_output_types == prog2_output_types def assert_same_output_shapes(prog1, prog2, func_name="main"): prog1_output_shapes = [o.shape for o in prog1[func_name].outputs] prog2_output_shapes = [o.shape for o in prog2[func_name].outputs] assert prog1_output_shapes == prog2_output_shapes def get_op_names_in_program(prog, func_name="main", skip_const_ops=True): """ Return the operations names in prog[func_name], in the same order as they are stored (topological) """ op_names_in_program = [] for op in prog[func_name].operations: if skip_const_ops: if op.op_type == "const": continue op_names_in_program.append(op.name) return op_names_in_program def get_op_types_in_block(block: Block, skip_const_ops: bool = True, recurse: bool = False): """ Return the operation types in block, in the same order as they are stored (topological) """ op_types_in_block = [] for op in block.operations: if skip_const_ops: if op.op_type == "const": continue op_types_in_block.append(op.op_type) if recurse: for child_block in op.blocks: child_ops = get_op_types_in_block(child_block, skip_const_ops, recurse) op_types_in_block += child_ops return op_types_in_block def get_op_types_in_program(prog: Program, func_name: str = "main", skip_const_ops: bool = True, recurse: bool = False): """ Return the operation types in prog[func_name], in the same order as they are stored (topological) If ``skip_const_ops = True``, const ops are not returned. If ``recurse = True``, the ops of all nested blocks are returned. """ return get_op_types_in_block(prog[func_name], skip_const_ops, recurse) def random_gen( shape, rand_min=0.0, rand_max=1.0, eps_from_int=0.0, allow_duplicate=True, dtype=np.float32, ): """ This helper function generates a random array of shape `shape` The range of generated numbers will be between (rand_min, rand_max]. The value of generated numbers will be at least `eps_from_int` apart from integers. If allow_duplicate is set to false, it is guaranteed that value generated are all different. Default data type is np.float32. """ elem = np.prod(shape).astype(np.int32) # Since this function is extensively used as well for the fp16 precision models, # we make sure that the numerical value can be presented in fp16. gen_dtype = np.float16 if dtype == np.float32 else dtype ret = [] for _ in range(elem): while True: r = gen_dtype((rand_max - rand_min) * np.random.random() + rand_min) if not allow_duplicate and r in ret: continue if np.issubdtype(gen_dtype, np.integer) or np.fabs(np.round(r) - r) > eps_from_int: ret.append(r) break ret = np.array(ret).reshape(shape) return ret.astype(dtype) def ssa_fn(func): """ Deprecated: use @mb.program() """ def wrapper(*args, **kwargs): prog = mil.Program() with Function({}) as ssa_func: func(*args, **kwargs) return wrapper def to_tuple(v): if not isinstance(v, (list, tuple)): return tuple([v]) return tuple(v) def run_core_ml_predict(mlmodel, input_key_values, state=None): for k, v in input_key_values.items(): if isinstance(v, Image.Image): continue elif not np.isscalar(v) and not v.shape == (): input_key_values[k] = v.astype(np.float32) else: input_key_values[k] = np.array([v], dtype=np.float32) return mlmodel.predict(input_key_values, state=state) def _get_coreml_out_from_dict(out_dict, out_name): if out_name in out_dict: return out_dict[out_name] sanitized_out_name = _NameSanitizer._replace_invalid_char_with_underscore(out_name) if sanitized_out_name in out_dict: return out_dict[sanitized_out_name] else: raise KeyError(f"{out_name} output not found in Core ML outputs") def _get_proto_output_shape(desc, out_name): sanitized_out_name = _NameSanitizer._replace_invalid_char_with_underscore(out_name) for coreml_o in desc.output: if coreml_o.name == sanitized_out_name: return coreml_o.type.multiArrayType.shape raise KeyError(f"{out_name} output not found in Core ML outputs") def compare_backend( mlmodel, input_key_values, expected_outputs, dtype="fp32", atol=1e-04, rtol=1e-05, also_compare_shapes=True, state=None, ): """ Inputs: - mlmodel: MLModel. - input_key_values: str -> np.array. Keys must match those in input_placeholders. - expected_outputs: dict[str, np.array]. Required iff frontend_only is False """ if _IS_MACOS and (not mlmodel.is_package or coremltoolsutils._macos_version() >= (12, 0)): if dtype not in ["fp32", "fp16"]: raise ValueError("Unsupported dtype config") pred = run_core_ml_predict(mlmodel, input_key_values, state) if also_compare_shapes: compare_shapes( mlmodel, input_key_values, expected_outputs, pred=pred, ) if mlmodel.compute_unit != ct.ComputeUnit.CPU_ONLY or (dtype == "fp16"): atol = max(atol * 100.0, 5e-1) rtol = max(rtol * 100.0, 5e-2) for o, expected in expected_outputs.items(): coreml_out = _get_coreml_out_from_dict(pred, o) if isinstance(coreml_out, np.ndarray): np.testing.assert_allclose(coreml_out, expected, atol=atol, rtol=rtol) elif isinstance(coreml_out, dict): for k, v in coreml_out.items(): assert k in expected assert expected[k] == v else: assert coreml_out == expected return pred return None def compare_shapes(mlmodel, input_key_values, expected_outputs, pred=None): """ Inputs: - mlmodel: MLModel. - input_key_values: str -> np.array or PIL.Image. Keys must match those in input_placeholders. - expected_outputs: dict[str, np.array]. - pred: Prediction to use, if it has already been computed. """ if _IS_MACOS: if not pred: pred = run_core_ml_predict(mlmodel, input_key_values) for o, expected in expected_outputs.items(): coreml_out = _get_coreml_out_from_dict(pred, o) # output is dictionary (for classifier) if isinstance(coreml_out, dict) and isinstance(expected, dict): assert len(coreml_out) == len(expected) continue # output is numpy objects np_types = (np.generic, np.ndarray) if isinstance(coreml_out, np_types) and isinstance(expected, np_types): msg = "Output: {}. expected shape {} != actual shape {}".format( o, expected.shape, coreml_out.shape ) # Core ML does not support scalar as output # remove this special case when support is added if expected.shape == () and coreml_out.shape == (1,): continue assert coreml_out.shape == expected.shape, msg # Validate the shape consistency across runtime returned values and # the output information in the mlprogram proto. spec = mlmodel.get_spec() if spec.WhichOneof("Type") == "mlProgram": if mlmodel._is_multifunction(): desc = mlmodel._get_function_description(mlmodel.function_name) else: desc = spec.description # The proto output and the runtime outputs are different for classifier if desc.predictedFeatureName != "": continue proto_shape = _get_proto_output_shape(desc, o) if proto_shape != []: assert proto_shape == list( coreml_out.shape ), f"the output shape, for output named {o}, returned by the model is {coreml_out.shape} which does match with the shape present in the proto spec, which is {proto_shape}" continue # output is other types (for classifier) assert type(coreml_out) == type(expected) def ct_convert( program, source="auto", inputs=None, outputs=None, classifier_config=None, minimum_deployment_target=None, convert_to=None, compute_precision=None, skip_model_load=False, converter=ct.convert, **kwargs, ): """ Overloaded ct.convert function with the only difference being in the argument `convert_to` which in this overloaded call accepts a tuple of (target, dtype). Ex: ("neuralnetwork", "fp32"), ("mlprogram", "fp16") """ if isinstance(converter, partial): raise ValueError("Partial function is not supported for function-parameter 'converter' since its keywords arguments could get overridden.") target, dtype = convert_to if dtype not in ["fp32", "fp16"]: raise ValueError("Unsupported dtype config") compute_precision = ct.precision.FLOAT16 if dtype == "fp16" else ct.precision.FLOAT32 if target == "neuralnetwork": compute_precision = None PYTEST_CURRENT_TEST = os.environ.get("PYTEST_CURRENT_TEST").split("(call)")[0].strip() is_current_test_to_be_debugged = PYTEST_CURRENT_TEST in debug_save_mlmodels if is_current_test_to_be_debugged: # If current test is to be debugged, then it is probably buggy in Core ML framework, # so we skip its load to dodge potential bug which might kill python process skip_model_load = True mlmodel = converter( program, source=source, inputs=inputs, outputs=outputs, classifier_config=classifier_config, minimum_deployment_target=minimum_deployment_target, convert_to=target, compute_precision=compute_precision, skip_model_load=skip_model_load, **kwargs, ) if is_current_test_to_be_debugged: _serialize_current_pytest_mlmodel(mlmodel) pytest.xfail("This test is to be debugged") return mlmodel def get_core_ml_prediction( build, input_placeholders, input_values, backend, compute_unit=ct.ComputeUnit.CPU_ONLY ): """ Return predictions of the given model. """ minimum_deployment_target = backend.opset_version program = mil.Program() with Function(input_placeholders, opset_version=minimum_deployment_target) as ssa_func: output_vars = build(**ssa_func.inputs) if isinstance(output_vars, tuple): output_vars = list(output_vars) elif not isinstance(output_vars, list): output_vars = [output_vars] ssa_func.set_outputs(output_vars) program.add_function("main", ssa_func) mlmodel = ct_convert( program, source="milinternal", convert_to=(backend.backend, backend.precision), compute_units=compute_unit, minimum_deployment_target=minimum_deployment_target, ) return mlmodel.predict(input_values) def _decorate_prog_with_scope_if_not_present(prog: Program): """ For a program without any scope info, we manually add scopes to every op, in ordere to test that all graph passes can preserve the source scope info. """ def _is_scopes_present_in_program(prog: Program) -> bool: """ Return True is any op already has the scopes info. """ def _is_scopes_present_in_block(block: Block) -> bool: for op in block.operations: for b in op.blocks: if _is_scopes_present_in_block(b): return True if len(op.scopes) > 0: return True for func in prog.functions.values(): if _is_scopes_present_in_block(func): return True def _decorate_prog_with_default_torch_scope(prog: Program): """ Decorate every op in the program with a default TORCHSCRIPT_MODULE_TYPE scope info. """ def _decorate_block_with_default_torch_scope(block: Block): for op in block.operations: for b in op.blocks: _decorate_block_with_default_torch_scope(b) assert ScopeSource.TORCHSCRIPT_MODULE_TYPE not in op.scopes op.scopes[ScopeSource.TORCHSCRIPT_MODULE_TYPE] = ["dummy"] for func in prog.functions.values(): _decorate_block_with_default_torch_scope(func) prog._add_essential_scope_source(ScopeSource.TORCHSCRIPT_MODULE_TYPE) if not _is_scopes_present_in_program(prog): _decorate_prog_with_default_torch_scope(prog) def apply_pass_and_basic_check( prog: Program, pass_name: Union[str, AbstractGraphPass], skip_output_name_check: Optional[bool] = False, skip_output_type_check: Optional[bool] = False, skip_output_shape_check: Optional[bool] = False, skip_input_name_check: Optional[bool] = False, skip_input_type_check: Optional[bool] = False, skip_function_name_check: Optional[bool] = False, func_name: Optional[str] = "main", skip_essential_scope_check: Optional[bool] = False, ) -> Tuple[Program, Block, Block]: """ Apply pass to the program """ prev_prog = copy.deepcopy(prog) graph_pass = pass_name if isinstance(pass_name, AbstractGraphPass) else PASS_REGISTRY[pass_name] _decorate_prog_with_scope_if_not_present(prog) graph_pass(prog) prog.validate(check_essential_scope=not skip_essential_scope_check) if not skip_function_name_check: if prev_prog.functions.keys() != prog.functions.keys(): raise ValueError("function names changed during {pass_name}.") for name in prev_prog.functions: if not skip_output_name_check: assert_same_output_names(prev_prog, prog, name) if not skip_output_type_check: assert_same_output_types(prev_prog, prog, name) if not skip_output_shape_check: assert_same_output_shapes(prev_prog, prog, name) if not skip_input_name_check: assert_same_input_names(prev_prog, prog, name) if not skip_input_type_check: assert_same_input_types(prev_prog, prog, name) return prev_prog, prev_prog.functions[func_name], prog.functions[func_name] def assert_prog_input_type(prog, expected_dtype_str, expected_name=None, index=0): block = prog.functions["main"] if expected_name is None: input_var = list(block.inputs.values())[index] assert input_var.is_tensor_or_scalar_of(dtype=expected_dtype_str) else: for input_var in block.inputs.values(): if input_var.name == expected_name: assert input_var.is_tensor_or_scalar_of(dtype=expected_dtype_str) def assert_spec_input_type(spec, expected_feature_type, expected_name=None, index=0): if expected_name is None: assert spec.description.input[index].type.multiArrayType.dataType == expected_feature_type else: for input in spec.description.input: if input.name == expected_name: assert input.type.multiArrayType.dataType == expected_feature_type def assert_input_dtype(mlmodel, expected_type_str, expected_name=None, index=0): assert_prog_input_type(mlmodel._mil_program, expected_type_str, expected_name=expected_name, index=index) assert_spec_input_type(mlmodel._spec, DTYPE_TO_FEATURE_TYPE_MAP[expected_type_str], expected_name=expected_name, index=index) def assert_spec_output_type(spec, expected_feature_type, expected_name=None, index=0): assert spec.description.output[index].type.multiArrayType.dataType == expected_feature_type if expected_name is not None: assert spec.description.output[index].name == expected_name def assert_prog_output_type(prog, expected_dtype_str, expected_name=None, index=0): block = prog.functions["main"] output_var = block.outputs[index] assert output_var.is_tensor_or_scalar_of(dtype=expected_dtype_str) if expected_name is not None: assert output_var.name == expected_name def assert_output_dtype(mlmodel, expected_type_str, expected_name=None, index=0): assert_prog_output_type(mlmodel._mil_program, expected_type_str, expected_name=expected_name, index=index) assert_spec_output_type(mlmodel._spec, DTYPE_TO_FEATURE_TYPE_MAP[expected_type_str], expected_name=expected_name, index=index) def random_gen_input_feature_type(input_desc): if input_desc.type.WhichOneof("Type") == "multiArrayType": shape = [s for s in input_desc.type.multiArrayType.shape] if input_desc.type.multiArrayType.dataType == ft.ArrayFeatureType.FLOAT32: dtype = np.float32 elif input_desc.type.multiArrayType.dataType == ft.ArrayFeatureType.INT32: dtype = np.int32 elif input_desc.type.multiArrayType.dataType == ft.ArrayFeatureType.FLOAT16: dtype = np.float16 elif input_desc.type.multiArrayType.dataType == ft.ArrayFeatureType.FLOAT64: dtype = np.float64 else: raise ValueError("unsupported type") return np.random.rand(*shape).astype(dtype) elif input_desc.type.WhichOneof("Type") == "imageType": if input_desc.type.imageType.colorSpace in (ft.ImageFeatureType.BGR, ft.ImageFeatureType.RGB): shape = [3, input_desc.type.imageType.height, input_desc.type.imageType.width] x = np.random.randint(low=0, high=256, size=shape) return Image.fromarray(np.transpose(x, [1, 2, 0]).astype(np.uint8)) elif input_desc.type.imageType.colorSpace == ft.ImageFeatureType.GRAYSCALE: shape = [input_desc.type.imageType.height, input_desc.type.imageType.width] x = np.random.randint(low=0, high=256, size=shape) return Image.fromarray(x.astype(np.uint8), 'L') elif input_desc.type.imageType.colorSpace == ft.ImageFeatureType.GRAYSCALE_FLOAT16: shape = (input_desc.type.imageType.height, input_desc.type.imageType.width) x = np.random.rand(*shape) return Image.fromarray(x.astype(np.float32), 'F') else: raise ValueError("unrecognized image type") else: raise ValueError('unsupported type') def gen_input_shapes_einsum(equation: str, dynamic: bool, backend: Tuple[str, str]): equation = equation.replace(" ", "") left = equation.split("->")[0] var_descs = left.split(",") converter_shapes = {} shapes = {} cur_default_shape = 2 for symbol in itertools.chain.from_iterable(var_descs): if symbol not in shapes: shapes[symbol] = cur_default_shape if dynamic: converter_shapes[symbol] = ct.RangeDim( default=cur_default_shape, upper_bound=cur_default_shape if backend[0] == "mlprogram" else -1, ) else: converter_shapes[symbol] = cur_default_shape cur_default_shape += 1 var_shapes = [[shapes[symbol] for symbol in var_desc] for var_desc in var_descs] converted_shapes = [ct.TensorType(shape=[converter_shapes[symbol] for symbol in var_desc], dtype=np.float32) for var_desc in var_descs] return var_shapes, converted_shapes def verify_prediction(mlmodel, multiarray_type=None): spec = mlmodel._spec input_dict = {} for input_desc in spec.description.input: input_dict[input_desc.name] = random_gen_input_feature_type(input_desc) if multiarray_type is not None: input_dict[input_desc.name] = input_dict[input].astype(multiarray_type) state = mlmodel.make_state() if mlmodel._is_stateful() else None res = mlmodel.predict(input_dict, state=state) assert isinstance(res, dict) assert len(res) == len(spec.description.output) def assert_spec_input_image_type(spec, expected_feature_type): assert spec.description.input[0].type.imageType.colorSpace == expected_feature_type def assert_spec_output_image_type(spec, expected_feature_type): assert spec.description.output[0].type.imageType.colorSpace == expected_feature_type def assert_cast_ops_count(mlmodel, expected_count): block = mlmodel._mil_program.functions["main"] assert len(block.find_ops(op_type="cast")) == expected_count def assert_ops_in_mil_program(mlmodel, expected_op_list): assert expected_op_list == get_op_types_in_program(mlmodel._mil_program) def validate_minimum_deployment_target( minimum_deployment_target: ct.target, backend: Tuple[str, str] ): """ Validates the minimum deployment target based on backend and macOS version. Only used in tests. """ if minimum_deployment_target >= ct.target.iOS15 and backend[0] != "mlprogram": pytest.skip("IOS15+ target only compatible with mlprogram.") if not macos_compatible_with_deployment_target(minimum_deployment_target): pytest.skip( f"IOS{minimum_deployment_target} target is not runnable on this macOS {coremltoolsutils._macos_version()}" ) def compute_snr_and_psnr(x, y): assert len(x) == len(y) eps = 1e-5 eps2 = 1e-10 noise = x - y noise_var = np.sum(noise**2) / len(noise) signal_energy = np.sum(y**2) / len(y) max_signal_energy = np.amax(y**2) snr = 10 * np.log10((signal_energy + eps) / (noise_var + eps2)) psnr = 10 * np.log10((max_signal_energy + eps) / (noise_var + eps2)) return snr, psnr ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2575471 coremltools-8.0/coremltools/converters/sklearn/0000755000000000000000000000000014672075535020672 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_LinearSVC.py0000644000000000000000000000273414672066616023177 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel if _HAS_SKLEARN: from sklearn.svm import LinearSVC as _LinearSVC sklearn_class = _LinearSVC from . import _sklearn_util from . import _logistic_regression model_type = "classifier" def convert(model, feature_names, target): """Convert a LinearSVC model to the protobuf spec. Parameters ---------- model: LinearSVC A trained LinearSVC model. feature_names: [str] Name of the input columns. target: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_expected_type(model, _LinearSVC) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "coef_")) return _MLModel(_logistic_regression._convert(model, feature_names, target)) def supports_output_scores(model): return True def get_output_classes(model): return _logistic_regression.get_output_classes(model) def get_input_dimension(model): return _logistic_regression.get_input_dimension(model) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_LinearSVR.py0000644000000000000000000000257514672066616023221 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel if _HAS_SKLEARN: import sklearn from sklearn.svm import LinearSVR as _LinearSVR from . import _sklearn_util sklearn_class = sklearn.svm.LinearSVR from . import _linear_regression model_type = "regressor" def convert(model, features, target): """Convert a LinearSVR model to the protobuf spec. Parameters ---------- model: LinearSVR A trained LinearSVR model. feature_names: [str] Name of the input columns. target: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) # Check the scikit learn model _sklearn_util.check_expected_type(model, _LinearSVR) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "coef_")) return _MLModel(_linear_regression._convert(model, features, target)) def get_input_dimension(model): return _linear_regression.get_input_dimension(model) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_NuSVC.py0000644000000000000000000000353214672066616022344 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..._deps import _HAS_SKLEARN from . import _SVC as _SVC if _HAS_SKLEARN: from sklearn.svm import NuSVC as _NuSVC from . import _sklearn_util from ._sklearn_util import check_fitted sklearn_class = _NuSVC model_type = "classifier" def convert(model, feature_names, target): """Convert a Nu-Support Vector Classification (NuSVC) model to the protobuf spec. Parameters ---------- model: NuSVC A trained NuSVC encoder model. feature_names: [str], optional (default=None) Name of the input columns. target: str, optional (default=None) Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_expected_type(model, _NuSVC) return _SVC.convert(model, feature_names, target) def supports_output_scores(model): return _SVC.supports_output_scores(model) def get_output_classes(model): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) check_fitted(model, lambda m: hasattr(m, "support_vectors_")) return _SVC.get_output_classes(model) def get_input_dimension(model): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) check_fitted(model, lambda m: hasattr(m, "support_vectors_")) return _SVC.get_input_dimension(model) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_NuSVR.py0000644000000000000000000000266214672066616022366 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..._deps import _HAS_SKLEARN from . import _SVR as _SVR if _HAS_SKLEARN: from sklearn.svm import NuSVR as _NuSVR from . import _sklearn_util from ._sklearn_util import check_fitted sklearn_class = _NuSVR model_type = "regressor" def convert(model, feature_names, target): """Convert a Nu Support Vector Regression (NuSVR) model to the protobuf spec. Parameters ---------- model: NuSVR A trained NuSVR encoder model. feature_names: [str] Name of the input columns. target: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_expected_type(model, _NuSVR) return _SVR.convert(model, feature_names, target) def get_input_dimension(model): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) check_fitted(model, lambda m: hasattr(m, "support_vectors_")) return _SVR.get_input_dimension(model) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_SVC.py0000644000000000000000000000766514672066616022054 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ... import SPECIFICATION_VERSION as _SPECIFICATION_VERSION from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel from ...models._interface_management import set_classifier_interface_params from ...proto import Model_pb2 as _Model_pb2 if _HAS_SKLEARN: from sklearn.svm import SVC as _SVC from ._sklearn_util import check_fitted sklearn_class = _SVC model_type = "classifier" from ._svm_common import _set_kernel def _generate_base_svm_classifier_spec(model): """ Takes an SVM classifier produces a starting spec using the parts. that are shared between all SVMs. """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) check_fitted(model, lambda m: hasattr(m, "support_vectors_")) spec = _Model_pb2.Model() spec.specificationVersion = _SPECIFICATION_VERSION svm = spec.supportVectorClassifier _set_kernel(model, svm) for cur_rho in model.intercept_: if len(model.classes_) == 2: # For some reason Scikit Learn doesn't negate for binary classification svm.rho.append(cur_rho) else: svm.rho.append(-cur_rho) for i in range(len(model._dual_coef_)): svm.coefficients.add() for cur_alpha in model._dual_coef_[i]: svm.coefficients[i].alpha.append(cur_alpha) for cur_src_vector in model.support_vectors_: cur_dest_vector = svm.denseSupportVectors.vectors.add() for i in cur_src_vector: cur_dest_vector.values.append(i) return spec def convert(model, feature_names, target): """Convert a Support Vector Classtion (SVC) model to the protobuf spec. Parameters ---------- model: SVC A trained SVC encoder model. feature_names: [str], optional (default=None) Name of the input columns. target: str, optional (default=None) Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) spec = _generate_base_svm_classifier_spec(model) spec = set_classifier_interface_params( spec, feature_names, model.classes_, "supportVectorClassifier", output_features=target, ) svm = spec.supportVectorClassifier for i in model.n_support_: svm.numberOfSupportVectorsPerClass.append(int(i)) if len(model.probA_) != 0 and len(model.classes_) == 2: print( "[WARNING] Scikit Learn uses a technique to normalize pairwise probabilities even for binary classification. " "This can cause differences in predicted probabilities, usually less than 0.5%." ) # If this is an empty list, then model.probA_ will be an empty list. if len(model.probA_) != 0: for i in model.probA_: svm.probA.append(i) for i in model.probB_: svm.probB.append(i) return _MLModel(spec) def supports_output_scores(model): return len(model.probA_) != 0 def get_output_classes(model): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) check_fitted(model, lambda m: hasattr(m, "support_vectors_")) return list(model.classes_) def get_input_dimension(model): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) check_fitted(model, lambda m: hasattr(m, "support_vectors_")) return len(model.support_vectors_[0]) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_SVR.py0000644000000000000000000000450414672066616022060 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ... import SPECIFICATION_VERSION from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel from ...models._interface_management import set_regressor_interface_params from ...proto import Model_pb2 as _Model_pb2 if _HAS_SKLEARN: from sklearn.svm import SVR as _SVR from ._sklearn_util import check_fitted sklearn_class = _SVR model_type = "regressor" from ._svm_common import _set_kernel def _generate_base_svm_regression_spec(model): """ Takes an SVM regression model produces a starting spec using the parts. that are shared between all SVMs. """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) spec = _Model_pb2.Model() spec.specificationVersion = SPECIFICATION_VERSION svm = spec.supportVectorRegressor _set_kernel(model, svm) svm.rho = -model.intercept_[0] for i in range(len(model._dual_coef_)): for cur_alpha in model._dual_coef_[i]: svm.coefficients.alpha.append(cur_alpha) for cur_src_vector in model.support_vectors_: cur_dest_vector = svm.denseSupportVectors.vectors.add() for i in cur_src_vector: cur_dest_vector.values.append(i) return spec def convert(model, features, target): """Convert a Support Vector Regressor (SVR) model to the protobuf spec. Parameters ---------- model: SVR A trained SVR encoder model. feature_names: [str] Name of the input columns. target: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ spec = _generate_base_svm_regression_spec(model) spec = set_regressor_interface_params(spec, features, target) return _MLModel(spec) def get_input_dimension(model): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) check_fitted(model, lambda m: hasattr(m, "support_vectors_")) return len(model.support_vectors_[0]) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/__init__.py0000644000000000000000000000044614672066616023007 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause # A single function to manage the importing. from ._converter import convert ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_converter.py0000644000000000000000000001343014672066616023413 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools import __version__ as ct_version from coremltools.models import _METADATA_SOURCE, _METADATA_VERSION """ Defines the primary function for converting scikit-learn models. """ def convert(sk_obj, input_features=None, output_feature_names=None): """ Convert scikit-learn pipeline, classifier, or regressor to Core ML format. Parameters ---------- sk_obj: model | [model] of scikit-learn format. Scikit learn model(s) to convert to a Core ML format. The input model may be a single scikit learn model, a scikit learn pipeline model, or a list of scikit learn models. Currently supported scikit learn models are: - Linear and Logistic Regression - LinearSVC and LinearSVR - Ridge Regression - SVC and SVR - NuSVC and NuSVR - Gradient Boosting Classifier and Regressor - Decision Tree Classifier and Regressor - Random Forest Classifier and Regressor - Normalizer - Imputer - Standard Scaler - DictVectorizer - One Hot Encoder - KNeighborsClassifier The input model, or the last model in a pipeline or list of models, determines whether this is exposed as a Transformer, Regressor, or Classifier. Note that there may not be a one-to-one correspondence between scikit learn models and the Core ML models chosen to represent them. For example, many scikit learn models are embedded in a pipeline to handle processing of input features. input_features: str | dict | list Optional name(s) that can be given to the inputs of the scikit-learn model. Defaults to ``"input"``. Input features can be specified in a number of forms. - Single string: In this case, the input is assumed to be a single array, with the number of dimensions set using ``num_dimensions``. - List of strings: In this case, the overall input dimensions to the scikit-learn model are assumed to be the length of the list. If neighboring names are identical, they are assumed to be an input array of that length. For example: ``["a", "b", "c"]`` resolves to: ``[("a", Double), ("b", Double), ("c", Double)]``. In addition: ``["a", "a", "b"]`` resolves to: ``[("a", Array(2)), ("b", Double)]``. - Dictionary: Where the keys are the names and the indices or ranges of feature indices. In this case, the Dictionary is presented as a mapping from keys to indices or ranges of contiguous indices. For example: ``{"a" : 0, "b" : [2,3], "c" : 1}`` resolves to: ``[("a", Double), ("c", Double), ("b", Array(2))]``. Note that the ordering is determined by the indices. - List of tuples of the form ``(name, datatype)``, in which ``name`` is the name of the exposed feature, and ``datatype`` is an instance of ``String``, ``Double``, ``Int64``, ``Array``, or ``Dictionary``. output_feature_names: string or list of strings Optional name(s) that can be given to the inputs of the scikit-learn model. The ``output_feature_names`` is interpreted according to the model type: - If the scikit-learn model is a transformer, it is the name of the array feature output by the final sequence of the transformer (defaults to ``"output"``). - If it is a classifier, it should be a 2-tuple of names giving the top class prediction and the array of scores for each class (defaults to ``"classLabel"`` and ``"classScores"``). - If it is a regressor, it should give the name of the prediction value (defaults to ``"prediction"``). Returns ------- model:MLModel Returns an MLModel instance representing a Core ML model. Examples -------- .. sourcecode:: python >>> from sklearn.linear_model import LinearRegression >>> import pandas as pd # Load data >>> data = pd.read_csv('houses.csv') # Train a model >>> model = LinearRegression() >>> model.fit(data[["bedroom", "bath", "size"]], data["price"]) # Convert and save the scikit-learn model >>> import coremltools >>> coreml_model = coremltools.converters.sklearn.convert(model, ["bedroom", "bath", "size"], "price") >>> coreml_model.save('HousePricer.mlmodel') """ # This function is just a thin wrapper around the internal converter so # that sklearn isn't actually imported unless this function is called from ...models import MLModel # NOTE: Providing user-defined class labels will be enabled when # several issues with the ordering of the classes are worked out. For now, # to use custom class labels, directly import the internal function below. from ._converter_internal import _convert_sklearn_model spec = _convert_sklearn_model( sk_obj, input_features, output_feature_names, class_labels=None ) model = MLModel(spec) from sklearn import __version__ as sklearn_version model.user_defined_metadata[_METADATA_VERSION] = ct_version model.user_defined_metadata[_METADATA_SOURCE] = "scikit-learn=={0}".format( sklearn_version ) return model ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_converter_internal.py0000644000000000000000000003114414672066616025311 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ The primary file for converting Scikit-learn models. """ from ..._deps import _HAS_SKLEARN from ...models import _feature_management as _fm from ...models import datatypes from ...models.feature_vectorizer import create_feature_vectorizer from ...models.pipeline import Pipeline, PipelineClassifier, PipelineRegressor if _HAS_SKLEARN: from sklearn.pipeline import Pipeline as sk_Pipeline from collections import namedtuple as _namedtuple from . import (_SVC, _SVR, _decision_tree_classifier, _decision_tree_regressor, _dict_vectorizer, _gradient_boosting_classifier, _gradient_boosting_regressor, _imputer, _k_neighbors_classifier, _linear_regression, _LinearSVC, _LinearSVR, _logistic_regression, _normalizer, _NuSVC, _NuSVR, _one_hot_encoder, _random_forest_classifier, _random_forest_regressor, _standard_scaler, _ridge_regression) _PIPELINE_INTERNAL_FEATURE_NAME = "__feature_vector__" _converter_module_list = [ _dict_vectorizer, _one_hot_encoder, _normalizer, _standard_scaler, _imputer, _NuSVC, _NuSVR, _SVC, _SVR, _linear_regression, _LinearSVC, _LinearSVR, _logistic_regression, _random_forest_classifier, _random_forest_regressor, _decision_tree_classifier, _decision_tree_regressor, _gradient_boosting_classifier, _gradient_boosting_regressor, _k_neighbors_classifier, _ridge_regression ] def _test_module(m): assert m.model_type in ["transformer", "regressor", "classifier"], m.__name__ if m.model_type == "transformer": assert hasattr(m, "update_dimension"), m.__name__ if m.model_type == "classifier": assert hasattr(m, "supports_output_scores"), m.__name__ assert hasattr(m, "get_output_classes"), m.__name__ assert hasattr(m, "sklearn_class"), m.__name__ assert hasattr(m, "get_input_dimension"), m.__name__ return True assert all(_test_module(m) for m in _converter_module_list) _converter_lookup = dict( (md.sklearn_class, i) for i, md in enumerate(_converter_module_list) ) _converter_functions = [md.convert for md in _converter_module_list] def _get_converter_module(sk_obj): """ Returns the module holding the conversion functions for a particular model). """ try: cv_idx = _converter_lookup[sk_obj.__class__] except KeyError: raise ValueError( "Transformer '%s' not supported; supported transformers are %s." % (repr(sk_obj), ",".join(k.__name__ for k in _converter_module_list)) ) return _converter_module_list[cv_idx] def _is_sklearn_model(sk_obj): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) from sklearn.pipeline import Pipeline as sk_Pipeline return isinstance(sk_obj, sk_Pipeline) or sk_obj.__class__ in _converter_lookup def _convert_sklearn_model( input_sk_obj, input_features=None, output_feature_names=None, class_labels=None ): """ Converts a generic sklearn pipeline, transformer, classifier, or regressor into an coreML specification. """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) from sklearn.pipeline import Pipeline as sk_Pipeline if input_features is None: input_features = "input" if isinstance(input_sk_obj, sk_Pipeline): sk_obj_list = input_sk_obj.steps else: sk_obj_list = [("SKObj", input_sk_obj)] if len(sk_obj_list) == 0: raise ValueError("No SKLearn transformers supplied.") # Put the transformers into a pipeline list to hold them so that they can # later be added to a pipeline object. (Hold off adding them to the # pipeline now in case it's a single model at the end, in which case it # gets returned as is.) # # Each member of the pipeline list is a tuple of the proto spec for that # model, the input features, and the output features. pipeline_list = [] # These help us keep track of what's going on a bit easier. Input = _namedtuple("InputTransformer", ["name", "sk_obj", "module"]) Output = _namedtuple( "CoreMLTransformer", ["spec", "input_features", "output_features"] ) # Get a more information rich representation of the list for convenience. # obj_list is a list of tuples of (name, sk_obj, and the converter module for # that step in the list. obj_list = [ Input(sk_obj_name, sk_obj, _get_converter_module(sk_obj)) for sk_obj_name, sk_obj in sk_obj_list ] # Various preprocessing steps. # If the first component of the object list is the sklearn dict vectorizer, # which is unique in that it accepts a list of dictionaries, then we can # get the feature type mapping from that. This then may require the addition # of several OHE steps, so those need to be processed in the first stage. if isinstance(obj_list[0].sk_obj, _dict_vectorizer.sklearn_class): dv_obj = obj_list[0].sk_obj output_dim = len(_dict_vectorizer.get_input_feature_names(dv_obj)) if not isinstance(input_features, str): raise TypeError( "If the first transformer in a pipeline is a " "DictVectorizer, then the input feature must be the name " "of the input dictionary." ) input_features = [(input_features, datatypes.Dictionary(str))] if len(obj_list) > 1: output_feature_name = _PIPELINE_INTERNAL_FEATURE_NAME else: if output_feature_names is None: output_feature_name = "transformed_features" elif isinstance(output_feature_names, str): output_feature_name = output_feature_names else: raise TypeError( "For a transformer pipeline, the " "output_features needs to be None or a string " "for the predicted value." ) output_features = [(output_feature_name, datatypes.Array(output_dim))] spec = _dict_vectorizer.convert(dv_obj, input_features, output_features)._spec pipeline_list.append(Output(spec, input_features, output_features)) # Set up the environment for the rest of the pipeline current_input_features = output_features current_num_dimensions = output_dim # In the corner case that it's only the dict vectorizer here, just return # and exit with that at this point. if len(obj_list) == 1: return spec else: del obj_list[0] else: # First, we need to resolve the input feature types as the sklearn pipeline # expects just an array as input, but what we want to expose to the coreML # user is an interface with named variables. This resolution has to handle # a number of cases. # Can we get the number of features from the model? If so, pass that # information into the feature resolution function. If we can't, then this # function should return None. first_sk_obj = obj_list[0].sk_obj num_dimensions = _get_converter_module(first_sk_obj).get_input_dimension( first_sk_obj ) # Resolve the input features. features = _fm.process_or_validate_features(input_features, num_dimensions) current_num_dimensions = _fm.dimension_of_array_features(features) # Add in a feature vectorizer that consolodates all of the feature inputs # into the form expected by scipy's pipelines. Essentially this is a # translation layer between the coreML form with named arguments and the # scikit learn variable form. if len(features) == 1 and isinstance(features[0][1], datatypes.Array): current_input_features = features else: spec, _output_dimension = create_feature_vectorizer( features, _PIPELINE_INTERNAL_FEATURE_NAME ) assert _output_dimension == current_num_dimensions ft_out_features = [ ( _PIPELINE_INTERNAL_FEATURE_NAME, datatypes.Array(current_num_dimensions), ) ] pipeline_list.append(Output(spec, features, ft_out_features)) current_input_features = ft_out_features # Now, validate the sequence of transformers to make sure we have something # that can work with all of this. for i, (_, _, m) in enumerate(obj_list[:-1]): if m.model_type != "transformer": raise ValueError( "Only a sequence of transformer classes followed by a " "single transformer, regressor, or classifier is currently supported. " "(object in position %d interpreted as %s)" % (i, m.model_type) ) overall_mode = obj_list[-1].module.model_type assert overall_mode in ("transformer", "regressor", "classifier") # Now, go through each transformer in the sequence of transformers and add # it to the pipeline. for _, sk_obj, sk_m in obj_list[:-1]: next_dimension = sk_m.update_dimension(sk_obj, current_num_dimensions) output_features = [ (_PIPELINE_INTERNAL_FEATURE_NAME, datatypes.Array(next_dimension)) ] spec = sk_m.convert(sk_obj, current_input_features, output_features)._spec pipeline_list.append(Output(spec, current_input_features, output_features)) current_input_features = output_features current_num_dimensions = next_dimension # Now, handle the final transformer. This is where we need to have different # behavior depending on whether it's a classifier, transformer, or regressor. _, last_sk_obj, last_sk_m = obj_list[-1] if overall_mode == "classifier": supports_output_scores = last_sk_m.supports_output_scores(last_sk_obj) _internal_output_classes = list(last_sk_m.get_output_classes(last_sk_obj)) if class_labels is None: class_labels = _internal_output_classes output_features = _fm.process_or_validate_classifier_output_features( output_feature_names, class_labels, supports_output_scores ) elif overall_mode == "regressor": if output_feature_names is None: output_features = [("prediction", datatypes.Double())] elif isinstance(output_feature_names, str): output_features = [(output_feature_names, datatypes.Double())] else: raise TypeError( "For a regressor object or regressor pipeline, the " "output_features needs to be None or a string for the predicted value." ) else: # transformer final_output_dimension = last_sk_m.update_dimension( last_sk_obj, current_num_dimensions ) if output_feature_names is None: output_features = [ ("transformed_features", datatypes.Array(final_output_dimension)) ] elif isinstance(output_feature_names, str): output_features = [ (output_feature_names, datatypes.Array(final_output_dimension)) ] else: raise TypeError( "For a transformer object or transformer pipeline, the " "output_features needs to be None or a string for the " "name of the transformed value." ) last_spec = last_sk_m.convert( last_sk_obj, current_input_features, output_features )._spec pipeline_list.append(Output(last_spec, current_input_features, output_features)) # Now, create the pipeline and return the spec for it. # If it's just one element, we can return it. if len(pipeline_list) == 1: return pipeline_list[0].spec original_input_features = pipeline_list[0].input_features if overall_mode == "regressor": pipeline = PipelineRegressor(original_input_features, output_features) elif overall_mode == "classifier": pipeline = PipelineClassifier( original_input_features, class_labels, output_features ) else: pipeline = Pipeline(original_input_features, output_features) # Okay, now we can build the pipeline spec. for spec, input_features, output_features in pipeline_list: pipeline.add_model(spec) return pipeline.spec ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_decision_tree_classifier.py0000644000000000000000000000335714672066616026433 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel from ._tree_ensemble import convert_tree_ensemble if _HAS_SKLEARN: import sklearn.tree as _tree from . import _sklearn_util model_type = "classifier" sklearn_class = _tree.DecisionTreeClassifier def convert(model, input_name, output_features): """Convert a decision tree model to protobuf format. Parameters ---------- decision_tree : DecisionTreeClassifier A trained scikit-learn tree model. input_name: str Name of the input columns. output_name: str Name of the output columns. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_expected_type(model, _tree.DecisionTreeClassifier) _sklearn_util.check_fitted( model, lambda m: hasattr(m, "tree_") and model.tree_ is not None ) return _MLModel( convert_tree_ensemble( model, input_name, output_features, mode="classifier", class_labels=model.classes_, ) ) def supports_output_scores(model): return True def get_output_classes(model): return list(model.classes_) def get_input_dimension(model): if hasattr(model, "n_features_in_"): return model.n_features_in_ else: return model.n_features_ ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_decision_tree_regressor.py0000644000000000000000000000277214672066616026322 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel from ._tree_ensemble import convert_tree_ensemble as _convert_tree_ensemble if _HAS_SKLEARN: import sklearn.tree as _tree from . import _sklearn_util model_type = "regressor" sklearn_class = _tree.DecisionTreeRegressor def convert(model, feature_names, target): """Convert a decision tree model to protobuf format. Parameters ---------- decision_tree : DecisionTreeRegressor A trained scikit-learn tree model. feature_names: [str] Name of the input columns. target: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_expected_type(model, _tree.DecisionTreeRegressor) _sklearn_util.check_fitted( model, lambda m: hasattr(m, "tree_") and model.tree_ is not None ) return _MLModel(_convert_tree_ensemble(model, feature_names, target)) def get_input_dimension(model): if hasattr(model, "n_features_"): return model.n_features_ else: return model.n_features_in_ ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_dict_vectorizer.py0000644000000000000000000000677514672066616024621 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ... import SPECIFICATION_VERSION from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel from ...models._feature_management import process_or_validate_features from ...models._interface_management import set_transform_interface_params from ...models.feature_vectorizer import create_feature_vectorizer from ...proto import Model_pb2 as _Model_pb2 if _HAS_SKLEARN: from sklearn.feature_extraction import DictVectorizer sklearn_class = DictVectorizer from ...models import datatypes from ...models.pipeline import Pipeline model_type = "transformer" def convert(model, input_features, output_features): """Convert a _imputer model to the protobuf spec. Parameters ---------- model: Imputer A trained Imputer model. input_features: str Name of the input column. output_features: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ _INTERMEDIATE_FEATURE_NAME = "__sparse_vector_features__" n_dimensions = len(model.feature_names_) input_features = process_or_validate_features(input_features) # Ensure that the output_features are also solid. output_features = process_or_validate_features(output_features, n_dimensions) # The DictVectorizer in the framework outputs a sparse dictionary # of index to value due to other considerations, but we are expecting # the output of this to be a dense feature vector. To make that happen, # put a feature_vectorizer immediately after the dict vectorizer. pline = Pipeline(input_features, output_features) # Set the basic model parameters of the dict vectorizer component. dv_spec = _Model_pb2.Model() dv_spec.specificationVersion = SPECIFICATION_VERSION # Set up the dict vectorizer parameters tr_spec = dv_spec.dictVectorizer is_str = None for feature_name in model.feature_names_: if isinstance(feature_name, str): if is_str is False: raise ValueError("Mapping of DictVectorizer mixes int and str types.") tr_spec.stringToIndex.vector.append(feature_name) is_str is True if isinstance(feature_name, int): if is_str is True: raise ValueError("Mapping of DictVectorizer mixes int and str types.") tr_spec.int64ToIndex.vector.append(feature_name) is_str is False intermediate_features = [ (_INTERMEDIATE_FEATURE_NAME, datatypes.Dictionary(key_type=int)) ] # Set the interface for the dict vectorizer with the input and the # intermediate output set_transform_interface_params(dv_spec, input_features, intermediate_features) pline.add_model(dv_spec) # Follow the dict vectorizer by a feature_vectorizer to change the sparse # output layer into a dense vector as expected. fvec, _num_out_dim = create_feature_vectorizer( intermediate_features, output_features[0][0], {"__sparse_vector_features__": n_dimensions}, ) pline.add_model(fvec) return _MLModel(pline.spec) def update_dimension(m, current_num_dimensions): return len(m.feature_names_) def get_input_dimension(m): return None def get_input_feature_names(m): return m.feature_names_ ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_gradient_boosting_classifier.py0000644000000000000000000000670114672066616027314 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel from ._tree_ensemble import convert_tree_ensemble as _convert_tree_ensemble, get_input_dimension if _HAS_SKLEARN: import sklearn.ensemble as _ensemble from . import _sklearn_util sklearn_class = _ensemble.GradientBoostingClassifier model_type = "classifier" def convert(model, feature_names, target): """Convert a boosted tree model to protobuf format. Parameters ---------- decision_tree : GradientBoostingClassifier A trained scikit-learn tree model. feature_names: [str] Name of the input columns. target: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_expected_type(model, _ensemble.GradientBoostingClassifier) def is_gbr_model(m): if len(m.estimators_) == 0: return False if hasattr(m, "estimators_") and m.estimators_ is not None: for t in m.estimators_.flatten(): if not hasattr(t, "tree_") or t.tree_ is None: return False return True else: return False _sklearn_util.check_fitted(model, is_gbr_model) post_evaluation_transform = None if model.n_classes_ == 2: post_evaluation_transform = "Regression_Logistic" else: post_evaluation_transform = "Classification_SoftMax" # Here we enumerate known methods GradientBoostingClassifier use for initializing the raw predictions. # Alternatively we can enumerate known estimators/strategies combinations. # This covers more combinations with less hacks base_prediction = None if hasattr(model, "n_features_in_"): num_input_features = model.n_features_in_ else: num_input_features = model.n_features_ dummy_x = np.zeros((1, num_input_features)) for base_init_func in ('_init_decision_function', '_raw_predict_init'): if not hasattr(model, base_init_func): continue raw_predictions = getattr(model, base_init_func)(dummy_x)[0, :] if '_init_decision_function' == base_init_func and model.n_classes_ > 2: # fix initial default prediction for multiclass classification # https://github.com/scikit-learn/scikit-learn/pull/12983 raw_predictions = np.log(raw_predictions) base_prediction = list(raw_predictions) break if base_prediction is None: raise ValueError("We don't support your classifier: cannot initialize base_prediction. " "Please file a bug report.") return _MLModel( _convert_tree_ensemble( model, feature_names, target, mode="classifier", base_prediction=base_prediction, class_labels=model.classes_, post_evaluation_transform=post_evaluation_transform, ) ) def supports_output_scores(model): return True def get_output_classes(model): return list(model.classes_) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_gradient_boosting_regressor.py0000644000000000000000000000427214672066616027204 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from packaging.version import Version from ..._deps import _HAS_SKLEARN, _SKLEARN_VERSION from ...models import MLModel as _MLModel from ._tree_ensemble import convert_tree_ensemble as _convert_tree_ensemble, get_input_dimension if _HAS_SKLEARN: import sklearn.ensemble as _ensemble from . import _sklearn_util sklearn_class = _ensemble.GradientBoostingRegressor model_type = "regressor" def convert(model, input_features, output_features): """Convert a boosted tree model to protobuf format. Parameters ---------- decision_tree : GradientBoostingRegressor A trained scikit-learn tree model. input_feature: [str] Name of the input columns. output_features: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_expected_type(model, _ensemble.GradientBoostingRegressor) def is_gbr_model(m): if len(m.estimators_) == 0: return False if hasattr(m, "estimators_") and m.estimators_ is not None: for t in m.estimators_.flatten(): if not hasattr(t, "tree_") or t.tree_ is None: return False return True else: return False _sklearn_util.check_fitted(model, is_gbr_model) if model.loss == "huber": base_prediction = model.init_.quantile else: # >= 0.22 GradientBoostingRegressor deprecated "mean" in favor of "constant_" attribute if _SKLEARN_VERSION < Version("0.22"): base_prediction = model.init_.mean else: base_prediction = model.init_.constant_ return _MLModel( _convert_tree_ensemble( model, input_features, output_features, base_prediction=base_prediction ) ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_imputer.py0000644000000000000000000000651514672066616023077 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from packaging.version import Version from ... import SPECIFICATION_VERSION from ..._deps import _HAS_SKLEARN, _SKLEARN_VERSION from ...models import MLModel as _MLModel from ...models import datatypes from ...models._interface_management import set_transform_interface_params from ...proto import Model_pb2 as _Model_pb2 from . import _sklearn_util if _HAS_SKLEARN: import sklearn try: # scikit-learn >= 0.21 from sklearn.impute import SimpleImputer as Imputer sklearn_class = sklearn.impute.SimpleImputer except ImportError: # scikit-learn < 0.21 from sklearn.preprocessing import Imputer sklearn_class = sklearn.preprocessing.Imputer model_type = "transformer" def convert(model, input_features, output_features): """Convert a DictVectorizer model to the protobuf spec. Parameters ---------- model: DictVectorizer A fitted DictVectorizer model. input_features: str Name of the input column. output_features: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) # Set the interface params. spec = _Model_pb2.Model() spec.specificationVersion = SPECIFICATION_VERSION assert len(input_features) == 1 assert isinstance(input_features[0][1], datatypes.Array) # feature name in and out are the same here spec = set_transform_interface_params(spec, input_features, output_features) # Test the scikit-learn model _sklearn_util.check_expected_type(model, Imputer) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "statistics_")) # model.axis deprecated in SimpleImputer >= 0.22. which now imputes only # along columns as desired here. if _SKLEARN_VERSION < Version("0.22"): if model.axis != 0: raise ValueError("Imputation is only supported along axis = 0.") # The imputer in our framework only works on single columns, so # we need to translate that over. The easiest way to do that is to # put it in a nested pipeline with a feature extractor and a tr_spec = spec.imputer for v in model.statistics_: tr_spec.imputedDoubleArray.vector.append(v) try: tr_spec.replaceDoubleValue = float(model.missing_values) except ValueError: raise ValueError( "Only scalar values or NAN as missing_values " "in _imputer are supported." ) return _MLModel(spec) def update_dimension(model, input_dimension): """ Given a model that takes an array of dimension input_dimension, returns the output dimension. """ # This doesn't expand anything. return input_dimension def get_input_dimension(model): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "statistics_")) return len(model.statistics_) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_k_neighbors_classifier.py0000644000000000000000000002246514672066616026112 0ustar00rootroot# Copyright (c) 2019, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import coremltools from coremltools.proto import FeatureTypes_pb2 from ..._deps import _HAS_SCIPY, _HAS_SKLEARN from ...models import MLModel as _MLModel if _HAS_SKLEARN: import sklearn.neighbors as _neighbors from . import _sklearn_util if _HAS_SCIPY: import scipy as sp import numpy as np model_type = "classifier" sklearn_class = _neighbors.KNeighborsClassifier def convert(model, input_name, output_name): """Convert a scikit KNeighborsClassifier to protobuf format. Parameters ---------- model : KNeighborsClassifier A trained scikit-learn KNeighborsClassifier model. input_name: str Name of the input column. output_name: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_expected_type(model, sklearn_class) _check_fitted(model) _check_algorithm(model) _check_weighting_scheme(model) _check_distance_metric(model) return _MLModel(_convert_k_neighbors_classifier(model, input_name, output_name)) def supports_output_scores(model): """KNeighborsClassifier models do not support output scores.""" return False def get_output_classes(model): """Get the candidate classes for the model.""" _check_fitted(model) return list(model.classes_) def _convert_k_neighbors_classifier(model, input_name, output_name): """Convert the scikit KNeighborsClassifier to CoreML. Assumes initial validation of the scikit model has been done.""" spec = coremltools.proto.Model_pb2.Model() spec.specificationVersion = coremltools.SPECIFICATION_VERSION spec.kNearestNeighborsClassifier.numberOfNeighbors.defaultValue = model.n_neighbors spec.kNearestNeighborsClassifier.numberOfNeighbors.range.minValue = 1 spec.kNearestNeighborsClassifier.numberOfNeighbors.range.maxValue = _number_of_samples( model, spec ) # is there a better heuristic to use here? number_of_dimensions = 0 if _is_algorithm_brute(model): number_of_dimensions = model._fit_X.shape[1] spec.kNearestNeighborsClassifier.nearestNeighborsIndex.linearIndex.MergeFromString( b"" ) elif _is_algorithm_kd_tree(model): npdata = np.asarray(model._tree.data) number_of_dimensions = get_input_dimension(model) spec.kNearestNeighborsClassifier.nearestNeighborsIndex.singleKdTreeIndex.leafSize = ( model.leaf_size ) else: raise TypeError( "KNeighbors algorithm not supported for CoreML conversion: {}".format( model.algorithm ) ) spec.kNearestNeighborsClassifier.nearestNeighborsIndex.numberOfDimensions = ( number_of_dimensions ) # Make sure the distance function is set spec.kNearestNeighborsClassifier.nearestNeighborsIndex.squaredEuclideanDistance.MergeFromString( b"" ) input_features = spec.description.input.add() input_features.name = input_name[0][0] input_features.type.multiArrayType.shape.extend([number_of_dimensions]) input_features.type.multiArrayType.dataType = ( FeatureTypes_pb2.ArrayFeatureType.FLOAT32 ) output_label = spec.description.output.add() output_label.name = output_name[0][0] # predictedFeatureName is required since KNN is a classifier and it should be same as outputName. spec.description.predictedFeatureName = output_label.name # Need to confirm if scikit only accepts integer labels output_label.type.int64Type.MergeFromString(b"") spec.kNearestNeighborsClassifier.uniformWeighting.MergeFromString(b"") _extract_training_data(model, spec) return spec def _number_of_samples(model, spec): """Get the number of samples the model is fitted to.""" if _is_algorithm_brute(model): return model._fit_X.shape[0] elif _is_algorithm_kd_tree(model): return len(np.asarray(model._tree.data)) return 0 def _extract_training_data(model, spec): """Extract the training data from the scikit model and add it to the CoreML spec""" if _is_algorithm_brute(model): X = model._fit_X if _is_valid_sparse_format(X): X = _unpack_sparse(X) for sample in X: coreml_sample = ( spec.kNearestNeighborsClassifier.nearestNeighborsIndex.floatSamples.add() ) for feature in sample: coreml_sample.vector.append(feature) elif _is_algorithm_kd_tree(model): # sklearn guarantees that tree data is not stored in a sparse format npdata = np.asarray(model._tree.data) for sample in npdata: coreml_sample = ( spec.kNearestNeighborsClassifier.nearestNeighborsIndex.floatSamples.add() ) for feature in sample: coreml_sample.vector.append(feature) for label in model._y: spec.kNearestNeighborsClassifier.int64ClassLabels.vector.append(label) def get_input_dimension(model): """Get the input dimension for the model""" _check_fitted(model) number_of_dimensions = 0 if _is_algorithm_brute(model): number_of_dimensions = model._fit_X.shape[1] elif _is_algorithm_kd_tree(model): npdata = np.asarray(model._tree.data) number_of_dimensions = len(npdata[0]) else: raise TypeError( "KNeighbors algorithm not supported for CoreML conversion: {}".format( model.algorithm ) ) return number_of_dimensions def _check_fitted(model): """Simple wrapper to check if the KNeighborsClassifier has been fitted.""" return _sklearn_util.check_fitted( model, lambda m: hasattr(m, "_fit_method") or hasattr(m, "_fit_X") ) def _check_algorithm(model): """Ensure the kNeighbors algorithm for the given scikit model is a supported type""" is_valid = False print_name = "" if model.algorithm == "brute" or model.algorithm == "kd_tree": is_valid = True print_name = model.algorithm elif model.algorithm == "auto" and model._fit_method == "kd_tree": is_valid = True print_name = "kd_tree" elif model.algorithm == "auto" and model._fit_method == "brute": is_valid = True print_name = "brute" if not is_valid: raise TypeError( "KNeighbors algorithm not supported for CoreML conversion: {}".format( print_name ) ) def _check_weighting_scheme(model): """Simple wrapper to ensure the weighting scheme is valid for CoreML conversion""" is_valid = False if model.weights == "uniform": is_valid = True # Other cases CoreML doesn't support include weighting by distance or a user-provided 'callable' object. if not is_valid: print_name = "" if _is_printable(model.weights): print_name = model.weights else: print_name = getattr(model.weights, "__name__", repr(model.weights)) raise TypeError( "KNeighbors weight function not supported for CoreML conversion: {}".format( print_name ) ) def _check_distance_metric(model): """Simple wrapper to ensure the distance metric is valid for CoreML conversion""" is_valid = False if model.metric == "euclidean": is_valid = True elif model.metric == "minkowski" and model.p == 2: is_valid = True # There are a number of other distance metrics supported by scikit that CoreML doesn't currently support. if not is_valid: print_name = "" if _is_printable(model.metric): print_name = model.metric else: print_name = getattr(model.metric, "__name__", repr(model.metric)) raise TypeError( "KNeighbors distance metric not supported for CoreML conversion: {}".format( print_name ) ) def _is_algorithm_brute(model): """Checks if the algorithm for the scikit model is set to 'brute'.""" return model.algorithm == "brute" or ( model.algorithm == "auto" and model._fit_method == "brute" ) def _is_algorithm_kd_tree(model): """Checks if the algorithm for the scikit model is set to 'kd_tree'.""" return model.algorithm == "kd_tree" or ( model.algorithm == "auto" and model._fit_method == "kd_tree" ) def _is_printable(obj): """Check if the object is a valid text type.""" return isinstance(obj, str) def _is_valid_sparse_format(obj): """Check if the object is in CSR sparse format (the only valid type for KNeighborsClassifier)""" if not _HAS_SCIPY: return False return isinstance(obj, sp.sparse.csr_matrix) def _unpack_sparse(obj): """Unpack the sparse matrix into a format that we can easily iterate over for insertion into a CoreML model.""" if not _HAS_SCIPY and not sp.sparse.issparse(obj): raise TypeError("Object {} is not a scipy sparse matrix type".format(type(obj))) return obj.toarray() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_linear_regression.py0000644000000000000000000000451014672066616025115 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as _np from ... import SPECIFICATION_VERSION from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel from ...models._interface_management import set_regressor_interface_params from ...proto import Model_pb2 as _Model_pb2 if _HAS_SKLEARN: import sklearn from sklearn.linear_model import LinearRegression from . import _sklearn_util model_type = "regressor" sklearn_class = sklearn.linear_model.LinearRegression def convert(model, features, target): """Convert a linear regression model to the protobuf spec. Parameters ---------- model: LinearRegression A trained linear regression encoder model. feature_names: [str] Name of the input columns. target: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) # Check the scikit learn model _sklearn_util.check_expected_type(model, LinearRegression) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "coef_")) return _MLModel(_convert(model, features, target)) def _convert(model, features, target): # Set the model class (regressor) spec = _Model_pb2.Model() spec.specificationVersion = SPECIFICATION_VERSION spec = set_regressor_interface_params(spec, features, target) # Add parameters for the linear regression. lr = spec.glmRegressor if isinstance(model.intercept_, _np.ndarray): assert len(model.intercept_) == 1 lr.offset.append(model.intercept_[0]) else: lr.offset.append(model.intercept_) weights = lr.weights.add() for i in model.coef_: weights.value.append(i) return spec def get_input_dimension(model): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "coef_")) return model.coef_.size ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_logistic_regression.py0000644000000000000000000000606014672066616025462 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections.abc import Iterable from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel if _HAS_SKLEARN: from sklearn.linear_model import LogisticRegression from . import _sklearn_util sklearn_class = LogisticRegression from ... import SPECIFICATION_VERSION from ...models._interface_management import set_classifier_interface_params from ...proto import Model_pb2 as _Model_pb2 model_type = "classifier" def convert(model, feature_names, target): """Convert a Logistic Regression model to the protobuf spec. Parameters ---------- model: LogisticRegression A trained LogisticRegression model. feature_names: [str], optional (default=None) Name of the input columns. target: str, optional (default=None) Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_expected_type(model, LogisticRegression) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "coef_")) return _MLModel(_convert(model, feature_names, target)) def _convert(model, feature_names, target): spec = _Model_pb2.Model() spec.specificationVersion = SPECIFICATION_VERSION set_classifier_interface_params( spec, feature_names, model.classes_, "glmClassifier", output_features=target ) glmClassifier = spec.glmClassifier if model.multi_class == "ovr": glmClassifier.classEncoding = glmClassifier.OneVsRest else: print( '[ERROR] Currently "One Vs Rest" is the only supported multiclass option.' ) return None glmClassifier.postEvaluationTransform = glmClassifier.Logit if isinstance(model.intercept_, Iterable): for val in model.intercept_: glmClassifier.offset.append(val) else: for _ in model.coef_: glmClassifier.offset.append(model.intercept_) for cur_in_row in model.coef_: cur_out_row = glmClassifier.weights.add() for val in cur_in_row: cur_out_row.value.append(val) return spec def supports_output_scores(model): return True def get_output_classes(model): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "coef_")) return list(model.classes_) def get_input_dimension(model): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "coef_")) return len(model.coef_[0]) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_normalizer.py0000644000000000000000000000441314672066616023567 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ... import SPECIFICATION_VERSION from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel from ...models._interface_management import \ set_transform_interface_params as _set_transform_interface_params from ...proto import Model_pb2 as _Model_pb2 from ...proto.Normalizer_pb2 import Normalizer as _proto__normalizer if _HAS_SKLEARN: from sklearn.preprocessing import Normalizer from . import _sklearn_util sklearn_class = Normalizer model_type = "transformer" def convert(model, input_features, output_features): """Convert a normalizer model to the protobuf spec. Parameters ---------- model: Normalizer A Normalizer. input_features: str Name of the input column. output_features: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) # Test the scikit-learn model _sklearn_util.check_expected_type(model, Normalizer) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "norm")) # Set the interface params. spec = _Model_pb2.Model() spec.specificationVersion = SPECIFICATION_VERSION spec = _set_transform_interface_params(spec, input_features, output_features) # Set the one hot encoder parameters _normalizer_spec = spec.normalizer if model.norm == "l1": _normalizer_spec.normType = _proto__normalizer.L1 elif model.norm == "l2": _normalizer_spec.normType = _proto__normalizer.L2 elif model.norm == "max": _normalizer_spec.normType = _proto__normalizer.LMax return _MLModel(spec) def update_dimension(model, input_dimension): """ Given a model that takes an array of dimension input_dimension, returns the output dimension. """ # No change return input_dimension def get_input_dimension(model): # Cannot determine this now. return None ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_one_hot_encoder.py0000644000000000000000000002325614672066616024545 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ... import SPECIFICATION_VERSION from ..._deps import _HAS_SKLEARN, _SKLEARN_VERSION from ...models import MLModel as _MLModel from ...models import datatypes from ...models._interface_management import set_transform_interface_params from ...models.array_feature_extractor import create_array_feature_extractor from ...models.feature_vectorizer import create_feature_vectorizer from ...models.pipeline import Pipeline from ...proto import Model_pb2 as _Model_pb2 from ...proto import OneHotEncoder_pb2 as _OHE_pb2 from . import _sklearn_util if _HAS_SKLEARN: from packaging.version import Version from sklearn.preprocessing import OneHotEncoder sklearn_class = OneHotEncoder # model type determines the behavior of this module. model_type = "transformer" def convert(model, input_features, output_features): """Convert a one-hot-encoder model to the protobuf spec. Parameters ---------- model: OneHotEncoder A trained one-hot encoder model. input_features: str, optional Name of the input column. output_features: str, optional Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) # Make sure the model is fitted. _sklearn_util.check_expected_type(model, OneHotEncoder) if _SKLEARN_VERSION >= Version("0.22"): _sklearn_util.check_fitted(model, lambda m: hasattr(m, "categories_")) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "n_features_in_")) else: _sklearn_util.check_fitted(model, lambda m: hasattr(m, "active_features_")) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "n_values_")) input_dimension = get_input_dimension(model) if input_dimension is not None: # Make sure that our starting dimensions are correctly managed. assert len(input_features) == 1 assert input_features[0][1] == datatypes.Array(input_dimension) input_dimension = input_features[0][1].num_elements expected_output_dimension = update_dimension(model, input_dimension) assert output_features[0][1] == datatypes.Array(expected_output_dimension) if _SKLEARN_VERSION >= Version("0.22"): model.categorical_features = "all" model.active_features_ = range(expected_output_dimension) model.feature_indices_ = [0] t = 0 for i in model._n_features_outs: t = t + i model.feature_indices_.append(t) # Create a pipeline that can do all of the subsequent feature extraction. feature_vectorizer_input_features = [] feature_vectorizer_size_map = {} if model.categorical_features == "all": _categorical_features = set(range(input_dimension)) _cat_feature_idx_mapping = dict((i, i) for i in range(input_dimension)) else: _categorical_features = set(model.categorical_features) _cat_feature_idx_mapping = dict( (_idx, i) for i, _idx in enumerate(sorted(model.categorical_features)) ) pline = Pipeline(input_features, output_features) # Track the overall packing index, which determines the output ordering. pack_idx = 0 # First, go through all the columns that are encoded. The sklearn OHE puts # all of these first, regardless of their original ordering. for idx in range(input_dimension): f_name = "__OHE_%d__" % pack_idx if idx in _categorical_features: # This input column is one hot encoded feature_extractor_spec = create_array_feature_extractor( input_features, f_name, idx, output_type="Int64" ) pline.add_model(feature_extractor_spec) _cat_feature_idx = _cat_feature_idx_mapping[idx] ohe_input_features = [(f_name, datatypes.Int64())] ohe_output_features = [(f_name, datatypes.Dictionary("Int64"))] # Create a one hot encoder per column o_spec = _Model_pb2.Model() o_spec.specificationVersion = SPECIFICATION_VERSION o_spec = set_transform_interface_params( o_spec, ohe_input_features, ohe_output_features ) ohe_spec = o_spec.oneHotEncoder ohe_spec.outputSparse = True if model.handle_unknown == "error": ohe_spec.handleUnknown = _OHE_pb2.OneHotEncoder.HandleUnknown.Value( "ErrorOnUnknown" ) else: ohe_spec.handleUnknown = _OHE_pb2.OneHotEncoder.HandleUnknown.Value( "IgnoreUnknown" ) # Need to do a quick search to find the part of the active_features_ mask # that represents the categorical variables in our part. Could do this # with binary search, but we probably don't need speed so much here. def bs_find(a, i): lb, k = 0, len(a) while k > 0: _idx = lb + (k // 2) if a[_idx] < i: lb = _idx + 1 k -= 1 k = k // 2 return lb # Here are the indices we are looking for f_idx_bottom = model.feature_indices_[_cat_feature_idx] f_idx_top = model.feature_indices_[_cat_feature_idx + 1] # Now find where in the active features list we should look. cat_feat_idx_bottom = bs_find(model.active_features_, f_idx_bottom) cat_feat_idx_top = bs_find(model.active_features_, f_idx_top) n_cat_values = cat_feat_idx_top - cat_feat_idx_bottom for i in range(cat_feat_idx_bottom, cat_feat_idx_top): # The actual categorical value is stored as an offset in the active_features list. cat_idx = model.active_features_[i] - f_idx_bottom ohe_spec.int64Categories.vector.append(cat_idx) # Add the ohe to the pipeline pline.add_model(o_spec) # Add the result to the feature_vectorizer at the end. feature_vectorizer_input_features.append( (f_name, datatypes.Dictionary("Int64")) ) feature_vectorizer_size_map[f_name] = n_cat_values pack_idx += 1 # Now go through all the columns that are not encoded as the sklearn OHE puts # these after the encoded ones. For speed, we can put these all in a single # ArrayFeatureExtractor # pass_through_features = [ idx for idx in range(input_dimension) if idx not in _categorical_features ] if pass_through_features: f_name = "__OHE_pass_through__" # This input column is not one hot encoded feature_extractor_spec = create_array_feature_extractor( input_features, f_name, pass_through_features ) pline.add_model(feature_extractor_spec) feature_vectorizer_input_features.append( (f_name, datatypes.Array(len(pass_through_features))) ) # Finally, add the feature vectorizer to the pipeline. output_feature_name = output_features[0][0] output_feature_dimension = output_features[0][1].num_elements fvec, _num_out_dim = create_feature_vectorizer( feature_vectorizer_input_features, output_features[0][0], feature_vectorizer_size_map, ) # Make sure that the feature vectorizer input actually matches up with the assert _num_out_dim == output_features[0][1].num_elements pline.add_model(fvec) return _MLModel(pline.spec) def update_dimension(model, input_dimension): """ Given a model that takes an array of dimension input_dimension, returns the output dimension. """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) if _SKLEARN_VERSION >= Version("0.22"): _sklearn_util.check_fitted(model, lambda m: hasattr(m, "categories_")) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "n_features_in_")) return sum(model._n_features_outs) else: _sklearn_util.check_fitted(model, lambda m: hasattr(m, "active_features_")) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "n_values_")) if model.categorical_features == "all": return len(model.active_features_) else: out_dimension = len(model.active_features_) + ( input_dimension - len(model.n_values_) ) return out_dimension def get_input_dimension(model): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) if _SKLEARN_VERSION >= Version("0.22"): _sklearn_util.check_fitted(model, lambda m: hasattr(m, "categories_")) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "n_features_in_")) return model.n_features_in_ else: _sklearn_util.check_fitted(model, lambda m: hasattr(m, "active_features_")) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "n_values_")) if model.categorical_features == "all": return len(model.feature_indices_) - 1 else: # This can't actually be determined from the model as indices after the # rest of the categorical values don't seem to be tracked return None ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_random_forest_classifier.py0000644000000000000000000000357414672066616026462 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel from ._tree_ensemble import convert_tree_ensemble as _convert_tree_ensemble, get_input_dimension if _HAS_SKLEARN: import sklearn.ensemble as _ensemble from . import _sklearn_util sklearn_class = _ensemble.RandomForestClassifier model_type = "classifier" def convert(model, feature_names, target): """Convert a boosted tree model to protobuf format. Parameters ---------- decision_tree : RandomForestClassifier A trained scikit-learn tree model. feature_names: [str] Name of the input columns. target: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_expected_type(model, _ensemble.RandomForestClassifier) def is_rf_model(m): if len(m.estimators_) == 0: return False if hasattr(m, "estimators_") and m.estimators_ is not None: for t in m.estimators_: if not hasattr(t, "tree_") or t.tree_ is None: return False return True else: return False _sklearn_util.check_fitted(model, is_rf_model) return _MLModel( _convert_tree_ensemble( model, feature_names, target, mode="classifier", class_labels=model.classes_ ) ) def supports_output_scores(model): return True def get_output_classes(model): return list(model.classes_) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_random_forest_regressor.py0000644000000000000000000000325614672066616026346 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel from ._tree_ensemble import convert_tree_ensemble as _convert_tree_ensemble, get_input_dimension if _HAS_SKLEARN: import sklearn.ensemble as _ensemble from . import _sklearn_util sklearn_class = _ensemble.RandomForestRegressor model_type = "regressor" def convert(model, feature_names, target): """Convert a boosted tree model to protobuf format. Parameters ---------- decision_tree : RandomForestRegressor A trained scikit-learn tree model. feature_names: [str] Name of the input columns. target: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_expected_type(model, _ensemble.RandomForestRegressor) def is_rf_model(m): if len(m.estimators_) == 0: return False if hasattr(m, "estimators_") and m.estimators_ is not None: for t in m.estimators_: if not hasattr(t, "tree_") or t.tree_ is None: return False return True else: return False _sklearn_util.check_fitted(model, is_rf_model) return _MLModel(_convert_tree_ensemble(model, feature_names, target)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_ridge_regression.py0000644000000000000000000000261614672066616024742 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel if _HAS_SKLEARN: import sklearn from sklearn.linear_model import Ridge as _Ridge from . import _sklearn_util sklearn_class = sklearn.linear_model.Ridge from . import _linear_regression model_type = "regressor" def convert(model, features, target): """Convert a Ridge Regression model to the protobuf spec. Parameters ---------- model: LinearSVR A trained Ridge Regression model. feature_names: [str] Name of the input columns. target: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) # Check the scikit learn model _sklearn_util.check_expected_type(model, _Ridge) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "coef_")) return _MLModel(_linear_regression._convert(model, features, target)) def get_input_dimension(model): return _linear_regression.get_input_dimension(model) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_sklearn_util.py0000644000000000000000000000201014672066616024070 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause def check_fitted(model, func): """Check if a model is fitted. Raise error if not. Parameters ---------- model: model Any scikit-learn model func: model Function to check if a model is not trained. """ if not func(model): raise TypeError("Expected a 'fitted' model for conversion") def check_expected_type(model, expected_type): """Check if a model is of the right type. Raise error if not. Parameters ---------- model: model Any scikit-learn model expected_type: Type Expected type of the scikit-learn. """ if model.__class__.__name__ != expected_type.__name__: raise TypeError( "Expected model of type '%s' (got %s)" % (expected_type.__name__, model.__class__.__name__) ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_standard_scaler.py0000644000000000000000000000510214672066616024532 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ... import SPECIFICATION_VERSION from ..._deps import _HAS_SKLEARN from ...models import MLModel as _MLModel from ...models._interface_management import \ set_transform_interface_params as _set_transform_interface_params from ...proto import Model_pb2 as _Model_pb2 if _HAS_SKLEARN: from sklearn.preprocessing import StandardScaler from . import _sklearn_util sklearn_class = StandardScaler model_type = "transformer" def convert(model, input_features, output_features): """Convert a _imputer model to the protobuf spec. Parameters ---------- model: Imputer A trained Imputer model. input_features: str Name of the input column. output_features: str Name of the output column. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) # Test the scikit-learn model _sklearn_util.check_expected_type(model, StandardScaler) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "mean_")) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "scale_")) # Set the interface params. spec = _Model_pb2.Model() spec.specificationVersion = SPECIFICATION_VERSION spec = _set_transform_interface_params(spec, input_features, output_features) # Set the parameters tr_spec = spec.scaler for x in model.mean_: tr_spec.shiftValue.append(-x) for x in model.scale_: tr_spec.scaleValue.append(1.0 / x) return _MLModel(spec) def update_dimension(model, input_dimension): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "mean_")) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "scale_")) # Nothing to do for this model return input_dimension def get_input_dimension(model): if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "mean_")) _sklearn_util.check_fitted(model, lambda m: hasattr(m, "scale_")) return len(model.mean_) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_svm_common.py0000644000000000000000000000227214672066616023563 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ Common stuff for SVMs """ def _set_kernel(model, spec): """ Takes the sklearn SVM model and returns the spec with the protobuf kernel for that model. """ def gamma_value(model): return model._gamma result = None if model.kernel == "linear": spec.kernel.linearKernel.MergeFromString( b"" ) # hack to set kernel to an empty type elif model.kernel == "rbf": spec.kernel.rbfKernel.gamma = gamma_value(model) elif model.kernel == "poly": spec.kernel.polyKernel.gamma = gamma_value(model) spec.kernel.polyKernel.c = model.coef0 spec.kernel.polyKernel.degree = model.degree elif model.kernel == "sigmoid": spec.kernel.sigmoidKernel.gamma = gamma_value(model) spec.kernel.sigmoidKernel.c = model.coef0 else: raise ValueError( "Unsupported kernel. The following kernel are supported: linear, RBF, polynomial and sigmoid." ) return result ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/sklearn/_tree_ensemble.py0000644000000000000000000001747614672066616024233 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..._deps import _HAS_SKLEARN from ...models._feature_management import process_or_validate_features from ...models.tree_ensemble import (TreeEnsembleClassifier, TreeEnsembleRegressor) if _HAS_SKLEARN: from sklearn.tree import _tree def _get_value(scikit_value, mode="regressor", scaling=1.0, n_classes=2, tree_index=0): """ Get the right value from the scikit-tree """ # Regression if mode == "regressor": return scikit_value[0] * scaling # Binary classification if n_classes == 2: # Decision tree if len(scikit_value[0]) != 1: value = scikit_value[0][1] * scaling / scikit_value[0].sum() # boosted tree else: value = scikit_value[0][0] * scaling if value == 0.5: value = value - 1e-7 # Multiclass classification else: # Decision tree if len(scikit_value[0]) != 1: value = scikit_value[0] / scikit_value[0].sum() # boosted tree else: value = {tree_index: scikit_value[0] * scaling} return value def _recurse( coreml_tree, scikit_tree, tree_id, node_id, scaling=1.0, mode="regressor", n_classes=2, tree_index=0, ): """Traverse through the tree and append to the tree spec. """ if not (_HAS_SKLEARN): raise RuntimeError( "scikit-learn not found. scikit-learn conversion API is disabled." ) ## Recursion should not be called on the leaf node. if node_id == _tree.TREE_LEAF: raise ValueError("Invalid node_id %s" % _tree.TREE_LEAF) # Add a branch node to the tree if scikit_tree.children_left[node_id] != _tree.TREE_LEAF: branch_mode = "BranchOnValueLessThanEqual" feature_index = scikit_tree.feature[node_id] feature_value = scikit_tree.threshold[node_id] left_child_id = scikit_tree.children_left[node_id] right_child_id = scikit_tree.children_right[node_id] # Add a branch node coreml_tree.add_branch_node( tree_id, node_id, feature_index, feature_value, branch_mode, left_child_id, right_child_id, ) # Now recurse _recurse( coreml_tree, scikit_tree, tree_id, left_child_id, scaling, mode, n_classes, tree_index, ) _recurse( coreml_tree, scikit_tree, tree_id, right_child_id, scaling, mode, n_classes, tree_index, ) # Add a leaf node to the tree else: # Get the scikit-learn value if scikit_tree.n_outputs != 1: raise ValueError("Expected only 1 output in the scikit-learn tree.") value = _get_value( scikit_tree.value[node_id], mode, scaling, n_classes, tree_index ) coreml_tree.add_leaf_node(tree_id, node_id, value) def get_input_dimension(model): if hasattr(model, "n_features_in_"): return model.n_features_in_ elif hasattr(model, "n_features_"): return model.n_features_ elif hasattr(model, "n_estimators"): if model.n_estimators == 0: raise ValueError("model not trained.") try: estimator = model.estimators_[0] if hasattr(estimator, "n_features_in_"): return estimator.n_features_in_ return estimator.n_features_ except IndexError: raise ValueError("Model not trained or invalid model.") else: raise ValueError("Unable to obtain input dimension from model.") def convert_tree_ensemble( model, input_features, output_features=("predicted_class", float), mode="regressor", base_prediction=None, class_labels=None, post_evaluation_transform=None, ): """ Convert a generic tree regressor model to the protobuf spec. This currently supports: * Decision tree regression * Gradient boosted tree regression * Random forest regression * Decision tree classifier. * Gradient boosted tree classifier. * Random forest classifier. ---------- Parameters model: [DecisionTreeRegressor | GradientBoostingRegression | RandomForestRegressor] A scikit learn tree model. feature_names : list of strings, optional (default=None) Names of each of the features. target: str Name of the output column. base_prediction: double Base prediction value. mode: str in ['regressor', 'classifier'] Mode of the tree model. class_labels: list[int] List of classes post_evaluation_transform: list[int] Post evaluation transform Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ num_dimensions = get_input_dimension(model) features = process_or_validate_features(input_features, num_dimensions) n_classes = None if mode == "classifier": n_classes = model.n_classes_ if class_labels is None: class_labels = range(n_classes) else: if len(class_labels) != n_classes: raise ValueError( "Number of classes in model (%d) does not match " "length of supplied class list (%d)." % (n_classes, len(class_labels)) ) coreml_tree = TreeEnsembleClassifier( input_features, class_labels, output_features ) if post_evaluation_transform is not None: coreml_tree.set_post_evaluation_transform(post_evaluation_transform) # Base prediction not provided if base_prediction is None: if n_classes == 2: base_prediction = [0.0] else: base_prediction = [0.0 for c in range(n_classes)] coreml_tree.set_default_prediction_value(base_prediction) else: if base_prediction is None: base_prediction = 0.0 coreml_tree = TreeEnsembleRegressor(input_features, output_features) coreml_tree.set_default_prediction_value(base_prediction) # Single tree if hasattr(model, "tree_"): _recurse( coreml_tree, model.tree_, tree_id=0, node_id=0, mode=mode, n_classes=n_classes, ) # Multiple trees elif hasattr(model, "estimators_"): is_ensembling_in_separate_trees = False if type(model.estimators_) != list: is_ensembling_in_separate_trees = ( len(model.estimators_.shape) > 0 and model.estimators_.shape[1] > 1 ) estimators = model.estimators_.flatten() else: estimators = model.estimators_ scaling = ( model.learning_rate if hasattr(model, "learning_rate") else 1.0 / len(estimators) ) for tree_id, base_model in enumerate(estimators): if is_ensembling_in_separate_trees: tree_index = tree_id % n_classes else: tree_index = 0 _recurse( coreml_tree, base_model.tree_, tree_id, node_id=0, scaling=scaling, mode=mode, n_classes=n_classes, tree_index=tree_index, ) else: raise TypeError("Unknown scikit-learn tree model type.") return coreml_tree.spec ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2575471 coremltools-8.0/coremltools/converters/xgboost/0000755000000000000000000000000014672075535020720 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/xgboost/__init__.py0000644000000000000000000000036314672066616023033 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ._tree import convert ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/xgboost/_tree.py0000644000000000000000000000531014672066616022367 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools import __version__ as ct_version from coremltools.models import _METADATA_SOURCE, _METADATA_VERSION from ...models import MLModel as _MLModel from ._tree_ensemble import convert_tree_ensemble as _convert_tree_ensemble def convert( model, feature_names=None, target="target", force_32bit_float=True, mode="regressor", class_labels=None, n_classes=None, ): """ Convert a trained XGBoost model to Core ML format. Parameters ---------- decision_tree : Booster A trained XGboost tree model. feature_names: [str] | str Names of input features that will be exposed in the Core ML model interface. Can be set to one of the following: - ``None`` for using the feature names from the model. - List of names of the input features that should be exposed in the interface to the Core ML model. These input features are in the same order as the XGboost model. target: str Name of the output feature name exposed to the Core ML model. force_32bit_float: bool If ``True``, then the resulting CoreML model will use 32 bit floats internally. mode: str in ['regressor', 'classifier'] Mode of the tree model. class_labels: list[int] or None List of classes. When set to None, the class labels are just the range from 0 to ``n_classes - 1``. n_classes: int or None Number of classes in classification. When set to ``None``, the number of classes is expected from the model or ``class_labels`` should be provided. Returns ------- model:MLModel Returns an MLModel instance representing a Core ML model. Examples -------- .. sourcecode:: python # Convert it with default input and output names >>> import coremltools >>> coreml_model = coremltools.converters.xgboost.convert(model) # Saving the Core ML model to a file. >>> coreml_model.save('my_model.mlmodel') """ model = _MLModel( _convert_tree_ensemble( model, feature_names, target, force_32bit_float=force_32bit_float, mode=mode, class_labels=class_labels, n_classes=n_classes, ) ) from xgboost import __version__ as xgboost_version model.user_defined_metadata[_METADATA_VERSION] = ct_version model.user_defined_metadata[_METADATA_SOURCE] = "xgboost=={0}".format( xgboost_version ) return model ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/converters/xgboost/_tree_ensemble.py0000644000000000000000000002660614672066616024254 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from copy import deepcopy import numpy as _np from ..._deps import _HAS_XGBOOST from ...models.tree_ensemble import TreeEnsembleClassifier from ...models.tree_ensemble import \ TreeEnsembleRegressor as _TreeEnsembleRegressor if _HAS_XGBOOST: import xgboost as _xgboost def recurse_json( mlkit_tree, xgb_tree_json, tree_id, node_id, feature_map, force_32bit_float, mode="regressor", tree_index=0, n_classes=2, ): """Traverse through the tree and append to the tree spec. """ relative_hit_rate = None try: relative_hit_rate = xgb_tree_json["cover"] except KeyError: pass # Fill node attributes if "leaf" not in xgb_tree_json: branch_mode = "BranchOnValueLessThan" split_name = xgb_tree_json["split"] feature_index = split_name if not feature_map else feature_map[split_name] # xgboost internally uses float32, but the parsing from json pulls it out # as a 64bit double. To trigger the internal float32 detection in the # tree ensemble compiler, we need to explicitly cast it to a float 32 # value, then back to the 64 bit float that protobuf expects. This is # controlled with the force_32bit_float flag. feature_value = xgb_tree_json["split_condition"] if force_32bit_float: feature_value = float(_np.float32(feature_value)) true_child_id = xgb_tree_json["yes"] false_child_id = xgb_tree_json["no"] # Get the missing value behavior correct missing_value_tracks_true_child = False try: if xgb_tree_json["missing"] == true_child_id: missing_value_tracks_true_child = True except KeyError: pass mlkit_tree.add_branch_node( tree_id, node_id, feature_index, feature_value, branch_mode, true_child_id, false_child_id, relative_hit_rate=relative_hit_rate, missing_value_tracks_true_child=missing_value_tracks_true_child, ) else: value = xgb_tree_json["leaf"] if force_32bit_float: value = float(_np.float32(value)) if mode == "classifier" and n_classes > 2: value = {tree_index: value} mlkit_tree.add_leaf_node( tree_id, node_id, value, relative_hit_rate=relative_hit_rate ) # Now recurse if "children" in xgb_tree_json: for child in xgb_tree_json["children"]: recurse_json( mlkit_tree, child, tree_id, child["nodeid"], feature_map, force_32bit_float, mode=mode, tree_index=tree_index, n_classes=n_classes, ) def convert_tree_ensemble( model, feature_names, target, force_32bit_float, mode="regressor", class_labels=None, n_classes=None, ): """Convert a generic tree model to the protobuf spec. This currently supports: * Decision tree regression Parameters ---------- model: str | Booster Path on disk where the XGboost JSON representation of the model is or a handle to the XGboost model. feature_names : list of strings or None Names of each of the features. When set to None, the feature names are extracted from the model. target: str, Name of the output column. force_32bit_float: bool If True, then the resulting CoreML model will use 32 bit floats internally. mode: str in ['regressor', 'classifier'] Mode of the tree model. class_labels: list[int] or None List of classes. When set to None, the class labels are just the range from 0 to n_classes - 1. n_classes: int or None Number of classes in classification. When set to None, the number of classes is expected from the model or class_labels should be provided. Returns ------- model_spec: An object of type Model_pb. Protobuf representation of the model """ if not (_HAS_XGBOOST): raise RuntimeError("xgboost not found. xgboost conversion API is disabled.") accepted_modes = ["regressor", "classifier"] if mode not in accepted_modes: raise ValueError("mode should be in %s" % accepted_modes) import json import os feature_map = None if isinstance( model, (_xgboost.core.Booster, _xgboost.XGBRegressor, _xgboost.XGBClassifier) ): model = _booster_feature_names_workaround(model, feature_names) # Testing a few corner cases that we don't support if isinstance(model, _xgboost.XGBRegressor): if mode == "classifier": raise ValueError("mode is classifier but provided a regressor") try: objective = model.get_xgb_params()["objective"] except: objective = None if objective in ["reg:gamma", "reg:tweedie"]: raise ValueError( "Regression objective '%s' not supported for export." % objective ) if isinstance(model, _xgboost.XGBClassifier): if mode == "regressor": raise ValueError("mode is regressor but provided a classifier") n_classes = model.n_classes_ if class_labels is not None: if len(class_labels) != n_classes: raise ValueError( "Number of classes in model (%d) does not match " "length of supplied class list (%d)." % (n_classes, len(class_labels)) ) else: class_labels = list(range(n_classes)) # Now use the booster API. if isinstance(model, (_xgboost.XGBRegressor, _xgboost.XGBClassifier)): # Name change in 0.7 if hasattr(model, "get_booster"): model = model.get_booster() else: model = model.booster() # Xgboost sometimes has feature names in there. Sometimes does not. if (feature_names is None) and (model.feature_names is None): raise ValueError( "The XGBoost model does not have feature names. They must be provided in convert method." ) feature_names = model.feature_names if feature_names is None: feature_names = model.feature_names xgb_model_str = model.get_dump(with_stats=True, dump_format="json") if model.feature_names: feature_map = {f: i for i, f in enumerate(model.feature_names)} # Path on the file system where the XGboost model exists. elif isinstance(model, str): if not os.path.exists(model): raise TypeError("Invalid path %s." % model) with open(model) as f: xgb_model_str = json.load(f) if feature_names is None: raise ValueError( "feature names must be provided in convert method if the model is a path on file system." ) else: feature_map = {f: i for i, f in enumerate(feature_names)} else: raise TypeError("Unexpected type. Expecting XGBoost model.") if mode == "classifier": if n_classes is None and class_labels is None: raise ValueError( "You must provide class_labels or n_classes when not providing the XGBClassifier" ) elif n_classes is None: n_classes = len(class_labels) elif class_labels is None: class_labels = range(n_classes) if n_classes == 2: # if we have only 2 classes we only have one sequence of estimators base_prediction = [0.0] else: base_prediction = [0.0 for c in range(n_classes)] # target here is the equivalent of output_features in scikit learn mlkit_tree = TreeEnsembleClassifier(feature_names, class_labels, target) mlkit_tree.set_default_prediction_value(base_prediction) if n_classes == 2: mlkit_tree.set_post_evaluation_transform("Regression_Logistic") else: mlkit_tree.set_post_evaluation_transform("Classification_SoftMax") else: mlkit_tree = _TreeEnsembleRegressor(feature_names, target) mlkit_tree.set_default_prediction_value(0.5) for xgb_tree_id, xgb_tree_str in enumerate(xgb_model_str): if mode == "classifier" and n_classes > 2: tree_index = xgb_tree_id % n_classes else: tree_index = 0 try: # this means that the xgb_tree_str is a json dump and needs to be loaded xgb_tree_json = json.loads(xgb_tree_str) except: # this means that the xgb_tree_str is loaded from a path in file system already and does not need to be reloaded xgb_tree_json = xgb_tree_str recurse_json( mlkit_tree, xgb_tree_json, xgb_tree_id, node_id=0, feature_map=feature_map, force_32bit_float=force_32bit_float, mode=mode, tree_index=tree_index, n_classes=n_classes, ) return mlkit_tree.spec def _booster_feature_names_workaround(model, feature_names): """ Removes booster feature names. This is intended as a workaround for faulty JSON dump generated by get_dump(dump_format='json') and dump_model(..., dump_format='json') methods in XGBoost. Parameters ---------- model: Booster or XGBRegressor or XGBClassifier model from which feature names are to be removed feature_names: list list of feature names Returns ------- Booster or XGBRegressor or XGBClassifier a copy of a model with removed feature names """ # make sure feature names are not None assert feature_names is not None if isinstance(model, _xgboost.core.Booster): # if feature names are already nulled return booster if model.feature_names is None: return model # copy booster to avoid modifying the original model_copy = model.copy() # make sure feature names match _np.testing.assert_array_equal(model.feature_names, feature_names), \ ValueError('`feature_names` param does not match booster feature names') # Remove feature names from the booster model_copy.feature_names = None elif isinstance(model, (_xgboost.XGBRegressor, _xgboost.XGBClassifier)): # if feature names are already nulled return booster if model.get_booster().feature_names is None: return model # copy estimator with deepcopy to get identical object without modifying the original model_copy = deepcopy(model) # make sure feature names match _np.testing.assert_array_equal(model.get_booster().feature_names, feature_names), \ ValueError('`feature_names` param does not match booster feature names') # Remove feature names from the sklearn wrapper model_copy.get_booster().feature_names = None else: raise TypeError(f"Invalid model object type: {type(model)}") return model_copy ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.261547 coremltools-8.0/coremltools/models/0000755000000000000000000000000014672075535016324 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/__init__.py0000644000000000000000000000221414672066616020434 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import array_feature_extractor from . import datatypes from . import _feature_management from . import nearest_neighbors from . import pipeline from . import tree_ensemble from . import feature_vectorizer from . import _interface_management from .model import MLModel from .model import ( _MLMODEL_FULL_PRECISION, _MLMODEL_HALF_PRECISION, _MLMODEL_QUANTIZED, _VALID_MLMODEL_PRECISION_TYPES, _SUPPORTED_QUANTIZATION_MODES, _QUANTIZATION_MODE_LINEAR_QUANTIZATION, _QUANTIZATION_MODE_LINEAR_SYMMETRIC, _QUANTIZATION_MODE_LOOKUP_TABLE_LINEAR, _QUANTIZATION_MODE_LOOKUP_TABLE_KMEANS, _QUANTIZATION_MODE_CUSTOM_LOOKUP_TABLE, _QUANTIZATION_MODE_DEQUANTIZE, _LUT_BASED_QUANTIZATION, _QUANTIZATION_MODE_DEQUANTIZE, _METADATA_VERSION, _METADATA_SOURCE, _METADATA_SOURCE_DIALECT, ) from . import neural_network from . import ml_program from ._compiled_model import CompiledMLModel ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/_compiled_model.py0000644000000000000000000001307414672066616022016 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from os.path import expanduser as _expanduser from typing import Optional as _Optional from coremltools import ComputeUnit as _ComputeUnit from coremltools.models.model import MLState as _MLState from .model import ( _verify_optimization_hint_input, MLModel as _MLModel, ) from .utils import _macos_version try: from ..libcoremlpython import _MLModelProxy except: _MLModelProxy = None class CompiledMLModel: @staticmethod def _init_check( path: str, compute_units: _ComputeUnit, function_name: str, optimization_hints: _Optional[dict] = None, ): if _macos_version() < (10, 13): raise Exception("Loading compiled Core ML models is only support on macOS 10.13 or higher.") if _MLModelProxy is None: raise Exception("Unable to load any compiled models. This is most likely because" " coremltools was installed from an egg rather than a wheel.") if not isinstance(path, str): raise TypeError('The "path" parameter must be of type "str".') if not isinstance(compute_units, _ComputeUnit): raise TypeError('The "compute_units" parameter must be of type: "coremltools.ComputeUnit".') if not isinstance(function_name, str): raise TypeError('The "function_name" parameter must be of type "str".') _verify_optimization_hint_input(optimization_hints) def __init__( self, path: str, compute_units: _ComputeUnit = _ComputeUnit.ALL, function_name: _Optional[str] = None, optimization_hints: _Optional[dict] = None, ): """ Loads a compiled Core ML model. Parameters ---------- path : str The path to a compiled model directory, ending in ``.mlmodelc``. compute_units : coremltools.ComputeUnit An enum with the following possible values: - ``coremltools.ComputeUnit.ALL``: Use all compute units available, including the neural engine. - ``coremltools.ComputeUnit.CPU_ONLY``: Limit the model to only use the CPU. - ``coremltools.ComputeUnit.CPU_AND_GPU``: Use both the CPU and GPU, but not the neural engine. - ``coremltools.ComputeUnit.CPU_AND_NE``: Use both the CPU and neural engine, but not the GPU. Available only for macOS >= 13.0. optimization_hints : dict or None Keys are the names of the optimization hint, either 'reshapeFrequency' or 'specializationStrategy'. Values are enumeration values of type ``coremltools.ReshapeFrequency`` or ``coremltools.SpecializationStrategy``. Examples -------- .. sourcecode:: python my_compiled_model = ct.models.CompiledMLModel("my_model_path.mlmodelc") y = my_compiled_model.predict({"x": 3}) See Also -------- predict """ if function_name is None: function_name = "" self._init_check(path, compute_units, function_name, optimization_hints) self.compute_unit = compute_units self.function_name = function_name if optimization_hints is not None: self.optimization_hints = optimization_hints.copy() else: self.optimization_hints = None path = _expanduser(path) if self.optimization_hints is not None: optimization_hints_str_vals = {k: v.name for k, v in self.optimization_hints.items()} else: optimization_hints_str_vals = {} self._proxy = _MLModelProxy(path, compute_units.name, function_name, optimization_hints_str_vals) def predict(self, data, state: _Optional[_MLState] = None): """ Return predictions for the model. Parameters ---------- data: dict[str, value] or list[dict[str, value]] Dictionary of data to use for predictions, where the keys are the names of the input features. For batch predictons, use a list of such dictionaries. state : MLState Optional state object as returned by ``make_state()``. Returns ------- dict[str, value] Predictions as a dictionary where each key is the output feature name. list[dict[str, value]] For batch prediction, returns a list of the above dictionaries. Examples -------- .. sourcecode:: python data = {"bedroom": 1.0, "bath": 1.0, "size": 1240} predictions = model.predict(data) data = [ {"bedroom": 1.0, "bath": 1.0, "size": 1240}, {"bedroom": 4.0, "bath": 2.5, "size": 2400}, ] batch_predictions = model.predict(data) """ _MLModel._check_predict_data(data) return _MLModel._get_predictions( self._proxy, _MLModel._update_float16_multiarray_input_to_float32, data, state ) def make_state(self) -> _MLState: """ Returns a new state object, which can be passed to the ``predict`` method. Examples -------- .. sourcecode:: python state = model.make_state() predictions = model.predict(x, state) See Also -------- predict """ return _MLState(self._proxy.newState()) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/_deprecation.py0000644000000000000000000000217514672066616021337 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import functools import warnings def deprecated(obj=None, suffix="", version="", obj_prefix=""): """ Decorator to mark a function or a class as deprecated """ def decorator_deprecation_warning(obj): @functools.wraps(obj) def wrapped(*args, **kwargs): if isinstance(obj, type): msg = ( f"Class {obj_prefix}{obj.__name__} is deprecated and will be removed in {version}." ) else: msg = ( f"Function {obj_prefix}{obj.__name__} is deprecated and will be removed in {version}." ) if suffix: msg += f"; {suffix}" warnings.warn(msg, category=DeprecationWarning) return obj(*args, **kwargs) return wrapped if obj is None: return decorator_deprecation_warning return decorator_deprecation_warning(obj) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/_feature_management.py0000644000000000000000000002676214672066616022701 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import operator as op from collections import defaultdict from copy import copy from functools import reduce import numpy as _np from . import datatypes def process_or_validate_classifier_output_features( output_features, class_labels, supports_class_scores=True ): """ Given a list of class labels and a list of output_features, validate the list and return a valid version of output_features with all the correct data type information included. """ def raise_error(msg): raise ValueError("Classifier error: %s" % msg) class_labels = list(class_labels) # First, we need to determine the type of the classes. _int_types = (int, bool, _np.bool_, _np.int32, _np.int64) if all(isinstance(cl, _int_types) for cl in class_labels): output_class_type = datatypes.Int64() elif all(isinstance(cl, str) for cl in class_labels): output_class_type = datatypes.String() else: raise ValueError("Class labels must be all of type int or all of type string.") if output_features is None: out = [("classLabel", output_class_type)] if supports_class_scores: out += [("classProbability", datatypes.Dictionary(output_class_type))] elif isinstance(output_features, str): out = [(output_features, output_class_type)] if supports_class_scores: out += [("classProbability", datatypes.Dictionary(output_class_type))] elif ( isinstance(output_features, (list, tuple)) and all(isinstance(fn, str) for fn in output_features) and len(output_features) == 2 ): if supports_class_scores: out = [ (output_features[0], output_class_type), (output_features[1], datatypes.Dictionary(output_class_type)), ] else: raise ValueError( "Classifier model (as trained) does not support output scores for classes." ) elif is_valid_feature_list(output_features): output_features = [ (k, datatypes._normalize_datatype(dt)) for k, dt in output_features ] if len(output_features) == 1 or not supports_class_scores: if not output_features[0][1] == output_class_type: raise ValueError( "Type of output class feature does not match type of class labels." ) else: # Make sure the first two output features specified give the output # class field and the output class scores dictionary field if isinstance(output_features[0][1], datatypes.Dictionary) and isinstance( output_features[1][1], output_class_type ): output_features[0], output_features[1] = ( output_features[1], output_features[0], ) if not isinstance(output_features[1][1], datatypes.Dictionary): raise_error("Output features class scores should be dictionary type.") if output_features[1][1].key_type != output_class_type: raise_error( "Class scores dictionary key type does not match type of class labels." ) if output_features[0][1] != output_class_type: raise_error( "Specified type of output class does not match type of class labels." ) # NOTE: We are intentionally allowing the case where additional fields are allowed # beyond the original two features. out = output_features else: raise_error("Form of output features not recognized") return out def is_valid_feature_list(features): # Just test all the ways this could be return ( type(features) is list and len(features) >= 1 and all(type(t) is tuple and len(t) == 2 for t in features) and all(isinstance(n, str) for n, td in features) and all(datatypes._is_valid_datatype(td) for n, td in features) ) def dimension_of_array_features(features): if not is_valid_feature_list(features): raise ValueError("Expected feature list in valid form.") dim = 0 for n, td in features: if isinstance(td, (datatypes.Int64, datatypes.Double)): dim += 1 elif isinstance(td, datatypes.Array): dim += reduce(op.mul, td.dimensions, 1) else: raise ValueError( "Unable to determine number of dimensions from feature list." ) return dim def process_or_validate_features(features, num_dimensions=None, feature_type_map={}): """ Puts features into a standard form from a number of different possible forms. The standard form is a list of 2-tuples of (name, datatype) pairs. The name is a string and the datatype is an object as defined in the _datatype module. The possible input forms are as follows: * A list of strings. in this case, the overall dimension is assumed to be the length of the list. If neighboring names are identical, they are assumed to be an input array of that length. For example: ["a", "b", "c"] resolves to [("a", Double), ("b", Double), ("c", Double)]. And: ["a", "a", "b"] resolves to [("a", Array(2)), ("b", Double)]. * A dictionary of keys to indices or ranges of feature indices. In this case, it's presented as a mapping from keys to indices or ranges of contiguous indices. For example, {"a" : 0, "b" : [2,3], "c" : 1} Resolves to [("a", Double), ("c", Double), ("b", Array(2))]. Note that the ordering is determined by the indices. * A single string. In this case, the input is assumed to be a single array, with the number of dimensions set using num_dimensions. Notes: If the features variable is in the standard form, it is simply checked and returned. If num_dimensions is given, it is used to check against the existing features, or fill in missing information in the case when features is a single string. """ original_features = copy(features) if num_dimensions is not None and not isinstance(num_dimensions, int): raise TypeError( "num_dimensions must be None, an integer or a long, not '%s'" % str(type(num_dimensions)) ) def raise_type_error(additional_msg): raise TypeError( "Error processing feature list: %s\nfeatures = %s" % (additional_msg, str(original_features)) ) if type(features) is dict and is_valid_feature_list(features.items()): features = features.items() # First, see if the features are already in the correct form. If they are, # then we if is_valid_feature_list(features): if num_dimensions is not None: try: feature_dims = dimension_of_array_features(features) except ValueError: feature_dims = None if feature_dims is not None and feature_dims != num_dimensions: raise_type_error("Dimension mismatch.") # We may need to translate some parts of this back to the actual # datatype class -- e.g. translate str to datatypes.String(). return [(k, datatypes._normalize_datatype(dt)) for k, dt in features] if isinstance(features, str): if num_dimensions is None: raise_type_error( "If a single feature name is given, then " "num_dimensions must be provided." ) features = {features: range(num_dimensions)} if isinstance(features, (list, tuple, _np.ndarray)): # Change this into a dictionary mapping = defaultdict(lambda: []) for i, k in enumerate(features): if not isinstance(k, str): raise_type_error( "List of feature names must either be a list of strings, or a list of (name, datatypes.Array instance) tuples." ) if num_dimensions is not None and len(features) != num_dimensions: raise_type_error( ("List of feature names has wrong length; " "%d required, %d provided.") % (num_dimensions, len(features)) ) for i, k in enumerate(features): mapping[k].append(i) # Replace the features features = mapping if not isinstance(features, dict): raise_type_error( "features must be either a list of feature names " "or a dictionary of feature names to ranges." ) # We'll be invasive here so make a copy. features = copy(features) for k, v in list(features.items()): if not isinstance(k, str): raise_type_error("Feature names must be strings.") def test_index(val): error = False try: if val != int(val): error = True except: error = True if error: raise_type_error( "Specified indices for feature %s must be integers." % k ) if val < 0 or (num_dimensions is not None and val >= num_dimensions): raise_type_error("Index in feature %s out of range." % k) iterable_types = [tuple, list, set] iterable_types.append(range) if isinstance(v, tuple(iterable_types)): for idx in v: test_index(idx) # Replace and update features[k] = v = list(sorted(v)) elif isinstance(v, int): test_index(v) features[k] = v = [v] else: raise_type_error( ( "Value type for feature %s not recognized; " "values must be either integers, lists or range objects." ) % k ) # check to make sure things are contiguous if v != list(range(v[0], v[-1] + 1)): raise_type_error( "Index list for feature %s must consist of " "a contiguous range of indices." % k ) if len(set(v)) != len(v): raise_type_error("Index list for feature %s contains duplicates." % k) # Now, set num dimensions from the list if it's actually None if num_dimensions is None: from itertools import chain num_dimensions = 1 + max(chain.from_iterable(features.values())) if ( set().union(*features.values()) != set(range(num_dimensions)) or sum(len(v) for v in features.values()) != num_dimensions ): raise_type_error( "Supplied indices must cover entire range of 0, ..., num_dimensions-1." ) # Define the output feature types output_features = [None] * len(features) # Finally, go through and map all these things out as types. # Sort by first value of the index range. for i, (k, v) in enumerate(sorted(features.items(), key=lambda t: t[1][0])): if k in feature_type_map: output_features[i] = (k, feature_type_map[k]) elif len(v) == 1: output_features[i] = (k, datatypes.Double()) else: output_features[i] = (k, datatypes.Array(len(v))) return output_features ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/_interface_management.py0000644000000000000000000001563414672066616023202 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ..proto import Model_pb2 from . import _feature_management as _fm from . import datatypes def set_classifier_interface_params( spec, features, class_labels, model_accessor_for_class_labels, output_features=None, training_features=None, ): """ Common utilities to set the regression interface params. """ # Normalize the features list. features = _fm.process_or_validate_features(features) if class_labels is None: raise ValueError("List of class labels must be provided.") n_classes = len(class_labels) output_features = _fm.process_or_validate_classifier_output_features( output_features, class_labels ) if len(output_features) == 1: predicted_class_output, pred_cl_type = output_features[0] score_output = None elif len(output_features) == 2: predicted_class_output, pred_cl_type = output_features[0] score_output, score_output_type = output_features[1] else: raise ValueError( "Provided output classes for a classifier must be " "a list of features, predicted class and (optionally) class_score." ) spec.description.predictedFeatureName = predicted_class_output # Are they out of order? if not (pred_cl_type == datatypes.Int64() or pred_cl_type == datatypes.String()): raise ValueError( "Provided predicted class output type not Int64 or String (%s)." % repr(pred_cl_type) ) if score_output is not None: if not isinstance(score_output_type, datatypes.Dictionary): raise ValueError( "Provided class score output type not a Dictionary (%s)." % repr(score_output_type) ) if score_output_type.key_type != pred_cl_type: raise ValueError( ( "Provided class score output (%s) key_type (%s) does not " "match type of class prediction (%s)." ) % (score_output, repr(score_output_type.key_type), repr(pred_cl_type)) ) spec.description.predictedProbabilitiesName = score_output # add input for index, (cur_input_name, input_type) in enumerate(features): input_ = spec.description.input.add() input_.name = cur_input_name datatypes._set_datatype(input_.type, input_type) # add output for index, (cur_output_name, output_type) in enumerate(output_features): output_ = spec.description.output.add() output_.name = cur_output_name datatypes._set_datatype(output_.type, output_type) # Add training features if training_features is not None: spec = set_training_features(spec, training_features) # Worry about the class labels if pred_cl_type == datatypes.String(): try: for c in class_labels: getattr( spec, model_accessor_for_class_labels ).stringClassLabels.vector.append(str(c)) # Not all the classifiers have class labels; in particular the pipeline # classifier. Thus it's not an error if we can't actually set them. except AttributeError: pass else: for c in class_labels: conv_error = False try: if not (int(c) == c): conv_error = True except: conv_error = True if conv_error: raise TypeError( ("Cannot cast '%s' class to an int type " % str(c)) + "(class type determined by type of first class)." ) try: getattr( spec, model_accessor_for_class_labels ).int64ClassLabels.vector.append(int(c)) # Not all the classifiers have class labels; in particular the pipeline # classifier. Thus it's not an error if we can't actually set them. except AttributeError: break # And we are done! return spec def set_regressor_interface_params( spec, features, output_features, training_features=None ): """ Common utilities to set the regressor interface params. """ if output_features is None: output_features = [("predicted_class", datatypes.Double())] else: output_features = _fm.process_or_validate_features(output_features, 1) if len(output_features) != 1: raise ValueError( "Provided output features for a regressor must be " "one Double feature." ) if output_features[0][1] != datatypes.Double(): raise ValueError("Output type of a regressor must be a Double.") prediction_name = output_features[0][0] spec.description.predictedFeatureName = prediction_name # Normalize the features list. features = _fm.process_or_validate_features(features) # add input and output features for cur_input_name, feature_type in features: input_ = spec.description.input.add() input_.name = cur_input_name datatypes._set_datatype(input_.type, feature_type) # Add training features if training_features is not None: spec = set_training_features(spec, training_features) output_ = spec.description.output.add() output_.name = prediction_name datatypes._set_datatype(output_.type, "Double") return spec def set_transform_interface_params( spec, input_features, output_features, are_optional=False, training_features=None, array_datatype=Model_pb2.ArrayFeatureType.DOUBLE, ): """ Common utilities to set transform interface params. """ input_features = _fm.process_or_validate_features(input_features) output_features = _fm.process_or_validate_features(output_features) # Add input and output features for (fname, ftype) in input_features: input_ = spec.description.input.add() input_.name = fname datatypes._set_datatype(input_.type, ftype, array_datatype=array_datatype) if are_optional: input_.type.isOptional = are_optional for (fname, ftype) in output_features: output_ = spec.description.output.add() output_.name = fname datatypes._set_datatype(output_.type, ftype, array_datatype=array_datatype) # Add training features if training_features is not None: spec = set_training_features(spec, training_features) return spec def set_training_features(spec, training_features): for (fname, ftype) in training_features: training_input_ = spec.description.trainingInput.add() training_input_.name = fname if ftype: datatypes._set_datatype(training_input_.type, ftype) return spec ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/array_feature_extractor.py0000644000000000000000000000374214672066616023630 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .. import SPECIFICATION_VERSION from ..proto import Model_pb2 as _Model_pb2 from . import datatypes from ._interface_management import set_transform_interface_params def create_array_feature_extractor( input_features, output_name, extract_indices, output_type=None ): """ Creates a feature extractor from an input array ``(feature, return)``. Parameters ---------- input_features: A list of one ``(name, array)`` tuple. extract_indices: Either an integer or a list. If it's an integer, the output type is by default a double (but may also be an integer). If a list, the output type is an array. """ # Make sure that our starting stuff is in the proper form. assert len(input_features) == 1 assert isinstance(input_features[0][1], datatypes.Array) # Create the model. spec = _Model_pb2.Model() spec.specificationVersion = SPECIFICATION_VERSION if isinstance(extract_indices, int): extract_indices = [extract_indices] if output_type is None: output_type = datatypes.Double() elif isinstance(extract_indices, (list, tuple)): if not all(isinstance(x, int) for x in extract_indices): raise TypeError("extract_indices must be an integer or a list of integers.") if output_type is None: output_type = datatypes.Array(len(extract_indices)) else: raise TypeError("extract_indices must be an integer or a list of integers.") output_features = [(output_name, output_type)] for idx in extract_indices: assert idx < input_features[0][1].num_elements spec.arrayFeatureExtractor.extractIndex.append(idx) set_transform_interface_params(spec, input_features, output_features) return spec ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/datatypes.py0000644000000000000000000001515114672066616020677 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ Basic Data Types. """ import numpy as _np from ..proto import Model_pb2 class _DatatypeBase: def __init__(self, type_tag, full_tag, num_elements): self.type_tag, self.full_tag = type_tag, full_tag self.num_elements = num_elements def __eq__(self, other): return hasattr(other, "full_tag") and self.full_tag == other.full_tag def __ne__(self, other): return not self.__eq__(other) def __hash__(self): return hash(self.full_tag) def __repr__(self): return self.full_tag class Int64(_DatatypeBase): """ Int64 Data Type """ def __init__(self): _DatatypeBase.__init__(self, "Int64", "Int64", 1) class Double(_DatatypeBase): """ Double Data Type """ def __init__(self): _DatatypeBase.__init__(self, "Double", "Double", 1) class String(_DatatypeBase): """ String Data Type """ def __init__(self): _DatatypeBase.__init__(self, "String", "String", 1) class Array(_DatatypeBase): """ Array Data Type """ def __init__(self, *dimensions): """ Constructs a Array, given its dimensions Parameters ---------- dimensions: ints | longs Examples -------- # Create a single dimensions array of length five >>> arr = coremltools.models.datatypes.Array(5) # Create a multi dimension array five by two by ten. >>> multi_arr = coremltools.models.datatypes.Array(5, 2, 10) """ assert len(dimensions) >= 1 assert all( isinstance(d, (int, _np.int64, _np.int32)) for d in dimensions ), "Dimensions must be ints, not {}".format(str(dimensions)) self.dimensions = dimensions num_elements = 1 for d in self.dimensions: num_elements *= d _DatatypeBase.__init__( self, "Array", "Array({%s})" % (",".join("%d" % d for d in self.dimensions)), num_elements, ) class Dictionary(_DatatypeBase): """ Dictionary Data Type """ def __init__(self, key_type=None): """ Constructs a Dictionary, given its key type Parameters ---------- key_type: Int64 | String Examples -------- >>> from coremltools.models.datatypes import Dictionary, Int64, String # Create a dictionary with string keys >>> str_key_dict = Dictionary(key_type=String) # Create a dictionary with int keys >>> int_key_dict = Dictionary(Int64) """ # Resolve it to a class if it's global _simple_type_remap if key_type in _simple_type_remap: key_type = _simple_type_remap[key_type] if not isinstance(key_type, (Int64, String)): raise TypeError("Key type for dictionary must be either string or integer.") self.key_type = key_type _DatatypeBase.__init__( self, "Dictionary", "Dictionary(%s)" % repr(self.key_type), None ) _simple_type_remap = { int: Int64(), str: String(), float: Double(), Double: Double(), Int64: Int64(), String: String(), "Double": Double(), "Int64": Int64(), "String": String(), } def _is_valid_datatype(datatype_instance): """ Returns true if datatype_instance is a valid datatype object and false otherwise. """ # Remap so we can still use the python types for the simple cases global _simple_type_remap if datatype_instance in _simple_type_remap: return True # Now set the protobuf from this interface. if isinstance(datatype_instance, (Int64, Double, String, Array)): return True elif isinstance(datatype_instance, Dictionary): kt = datatype_instance.key_type if isinstance(kt, (Int64, String)): return True return False def _normalize_datatype(datatype_instance): """ Translates a user specified datatype to an instance of the ones defined above. Valid data types are passed through, and the following type specifications are translated to the proper instances: str, "String" -> String() int, "Int64" -> Int64() float, "Double" -> Double() If a data type is not recognized, then an error is raised. """ global _simple_type_remap if datatype_instance in _simple_type_remap: return _simple_type_remap[datatype_instance] # Now set the protobuf from this interface. if isinstance(datatype_instance, (Int64, Double, String, Array)): return datatype_instance elif isinstance(datatype_instance, Dictionary): kt = datatype_instance.key_type if isinstance(kt, (Int64, String)): return datatype_instance raise ValueError("Datatype instance not recognized.") def _set_datatype( proto_type_obj, datatype_instance, array_datatype=Model_pb2.ArrayFeatureType.DOUBLE ): # Remap so we can still use the python types for the simple cases global _simple_type_remap if datatype_instance in _simple_type_remap: datatype_instance = _simple_type_remap[datatype_instance] # Now set the protobuf from this interface. if isinstance(datatype_instance, Int64): proto_type_obj.int64Type.MergeFromString(b"") elif isinstance(datatype_instance, Double): proto_type_obj.doubleType.MergeFromString(b"") elif isinstance(datatype_instance, String): proto_type_obj.stringType.MergeFromString(b"") elif isinstance(datatype_instance, Array): proto_type_obj.multiArrayType.MergeFromString(b"") proto_type_obj.multiArrayType.dataType = array_datatype for n in datatype_instance.dimensions: proto_type_obj.multiArrayType.shape.append(n) elif isinstance(datatype_instance, Dictionary): proto_type_obj.dictionaryType.MergeFromString(b"") kt = datatype_instance.key_type if isinstance(kt, Int64): proto_type_obj.dictionaryType.int64KeyType.MergeFromString(b"") elif isinstance(kt, String): proto_type_obj.dictionaryType.stringKeyType.MergeFromString(b"") else: raise ValueError("Dictionary key type must be either string or int.") else: raise TypeError( "Datatype parameter not recognized; must be an instance " "of datatypes.{Double, Int64, String, Dictionary, Array}, or " "python int, float, or str types." ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/feature_vectorizer.py0000644000000000000000000000720614672066616022612 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .. import SPECIFICATION_VERSION from ..proto import Model_pb2 as _Model_pb2 from . import datatypes from ._feature_management import (is_valid_feature_list, process_or_validate_features) from ._interface_management import set_transform_interface_params def create_feature_vectorizer(input_features, output_feature_name, known_size_map={}): """ Create a feature vectorizer from input features. This returns a 2-tuple ``(spec, num_dimension)`` for a feature vectorizer that puts everything into a single array with a length equal to the total size of all the input features. Parameters ---------- input_features: [list of 2-tuples] Name(s) of the input features, given as a list of ``('name', datatype)`` tuples. The datatypes entry is one of the data types defined in the ``datatypes`` module. Allowed ``datatypes`` are ``datatype.Int64``, ``datatype.Double``, ``datatypes.Dictionary``, and ``datatype.Array``. If the feature is a dictionary type, then the dictionary must have integer keys, and the number of dimensions to expand it into must be provided by ``known_size_map``. Feature indices in the final array are counted sequentially from the from 0 through the total number of features. output_feature_name: str The name of the output feature. The type is an Array List of the output features of the network. known_size_map: A dictionary mapping the feature name to the expanded size in the final array. This is most useful for specifying the size of sparse vectors given as dictionaries of index to value. """ spec = _Model_pb2.Model() spec.specificationVersion = SPECIFICATION_VERSION input_features = process_or_validate_features(input_features) feature_vectorizer = spec.featureVectorizer num_output_dimensions = 0 for n, ft in input_features: if n in known_size_map: dim = known_size_map[n] if ft.num_elements is not None: if dim != ft.num_elements: raise ValueError( "In feature {}, override size {} not compatible with inherent " "value size {}.".format(n, dim, ft.num_elements) ) else: if ft.num_elements is None: raise ValueError( "In feature {}, inherent size unknown so must be manually supplied.".format( n ) ) dim = ft.num_elements num_output_dimensions += dim new_feature = feature_vectorizer.inputList.add() new_feature.inputColumn = n new_feature.inputDimensions = dim if not isinstance(output_feature_name, str): if ( is_valid_feature_list(output_feature_name) and len(output_feature_name) == 1 and output_feature_name[0][1] == datatypes.Array(num_output_dimensions) ): output_feature_name = output_feature_name[0][0] else: raise TypeError( "Output feature must be specified as a feature name or correct output feature list." ) output_features = [(output_feature_name, datatypes.Array(num_output_dimensions))] set_transform_interface_params(spec, input_features, output_features) return spec, num_output_dimensions ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.261547 coremltools-8.0/coremltools/models/ml_program/0000755000000000000000000000000014672075535020463 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/ml_program/__init__.py0000644000000000000000000000036714672066616022602 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import compression_utils././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/ml_program/compression_utils.py0000644000000000000000000001064414672066616024623 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as _np from coremltools.converters.mil.mil import Operation as _Operation from coremltools.models._deprecation import deprecated as _deprecated from coremltools.optimize.coreml import ( OpLinearQuantizerConfig as _OpLinearQuantizerConfig, OpMagnitudePrunerConfig as _OpMagnitudePrunerConfig, OpPalettizerConfig as _OpPalettizerConfig, OpThresholdPrunerConfig as _OpThresholdPrunerConfig, OptimizationConfig as _OptimizationConfig, ) from coremltools.optimize.coreml import ( linear_quantize_weights as _linear_quantize_weights, decompress_weights as _decompress_weights, palettize_weights as _palettize_weights, prune_weights as _prune_weights, ) _DEFAULT_MIN_WEIGHT_SIZE_TO_COMPRESS = 2048 def _default_op_selector(const_op): if not isinstance(const_op, _Operation) or const_op.op_type != "const": raise ValueError("Input of the op_selector must be type of const Operation, got {}.".format(type(const_op))) return const_op.val.val.size > _DEFAULT_MIN_WEIGHT_SIZE_TO_COMPRESS @_deprecated( suffix="Please use coremltools.optimize.coreml.affine_quantize_weights", version="7.0", obj_prefix="coremltools.compression_utils.", ) def affine_quantize_weights(mlmodel, mode="linear_symmetric", op_selector=None, dtype=_np.int8): """ ``coremltools.compression_utils.affine_quantize_weights`` is deprecated and will be removed in the future. Please use :py:class:`coremltools.optimize.coreml.linear_quantize_weights`. """ if op_selector is None: op_selector = _default_op_selector op_config = _OpLinearQuantizerConfig(mode=mode, dtype=dtype, weight_threshold=None) config = _OptimizationConfig(global_config=op_config, is_deprecated=True, op_selector=op_selector) return _linear_quantize_weights(mlmodel, config) @_deprecated( suffix="Please use coremltools.optimize.coreml.palettize_weights", version="7.0", obj_prefix="coremltools.compression_utils.", ) def palettize_weights(mlmodel, nbits=None, mode="kmeans", op_selector=None, lut_function=None): """ ``coremltools.compression_utils.palettize_weights`` is deprecated and will be removed in the future. Please use :py:class:`coremltools.optimize.coreml.palettize_weights`. """ if op_selector is None: op_selector = _default_op_selector op_config = _OpPalettizerConfig(nbits=nbits, mode=mode, lut_function=lut_function, weight_threshold=None) config = _OptimizationConfig(global_config=op_config, is_deprecated=True, op_selector=op_selector) return _palettize_weights(mlmodel, config) @_deprecated( suffix="Please use coremltools.optimize.coreml.sparsify_weights", version="7.0", obj_prefix="coremltools.compression_utils.", ) def sparsify_weights( mlmodel, mode="threshold_based", threshold=1e-12, target_percentile=1.0, op_selector=None ): """ ``coremltools.compression_utils.sparsify_weights`` is deprecated and will be removed in the future. Please use :py:class:`coremltools.optimize.coreml.prune_weights`. """ if op_selector is None: op_selector = _default_op_selector if mode.upper() == "THRESHOLD_BASED": op_config = _OpThresholdPrunerConfig( threshold=threshold, minimum_sparsity_percentile=0.0, weight_threshold=None, ) elif mode.upper() == "PERCENTILE_BASED": op_config = _OpMagnitudePrunerConfig( target_sparsity=target_percentile, weight_threshold=None, ) else: raise ValueError( 'Only modes "THRESHOLD_BASED" and "PERCENTILE_BASED" are supported for weight sparsification.' f' Got mode: "{mode}".' ) config = _OptimizationConfig(global_config=op_config, is_deprecated=True, op_selector=op_selector) return _prune_weights(mlmodel, config) @_deprecated( suffix="Please use coremltools.optimize.coreml.decompress_weights", version="7.0", obj_prefix="coremltools.compression_utils.", ) def decompress_weights(mlmodel): """ ``coremltools.compression_utils.decompress_weights`` is deprecated and will be removed in the future. Please use :py:class:`coremltools.optimize.coreml.decompress_weights`. """ return _decompress_weights(mlmodel) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/model.py0000644000000000000000000010734014672066616020003 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import atexit as _atexit import json import os as _os import shutil as _shutil import tempfile as _tempfile import warnings as _warnings from copy import deepcopy as _deepcopy from typing import Optional as _Optional import numpy as _np import numpy as _numpy from coremltools import ( ComputeUnit as _ComputeUnit, _logger as logger, proto as _proto, SpecializationStrategy as _SpecializationStrategy, ReshapeFrequency as _ReshapeFrequency, ) from coremltools._deps import _HAS_TF_1, _HAS_TF_2, _HAS_TORCH from coremltools.converters.mil.mil.program import Program as _Program from coremltools.converters.mil.mil.scope import ScopeSource as _ScopeSource from .utils import ( _MLMODEL_EXTENSION, _MLPACKAGE_AUTHOR_NAME, _MLPACKAGE_EXTENSION, _MODEL_FILE_NAME, _create_mlpackage, _has_custom_layer, _is_macos, _macos_version, _try_get_weights_dir_path, ) from .utils import load_spec as _load_spec from .utils import save_spec as _save_spec if _HAS_TORCH: import torch as _torch if _HAS_TF_1 or _HAS_TF_2: import tensorflow as _tf try: from ..libmodelpackage import ModelPackage as _ModelPackage except: _ModelPackage = None try: from ..libcoremlpython import _MLModelProxy except Exception as e: logger.warning(f"Failed to load _MLModelProxy: {e}") _MLModelProxy = None _HAS_PIL = True try: from PIL import Image as _PIL_IMAGE except: _HAS_PIL = False _MLMODEL_FULL_PRECISION = "float32" _MLMODEL_HALF_PRECISION = "float16" _MLMODEL_QUANTIZED = "quantized_model" _VALID_MLMODEL_PRECISION_TYPES = [ _MLMODEL_FULL_PRECISION, _MLMODEL_HALF_PRECISION, _MLMODEL_QUANTIZED, ] # Linear quantization _QUANTIZATION_MODE_LINEAR_QUANTIZATION = "_linear_quantization" # Linear quantization represented as a lookup table _QUANTIZATION_MODE_LOOKUP_TABLE_LINEAR = "_lookup_table_quantization_linear" # Lookup table quantization generated by K-Means _QUANTIZATION_MODE_LOOKUP_TABLE_KMEANS = "_lookup_table_quantization_kmeans" # Custom lookup table quantization _QUANTIZATION_MODE_CUSTOM_LOOKUP_TABLE = "_lookup_table_quantization_custom" # Dequantization _QUANTIZATION_MODE_DEQUANTIZE = "_dequantize_network" # used for testing # Symmetric linear quantization _QUANTIZATION_MODE_LINEAR_SYMMETRIC = "_linear_quantization_symmetric" _SUPPORTED_QUANTIZATION_MODES = [ _QUANTIZATION_MODE_LINEAR_QUANTIZATION, _QUANTIZATION_MODE_LOOKUP_TABLE_LINEAR, _QUANTIZATION_MODE_LOOKUP_TABLE_KMEANS, _QUANTIZATION_MODE_CUSTOM_LOOKUP_TABLE, _QUANTIZATION_MODE_DEQUANTIZE, _QUANTIZATION_MODE_LINEAR_SYMMETRIC, ] _LUT_BASED_QUANTIZATION = [ _QUANTIZATION_MODE_LOOKUP_TABLE_LINEAR, _QUANTIZATION_MODE_LOOKUP_TABLE_KMEANS, _QUANTIZATION_MODE_CUSTOM_LOOKUP_TABLE, ] _METADATA_VERSION = "com.github.apple.coremltools.version" _METADATA_SOURCE = "com.github.apple.coremltools.source" _METADATA_SOURCE_DIALECT = "com.github.apple.coremltools.source_dialect" def _verify_optimization_hint_input(optimization_hint_input: _Optional[dict] = None) -> None: """ Throws an exception if ``optimization_hint_input`` is not valid. """ if optimization_hint_input is None: return if not isinstance(optimization_hint_input, dict): raise TypeError('"optimization_hint_input" must be a dictionary or None') if optimization_hint_input != {} and _macos_version() < (15, 0): raise ValueError('Optimization hints are only available on macOS >= 15.0') for k in optimization_hint_input.keys(): if k not in ('reshapeFrequency', 'specializationStrategy'): raise ValueError(f"Unrecognized key in optimization_hint dictionary: {k}") if "specializationStrategy" in optimization_hint_input and not isinstance(optimization_hint_input["specializationStrategy"], _SpecializationStrategy): raise TypeError('"specializationStrategy" value of "optimization_hint_input" dictionary must be of type coremltools.SpecializationStrategy') if "reshapeFrequency" in optimization_hint_input and not isinstance(optimization_hint_input["reshapeFrequency"], _ReshapeFrequency): raise TypeError('"reshapeFrequency" value of "optimization_hint_input" dictionary must be of type coremltools.ReshapeFrequency') class _FeatureDescription: def __init__(self, fd_spec): self._fd_spec = fd_spec def __repr__(self): return "Features(%s)" % ",".join(map(lambda x: x.name, self._fd_spec)) def __len__(self): return len(self._fd_spec) def __getitem__(self, key): for f in self._fd_spec: if key == f.name: return f.shortDescription raise KeyError("No feature with name %s." % key) def __contains__(self, key): for f in self._fd_spec: if key == f.name: return True return False def __setitem__(self, key, value): for f in self._fd_spec: if key == f.name: f.shortDescription = value return raise AttributeError("No feature with name %s." % key) def __iter__(self): for f in self._fd_spec: yield f.name class MLState: def __init__(self, proxy): """ Holds state for an MLModel. This is an opaque object. Nothing can be done with it except pass it to MLModel.predict. See Also -------- ct.MLModel.predict """ self.__proxy__ = proxy class MLModel: """ This class defines the minimal interface to a Core ML object in Python. At a high level, the protobuf specification consists of: - Model description: Encodes names and type information of the inputs and outputs to the model. - Model parameters: The set of parameters required to represent a specific instance of the model. - Metadata: Information about the origin, license, and author of the model. With this class, you can inspect a Core ML model, modify metadata, and make predictions for the purposes of testing (on select platforms). Examples -------- .. sourcecode:: python # Load the model model = MLModel("HousePricer.mlmodel") # Set the model metadata model.author = "Author" model.license = "BSD" model.short_description = "Predicts the price of a house in the Seattle area." # Get the interface to the model model.input_description model.output_description # Set feature descriptions manually model.input_description["bedroom"] = "Number of bedrooms" model.input_description["bathrooms"] = "Number of bathrooms" model.input_description["size"] = "Size (in square feet)" # Set model.output_description["price"] = "Price of the house" # Make predictions predictions = model.predict({"bedroom": 1.0, "bath": 1.0, "size": 1240}) # Get the spec of the model spec = model.get_spec() # Save the model model.save("HousePricer.mlpackage") # Load the model from the spec object spec = model.get_spec() # modify spec (e.g. rename inputs/outputs etc) model = MLModel(spec) # if model type is mlprogram, i.e. spec.WhichOneof('Type') == "mlProgram", then: model = MLModel(spec, weights_dir=model.weights_dir) # Load a non-default function from a multifunction .mlpackage model = MLModel("MultifunctionModel.mlpackage", function_name="deep_features") See Also -------- predict """ def __init__( self, model, is_temp_package=False, mil_program=None, skip_model_load=False, compute_units=_ComputeUnit.ALL, weights_dir=None, function_name=None, optimization_hints: _Optional[dict] = None, ): """ Construct an MLModel from an ``.mlmodel``. Parameters ---------- model: str or Model_pb2 For an ML program (``mlprogram``), the model can be a path string (``.mlpackage``) or ``Model_pb2``. If it is a path string, it must point to a directory containing bundle artifacts (such as ``weights.bin``). If it is of type ``Model_pb2`` (spec), then you must also provide ``weights_dir`` if the model has weights, because both the proto spec and the weights are required to initialize and load the model. The proto spec for an ``mlprogram``, unlike a neural network (``neuralnetwork``), does not contain the weights; they are stored separately. If the model does not have weights, you can provide an empty ``weights_dir``. For non- ``mlprogram`` model types, the model can be a path string (``.mlmodel``) or type ``Model_pb2``, such as a spec object. is_temp_package: bool Set to ``True`` if the input model package dir is temporary and can be deleted upon interpreter termination. mil_program: coremltools.converters.mil.Program Set to the MIL program object, if available. It is available whenever an MLModel object is constructed using the unified converter API `coremltools.convert() `_. skip_model_load: bool Set to ``True`` to prevent Core ML Tools from calling into the Core ML framework to compile and load the model. In that case, the returned model object cannot be used to make a prediction. This flag may be used to load a newer model type on an older Mac, to inspect or load/save the spec. Example: Loading an ML program model type on a macOS 11, since an ML program can be compiled and loaded only from macOS12+. Defaults to ``False``. compute_units: coremltools.ComputeUnit The set of processing units the model can use to make predictions. An enum with four possible values: - ``coremltools.ComputeUnit.ALL``: Use all compute units available, including the neural engine. - ``coremltools.ComputeUnit.CPU_ONLY``: Limit the model to only use the CPU. - ``coremltools.ComputeUnit.CPU_AND_GPU``: Use both the CPU and GPU, but not the neural engine. - ``coremltools.ComputeUnit.CPU_AND_NE``: Use both the CPU and neural engine, but not the GPU. Available only for macOS >= 13.0. weights_dir: str Path to the weight directory, required when loading an MLModel of type ``mlprogram``, from a spec object, such as when the argument ``model`` is of type ``Model_pb2``. function_name : str The name of the function from ``model`` to load. If not provided, ``function_name`` will be set to the ``defaultFunctionName`` in the proto. optimization_hints : dict or None Keys are the names of the optimization hint, either 'reshapeFrequency' or 'specializationStrategy'. Values are enumeration values of type ``coremltools.ReshapeFrequency`` or ``coremltools.SpecializationStrategy``. Notes ----- Internally this maintains the following: - ``_MLModelProxy``: A pybind wrapper around CoreML::Python::Model (see `coremltools/coremlpython/CoreMLPython.mm `_) - ``package_path`` (mlprogram only): Directory containing all artifacts (``.mlmodel``, weights, and so on). - ``weights_dir`` (mlprogram only): Directory containing weights inside the package_path. Examples -------- .. sourcecode:: python loaded_model = MLModel("my_model.mlmodel") loaded_model = MLModel("my_model.mlpackage") """ def cleanup(package_path): if _os.path.exists(package_path): _shutil.rmtree(package_path) def does_model_contain_mlprogram(model) -> bool: """ Is this an mlprogram or is it a pipeline with at least one mlprogram? """ model_type = model.WhichOneof("Type") if model_type == "mlProgram": return True elif model_type not in ("pipeline", "pipelineClassifier", "pipelineRegressor"): return False # Does this pipeline contain an mlprogram? if model_type == "pipeline": pipeline_models = model.pipeline.models elif model_type == "pipelineClassifier": pipeline_models = model.pipelineClassifier.pipeline.models else: assert model_type == "pipelineRegressor" pipeline_models = model.pipelineRegressor.pipeline.models for m in pipeline_models: if does_model_contain_mlprogram(m): return True return False if not isinstance(compute_units, _ComputeUnit): raise TypeError('"compute_units" parameter must be of type: coremltools.ComputeUnit') elif (compute_units == _ComputeUnit.CPU_AND_NE and _is_macos() and _macos_version() < (13, 0) ): raise ValueError( 'coremltools.ComputeUnit.CPU_AND_NE is only available on macOS >= 13.0' ) _verify_optimization_hint_input(optimization_hints) self.compute_unit = compute_units self.function_name = function_name if optimization_hints is not None: self.optimization_hints = optimization_hints.copy() else: self.optimization_hints = None self.is_package = False self.is_temp_package = False self.package_path = None self._weights_dir = None if mil_program is not None and not isinstance(mil_program, _Program): raise ValueError('"mil_program" must be of type "coremltools.converters.mil.Program"') self._mil_program = mil_program if isinstance(model, str): model = _os.path.abspath(_os.path.expanduser(_os.path.expandvars(model))) if _os.path.isdir(model): self.is_package = True self.package_path = model self.is_temp_package = is_temp_package self._weights_dir = _try_get_weights_dir_path(model) self.__proxy__, self._spec, self._framework_error = self._get_proxy_and_spec( model, compute_units, skip_model_load=skip_model_load, optimization_hints=optimization_hints, ) elif isinstance(model, _proto.Model_pb2.Model): if does_model_contain_mlprogram(model): if model.WhichOneof("Type") == "mlProgram" and weights_dir is None: raise Exception( "MLModel of type mlProgram cannot be loaded just from the model spec object. " "It also needs the path to the weights file. Please provide that as well, " "using the 'weights_dir' argument." ) self.is_package = True self.is_temp_package = True filename = _create_mlpackage(model, weights_dir) self.package_path = filename self._weights_dir = _try_get_weights_dir_path(filename) else: filename = _tempfile.mktemp(suffix=_MLMODEL_EXTENSION) _save_spec(model, filename) self.__proxy__, self._spec, self._framework_error = self._get_proxy_and_spec( filename, compute_units, skip_model_load=skip_model_load, optimization_hints=optimization_hints ) try: _os.remove(filename) except OSError: pass else: raise TypeError( "Expected model to be a .mlmodel file, .mlpackage file or a Model_pb2 object" ) self._input_description = _FeatureDescription(self._spec.description.input) self._output_description = _FeatureDescription(self._spec.description.output) self._model_input_names_set = set([i.name for i in self._spec.description.input]) if self.is_package and self.is_temp_package: _atexit.register(cleanup, self.package_path) # If function_name is not passed, self.function_name defaults to defaultFunctionName in the proto. default_function_name = self._spec.description.defaultFunctionName if self.function_name is None and len(default_function_name) > 0: self.function_name = default_function_name if self.function_name is not None: if not self._is_multifunction() and self.function_name != "main": raise ValueError('function_name must be "main" for non multifunction model') # Updated self._model_input_names_set based on self.function_name. # self._model_input_names_set defines the allowed input keys for the data dictionary passed to self.predict(). if self.function_name is not None and self._is_multifunction(): f = self._get_function_description(self.function_name) self._model_input_names_set = set([i.name for i in f.input]) def _get_proxy_and_spec( self, filename: str, compute_units: _ComputeUnit, skip_model_load: _Optional[bool] = False, optimization_hints: _Optional[dict] = None, ): filename = _os.path.expanduser(filename) specification = _load_spec(filename) if _MLModelProxy and not skip_model_load: # check if the version is supported engine_version = _MLModelProxy.maximum_supported_specification_version() if specification.specificationVersion > engine_version: # in this case the specification is a newer kind of .mlmodel than this # version of the engine can support so we'll not try to have a proxy object return None, specification, None function_name = "" if self.function_name is None else self.function_name if optimization_hints is not None: optimization_hints_str_vals = {k: v.name for k, v in optimization_hints.items()} else: optimization_hints_str_vals = {} try: return ( _MLModelProxy(filename, compute_units.name, function_name, optimization_hints_str_vals), specification, None, ) except RuntimeError as e: _warnings.warn( "You will not be able to run predict() on this Core ML model." + " Underlying exception message was: " + str(e), RuntimeWarning, ) return None, specification, e return None, specification, None @property def short_description(self): return self._spec.description.metadata.shortDescription @short_description.setter def short_description(self, short_description): self._spec.description.metadata.shortDescription = short_description @property def input_description(self): return self._input_description @property def output_description(self): return self._output_description @property def user_defined_metadata(self): return self._spec.description.metadata.userDefined @property def author(self): return self._spec.description.metadata.author @author.setter def author(self, author): self._spec.description.metadata.author = author @property def license(self): return self._spec.description.metadata.license @license.setter def license(self, license): self._spec.description.metadata.license = license @property def version(self): return self._spec.description.metadata.versionString @property def weights_dir(self): return self._weights_dir @version.setter def version(self, version_string): self._spec.description.metadata.versionString = version_string def __repr__(self): return self._spec.description.__repr__() def __str__(self): return self.__repr__() def save(self, save_path: str): """ Save the model to an ``.mlmodel`` format. For an MIL program, the ``save_path`` is a package directory containing the ``mlmodel`` and weights. Parameters ---------- save_path: Target file path / bundle directory for the model. Examples -------- .. sourcecode:: python model.save("my_model_file.mlmodel") loaded_model = MLModel("my_model_file.mlmodel") """ save_path = _os.path.expanduser(save_path) # Clean up existing file or directory. if _os.path.exists(save_path): if _os.path.isdir(save_path): _shutil.rmtree(save_path) else: _os.remove(save_path) if self.is_package: name, ext = _os.path.splitext(save_path) if not ext: save_path = "{}{}".format(save_path, _MLPACKAGE_EXTENSION) elif ext != _MLPACKAGE_EXTENSION: raise Exception( "For an ML Program, extension must be {} (not {}). Please see https://coremltools.readme.io/docs/unified-conversion-api#target-conversion-formats to see the difference between neuralnetwork and mlprogram model types.".format( _MLPACKAGE_EXTENSION, ext ) ) _shutil.copytree(self.package_path, save_path) if self._mil_program is not None and all( [ _ScopeSource.EXIR_DEBUG_HANDLE in function._essential_scope_sources for function in self._mil_program.functions.values() ] ): debug_handle_to_ops_mapping = ( self._mil_program.construct_debug_handle_to_ops_mapping() ) if len(debug_handle_to_ops_mapping) > 0: debug_handle_to_ops_mapping_as_json = json.dumps( { "version" : self.user_defined_metadata[_METADATA_VERSION], "mapping" : debug_handle_to_ops_mapping, } ) saved_debug_handle_to_ops_mapping_path = _os.path.join( save_path, "executorch_debug_handle_mapping.json" ) with open(saved_debug_handle_to_ops_mapping_path, "w") as f: f.write(debug_handle_to_ops_mapping_as_json) saved_spec_path = _os.path.join( save_path, "Data", _MLPACKAGE_AUTHOR_NAME, _MODEL_FILE_NAME ) _save_spec(self._spec, saved_spec_path) else: _save_spec(self._spec, save_path) def get_compiled_model_path(self): """ Returns the path for the underlying compiled ML Model. **Important**: This path is available only for the lifetime of this Python object. If you want the compiled model to persist, you need to make a copy. """ return self.__proxy__.get_compiled_model_path() def get_spec(self): """ Get a deep copy of the protobuf specification of the model. Returns ------- model: Model_pb2 Protobuf specification of the model. Examples -------- .. sourcecode:: python spec = model.get_spec() """ return _deepcopy(self._spec) def predict(self, data, state: _Optional[MLState] = None): """ Return predictions for the model. Parameters ---------- data: dict[str, value] or list[dict[str, value]] Dictionary of data to use for predictions, where the keys are the names of the input features. For batch predictons, use a list of such dictionaries. The following dictionary values types are acceptable: list, array, numpy.ndarray, tensorflow.Tensor and torch.Tensor. state : MLState Optional state object as returned by ``make_state()``. Returns ------- dict[str, value] Predictions as a dictionary where each key is the output feature name. list[dict[str, value]] For batch prediction, returns a list of the above dictionaries. Examples -------- .. sourcecode:: python data = {"bedroom": 1.0, "bath": 1.0, "size": 1240} predictions = model.predict(data) data = [ {"bedroom": 1.0, "bath": 1.0, "size": 1240}, {"bedroom": 4.0, "bath": 2.5, "size": 2400}, ] batch_predictions = model.predict(data) """ def verify_and_convert_input_dict(d): self._verify_input_dict(d) self._convert_tensor_to_numpy(d) # TODO: remove the following call when this is fixed: rdar://92239209 self._update_float16_multiarray_input_to_float32(d) if self.is_package and _is_macos() and _macos_version() < (12, 0): raise Exception( "predict() for .mlpackage is not supported in macOS version older than 12.0." ) MLModel._check_predict_data(data) if self.__proxy__: return self._get_predictions(self.__proxy__, verify_and_convert_input_dict, data, state) else: # Error case if _macos_version() < (10, 13): raise Exception( "Model prediction is only supported on macOS version 10.13 or later." ) if not _MLModelProxy: raise Exception("Unable to load CoreML.framework. Cannot make predictions.") elif ( _MLModelProxy.maximum_supported_specification_version() < self._spec.specificationVersion ): engineVersion = _MLModelProxy.maximum_supported_specification_version() raise Exception( "The specification has version " + str(self._spec.specificationVersion) + " but the Core ML framework version installed only supports Core ML model specification version " + str(engineVersion) + " or older." ) elif _has_custom_layer(self._spec): raise Exception( "This model contains a custom neural network layer, so predict is not supported." ) else: if self._framework_error: raise self._framework_error else: raise Exception("Unable to load CoreML.framework. Cannot make predictions.") @staticmethod def _check_predict_data(data): if type(data) not in (list, dict): raise TypeError("\"data\" parameter must be either a dict or list of dict.") if type(data) == list and not all(map(lambda x: type(x) == dict, data)): raise TypeError("\"data\" list must contain only dictionaries") @staticmethod def _get_predictions(proxy, preprocess_method, data, state): if type(data) == dict: preprocess_method(data) state = None if state is None else state.__proxy__ return proxy.predict(data, state) else: assert type(data) == list assert state is None, "State can only be used for unbatched predictions" for i in data: preprocess_method(i) return proxy.batchPredict(data) def _is_stateful(self) -> bool: model_desc = self._spec.description # For a single function model, we check if len(state) > 0 if len(model_desc.functions) == 0: return len(model_desc.state) > 0 # For a multifunction model, we first get the corresponding function description, # and check the state field. f = list(filter(lambda f: f.name == self.function_name, model_desc.functions)) return len(f.state) > 0 def _is_multifunction(self) -> bool: return len(self._spec.description.functions) > 0 def _get_function_description(self, function_name: str) -> _proto.Model_pb2.FunctionDescription: f = list(filter(lambda f: f.name == function_name, self._spec.description.functions)) if len(f) == 0: raise ValueError(f"function_name {function_name} not found in the model.") assert len(f) == 1, f"Invalid proto: two functions with the same name {function_name}." return f[0] def make_state(self) -> MLState: """ Returns a new state object, which can be passed to the ``predict`` method. Returns _______ state: MLState Holds state for an MLModel. State functionality is only supported on macOS 15+. Examples -------- .. sourcecode:: python state = model.make_state() predictions = model.predict(x, state) See Also -------- predict """ if not _is_macos() or _macos_version() < (15, 0): raise Exception("State functionality is only supported on macOS 15+") return MLState(self.__proxy__.newState()) def _input_has_infinite_upper_bound(self) -> bool: """Check if any input has infinite upper bound (-1).""" for input_spec in self.input_description._fd_spec: for size_range in input_spec.type.multiArrayType.shapeRange.sizeRanges: if size_range.upperBound == -1: return True return False def _set_build_info_mil_attributes(self, metadata): if self._spec.WhichOneof('Type') != "mlProgram": # No MIL attributes to set return ml_program_attributes = self._spec.mlProgram.attributes build_info_proto = ml_program_attributes["buildInfo"] # Set ValueType to dictionary of string to string str_type = _proto.MIL_pb2.ValueType() str_type.tensorType.dataType = _proto.MIL_pb2.DataType.STRING dict_type_str_to_str = _proto.MIL_pb2.ValueType() dict_type_str_to_str.dictionaryType.keyType.CopyFrom(str_type) dict_type_str_to_str.dictionaryType.valueType.CopyFrom(str_type) build_info_proto.type.CopyFrom(dict_type_str_to_str) # Copy the metadata build_info_dict = build_info_proto.immediateValue.dictionary for k, v in metadata.items(): key_pair = _proto.MIL_pb2.DictionaryValue.KeyValuePair() key_pair.key.immediateValue.tensor.strings.values.append(k) key_pair.key.type.CopyFrom(str_type) key_pair.value.immediateValue.tensor.strings.values.append(v) key_pair.value.type.CopyFrom(str_type) build_info_dict.values.append(key_pair) def _get_mil_internal(self): """ Get a deep copy of the MIL program object, if available. It's available whenever an MLModel object is constructed using the unified converter API [``coremltools.convert()``](https://apple.github.io/coremltools/source/coremltools.converters.mil.html#coremltools.converters._converters_entry.convert). Returns ------- program: coremltools.converters.mil.Program Examples -------- .. sourcecode:: python mil_prog = model._get_mil_internal() """ return _deepcopy(self._mil_program) def _verify_input_dict(self, input_dict): # Check if the input name given by the user is valid. # Although this is checked during prediction inside CoreML Framework, # we still check it here to return early and # return a more verbose error message self._verify_input_name_exists(input_dict) # verify that the pillow image modes are correct, for image inputs self._verify_pil_image_modes(input_dict) def _verify_pil_image_modes(self, input_dict): if not _HAS_PIL: return for input_desc in self._spec.description.input: if input_desc.type.WhichOneof("Type") == "imageType": input_val = input_dict.get(input_desc.name, None) if not isinstance(input_val, _PIL_IMAGE.Image): msg = "Image input, '{}' must be of type PIL.Image.Image in the input dict" raise TypeError(msg.format(input_desc.name)) if input_desc.type.imageType.colorSpace in ( _proto.FeatureTypes_pb2.ImageFeatureType.BGR, _proto.FeatureTypes_pb2.ImageFeatureType.RGB, ): if input_val.mode != "RGB": msg = "RGB/BGR image input, '{}', must be of type PIL.Image.Image with mode=='RGB'" raise TypeError(msg.format(input_desc.name)) elif ( input_desc.type.imageType.colorSpace == _proto.FeatureTypes_pb2.ImageFeatureType.GRAYSCALE ): if input_val.mode != "L": msg = "GRAYSCALE image input, '{}', must be of type PIL.Image.Image with mode=='L'" raise TypeError(msg.format(input_desc.name)) elif ( input_desc.type.imageType.colorSpace == _proto.FeatureTypes_pb2.ImageFeatureType.GRAYSCALE_FLOAT16 ): if input_val.mode != "F": msg = "GRAYSCALE_FLOAT16 image input, '{}', must be of type PIL.Image.Image with mode=='F'" raise TypeError(msg.format(input_desc.name)) def _verify_input_name_exists(self, input_dict): for given_input in input_dict.keys(): if given_input not in self._model_input_names_set: err_msg = "Provided key \"{}\", in the input dict, " \ "does not match any of the model input name(s), which are: {}" raise KeyError(err_msg.format(given_input, self._model_input_names_set)) @staticmethod def _update_float16_multiarray_input_to_float32(input_data: dict): for k, v in input_data.items(): if isinstance(v, _np.ndarray) and v.dtype == _np.float16: input_data[k] = v.astype(_np.float32) def _convert_tensor_to_numpy(self, input_dict): def convert(given_input): if isinstance(given_input, _numpy.ndarray): sanitized_input = given_input elif _HAS_TORCH and isinstance(given_input, _torch.Tensor): sanitized_input = given_input.detach().numpy() elif (_HAS_TF_1 or _HAS_TF_2) and isinstance(given_input, _tf.Tensor): sanitized_input = given_input.eval(session=_tf.compat.v1.Session()) else: sanitized_input = _numpy.array(given_input) return sanitized_input model_input_to_types = {} for inp in self._spec.description.input: type_value = inp.type.multiArrayType.dataType type_name = inp.type.multiArrayType.ArrayDataType.Name(type_value) if type_name != "INVALID_ARRAY_DATA_TYPE": model_input_to_types[inp.name] = type_name for given_input_name, given_input in input_dict.items(): if given_input_name not in model_input_to_types: continue input_dict[given_input_name] = convert(given_input) ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.261547 coremltools-8.0/coremltools/models/nearest_neighbors/0000755000000000000000000000000014672075535022025 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/nearest_neighbors/__init__.py0000644000000000000000000000042014672066616024132 0ustar00rootroot# Copyright (c) 2019, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .builder import KNearestNeighborsClassifierBuilder ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/nearest_neighbors/builder.py0000644000000000000000000005150214672066616024030 0ustar00rootroot# Copyright (c) 2019, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as _np import coremltools from ...proto import FeatureTypes_pb2 from .. import datatypes class KNearestNeighborsClassifierBuilder: """ Construct a CoreML KNearestNeighborsClassifier specification. Please see the Core ML Nearest Neighbors protobuf message for more information on KNearestNeighborsClassifier parameters. Examples -------- .. sourcecode:: python from coremltools.models.nearest_neighbors import KNearestNeighborsClassifierBuilder from coremltools.models.utils import save_spec # Create a KNearestNeighborsClassifier model that takes 4-dimensional input data and outputs a string label. >>> builder = KNearestNeighborsClassifierBuilder(input_name='input', ... output_name='output', ... number_of_dimensions=4, ... default_class_label='default_label') # save the spec by the builder >>> save_spec(builder.spec, 'knnclassifier.mlmodel') """ _VALID_INDEX_TYPES = ["linear", "kd_tree"] _VALID_WEIGHTING_SCHEMES = ["uniform", "inverse_distance"] _VALID_DISTANCE_METRICS = ["squared_euclidean"] # Optional parameter keys for constructor _PARAMETER_KEY_NUMBER_OF_NEIGHBORS = "number_of_neighbors" _PARAMETER_KEY_WEIGHTING_SCHEME = "weighting_scheme" _PARAMETER_KEY_INDEX_TYPE = "index_type" _PARAMETER_KEY_LEAF_SIZE = "leaf_size" _PARAMETER_KEY_INPUT_TYPE = "input_type" # Optional parameter default values _PARAMETER_DEFAULT_NUMBER_OF_NEIGHBORS = 5 _PARAMETER_DEFAULT_WEIGHTING_SCHEME = "uniform" _PARAMETER_DEFAULT_INDEX_TYPE = "linear" _PARAMETER_DEFAULT_LEAF_SIZE = 30 _PARAMETER_DEFAULT_INPUT_TYPE = "NotSpecified" def __init__( self, input_name, output_name, number_of_dimensions, default_class_label, **kwargs ): """ Create a KNearestNeighborsClassifierBuilder object. Parameters ---------- input_name Name of the model input. output_name Name of the output. number_of_dimensions Number of dimensions of the input data. default_class_label The default class label to use for predictions. Must be either an int64 or a string. number_of_neighbors Number of neighbors to use for predictions. Default = 5 with allowed values between 1-1000. weighting_scheme Weight function used in prediction. One of ``'uniform'`` (default) or ``'inverse_distance'``. index_type Algorithm to compute nearest neighbors. One of ``'linear'`` (default), or ``'kd_tree'``. leaf_size Leaf size for the kd-tree. Ignored if index type is ``'linear'``. Default = 30. """ self.spec = coremltools.proto.Model_pb2.Model() self.spec.specificationVersion = ( coremltools._MINIMUM_NEAREST_NEIGHBORS_SPEC_VERSION ) # the model is initially empty - assume it's updatable self.is_updatable = True if number_of_dimensions <= 0: raise ValueError("number_of_dimensions must be >= 0") self.spec.kNearestNeighborsClassifier.nearestNeighborsIndex.numberOfDimensions = ( number_of_dimensions ) input_type = kwargs.get( self._PARAMETER_KEY_INPUT_TYPE, self._PARAMETER_DEFAULT_INPUT_TYPE ) input_feature_type = FeatureTypes_pb2.ArrayFeatureType.FLOAT32 if input_type == datatypes.Double: input_feature_type = FeatureTypes_pb2.ArrayFeatureType.DOUBLE input_feature = self.spec.description.input.add() input_feature.name = input_name input_feature.type.multiArrayType.dataType = input_feature_type input_feature.type.multiArrayType.shape.extend([number_of_dimensions]) training_features = self.spec.description.trainingInput.add() training_features.name = input_name training_features.type.multiArrayType.dataType = input_feature_type training_features.type.multiArrayType.shape.extend([number_of_dimensions]) output_label = self.spec.description.output.add() output_label.name = output_name output_label_probs = self.spec.description.output.add() output_label_probs.name = output_name + "Probs" training_features = self.spec.description.trainingInput.add() training_features.name = output_name if self._is_valid_text_type(default_class_label): output_label.type.stringType.MergeFromString(b"") training_features.type.stringType.MergeFromString(b"") output_label_probs.type.dictionaryType.stringKeyType.MergeFromString(b"") self.spec.kNearestNeighborsClassifier.stringClassLabels.MergeFromString(b"") self.spec.kNearestNeighborsClassifier.defaultStringLabel = ( default_class_label ) elif self._is_valid_number_type(default_class_label): output_label.type.int64Type.MergeFromString(b"") training_features.type.int64Type.MergeFromString(b"") output_label_probs.type.dictionaryType.int64KeyType.MergeFromString(b"") self.spec.kNearestNeighborsClassifier.int64ClassLabels.MergeFromString(b"") self.spec.kNearestNeighborsClassifier.defaultInt64Label = ( default_class_label ) else: raise TypeError( "default_class_label type ({}) is invalid. Must be either string or int64".format( type(default_class_label) ) ) self.spec.description.predictedFeatureName = output_label.name self.spec.description.predictedProbabilitiesName = output_label_probs.name number_of_neighbors = kwargs.get( self._PARAMETER_KEY_NUMBER_OF_NEIGHBORS, self._PARAMETER_DEFAULT_NUMBER_OF_NEIGHBORS, ) self.set_number_of_neighbors_with_bounds( number_of_neighbors, allowed_range=(1, 1000) ) # Can we think of a more sensible default value? self.weighting_scheme = kwargs.get( self._PARAMETER_KEY_WEIGHTING_SCHEME, self._PARAMETER_DEFAULT_WEIGHTING_SCHEME, ) index_type = kwargs.get( self._PARAMETER_KEY_INDEX_TYPE, self._PARAMETER_DEFAULT_INDEX_TYPE ) leaf_size = kwargs.get( self._PARAMETER_KEY_LEAF_SIZE, self._PARAMETER_DEFAULT_LEAF_SIZE ) self.set_index_type(index_type, leaf_size) # SED is currently the only supported distance metric self.spec.kNearestNeighborsClassifier.nearestNeighborsIndex.squaredEuclideanDistance.MergeFromString( b"" ) @property def author(self): """ Get the author for the KNearestNeighborsClassifier model. Returns ------- The author """ return self.spec.description.metadata.author @author.setter def author(self, author): """ Add an author for the KNearestNeighborsClassifier model. Parameters ---------- author The author. Returns ------- None """ self.spec.description.metadata.author = author @property def license(self): """ Get the license for the KNearestNeighborsClassifier model. Returns ------- The license. """ return self.spec.description.metadata.license @author.setter def license(self, license): """ Add a license for the KNearestNeighborsClassifier model. Parameters ---------- license The license. Returns ------- None """ self.spec.description.metadata.license = license @property def description(self): """ Get the description for the KNearestNeighborsClassifier model. Returns ------- The description. """ return self.spec.description.metadata.shortDescription @description.setter def description(self, description): """ Add a description for the model. Parameters ---------- description The description Returns ------- None """ self.spec.description.metadata.shortDescription = description @property def is_updatable(self): """ Check if the KNearestNeighborsClassifier is updatable. Returns ------- Is updatable. """ return self.spec.isUpdatable @is_updatable.setter def is_updatable(self, is_updatable): """ Set the KNearestNeighborsClassifier to be updatable. Parameters ---------- is_updatable Boolean Returns ------- None """ self.spec.isUpdatable = is_updatable @property def weighting_scheme(self): """ Get the weighting scheme for the KNearestNeighborsClassifier model. Returns ------- The weighting scheme. """ return self._weighting_scheme @weighting_scheme.setter def weighting_scheme(self, weighting_scheme): """ Set the weighting scheme for the KNearestNeighborsClassifier model. Parameters ---------- weighting_scheme One of [ ``'uniform'``, ``'inverse_distance'`` ]. Returns ------- None """ weighting_scheme = weighting_scheme.lower() if weighting_scheme not in self._VALID_WEIGHTING_SCHEMES: raise TypeError("Invalid weighting scheme") if weighting_scheme == "inverse_distance": self.spec.kNearestNeighborsClassifier.inverseDistanceWeighting.MergeFromString( b"" ) else: self.spec.kNearestNeighborsClassifier.uniformWeighting.MergeFromString(b"") # storing this in the object is just a convenience self._weighting_scheme = weighting_scheme @property def index_type(self): """ Get the index type for the KNearestNeighborsClassifier model. Returns ------- The index type. """ return self._index_type def set_index_type(self, index_type, leaf_size=30): """ Set the index type for the KNearestNeighborsClassifier model. Parameters ---------- index_type One of [ ``'linear'``, ``'kd_tree'`` ]. leaf_size For kd_tree indexes, the leaf size to use (default = 30). Returns ------- None """ index_type = index_type.lower() if index_type not in self._VALID_INDEX_TYPES: raise TypeError("Invalid index type") if index_type == "kd_tree": if leaf_size <= 0: raise TypeError("leaf_size must be > 0") self.spec.kNearestNeighborsClassifier.nearestNeighborsIndex.singleKdTreeIndex.leafSize = ( leaf_size ) else: self.spec.kNearestNeighborsClassifier.nearestNeighborsIndex.linearIndex.MergeFromString( b"" ) # storing this in the object is just a convenience self._index_type = index_type @property def leaf_size(self): """ Get the leaf size for the KNearestNeighborsClassifier. Returns ------- The leaf size. """ return ( self.spec.kNearestNeighborsClassifier.nearestNeighborsIndex.singleKdTreeIndex.leafSize ) @leaf_size.setter def leaf_size(self, leaf_size): """ Set the leaf size for a KNearestNeighborsClassifier model. Only for kd-tree indexes. Parameters ---------- leaf_size The leaf size. Returns ------- None """ if leaf_size <= 0: raise ValueError("leaf_size must be > 0") self.spec.kNearestNeighborsClassifier.nearestNeighborsIndex.singleKdTreeIndex.leafSize = ( leaf_size ) @property def number_of_dimensions(self): """ Get the number of dimensions of the input data for the KNearestNeighborsClassifier model. Returns ------- Number of dimensions. """ return ( self.spec.kNearestNeighborsClassifier.nearestNeighborsIndex.numberOfDimensions ) @property def number_of_neighbors(self): """ Get the number of neighbors value for the KNearestNeighborsClassifier model. Returns ------- The number of neighbors default value. """ return self.spec.kNearestNeighborsClassifier.numberOfNeighbors.defaultValue def set_number_of_neighbors_with_bounds( self, number_of_neighbors, allowed_range=None, allowed_set=None ): """ Set the numberOfNeighbors parameter for the KNearestNeighborsClassifier model. Parameters ---------- allowed_range Tuple of (``min_value``, ``max_value``) defining the range of allowed values. allowed_values Set of allowed values for the number of neighbors. Returns ------- None """ if number_of_neighbors <= 0: raise ValueError("number_of_neighbors must be > 0") if allowed_range is None and allowed_set is None: raise ValueError( "Exactly one of allowed_range or allowed_values must be provided" ) if allowed_range is not None and allowed_set is not None: raise ValueError( "Exactly one of allowed_range or allowed_values must be provided" ) if allowed_range is not None: if not isinstance(allowed_range, tuple): raise TypeError( "allowed_range expects a tuple of (min_value, max_value)" ) if len(allowed_range) != 2: raise TypeError( "allowed_range expects a tuple of (min_value, max_value)" ) (min_value, max_value) = allowed_range if min_value <= 0: raise ValueError("allowed_range minimum must be > 0") if max_value < min_value: raise ValueError("allowed_range max_value must be >= min_value") if number_of_neighbors < min_value or number_of_neighbors > max_value: raise ValueError("number_of_neighbors is not within allowed range") self.spec.kNearestNeighborsClassifier.numberOfNeighbors.defaultValue = ( number_of_neighbors ) self.spec.kNearestNeighborsClassifier.numberOfNeighbors.range.minValue = ( min_value ) self.spec.kNearestNeighborsClassifier.numberOfNeighbors.range.maxValue = ( max_value ) elif allowed_set is not None: if not isinstance(allowed_set, set): raise TypeError("allowed_values expects 'set' type") if len(allowed_set) == 0: raise TypeError("allowed_values cannot be empty") found_match = False for v in allowed_set: if not self._is_valid_number_type(v): raise TypeError("allowed_values must contain only integer types") if v <= 0: raise TypeError("allowed_values must only contain values > 0") if number_of_neighbors == v: found_match = True if found_match: self.spec.kNearestNeighborsClassifier.numberOfNeighbors.defaultValue = ( number_of_neighbors ) for v in allowed_set: self.spec.kNearestNeighborsClassifier.numberOfNeighbors.set.values.append( v ) else: raise ValueError("number_of_neighbors is not a valid value") def number_of_neighbors_allowed_range(self): """ Get the range of allowed values for the numberOfNeighbors parameter. Returns ------- Tuple of (``min_value``, ``max_value``) or ``None`` if the range hasn't been set. """ if self.spec.kNearestNeighborsClassifier.numberOfNeighbors.HasField("range"): return ( self.spec.kNearestNeighborsClassifier.numberOfNeighbors.range.minValue, self.spec.kNearestNeighborsClassifier.numberOfNeighbors.range.maxValue, ) return None def number_of_neighbors_allowed_set(self): """ Get the set of allowed values for the numberOfNeighbors parameter. Returns ------- Set of allowed values or ``None`` if the set of allowed values hasn't been populated. """ if self.spec.kNearestNeighborsClassifier.numberOfNeighbors.HasField("set"): spec_values = ( self.spec.kNearestNeighborsClassifier.numberOfNeighbors.set.values ) allowed_values = set() for v in spec_values: allowed_values.add(v) return allowed_values return None def add_samples(self, data_points, labels): """ Add some samples to the KNearestNeighborsClassifier model. Parameters ---------- data_points List of input data points. labels List of corresponding labels. Returns ------- None """ if len(data_points) == 0: raise TypeError("data_points is empty") if len(labels) == 0: raise TypeError("labels is empty") if len(data_points[0]) != self.number_of_dimensions: raise TypeError( "dimensionality of data_points != expected number of dimensions" ) if len(data_points) != len(labels): raise TypeError("len(data_points) != len(labels)") # Validate the types of the labels before adding any points. self._validate_label_types(labels) for data_point in data_points: sample = ( self.spec.kNearestNeighborsClassifier.nearestNeighborsIndex.floatSamples.add() ) for feature in data_point: sample.vector.append(feature) if self.spec.kNearestNeighborsClassifier.HasField("int64ClassLabels"): for label in labels: self.spec.kNearestNeighborsClassifier.int64ClassLabels.vector.append( label ) else: # string labels for label in labels: self.spec.kNearestNeighborsClassifier.stringClassLabels.vector.append( label ) def _validate_label_types(self, labels): """ Ensure the label types matched the expected types. Parameters ---------- spec The spec. labels The list of labels. Returns ------- None, throws a TypeError if not expected. """ if self.spec.kNearestNeighborsClassifier.HasField("int64ClassLabels"): check_is_valid = KNearestNeighborsClassifierBuilder._is_valid_number_type else: check_is_valid = KNearestNeighborsClassifierBuilder._is_valid_text_type for label in labels: if not check_is_valid(label): raise TypeError("Invalid type for label: {}".format(type(label))) @staticmethod def _is_valid_text_type(obj): """ Checks if the object is a valid text type. Parameters ---------- obj The object to check. Returns ------- True if a valid text type, False otherwise. """ return isinstance(obj, str) @staticmethod def _is_valid_number_type(obj): """ Checks if the object is a valid number type. Parameters ---------- obj The object to check. Returns ------- True if a valid number type, False otherwise. """ return isinstance(obj, (int, _np.integer)) ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.261547 coremltools-8.0/coremltools/models/neural_network/0000755000000000000000000000000014672075535021363 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/neural_network/__init__.py0000644000000000000000000000074614672066616023503 0ustar00rootroot# Copyright (c) 2018, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from . import (flexible_shape_utils, optimization_utils, printer, quantization_utils, spec_inspection_utils, update_optimizer_utils, utils) from .builder import NeuralNetworkBuilder from .update_optimizer_utils import AdamParams, SgdParams ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/neural_network/builder.py0000644000000000000000000122661714672066616023402 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ Neural network builder class to construct Core ML models. """ from math import floor as _math_floor import numpy as _np from ... import ( _MINIMUM_NDARRAY_SPEC_VERSION, _MINIMUM_UPDATABLE_SPEC_VERSION, _SPECIFICATION_VERSION_IOS_14, ) from ... import SPECIFICATION_VERSION as _SPECIFICATION_VERSION from ... import proto as _proto from .. import datatypes from .._interface_management import set_training_features, set_transform_interface_params from .quantization_utils import _convert_array_to_nbit_quantized_bytes, _unpack_to_bytes from .spec_inspection_utils import _summarize_network_layer_info from .update_optimizer_utils import AdamParams, SgdParams _SUPPORTED_UPDATABLE_LAYERS = ["innerProduct", "convolution"] def _set_recurrent_activation(param, activation): if isinstance(activation, bytes): activation = activation.decode("utf8") activation = ( activation.upper() if isinstance(activation, str) else activation ) if activation == "SIGMOID": param.sigmoid.MergeFromString(b"") elif activation == "TANH": param.tanh.MergeFromString(b"") elif activation == "LINEAR": param.linear.MergeFromString(b"") elif activation == "SIGMOID_HARD" or activation == "HARD_SIGMOID": # The standard name is "hard_sigmoid", but in nn there are still usages of "sigmoid_hard". param.sigmoidHard.MergeFromString(b"") elif activation == "SCALED_TANH": param.scaledTanh.MergeFromString(b"") elif activation == "RELU": param.ReLU.MergeFromString(b"") else: raise TypeError( "Unsupported activation type with Recurrent layer: %s." % activation ) def _verify_quantization_arguments(weight=bytes(), output_channels=1, **kwargs): quantization_type = kwargs.get("quantization_type", "").lower() nbits = kwargs.get("nbits", 8) quant_scale = kwargs.get("quant_scale", None) quant_bias = kwargs.get("quant_bias", None) quant_lut = kwargs.get("quant_lut", None) int_8_dynamic_quantize = kwargs.get("int_8_dynamic_quantize", False) if int_8_dynamic_quantize and nbits != 8: raise ValueError("nbits must be 8 when 'int_8_dynamic_quantize' is true ") if int_8_dynamic_quantize and quant_bias is not None: raise ValueError( "quant_bias must be empty when 'int_8_dynamic_quantize' is true " ) if int_8_dynamic_quantize and quant_scale.size != 1: raise ValueError( "quant_scale must be of size 1 when 'int_8_dynamic_quantize' is true " ) if not isinstance(weight, bytes): raise ValueError("Weight must be of type bytes() for quantization") if quantization_type == "linear": if not int_8_dynamic_quantize: if quant_scale is None or quant_bias is None: raise ValueError( "quant_scale and quant_bias parameters must be provided for linear quantization type" ) if not _np.isscalar(quant_scale) and (len(quant_scale) != 1 and len(quant_scale) != output_channels): raise ValueError( "quant_scale should be of type float or an array of length outputChannels" ) if not int_8_dynamic_quantize: if not _np.isscalar(quant_scale) and len(quant_bias) != 1 and len(quant_bias) != output_channels: raise ValueError( "quant_bias should be of type float or an array of length outputChannels" ) elif quantization_type == "lut": if quant_lut is None: raise ValueError( "quant_lut must be provided for look up table quantization type" ) if len(quant_lut) != 2 ** nbits: raise ValueError("quant_lut must be an array of length 2^nbits") else: raise ValueError("quantization_type must be either linear or lut") if quantization_type == "linear" or "lut": if nbits > 8 or nbits < 1: raise ValueError("nbits must be between 1 and 8") def _fill_quantized_weights(weights_message=None, W=bytes(), use_int_8=False, **kwargs): if use_int_8: weights_message.int8RawValue = bytes() weights_message.int8RawValue += W else: weights_message.rawValue = bytes() weights_message.rawValue += W nbits = kwargs.get("nbits", 8) weights_message.quantization.numberOfBits = nbits quantization_type = kwargs.get("quantization_type", "").lower() if quantization_type == "linear": quant_scale = kwargs.get("quant_scale", [1.0]) quant_bias = kwargs.get("quant_bias", [0.0]) weights_message.quantization.linearQuantization.scale.extend(quant_scale) if not use_int_8: weights_message.quantization.linearQuantization.bias.extend(quant_bias) else: quant_lut = kwargs.get("quant_lut", [0.0, 1.0]) weights_message.quantization.lookupTableQuantization.floatValue.extend( quant_lut ) def _get_nn_spec(spec): if spec.HasField("neuralNetworkClassifier"): return spec.neuralNetworkClassifier elif spec.HasField("neuralNetworkRegressor"): return spec.neuralNetworkRegressor elif spec.HasField("neuralNetwork"): return spec.neuralNetwork else: return None def _get_lstm_weight_fields(lstm_wp): """ Get LSTM weight fields. lstm_wp: _proto.NeuralNetwork_pb2.LSTMWeightParams """ return [ lstm_wp.inputGateWeightMatrix, lstm_wp.forgetGateWeightMatrix, lstm_wp.blockInputWeightMatrix, lstm_wp.outputGateWeightMatrix, lstm_wp.inputGateRecursionMatrix, lstm_wp.forgetGateRecursionMatrix, lstm_wp.blockInputRecursionMatrix, lstm_wp.outputGateRecursionMatrix, lstm_wp.inputGateBiasVector, lstm_wp.forgetGateBiasVector, lstm_wp.blockInputBiasVector, lstm_wp.outputGateBiasVector, lstm_wp.inputGatePeepholeVector, lstm_wp.forgetGatePeepholeVector, lstm_wp.outputGatePeepholeVector, ] def _fill_tensor_fields(tensor_field, ranks=None, shapes=None): """ Fill the tensor fields. ranks - ``NONE`` or a list of integers with the same length of number of inputs/outputs shapes - ``NONE`` or a list of shapes the same length of number of inputs/outputs. Each shape is a list or tuple """ if ranks is None and shapes is None: return if ranks is None and shapes is not None: ranks = [len(shape) for shape in shapes] # Fill ranks only for rank in ranks: if rank is None: continue if not issubclass(type(rank), (int, _np.integer)): rank = -1 # Variable rank set to -1 field = tensor_field.add() field.rank = rank if ranks is not None and shapes is not None: if len(ranks) != len(shapes): raise ValueError("Number of rank and shape of tensor field does not match.") for i in range(0, len(ranks)): shape = shapes[i] rank = ranks[i] # Ignore incomplete info if shape is None or rank is None: continue # Raise error on inconsistent input if rank != len(shape): raise ValueError("Rank and shape does not match") # Add the shape to the proto is_symbolic = False for s in shape: if not issubclass(type(s), (int, _np.integer)): s = -1 # Symbolic shape set to -1 tensor_field[i].dimValue.append(s) class NeuralNetworkBuilder: """ Neural network builder class to construct Core ML models. The NeuralNetworkBuilder constructs a Core ML neural network specification layer by layer. The layers should be added in such an order that the inputs to each layer (referred to as blobs of each layer) have been previously defined. The builder can also set preprocessing steps to handle specialized input formats (such as images), and set class labels for neural network classifiers. Refer to the protobuf messages in the specification (NeuralNetwork.proto) for more details. Examples -------- .. sourcecode:: python import numpy as np from coremltools.models import datatypes from coremltools.models.neural_network import NeuralNetworkBuilder from coremltools.models.utils import save_spec # Create a neural network binary classifier that classifies # 3-dimensional data points # Specify input and output dimensions input_dim = (3,) output_dim = (2,) # Specify input and output features input_features = [("data", datatypes.Array(*input_dim))] output_features = [("probs", datatypes.Array(*output_dim))] # Create random weights and bias weights = np.random.rand(2, 3) bias = np.random.rand(2) # Build a simple neural network with 1 inner product layer builder = NeuralNetworkBuilder(input_features, output_features) builder.add_inner_product( name="ip_layer", W=weights, b=bias, input_channels=3, output_channels=2, has_bias=True, input_name="data", output_name="probs", ) # save the spec by the builder save_spec(builder.spec, "network.mlmodel") """ def __init__( self, input_features=None, output_features=None, mode=None, spec=None, nn_spec=None, disable_rank5_shape_mapping=False, training_features=None, use_float_arraytype=False, ): """ Construct a NeuralNetworkBuilder object to build an MLModel specification with a model interface, or a NeuralNetwork protobuf message, either from scratch or using an existing specification. Parameters ---------- input_features: [(str, datatypes.Array)] or None List of input feature of the network. Each feature is a ``(name, array)`` tuple, where ``name`` is the name of the feature, and ``array`` is a ``datatype.Array`` object describing the feature type. * When ``spec`` is ``None`` (building from scratch), ``input_features`` must not be ``None``. output_features: [(str, datatypes.Array or None)] or None List of output feature of the network. Each feature is a ``(name, array)`` tuple, where ``name`` is the name of the feature, and ``array`` is a ``datatypes.Array`` object describing the feature type. * The ``array`` can be ``None`` if not known. * When ``spec`` is ``None`` (building from scratch), ``output_features`` must not be ``None``. mode: str ('classifier', 'regressor' or None) Mode (one of ``'classifier'``, ``'regressor'``, or ``None``). When ``mode = 'classifier'``, a NeuralNetworkClassifier spec will be constructed. When ``mode = 'regressor'``, a NeuralNetworkRegressor spec will be constructed. disable_rank5_shape_mapping: bool Only applicable for neural networks. If True, inputs are no longer forced to map to rank 5 tensors (rank is equal to the length of the shape of the tensor). Instead, for multi-array inputs ``"EXACT_ARRAY_MAPPING"`` mapping is used, whereas for image inputs ``"RANK4_IMAGE_MAPPING"`` is used. For details, see description of enums ``NeuralNetworkMultiArrayShapeMapping`` and ``NeuralNetworkImageShapeMapping`` in NeuralNetwork.proto. When ``spec`` is not ``None``, this argument will be ignored. spec: None or coremltools.proto.Model_pb2 If ``None``, a new MLModel spec will be created by the builder with input and output features. Otherwise, the builder will continue to build on ``spec``. This is useful when the MLModel is built incrementally. nn_spec: None or coremltools.proto.NeuralNetwork_pb2 If ``None``, a new, empty NeuralNetwork proto will be created for spec. If ``nn_spec`` is not ``None`` and ``spec`` is ``None``, the builder will build a NeuralNetwork spec without wrapping it within an MLModel. This is useful to create nested NeuralNetworks for models with control flow operations. use_float_arraytype: bool If true, the datatype of input/output multiarrays is set to Float32 instead of double. Examples -------- .. sourcecode:: python # Construct a builder that builds a neural network classifier with a 299 x 299 x 3 # dimensional input and 1000 dimensional output input_features = [("data", datatypes.Array((299, 299, 3)))] output_features = [("probs", datatypes.Array((1000,)))] builder = NeuralNetworkBuilder(input_features, output_features, mode="classifier") See Also -------- set_input, set_output, set_class_labels """ self.spec = spec self.nn_spec = nn_spec self._disable_rank5_shape_mapping = disable_rank5_shape_mapping self.layers = [] self.layer_specs = {} self.named_parameters = [] self.rank_dict = {} if self.spec is not None: # Existing spec if self.nn_spec is None: self.nn_spec = _get_nn_spec(self.spec) for layer_spec in self.nn_spec.layers: self.layers.append(layer_spec.name) self.layer_specs[layer_spec.name] = layer_spec else: # Both spec and nn_spec are not None raise ValueError( "Attempting to assign another NeuralNetwork Spec to an existing MLModel Spec" ) if input_features is None and output_features is None: return if ( self.spec is None and self.nn_spec is not None ): # Building nested Neural Network return # Set the interface params. if self.spec is None: self.spec = _proto.Model_pb2.Model() self.spec.specificationVersion = _SPECIFICATION_VERSION if disable_rank5_shape_mapping: self.spec.specificationVersion = _MINIMUM_NDARRAY_SPEC_VERSION # When output_features in None, use some dummy sized type out_features_with_shape = [] for out_feature in output_features: feat_name, feat_type = out_feature if feat_type is None: out_features_with_shape.append((str(feat_name), datatypes.Array(1))) else: out_features_with_shape.append(out_feature) # Set interface inputs and outputs if len(self.spec.description.input) > 0: del self.spec.description.input[:] if len(self.spec.description.output) > 0: del self.spec.description.output[:] if use_float_arraytype: array_datatype = _proto.Model_pb2.ArrayFeatureType.FLOAT32 else: array_datatype = _proto.Model_pb2.ArrayFeatureType.DOUBLE self.spec = set_transform_interface_params( self.spec, input_features, out_features_with_shape, training_features=training_features, array_datatype=array_datatype, ) for input in input_features: self.rank_dict[input[0]] = len(input[1].dimensions) for idx, output_feature in enumerate(output_features): if output_features[idx][1] is None: self.spec.description.output[idx].type.multiArrayType.ClearField( "shape" ) if self.nn_spec is None: if mode == "classifier": nn_spec = self.spec.neuralNetworkClassifier elif mode == "regressor": nn_spec = self.spec.neuralNetworkRegressor else: nn_spec = self.spec.neuralNetwork self.nn_spec = nn_spec if disable_rank5_shape_mapping and self.nn_spec: self.nn_spec.arrayInputShapeMapping = ( _proto.NeuralNetwork_pb2.NeuralNetworkMultiArrayShapeMapping.Value( "EXACT_ARRAY_MAPPING" ) ) self.nn_spec.imageInputShapeMapping = ( _proto.NeuralNetwork_pb2.NeuralNetworkImageShapeMapping.Value("RANK4_IMAGE_MAPPING") ) def set_input(self, input_names, input_dims): """ Set the inputs of the network spec. Parameters ---------- input_names: list of str The input names of the network. input_dims: [tuple] The input dimensions of the network. The ordering of ``input_dims`` is the same as ``input_names``. Examples -------- .. sourcecode:: python # Set the neural network spec inputs to be 3 dimensional vector data1 and # 4 dimensional vector data2. builder.set_input(input_names=["data1", "data2"], input_dims=[(3,), (4,)]) See Also -------- set_output, set_class_labels """ if len(input_names) != len(input_dims): raise ValueError("input_names and input_dims must be of the same sizes.") spec = self.spec for idx, dim in enumerate(input_dims): if ( hasattr(self, "_disable_rank5_shape_mapping") and self._disable_rank5_shape_mapping ): input_shape = dim else: if len(dim) == 3: input_shape = (dim[0], dim[1], dim[2]) elif len(dim) == 2: input_shape = (dim[1],) elif len(dim) == 1: input_shape = tuple(dim) else: raise RuntimeError( "Attempting to add a neural network " + "input with rank " + str(len(dim)) + ". All networks should take inputs of rank 1 or 3." ) spec.description.input[idx].type.multiArrayType.ClearField("shape") spec.description.input[idx].type.multiArrayType.shape.extend(input_shape) # TODO: if it's an embedding, this should be integer spec.description.input[ idx ].type.multiArrayType.dataType = _proto.Model_pb2.ArrayFeatureType.DOUBLE spec.description.input[idx].name = input_names[idx] def set_output(self, output_names, output_dims): """ Set the outputs of the network spec. Parameters ---------- output_names: list of str The output names of the network. output_dims: [tuple] The output dimensions of the network. The ordering of ``output_dims`` is the same as ``output_names``. Examples -------- .. sourcecode:: python # Set the neural network spec outputs to be 3 dimensional vector feature1 and # 4 dimensional vector feature2. builder.set_output(output_names=["feature1", "feature2"], output_dims=[(3,), (4,)]) See Also -------- set_input, set_class_labels """ if len(output_names) != len(output_dims): raise ValueError("output_names and output_dims must be of the same sizes.") spec = self.spec for idx, dim in enumerate(output_dims): spec.description.output[idx].type.multiArrayType.ClearField("shape") spec.description.output[idx].type.multiArrayType.shape.extend(dim) spec.description.output[ idx ].type.multiArrayType.dataType = _proto.Model_pb2.ArrayFeatureType.DOUBLE spec.description.output[idx].name = output_names[idx] def set_training_input(self, training_input): """ Set the training inputs of the network spec. Parameters ---------- training_input: [tuple] The training input names and type of the network. Examples -------- .. sourcecode:: python # Set the neural network spec training inputs to be 3 dimensional vector for 'input' and # Double for 'target'. builder.set_training_input([("input", datatypes.Array(3)), ("target", "Double")]) """ spec = self.spec set_training_features(spec, training_input) def set_class_labels( self, class_labels, predicted_feature_name="classLabel", prediction_blob="" ): """ Set class labels to the model spec to make it a neural network classifier. Parameters ---------- class_labels: list of int or list of str A list of integers or strings that map the index of the output of a neural network to labels in a classifier. predicted_feature_name: str Name of the output feature for the class labels exposed in the Core ML neural network classifier, defaults: ``'classLabel'``. prediction_blob: str If provided, then this is the name of the neural network blob which generates the probabilities for each class label (typically the output of a softmax layer). If not provided, then the last output layer is assumed. See Also -------- set_input, set_output, set_pre_processing_parameters """ spec = self.spec nn_spec = self.nn_spec if len(spec.description.output) == 0: raise ValueError( "Model should have at least one output (the probabilities) to automatically make it a classifier." ) probOutput = spec.description.output[0] probOutput.type.dictionaryType.MergeFromString(b"") if len(class_labels) == 0: return class_type = type(class_labels[0]) if not isinstance(class_labels[0], (int, str)): raise TypeError( "Class labels must be of type Integer or String. (not %s)" % class_type ) spec.description.predictedProbabilitiesName = probOutput.name spec.description.predictedFeatureName = predicted_feature_name classLabel = spec.description.output.add() classLabel.name = predicted_feature_name if class_type == int: nn_spec.ClearField("int64ClassLabels") probOutput.type.dictionaryType.int64KeyType.MergeFromString(b"") classLabel.type.int64Type.MergeFromString(b"") for c in class_labels: nn_spec.int64ClassLabels.vector.append(c) else: nn_spec.ClearField("stringClassLabels") probOutput.type.dictionaryType.stringKeyType.MergeFromString(b"") classLabel.type.stringType.MergeFromString(b"") for c in class_labels: nn_spec.stringClassLabels.vector.append(c) if prediction_blob != "": # correctness here will be checked in the validator -- i.e. to # make sure this string corresponds to a real blob nn_spec.labelProbabilityLayerName = prediction_blob else: # not provided # assume it's the last blob produced in the network nn_spec.labelProbabilityLayerName = nn_spec.layers[-1].output[0] def set_optional_input(self, input_idx, value=None, format="float"): """ Marks given input as optional input. Optionally, sets default value for optional input if value is not ``None``. Parameters ---------- input_idx: int Index of input to be marked and fill with default value. value: int/double/float/None Value to be fill as default value. format: str Format of default value. Must be one of ``'float'``, ``'double'``, or ``'int'``. """ if input_idx >= len(self.spec.description.input): msg = ( str(input_idx) + " out of " + str(len(self.spec.description.input)) + " inputs!" ) raise ValueError("Setting invalid input as optional! {}".format(msg)) self.spec.description.input[input_idx].type.isOptional = True if value is None: return # Default value is supported from CoreML 4 onwards. self.spec.specificationVersion = max( self.spec.specificationVersion, _SPECIFICATION_VERSION_IOS_14 ) format = format.lower() if format == "float": self.spec.description.input[ input_idx ].type.multiArrayType.floatDefaultValue = value elif format == "double": self.spec.description.input[ input_idx ].type.multiArrayType.doubleDefaultValue = value elif format == "int": self.spec.description.input[ input_idx ].type.multiArrayType.intDefaultValue = value else: raise ValueError( "Incorrect format for optional inputs! Expecting int/float/double, got {}!".format( format ) ) def add_optionals(self, optionals_in, optionals_out): """ Add optional inputs and outputs to the model spec. Parameters ---------- optionals_in: list of str List of inputs that are optionals. optionals_out: list of str List of outputs that are optionals. See Also -------- set_input, set_output """ spec = self.spec if (not optionals_in) and (not optionals_out): return input_types = [ datatypes.Array(dim) if isinstance(dim, int) else datatypes.Array(*dim) for (name, dim) in optionals_in ] output_types = [] for name, dim in optionals_out: if not dim: output_types.append(None) elif isinstance(dim, int): output_types.append(datatypes.Array(dim)) else: output_types.append(datatypes.Array(*dim)) input_names = [str(name) for (name, dim) in optionals_in] output_names = [str(name) for (name, dim) in optionals_out] input_features = list(zip(input_names, input_types)) output_features = list(zip(output_names, output_types)) len_before_in = len(spec.description.input) len_before_out = len(spec.description.output) # this appends to the existing model interface set_transform_interface_params(spec, input_features, output_features, True) # add types for any extra hidden inputs for idx in range(len_before_in, len(spec.description.input)): spec.description.input[ idx ].type.multiArrayType.dataType = _proto.Model_pb2.ArrayFeatureType.DOUBLE for idx in range(len_before_out, len(spec.description.output)): spec.description.output[ idx ].type.multiArrayType.dataType = _proto.Model_pb2.ArrayFeatureType.DOUBLE def _check_fp16_weight_params_lstms(self, lstm_wp, has_peephole=True): """ Checks if an LSTM layer has at least one ``weight_param`` which is in FP16 format. Parameters ---------- lstm_wp: LSTM weights. has_peephole: if the LSTM has a peephole. """ if len(lstm_wp.inputGateWeightMatrix.float16Value) > 0: return True if len(lstm_wp.forgetGateWeightMatrix.float16Value) > 0: return True if len(lstm_wp.blockInputWeightMatrix.float16Value) > 0: return True if len(lstm_wp.outputGateWeightMatrix.float16Value) > 0: return True if len(lstm_wp.inputGateRecursionMatrix.float16Value) > 0: return True if len(lstm_wp.forgetGateRecursionMatrix.float16Value) > 0: return True if len(lstm_wp.blockInputRecursionMatrix.float16Value) > 0: return True if len(lstm_wp.outputGateRecursionMatrix.float16Value) > 0: return True if len(lstm_wp.inputGateWeightMatrix.float16Value) > 0: return True if has_peephole: if len(lstm_wp.inputGatePeepholeVector.float16Value) > 0: return True if len(lstm_wp.forgetGatePeepholeVector.float16Value) > 0: return True if len(lstm_wp.outputGatePeepholeVector.float16Value) > 0: return True return False def _check_fp16_weight_param_exists(self, layers): """ Checks if the network has at least one ``weight_param`` which is in FP16 format. Parameters ---------- layers: list of nn_spec.layer List of layers. """ for layer in layers: layer_type = layer.WhichOneof("layer") # Convolution if layer_type == "convolution": if len(layer.convolution.weights.float16Value) > 0: return True if layer.convolution.hasBias and len(layer.convolution.bias.float16Value) > 0: return True # Batchnorm elif layer_type == "batchnorm": if len(layer.batchnorm.mean.float16Value) > 0: return True # InnerProduct elif layer_type == "innerProduct": if len(layer.innerProduct.weights.float16Value) > 0: return True if layer.innerProduct.hasBias and len(layer.innerProduct.bias.float16Value) > 0: return True # BatchedMatmul elif layer_type == "batchedMatmul": if len(layer.batchedMatmul.weights.float16Value) > 0: return True if layer.batchedMatmul.hasBias and len(layer.batchedMatmul.bias.float16Value) > 0: return True # Embedding layer elif layer_type == "embedding": if len(layer.embedding.weights.float16Value) > 0: return True if layer.embedding.hasBias and len(layer.embedding.bias.float16Value) > 0: return True # Embedding ND layer elif layer_type == "embeddingND": if len(layer.embeddingND.weights.float16Value) > 0: return True if layer.embeddingND.hasBias and len(layer.embeddingND.bias.float16Value) > 0: return True # Scale layer elif layer_type == "scale": if len(layer.scale.shapeScale.float16Value) > 0: return True if layer.scale.hasBias and len(layer.scale.bias.float16Value) > 0: return True # Bias layer elif layer_type == "bias": if len(layer.bias.bias.float16Value) > 0: return True # LoadConstant layer elif layer_type == "loadConstant": if len(layer.loadConstant.data.float16Value) > 0: return True # Simple Recurrent elif layer_type == "simpleRecurrent": if len(layer.simpleRecurrent.weightMatrix.float16Value) > 0: return True if layer.simpleRecurrent.hasBiasVector and len(layer.simpleRecurrent.biasVector.float16Value) > 0: return True # GRU elif layer_type == "gru": if len(layer.gru.updateGateWeightMatrix.float16Value) > 0: return True if layer.gru.hasBiasVectors and len(layer.gru.outputGateBiasVector.float16Value) > 0: return True # uniDirectionalLSTM Layers elif layer_type == "uniDirectionalLSTM": return self._check_fp16_weight_params_lstms(lstm_wp=layer.uniDirectionalLSTM.weightParams, has_peephole=layer.uniDirectionalLSTM.params.hasPeepholeVectors) # biDirectionalLSTM Layers elif layer_type == "biDirectionalLSTM": for lstm_wp in layer.biDirectionalLSTM.weightParams: if self._check_fp16_weight_params_lstms(lstm_wp=lstm_wp, has_peephole=layer.biDirectionalLSTM.params.hasPeepholeVectors): return True # branch Layers elif layer_type == "branch": if len(layer.branch.ifBranch.float16Value) > 0: return True if len(layer.branch.elseBranch.float16Value) > 0: return True # loop Layers elif layer_type == "loop": if len(layer.loop.conditionNetwork.float16Value) > 0: return True if len(layer.loop.bodyNetwork.float16Value) > 0: return True return False def make_updatable(self, trainables): """ Make the builder's NeuralNetwork spec updatable. Parameters ---------- trainables: list of str List of layer names to be set trainable. """ if self.spec is None: return # check if any layer weights/biases is in FP16 format if self._check_fp16_weight_param_exists(self.nn_spec.layers): raise ValueError("This model has at least one layer with FP16 weights or bias formats. These networks will " "always be optimized to a full FP16 model format which is not supported to be marked " "updatable. Either make sure the model has no FP16 WeightParams or split the " "network to two models with updatable part of the model as a separate model with no FP16 " "WeightParams. Note that updatable pipelines model can only have the last sub model marked " "as updatable.") self.spec.isUpdatable = True if ( not self.spec.specificationVersion or self.spec.specificationVersion < _MINIMUM_UPDATABLE_SPEC_VERSION ): self.spec.specificationVersion = _MINIMUM_UPDATABLE_SPEC_VERSION self.nn_spec.updateParams.MergeFromString(b"") self.set_shuffle() for trainable in trainables: if trainable not in self.layer_specs: raise ValueError("Layer %s does not exist." % trainable) spec_layer = self.layer_specs[trainable] spec_layer_type = spec_layer.WhichOneof("layer") if spec_layer_type not in _SUPPORTED_UPDATABLE_LAYERS: raise ValueError( "Layer %s is not supported to be marked as updatable. Only %s layers " "are supported to be marked updatable." % (trainable, _SUPPORTED_UPDATABLE_LAYERS) ) spec_layer.isUpdatable = True typed_layer = getattr(spec_layer, spec_layer.WhichOneof("layer")) for fd in typed_layer.DESCRIPTOR.fields: field = getattr(typed_layer, fd.name) if type(field) == _proto.NeuralNetwork_pb2.LSTMWeightParams: wfs = _get_lstm_weight_fields(field) for wf in wfs: wf.isUpdatable = True elif type(field) == _proto.NeuralNetwork_pb2.WeightParams: field.isUpdatable = True else: pass def set_categorical_cross_entropy_loss(self, name, input): r""" Categorical Cross Entropy is used for single label categorization (only one category is applicable for each data point). Parameters ---------- name: The name of the loss layer input: The name of the input The ``input`` should be a vector of length N representing the distribution over N categories. This must be the output of a softmax. Notes ----- .. math:: Loss_ {CCE}(input, target) = -\sum_{i = 1} ^ {N}(target == i) log(input[i]) = - log(input[target]) """ if self.spec is None: return if name in self.layer_specs: raise ValueError("Name %s is already used." % name) if input is None: raise ValueError("Loss Layer input must be specified") target = input + "_true" if len(self.nn_spec.layers) < 1: raise ValueError( "Loss layer (%s) cannot be attached to an empty model." % name ) # validate input # input must be a softmax layer output input_validated = False for _, layer in enumerate(self.nn_spec.layers[::-1]): layer_outputs = list(layer.output) layer_type = layer.WhichOneof("layer") if input in layer_outputs and layer_type == "softmax": input_validated = True break if not input_validated: raise ValueError( "Categorical Cross Entropy loss layer input (%s) must be a softmax layer output." % input ) # validate target output_names = [x.name for x in self.spec.description.output] if target in output_names: raise ValueError( "Loss layer target (%s) must not be a model output." % target ) updating_classifier = False predicted_probabilities_name = self.spec.description.predictedProbabilitiesName predicted_feature_name = self.spec.description.predictedFeatureName if ( self.spec.HasField("neuralNetworkClassifier") and input == predicted_probabilities_name ): updating_classifier = True loss_layer = self.nn_spec.updateParams.lossLayers.add() self.layers.append(name) self.layer_specs[name] = loss_layer loss_layer.name = name loss_layer.categoricalCrossEntropyLossLayer.input = input loss_layer.categoricalCrossEntropyLossLayer.target = target training_inputs = self.spec.description.trainingInput training_inputs.extend(self.spec.description.input) training_input = training_inputs.add() if updating_classifier: training_input.name = predicted_feature_name classifier_output_type = [ x.type for x in self.spec.description.output if x.name == predicted_feature_name ] model_type = classifier_output_type[0].WhichOneof("Type") if model_type == "stringType": datatypes._set_datatype(training_input.type, datatypes.String()) elif model_type == "int64Type": datatypes._set_datatype(training_input.type, datatypes.Int64()) else: training_input.name = target datatypes._set_datatype(training_input.type, datatypes.Array(1)) training_input.type.multiArrayType.dataType = _proto.Model_pb2.ArrayFeatureType.INT32 print( "Now adding input {} as target for categorical cross-entropy loss layer.".format( target ) ) def set_mean_squared_error_loss(self, name, input_feature=None): """ input_feature: [(str, datatypes.Array)] or None The input feature of the loss layer. Each feature is a ``(name, array)`` tuple, where ``name`` is the name of the model's tensor our loss will be attached to, and ``array`` is a ``datatypes.Array`` object describing the shape of that tensor. Both the name and the array's shape must be provided in the tuple. Examples -------- feature = [('output_tensor', datatypes.Array((299, 299, 3)))] """ if self.spec is None: return if name in self.layer_specs: raise ValueError("Name %s is already used." % name) if input_feature is None: raise ValueError("Loss Layer input must be specified") if not isinstance(input_feature, tuple): raise ValueError( "Loss layer input must be a tuple of type (string, datatype)" ) (fname, ftype) = input_feature if not isinstance(fname, str): raise ValueError( "Loss layer input must be a tuple of type (string, datatype)" ) if not isinstance(ftype, datatypes.Array): raise ValueError( "Loss layer input must be a tuple of type (string, datatype)" ) target = fname + "_true" loss_layer = self.nn_spec.updateParams.lossLayers.add() self.layers.append(name) self.layer_specs[name] = loss_layer loss_layer.name = name output_names = [x.name for x in self.spec.description.output] if target in output_names: raise ValueError( "Loss Layer target (%s) must not be a model output" % target ) loss_layer.meanSquaredErrorLossLayer.input = input_feature[0] loss_layer.meanSquaredErrorLossLayer.target = target training_inputs = self.spec.description.trainingInput training_inputs.extend(self.spec.description.input) training_input = training_inputs.add() training_input.name = target datatypes._set_datatype(training_input.type, input_feature[1]) training_input.type.multiArrayType.dataType = _proto.Model_pb2.ArrayFeatureType.DOUBLE print( "Now adding input {} as target for mean squared error loss layer.".format( target ) ) def set_sgd_optimizer(self, sgd_params): if self.spec is None: return if not isinstance(sgd_params, SgdParams): raise Exception("sgd_params must be of instance SgdParams") sgd_optimizer = self.nn_spec.updateParams.optimizer.sgdOptimizer # set learning rate sgd_optimizer.learningRate.defaultValue = sgd_params.lr.value sgd_optimizer.learningRate.range.minValue = sgd_params.lr.min sgd_optimizer.learningRate.range.maxValue = sgd_params.lr.max # set mini batch size sgd_optimizer.miniBatchSize.defaultValue = sgd_params.batch.value sgd_optimizer.miniBatchSize.set.values.extend(sgd_params.batch.allowed_set) # set momentum sgd_optimizer.momentum.defaultValue = sgd_params.momentum.value sgd_optimizer.momentum.range.minValue = sgd_params.momentum.min sgd_optimizer.momentum.range.maxValue = sgd_params.momentum.max def set_adam_optimizer(self, adam_params): if self.spec is None: return if not isinstance(adam_params, AdamParams): raise Exception("adam_params must be of instance AdamParams") adam_optimizer = self.nn_spec.updateParams.optimizer.adamOptimizer # set learning rate adam_optimizer.learningRate.defaultValue = adam_params.lr.value adam_optimizer.learningRate.range.minValue = adam_params.lr.min adam_optimizer.learningRate.range.maxValue = adam_params.lr.max # set mini batch size adam_optimizer.miniBatchSize.defaultValue = adam_params.batch.value adam_optimizer.miniBatchSize.set.values.extend(adam_params.batch.allowed_set) # set beta1 adam_optimizer.beta1.defaultValue = adam_params.beta1.value adam_optimizer.beta1.range.minValue = adam_params.beta1.min adam_optimizer.beta1.range.maxValue = adam_params.beta1.max # set beta2 adam_optimizer.beta2.defaultValue = adam_params.beta2.value adam_optimizer.beta2.range.minValue = adam_params.beta2.min adam_optimizer.beta2.range.maxValue = adam_params.beta2.max # set eps adam_optimizer.eps.defaultValue = adam_params.eps.value adam_optimizer.eps.range.minValue = adam_params.eps.min adam_optimizer.eps.range.maxValue = adam_params.eps.max def set_epochs(self, epochs=1, allowed_set=None): if self.spec is None: return self.nn_spec.updateParams.epochs.defaultValue = epochs if allowed_set is None: self.nn_spec.updateParams.epochs.set.values.extend([epochs]) else: self.nn_spec.updateParams.epochs.set.values.extend(allowed_set) def set_shuffle(self, seed=None): if self.spec is None: return # Validate that seed passed in is integer if seed is not None: if not isinstance(seed, int): raise TypeError("Shuffle seed value must be integer") self.nn_spec.updateParams.shuffle.defaultValue = True if seed is not None: self.nn_spec.updateParams.seed.defaultValue = seed def _add_generic_layer( self, name, input_names, output_names, input_ranks=None, input_shapes=None, output_ranks=None, output_shapes=None, ): generic_layer = self.nn_spec.layers.add() generic_layer.name = name if input_names is not None: generic_layer.input.extend(input_names) generic_layer.output.extend(output_names) self.layers.append(name) if name in self.layer_specs: raise ValueError( 'Layer with name "%s" has already been added. Please use a unique name.' % name ) self.layer_specs[name] = generic_layer _fill_tensor_fields(generic_layer.inputTensor, input_ranks, input_shapes) _fill_tensor_fields(generic_layer.outputTensor, output_ranks, output_shapes) # Pass Rank Information # Generic Layer copies rank of first input to all of its output # All the layers that modifies rank apart from first input must override if input_names is not None and len(input_names) > 0: for output_ in output_names: self.rank_dict[output_] = self._get_rank(input_names[0]) return generic_layer def inspect_layers(self, last=-1, verbose=False): """ Prints the summary for last "last" number of layers. Parameters ---------- last: int The numbers of layers to inspect, starting from the last one. verbose: bool Whether to display layer-specific parameters or not. """ n_layers = len(self.nn_spec.layers) if last < 0: last = n_layers for i, alayer in enumerate(self.nn_spec.layers[::-1]): if i >= last: break ( layer_type, name, in_blobs, out_blobs, params_info, ) = _summarize_network_layer_info(alayer) print( "[Id: {}], Name: {} (Type: {})".format( n_layers - i - 1, name, layer_type ) ) print(" " * 10 + "Updatable: {}".format(alayer.isUpdatable)) print(" " * 10 + "Input blobs: {}".format(in_blobs)) print(" " * 10 + "Output blobs: {}".format(out_blobs)) if verbose and len(params_info) > 0: print(" " * 10 + "Parameters: ") for param in params_info: print(" " * 14 + "{} = {}".format(param[0], param[1])) def inspect_loss_layers(self): """ Prints the summary for the loss layer. """ n_loss_layers = len(self.nn_spec.updateParams.lossLayers) if n_loss_layers < 1: print("no loss layer detected.") for i, loss_layer in enumerate(self.nn_spec.updateParams.lossLayers[::-1]): loss_type = loss_layer.WhichOneof("LossLayerType") loss_name = loss_layer.name loss_input = None loss_target = None if loss_type == "categoricalCrossEntropyLossLayer": loss_input = loss_layer.categoricalCrossEntropyLossLayer.input loss_target = loss_layer.categoricalCrossEntropyLossLayer.target elif loss_type == "meanSquaredErrorLossLayer": loss_input = loss_layer.meanSquaredErrorLossLayer.input loss_target = loss_layer.meanSquaredErrorLossLayer.target print( "[Id: {}], Name: {} (Type: {})".format( n_loss_layers - i - 1, loss_name, loss_type ) ) print(" " * 10 + "Loss Input: {}".format(loss_input)) print(" " * 10 + "Loss Target: {}".format(loss_target)) def inspect_optimizer(self): """ Prints the summary for the optimizer. """ optimizer = self.nn_spec.updateParams.optimizer optimizer_type = optimizer.WhichOneof("OptimizerType") print("Optimizer Type: {}".format(optimizer_type)) if optimizer_type == "sgdOptimizer": lr = optimizer.sgdOptimizer.learningRate batch = optimizer.sgdOptimizer.miniBatchSize momentum = optimizer.sgdOptimizer.momentum print( "lr: {}, min: {}, max: {}".format( lr.defaultValue, lr.range.minValue, lr.range.maxValue ) ) print( "batch: {}, allowed_set: {}".format( batch.defaultValue, batch.set.values ) ) print( "momentum: {}, min: {}, max: {}".format( momentum.defaultValue, momentum.range.minValue, momentum.range.maxValue, ) ) elif optimizer_type == "adamOptimizer": lr = optimizer.adamOptimizer.learningRate batch = optimizer.adamOptimizer.miniBatchSize beta1 = optimizer.adamOptimizer.beta1 beta2 = optimizer.adamOptimizer.beta2 eps = optimizer.adamOptimizer.eps print( "lr: {}, min: {}, max: {}".format( lr.defaultValue, lr.range.minValue, lr.range.maxValue ) ) print( "batch: {}, allowed_set: {}".format( batch.defaultValue, batch.set.values ) ) print( "beta1: {}, min: {}, max: {}".format( beta1.defaultValue, beta1.range.minValue, beta1.range.maxValue ) ) print( "beta2: {}, min: {}, max: {}".format( beta2.defaultValue, beta2.range.minValue, beta2.range.maxValue ) ) print( "epsilon: {}, min: {}, max: {}".format( eps.defaultValue, eps.range.minValue, eps.range.maxValue ) ) def inspect_updatable_layers(self): """ Prints all updatable layers with their inputs and outputs. """ for _, layer in enumerate(self.nn_spec.layers[::-1]): if layer.isUpdatable: ( layer_type, name, in_blobs, out_blobs, _, ) = _summarize_network_layer_info(layer) print("Name: {} (Type: {})".format(name, layer_type)) print(" " * 10 + "Input blobs: {}".format(in_blobs)) print(" " * 10 + "Output blobs: {}".format(out_blobs)) def inspect_input_features(self): """ Prints the name and type of input features. """ input_features = self.spec.description.input n_input_features = len(input_features) if n_input_features < 1: return for i, input_feature in enumerate(input_features[::-1]): print( "[Id: {}] Name: {}".format(n_input_features - i - 1, input_feature.name) ) print(" " * 10 + "Type: {}".format(input_feature.type)) def inspect_output_features(self): """ Prints the name and type of output features. """ output_features = self.spec.description.output n_output_features = len(output_features) if n_output_features < 1: return for i, output_feature in enumerate(output_features[::-1]): print( "[Id: {}] Name: {}".format( n_output_features - i - 1, output_feature.name ) ) print(" " * 10 + "Type: {}".format(output_feature.type)) def inspect_conv_channels(self, layer_name): """ Prints the output and kernel channels of a convolution layer. """ if self.spec is None: return if layer_name not in self.layer_specs: raise ValueError("Layer %s does not exist." % (layer_name)) spec_layer = self.layer_specs[layer_name] if spec_layer.WhichOneof("layer") != "convolution": raise ValueError("Layer %s is not a convolution layer." % (layer_name)) output_channels = spec_layer.convolution.outputChannels kernel_channels = spec_layer.convolution.kernelChannels print("outputChannels: {}".format(output_channels)) print("kernelChannels: {}".format(kernel_channels)) def inspect_innerproduct_channels(self, layer_name): """ Prints the output and kernel channels of an innerProduct layer. """ if self.spec is None: return if layer_name not in self.layer_specs: raise ValueError("Layer %s does not exist." % (layer_name)) spec_layer = self.layer_specs[layer_name] if spec_layer.WhichOneof("layer") != "innerProduct": raise ValueError("Layer %s is not an innerProduct layer." % (layer_name)) input_channels = spec_layer.innerProduct.inputChannels output_channels = spec_layer.innerProduct.outputChannels print("inputChannels: {}".format(input_channels)) print("outputChannels: {}".format(output_channels)) def _get_rank(self, name): return self.rank_dict[name] if name in self.rank_dict else -1 def _set_max_input_rank(self, input_names, output_name): if len(input_names) == 0: raise ValueError("Input name list empty for collecting rank information") self.rank_dict[output_name] = -1 for i in range(0, len(input_names)): input_rank = self._get_rank(input_names[i]) if input_rank == -1: self.rank_dict[output_name] = -1 return self.rank_dict[output_name] = max(self._get_rank(output_name), input_rank) def _set_rank_for_reduce_op( self, input_name, output_name, axes, keepdims, reduce_all ): if keepdims: self.rank_dict[output_name] = self._get_rank(input_name) else: if reduce_all or self._get_rank(input_name) == 1: self.rank_dict[output_name] = 1 elif axes is not None and len(axes) > 0: rank = self._get_rank(input_name) - len(axes) self.rank_dict[output_name] = rank if rank != 0 else 1 else: raise ValueError( "Reduce Ops must provide axes to reduce on if reduce_all is False" ) def add_inner_product( self, name, W, b, input_channels, output_channels, has_bias, input_name, output_name, int_8_dynamic_quantize=False, is_quantized_weight=False, quantization_type="linear", nbits=8, quant_scale=None, quant_bias=None, quant_lut=None, ): """ Add an inner product layer to the model. Refer to the ``InnerProductLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. W: numpy.array or bytes() Weight matrix of shape ``(output_channels, input_channels)``. If ``W`` is of type ``bytes()`` (quantized), other quantization related arguments must be provided as well (see below). b: numpy.array Bias vector of shape: ``(output_channels, )``. input_channels: int Number of input channels. output_channels: int Number of output channels. has_bias: boolean Whether the bias vector of this layer is ignored in the spec. - If True, the bias vector of this layer is not ignored. - If False, the bias vector is ignored. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. Quantization arguments, used when ``W`` is of type ``bytes()``: int_8_dynamic_quantize: boolean Whether to quantize and dequantize before and after inner product, respectively. Expects byte weights, representing int8 values, if True. See NeuralNetwork.proto for other validation conditions. is_quantized_weight: bool, optional Set it to true when ``W`` is of type ``bytes()``, representing quantized weights, default: false. quantization_type: str When weights are quantized (that is, ``W`` is of type ``bytes()``), this should be either ``"linear"`` or ``"lut"``. nbits: int Should be between 1 and 8 (inclusive). Number of bits per weight value. Only applicable when weights are quantized. quant_scale: numpy.array(dtype=numpy.float32) scale vector to be used with linear quantization. Must be of length either 1 or output_channels. quant_bias: numpy.array(dtype=numpy.float32) bias vector to be used with linear quantization. Must be of length either 1 or output_channels. quant_lut: numpy.array(dtype=numpy.float32) the LUT (look up table) to be used with LUT quantization. Must be of length 2^n bits. See Also -------- add_embedding, add_convolution, add_batched_mat_mul """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.innerProduct # Fill in the parameters spec_layer_params.inputChannels = input_channels spec_layer_params.outputChannels = output_channels spec_layer_params.hasBias = has_bias spec_layer_params.int8DynamicQuantize = int_8_dynamic_quantize weights = spec_layer_params.weights if not is_quantized_weight and isinstance(W, _np.ndarray): weights.floatValue.extend(W.flatten()) else: _verify_quantization_arguments( weight=W, output_channels=output_channels, quantization_type=quantization_type, nbits=nbits, quant_scale=quant_scale, quant_bias=quant_bias, quant_lut=quant_lut, int_8_dynamic_quantize=int_8_dynamic_quantize, ) _fill_quantized_weights( weights_message=weights, W=W, use_int_8=int_8_dynamic_quantize, quantization_type=quantization_type, nbits=nbits, quant_scale=quant_scale, quant_bias=quant_bias, quant_lut=quant_lut, ) if has_bias: bias = spec_layer_params.bias bias.floatValue.extend(b.flatten()) return spec_layer def add_embedding( self, name, W, b, input_dim, output_channels, has_bias, input_name, output_name, is_quantized_weight=False, quantization_type="linear", nbits=8, quant_scale=None, quant_bias=None, quant_lut=None, ): """ Add an embedding layer to the model. Refer to the ``EmbeddingLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. W: float32 numpy.array or bytes() Weight matrix of shape ``(output_channels, input_dim)``. If ``W`` is of type ``bytes()`` (quantized to 1-8 bits), other quantization related arguments must be provided as well (see below). b: numpy.array Bias vector of shape ``(output_channels, )``. input_dim: int Size of the vocabulary (1 + maximum integer index of the words). output_channels: int Number of output channels. has_bias: boolean Whether the bias vector of this layer is ignored in the ``spec``. - If True, the bias vector of this layer is not ignored. - If False, the bias vector is ignored. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. Quantization arguments expected, when ``W`` is of type ``bytes()``: is_quantized_weight: bool Set it to true when ``W`` is of type ``bytes()``, representing quantized weights. quantization_type: str When weights are quantized (that is, ``W`` is of type ``bytes()``), this should be either ``"linear"`` or ``"lut"``. nbits: int Should be between 1 and 8 (inclusive). Number of bits per weight value. quant_scale: numpy.array(dtype=numpy.float32) Scale vector to be used with linear quantization. Must be of length either 1 or output_channels. quant_bias: numpy.array(dtype=numpy.float32) Bias vector to be used with linear quantization. Must be of length either 1 or output_channels. quant_lut: numpy.array(dtype=numpy.float32) The LUT (look up table) to be used with LUT quantization. Must be of length 2^n bits. See Also -------- add_inner_product """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) # Fill in the parameters spec_layer_params = spec_layer.embedding spec_layer_params.inputDim = input_dim spec_layer_params.outputChannels = output_channels spec_layer_params.hasBias = has_bias weights = spec_layer_params.weights if not is_quantized_weight: weights.floatValue.extend(W.flatten()) else: _verify_quantization_arguments( weight=W, output_channels=output_channels, quantization_type=quantization_type, nbits=nbits, quant_scale=quant_scale, quant_bias=quant_bias, quant_lut=quant_lut, ) _fill_quantized_weights( weights_message=weights, W=W, quantization_type=quantization_type, nbits=nbits, quant_scale=quant_scale, quant_bias=quant_bias, quant_lut=quant_lut, ) if has_bias: bias = spec_layer_params.bias bias.floatValue.extend(b.flatten()) return spec_layer def add_softmax(self, name, input_name, output_name): """ Add a softmax layer to the model. Refer to the ``SoftmaxLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_activation, add_inner_product, add_convolution """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.softmax.MergeFromString(b"") return spec_layer def add_activation( self, name, non_linearity, input_name, output_name, params=None, input_rank=None, input_shape=None, output_rank=None, output_shape=None, ): """ Add an activation layer to the model. Refer to the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. non_linearity: str The ``non_linearity`` (activation) function of this layer. It can be one of the following: - ``'RELU'``: Rectified Linear Unit (ReLU) function. - ``'SIGMOID'``: sigmoid function. - ``'TANH'``: tanh function. - ``'SCALED_TANH'``: scaled tanh function, defined as: ``f(x) = alpha * tanh(beta * x)`` where ``alpha`` and ``beta`` are constant scalars. - ``'SOFTPLUS'``: softplus function. - ``'SOFTSIGN'``: softsign function. - ``'SIGMOID_HARD'``: hard sigmoid function, defined as: ``f(x) = min(max(alpha * x + beta, -1), 1)`` where ``alpha`` and ``beta`` are constant scalars. - ``'LEAKYRELU'``: leaky relu function, defined as: ``f(x) = (x >= 0) * x + (x < 0) * alpha * x`` where ``alpha`` is a constant scalar. - ``'PRELU'``: Parametric ReLU function, defined as: ``f(x) = (x >= 0) * x + (x < 0) * alpha * x`` where ``alpha`` is a multi-dimensional array of same size as ``x``. - ``'ELU'``: Exponential linear unit function, defined as: ``f(x) = (x >= 0) * x + (x < 0) * (alpha * exp(x) - 1)`` where ``alpha`` is a constant scalar. - ``'PARAMETRICSOFTPLUS'``: Parametric softplus function, defined as: ``f(x) = alpha * log(1 + exp(beta * x))`` where ``alpha`` and ``beta`` are two multi-dimensional arrays of same size as ``x``. - ``'THRESHOLDEDRELU'``: Thresholded ReLU function, defined as: ``f(x) = (x >= alpha) * x`` where ``alpha`` is a constant scalar. - ``'LINEAR'``: linear function. ``f(x) = alpha * x + beta`` input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. params: list of float or numpy.array Parameters for the activation, depending on non_linearity. - When ``non_linearity`` is one of [``'RELU'``, ``'SIGMOID'``, ``'TANH'``, ``'SCALED_TANH'``, ``'SOFTPLUS'``, ``'SOFTSIGN'``], params is ignored. - When ``non_linearity`` is one of [``'SCALED_TANH'``, ``'SIGMOID_HARD'``, ``'LINEAR'``], param is a list of 2 floats ``[alpha, beta]``. - When ``non_linearity`` is one of [``'LEAKYRELU'``, ``'ELU'``, ``'THRESHOLDEDRELU'``], param is a list of 1 float ``[alpha]``. - When ``non_linearity`` is ``'PRELU'``, param is a list of 1 numpy array ``[alpha]``. The shape of ``alpha`` is ``(C,)``, where ``C`` is either the number of input channels or 1. When ``C = 1``, same ``alpha`` is applied to all channels. - When ``non_linearity`` is ``'PARAMETRICSOFTPLUS'``, param is a list of 2 numpy arrays ``[alpha, beta]``. The shape of ``alpha`` and `beta` is ``(C, )``, where ``C`` is either the number of input channels or 1. When ``C = 1``, same ``alpha`` and ``beta`` are applied to all channels. See Also -------- add_convolution, add_softmax """ input_rank = ( len(input_shape) if (input_shape and not input_rank) else input_rank ) output_rank = ( len(output_shape) if (output_shape and not output_rank) else output_rank ) spec_layer = self._add_generic_layer( name, [input_name], [output_name], [input_rank] if input_rank else None, [input_shape] if input_shape else None, [output_rank] if output_rank else None, [output_shape] if output_shape else None, ) spec_layer_params = spec_layer.activation # Fill in the parameters non_linearity = ( non_linearity.upper() if isinstance(non_linearity, str) else non_linearity ) if non_linearity == "RELU": spec_layer_params.ReLU.MergeFromString(b"") elif non_linearity == "SIGMOID": spec_layer_params.sigmoid.MergeFromString(b"") elif non_linearity == "TANH": spec_layer_params.tanh.MergeFromString(b"") elif non_linearity == "SCALED_TANH": spec_layer_params.scaledTanh.MergeFromString(b"") if params is None: alpha, beta = (0.0, 0.0) else: alpha, beta = params[0], params[1] spec_layer_params.scaledTanh.alpha = alpha spec_layer_params.scaledTanh.beta = beta elif non_linearity == "SOFTPLUS": spec_layer_params.softplus.MergeFromString(b"") elif non_linearity == "SOFTSIGN": spec_layer_params.softsign.MergeFromString(b"") elif non_linearity == "SIGMOID_HARD": if params is None: alpha, beta = (0.2, 0.5) else: alpha, beta = params[0], params[1] spec_layer_params.sigmoidHard.alpha = alpha spec_layer_params.sigmoidHard.beta = beta elif non_linearity == "LEAKYRELU": if params is None: alpha = 0.3 else: alpha = params[0] spec_layer_params.leakyReLU.alpha = float(alpha) elif non_linearity == "PRELU": # PReLU must provide an np array in params[0] spec_layer_params.PReLU.alpha.floatValue.extend(params.flatten()) elif non_linearity == "ELU": # ELU must provide an alpha in params[0] spec_layer_params.ELU.alpha = float(params) elif non_linearity == "PARAMETRICSOFTPLUS": # Parametric softplus must provide two np arrays for alpha and beta alphas, betas = (params[0], params[1]) # Weight alignment: Keras [H,W,C,F] spec_layer_params.parametricSoftplus.alpha.floatValue.extend( alphas.flatten() ) spec_layer_params.parametricSoftplus.beta.floatValue.extend(betas.flatten()) elif non_linearity == "THRESHOLDEDRELU": if params is None: theta = 1.0 else: theta = params spec_layer_params.thresholdedReLU.alpha = float(theta) elif non_linearity == "LINEAR": if params is None: alpha, beta = (1.0, 0.0) else: alpha, beta = params[0], params[1] spec_layer_params.linear.alpha = alpha spec_layer_params.linear.beta = beta else: raise TypeError("Unknown activation type %s." % non_linearity) return spec_layer def add_elementwise(self, name, input_names, output_name, mode, alpha=None): """ Add an element-wise operation layer to the model. Refer to the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str A list of input blob names of this layer. The input blobs should have the same shape. output_name: str The output blob name of this layer. mode: str A string specifying the mode of the elementwise layer. It can be one of the following: - ``'CONCAT'``: Concatenate input blobs along the channel axis. - ``'SEQUENCE_CONCAT'``: Concatenate input blobs along the sequence axis. - ``'ADD'``: Perform an element-wise summation over the input blobs. - ``'MULTIPLY'``: Perform an element-wise multiplication over the input blobs. - ``'DOT'``: Compute the dot product of the two input blobs. In this mode, the length of ``input_names`` should be 2. - ``'COS'``: Compute the cosine similarity of the two input blobs. In this mode, the length of ``input_names`` should be 2. - ``'MAX'``: Compute the element-wise maximum over the input blobs. - ```'MIN'```: Compute the element-wise minimum over the input blobs. - ``'AVE'``: Compute the element-wise average over the input blobs. alpha: float * if ``mode == 'ADD'`` and there is only one ``input_name``, ``alpha`` is added to the input. * if ``mode == 'MULTIPLY'`` and there is only one ``input_name``, ``alpha`` is multiplied to the input. See Also -------- add_upsample, add_sequence_repeat """ input_names = input_names if isinstance(input_names, list) else [input_names] spec_layer = self._add_generic_layer(name, input_names, [output_name]) # add one of the following layers mode = mode.upper() if isinstance(mode, str) else mode if mode == "CONCAT": spec_layer.concat.sequenceConcat = False elif mode == "SEQUENCE_CONCAT": spec_layer.concat.sequenceConcat = True elif mode == "ADD": spec_layer.add.MergeFromString(b"") if alpha: spec_layer.add.alpha = alpha elif mode == "MULTIPLY": spec_layer.multiply.MergeFromString(b"") if alpha: spec_layer.multiply.alpha = alpha elif mode == "COS": spec_layer.dot.cosineSimilarity = True elif mode == "DOT": spec_layer.dot.cosineSimilarity = False elif mode == "MAX": spec_layer.max.MergeFromString(b"") elif mode == "MIN": spec_layer.min.MergeFromString(b"") elif mode == "AVE": spec_layer.average.MergeFromString(b"") else: raise ValueError("Unsupported elementwise mode %s" % mode) return spec_layer def add_upsample( self, name, scaling_factor_h, scaling_factor_w, input_name, output_name, mode="NN", linear_upsample_mode="DEFAULT", ): """ Add an upsample layer to the model. Refer to the ``UpsampleLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. scaling_factor_h: int or float Scaling factor on the vertical direction. Float values only supported with ``BILINEAR`` and ``ALIGN_CORNERS_*``. scaling_factor_w: int or float Scaling factor on the horizontal direction. Float values only supported with ``BILINEAR`` and ``ALIGN_CORNERS_*``. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. mode: str Overall interpolation mode. The following values are supported: * ``'NN'``: nearest neighbour * ``'BILINEAR'``: bilinear interpolation linear_upsample_mode: str Specifies the behavior for linear upsampling. Only valid when Interpolation Mode is ``BILINEAR``. If input grid is ``[0, Xin-1]`` (corresponding to an input size of ``Xin``), and if the output size is ``Xout``, then the grid points are sampled in the following manner: 'DEFAULT': - ``spacing = (Xin-Xin/Xout) / (Xout-1)`` - ``grid_point[i] = min(Xin-1, max(0, i * spacing)), for i = 0,1,2,..,Xout-1`` 'ALIGN_CORNERS_TRUE': - ``spacing = (Xin-1) / (Xout-1)`` - ``grid_point[i] = min(Xin-1, max(0, i * spacing)), for i = 0,1,2,..,Xout-1`` 'ALIGN_CORNERS_FALSE': - ``spacing = Xin / Xout`` - ``grid_point[i] = min(Xin-1, max(0, i * spacing + 0.5 * spacing - 0.5)), for i = 0,1,2,..,Xout-1`` See Also -------- add_resize_bilinear """ mode = mode.upper() if isinstance(mode, str) else mode linear_upsample_mode = ( linear_upsample_mode.upper() if isinstance(linear_upsample_mode, str) else linear_upsample_mode ) if mode not in ["NN", "BILINEAR"]: raise ValueError("Unsupported upsampling mode %s" % mode) if linear_upsample_mode not in ["DEFAULT", "ALIGN_CORNERS_TRUE", "ALIGN_CORNERS_FALSE"]: raise ValueError( "Unsupported linear upsampling mode %s" % linear_upsample_mode ) # Default linear upsample mode is backwards compatible, else set spec to iOS14 if ( linear_upsample_mode != "DEFAULT" and self.spec and ( not self.spec.specificationVersion or self.spec.specificationVersion < _SPECIFICATION_VERSION_IOS_14 ) ): self.spec.specificationVersion = _SPECIFICATION_VERSION_IOS_14 spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.upsample if ( scaling_factor_h - _math_floor(scaling_factor_h) > 0.001 or scaling_factor_w - _math_floor(scaling_factor_w) > 0.001 ): if mode != "BILINEAR" or linear_upsample_mode not in [ "ALIGN_CORNERS_TRUE", "ALIGN_CORNERS_FALSE", ]: raise ValueError( "Fractional upsampling only compatible with BILINEAR and ALIGN_CORNERS_TRUE or ALIGN_CORNERS_FALSE" ) spec_layer_params.fractionalScalingFactor.append(float(scaling_factor_h)) spec_layer_params.fractionalScalingFactor.append(float(scaling_factor_w)) else: spec_layer_params.scalingFactor.append(int(scaling_factor_h)) spec_layer_params.scalingFactor.append(int(scaling_factor_w)) spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.UpsampleLayerParams.InterpolationMode.Value(mode) ) spec_layer_params.linearUpsampleMode = ( _proto.NeuralNetwork_pb2.UpsampleLayerParams.LinearUpsampleMode.Value( linear_upsample_mode ) ) return spec_layer def add_scale( self, name, W, b, has_bias, input_name, output_name, shape_scale=None, shape_bias=None, ): """ Add a scale layer to the model. Refer to the ``ScaleLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. W: int or numpy.array Scale of the input. b: int or numpy.array Bias to add to the input. has_bias: boolean Whether the bias vector of this layer is ignored in the ``spec``. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. shape_scale: list of int or tuple of int List of ints that specifies the shape of the scale parameter. Can be ``[1]``, ``[C]``, ``[1,H,W]``, or ``[C,H,W]``. shape_bias: list of int List of ints that specifies the shape of the bias parameter (if present). Can be ``[1]``, ``[C]``, ``[1,H,W]``, or ``[C,H,W]``. See Also -------- add_bias """ if not shape_scale: shape_scale = [1] if not shape_bias: shape_bias = [1] spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.scale spec_layer_params.hasBias = has_bias # add scale and its shape scale = spec_layer_params.scale spec_layer_params.shapeScale.extend(shape_scale) if isinstance(W, int): scale.floatValue.append(float(W)) else: scale.floatValue.extend(W.flatten()) if len(scale.floatValue) != _np.prod(shape_scale): raise ValueError( "Dimensions of 'shape_scale' do not match the size of the provided 'scale' parameter" ) # add bias and its shape if has_bias: bias = spec_layer_params.bias spec_layer_params.shapeBias.extend(shape_bias) if isinstance(b, int): bias.floatValue.append(float(b)) else: bias.floatValue.extend(b.flatten()) if len(bias.floatValue) != _np.prod(shape_bias): raise ValueError( "Dimensions of 'shape_bias' do not match the size of the provided 'b' parameter" ) return spec_layer def add_bias(self, name, b, input_name, output_name, shape_bias=None): """ Add a bias layer to the model. Refer to the ``BiasLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. b: int or numpy.array Bias to add to the input. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. shape_bias: list of int List of ints that specifies the shape of the bias parameter (if present). Can be ``[1]``, ``[C]``, ``[1,H,W]``, or ``[C,H,W]``. See Also -------- add_scale """ if not shape_bias: shape_bias = [1] spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.bias # add bias and its shape bias = spec_layer_params.bias if len(shape_bias) != 1 and len(shape_bias) != 3: raise ValueError("Shape of bias layer must have length 1 or 3.") spec_layer_params.shape.extend(shape_bias) if isinstance(b, int): bias.floatValue.append(float(b)) else: bias.floatValue.extend(b.flatten()) if len(bias.floatValue) != _np.prod(shape_bias): raise ValueError( "Dimensions of 'shape_bias' do not match the size" "of the provided 'b' parameter" ) return spec_layer def add_sequence_repeat(self, name, nrep, input_name, output_name): """ Add a sequence repeat layer to the model. Refer to the ``SequenceRepeatLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. nrep: int Number of repetitions of the input blob along the sequence axis. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_upsample, add_elementwise """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.sequenceRepeat spec_layer_params.nRepetitions = nrep return spec_layer def add_convolution( self, name, kernel_channels, output_channels, height, width, stride_height, stride_width, border_mode, groups, W, b, has_bias, is_deconv=False, output_shape=None, input_name="data", output_name="out", dilation_factors=[1, 1], padding_top=0, padding_bottom=0, padding_left=0, padding_right=0, same_padding_asymmetry_mode="BOTTOM_RIGHT_HEAVY", **kwargs ): """ Add a convolution layer to the network. Refer to the ``ConvolutionLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. kernel_channels: int Number of channels for the convolution kernels. output_channels: int Number of filter kernels. This is equal to the number of channels in the output blob. height: int Height of each kernel. width: int Width of each kernel. stride_height: int Stride along the height direction. stride_width: int Stride along the height direction. border_mode: str Option for the padding type and output blob shape. Can be either 'valid' or 'same'. groups: int Number of kernel groups. Input is divided into groups along the channel axis. Each kernel group share the same weights. W: numpy.array or bytes() or None Weight of the convolution kernels. * If ``is_deconv`` is False, ``W`` should have shape ``(height, width, kernel_channels, output_channels)``, where:: kernel_channel = input_channels / groups * If ``is_deconv`` is True, ``W`` should have shape ``(height, width, kernel_channels, output_channels / groups)``, where:: kernel_channel = input_channels * If ``W`` is of type ``bytes()`` (quantized), other quantization-related arguments must be provided as well (see below). * For Core ML specification version >=4, ``W`` can be ``None``. In this case, the convolution layer takes 2 inputs, where the 1st input represents the input feature map, and the 2nd input represents the weight blob. b: numpy.array Biases of the convolution kernels. ``b`` should have shape ``(outputChannels, )``. has_bias: boolean Whether bias is ignored. - If True, bias is not ignored. - If False, bias is ignored. is_deconv: boolean Whether the convolution layer is performing a convolution or a transposed convolution (deconvolution). - If True, the convolution layer is performing transposed convolution. - If False, the convolution layer is performing regular convolution. output_shape: tuple or None Either ``None`` or a 2-tuple, specifying the output shape ``(output_height, output_width)``. - Used only when ``is_deconv == True``. - When ``is_deconv == False``, this parameter is ignored. - If it is ``None``, the output shape is calculated automatically using the ``border_mode``. input_name: str or list of str The input blob name(s) of this layer. output_name: str The output blob name of this layer. dilation_factors: list of int Dilation factors across height and width directions. Must be a list of two positive integers. Defaults to ``[1, 1]``. padding_top, padding_bottom, padding_left, padding_right: int Values of height (top, bottom) and width (left, right) padding to be used if ``border_more`` is ``"valid"``. same_padding_asymmetry_mode: str Type of asymmetric padding to be used when ``border_mode`` is ``'same'``. Can be either ``'BOTTOM_RIGHT_HEAVY'`` or ``'TOP_LEFT_HEAVY'``. Quantization Quantization arguments expected in ``kwargs``, when ``W`` is of type ``bytes()``. quantization_type: str When weights are quantized (that is, ``W`` is of type ``bytes()``), this should be either ``"linear"`` or ``"lut"``. nbits: int Should be between 1 and 8 (inclusive). Number of bits per weight value. Only applicable when weights are quantized. quant_scale: numpy.array(dtype=numpy.float32) scale vector to be used with linear quantization. Must be of length either 1 or ``output_channels``. quant_bias: numpy.array(dtype=numpy.float32) bias vector to be used with linear quantization. Must be of length either 1 or ``output_channels``. quant_lut: numpy.array(dtype=numpy.float32) the LUT (look up table) to be used with LUT quantization. Must be of length 2^n bits. Depthwise convolution Depthwise convolution is a special case of convolution, in which: * ``kernel_channels = 1 (== input_channels / groups)`` * ``output_channels = channel_multiplier * input_channels`` * ``groups = input_channels`` * ``W``: ``[Kernel_height, Kernel_width, 1, channel_multiplier * input_channels]`` See Also -------- add_convolution3d, add_pooling, add_activation, add_batchnorm """ if isinstance(input_name, tuple): input_names = list(input_name) elif isinstance(input_name, list): input_names = input_name else: input_names = [input_name] spec_layer = self._add_generic_layer(name, input_names, [output_name]) # Set the layer params spec_layer_params = spec_layer.convolution spec_layer_params.isDeconvolution = is_deconv if is_deconv and output_shape: spec_layer_params.outputShape.append(output_shape[0]) spec_layer_params.outputShape.append(output_shape[1]) spec_layer_params.outputChannels = output_channels spec_layer_params.kernelChannels = kernel_channels spec_layer_params.kernelSize.append(height) spec_layer_params.kernelSize.append(width) spec_layer_params.stride.append(stride_height) spec_layer_params.stride.append(stride_width) border_mode = ( border_mode.lower() if isinstance(border_mode, str) else border_mode ) same_padding_asymmetry_mode = ( same_padding_asymmetry_mode.upper() if isinstance(same_padding_asymmetry_mode, str) else same_padding_asymmetry_mode ) if border_mode == "valid": height_border = spec_layer_params.valid.paddingAmounts.borderAmounts.add() height_border.startEdgeSize = padding_top height_border.endEdgeSize = padding_bottom width_border = spec_layer_params.valid.paddingAmounts.borderAmounts.add() width_border.startEdgeSize = padding_left width_border.endEdgeSize = padding_right elif border_mode == "same": if not ( same_padding_asymmetry_mode == "BOTTOM_RIGHT_HEAVY" or same_padding_asymmetry_mode == "TOP_LEFT_HEAVY" ): raise ValueError( "Invalid value %d of same_padding_asymmetry_mode parameter" % same_padding_asymmetry_mode ) spec_layer_params.same.asymmetryMode = ( _proto.NeuralNetwork_pb2.SamePadding.SamePaddingMode.Value( same_padding_asymmetry_mode ) ) else: raise NotImplementedError( "Border mode %s is not implemented." % border_mode ) spec_layer_params.nGroups = groups spec_layer_params.hasBias = has_bias # add dilation factors spec_layer_params.dilationFactor.append(dilation_factors[0]) spec_layer_params.dilationFactor.append(dilation_factors[1]) # If weight comes from another tensor just return if len(input_names) > 1: return # Weight assignments quantization = len(kwargs) > 0 and ('quantization_type' in kwargs and kwargs.get('quantization_type') is not None) if quantization: _verify_quantization_arguments( weight=W, output_channels=output_channels, **kwargs ) nbits = kwargs.get("nbits", 8) num_weights = (output_channels * kernel_channels * height * width) / groups if nbits < 8: byte_arr = _np.frombuffer(W, dtype=_np.uint8) W = _unpack_to_bytes(byte_arr, num_weights, nbits) else: W = _np.frombuffer(W, dtype=_np.uint8) if is_deconv: W = _np.reshape( W, (height, width, kernel_channels, output_channels / groups) ) else: W = _np.reshape(W, (height, width, kernel_channels, output_channels)) # Weight alignment: MLModel Spec requires following weight arrangement: # is_deconv == False ==> (output_channels, kernel_channels, height, width), where kernel_channel = input_channels / groups # is_deconv == True ==> (kernel_channels, output_channels / groups, height, width), where kernel_channel = input_channels if not is_deconv: Wt = W.transpose((3, 2, 0, 1)) Wt = Wt.flatten() else: Wt = W.transpose((2, 3, 0, 1)).flatten() # Assign weights weights = spec_layer_params.weights if not quantization: # no quantization weights.floatValue.extend(Wt.flatten()) else: # there is quantization W_bytes = bytes() if nbits == 8: W_bytes += Wt.flatten().tobytes() else: W_bytes += _convert_array_to_nbit_quantized_bytes( Wt.flatten(), nbits ).tobytes() _fill_quantized_weights(weights_message=weights, W=W_bytes, **kwargs) # Assign biases if has_bias: bias = spec_layer_params.bias for f in range(output_channels): bias.floatValue.append(float(b[f])) return spec_layer def add_convolution3d( self, name, input_channels, output_channels, depth, height, width, W, b, has_bias, groups=1, stride_depth=1, stride_height=1, stride_width=1, dilation_width=1, dilation_height=1, dilation_depth=1, is_deconv=False, output_shape=None, padding_mode="valid", padding_front=0, padding_back=0, padding_top=0, padding_bottom=0, padding_left=0, padding_right=0, input_name="data", output_name="out", ): """ Add a 3 dimensional convolution layer to the network. Refer to the ``Convolution3DLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_channels: int Number of input channels. output_channels: int Number of filter kernels. This is equal to the number of channels in the output blob. depth: int Depth of each kernel. height: int Height of each kernel. width: int Width of each kernel. W: numpy.array or bytes() Weight of the convolution kernels. ``W`` should have shape: - If ``deconv`` is False: ``(output_channels, kernel_channels, depth, height, width)``, where: ``kernel_channels = input_channels / groups`` - If ``deconv`` is True: ``(output_channels / groups, kernel_channels, depth, height, width)``, where: ``kernel_channels = input_channels`` b: numpy.array Biases of the convolution kernels. ``b`` should have shape ``(outputChannels, )``. has_bias: boolean Whether bias is ignored. - If True, bias is not ignored. - If False, bias is ignored. groups: int Number of kernel groups. Input is divided into groups along the channel axis. Each kernel group share the same weights. Defaults to 1. stride_depth, stride_height, stride_width: int Stride along the depth, height, and width directions, respectively. Must all be positive integers. Defaults to 1. dilation_depth, dilation_width, dilation_height: int Dilation factors across depth, height, and width directions. Must all be positive integers. Defaults to 1 in each dimension. is_deconv: bool True if this is Convolution Transpose, otherwise False. output_shape: None or Tuple of int Applicable only for Deconvolution layer. ``None`` if Convolution. Tuple of length 3 if Convolution Transpose. padding_mode: str Option for the padding type and output blob shape. Can be ``'custom'``, ``'valid'``, or ``'same'``. Defaults to ``'valid'``. Case-insensitive. padding_front, padding_back, padding_top, padding_bottom, padding_left, padding_right: int Values of depth (front, back), height (top, bottom), and width (left, right) padding to be used. Must all be positive integers. All default to 0. input_name: str or list of str The input blob name(s) of this layer. output_name: str The output blob name of this layer. Depthwise convolution Depthwise convolution is a special case of convolution, in which: * ``kernel_channels = 1`` (``== input_channels / groups``) * ``output_channels = channel_multiplier * input_channels`` * ``groups = input_channels`` * ``W``: ``[Kernel_depth, Kernel_height, Kernel_width, 1, channel_multiplier * input_channels]`` See Also -------- add_convolution, add_pooling, add_activation, add_batchnorm """ # Update spec version if necessary if self.spec and ( not self.spec.specificationVersion or self.spec.specificationVersion < _SPECIFICATION_VERSION_IOS_14 ): self.spec.specificationVersion = _SPECIFICATION_VERSION_IOS_14 if isinstance(input_name, tuple): input_names = list(input_name) elif isinstance(input_name, list): input_names = input_name else: input_names = [input_name] # 3D convolution doesn't currently support 2-inputs if len(input_names) > 1: raise ValueError("3D convolution only supports 1 input.") spec_layer = self._add_generic_layer(name, input_names, [output_name]) # Set the layer params spec_layer_params = spec_layer.convolution3d spec_layer_params.isDeconvolution = is_deconv spec_layer_params.nGroups = groups spec_layer_params.outputChannels = output_channels spec_layer_params.inputChannels = input_channels spec_layer_params.kernelDepth = depth spec_layer_params.kernelHeight = height spec_layer_params.kernelWidth = width spec_layer_params.strideDepth = stride_depth spec_layer_params.strideHeight = stride_height spec_layer_params.strideWidth = stride_width if is_deconv and output_shape: spec_layer_params.outputShape.append(output_shape[0]) spec_layer_params.outputShape.append(output_shape[1]) spec_layer_params.outputShape.append(output_shape[2]) supported_padding_modes = {"CUSTOM", "VALID", "SAME"} if padding_mode.upper() not in supported_padding_modes: raise ValueError( "Unsupported padding mode: %s. Must be one of %s" % (padding_mode, supported_padding_modes) ) if padding_mode.upper() == "CUSTOM": spec_layer_params.customPaddingFront = padding_front spec_layer_params.customPaddingBack = padding_back spec_layer_params.customPaddingTop = padding_top spec_layer_params.customPaddingBottom = padding_bottom spec_layer_params.customPaddingLeft = padding_left spec_layer_params.customPaddingRight = padding_right spec_layer_params.paddingType = ( _proto.NeuralNetwork_pb2.Convolution3DLayerParams.PaddingType.Value( padding_mode.upper() ) ) spec_layer_params.dilationDepth = dilation_depth spec_layer_params.dilationHeight = dilation_height spec_layer_params.dilationWidth = dilation_width # Weight alignment: MLModel Spec requires following weight arrangement: # is_deconv == False ==> (output_channels, kernel_channels, depth, height, width), where kernel_channel = input_channels / groups # is_deconv == True ==> (kernel_channels, output_channels / groups, height, width), where kernel_channel = input_channels if is_deconv: W = W.transpose((1, 0, 2, 3, 4)) # Assign weights weights = spec_layer_params.weights weights.floatValue.extend(W.flatten()) # Assign biases spec_layer_params.hasBias = has_bias if has_bias: bias = spec_layer_params.bias for f in range(output_channels): bias.floatValue.append(float(b[f])) return spec_layer def add_pooling( self, name, height, width, stride_height, stride_width, layer_type, padding_type, input_name, output_name, exclude_pad_area=True, is_global=False, padding_top=0, padding_bottom=0, padding_left=0, padding_right=0, same_padding_asymmetry_mode="BOTTOM_RIGHT_HEAVY", ): """ Add a pooling layer to the model that performs spatial pooling. Refer to the ``PoolingLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. height: int Height of pooling region. width: int Width of pooling region. stride_height: int Stride along the height direction. stride_width: int Stride along the width direction. layer_type: str Type of pooling performed. Can either be ``'MAX'``, ``'AVERAGE'``, or ``'L2'``. padding_type: str Option for the type of padding and output blob shape. Can be either ``'VALID'``, ``'SAME'``, or ``'INCLUDE_LAST_PIXEL'``. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. exclude_pad_area: boolean Whether to exclude padded area in the ``'AVERAGE'`` pooling operation, default: true. This flag is only used with average pooling. - If True, the value of the padded area will be excluded. - If False, the padded area will be included. is_global: boolean Whether the pooling operation is global. Defaults to False. - If True, the pooling operation is global. The pooling region is of the same size of the input blob. Parameters ``height``, ``width``, ``stride_height``, and ``stride_width`` will be ignored. - If False, the pooling operation is not global. padding_top, padding_bottom, padding_left, padding_right: int Values of height (top, bottom) and width (left, right) padding to be used if padding type is ``"VALID"`` or ``"INCLUDE_LAST_PIXEL"``. same_padding_asymmetry_mode: str. Type of asymmetric padding to be used when ``padding_type = 'SAME'``. Can be either ``'BOTTOM_RIGHT_HEAVY'`` or ``'TOP_LEFT_HEAVY'``. See Also -------- add_pooling3d, add_convolution, add_activation """ # Create spec layer spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.pooling # Set the parameters spec_layer_params.type = _proto.NeuralNetwork_pb2.PoolingLayerParams.PoolingType.Value( layer_type.upper() ) padding_type = ( padding_type.upper() if isinstance(padding_type, str) else padding_type ) same_padding_asymmetry_mode = ( same_padding_asymmetry_mode.upper() if isinstance(same_padding_asymmetry_mode, str) else same_padding_asymmetry_mode ) if padding_type == "VALID": height_border = spec_layer_params.valid.paddingAmounts.borderAmounts.add() height_border.startEdgeSize = padding_top height_border.endEdgeSize = padding_bottom width_border = spec_layer_params.valid.paddingAmounts.borderAmounts.add() width_border.startEdgeSize = padding_left width_border.endEdgeSize = padding_right elif padding_type == "SAME": if not ( same_padding_asymmetry_mode == "BOTTOM_RIGHT_HEAVY" or same_padding_asymmetry_mode == "TOP_LEFT_HEAVY" ): raise ValueError( "Invalid value %d of same_padding_asymmetry_mode parameter" % same_padding_asymmetry_mode ) spec_layer_params.same.asymmetryMode = ( _proto.NeuralNetwork_pb2.SamePadding.SamePaddingMode.Value( same_padding_asymmetry_mode ) ) elif padding_type == "INCLUDE_LAST_PIXEL": if padding_top != padding_bottom or padding_left != padding_right: raise ValueError( "Only symmetric padding is supported with the INCLUDE_LAST_PIXEL padding type" ) spec_layer_params.includeLastPixel.paddingAmounts.append(padding_top) spec_layer_params.includeLastPixel.paddingAmounts.append(padding_left) else: raise ValueError("Unknown padding_type %s in pooling" % padding_type) spec_layer_params.kernelSize.append(height) spec_layer_params.kernelSize.append(width) spec_layer_params.stride.append(stride_height) spec_layer_params.stride.append(stride_width) spec_layer_params.avgPoolExcludePadding = exclude_pad_area spec_layer_params.globalPooling = is_global return spec_layer def add_pooling3d( self, name, input_name, output_name, pooling_type, kernel_depth, kernel_height, kernel_width, stride_depth, stride_height, stride_width, padding_mode="valid", custom_padding_front=0, custom_padding_back=0, custom_padding_top=0, custom_padding_bottom=0, custom_padding_left=0, custom_padding_right=0, average_pooling_count_excludes_padding=False, ): """ Add a pooling layer to the model that performs spatial pooling across three dimensions. Refer to the ``Pooling3DLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. pooling_type: str Type of pooling performed. Can either be ``'MAX'`` OR ``'AVERAGE'``. kernel_depth: int Depth of the pooling region. kernel_height: int Height of pooling region. kernel_width: int Width of pooling region. stride_depth: int Stride along the depth direction stride_height: int Stride along the height direction. stride_width: int Stride along the width direction. padding_mode: str Option for the padding type and output blob shape. Can be ``'VALID'``, ``'SAME'``, or ``'CUSTOM'``. custom_padding_front: int Padding before the input in the depth direction. custom_padding_back: int Padding after the input in the depth direction. custom_padding_top: int Padding before the input in the height direction. custom_padding_bottom: int Padding after the input in the height direction. custom_padding_left: int Padding before the input in the width direction. custom_padding_right: int Padding after the input in the width direction. average_pooling_count_excludes_padding: boolean If true, exclude zeros from padding in average pooling. Can only be true for ``AVERAGE`` padding. See Also -------- add_pooling, add_global_pooling3d """ if self.spec and ( not self.spec.specificationVersion or self.spec.specificationVersion < _SPECIFICATION_VERSION_IOS_14 ): self.spec.specificationVersion = _SPECIFICATION_VERSION_IOS_14 spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.pooling3d spec_layer_params.type = _proto.NeuralNetwork_pb2.Pooling3DLayerParams.PoolingType3D.Value( pooling_type.upper() ) spec_layer_params.kernelDepth = kernel_depth spec_layer_params.kernelHeight = kernel_height spec_layer_params.kernelWidth = kernel_width spec_layer_params.strideDepth = stride_depth spec_layer_params.strideHeight = stride_height spec_layer_params.strideWidth = stride_width supported_padding_modes = {"CUSTOM", "VALID", "SAME"} if padding_mode.upper() not in supported_padding_modes: raise ValueError( "Unsupported padding mode: %s. Must be one of %s" % (padding_mode, supported_padding_modes) ) if padding_mode.upper() == "CUSTOM": spec_layer_params.customPaddingFront = custom_padding_front spec_layer_params.customPaddingBack = custom_padding_back spec_layer_params.customPaddingTop = custom_padding_top spec_layer_params.customPaddingBottom = custom_padding_bottom spec_layer_params.customPaddingLeft = custom_padding_left spec_layer_params.customPaddingRight = custom_padding_right spec_layer_params.paddingType = ( _proto.NeuralNetwork_pb2.Pooling3DLayerParams.Pooling3DPaddingType.Value( padding_mode.upper() ) ) spec_layer_params.countExcludePadding = average_pooling_count_excludes_padding return spec_layer def add_global_pooling3d(self, name, input_name, output_name, pooling_type): """ Add a layer to pool three spatial dimensions down to one value. This behaves like a special case of Pooling3DLayerParams in which the Kernel is the size of the input and there is no padding. Refer to the ``GlobalPooling3DLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. pooling_type: str Type of pooling performed. Can either be ``'MAX'`` OR ``'AVERAGE'``. See Also -------- add_pooling, add_pooling3d """ if self.spec and ( not self.spec.specificationVersion or self.spec.specificationVersion < _SPECIFICATION_VERSION_IOS_14 ): self.spec.specificationVersion = _SPECIFICATION_VERSION_IOS_14 spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.globalPooling3d spec_layer_params.type = ( _proto.NeuralNetwork_pb2.GlobalPooling3DLayerParams.GlobalPoolingType3D.Value( pooling_type.upper() ) ) return spec_layer def add_padding( self, name, left=0, right=0, top=0, bottom=0, value=0, input_name="data", output_name="out", padding_type="constant", ): """ Add a padding layer to the model that performs padding along spatial dimensions. Refer to the ``PaddingLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. left: int Number of elements to be padded on the left side of the input blob. right: int Number of elements to be padded on the right side of the input blob. top: int Number of elements to be padded on the top of the input blob. bottom: int Number of elements to be padded on the bottom of the input blob. value: float Value of the elements padded. Used only when ``padding_type = 'constant'``. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. padding_type: str Type of the padding. Can be one of ``'constant'``, ``'reflection'``, or ``'replication'``. See Also -------- add_crop, add_convolution, add_pooling, add_constant_pad """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.padding # Set the parameters padding_type = ( padding_type.lower() if isinstance(padding_type, str) else padding_type ) if padding_type == "constant": spec_layer_params.constant.value = value elif padding_type == "reflection": spec_layer_params.reflection.MergeFromString(b"") elif padding_type == "replication": spec_layer_params.replication.MergeFromString(b"") else: raise ValueError("Unknown padding_type %s" % padding_type) height_border = spec_layer_params.paddingAmounts.borderAmounts.add() height_border.startEdgeSize = top height_border.endEdgeSize = bottom width_border = spec_layer_params.paddingAmounts.borderAmounts.add() width_border.startEdgeSize = left width_border.endEdgeSize = right return spec_layer def add_crop( self, name, left, right, top, bottom, offset, input_names, output_name ): """ Add a cropping layer to the model. The cropping layer have two functional modes: - When it has 1 input blob, it crops the input blob based on the 4 parameters ``[left, right, top, bottom]``. - When it has 2 input blobs, it crops the first input blob based on the dimension of the second blob with an offset. Refer to the ``CropLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. left: int Number of elements to be cropped on the left side of the input blob. When the crop layer takes 2 inputs, this parameter is ignored. right: int Number of elements to be cropped on the right side of the input blob. When the crop layer takes 2 inputs, this parameter is ignored. top: int Number of elements to be cropped on the top of the input blob. When the crop layer takes 2 inputs, this parameter is ignored. bottom: int Number of elements to be cropped on the bottom of the input blob. When the crop layer takes 2 inputs, this parameter is ignored. offset: list of int Offset along the height and width directions when the crop layer takes 2 inputs. Must be a list of length 2. When the crop layer takes 1 input, this parameter is ignored. input_names: list of str The input blob names of this layer. Must be either a list of 1 string (1 input crop layer), or a list of 2 strings (2-input crop layer). output_name: str The output blob name of this layer. See Also -------- add_padding, add_convolution, add_pooling """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.crop # Set the parameters offset = [0, 0] if len(input_names) == 1 else offset spec_layer_params.offset.extend(offset) height_border = spec_layer_params.cropAmounts.borderAmounts.add() height_border.startEdgeSize = top height_border.endEdgeSize = bottom width_border = spec_layer_params.cropAmounts.borderAmounts.add() width_border.startEdgeSize = left width_border.endEdgeSize = right return spec_layer def add_simple_rnn( self, name, W_h, W_x, b, hidden_size, input_size, activation, input_names, output_names, output_all=False, reverse_input=False, ): """ Add a simple recurrent layer to the model. Refer to the ``SimpleRecurrentLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. W_h: numpy.array Weights of the recurrent layer's hidden state. Must be of shape ``(hidden_size, hidden_size)``. W_x: numpy.array Weights of the recurrent layer's input. Must be of shape ``(hidden_size, input_size)``. b: numpy.array or None Bias of the recurrent layer's output. If ``None``, bias is ignored. Otherwise it must be of shape ``(hidden_size, )``. hidden_size: int Number of hidden units. This is equal to the number of channels of output shape. input_size: int Number of the number of channels of input shape. activation: str Activation function name. Can be one of the following option: [``'RELU'``, ``'TANH'``, ``'SIGMOID'``, ``'SCALED_TANH'``, ``'SIGMOID_HARD'``, ``'LINEAR'``]. See add_activation for more detailed description. input_names: list of str The input blob names list of this layer, in the order of ``[x, h_input]``. output_names: list of str The output blob names list of this layer, in the order of ``[y, h_output]``. output_all: boolean Whether the recurrent layer should output at every time step. - If False, the output is the result after the final state update. - If True, the output is a sequence, containing outputs at all time steps. reverse_input: boolean Whether the recurrent layer should process the input sequence in the reverse order. - If False, the input sequence order is not reversed. - If True, the input sequence order is reversed. See Also -------- add_activation, add_gru, add_unilstm, add_bidirlstm """ spec_layer = self._add_generic_layer(name, input_names, output_names) spec_layer_params = spec_layer.simpleRecurrent spec_layer_params.reverseInput = reverse_input # set the parameters spec_layer_params.inputVectorSize = input_size spec_layer_params.outputVectorSize = hidden_size if b is not None: spec_layer_params.hasBiasVector = True spec_layer_params.sequenceOutput = output_all activation_f = spec_layer_params.activation _set_recurrent_activation(activation_f, activation) # Write the weights spec_layer_params.weightMatrix.floatValue.extend(W_x.flatten()) spec_layer_params.recursionMatrix.floatValue.extend(W_h.flatten()) if b is not None: spec_layer_params.biasVector.floatValue.extend(b.flatten()) return spec_layer def add_gru( self, name, W_h, W_x, b, hidden_size, input_size, input_names, output_names, activation="TANH", inner_activation="SIGMOID_HARD", output_all=False, reverse_input=False, ): """ Add a Gated-Recurrent Unit (GRU) layer to the model. Refer to the ``GRULayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. W_h: [numpy.array] List of recursion weight matrices. The ordering is ``[R_z, R_r, R_o]``, where ``R_z``, ``R_r`` and ``R_o`` are weight matrices at update gate, reset gate and output gate. The shapes of these matrices are ``(hidden_size, hidden_size)``. W_x: [numpy.array] List of input weight matrices. The ordering is ``[W_z, W_r, W_o]``, where ``W_z``, ``W_r``, and ``W_o`` are weight matrices at update gate, reset gate and output gate. The shapes of these matrices are ``(hidden_size, input_size)``. b: [numpy.array] or None List of biases of the GRU layer. The ordering is ``[b_z, b_r, b_o]``, where ``b_z``, ``b_r``, and ``b_o`` are biases at update gate, reset gate and output gate. If ``None``, biases are ignored. Otherwise the shapes of the biases are ``(hidden_size, )``. hidden_size: int Number of hidden units. This is equal to the number of channels of output shape. input_size: int Number of the number of channels of input shape. activation: str Activation function used at the output gate. Can be one of the following options: [``'RELU'``, ``'TANH'``, ``'SIGMOID'``, ``'SCALED_TANH'``, ``'SIGMOID_HARD'``, ``'LINEAR'``]. Defaults to ``'TANH'``. See add_activation for more detailed description. inner_activation: str Inner activation function used at update and reset gates. Can be one of the following options: [``'RELU'``, ``'TANH'``, ``'SIGMOID'``, ``'SCALED_TANH'``, ``'SIGMOID_HARD'``, ``'LINEAR'``]. Defaults to ``'SIGMOID_HARD'``. See add_activation for more detailed description. input_names: list of str The input blob names list of this layer, in the order of ``[x, h_input]``. output_names: list of str The output blob names list of this layer, in the order of ``[y, h_output]``. output_all: boolean Whether the recurrent layer should output at every time step. - If False, the output is the result after the final state update. - If True, the output is a sequence, containing outputs at all time steps. reverse_input: boolean Whether the recurrent layer should process the input sequence in the reverse order. - If False, the input sequence order is not reversed. - If True, the input sequence order is reversed. See Also -------- add_activation, add_simple_rnn, add_unilstm, add_bidirlstm """ spec_layer = self._add_generic_layer(name, input_names, output_names) spec_layer_params = spec_layer.gru # set the parameters spec_layer_params.inputVectorSize = input_size spec_layer_params.outputVectorSize = hidden_size if b is not None: spec_layer_params.hasBiasVectors = True spec_layer_params.sequenceOutput = output_all spec_layer_params.reverseInput = reverse_input activation_f = spec_layer_params.activations.add() activation_g = spec_layer_params.activations.add() _set_recurrent_activation(activation_f, inner_activation) _set_recurrent_activation(activation_g, activation) # Write the weights R_z, R_r, R_o = W_h W_z, W_r, W_o = W_x spec_layer_params.updateGateWeightMatrix.floatValue.extend(W_z.flatten()) spec_layer_params.resetGateWeightMatrix.floatValue.extend(W_r.flatten()) spec_layer_params.outputGateWeightMatrix.floatValue.extend(W_o.flatten()) spec_layer_params.updateGateRecursionMatrix.floatValue.extend(R_z.flatten()) spec_layer_params.resetGateRecursionMatrix.floatValue.extend(R_r.flatten()) spec_layer_params.outputGateRecursionMatrix.floatValue.extend(R_o.flatten()) if b is not None: b_z, b_r, b_o = b spec_layer_params.updateGateBiasVector.floatValue.extend(b_z.flatten()) spec_layer_params.resetGateBiasVector.floatValue.extend(b_r.flatten()) spec_layer_params.outputGateBiasVector.floatValue.extend(b_o.flatten()) return spec_layer def add_unilstm( self, name, W_h, W_x, b, hidden_size, input_size, input_names, output_names, inner_activation="SIGMOID", cell_state_update_activation="TANH", output_activation="TANH", peep=None, output_all=False, forget_bias=False, coupled_input_forget_gate=False, cell_clip_threshold=50000.0, reverse_input=False, ): """ Add a Uni-directional LSTM layer to the model. Refer to the ``UniDirectionalLSTMLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. W_h: [numpy.array] List of recursion weight matrices. The ordering is [R_i, R_f, R_o, R_z], where R_i, R_f, R_o, R_z are weight matrices at input gate, forget gate, output gate and cell gate. The shapes of these matrices are (hidden_size, hidden_size). W_x: [numpy.array] List of input weight matrices. The ordering is [W_i, W_f, W_o, W_z], where W_i, W_f, W_o, W_z are weight matrices at input gate, forget gate, output gate and cell gate. The shapes of these matrices are (hidden_size, input_size). b: [numpy.array] or None List of biases. The ordering is [b_i, b_f, b_o, b_z], where b_i, b_f, b_o, b_z are biases at input gate, forget gate, output gate and cell gate. If ``None``, biases are ignored. Otherwise the shapes of the biases are (hidden_size, ). hidden_size: int Number of hidden units. This is equal to the number of channels of output shape. input_size: int Number of the number of channels of input shape. input_names: list of str The input blob names list of this layer, in the order of [x, h_input, c_input]. output_names: list of str The output blob names list of this layer, in the order of [y, h_output, c_output]. inner_activation: str Inner activation function used at input and forget gate. Can be one of the following option: ['RELU', 'TANH', 'SIGMOID', 'SCALED_TANH', 'SIGMOID_HARD', 'LINEAR']. cell_state_update_activation: str Cell state update activation function used at the cell state update gate. ['RELU', 'TANH', 'SIGMOID', 'SCALED_TANH', 'SIGMOID_HARD', 'LINEAR']. output_activation: str Activation function used at the output gate. Can be one of the following option: ['RELU', 'TANH', 'SIGMOID', 'SCALED_TANH', 'SIGMOID_HARD', 'LINEAR']. peep: [numpy.array] or None List of peephole vectors. The ordering is [p_i, p_f, p_o], where p_i, p_f, and p_o are peephole vectors at input gate, forget gate, output gate. The shapes of the peephole vectors are (hidden_size,). output_all: boolean Whether the LSTM layer should output at every time step. - If False, the output is the result after the final state update. - If True, the output is a sequence, containing outputs at all time steps. forget_bias: boolean If True, a vector of 1s is added to forget gate bias. coupled_input_forget_gate: boolean If True, the input gate and forget gate is coupled. i.e. forget gate is not used. cell_clip_threshold: float The limit on the maximum and minimum values on the cell state. If not provided, it is defaulted to 50.0. reverse_input: boolean Whether the LSTM layer should process the input sequence in the reverse order. - If False, the input sequence order is not reversed. - If True, the input sequence order is reversed. See Also -------- add_activation, add_simple_rnn, add_gru, add_bidirlstm """ spec_layer = self._add_generic_layer(name, input_names, output_names) spec_layer_params = spec_layer.uniDirectionalLSTM params = spec_layer_params.params weight_params = spec_layer_params.weightParams # set the parameters spec_layer_params.inputVectorSize = input_size spec_layer_params.outputVectorSize = hidden_size params.sequenceOutput = output_all params.forgetBias = False if b is not None: params.hasBiasVectors = True if peep is not None: params.hasPeepholeVectors = True params.coupledInputAndForgetGate = coupled_input_forget_gate params.cellClipThreshold = cell_clip_threshold params.forgetBias = forget_bias spec_layer_params.reverseInput = reverse_input activation_f = spec_layer_params.activations.add() activation_g = spec_layer_params.activations.add() activation_h = spec_layer_params.activations.add() _set_recurrent_activation(activation_f, inner_activation) _set_recurrent_activation(activation_g, cell_state_update_activation) _set_recurrent_activation(activation_h, output_activation) # Write the weights R_i, R_f, R_o, R_z = W_h W_i, W_f, W_o, W_z = W_x weight_params.inputGateWeightMatrix.floatValue.extend(W_i.flatten()) weight_params.forgetGateWeightMatrix.floatValue.extend(W_f.flatten()) weight_params.outputGateWeightMatrix.floatValue.extend(W_o.flatten()) weight_params.blockInputWeightMatrix.floatValue.extend(W_z.flatten()) weight_params.inputGateRecursionMatrix.floatValue.extend(R_i.flatten()) weight_params.forgetGateRecursionMatrix.floatValue.extend(R_f.flatten()) weight_params.outputGateRecursionMatrix.floatValue.extend(R_o.flatten()) weight_params.blockInputRecursionMatrix.floatValue.extend(R_z.flatten()) if b is not None: b_i, b_f, b_o, b_z = b weight_params.inputGateBiasVector.floatValue.extend(b_i.flatten()) weight_params.forgetGateBiasVector.floatValue.extend(b_f.flatten()) weight_params.outputGateBiasVector.floatValue.extend(b_o.flatten()) weight_params.blockInputBiasVector.floatValue.extend(b_z.flatten()) if peep is not None: p_i, p_f, p_o = peep weight_params.inputGatePeepholeVector.floatValue.extend(p_i.flatten()) weight_params.forgetGatePeepholeVector.floatValue.extend(p_f.flatten()) weight_params.outputGatePeepholeVector.floatValue.extend(p_o.flatten()) return spec_layer def add_bidirlstm( self, name, W_h, W_x, b, W_h_back, W_x_back, b_back, hidden_size, input_size, input_names, output_names, inner_activation="SIGMOID", cell_state_update_activation="TANH", output_activation="TANH", peep=None, peep_back=None, output_all=False, forget_bias=False, coupled_input_forget_gate=False, cell_clip_threshold=50000.0, ): """ Add a Bi-directional LSTM layer to the model. Refer to the ``BiDirectionalLSTMLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. W_h: [numpy.array] List of recursion weight matrices for the forward layer. The ordering is ``[R_i, R_f, R_o, R_z]``, where ``R_i``, ``R_f``, ``R_o``, and ``R_z`` are weight matrices at input gate, forget gate, output gate and cell gate. The shapes of these matrices are ``(hidden_size, hidden_size)``. W_x: [numpy.array] List of input weight matrices for the forward layer. The ordering is ``[W_i, W_f, W_o, W_z]``, where ``W_i``, ``W_f``, ``W_o``, and ``W_z`` are weight matrices at input gate, forget gate, output gate and cell gate. The shapes of these matrices are ``(hidden_size, input_size)``. b: [numpy.array] List of biases for the forward layer. The ordering is ``[b_i, b_f, b_o, b_z]``, where ``b_i``, ``b_f``, ``b_o``, and ``b_z`` are biases at input gate, forget gate, output gate and cell gate. If ``None``, biases are ignored. Otherwise the shapes of the biases are ``(hidden_size, )``. W_h_back: [numpy.array] List of recursion weight matrices for the backward layer. The ordering is ``[R_i, R_f, R_o, R_z]``, where ``R_i``, ``R_f``, ``R_o``, and ``R_z`` are weight matrices at input gate, forget gate, output gate and cell gate. The shapes of these matrices are ``(hidden_size, hidden_size)``. W_x_back: [numpy.array] List of input weight matrices for the backward layer. The ordering is `[W_i, W_f, W_o, W_z]``, where ``W_i``, ``W_f``, ``W_o``, and ``W_z`` are weight matrices at input gate, forget gate, output gate and cell gate. The shapes of these matrices are ``(hidden_size, input_size)``. b_back: [numpy.array] List of biases for the backward layer. The ordering is ``[b_i, b_f, b_o, b_z]``, where ``b_i``, ``b_f``, ``b_o``, and ``b_z`` are biases at input gate, forget gate, output gate and cell gate. The shapes of the biases ``(hidden_size)``. hidden_size: int Number of hidden units. This is equal to the number of channels of output shape. input_size: int Number of the number of channels of input shape. input_names: list of str The input blob names of this layer, in the order of ``[x, h_input, c_input, h_reverse_input, c_reverse_input]``. output_names: list of str The output blob names of this layer, in the order of ``[y, h_output, c_output, h_reverse_output, c_reverse_output]``. inner_activation: str Inner activation function used at input and forget gate. Can be one of the following options: [``'RELU'``, ``'TANH'``, ``'SIGMOID'``, ``'SCALED_TANH'``, ``'SIGMOID_HARD'``, ``'LINEAR'``]. Defaults to ``'SIGMOID'``. cell_state_update_activation: str Cell state update activation function used at the cell state update gate. Can be one of the following options: [``'RELU'``, ``'TANH'``, ``'SIGMOID'``, ``'SCALED_TANH'``, ``'SIGMOID_HARD'``, ``'LINEAR'``]. Defaults to ``'TANH'``. output_activation: str Activation function used at the output gate. Can be one of the following options: [``'RELU'``, ``'TANH'``, ``'SIGMOID'``, ``'SCALED_TANH'``, ``'SIGMOID_HARD'``, ``'LINEAR'``]. Defaults to ``'TANH'``. peep: [numpy.array] or None List of peephole vectors for the forward layer. The ordering is ``[p_i, p_f, p_o]``, where ``p_i``, ``p_f``, and ``p_o`` are peephole vectors at input gate, forget gate, and output gate. The shapes of the peephole vectors are ``(hidden_size,)``. Defaults to ``None``. peep_back: [numpy.array] or None List of peephole vectors for the backward layer. The ordering is ``[p_i, p_f, p_o]``, where ``p_i``, ``p_f``, and ``p_o`` are peephole vectors at input gate, forget gate, and output gate. The shapes of the peephole vectors are ``(hidden_size,)``. Defaults to ``None``. output_all: boolean Whether the LSTM layer should output at every time step. Defaults to ``False``. - If ``False``, the output is the result after the final state update. - If ``True``, the output is a sequence, containing outputs at all time steps. forget_bias: boolean If ``True``, a vector of 1s is added to forget gate bias. Defaults to ``False``. coupled_input_forget_gate: boolean If ``True``, the input gate and forget gate is coupled. That is, the forget gate is not used. Defaults to ``False``. cell_clip_threshold: float The limit on the maximum and minimum values on the cell state. Defaults to 50.0. See Also -------- add_activation, add_simple_rnn, add_unilstm, add_bidirlstm """ spec_layer = self._add_generic_layer(name, input_names, output_names) spec_layer_params = spec_layer.biDirectionalLSTM params = spec_layer_params.params weight_params = spec_layer_params.weightParams.add() weight_params_back = spec_layer_params.weightParams.add() # set the parameters spec_layer_params.inputVectorSize = input_size spec_layer_params.outputVectorSize = hidden_size if b is not None: params.hasBiasVectors = True params.sequenceOutput = output_all params.forgetBias = forget_bias if peep is not None: params.hasPeepholeVectors = True params.coupledInputAndForgetGate = coupled_input_forget_gate params.cellClipThreshold = cell_clip_threshold # set activations activation_f = spec_layer_params.activationsForwardLSTM.add() activation_g = spec_layer_params.activationsForwardLSTM.add() activation_h = spec_layer_params.activationsForwardLSTM.add() _set_recurrent_activation(activation_f, inner_activation) _set_recurrent_activation(activation_g, cell_state_update_activation) _set_recurrent_activation(activation_h, output_activation) activation_f_back = spec_layer_params.activationsBackwardLSTM.add() activation_g_back = spec_layer_params.activationsBackwardLSTM.add() activation_h_back = spec_layer_params.activationsBackwardLSTM.add() _set_recurrent_activation(activation_f_back, inner_activation) _set_recurrent_activation(activation_g_back, cell_state_update_activation) _set_recurrent_activation(activation_h_back, output_activation) # Write the forward lstm weights R_i, R_f, R_o, R_z = W_h W_i, W_f, W_o, W_z = W_x weight_params.inputGateWeightMatrix.floatValue.extend(W_i.flatten()) weight_params.forgetGateWeightMatrix.floatValue.extend(W_f.flatten()) weight_params.outputGateWeightMatrix.floatValue.extend(W_o.flatten()) weight_params.blockInputWeightMatrix.floatValue.extend(W_z.flatten()) weight_params.inputGateRecursionMatrix.floatValue.extend(R_i.flatten()) weight_params.forgetGateRecursionMatrix.floatValue.extend(R_f.flatten()) weight_params.outputGateRecursionMatrix.floatValue.extend(R_o.flatten()) weight_params.blockInputRecursionMatrix.floatValue.extend(R_z.flatten()) if b is not None: b_i, b_f, b_o, b_z = b weight_params.inputGateBiasVector.floatValue.extend(b_i.flatten()) weight_params.forgetGateBiasVector.floatValue.extend(b_f.flatten()) weight_params.outputGateBiasVector.floatValue.extend(b_o.flatten()) weight_params.blockInputBiasVector.floatValue.extend(b_z.flatten()) if peep is not None: p_i, p_f, p_o = peep weight_params.inputGatePeepholeVector.floatValue.extend(p_i.flatten()) weight_params.forgetGatePeepholeVector.floatValue.extend(p_f.flatten()) weight_params.outputGatePeepholeVector.floatValue.extend(p_o.flatten()) # Write the backward lstm weights R_i, R_f, R_o, R_z = W_h_back W_i, W_f, W_o, W_z = W_x_back weight_params_back.inputGateWeightMatrix.floatValue.extend(W_i.flatten()) weight_params_back.forgetGateWeightMatrix.floatValue.extend(W_f.flatten()) weight_params_back.outputGateWeightMatrix.floatValue.extend(W_o.flatten()) weight_params_back.blockInputWeightMatrix.floatValue.extend(W_z.flatten()) weight_params_back.inputGateRecursionMatrix.floatValue.extend(R_i.flatten()) weight_params_back.forgetGateRecursionMatrix.floatValue.extend(R_f.flatten()) weight_params_back.outputGateRecursionMatrix.floatValue.extend(R_o.flatten()) weight_params_back.blockInputRecursionMatrix.floatValue.extend(R_z.flatten()) if b_back is not None: b_i, b_f, b_o, b_z = b_back weight_params_back.inputGateBiasVector.floatValue.extend(b_i.flatten()) weight_params_back.forgetGateBiasVector.floatValue.extend(b_f.flatten()) weight_params_back.outputGateBiasVector.floatValue.extend(b_o.flatten()) weight_params_back.blockInputBiasVector.floatValue.extend(b_z.flatten()) if peep_back is not None: p_i, p_f, p_o = peep_back weight_params_back.inputGatePeepholeVector.floatValue.extend(p_i.flatten()) weight_params_back.forgetGatePeepholeVector.floatValue.extend(p_f.flatten()) weight_params_back.outputGatePeepholeVector.floatValue.extend(p_o.flatten()) return spec_layer def add_flatten(self, name, mode, input_name, output_name): """ Add a flatten layer. Only flattens the channel, height and width axis. Leaves the sequence axis as is. Refer to the ``FlattenLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. mode: int - If mode == 0, the flatten layer is in CHANNEL_FIRST mode. - If mode == 1, the flatten layer is in CHANNEL_LAST mode. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_permute, add_reshape """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.flatten # Set the parameters if mode == 0: spec_layer_params.mode = _proto.NeuralNetwork_pb2.FlattenLayerParams.FlattenOrder.Value( "CHANNEL_FIRST" ) elif mode == 1: spec_layer_params.mode = _proto.NeuralNetwork_pb2.FlattenLayerParams.FlattenOrder.Value( "CHANNEL_LAST" ) else: raise NotImplementedError("Unknown flatten mode %d " % mode) return spec_layer def add_slice( self, name, input_name, output_name, axis, start_index=0, end_index=-1, stride=1 ): """ Add a slice layer. Equivalent to to numpy slice [start_index:end_index:stride], start_index is included, while end_index is exclusive. Refer to the ``SliceLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axis: str axis along which input is sliced. allowed values: 'channel', 'height', 'width' start_index: int must be non-negative. end_index: int negative indexing is supported. stride: int must be positive. See Also -------- add_permute, add_reshape """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.slice # Set the parameters if start_index < 0: raise ValueError( "Invalid start_index value %d. Must be non-negative." % start_index ) if stride < 1: raise ValueError("Invalid stride value %d. Must be positive." % stride) spec_layer_params.startIndex = start_index spec_layer_params.endIndex = end_index spec_layer_params.stride = stride axis = axis.lower() if isinstance(axis, str) else axis if axis == "channel": spec_layer_params.axis = _proto.NeuralNetwork_pb2.SliceLayerParams.SliceAxis.Value( "CHANNEL_AXIS" ) elif axis == "height": spec_layer_params.axis = _proto.NeuralNetwork_pb2.SliceLayerParams.SliceAxis.Value( "HEIGHT_AXIS" ) elif axis == "width": spec_layer_params.axis = _proto.NeuralNetwork_pb2.SliceLayerParams.SliceAxis.Value( "WIDTH_AXIS" ) else: raise NotImplementedError("Unsupported Slice axis %s " % axis) return spec_layer def add_slice_by_size(self, name, input_names, output_name, axis, size): """ Add a slice layer. Equivalent to to numpy slice [start_index: start_index+size], Input is list of str which is [input_tensor, begin_id]. Assume input_tensor has shape (2, 3, 4), and axis=1, size=2. This would produce input_tensor[:, begin_id:begin_id+2, :] Refer to the ``SliceBySizeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. axis: int axis along which input is sliced. size: int The size of which input will be taken See Also -------- add_slice, add_slice_static, add_slice_dynamic """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.sliceBySize if size < 1: raise ValueError("Invalid size value %d. Must be positive." % size) spec_layer_params.axis = axis spec_layer_params.size = size return spec_layer def add_reorganize_data( self, name, input_name, output_name, mode="SPACE_TO_DEPTH", block_size=2 ): """ Add a data reorganization layer of type "SPACE_TO_DEPTH" or "DEPTH_TO_SPACE". Refer to the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. mode: str - If mode == 'SPACE_TO_DEPTH': data is moved from the spatial to the channel dimension. Input is spatially divided into non-overlapping blocks of size block_size X block_size and data from each block is moved to the channel dimension. Output CHW dimensions are: [C * block_size * block_size, H/block_size, C/block_size]. - If mode == 'DEPTH_TO_SPACE': data is moved from the channel to the spatial dimension. Reverse of the operation 'SPACE_TO_DEPTH'. Output CHW dimensions are: [C/(block_size * block_size), H * block_size, C * block_size]. - If mode == 'PIXEL_SHUFFLE': data is moved from the channel to the spatial dimension. Reverse of the operation 'SPACE_TO_DEPTH'. Output CHW dimensions are: [C/(block_size * block_size), H * block_size, C * block_size]. block_size: int Must be greater than 1. Must divide H and W, when mode is 'SPACE_TO_DEPTH'. (block_size * block_size) must divide C when mode is 'DEPTH_TO_SPACE' or 'PIXEL_SHUFFLE'. See Also -------- add_flatten, add_reshape """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reorganizeData # Set the parameters if block_size < 2: raise ValueError( "Invalid block_size value %d. Must be greater than 1." % block_size ) spec_layer_params.blockSize = block_size mode = mode.upper() if isinstance(mode, str) else mode if mode == "SPACE_TO_DEPTH": spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.ReorganizeDataLayerParams.ReorganizationType.Value( "SPACE_TO_DEPTH" ) ) elif mode == "DEPTH_TO_SPACE": spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.ReorganizeDataLayerParams.ReorganizationType.Value( "DEPTH_TO_SPACE" ) ) elif mode == "PIXEL_SHUFFLE": if self.spec and ( not self.spec.specificationVersion or self.spec.specificationVersion < _SPECIFICATION_VERSION_IOS_14 ): self.spec.specificationVersion = _SPECIFICATION_VERSION_IOS_14 spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.ReorganizeDataLayerParams.ReorganizationType.Value( "PIXEL_SHUFFLE" ) ) else: raise NotImplementedError("Unknown reorganization mode %s." % mode) return spec_layer def add_batchnorm( self, name, channels, gamma, beta, mean=None, variance=None, input_name="data", output_name="out", compute_mean_var=False, instance_normalization=False, epsilon=1e-5, ): """ Add a batch normalization layer. Batch normalization operation is defined as: ``y = gamma * (x - mean) / sqrt(variance + epsilon) + beta`` Refer to the ``BatchnormLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. channels: int Number of channels of the input blob. gamma: numpy.array Values of gamma. Must be numpy array of shape ``(channels, )``. beta: numpy.array Values of beta. Must be numpy array of shape ``(channels, )``. mean: numpy.array Means of the input blob on each channel. Must be numpy array of shape ``(channels, )``. variance: Variances of the input blob on each channel. Must be numpy array of shape ``(channels, )``. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. compute_mean_var: bool Set to ``True`` if mean and variance is to be computed from the input data. instance_normalization: bool Set compute_mean_var and this to ``True`` to perform instance normalization. That is, mean and variance are computed from the single input instance. epsilon: float Value of epsilon. Defaults to ``1e-5`` if not specified. See Also -------- add_convolution, add_pooling, add_inner_product """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.batchnorm # Set the parameters spec_layer_params.channels = channels spec_layer_params.gamma.floatValue.extend(gamma.flatten()) spec_layer_params.beta.floatValue.extend(beta.flatten()) spec_layer_params.epsilon = epsilon spec_layer_params.computeMeanVar = compute_mean_var spec_layer_params.instanceNormalization = instance_normalization if compute_mean_var: if not instance_normalization: raise NotImplementedError( "Batch-instance norm is currently not supported" ) if not compute_mean_var: spec_layer_params.mean.floatValue.extend(mean.flatten()) spec_layer_params.variance.floatValue.extend(variance.flatten()) return spec_layer def add_permute(self, name, dim, input_name, output_name): """ Add a permute layer. Assumes that the input has dimensions in the order [Seq, C, H, W] Refer to the ``PermuteLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. dim: tuple The order in which to permute the input dimensions = [seq,C,H,W]. Must have length 4 and a permutation of ``[0, 1, 2, 3]``. examples: Lets say input has shape: [seq, C, H, W]. If ``dim`` is set to ``[0, 3, 1, 2]``, then the output has shape ``[W,C,H]`` and has the same sequence length that of the input. If ``dim`` is set to ``[3, 1, 2, 0]``, and the input is a sequence of data with length ``Seq`` and shape ``[C, 1, 1]``, then the output is a unit sequence of data with shape ``[C, 1, Seq]``. If ``dim`` is set to ``[0, 3, 2, 1]``, the output is a reverse of the input: ``[C, H, W] -> [W, H, C]``. If ``dim`` is not set, or is set to ``[0, 1, 2, 3]``, the output is the same as the input. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_flatten, add_reshape """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.permute spec_layer_params.axis.extend(list(dim)) if len(dim) != 4: raise ValueError("Length of the 'dim' parameter must be equal to 4") return spec_layer def add_reshape(self, name, input_name, output_name, target_shape, mode): """ Add a reshape layer. Refer to the ``ReshapeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. target_shape: tuple Shape of the output blob. The product of target_shape must be equal to the shape of the input blob. Can be either length 3 (C,H,W) or length 4 (Seq,C,H,W). mode: int - If mode == 0, the reshape layer is in CHANNEL_FIRST mode. - If mode == 1, the reshape layer is in CHANNEL_LAST mode. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_flatten, add_permute """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reshape spec_layer_params.targetShape.extend(target_shape) if mode == 0: spec_layer_params.mode = _proto.NeuralNetwork_pb2.ReshapeLayerParams.ReshapeOrder.Value( "CHANNEL_FIRST" ) else: spec_layer_params.mode = _proto.NeuralNetwork_pb2.ReshapeLayerParams.ReshapeOrder.Value( "CHANNEL_LAST" ) if len(target_shape) != 4 and len(target_shape) != 3: raise ValueError( "Length of the 'target-shape' parameter must be equal to 3 or 4" ) self.rank_dict[output_name] = len(target_shape) return spec_layer def add_reduce(self, name, input_name, output_name, axis, mode, epsilon=1e-6): """ Add a reduce layer. Applies the function specified by the parameter mode, along dimension(s) specified by the parameter axis. Refer to the ``ReduceLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axis: str dimensions along which the reduction operation is applied. Allowed values: 'CHW', 'HW', 'C', 'H', 'W' mode: str Reduction operation to be applied. Allowed values: 'sum', 'avg', 'prod', 'logsum', 'sumsquare', 'L1', 'L2', 'max', 'min', 'argmax'. 'argmax' is only supported with axis values 'C', 'H' and 'W'. epsilon: float number that is added to the input when 'logsum' function is applied. See Also -------- add_activation """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reduce spec_layer_params.epsilon = epsilon mode = mode.lower() if isinstance(mode, str) else mode if mode == "sum": spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceOperation.Value("SUM") ) elif mode == "avg": spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceOperation.Value("AVG") ) elif mode == "prod": spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceOperation.Value("PROD") ) elif mode == "logsum": spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceOperation.Value("LOGSUM") ) elif mode == "sumsquare": spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceOperation.Value("SUMSQUARE") ) elif mode == "l1": spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceOperation.Value("L1") ) elif mode == "l2": spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceOperation.Value("L2") ) elif mode == "max": spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceOperation.Value("MAX") ) elif mode == "min": spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceOperation.Value("MIN") ) elif mode == "argmax": spec_layer_params.mode = ( _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceOperation.Value("ARGMAX") ) else: raise NotImplementedError("Unknown reduction operation %s." % mode) axis = axis.upper() if isinstance(axis, str) else axis if axis == "CHW": spec_layer_params.axis = _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceAxis.Value( "CHW" ) elif axis == "HW": spec_layer_params.axis = _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceAxis.Value( "HW" ) elif axis == "C": spec_layer_params.axis = _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceAxis.Value( "C" ) elif axis == "H": spec_layer_params.axis = _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceAxis.Value( "H" ) elif axis == "W": spec_layer_params.axis = _proto.NeuralNetwork_pb2.ReduceLayerParams.ReduceAxis.Value( "W" ) else: raise NotImplementedError("Unknown reduction axis %s." % axis) return spec_layer def add_lrn(self, name, input_name, output_name, alpha, beta, local_size, k=1.0): """ Add a LRN (local response normalization) layer. Supports "across" channels normalization. Refer to the ``LRNLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. alpha: float multiplicative constant in the denominator. beta: float exponent of the normalizing term in the denominator. k: float bias term in the denominator. Must be positive. local_size: int size of the neighborhood along the channel axis. See Also -------- add_l2_normalize, add_mvn """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.lrn spec_layer_params.alpha = alpha spec_layer_params.beta = beta spec_layer_params.localSize = local_size spec_layer_params.k = k return spec_layer def add_mvn( self, name, input_name, output_name, across_channels=True, normalize_variance=True, epsilon=1e-5, ): """ Add an MVN (mean variance normalization) layer. Computes mean, variance and normalizes the input. Refer to the ``MeanVarianceNormalizeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. across_channels: boolean If False, each channel plane is normalized separately If True, mean/variance is computed across all C, H and W dimensions normalize_variance: boolean If False, only mean subtraction is performed. epsilon: float small bias to avoid division by zero. See Also -------- add_l2_normalize, add_lrn """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.mvn spec_layer_params.acrossChannels = across_channels spec_layer_params.normalizeVariance = normalize_variance spec_layer_params.epsilon = epsilon return spec_layer def add_l2_normalize(self, name, input_name, output_name, epsilon=1e-5): """ Add L2 normalize layer. Normalizes the input by the L2 norm, i.e. divides by the the square root of the sum of squares of all elements of the input along C, H and W dimensions. Refer to the ``L2NormalizeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. epsilon: float small bias to avoid division by zero. See Also -------- add_mvn, add_lrn """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.l2normalize spec_layer_params.epsilon = epsilon return spec_layer def add_unary( self, name, input_name, output_name, mode, alpha=1.0, shift=0, scale=1.0, epsilon=None, ): """ Add a Unary layer. Applies the specified function (mode) to all the elements of the input. Prior to the application of the function the input can be scaled and shifted by using the 'scale', 'shift' parameters. Refer to the ``UnaryFunctionLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. mode: str Unary function. Allowed values: 'sqrt', 'rsqrt', 'inverse', 'power', 'exp', 'log', 'abs', threshold'. alpha: float constant used in with modes 'power' and 'threshold'. shift, scale: float input is modified by scale and shift prior to the application of the unary function. epsilon: float small bias to prevent division by zero. See Also -------- add_activation """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.unary if epsilon is None: # Use the default value of epsilon to be 1e-4, instead of 1e-6, if mode = "rsqrt" or "inverse" if mode == "inverse" or mode == "rsqrt": epsilon = 1e-4 elif mode == "log": epsilon = 1e-45 else: epsilon = 1e-6 spec_layer_params.epsilon = epsilon spec_layer_params.alpha = alpha spec_layer_params.shift = shift spec_layer_params.scale = scale mode = mode.lower() if isinstance(mode, str) else mode if mode == "sqrt": spec_layer_params.type = ( _proto.NeuralNetwork_pb2.UnaryFunctionLayerParams.Operation.Value("SQRT") ) elif mode == "rsqrt": spec_layer_params.type = ( _proto.NeuralNetwork_pb2.UnaryFunctionLayerParams.Operation.Value("RSQRT") ) elif mode == "inverse": spec_layer_params.type = ( _proto.NeuralNetwork_pb2.UnaryFunctionLayerParams.Operation.Value("INVERSE") ) elif mode == "power": spec_layer_params.type = ( _proto.NeuralNetwork_pb2.UnaryFunctionLayerParams.Operation.Value("POWER") ) elif mode == "exp": spec_layer_params.type = ( _proto.NeuralNetwork_pb2.UnaryFunctionLayerParams.Operation.Value("EXP") ) elif mode == "log": spec_layer_params.type = ( _proto.NeuralNetwork_pb2.UnaryFunctionLayerParams.Operation.Value("LOG") ) elif mode == "abs": spec_layer_params.type = ( _proto.NeuralNetwork_pb2.UnaryFunctionLayerParams.Operation.Value("ABS") ) elif mode == "threshold": spec_layer_params.type = ( _proto.NeuralNetwork_pb2.UnaryFunctionLayerParams.Operation.Value("THRESHOLD") ) else: raise NotImplementedError("Unknown unary function %s " % mode) return spec_layer def add_split(self, name, input_name, output_names): """ Add a split layer that uniformly splits the input along the channel dimension to produce multiple outputs. Refer to the ``SplitLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_names: list of str List of output blob names of this layer. See Also -------- add_elementwise """ spec_layer = self._add_generic_layer(name, [input_name], output_names) spec_layer_params = spec_layer.split spec_layer_params.nOutputs = len(output_names) return spec_layer def add_load_constant(self, name, output_name, constant_value, shape): """ Add a load constant layer. Refer to the ``LoadConstantLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. output_name: str The output blob name of this layer. constant_value: numpy.array value of the constant as a numpy array. shape: list of int or tuple of int List of ints representing the shape of the constant. Must be of length 3: [C,H,W] See Also -------- add_elementwise """ spec_layer = self._add_generic_layer(name, [], [output_name]) spec_layer_params = spec_layer.loadConstant data = spec_layer_params.data data.floatValue.extend(constant_value.flatten()) spec_layer_params.shape.extend(shape) self.rank_dict[output_name] = 5 if len(data.floatValue) != _np.prod(shape): raise ValueError( "Dimensions of 'shape' do not match the size of the provided constant" ) if not self._disable_rank5_shape_mapping: if len(shape) != 3: raise ValueError("'shape' must be of length 3") return spec_layer def add_custom(self, name, input_names, output_names, custom_proto_spec=None): """ Add a custom layer. Refer to the ``CustomLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names to this layer. output_names: list of str The output blob names from this layer. custom_proto_spec: CustomLayerParams A protobuf CustomLayerParams message. This can also be left blank and filled in later. """ # custom layers require a newer specification version from coremltools import _MINIMUM_CUSTOM_LAYER_SPEC_VERSION if self.spec: self.spec.specificationVersion = max( self.spec.specificationVersion, _MINIMUM_CUSTOM_LAYER_SPEC_VERSION ) spec_layer = self._add_generic_layer(name, input_names, output_names) spec_layer.custom.MergeFromString(b"") if custom_proto_spec: spec_layer.custom.CopyFrom(custom_proto_spec) return spec_layer def add_resize_bilinear( self, name, input_name, output_name, target_height=1, target_width=1, mode="ALIGN_ENDPOINTS_MODE", ): """ Add a resize bilinear layer to the model. A layer that resize the input to a given spatial size using bilinear interpolation. Refer to the ``ResizeBilinearLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. target_height: int Output height dimension. target_width: int Output width dimension. mode: str Following values are supported: 'STRICT_ALIGN_ENDPOINTS_MODE', 'ALIGN_ENDPOINTS_MODE', 'UPSAMPLE_MODE', 'ROI_ALIGN_MODE'. This parameter determines the sampling grid used for bilinear interpolation. See Also -------- add_upsample """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.resizeBilinear spec_layer_params.targetSize.append(target_height) spec_layer_params.targetSize.append(target_width) mode = mode.upper() if isinstance(mode, str) else mode if mode == "ALIGN_ENDPOINTS_MODE": spec_layer_params.mode.samplingMethod = ( _proto.NeuralNetwork_pb2.SamplingMode.Method.Value("ALIGN_ENDPOINTS_MODE") ) elif mode == "STRICT_ALIGN_ENDPOINTS_MODE": spec_layer_params.mode.samplingMethod = ( _proto.NeuralNetwork_pb2.SamplingMode.Method.Value("STRICT_ALIGN_ENDPOINTS_MODE") ) elif mode == "UPSAMPLE_MODE": spec_layer_params.mode.samplingMethod = ( _proto.NeuralNetwork_pb2.SamplingMode.Method.Value("UPSAMPLE_MODE") ) elif mode == "ROI_ALIGN_MODE": spec_layer_params.mode.samplingMethod = ( _proto.NeuralNetwork_pb2.SamplingMode.Method.Value("ROI_ALIGN_MODE") ) else: raise ValueError("Unsupported resize bilinear mode %s" % mode) return spec_layer def add_crop_resize( self, name, input_names, output_name, target_height=1, target_width=1, mode="STRICT_ALIGN_ENDPOINTS_MODE", normalized_roi=False, box_indices_mode="CORNERS_HEIGHT_FIRST", spatial_scale=1.0, ): """ Add a crop resize layer to the model. A layer that extracts cropped spatial patches or RoIs (regions of interest) from the input and resizes them to a pre-specified size using bilinear interpolation. Note that RoI Align layer can be implemented with this layer followed by a pooling layer. Refer to the ``CropResizeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str * Must be a list of two names: image feature map and crop indices/RoI input. * First input corresponds to a blob with shape ``[1, Batch, C, H_in, W_in]``. This represents a batch of input image feature data with ``C`` channels. * The second input shape must be ``[N, 1, 4, 1, 1]`` or ``[N, 1, 5, 1, 1]``. This represents the bounding box coordinates for ``N`` patches/RoIs. * ``N``: number of patches/RoIs to be extracted. * If RoI shape = ``[N, 1, 4, 1, 1]``, the channel axis corresponds to the four coordinates specifying the bounding box. All the N~ RoIs are extracted from all the batches of the input. * If RoI shape = ``[N, 1, 5, 1, 1]``, the first element of the channel axis specifies the input batch id from which to extract the RoI and must be in the interval ``[0, Batch - 1]``. That is, ``n`` -th RoI is extracted from the ``RoI[n,0,0,0]`` -th input batch id. The last four elements of the channel axis specify the bounding box coordinates. output_name: str The output blob name of this layer. target_height: int Output height dimension. target_width: int Output width dimension. mode: str * The following values are supported: ``'STRICT_ALIGN_ENDPOINTS_MODE'``, ``'ALIGN_ENDPOINTS_MODE'``, ``'UPSAMPLE_MODE'``, ``'ROI_ALIGN_MODE'``. * This parameter determines the sampling grid used for bilinear interpolation. normalized_roi: bool * If true the bounding box coordinates must be in the interval ``[0, 1]``. They are scaled by ``(input_height - 1)``, ``(input_width - 1)``; that is, based on the input spatial dimensions. * If false the bounding box coordinates must be in the interval ``[0, input_height - 1]`` and ``[0, input_width - 1]``, respectively for height and width dimensions. box_indices_mode: str * The following values are supported: ``'CORNERS_HEIGHT_FIRST'``, ``'CORNERS_WIDTH_FIRST'``, ``'CENTER_SIZE_HEIGHT_FIRST'``, ``'CENTER_SIZE_WIDTH_FIRST'``. * Representation used to interpret the bounding box coordinates (RoI) input. * ``'CORNERS_HEIGHT_FIRST'``: ``[h_start, w_start, h_end, w_end]`` * ``'CORNERS_WIDTH_FIRST'``: ``[w_start, h_start, w_end, h_end]`` * ``'CENTER_SIZE_HEIGHT_FIRST'``: ``[h_center, w_center, box_height, box_width]`` * ``'CENTER_SIZE_WIDTH_FIRST'``: ``[w_center, h_center, box_width, box_height]`` spatial_scale: float Additional spatial scale that multiplies the bounding box coordinates. Generally used while implementing the RoI Align layer, which uses unnormalized RoI coordinates along with a spatial scale less than or equal to 1. See Also -------- add_resize_bilinear, add_crop """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.cropResize spec_layer_params.targetSize.append(target_height) spec_layer_params.targetSize.append(target_width) spec_layer_params.normalizedCoordinates = normalized_roi spec_layer_params.spatialScale = spatial_scale mode = mode.upper() if isinstance(mode, str) else mode box_indices_mode = ( box_indices_mode.upper() if isinstance(box_indices_mode, str) else box_indices_mode ) if mode == "ALIGN_ENDPOINTS_MODE": spec_layer_params.mode.samplingMethod = ( _proto.NeuralNetwork_pb2.SamplingMode.Method.Value("ALIGN_ENDPOINTS_MODE") ) elif mode == "STRICT_ALIGN_ENDPOINTS_MODE": spec_layer_params.mode.samplingMethod = ( _proto.NeuralNetwork_pb2.SamplingMode.Method.Value("STRICT_ALIGN_ENDPOINTS_MODE") ) elif mode == "UPSAMPLE_MODE": spec_layer_params.mode.samplingMethod = ( _proto.NeuralNetwork_pb2.SamplingMode.Method.Value("UPSAMPLE_MODE") ) elif mode == "ROI_ALIGN_MODE": spec_layer_params.mode.samplingMethod = ( _proto.NeuralNetwork_pb2.SamplingMode.Method.Value("ROI_ALIGN_MODE") ) else: raise ValueError("Unsupported crop resize mode %s" % mode) if box_indices_mode == "CORNERS_HEIGHT_FIRST": spec_layer_params.boxIndicesMode.boxMode = ( _proto.NeuralNetwork_pb2.BoxCoordinatesMode.Coordinates.Value( "CORNERS_HEIGHT_FIRST" ) ) elif box_indices_mode == "CORNERS_WIDTH_FIRST": spec_layer_params.boxIndicesMode.boxMode = ( _proto.NeuralNetwork_pb2.BoxCoordinatesMode.Coordinates.Value("CORNERS_WIDTH_FIRST") ) elif box_indices_mode == "CENTER_SIZE_HEIGHT_FIRST": spec_layer_params.boxIndicesMode.boxMode = ( _proto.NeuralNetwork_pb2.BoxCoordinatesMode.Coordinates.Value( "CENTER_SIZE_HEIGHT_FIRST" ) ) elif box_indices_mode == "CENTER_SIZE_WIDTH_FIRST": spec_layer_params.boxIndicesMode.boxMode = ( _proto.NeuralNetwork_pb2.BoxCoordinatesMode.Coordinates.Value( "CENTER_SIZE_WIDTH_FIRST" ) ) else: raise ValueError( "Unsupported crop resize box indices mode %s" % box_indices_mode ) return spec_layer def set_pre_processing_parameters( self, image_input_names=None, is_bgr=False, red_bias=0.0, green_bias=0.0, blue_bias=0.0, gray_bias=0.0, image_scale=1.0, image_format="NCHW", ): """ Add a pre-processing parameters layer to the neural network object. Parameters ---------- image_input_names: list of str Name of input blobs that are images is_bgr: boolean or dict() Channel order for input blobs that are images. BGR if True else RGB. To specify a different value for each image input, provide a dictionary with input names as keys. red_bias: float or dict() Image re-centering parameter (red channel) blue_bias: float or dict() Image re-centering parameter (blue channel) green_bias: float or dict() Image re-centering parameter (green channel) gray_bias: float or dict() Image re-centering parameter (for grayscale images) image_scale: float or dict() Value by which to scale the images. image_format: str Image format, either 'NCHW' / 'NHWC' See Also -------- set_input, set_output, set_class_labels """ if not image_input_names: return # nothing to do here image_format = ( image_format.upper() if isinstance(image_format, str) else image_format ) if image_format != "NCHW" and image_format != "NHWC": raise ValueError( "Input image format must be either 'NCHW' or 'NHWC'. Provided {}".format( image_format ) ) if not isinstance(is_bgr, dict): is_bgr = dict.fromkeys(image_input_names, is_bgr) if not isinstance(red_bias, dict): red_bias = dict.fromkeys(image_input_names, red_bias) if not isinstance(blue_bias, dict): blue_bias = dict.fromkeys(image_input_names, blue_bias) if not isinstance(green_bias, dict): green_bias = dict.fromkeys(image_input_names, green_bias) if not isinstance(gray_bias, dict): gray_bias = dict.fromkeys(image_input_names, gray_bias) if not isinstance(image_scale, dict): image_scale = dict.fromkeys(image_input_names, image_scale) # Raise error if any key in image preprocessing parameters # are not in image_input_names. def check_valid_preprocessing_keys(input, target, input_name): for key in input: if key not in target: raise ValueError("Invalid key {} in {}.".format(key, input_name)) target = image_input_names check_valid_preprocessing_keys(is_bgr, target, "is_bgr") check_valid_preprocessing_keys(red_bias, target, "red_bias") check_valid_preprocessing_keys(blue_bias, target, "blue_bias") check_valid_preprocessing_keys(green_bias, target, "green_bias") check_valid_preprocessing_keys(gray_bias, target, "gray_bias") check_valid_preprocessing_keys(image_scale, target, "image_scale") spec = self.spec # Add image inputs for input_ in spec.description.input: if input_.name in image_input_names: if input_.type.WhichOneof("Type") == "multiArrayType": array_shape = tuple(input_.type.multiArrayType.shape) if len(array_shape) == 4: input_indices = ( [0, 1, 2, 3] if image_format == "NCHW" else [0, 3, 1, 2] ) elif len(array_shape) == 3: # Adding dummy index for 'batch' for compatibility input_indices = ( [0, 0, 1, 2] if image_format == "NCHW" else [0, 2, 0, 1] ) else: raise ValueError( "Invalid input shape. Input of rank {}, but expecting input of either rank 3 or rank 4".format( len(array_shape) ) ) # Extract image shape depending on input format _, channels, height, width = [array_shape[e] for e in input_indices] if image_format == "NHWC": # If input format is 'NHWC' for TF model, it will be # 'NCHW' for CoreML model. Therefore, add transpose to # NHWC after the input and replace all use of input layers = self.nn_spec.layers complement_transpose = True transpose_names = set() transpose_outputs = [] for layer_ in layers: if ( layer_.HasField("transpose") and layer_.input[0] == input_.name ): transpose_order = list(layer_.transpose.axes) if transpose_order == [ 0, 3, 1, 2, ] or transpose_order == [2, 0, 1]: transpose_names.add(layer_.name) transpose_outputs += list(layer_.output) else: complement_transpose = False break else: for i in layer_.input: if i == input_.name: complement_transpose = False break if complement_transpose: for layer_ in layers: for i in range(len(layer_.input)): if layer_.input[i] in transpose_names: layer_.input[i] = input_.name for layer_ in layers: if layer_.name == input_.name: del layer_.output[:] layer_.output.extend(transpose_outputs) break while len(transpose_names) > 0: for idx, layer_ in enumerate(layers): if layer_.name in transpose_names: del layers[idx] transpose_names.remove(layer_.name) else: axes = [1, 2, 0] if len(array_shape) == 4: axes = [0, 2, 3, 1] input_transpose = input_.name + "_to_nhwc" transpose_layer = self.add_transpose( name=input_transpose, axes=axes, input_name=input_.name, output_name=input_transpose, ) layers.insert(0, layers.pop()) for layer_ in layers: for i in range(len(layer_.input)): if layer_.name == input_transpose: continue if layer_.input[i] == input_.name: layer_.input[i] = input_transpose # TODO: If input is not rank 3 or 4, then accordingly handle # e.g. for rank-2 input, squeeze additional dimension in case of Gray scale image if channels == 1: input_.type.imageType.colorSpace = ( _proto.FeatureTypes_pb2.ImageFeatureType.ColorSpace.Value("GRAYSCALE") ) elif channels == 3: if input_.name in is_bgr: if is_bgr[input_.name]: input_.type.imageType.colorSpace = ( _proto.FeatureTypes_pb2.ImageFeatureType.ColorSpace.Value("BGR") ) else: input_.type.imageType.colorSpace = ( _proto.FeatureTypes_pb2.ImageFeatureType.ColorSpace.Value("RGB") ) else: input_.type.imageType.colorSpace = ( _proto.FeatureTypes_pb2.ImageFeatureType.ColorSpace.Value("RGB") ) else: raise ValueError( "Channel Value %d not supported for image inputs" % channels ) input_.type.imageType.width = width input_.type.imageType.height = height preprocessing = self.nn_spec.preprocessing.add() preprocessing.featureName = input_.name scaler = preprocessing.scaler if input_.name in image_scale: scaler.channelScale = image_scale[input_.name] else: scaler.channelScale = 1.0 if input_.name in red_bias: scaler.redBias = red_bias[input_.name] if input_.name in blue_bias: scaler.blueBias = blue_bias[input_.name] if input_.name in green_bias: scaler.greenBias = green_bias[input_.name] if input_.name in gray_bias: scaler.grayBias = gray_bias[input_.name] def add_transpose(self, name, axes, input_name, output_name): """ Add a N-D transpose layer with axes as a parameter. Refer to the ``TransposeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. axes: list of int or tuple of int The list containing a permutation of "[0,1,2,...,N-1]" where N is the rank of input/output tensor. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_permute, add_reshape """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) rank = len(axes) axes = [rank + axis if axis < 0 else axis for axis in axes] spec_layer.transpose.axes.extend(axes) return spec_layer def add_softmax_nd(self, name, input_name, output_name, axis): """ Add a softmax_nd layer to the model that performs softmax operation along the given axis. Refer to the ``SoftmaxNDLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axis: int Axis to perform the softmax operation on. """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.softmaxND spec_layer_params.axis = axis return spec_layer def add_concat_nd(self, name, input_names, output_name, axis, interleave=False): """ Add a concat_nd layer to the model that performs concatenation along the given axis. Refer to the ``ConcatNDLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. axis: int Axis to perform the concat operation on. interleave : bool (Only available in Core ML Specification >= 5 (iOS >= 14, macOS >= 11.0) If true, concatenate by interleaving the inputs """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.concatND spec_layer_params.axis = axis if interleave: spec_layer_params.interleave = True if self.spec: self.spec.specificationVersion = max(self.spec.specificationVersion, _SPECIFICATION_VERSION_IOS_14) return spec_layer def add_erf(self, name, input_name, output_name): """ Add an erf function (gaussian error function) layer to the model. Refer to the ``ErfLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.erf.MergeFromString(b"") return spec_layer def add_gelu(self, name, input_name, output_name, mode="EXACT"): """ Add a GELU (gaussian error linear unit) activation layer, which is: ``0.5 * x * (1 + erf(x / sqrt(2)))``. Refer to the ``GeluLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. mode: str, optional Gelu mode in [EXACT | TANH_APPROXIMATION | SIGMOID_APPROXIMATION], default EXACT. """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.gelu if mode == "EXACT": spec_layer_params.mode = _proto.NeuralNetwork_pb2.GeluLayerParams.GeluMode.Value( "EXACT" ) elif mode == "TANH_APPROXIMATION": spec_layer_params.mode = _proto.NeuralNetwork_pb2.GeluLayerParams.GeluMode.Value( "TANH_APPROXIMATION" ) elif mode == "SIGMOID_APPROXIMATION": spec_layer_params.mode = _proto.NeuralNetwork_pb2.GeluLayerParams.GeluMode.Value( "SIGMOID_APPROXIMATION" ) else: raise ValueError("Unsupported Gelu mode %s" % mode) return spec_layer def add_sin(self, name, input_name, output_name): """ Add a sin layer to the model that computes element-wise sine for the input tensor. Refer to the ``SinLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_sinh, add_asin, add_asinh """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.sin.MergeFromString(b"") return spec_layer def add_cos(self, name, input_name, output_name): """ Add a cos layer to the model that computes element-wise cosine for the input tensor. Refer to the ``CosLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_cosh, add_acos, add_acosh """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.cos.MergeFromString(b"") return spec_layer def add_tan(self, name, input_name, output_name): """ Add a tan layer to the model that computes element-wise tangent for the input tensor. Refer to the ``TanLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_tanh, add_atan, add_atanh """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.tan.MergeFromString(b"") return spec_layer def add_asin(self, name, input_name, output_name): """ Add an asin layer to the model that computes element-wise arc-sine for the input tensor. Refer to the ``AsinLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_sin, add_sinh, add_asinh """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.asin.MergeFromString(b"") return spec_layer def add_acos(self, name, input_name, output_name): """ Add an acos layer to the model that computes element-wise arc-cosine for the input tensor. Refer to the ``AcosLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_cos, add_cosh, add_acosh """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.acos.MergeFromString(b"") return spec_layer def add_atan(self, name, input_name, output_name): """ Add an atan layer to the model that computes element-wise arc-tangent for the input tensor. Refer to the ``AtanLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_tan, add_tanh, add_atanh """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.atan.MergeFromString(b"") return spec_layer def add_sinh(self, name, input_name, output_name): """ Add a sinh layer to the model that computes element-wise hyperbolic sine for the input tensor. Refer to the ``SinhLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_sin, add_asin, add_asinh """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.sinh.MergeFromString(b"") return spec_layer def add_cosh(self, name, input_name, output_name): """ Add a osh layer to the model that computes element-wise hyperbolic cosine for the input tensor. Refer to the ``CoshLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_cos, add_acos, add_acosh """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.cosh.MergeFromString(b"") return spec_layer def add_tanh(self, name, input_name, output_name): """ Add a tanh layer to the model that computes element-wise hyperbolic tangent for the input tensor. Refer to the ``TanhLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_tan, add_atan, add_atanh """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.tanh.MergeFromString(b"") return spec_layer def add_asinh(self, name, input_name, output_name): """ Add an asinh layer to the model that computes element-wise inverse hyperbolic sine for the input tensor. Refer to the ``AsinhLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_sin, add_sinh, add_asin """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.asinh.MergeFromString(b"") return spec_layer def add_acosh(self, name, input_name, output_name): """ Add an acosh layer to the model that computes element-wise inverse hyperbolic cosine for the input tensor. Refer to the ``AcoshLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_cos, add_cosh, add_acos """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.acosh.MergeFromString(b"") return spec_layer def add_atanh(self, name, input_name, output_name): """ Add an atanh layer to the model that computes element-wise inverse hyperbolic tangent for the input tensor. Refer to the ``AtanhLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_tan, add_tanh, add_atan """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.atanh.MergeFromString(b"") return spec_layer def add_exp2(self, name, input_name, output_name): """ Add an exp2 layer to the model that performs element-wise experiential operation. Refer to the ``Exp2LayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.exp2.MergeFromString(b"") return spec_layer def add_add_broadcastable(self, name, input_names, output_name): """ Add an add_broadcastable layer to the model that performs element-wise addition operation with broadcast support. Refer to the ``AddBroadcastableLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.addBroadcastable.MergeFromString(b"") self._set_max_input_rank(input_names, output_name) return spec_layer def add_multiply_broadcastable(self, name, input_names, output_name): """ Add a multiply_broadcastable layer to the model that performs element-wise multiplication operation with broadcast support. Refer to the ``MultiplyBroadcastableLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.multiplyBroadcastable.MergeFromString(b"") self._set_max_input_rank(input_names, output_name) return spec_layer def add_divide_broadcastable(self, name, input_names, output_name): """ Add a divide_broadcastable layer to the model that performs element-wise division operation with broadcast support. Refer to the ``DivideBroadcastableLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.divideBroadcastable.MergeFromString(b"") self._set_max_input_rank(input_names, output_name) return spec_layer def add_subtract_broadcastable(self, name, input_names, output_name): """ Add a subtract_broadcastable layer to the model that performs element-wise subtraction operation with broadcast support. Refer to the ``SubtractBroadcastableLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.subtractBroadcastable.MergeFromString(b"") self._set_max_input_rank(input_names, output_name) return spec_layer def add_max_broadcastable(self, name, input_names, output_name): """ Add a max_broadcastable layer to the model that performs element-wise maximum operation with broadcast support. Refer to the ``MaxBroadcastableLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.maxBroadcastable.MergeFromString(b"") self._set_max_input_rank(input_names, output_name) return spec_layer def add_min_broadcastable(self, name, input_names, output_name): """ Add a min_broadcastable layer to the model that performs element-wise minimum operation with broadcast support. Refer to the ``MinBroadcastableLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.minBroadcastable.MergeFromString(b"") self._set_max_input_rank(input_names, output_name) return spec_layer def add_floor_div_broadcastable(self, name, input_names, output_name): """ Add a floor_div_broadcastable layer to the model that performs floor division operation with broadcast support. Refer to the ``FloorDivBroadcastableLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. See Also -------- add_divide_broadcastable """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.floorDivBroadcastable.MergeFromString(b"") self._set_max_input_rank(input_names, output_name) return spec_layer def add_mod_broadcastable(self, name, input_names, output_name): """ Add a mod_broadcastable layer to the model that performs element-wise modular operation with broadcast support. Refer to the ``ModBroadcastableLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.modBroadcastable.MergeFromString(b"") self._set_max_input_rank(input_names, output_name) return spec_layer def add_pow_broadcastable(self, name, input_names, output_name): """ Add a pow_broadcastable layer to the model that performs element-wise power operation with broadcast support. Refer to the ``PowBroadcastableLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.powBroadcastable.MergeFromString(b"") self._set_max_input_rank(input_names, output_name) return spec_layer def add_stack(self, name, input_names, output_name, axis=0): """ Add a stack layer to the model that performs stack operation on a list of tensors into one rank+1 tensor on the given axis. Refer to the ``StackLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. axis: int, optional The axis to perform stack operation, default: 0. """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.stack.axis = axis self.rank_dict[output_name] = self._get_rank(input_names[0]) + 1 return spec_layer def add_ceil(self, name, input_name, output_name): """ Add a ceil layer to the model that performs element-wise ceil operation on the input tensor that rounds the value to the smallest integer not less than x. Refer to the ``CeilLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_floor, add_clip """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.ceil.MergeFromString(b"") return spec_layer def add_floor(self, name, input_name, output_name): """ Add a floor layer to the model that performs element-wise floor operation on the input tensor that rounds the value to the largest integer not greater than x. Refer to the ``FloorLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_ceil, add_clip """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.floor.MergeFromString(b"") return spec_layer def add_round(self, name, input_name, output_name): """ Add a round layer to the model that performs element-wise round operation on the input tensor that rounds the value to the nearest integer. Refer to the ``RoundLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.round.MergeFromString(b"") return spec_layer def add_sign(self, name, input_name, output_name): """ Add a sign layer to the model that performs element-wise sign operation (+1 for positive values, -1 for negative values, 0 for zeroes). Refer to the ``SignLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.sign.MergeFromString(b"") return spec_layer def add_clip(self, name, input_name, output_name, min_value=0.0, max_value=1.0): """ Add a clip layer to the model that performs element-wise clip operation. Clip the values in the input tensor to the range [min_value, max_value]. Refer to the ``ClipLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. min_value: float, optional Lower bound / minimum value for clip, default: 0.0. max_value: float, optional Upper bound / maximum value for clip, default: 1.0. See Also -------- add_floor, add_ceil """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.clip.MergeFromString(b"") spec_params = spec_layer.clip spec_params.minVal = float(min_value) spec_params.maxVal = float(max_value) return spec_layer def add_split_nd( self, name, input_name, output_names, axis, num_splits=2, split_sizes=None ): """ Add a split layer to the model that splits the input tensor into multiple output tensors. Either uniformly split the input tensor into ``num_splits`` tensors, or split into given size list ``split_sizes`` output tensors. Refer to the ``SplitNDLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_names: list of str The output blob names of this layer. axis: int Axis to perform split on. num_splits: int, optional Number of splits, default: 2. split_sizes: list of int or tuple of int, optional List of size to split, default ``[]`` or ``None``. """ if not split_sizes: split_sizes = [] spec_layer = self._add_generic_layer(name, [input_name], output_names) spec_layer_params = spec_layer.splitND spec_layer_params.axis = axis if split_sizes and len(split_sizes) > 0: spec_layer_params.splitSizes.extend(split_sizes) spec_layer_params.numSplits = len(split_sizes) else: spec_layer_params.numSplits = num_splits assert len(output_names) == spec_layer_params.numSplits return spec_layer def add_slice_static( self, name, input_name, output_name, begin_ids, end_ids, strides, begin_masks, end_masks, squeeze_masks=None, ): """ Add a slice_static layer to the model that extracts a slice of size ``(end - begin) / stride`` from the given input tensor. Refer to the ``SliceStaticLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. begin_ids: list of int or tuple of int Begin offsets for slice layer. end_ids: list of int or tuple of int End offsets for slice layer. strides: list of int or tuple of int Strides for slice layer. begin_masks: list of bool Boolean masks for begin offsets. end_masks: list of bool Boolean masks for end offsets. squeeze_masks: list of bool Boolean masks for squeezing axis. See Also -------- add_slice_dynamic """ rank = len(begin_ids) assert len(end_ids) == rank assert len(strides) == rank assert len(begin_masks) == rank assert len(end_masks) == rank assert squeeze_masks is None or len(squeeze_masks) == rank spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.sliceStatic spec_layer_params.beginIds.extend(begin_ids) spec_layer_params.endIds.extend(end_ids) spec_layer_params.strides.extend(strides) spec_layer_params.beginMasks.extend(begin_masks) spec_layer_params.endMasks.extend(end_masks) if not (squeeze_masks and any(squeeze_masks)): return spec_layer if self.spec and ( not self.spec.specificationVersion or self.spec.specificationVersion < _SPECIFICATION_VERSION_IOS_14 ): self.spec.specificationVersion = _SPECIFICATION_VERSION_IOS_14 spec_layer_params.squeezeMasks.extend(squeeze_masks) return spec_layer def add_slice_dynamic( self, name, input_names, output_name, end_ids=None, strides=None, begin_masks=None, end_masks=None, squeeze_masks=None, ): """ Add a slice_dynamic layer to the model that extracts a slice of size ``(end - begin) / stride`` from the given input tensor. Refer to the ``SliceDynamicLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. end_ids: list of int or tuple of int, optional End offsets for slice layer, default: [1]. strides: list of int or tuple of int, optional Strides for slice layer, default: [1]. begin_masks: list of bool, optional Boolean masks for begin offsets, default: [false]. end_masks: list of bool, optional Boolean masks for end offsets, default: [false]. squeeze_masks: list of bool, optional Boolean masks for squeezing axis, default: [false]. See Also -------- add_slice_static """ if not end_ids: end_ids = [1 for _ in range(5)] if not strides: strides = [1 for _ in range(5)] if not begin_masks: begin_masks = [False for _ in range(5)] if not end_masks: end_masks = [False for _ in range(5)] if not squeeze_masks: squeeze_masks = [False for _ in range(5)] spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.sliceDynamic spec_layer_params.endIds.extend(end_ids) spec_layer_params.strides.extend(strides) spec_layer_params.beginMasks.extend(begin_masks) spec_layer_params.endMasks.extend(end_masks) if not any(squeeze_masks): return spec_layer if self.spec and ( not self.spec.specificationVersion or self.spec.specificationVersion < _SPECIFICATION_VERSION_IOS_14 ): self.spec.specificationVersion = _SPECIFICATION_VERSION_IOS_14 spec_layer_params.squeezeMasks.extend(squeeze_masks) return spec_layer def add_tile(self, name, input_name, output_name, reps=[]): """ Add a tile layer to the model that construct a tensor by repeating the input tensor multiple number of times. Refer to the ``TileLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str or list[str] The input blob name of this layer. If second input is provided, reps parameter is ignored. output_name: str The output blob name of this layer. reps: list of int or tuple of int Number of times to replicate. If `input_name` provides two inputs, second input is used as reps and this parameter is ignored. See Also -------- add_stack, add_concat_nd """ if isinstance(input_name, tuple): input_names = list(input_name) elif isinstance(input_name, list): input_names = input_name else: input_names = [input_name] spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.tile # If two inputs are provided, # ignore reps attribute. if len(input_names) == 2: reps = [] if self.spec and ( not self.spec.specificationVersion or self.spec.specificationVersion < _SPECIFICATION_VERSION_IOS_14 ): self.spec.specificationVersion = _SPECIFICATION_VERSION_IOS_14 assert all([i > 0 for i in reps]) spec_layer_params.reps.extend(reps) return spec_layer def add_range_static( self, name, output_name, input_names=None, end=1, start=0, step=1 ): """ Add a range_static layer that returns a tensor that contains evenly spaced values. This layer has no input and three parameters. Refer to the ``RangeStaticLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. output_name: str The output blob name of this layer. input_names: list of str The input blob names of this layer. end: int, optional Range parameter: end, default: 1. start: int, optional Range parameter: start, default: 0. step: int, optional Range parameter: step size, default: 1. See Also -------- add_range_dynamic """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.rangeStatic.MergeFromString(b"") spec_params = spec_layer.rangeStatic spec_params.endValue = float(end) spec_params.startValue = float(start) spec_params.stepSizeValue = float(step) self.rank_dict[output_name] = 1 return spec_layer def add_range_dynamic(self, name, input_names, output_name, start=0, step=1): """ Add a range_dynamic layer that returns a tensor that contains evenly spaced values. This layer has up to three inputs or no input and three parameters. Refer to the ``RangeDynamicLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names. If input size == 1: end is input, start and step are read from parameters If input size == 2: end, start are inputs, step is read from parameters If input size == 3: start, end, step are all inputs, none of the parameters are used. output_name: str The output blob name of this layer. start: int, optional Range parameter: start. Ignored if start is provided as input, default: 0. step: int, optional Range parameter: step. Ignored if step is provided as input, default: 1. See Also -------- add_range_static """ if len(input_names) < 1 or len(input_names) > 3: raise ValueError("RangeDynamic layer must have either 1, 2 or 3 inputs.") spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.rangeDynamic.MergeFromString(b"") spec_params = spec_layer.rangeDynamic spec_params.startValue = float(start) spec_params.stepSizeValue = float(step) self.rank_dict[output_name] = 1 return spec_layer def add_branch(self, name, input_name, if_branch=None, else_branch=None): """ Add a branch layer to the model that provides the functionality of branching or an ``if-else`` block. Refer to the ``BranchLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. if_branch: NeuralNetwork Neural network to execute if the absolute value of the input tensor is greater than 1e-6. else_branch: NeuralNetwork, optional Neural network to execute if the absolute value of the input tensor is less than 1e-6. See Also -------- add_loop, add_loop_continue, add_loop_break """ layer = self._add_generic_layer(name, [input_name], []) branch = layer.branch if if_branch: branch.ifBranch = if_branch else: branch.ifBranch.MergeFromString(b"") if else_branch: branch.elseBranch = else_branch else: branch.elseBranch.MergeFromString(b"") return layer def add_loop( self, name, body_network=None, input_name=None, condition=None, condition_network=None, max_iterations=None, ): """ Add a loop layer to the model that provides the functionality of a ``for`` loop, or a ``while`` loop. Refer to the ``LoopLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. body_network: NeuralNetwork Neural network to execute for the body of the loop. input_name: str The input blob name of this layer. condition: str, optional Condition of the loop. condition_network: NeuralNetwork, optional Neural network to execute for the condition of the loop. max_iterations: int, optional Maximum number of iterations of the loop. See Also -------- add_loop_break, add_loop_continue, add_branch """ input_names = [] if input_name is None else [input_name] spec_layer = self._add_generic_layer(name, input_names, []) loop = spec_layer.loop if condition_network is None: loop.conditionNetwork.MergeFromString(b"") else: loop.conditionNetwork = condition_network if condition is not None: loop.conditionVar = str(condition) if max_iterations is not None: loop.maxLoopIterations = ( max_iterations if max_iterations is not None else -1 ) if body_network is None: loop.bodyNetwork.MergeFromString(b"") else: loop.bodyNetwork = body_network return spec_layer def add_loop_break(self, name): """ Add a loop_break layer to the model that terminates the loop that contains this layer. Must reside in the ``bodyNetwork`` of the loop layer. Refer to the ``LoopBreakLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. See Also -------- add_loop, add_loop_continue, add_branch """ spec_layer = self.nn_spec.layers.add() spec_layer.name = name spec_layer.loopBreak.MergeFromString(b"") return spec_layer def add_loop_continue(self, name): """ Add a loop_continue layer to the model that stops the current loop iteration and continue on the next iteration. Must reside in the ``bodyNetwork`` of the loop layer. Refer to the ``LoopContinueLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. See Also -------- add_loop, add_loop_break, add_branch """ spec_layer = self.nn_spec.layers.add() spec_layer.name = name spec_layer.loopContinue.MergeFromString(b"") return spec_layer def add_copy(self, name, input_name, output_name): """ Add a copy layer to the model that copies its input tensor to the output tensor. Input tensor and output tensor must have distinct names. Refer to the ``CopyLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.copy.MergeFromString(b"") # If output name rank is different than earlier, # mark it as unknown if output_name in self.rank_dict and self._get_rank( output_name ) != self._get_rank(input_name): self.rank_dict[output_name] = -1 else: self.rank_dict[output_name] = self._get_rank(input_name) return spec_layer def add_greater_than( self, name, input_names, output_name, use_greater_than_equal=False, alpha=0.0 ): """ Add a greater_than layer to the model that performs the element-wise greater-than (>) operation or greater-than-or-equal-to (>=) operation. Broadcasting is supported. Refer to the ``GreaterThanLayerParams``, ``GreaterEqualLayerParams`` messages in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. use_greater_than_equal: bool, optional Whether or not to allow greater than or equal to, default: false. alpha: float, optional y = x1 != alpha, if only one input is provided, default: 0. See Also -------- add_equal, add_not_equal, add_less_than """ if isinstance(input_names, str): input_names = [input_names] spec_layer = self._add_generic_layer(name, input_names, [output_name]) if use_greater_than_equal: spec_layer.greaterEqual.MergeFromString(b"") if len(input_names) == 1: spec_layer.greaterEqual.alpha = alpha else: spec_layer.greaterThan.MergeFromString(b"") if len(input_names) == 1: spec_layer.greaterThan.alpha = alpha return spec_layer def add_less_than( self, name, input_names, output_name, use_less_than_equal=False, alpha=0.0 ): """ Add a less_than layer to the model that performs the element-wise less-than (<) operation or less-than-or-equal-to (<=) operation. Broadcasting is supported. Refer to the ``LessThanL_ayerParams``, ``LessEqualLayerParams`` messages in specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. use_less_than_equal: bool, optional Whether or not to allow less than or equal to, default: false. alpha: float, optional y = x1 != alpha, if only one input is provided, default: 0. See Also -------- add_equal, add_not_equal, add_greater_than """ if isinstance(input_names, str): input_names = [input_names] spec_layer = self._add_generic_layer(name, input_names, [output_name]) if use_less_than_equal: spec_layer.lessEqual.MergeFromString(b"") if len(input_names) == 1: spec_layer.lessEqual.alpha = alpha else: spec_layer.lessThan.MergeFromString(b"") if len(input_names) == 1: spec_layer.lessThan.alpha = alpha return spec_layer def add_equal(self, name, input_names, output_name, alpha=0.0): """ Add an equal layer to the model that performs the element-wise equal (=) operation. Broadcasting is supported. Refer to the ``EqualLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. alpha: float, optional y = x1 != alpha, if only one input is provided, default: 0. See Also -------- add_not_equal, add_greater_than, add_less_than """ if isinstance(input_names, str): input_names = [input_names] spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.equal.MergeFromString(b"") if len(input_names) == 1: spec_layer.equal.alpha = alpha return spec_layer def add_not_equal(self, name, input_names, output_name, alpha=0.0): """ Add a not_equal layer to the model that performs the element-wise not equal (!=) operation. Broadcasting is supported. Refer to the ``NotEqualLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. alpha: float, optional y = x1 != alpha, if only one input is provided, default: 0. See Also -------- add_equal, add_greater_than, add_less_than """ if isinstance(input_names, str): input_names = [input_names] spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.notEqual.MergeFromString(b"") if len(input_names) == 1: spec_layer.notEqual.alpha = alpha return spec_layer def add_logical(self, name, input_names, output_name, mode): """ Add a logical layer to the model that performs element-wise logical and/or/xor/not operation. Broadcasting is supported. Refer to the ``LogicalOrLayerParams``, ``LogicalNotLayerParams``, ``LogicalNotLayerParams``, and ``LogicalAndLayerParam`` messages in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. mode: str Logical operation mode in [AND | OR | XOR | NOT]. """ if isinstance(input_names, str): input_names = [input_names] spec_layer = self._add_generic_layer(name, input_names, [output_name]) if mode in ["AND", "OR", "XOR"] and len(input_names) != 2: raise ValueError('Logical operation "%s" requires 2 inputs' % name) if mode in ["NOT"] and len(input_names) != 1: raise ValueError('Logical operation "%s" requires 1 input' % name) if mode == "AND": spec_layer.logicalAnd.MergeFromString(b"") elif mode == "OR": spec_layer.logicalOr.MergeFromString(b"") elif mode == "XOR": spec_layer.logicalXor.MergeFromString(b"") elif mode == "NOT": spec_layer.logicalNot.MergeFromString(b"") else: raise ValueError('Logical operation "%s" is not supported' % mode) return spec_layer def add_sliding_windows( self, name, input_name, output_name, axis, window_size, step=1 ): """ Add a sliding_windows layer to the model that returns a tensor containing all windows of size ``window_size`` * separated by ``step`` along the dimension ``axis``. Refer to the ``SlidingWindowsLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The of input blob name of this layer. output_name: str The output blob name of this layer. axis: int Axis to perform the operation. window_size: int Number of elements in the sliding window. step: int, optional The stride of the input elements in the sliding window, default: 1. See Also -------- add_slice, add_slice_static, add_slice_dynamic """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.slidingWindows spec_layer_params.axis = axis spec_layer_params.windowSize = window_size spec_layer_params.step = step self.rank_dict[output_name] = self._get_rank(input_name) + 1 return spec_layer def add_reverse(self, name, input_name, output_name, reverse_dim=None): """ Add a reverse layer to the model that reverses specific dimensions of the input tensor. Refer to the ``ReverseLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. reverse_dim: list of int or tuple of int Reverse along the dimension, default [1]. See Also -------- add_reverse_sequence """ if not reverse_dim: reverse_dim = [1] spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reverse spec_layer_params.reverseDim.extend(map(bool, reverse_dim)) return spec_layer def add_reverse_sequence( self, name, input_names, output_name, batch_axis=0, seq_axis=-1 ): """ Add a reverse sequence layer to the model that reverses variable length slices. Refer to the ``ReverseSeqLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. batch_axis: int, optional Slices input along the dimension batch_axis, default 0. seq_axis: int, optional Reverse along the dimension seq_axis, default: -1. See Also -------- add_reverse """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.reverseSeq.batchAxis = batch_axis spec_layer.reverseSeq.sequenceAxis = seq_axis return spec_layer def add_gather(self, name, input_names, output_name, axis=0): """ Add a gather layer to the model that gathers elements or slices from data and store to a tensor whose shape is defined by indices from the input. Refer to the ``GatherLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. axis: int, optional The axis the operation perform on, default: 0. See Also -------- add_gather_nd, add_gather_along_axis, add_scatter, add_scatter_nd, add_scatter_along_axis """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.gather.axis = axis self.rank_dict[output_name] = ( self._get_rank(input_names[0]) - 1 + self._get_rank(input_names[1]) ) return spec_layer def add_scatter(self, name, input_names, output_name, axis=0, mode="UPDATE"): """ Add a scatter layer to the model that scatters data into a new tensor according to indices from the input. Refer to the ``ScatterLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. axis: int The axis the operation perform on, default: 0. mode: str, optional Scatter accumulation mode in [UPDATE | ADD | SUB | MUL | DIV | MAX | MIN], default: UPDATE. See Also -------- add_scatter_nd, add_scatter_along_axis, add_gather, add_gather_nd, add_gather_along_axis """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.scatter spec_layer_params.axis = axis mode = mode.upper() if isinstance(mode, str) else mode if mode == "UPDATE": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_UPDATE") elif mode == "ADD": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_ADD") elif mode == "SUB": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_SUB") elif mode == "MUL": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_MUL") elif mode == "DIV": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_DIV") elif mode == "MAX": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_MAX") elif mode == "MIN": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_MIN") else: raise ValueError("Unsupported Scatter mode %s" % mode) return spec_layer def add_gather_along_axis(self, name, input_names, output_name, axis=0): """ Add a gather_along_axis layer to the model that gathers elements or slices from data and store to a tensor whose shape is defined by indices from the input along the given axis into the output tensor. Refer to the ``GatherAlongAxisLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. axis: int, optional The axis the operation perform on, default: 0. See Also -------- add_gather, add_gather_nd, add_scatter, add_scatter_nd, add_scatter_along_axis """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.gatherAlongAxis.axis = axis self.rank_dict[output_name] = self._get_rank(input_names[1]) return spec_layer def add_scatter_along_axis( self, name, input_names, output_name, axis=0, mode="UPDATE" ): """ Add a scatter_along_axis layer to the model that scatters data into a new tensor according to indices from the input along the given axis into the output tensor. Refer to the ``ScatterAlongAxisLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. axis: int The axis to perform on, default: 0. mode: str, optional Scatter accumulation mode in [UPDATE | ADD | SUB | MUL | DIV | MAX | MIN], default: UPDATE See Also -------- add_scatter, add_scatter_nd, add_gather, add_gather_nd, add_gather_along_axis """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.scatterAlongAxis spec_layer_params.axis = axis mode = mode.upper() if isinstance(mode, str) else mode if mode == "UPDATE": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_UPDATE") elif mode == "ADD": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_ADD") elif mode == "SUB": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_SUB") elif mode == "MUL": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_MUL") elif mode == "DIV": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_DIV") elif mode == "MAX": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_MAX") elif mode == "MIN": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_MIN") else: raise ValueError("Unsupported scatter_along_axis mode %s" % mode) return spec_layer def add_gather_nd(self, name, input_names, output_name): """ Add a gather layer to the model that gathers elements or slices from data and store to a tensor whose shape is defined by indices from the input. This is the reverse operation of the scatter operation. Refer to the ``GatherNDLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. See Also -------- add_gather, add_gather_along_axis, add_scatter, add_scatter_nd, add_scatter_along_axis """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.gatherND.MergeFromString(b"") # NOTE: ideally, following is formula for computing output rank # self.rank_dict[output_name] = self._get_rank(input_names[1]) - 1 + self._get_rank(input_names[0]) # + shape_dict[input_names[1]][-1] # But, shape of indices (input_names[1]) is unknown and hence marking as -1 # Converter should update rank if indices are known self.rank_dict[output_name] = -1 return spec_layer def add_scatter_nd(self, name, input_names, output_name, mode="UPDATE"): """ Add a scatter layer to the model that scatters data into a new tensor according to indices from input. This is the reverse operation of the gather operation. Refer to the ``ScatterNDLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. mode: str, optional Scatter accumulation mode in [UPDATE | ADD | SUB | MUL | DIV | MAX | MIN], default: UPDATE See Also -------- add_scatter, add_scatter_along_axis, add_gather, add_gather_nd, add_gather_along_axis """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.scatterND mode = mode.upper() if isinstance(mode, str) else mode if mode == "UPDATE": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_UPDATE") elif mode == "ADD": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_ADD") elif mode == "SUB": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_SUB") elif mode == "MUL": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_MUL") elif mode == "DIV": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_DIV") elif mode == "MAX": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_MAX") elif mode == "MIN": spec_layer_params.mode = _proto.NeuralNetwork_pb2.ScatterMode.Value("SCATTER_MIN") else: raise ValueError("Unsupported scatter mode %s" % mode) return spec_layer def add_topk( self, name, input_names, output_names, k=0, axis=0, use_bottom_k=False ): """ Add a topk layer to the model that returns top or bottom k values and the corresponding indices of the input tensor along a given axis. Refer to the ``TopKLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. It must be of length 1 or 2. The optional second input corresponds to value of K. output_names: list of str The output blob names of this layer. First and second correspond to values and indices, respectively. k: int, optional number of values/indices to be computed along the axis. Need not be given of there are two inputs, default: 0. axis: int, optional axis along which the topk values/indices are computed. negative indexing is supported, default: 0 use_bottom_k: bool, optional if true, bottom k values are computed instead, default: false. """ spec_layer = self._add_generic_layer(name, input_names, output_names) spec_layer_params = spec_layer.topK spec_layer_params.axis = axis spec_layer_params.K = k spec_layer_params.useBottomK = use_bottom_k return spec_layer def add_argmax(self, name, input_name, output_name, axis, keepdims=True): """ Add an argmax layer to the model that returns the indices of the maximum value along a specified axis in the input tensor. Refer to the ``ArgMaxLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axis: int axis along which the argmax is computed. Negative indexing is supported. keepdims: bool, optional if true, output rank is same as input rank, default: true. See Also -------- add_argmin """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.argMax spec_layer_params.axis = axis spec_layer_params.removeDim = not keepdims input_rank = self._get_rank(input_name) if input_rank == 1: self.rank_dict[output_name] = 1 else: if keepdims: self.rank_dict[output_name] = input_rank else: self.rank_dict[output_name] = input_rank - 1 return spec_layer def add_argmin(self, name, input_name, output_name, axis, keepdims=True): """ Add an argmin layer to the model that returns the indices of the minimum value along a specified axis in the input tensor. Refer to the ``ArgMinLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axis: int axis along which the argmin is computed. Negative indexing is supported. keepdims: bool, optional if true, output rank is same as input rank, default: true. See Also -------- add_argmax """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.argMin spec_layer_params.axis = axis spec_layer_params.removeDim = not keepdims input_rank = self._get_rank(input_name) if input_rank == 1: self.rank_dict[output_name] = 1 else: if keepdims: self.rank_dict[output_name] = input_rank else: self.rank_dict[output_name] = input_rank - 1 return spec_layer def add_constant_pad( self, name, input_names, output_name, value=0.0, pad_to_given_output_size_mode=False, pad_amounts=[], ): """ Add a constant pad layer. Refer to the ``ConstantPaddingLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob name(s) of this layer. output_name: str The output blob name of this layer. value: float value to be used for padding. pad_to_given_output_size_mode: bool if true, pad_amounts are interpreted as output shapes (see example in NeuralNetwork.proto) pad_amounts: [int], optional must be non negative. Amount to pad in each dimension. Length of the list must be twice the input/output rank. Not required if second input is present. See Also -------- add_padding """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.constantPad spec_layer_params.value = value spec_layer_params.padToGivenOutputSizeMode = pad_to_given_output_size_mode if len(pad_amounts) > 0: spec_layer_params.padAmounts.extend(map(int, pad_amounts)) if len(input_names) == 1 and len(pad_amounts) == 0: raise ValueError( "Constant_pad layer: pad_amounts must be provided when there is a single input" ) return spec_layer def add_nms( self, name, input_names, output_names, iou_threshold=0.5, score_threshold=0.0, max_boxes=1, per_class_suppression=False, ): """ Add a non maximum suppression layer. Refer to the ``NonMaximumSuppressionLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. Must be at least 2, and maximum 5. output_names: list of str The output blob names of this layer. Must be of length 4 exactly. iou_threshold: float intersection over union threshold for suppression. Ignored if 3rd input is present. score_threshold: float threshold for selecting boxes to be used for NMS algorithm. Ignored if 4th input is present. max_boxes: int maximum number of boxes to output. Ignored if 5th input is present. per_class_suppression: bool If true, boxes are organized into classes and suppression is applied to each class group separately See Also -------- add_constant_pad """ spec_layer = self._add_generic_layer(name, input_names, output_names) spec_layer_params = spec_layer.NonMaximumSuppression spec_layer_params.iouThreshold = iou_threshold spec_layer_params.scoreThreshold = score_threshold spec_layer_params.maxBoxes = max_boxes spec_layer_params.perClassSuppression = per_class_suppression self.rank_dict[output_names[0]] = 3 self.rank_dict[output_names[1]] = 3 self.rank_dict[output_names[2]] = 2 self.rank_dict[output_names[3]] = 1 return spec_layer def add_embedding_nd( self, name, input_name, output_name, vocab_size, embedding_size, W, b=None, is_quantized_weight=False, quantization_type="linear", nbits=8, quant_scale=None, quant_bias=None, quant_lut=None, ): """ Add an embedding layer to the model that performs a matrix lookup and optionally adds a bias. Refer to the ``EmbeddingNDLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. vocab_size: int Size of the vocabulary (1 + maximum integer index of the words). embedding_size: int Size of the embedded vector. W: float32 numpy.array or bytes() Weight matrix of shape (embedding_size, vocab_size). If W is of type bytes(), i.e. quantized to 1-8 bits, other quantization related arguments must be provided as well (see below). b: numpy.array , optional Bias vector of shape (embedding_size, ). Quantization arguments expected, when W is of type bytes(): is_quantized_weight: bool Set it to true when W is of type bytes(), representing quantized weights quantization_type: str When weights are quantized (i.e. W is of type bytes()), this should be either "linear" or "lut". nbits: int Should be between 1 and 8 (inclusive). Number of bits per weight value. quant_scale: numpy.array(dtype=numpy.float32) scale vector to be used with linear quantization. Must be of length either 1 or embedding_size. quant_bias: numpy.array(dtype=numpy.float32) bias vector to be used with linear quantization. Must be of length either 1 or embedding_size. quant_lut: numpy.array(dtype=numpy.float32) the LUT (look up table) to be used with LUT quantization. Must be of length 2^nbits. See Also -------- add_inner_product, add_embedding """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) # Fill in the parameters spec_layer_params = spec_layer.embeddingND spec_layer_params.vocabSize = vocab_size spec_layer_params.embeddingSize = embedding_size spec_layer_params.hasBias = b is not None weights = spec_layer_params.weights if not is_quantized_weight: weights.floatValue.extend(W.flatten()) else: _verify_quantization_arguments( weight=W, output_channels=embedding_size, quantization_type=quantization_type, nbits=nbits, quant_scale=quant_scale, quant_bias=quant_bias, quant_lut=quant_lut, ) _fill_quantized_weights( weights_message=weights, W=W, quantization_type=quantization_type, nbits=nbits, quant_scale=quant_scale, quant_bias=quant_bias, quant_lut=quant_lut, ) if b is not None: bias = spec_layer_params.bias bias.floatValue.extend(b.flatten()) return spec_layer def add_batched_mat_mul( self, name, input_names, output_name, transpose_a=False, transpose_b=False, weight_matrix_rows=0, weight_matrix_columns=0, W=None, bias=None, int_8_dynamic_quantize=False, is_quantized_weight=False, quantization_type="linear", nbits=8, quant_scale=None, quant_bias=None, quant_lut=None, ): """ Add a N-D Batched Matrix Multiplication layer with NumPy-like broadcasting. Refer to the ``BatchedMatMulLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. transpose_a: bool, optional Whether or not to transpose A, default: false. transpose_b: bool, optional Whether or not to transpose B, default: false. weight_matrix_rows: int, optional Must be equal to the last dimension of the input, default: 0. weight_matrix_columns: int, optional Must be equal to the last dimension of the output, default: 0. W: float32 numpy.array or bytes(), optional Weight matrix of shape ``(weight_matrix_rows, weight_matrix_columns)``. If ``W`` is of type ``bytes()`` (quantized to 1-8 bits), other quantization-related arguments must be provided as well (see below). bias: float32 numpy.array, optional Bias vector of shape (weight_matrix_columns,). Quantization Quantization arguments, used when ``W`` is of type ``bytes()``: is_quantized_weight: bool, optional Set it to true when ``W`` is of type ``bytes()``, representing quantized weights, default: false. quantization_type: str, optional When weights are quantized (that is, ``W`` is of type ``bytes()``), this should be either ``"linear"`` or ``"lut"``, default: ``"linear"``. nbits: int, optional Should be between 1 and 8 (inclusive). Number of bits per weight value, default: 8. quant_scale: numpy.array(dtype=numpy.float32), optional Scale vector to be used with linear quantization. Must be of length either 1 or ``weight_matrix_columns``, default: ``None``. quant_bias: numpy.array(dtype=numpy.float32), optional Bias vector to be used with linear quantization. Must be of length either 1 or ``weight_matrix_columns``, default: ``None``. quant_lut: numpy.array(dtype=numpy.float32), optional The LUT (look up table) to be used with LUT quantization. Must be of length 2^n bits, default: ``None``. int_8_dynamic_quantize: bool Whether to quantize and dequantize before and after batched matmul, respectively. Expects byte weights, representing int8 values, if True. See NeuralNetwork.proto for other validation conditions. See Also -------- add_inner_product """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.batchedMatmul spec_layer_params.transposeA = transpose_a spec_layer_params.transposeB = transpose_b spec_layer_params.int8DynamicQuantize = int_8_dynamic_quantize if ((W is not None) or (bias is not None)) and len(input_names) == 2: raise ValueError( "batched_mat_mul: Weight and/or bias are ignored when there are two inputs" ) if (W is None) and len(input_names) == 1: raise ValueError( "batched_mat_mul: Weight parameter must be provided when there is one input" ) self.rank_dict[output_name] = 2 for input_ in input_names: self.rank_dict[output_name] = max( self._get_rank(output_name), self._get_rank(input_) ) if len(input_names) == 1: spec_layer_params.weightMatrixFirstDimension = weight_matrix_rows spec_layer_params.weightMatrixSecondDimension = weight_matrix_columns spec_layer_params.hasBias = bias is not None weights = spec_layer_params.weights if not is_quantized_weight: weights.floatValue.extend(_np.transpose(W).flatten()) else: _verify_quantization_arguments( weight=W, output_channels=weight_matrix_columns, quantization_type=quantization_type, nbits=nbits, quant_scale=quant_scale, quant_bias=quant_bias, quant_lut=quant_lut, int_8_dynamic_quantize=int_8_dynamic_quantize, ) if nbits < 8: num_weights = weight_matrix_rows * weight_matrix_columns byte_arr = _np.frombuffer(W, dtype=_np.uint8) W = _unpack_to_bytes(byte_arr, num_weights, nbits) elif int_8_dynamic_quantize: W = _np.frombuffer(W, dtype=_np.int8) else: W = _np.frombuffer(W, dtype=_np.uint8) W = _np.reshape(W, (weight_matrix_rows, weight_matrix_columns)) W = _np.transpose(W) W_bytes = bytes() if nbits == 8: W_bytes += W.flatten().tobytes() else: W_bytes += _convert_array_to_nbit_quantized_bytes( W.flatten(), nbits ).tobytes() _fill_quantized_weights( weights_message=weights, W=W_bytes, use_int_8=int_8_dynamic_quantize, quantization_type=quantization_type, nbits=nbits, quant_scale=quant_scale, quant_bias=quant_bias, quant_lut=quant_lut, ) if bias is not None: bias_param = spec_layer_params.bias bias_param.floatValue.extend(bias.flatten()) return spec_layer def add_get_shape(self, name, input_name, output_name): """ Add a get_shape layer to the model. Refer to the ``GetShapeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_reshape, add_reshape_like, add_reshape_static, add_reshape_dynamic """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.getShape.MergeFromString(b"") self.rank_dict[output_name] = 1 return spec_layer def add_load_constant_nd(self, name, output_name, constant_value, shape): """ Add a load_constant layer that loads data as a parameter and provides it as an output. Refer to the ``LoadConstantNDLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. output_name: str The output blob name of this layer. constant_value: numpy.array() value of the constant as a numpy array. shape: list of int or tuple of int List of ints representing the shape of the constant. See Also -------- add_elementwise """ spec_layer = self._add_generic_layer(name, [], [output_name]) spec_layer_params = spec_layer.loadConstantND data = spec_layer_params.data data.floatValue.extend(constant_value.flatten()) spec_layer_params.shape.extend(shape) # Rank information self.rank_dict[output_name] = len(shape) if len(data.floatValue) != _np.prod(shape): raise ValueError( "Dimensions of 'shape' do not match the size of the provided constant" ) return spec_layer def add_fill_like(self, name, input_name, output_name, value=0.0): """ Add a fill_like layer to the model outputs a tensor filled with a scalar value. Refer to the ``FillLikeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. value: float, optional A scalar value for the fill operation, default 0. See Also -------- add_fill_static, add_fill_dynamic """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.fillLike spec_layer_params.value = value return spec_layer def add_fill_static(self, name, output_name, output_shape, value=0.0): """ Add a fill_static layer to the model that outputs a tensor filled with a scalar value given shape as parameter. Refer to the ``FillStaticLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. output_name: str The output blob name of this layer. output_shape: list of int or tuple of int The target shape of the output tensor. value: float, optional A scalar value for the fill operation, default 0. See Also -------- add_fill_like, add_fill_static """ spec_layer = self._add_generic_layer(name, [], [output_name]) spec_layer_params = spec_layer.fillStatic spec_layer_params.value = value spec_layer_params.targetShape.extend(output_shape) self.rank_dict[output_name] = len(output_shape) return spec_layer def add_fill_dynamic(self, name, input_name, output_name, value=0.0): """ Add a fill_dynamic layer to the model that outputs a tensor filled with a scalar value. Refer to the ``FillDynamicLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. value: float, optional A scalar value for the fill operation, default: 0. See Also -------- add_fill_like, add_fill_static """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.fillDynamic spec_layer_params.value = value self.rank_dict[output_name] = -1 return spec_layer def add_broadcast_to_like(self, name, input_names, output_name): """ Add a broadcast_to_like layer to the model that broadcasts a tensor to a compatible shape. Refer to the ``BroadcastToLikeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. See Also -------- add_broadcast_to_static, add_broadcast_to_dynamic """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.broadcastToLike.MergeFromString(b"") if len(input_names) != 2: raise ValueError("BroadcastToLikeLayer must have two inputs") self.rank_dict[output_name] = self._get_rank(input_names[1]) return spec_layer def add_broadcast_to_static(self, name, input_name, output_name, output_shape): """ Add a broadcast_to_static layer to the model that broadcasts a tensor to a compatible shape. Refer to the ``BroadcastToStaticLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. output_shape: list of int or tuple of int The target shape of the output tensor. See Also -------- add_broadcast_to_like, add_broadcast_to_dynamic """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.broadcastToStatic spec_layer_params.targetShape.extend(output_shape) self.rank_dict[output_name] = len(output_shape) return spec_layer def add_broadcast_to_dynamic(self, name, input_names, output_name): """ Add a broadcast_to_dynamic layer to the model that broadcasts a tensor to a compatible shape. Refer to the ``BroadcastToDynamicLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. See Also -------- add_broadcast_to_like, add_broadcast_to_static """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.broadcastToDynamic.MergeFromString(b"") # Setting rank to -1 is a hint that Rank was not computed # converter can modify if it's a constant and known self.rank_dict[output_name] = -1 return spec_layer def add_expand_dims(self, name, input_name, output_name, axes): """ Add an expand dims layer to the model that increases the rank of the input tensor by adding unit dimensions. Refer to the ``ExpandDimsLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axes: list of int or tuple of int Dimensions the operation perform on. See Also -------- add_squeeze """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.expandDims spec_layer_params.axes.extend(axes) self.rank_dict[output_name] = self._get_rank(input_name) + len(axes) return spec_layer def add_squeeze(self, name, input_name, output_name, axes=None, squeeze_all=False): """ Add a squeeze layer to the model that decrease the rank of the input tensor by removing unit dimensions. Refer to the ``SqueezeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axes: list of int or tuple of int, optional Dimensions to perform the operation, default: ``None`` (squeeze_all). squeeze_all: bool, optional If true, all dimensions that are 1 are squeezed, default: false. See Also -------- add_expand_dims """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.squeeze if axes is not None: spec_layer_params.axes.extend(axes) spec_layer_params.squeezeAll = squeeze_all if squeeze_all or axes is None: # All the dimensions that are 1 will be squeezed # converter should update rank if shape is known self.rank_dict[output_name] = -1 else: rank = self._get_rank(input_name) - len(axes) self.rank_dict[output_name] = rank if rank != 0 else 1 return spec_layer def add_flatten_to_2d(self, name, input_name, output_name, axis=1): """ Add a flatten_to_2d layer to the model that flattens the input tensor into a 2-dimensional matrix. Refer to the ``FlattenTo2DLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The of input blob name of this layer. output_name: str The output blob name of this layer. axis: int, optional Axis to perform the operation, default: 1. See Also -------- add_flatten """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.flattenTo2D spec_layer_params.axis = axis self.rank_dict[output_name] = 2 return spec_layer def add_reshape_like(self, name, input_names, output_name): """ Add a reshape_like layer to the model that reshapes a tensor. Refer to the ``ReshapeLikeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. See Also -------- add_reshape, add_reshape_static, add_reshape_dynamic, add_rank_preserving_reshape """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.reshapeLike.MergeFromString(b"") self.rank_dict[output_name] = self._get_rank(input_names[1]) return spec_layer def add_reshape_static(self, name, input_name, output_name, output_shape): """ Add a reshape_static layer to the model that reshapes a tensor. Refer to the ``ReshapeStaticLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. output_shape: list of int or tuple of int Target shape of the output tensor. See Also -------- add_reshape, add_reshape_like, add_reshape_dynamic, add_rank_preserving_reshape """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reshapeStatic spec_layer_params.targetShape.extend(output_shape) self.rank_dict[output_name] = len(output_shape) return spec_layer def add_reshape_dynamic(self, name, input_names, output_name): """ Add a reshape_dynamic layer to the model that reshapes a tensor. Refer to the ``ReshapeDynamicLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. See Also -------- add_reshape, add_reshape_like, add_reshape_static, add_rank_preserving_reshape """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.reshapeDynamic.MergeFromString(b"") # Setting rank to -1 is a hint that Rank was not computed # converter can modify if it's a constant and known self.rank_dict[output_name] = -1 return spec_layer def add_rank_preserving_reshape(self, name, input_name, output_name, output_shape): """ Add a rank_preserving_reshape layer to the model that reshapes the input tensor without altering the rank of the tensor. Refer to the ``RankPreservingReshapeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. output_shape: list of int or tuple of int Determines the shape of the output blob. 0: copy the dimension of the input to output -1: calculate dimensions from the rest of the shape See Also -------- add_reshape, add_reshape_like, add_reshape_static, add_reshape_dynamic """ spec_layer = self._add_generic_layer( name, [input_name], [output_name], input_ranks=[len(output_shape)], input_shapes=[[int(x) for x in output_shape]], output_ranks=[len(output_shape)], output_shapes=[[int(x) for x in output_shape]], ) spec_layer_params = spec_layer.rankPreservingReshape spec_layer_params.targetShape.extend(map(int, output_shape)) return spec_layer def add_random_normal_like( self, name, input_name, output_name, mean=0.0, stddev=0.0, seed=-1 ): """ Add a random_normal_like layer to the model that fills the output tensor with random values from normal distribution. Refer to the ``RandomNormalLikeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. mean: float, optional The mean of the normal distribution, default: 0.0. stddev: float, optional The standard deviation of the normal distribution, default: 1.0. seed: int, optional Used to create a random seed for the distribution, default -1 (random). See Also -------- add_random_normal_static, add_random_normal_dynamic """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.randomNormalLike spec_layer_params.mean = mean spec_layer_params.stdDev = stddev spec_layer_params.seed = seed return spec_layer def add_random_normal_static( self, name, output_name, output_shape, mean=0.0, stddev=0.0, seed=-1 ): """ Add a random_normal_static layer to the model that fills the output tensor with random values from normal distribution. Refer to the ``RandomNormaStaticLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. output_name: str The output blob name of this layer. output_shape: list of int or tuple of int Target shape of the output tensor. mean: float, optional The mean of the normal distribution, default: 0.0. stddev: float, optional The standard deviation of the normal distribution, default: 1.0. seed: int, optional Used to create a random seed for the distribution. Default -1 (random). See Also -------- add_random_normal_like, add_random_normal_dynamic """ spec_layer = self._add_generic_layer(name, [], [output_name]) spec_layer_params = spec_layer.randomNormalStatic spec_layer_params.outputShape.extend(output_shape) spec_layer_params.mean = mean spec_layer_params.stdDev = stddev spec_layer_params.seed = seed self.rank_dict[output_name] = len(output_shape) return spec_layer def add_random_normal_dynamic( self, name, input_names, output_name, mean=0.0, stddev=0.0, seed=-1 ): """ Add a random_normal_dynamic layer to the model that fills the output tensor with random values from normal distribution. Refer to the ``RandomNormalDynamicLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. mean: float, optional The mean of the normal distribution, default: 0.0. stddev: float, optional The standard deviation of the normal distribution, default: 1.0. seed: int, optional Used to create a random seed for the distribution. Default -1 (random). See Also -------- add_random_normal_like, add_random_normal_static """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.randomNormalDynamic spec_layer_params.mean = mean spec_layer_params.stdDev = stddev spec_layer_params.seed = seed # Setting rank to -1 is a hint that Rank was not computed # converter can modify if it's a constant and known self.rank_dict[output_name] = -1 return spec_layer def add_random_uniform_like( self, name, input_name, output_name, minval=0.0, maxval=1.0, seed=-1 ): """ Add a random_uniform_like layer to the model that fills the output tensors with random values from uniform distribution. Refer to the ``RandomUniformLikeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. minval: float, optional Lower bound / minimum value of the uniform distribution, default: 0.0. maxval: float, optional Upper bound / maximum value of the uniform distribution, default: 1.0. seed: int, optional Used to create a random seed for the distribution. default -1 (random). See Also -------- add_random_uniform_static, add_random_uniform_dynamic """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.randomUniformLike spec_layer_params.minVal = minval spec_layer_params.maxVal = maxval spec_layer_params.seed = seed return spec_layer def add_random_uniform_static( self, name, output_name, output_shape, minval=0.0, maxval=1.0, seed=-1 ): """ Add a random_uniform_static layer to the model that fills the output tensors with random values from uniform distribution. Refer to the ``RandomUniformStaticLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. output_name: str The output blob name of this layer. output_shape: list of int or tuple of int Target shape of the output tensor. minval: float, optional Lower bound / minimum value of the uniform distribution, default: 0.0. maxval: float, optional Upper bound / maximum value of the uniform distribution, default: 1.0. seed: int, optional Used to create a random seed for the distribution. default -1 (random). See Also -------- add_random_uniform_like, add_random_uniform_dynamic """ spec_layer = self._add_generic_layer(name, [], [output_name]) spec_layer_params = spec_layer.randomUniformStatic spec_layer_params.outputShape.extend(output_shape) spec_layer_params.minVal = minval spec_layer_params.maxVal = maxval spec_layer_params.seed = seed self.rank_dict[output_name] = len(output_shape) return spec_layer def add_random_uniform_dynamic( self, name, input_names, output_name, minval=0.0, maxval=1.0, seed=-1 ): """ Add a random_uniform_dynamic layer to the model that fills the output tensors with random values from uniform distribution. Refer to the ``RandomUniformDynamicLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. minval: float, optional Lower bound / minimum value of the uniform distribution, default: 0.0. maxval: float, optional Upper bound / maximum value of the uniform distribution, default: 1.0. seed: int, optional Used to create a random seed for the distribution. default -1 (random). See Also -------- add_random_uniform_like, add_random_uniform_static """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.randomUniformDynamic spec_layer_params.minVal = minval spec_layer_params.maxVal = maxval spec_layer_params.seed = seed # Setting rank to -1 is a hint that Rank was not computed # converter can modify if it's a constant and known self.rank_dict[output_name] = -1 return spec_layer def add_random_bernoulli_like( self, name, input_name, output_name, prob=0.5, seed=-1 ): """ Add a random_bernoulli_like layer to the model that fills the output tensor with random values from Bernoulli distribution. Refer to the ``RandomBernoulliLikeLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. prob: float, optional Probabilities for Bernoulli distribution, default: 0.5. seed: int, optional Used to create a random seed for the distribution. default -1 (random). See Also -------- add_random_bernoulli_static, add_random_bernoulli_dynamic """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.randomBernoulliLike spec_layer_params.prob = prob spec_layer_params.seed = seed return spec_layer def add_random_bernoulli_static( self, name, output_name, output_shape, prob=0.5, seed=-1 ): """ Add a random_bernoulli_static layer to the model that fills the output tensor with random values from Bernoulli distribution. Refer to the ``RandomBernoulliStaticLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. output_name: str The output blob name of this layer. output_shape: list of int or tuple of int Target shape of the output tensor. prob: float, optional Probabilities for Bernoulli distribution, default: 0.5. seed: int, optional Used to create a random seed for the distribution. default -1 (random). See Also -------- add_random_bernoulli_like, add_random_bernoulli_dynamic """ spec_layer = self._add_generic_layer(name, [], [output_name]) spec_layer_params = spec_layer.randomBernoulliStatic spec_layer_params.outputShape.extend(output_shape) spec_layer_params.prob = prob spec_layer_params.seed = seed self.rank_dict[output_name] = len(output_shape) return spec_layer def add_random_bernoulli_dynamic( self, name, input_names, output_name, prob=0.5, seed=-1 ): """ Add a random_bernoulli_dynamic layer to the model that fills the output tensor with random values from Bernoulli distribution. Refer to the ``RandomBernoulliDynamicLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. prob: float, optional Probabilities for Bernoulli distribution, default: 0.5. seed: int, optional Used to create a random seed for the distribution. default -1 (random). See Also -------- add_random_bernoulli_like, add_random_bernoulli_static """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.randomBernoulliDynamic spec_layer_params.prob = prob spec_layer_params.seed = seed # Setting rank to -1 is a hint that Rank was not computed # converter can modify if it's a constant and known self.rank_dict[output_name] = -1 return spec_layer def add_categorical_distribution( self, name, input_name, output_name, num_samples, is_logits=True, eps=1e-10, temperature=1.0, seed=-1, ): """ Add a categorical_distribution layer to the model that fills the output tensor with random values from categorical distribution. Refer to the ``CategoricalDistributionLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. num_samples: int List of dimensions for the reduce operations. is_logits: bool, optional If true, the input is log probabilities. If false, the input is probabilities, default: True eps: float, optional Epsilon parameter for categorical distribution, default 1e-10. temperature: float, optional Temperature parameter for categorical distribution, default 1.0. seed: int, optional Used to create a random seed for the distribution. default -1 (random). """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.categoricalDistribution spec_layer_params.numSamples = num_samples spec_layer_params.isLogits = is_logits spec_layer_params.eps = eps spec_layer_params.temperature = temperature spec_layer_params.seed = seed return spec_layer def add_reduce_sum( self, name, input_name, output_name, axes=None, keepdims=True, reduce_all=False ): """ Add a reduce_sum layer to the model that reduces the input tensor using ``sum(elements across given dimensions)``. Refer to the ``ReduceSumLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axes: list of int or tuple of int, optional List of dimensions for the reduce operations. Each should be in range ``[-rank(input), rank(input))``, default: ``None`` (``reduce_all``). keepdims: bool, optional Whether or not to retain the reduced dimensions with length 1, default: true. reduce_all: bool, optional Whether or not to reduce on all axes, default: false. See Also -------- add_reduce_l1, add_reduce_l2, add_reduce_min, add_reduce_prod, add_reduce_max, add_reduce_mean, add_reduce_logsum, add_reduce_logsumexp, add_reduce_sumsquare """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reduceSum if axes is not None and len(axes) != 0: spec_layer_params.axes.extend(map(int, axes)) else: reduce_all = True spec_layer_params.keepDims = keepdims spec_layer_params.reduceAll = reduce_all self._set_rank_for_reduce_op( input_name, output_name, axes, keepdims, reduce_all ) return spec_layer def add_reduce_prod( self, name, input_name, output_name, axes=None, keepdims=True, reduce_all=False ): """ Add a reduce_prod layer to the model that reduces the input tensor using ``prod(elements across given dimensions)``. Refer to the ``ReduceProdLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axes: list of int or tuple of int, optional List of dimensions for the reduce operations. Each should be in range [-rank(input), rank(input)), default: ``None`` (reduce_all) keepdims: bool, optional Whether or not to retain the reduced dimensions with length 1, default: true. reduce_all: bool, optional Whether or not to reduce on all axes. If axes list is empty, it will be set to true, default: false. See Also -------- add_reduce_l1, add_reduce_l2, add_reduce_sum, add_reduce_min, add_reduce_max, add_reduce_mean, add_reduce_logsum, add_reduce_logsumexp, add_reduce_sumsquare """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reduceProd if axes is not None and len(axes) != 0: spec_layer_params.axes.extend(map(int, axes)) else: reduce_all = True spec_layer_params.keepDims = keepdims spec_layer_params.reduceAll = reduce_all self._set_rank_for_reduce_op( input_name, output_name, axes, keepdims, reduce_all ) return spec_layer def add_reduce_mean( self, name, input_name, output_name, axes=None, keepdims=True, reduce_all=False ): """ Add a reduce_mean layer to the model that reduces the input tensor using ``mean(elements across given dimensions)``. Refer to the ``ReduceMeanLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axes: list of int or tuple of int, optional List of dimensions for the reduce operations. Each should be in range [-rank(input), rank(input)), default: ``None`` (reduce_all) keepdims: bool, optional Whether or not to retain the reduced dimensions with length 1, default: true. reduce_all: bool, optional Whether or not to reduce on all axes, default: false. See Also -------- add_reduce_l1, add_reduce_l2, add_reduce_sum, add_reduce_min, add_reduce_prod add_reduce_max, add_reduce_logsum, add_reduce_logsumexp, add_reduce_sumsquare """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reduceMean if axes is not None and len(axes) != 0: spec_layer_params.axes.extend(map(int, axes)) else: reduce_all = True spec_layer_params.keepDims = keepdims spec_layer_params.reduceAll = reduce_all self._set_rank_for_reduce_op( input_name, output_name, axes, keepdims, reduce_all ) return spec_layer def add_reduce_max( self, name, input_name, output_name, axes=None, keepdims=True, reduce_all=False ): """ Add a reduce_max layer to the model that reduces the input tensor using ``max(elements across given dimensions)``. Refer to the ``ReduceMaxLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axes: list of int or tuple of int, optional List of dimensions for the reduce operations. Each should be in range [-rank(input), rank(input)), default: ``None`` (reduce_all) keepdims: bool, optional Whether or not to retain the reduced dimensions with length 1, default: true. reduce_all: bool, optional Whether or not to reduce on all axes, default: false. See Also -------- add_reduce_l1, add_reduce_l2, add_reduce_sum, add_reduce_min, add_reduce_prod add_reduce_mean, add_reduce_logsum, add_reduce_logsumexp, add_reduce_sumsquare """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reduceMax if axes is not None and len(axes) != 0: spec_layer_params.axes.extend(map(int, axes)) else: reduce_all = True spec_layer_params.keepDims = keepdims spec_layer_params.reduceAll = reduce_all self._set_rank_for_reduce_op( input_name, output_name, axes, keepdims, reduce_all ) return spec_layer def add_reduce_min( self, name, input_name, output_name, axes=None, keepdims=True, reduce_all=False ): """ Add a reduce_min layer to the model that reduces the input tensor using ``min(elements across given dimensions)``. Refer to the ``ReduceMinLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axes: list of int or tuple of int, optional List of dimensions for the reduce operations. Each should be in range [-rank(input), rank(input)), default: ``None`` (reduce_all) keepdims: bool, optional Whether or not to retain the reduced dimensions with length 1, default: true. reduce_all: bool, optional Whether or not to reduce on all axes, default: false. See Also -------- add_reduce_l1, add_reduce_l2, add_reduce_sum, add_reduce_max, add_reduce_prod add_reduce_mean, add_reduce_logsum, add_reduce_logsumexp, add_reduce_sumsquare """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reduceMin if axes is not None and len(axes) != 0: spec_layer_params.axes.extend(map(int, axes)) else: reduce_all = True spec_layer_params.keepDims = keepdims spec_layer_params.reduceAll = reduce_all self._set_rank_for_reduce_op( input_name, output_name, axes, keepdims, reduce_all ) return spec_layer def add_reduce_l2( self, name, input_name, output_name, axes=None, keepdims=True, reduce_all=False ): """ Add a reduce_l2 layer to the model that reduces the input tensor using ``l2_normalization(elements across given dimensions)``. Refer to the ``ReduceL2LayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axes: list of int or tuple of int, optional List of dimensions for the reduce operations. Each should be in range [-rank(input), rank(input)), default: ``None`` (reduce_all) keepdims: bool, optional Whether or not to retain the reduced dimensions with length 1, default: true. reduce_all: bool, optional Whether or not to reduce on all axes, default: false. See Also -------- add_reduce_l1, add_reduce_sum, add_reduce_min, add_reduce_max, add_reduce_prod add_reduce_mean, add_reduce_logsum, add_reduce_logsumexp, add_reduce_sumsquare """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reduceL2 if axes is not None and len(axes) != 0: spec_layer_params.axes.extend(map(int, axes)) else: reduce_all = True spec_layer_params.keepDims = keepdims spec_layer_params.reduceAll = reduce_all self._set_rank_for_reduce_op( input_name, output_name, axes, keepdims, reduce_all ) return spec_layer def add_reduce_l1( self, name, input_name, output_name, axes=None, keepdims=True, reduce_all=False ): """ Add a reduce_l1 layer to the model that reduces the input tensor using ``l1_normalization(elements across given dimensions)``. Refer to the ``ReduceL1LayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axes: list of int or tuple of int, optional List of dimensions for the reduce operations. Each should be in range [-rank(input), rank(input)), default: ``None`` (reduce_all) keepdims: bool, optional Whether or not to retain the reduced dimensions with length 1, default: true. reduce_all: bool, optional Whether or not to reduce on all axes, default: false. See Also -------- add_reduce_l2, add_reduce_sum, add_reduce_min, add_reduce_max, add_reduce_prod add_reduce_mean, add_reduce_logsum, add_reduce_logsumexp, add_reduce_sumsquare """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reduceL1 if axes is not None and len(axes) != 0: spec_layer_params.axes.extend(map(int, axes)) else: reduce_all = True spec_layer_params.keepDims = keepdims spec_layer_params.reduceAll = reduce_all self._set_rank_for_reduce_op( input_name, output_name, axes, keepdims, reduce_all ) return spec_layer def add_reduce_sumsquare( self, name, input_name, output_name, axes=None, keepdims=True, reduce_all=False ): """ Add a reduce_sumsquare layer to the model that reduces the input tensor using ``sum(square(elements across given dimensions))``. Refer to the ``ReduceSumSquareLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axes: list of int or tuple of int, optional List of dimensions for the reduce operations. Each should be in range [-rank(input), rank(input)), default: ``None`` (reduce_all) keepdims: bool, optional Whether or not to retain the reduced dimensions with length 1, default: true. reduce_all: bool, optional Whether or not to reduce on all axes, default: false. See Also -------- add_reduce_l1, add_reduce_l2, add_reduce_sum, add_reduce_min, add_reduce_prod add_reduce_max, add_reduce_mean, add_reduce_logsum, add_reduce_logsumexp """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reduceSumSquare if axes is not None and len(axes) != 0: spec_layer_params.axes.extend(map(int, axes)) else: reduce_all = True spec_layer_params.keepDims = keepdims spec_layer_params.reduceAll = reduce_all self._set_rank_for_reduce_op( input_name, output_name, axes, keepdims, reduce_all ) return spec_layer def add_reduce_logsum( self, name, input_name, output_name, axes=None, keepdims=True, reduce_all=False ): """ Add a reduce_logsum layer to the model that reduces the input tensor using log(sum(elements across given dimensions)). Refer to the ``ReduceLogSumLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axes: list of int or tuple of int, optional List of dimensions for the reduce operations. Each should be in range [-rank(input), rank(input)), default: ``None`` (reduce_all) keepdims: bool, optional Whether or not to retain the reduced dimensions with length 1, default: true. reduce_all: bool, optional Whether or not to reduce on all axes, default: false. See Also -------- add_reduce_l1, add_reduce_l2, add_reduce_sum, add_reduce_min, add_reduce_prod add_reduce_max, add_reduce_mean, add_reduce_logsumexp, add_reduce_sumsquare """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reduceLogSum if axes is not None and len(axes) != 0: spec_layer_params.axes.extend(map(int, axes)) else: reduce_all = True spec_layer_params.keepDims = keepdims spec_layer_params.reduceAll = reduce_all self._set_rank_for_reduce_op( input_name, output_name, axes, keepdims, reduce_all ) return spec_layer def add_reduce_logsumexp( self, name, input_name, output_name, axes=None, keepdims=True, reduce_all=False ): """ Add a reduce_logsumexp layer to the model that computes ``log(sum(exp(tensor)))`` and reduces along the given axis. Refer to the ``ReduceLogSumExpLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axes: list of int or tuple of int, optional List of dimensions for the reduce operations. Each should be in range [-rank(input), rank(input)), default: ``None`` (reduce_all) keepdims: bool, optional Whether or not to retain the reduced dimensions with length 1, default: true. reduce_all: bool, optional Whether or not to reduce on all axes, default: false. See Also -------- add_reduce_l1, add_reduce_l2, add_reduce_sum, add_reduce_min, add_reduce_prod add_reduce_max, add_reduce_mean, add_reduce_logsum, add_reduce_sumsquare """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.reduceLogSumExp if axes is not None and len(axes) != 0: spec_layer_params.axes.extend(map(int, axes)) else: reduce_all = True spec_layer_params.keepDims = keepdims spec_layer_params.reduceAll = reduce_all self._set_rank_for_reduce_op( input_name, output_name, axes, keepdims, reduce_all ) return spec_layer def add_where_nonzero(self, name, input_name, output_name): """ Add a where_nonzero layer to the model that returns a tensor containing the indices of all non-zero elements of input tensor. Refer to the ``WhereNonZeroLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. See Also -------- add_where_broadcastable """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.whereNonZero.MergeFromString(b"") self.rank_dict[output_name] = 2 return spec_layer def add_matrix_band_part( self, name, input_name, output_name, num_lower=-1, num_upper=-1 ): """ Add a matrix_band_part layer to the model that copies a tensor setting everything outside a central band in each inner-most matrix to zero. Refer to the ``MatrixBandPartLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The of input blob name of this layer. output_name: str The output blob name of this layer. num_lower: int, optional Number of lower sub-diagonals to keep. Default: -1 (keep entire lower triangle). num_upper: int, optional Number of upper sub-diagonals to keep. Default: -1 (keep entire upper triangle). See Also -------- add_lower_triangular, add_lower_triangular """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.matrixBandPart spec_layer_params.numLower = num_lower spec_layer_params.numUpper = num_upper return spec_layer def add_lower_triangular(self, name, input_name, output_name, k=0): """ Add a lower_triangular layer to the model that copies a tensor setting everything outside lower triangular to zero. Refer to the ``LowerTriangularLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The of input blob name of this layer. output_name: str The output blob name of this layer. k: int, optional Diagonal below which to zero elements, default: 0 (main diagonal), k < 0 is lower it and k > 0 is upper. See Also -------- add_upper_triangular, add_matrix_band_part """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.lowerTriangular spec_layer_params.k = k return spec_layer def add_upper_triangular(self, name, input_name, output_name, k=0): """ Add a upper_triangular layer to the model that copies a tensor setting everything outside upper triangular to zero. Refer to the ``UpperTriangularLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The of input blob name of this layer. output_name: str The output blob name of this layer. k: int, optional Diagonal above which to zero elements, default: 0 (main diagonal), k < 0 is lower it and k > 0 is upper. See Also -------- add_lower_triangular, add_matrix_band_part """ spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.upperTriangular spec_layer_params.k = k return spec_layer def add_where_broadcastable(self, name, input_names, output_name): """ Add a where_broadcastable layer to the model that returns the elements either from tensor x or tensor y, depending on the value in the condition tensor. Refer to the ``WhereBroadcastableLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. See Also -------- add_where_nonzero """ spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer.whereBroadcastable.MergeFromString(b"") self._set_max_input_rank(input_names, output_name) return spec_layer def add_layer_normalization( self, name, input_name, output_name, normalized_shape, gamma, beta, eps=1e-5 ): """ Add a layer normalization layer to the model that applies layer normalization over the input tensor. Refer to the ``LayerNormalizationLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. normalized_shape: list of int or tuple of int Input shape from an expected input of size. gamma: WeightParams Weight parameters. beta: WeightParams Bias parameters. eps: float, optional Constant value added to the denominator, default: 1e-5. """ if gamma.shape != tuple(normalized_shape): raise ValueError("Shape of parameter gamma should match normalized_shape") if beta.shape != tuple(normalized_shape): raise ValueError("Shape of parameter beta should match normalized_shape") spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer_params = spec_layer.layerNormalization spec_layer_params.normalizedShape.extend(normalized_shape) weights = spec_layer_params.gamma weights.floatValue.extend(gamma.flatten()) bias = spec_layer_params.beta bias.floatValue.extend(beta.flatten()) spec_layer_params.eps = eps return spec_layer def add_one_hot( self, name, input_names, output_name, one_hot_vector_size=None, axis=-1, on_value=1.0, off_value=0.0, ): """ Add a one hot layer to the model that computes the one hot representation of the input tensor. Refer to the ``OneHotLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. one_hot_vector_size: int > 0 size of the one hot vector. axis: int, optional refers to the axis in the output tensor, default: -1. on_value: float, optional Constant value on locations represented by first input, default: 1.0. off_value: float, optional Constant value at all other locations, default: 0.0. """ if self.spec and ( not self.spec.specificationVersion or self.spec.specificationVersion < _SPECIFICATION_VERSION_IOS_14 ): self.spec.specificationVersion = _SPECIFICATION_VERSION_IOS_14 spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.oneHot spec_layer_params.axis = axis if one_hot_vector_size: spec_layer_params.oneHotVectorSize = one_hot_vector_size spec_layer_params.onValue = on_value spec_layer_params.offValue = off_value return spec_layer def add_cumsum( self, name, input_names, output_name, axis=-1, reverse=False, exclusive=False ): """ Add a cum sum layer to the model computes the cumulative sum values of the input along a given axis. Refer to the ``CumSumLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_names: list of str The input blob names of this layer. output_name: str The output blob name of this layer. axis: int, optional Axis to perform the operation, default: -1. reverse: bool, optional if true, cumsum is performed in the opposite direction, default: False. exclusive: bool, optional whether to perform exclusive or inclusive cumulative summation, default: False. """ if self.spec and ( not self.spec.specificationVersion or self.spec.specificationVersion < _SPECIFICATION_VERSION_IOS_14 ): self.spec.specificationVersion = _SPECIFICATION_VERSION_IOS_14 spec_layer = self._add_generic_layer(name, input_names, [output_name]) spec_layer_params = spec_layer.cumSum spec_layer_params.axis = axis spec_layer_params.reverse = reverse spec_layer_params.excludeFinalSum = exclusive return spec_layer def add_clamped_relu(self, name, input_name, output_name, alpha=0.0, beta=6.0): """ Add a clamped relu layer to the model. Clamped relu formula is f(x) = min((x >= 0 ? x : alpha * x), beta) Refer to the ``ClampedReluLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. alpha: float, optional slope of the output when input is negative, default: 0.0. beta: float, optional Upper bound on the output value, default: 6.0. See Also -------- add_clip """ if self.spec and ( not self.spec.specificationVersion or self.spec.specificationVersion < _SPECIFICATION_VERSION_IOS_14 ): self.spec.specificationVersion = _SPECIFICATION_VERSION_IOS_14 spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.clampedReLU.MergeFromString(b"") spec_params = spec_layer.clampedReLU spec_params.alpha = float(alpha) spec_params.beta = float(beta) return spec_layer def add_argsort(self, name, input_name, output_name, axis=0, descending=False): """ Add an argsort layer to the model. Refer to the ``ArgsortLayerParams`` message in the specification (NeuralNetwork.proto) for more details. Parameters ---------- name: str The name of this layer. input_name: str The input blob name of this layer. output_name: str The output blob name of this layer. axis: int, optional axis along which to compute the sorting indices descending: bool, optional order of sorting See Also -------- add_topk """ if self.spec and ( not self.spec.specificationVersion or self.spec.specificationVersion < _SPECIFICATION_VERSION_IOS_14 ): self.spec.specificationVersion = _SPECIFICATION_VERSION_IOS_14 spec_layer = self._add_generic_layer(name, [input_name], [output_name]) spec_layer.argSort.MergeFromString(b"") spec_params = spec_layer.argSort spec_params.axis = int(axis) spec_params.descending = descending return spec_layer ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/neural_network/flexible_shape_utils.py0000644000000000000000000007136214672066616026140 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ Utilities to annotate neural network features with flexible shape information. """ from typing import List as _List from typing import Tuple as _Tuple from coremltools.proto import Model_pb2 as _ml from ... import _MINIMUM_FLEXIBLE_SHAPES_SPEC_VERSION, _MINIMUM_NDARRAY_SPEC_VERSION from ..utils import _get_feature _SEQUENCE_KEY = "S" _BATCH_KEY = "B" _CHANNEL_KEY = "C" _HEIGHT_KEY = "H" _WIDTH_KEY = "W" _CONSTRAINED_KEYS = [_CHANNEL_KEY, _HEIGHT_KEY, _WIDTH_KEY] class Shape: def __init__(self, shape_value): if shape_value < 1: raise Exception("Invalid value. Size/Shape values must be > 0") self._value = shape_value @property def value(self): return self._value class Size(Shape): def __init__(self, size_value): super(Size, self).__init__(size_value) class NeuralNetworkMultiArrayShape: """ An object representing a shape for a multiArray feature in a neural network. Valid shapes must have have only the Channel ``[C]`` shape or the Channel, Height and Width ``[C, H, W]`` shapes populated """ def __init__(self, channel=None, height=None, width=None): self._shape = { _CHANNEL_KEY: Shape(int(channel)) if channel else None, _HEIGHT_KEY: Shape(int(height)) if height else None, _WIDTH_KEY: Shape(int(width)) if width else None, } def set_channel_shape(self, channel_shape): self._shape[_CHANNEL_KEY] = Shape(channel_shape) def set_height_shape(self, height_shape): self._shape[_HEIGHT_KEY] = Shape(height_shape) def set_width_shape(self, width_shape): self._shape[_WIDTH_KEY] = Shape(width_shape) def _validate_multiarray_shape(self): num_dims = len([v for v in self._shape.values() if v]) if num_dims != 1 and num_dims != 3: raise Exception( "For neural networks, shape must be of length 1 or 3" ", representing input shape [C] or [C,H,W], respectively" ) if num_dims == 1: if not self._shape["C"]: raise Exception("Channel Shape not specified") @property def multiarray_shape(self): num_dims = len([v for v in self._shape.values() if v]) if num_dims == 1: return [self._shape[_CHANNEL_KEY].value] elif num_dims == 3: return [ self._shape[_CHANNEL_KEY].value, self._shape[_HEIGHT_KEY].value, self._shape[_WIDTH_KEY].value, ] else: raise Exception("Invalid multiarray shape for neural network") class NeuralNetworkImageSize: """ An object representing a size for an image feature inside a neural network. Valid sizess for height and width are > 0. """ def __init__(self, height=None, width=None): self._height = Size(height) self._width = Size(width) def set_width(self, width): self._width = Size(width) def set_height(self, height): self._height = Size(height) @property def width(self): return self._width.value @property def height(self): return self._height.value class ShapeRange: def __init__(self, lowerBound, upperBound): unBounded = False if upperBound == -1: unBounded = True if not unBounded and lowerBound > upperBound: raise Exception( "lowerBound > upperBound for range ({},{})".format( lowerBound, upperBound ) ) if not unBounded and upperBound < 1: raise Exception("Invalid upperBound: {} ".format(upperBound)) if lowerBound == 0: lowerBound = 1 if lowerBound < 1: raise Exception("Invalid lowerBound: {}".format(lowerBound)) self._lowerBound = lowerBound self._upperBound = upperBound self._unBounded = unBounded @property def lowerBound(self): return self._lowerBound @property def upperBound(self): return self._upperBound @property def isUnbounded(self): return self._unBounded @property def isFlexible(self): return not (self._lowerBound == self._upperBound) class NeuralNetworkMultiArrayShapeRange: """ An object representing a range of shapes for a multiArray feature in a neural network. Valid shape ranges must have have only the Channel ``[C]`` range or the Channel, Height and Width ``[C, H, W]`` ranges populated. A ``-1`` value in an upper bound represents an unbounded range. """ def __init__(self, input_ranges=None): self.arrayShapeRange = {} if input_ranges: if not isinstance(input_ranges, dict): raise Exception( "Attempting to initialize a shape range with something other than a dictionary of shapes." ) self.arrayShapeRange = {} for key, value in input_ranges.items(): if key in _CONSTRAINED_KEYS: self.arrayShapeRange[key] = self._create_shape_range(value) self.validate_array_shape_range() def _create_shape_range(self, r): if not isinstance(r, tuple): raise Exception("Range should be a ShapeRange or a tuple object") elif len(r) != 2: raise Exception("Range tuple should be at least length 2") return ShapeRange(r[0], r[1]) def add_channel_range(self, channel_range): if not isinstance(channel_range, ShapeRange): channel_range = self._create_shape_range(channel_range) self.arrayShapeRange[_CHANNEL_KEY] = channel_range def add_height_range(self, height_range): if not isinstance(height_range, ShapeRange): height_range = self._create_shape_range(height_range) self.arrayShapeRange[_HEIGHT_KEY] = height_range def add_width_range(self, width_range): if not isinstance(width_range, ShapeRange): width_range = self._create_shape_range(width_range) self.arrayShapeRange[_WIDTH_KEY] = width_range def get_shape_range_dims(self): return len(self.arrayShapeRange.keys()) def validate_array_shape_range(self): num_dims = self.get_shape_range_dims() if num_dims != 1 and num_dims != 3: raise Exception( "For neural networks, shape must be of length 1 or 3" ", representing input shape [C] or [C,H,W], respectively" ) if num_dims == 1: if _CHANNEL_KEY not in self.arrayShapeRange.keys(): raise Exception("Channel Shape Range not specified") if num_dims == 3: if ( _CHANNEL_KEY not in self.arrayShapeRange.keys() or _HEIGHT_KEY not in self.arrayShapeRange.keys() or _WIDTH_KEY not in self.arrayShapeRange.keys() ): raise Exception( "Shape range constraint missing for either channel, height, or width." ) def get_channel_range(self): return self.arrayShapeRange[_CHANNEL_KEY] def get_height_range(self): return self.arrayShapeRange[_HEIGHT_KEY] def get_width_range(self): return self.arrayShapeRange[_WIDTH_KEY] def isFlexible(self): """ Returns true if any one of the channel, height, or width ranges of this shape allow more than one input value. """ for key, value in self.arrayShapeRange.items(): if key in _CONSTRAINED_KEYS: if value.isFlexible: return True return False class NeuralNetworkImageSizeRange: """ An object representing a range of sizes for an image feature inside a neural network. Valid ranges for height and width are > 0. A ``-1`` upper bound value for either width or height represents an unbounded size for that dimension. """ def __init__(self, height_range=None, width_range=None): if height_range and not isinstance(height_range, ShapeRange): if not isinstance(height_range, tuple): raise Exception("Height range should be a ShapeRange or a tuple object") elif len(height_range) != 2: raise Exception("Height range tuple should be at least length 2") height_range = ShapeRange(height_range[0], height_range[1]) if width_range and not isinstance(width_range, ShapeRange): if not isinstance(width_range, tuple): raise Exception("Width range should be a ShapeRange or a tuple object") elif len(width_range) != 2: raise Exception("Width range tuple should be at least length 2") width_range = ShapeRange(width_range[0], width_range[1]) self._height_range = height_range self._width_range = width_range def add_width_range(self, width_range): if not isinstance(width_range, ShapeRange): if not isinstance(width_range, tuple): raise Exception("Width range should be a ShapeRange or a tuple object") elif len(width_range) != 2: raise Exception("Width range tuple should be at least length 2") self._width_range = ShapeRange(width_range[0], width_range[1]) def add_height_range(self, height_range): if not isinstance(height_range, ShapeRange): if not isinstance(height_range, tuple): raise Exception("Height range should be a ShapeRange or a tuple object") elif len(height_range) != 2: raise Exception("Height range tuple should be at least length 2") self._height_range = ShapeRange(height_range[0], height_range[1]) def get_width_range(self): return self._width_range def get_height_range(self): return self._height_range def _set_multiarray_ndshape_range_for_feature( feature: _ml.FeatureDescription, lower_bounds: _List[int], upper_bounds: _List[int], ): if not isinstance(lower_bounds, list): raise Exception("lower_bounds must be a list") if not isinstance(upper_bounds, list): raise Exception("upper_bounds must be a list") if feature.type.WhichOneof("Type") != "multiArrayType": raise Exception("Trying to update shape range for " "a non-multiArray feature type") shape = feature.type.multiArrayType.shape if len(shape) != len(lower_bounds): raise Exception( "Length of lower_bounds is not equal to the number of dimensions in the default shape" ) if len(shape) != len(upper_bounds): raise Exception( "Length of upper_bounds is not equal to the number of dimensions in the default shape" ) feature.type.multiArrayType.ClearField("ShapeFlexibility") for i in range(len(lower_bounds)): if shape[i] < lower_bounds[i]: raise Exception( "Default shape in %d-th dimension, which is %d, is smaller" " than the lower bound of %d" % (i, int(shape[i]), lower_bounds[i]) ) if upper_bounds[i] != -1: if shape[i] > upper_bounds[i]: raise Exception( "Default shape in %d-th dimension, which is %d, is greater" " than the upper bound of %d" % (i, int(shape[i]), upper_bounds[i]) ) s = feature.type.multiArrayType.shapeRange.sizeRanges.add() s.lowerBound = lower_bounds[i] s.upperBound = upper_bounds[i] def _update_image_size_range_for_feature( feature: _ml.FeatureDescription, size_range: NeuralNetworkImageSizeRange, ): if not isinstance(size_range, NeuralNetworkImageSizeRange): raise Exception("Shape ranges should be of type NeuralNetworkImageSizeRange") if feature.type.WhichOneof("Type") != "imageType": raise Exception("Trying to add size ranges for " "a non-image feature type") feature.type.imageType.ClearField("SizeFlexibility") feature.type.imageType.imageSizeRange.heightRange.lowerBound = ( size_range.get_height_range().lowerBound ) feature.type.imageType.imageSizeRange.heightRange.upperBound = ( size_range.get_height_range().upperBound ) feature.type.imageType.imageSizeRange.widthRange.lowerBound = ( size_range.get_width_range().lowerBound ) feature.type.imageType.imageSizeRange.widthRange.upperBound = ( size_range.get_width_range().upperBound ) def _add_multiarray_ndshape_enumeration_for_feature( feature: _ml.FeatureDescription, enumerated_shapes: _List[_Tuple[int]], ): if not isinstance(enumerated_shapes, list): raise Exception("enumerated_shapes must be a list") if len(enumerated_shapes) == 0: raise Exception("enumerated_shapes is empty") if feature.type.WhichOneof("Type") != "multiArrayType": raise Exception("Trying to update shape range for " "a non-multiArray feature type") shape = feature.type.multiArrayType.shape if feature.type.multiArrayType.WhichOneof("ShapeFlexibility") != "enumeratedShapes": feature.type.multiArrayType.ClearField("ShapeFlexibility") eshape_len = len(feature.type.multiArrayType.enumeratedShapes.shapes) shapes_added_so_far = [] # Add default array shape to list of enumerated shapes if enumerated shapes # field is currently empty if eshape_len == 0: fixed_shape = feature.type.multiArrayType.shape s = feature.type.multiArrayType.enumeratedShapes.shapes.add() s.shape.extend(fixed_shape) shapes_added_so_far.append(list(fixed_shape)) for shape in enumerated_shapes: if not isinstance(shape, tuple): raise Exception("An element in 'enumerated_shapes' is not a tuple") if list(shape) not in shapes_added_so_far: s = feature.type.multiArrayType.enumeratedShapes.shapes.add() s.shape.extend(list(shape)) shapes_added_so_far.append(list(shape)) def _add_enumerated_image_sizes_for_feature( feature: _ml.FeatureDescription, sizes: _List[NeuralNetworkImageSize], ): if not isinstance(sizes, list): sizes = [sizes] for size in sizes: if not isinstance(size, NeuralNetworkImageSize): raise Exception("Shape ranges should be of type NeuralNetworkImageSize") if feature.type.WhichOneof("Type") != "imageType": raise Exception("Trying to add enumerated sizes to " "a non-image feature type") if feature.type.imageType.WhichOneof("SizeFlexibility") != "enumeratedSizes": feature.type.imageType.ClearField("SizeFlexibility") esizes_len = len(feature.type.imageType.enumeratedSizes.sizes) # Add default image size to list of enumerated sizes if enumerated sizes # field is currently empty if esizes_len == 0: fixed_height = feature.type.imageType.height fixed_width = feature.type.imageType.width sizes.append(NeuralNetworkImageSize(fixed_height, fixed_width)) shapes_added_so_far = [] for size in sizes: if [size.height, size.width] not in shapes_added_so_far: s = feature.type.imageType.enumeratedSizes.sizes.add() s.height = size.height s.width = size.width shapes_added_so_far.append([s.height, s.width]) def add_enumerated_multiarray_shapes(spec, feature_name, shapes): """ Annotate an input or output multiArray feature in a neural network spec to to accommodate a list of enumerated array shapes. :param spec: MLModel The MLModel spec containing the feature. :param feature_name: str The name of the image feature for which to add shape information. If the feature is not found in the input or output descriptions then an exception is thrown. :param shapes: [] | NeuralNetworkMultiArrayShape A single or a list of NeuralNetworkImageSize objects which encode valid size information for a image feature. Examples -------- .. sourcecode:: python import coremltools from coremltools.models.neural_network import flexible_shape_utils spec = coremltools.utils.load_spec("mymodel.mlmodel") array_shapes = [flexible_shape_utils.NeuralNetworkMultiArrayShape(3)] second_shape = flexible_shape_utils.NeuralNetworkMultiArrayShape() second_shape.set_channel_shape(3) second_shape.set_height_shape(10) second_shape.set_width_shape(15) array_shapes.append(second_shape) flexible_shape_utils.add_enumerated_multiarray_shapes( spec, feature_name="my_multiarray_featurename", shapes=array_shapes ) :return: None. The spec object is updated """ if not isinstance(shapes, list): shapes = [shapes] for shape in shapes: if not isinstance(shape, NeuralNetworkMultiArrayShape): raise Exception( "Shape ranges should be of type NeuralNetworkMultiArrayShape" ) shape._validate_multiarray_shape() feature = _get_feature(spec, feature_name) if feature.type.WhichOneof("Type") != "multiArrayType": raise Exception( "Trying to add enumerated shapes to " "a non-multiArray feature type" ) if feature.type.multiArrayType.WhichOneof("ShapeFlexibility") != "enumeratedShapes": feature.type.multiArrayType.ClearField("ShapeFlexibility") eshape_len = len(feature.type.multiArrayType.enumeratedShapes.shapes) # Add default array shape to list of enumerated shapes if enumerated shapes # field is currently empty if eshape_len == 0: fixed_shape = feature.type.multiArrayType.shape if len(fixed_shape) == 1: fs = NeuralNetworkMultiArrayShape(fixed_shape[0]) shapes.append(fs) elif len(fixed_shape) == 3: fs = NeuralNetworkMultiArrayShape() fs.set_channel_shape(fixed_shape[0]) fs.set_height_shape(fixed_shape[1]) fs.set_width_shape(fixed_shape[2]) shapes.append(fs) else: raise Exception( "Original fixed multiArray shape for {} is invalid".format(feature_name) ) for shape in shapes: s = feature.type.multiArrayType.enumeratedShapes.shapes.add() s.shape.extend(shape.multiarray_shape) # Bump up specification version spec.specificationVersion = max( _MINIMUM_FLEXIBLE_SHAPES_SPEC_VERSION, spec.specificationVersion ) def add_enumerated_image_sizes(spec, feature_name, sizes): """ Annotate an input or output image feature in a neural network spec to to accommodate a list of enumerated image sizes. :param spec: MLModel The MLModel spec containing the feature. :param feature_name: str The name of the image feature for which to add size information. If the feature is not found in the input or output descriptions then an exception is thrown. :param sizes: [] | NeuralNetworkImageSize A single or a list of NeuralNetworkImageSize objects which encode valid size information for a image feature. Examples -------- .. sourcecode:: python import coremltools from coremltools.models.neural_network import flexible_shape_utils spec = coremltools.utils.load_spec("mymodel.mlmodel") image_sizes = [flexible_shape_utils.NeuralNetworkImageSize(128, 128)] image_sizes.append(flexible_shape_utils.NeuralNetworkImageSize(256, 256)) flexible_shape_utils.add_enumerated_image_sizes( spec, feature_name="my_multiarray_featurename", sizes=image_sizes ) :return: None. The spec object is updated """ if not isinstance(sizes, list): sizes = [sizes] for size in sizes: if not isinstance(size, NeuralNetworkImageSize): raise Exception("Shape ranges should be of type NeuralNetworkImageSize") feature = _get_feature(spec, feature_name) if feature.type.WhichOneof("Type") != "imageType": raise Exception("Trying to add enumerated sizes to " "a non-image feature type") if feature.type.imageType.WhichOneof("SizeFlexibility") != "enumeratedSizes": feature.type.imageType.ClearField("SizeFlexibility") esizes_len = len(feature.type.imageType.enumeratedSizes.sizes) # Add default image size to list of enumerated sizes if enumerated sizes # field is currently empty if esizes_len == 0: fixed_height = feature.type.imageType.height fixed_width = feature.type.imageType.width sizes.append(NeuralNetworkImageSize(fixed_height, fixed_width)) shapes_added_so_far = [] for size in sizes: if [size.height, size.width] not in shapes_added_so_far: s = feature.type.imageType.enumeratedSizes.sizes.add() s.height = size.height s.width = size.width shapes_added_so_far.append([s.height, s.width]) # Bump up specification version spec.specificationVersion = max( _MINIMUM_FLEXIBLE_SHAPES_SPEC_VERSION, spec.specificationVersion ) def update_image_size_range(spec, feature_name, size_range): """ Annotate an input or output Image feature in a neural network spec to to accommodate a range of image sizes. :param spec: MLModel The MLModel spec containing the feature. :param feature_name: str The name of the Image feature for which to add shape information. If the feature is not found in the input or output descriptions then an exception is thrown. :param size_range: NeuralNetworkImageSizeRange A NeuralNetworkImageSizeRange object with the populated image size range information. Examples -------- .. sourcecode:: python import coremltools from coremltools.models.neural_network import flexible_shape_utils spec = coremltools.utils.load_spec("mymodel.mlmodel") img_size_ranges = flexible_shape_utils.NeuralNetworkImageSizeRange() img_size_ranges.add_height_range(64, 128) img_size_ranges.add_width_range(128, -1) flexible_shape_utils.update_image_size_range( spec, feature_name="my_multiarray_featurename", size_range=img_size_ranges ) :return: None. The spec object is updated """ feature = _get_feature(spec, feature_name) _update_image_size_range_for_feature(feature, size_range) # Bump up specification version spec.specificationVersion = max( _MINIMUM_FLEXIBLE_SHAPES_SPEC_VERSION, spec.specificationVersion ) def update_multiarray_shape_range(spec, feature_name, shape_range): """ Annotate an input or output MLMultiArray feature in a neural network spec to accommodate a range of shapes. :param spec: MLModel The MLModel spec containing the feature. :param feature_name: str The name of the feature for which to add shape range information. If the feature is not found in the input or output descriptions then an exception is thrown. :param shape_range: NeuralNetworkMultiArrayShapeRange A NeuralNetworkMultiArrayShapeRange object with the populated shape range information. The shape_range object must either contain only shape information for channel or channel, height and width. If the object is invalid then an exception is thrown. Examples -------- .. sourcecode:: python import coremltools from coremltools.models.neural_network import flexible_shape_utils spec = coremltools.utils.load_spec("mymodel.mlmodel") shape_range = flexible_shape_utils.NeuralNetworkMultiArrayShapeRange() shape_range.add_channel_range((1, 3)) shape_range.add_width_range((128, 256)) shape_range.add_height_range((128, 256)) flexible_shape_utils.update_multiarray_shape_range( spec, feature_name="my_multiarray_featurename", shape_range=shape_range ) :return: None. The spec is updated. """ if not isinstance(shape_range, NeuralNetworkMultiArrayShapeRange): raise Exception("Shape range should be of type MultiArrayShapeRange") shape_range.validate_array_shape_range() feature = _get_feature(spec, feature_name) if feature.type.WhichOneof("Type") != "multiArrayType": raise Exception( "Trying to update shape range for " "a non-multiArray feature type" ) # Add channel range feature.type.multiArrayType.ClearField("ShapeFlexibility") s = feature.type.multiArrayType.shapeRange.sizeRanges.add() s.lowerBound = shape_range.get_channel_range().lowerBound s.upperBound = shape_range.get_channel_range().upperBound if shape_range.get_shape_range_dims() > 1: # Add height range s = feature.type.multiArrayType.shapeRange.sizeRanges.add() s.lowerBound = shape_range.get_height_range().lowerBound s.upperBound = shape_range.get_height_range().upperBound # Add width range s = feature.type.multiArrayType.shapeRange.sizeRanges.add() s.lowerBound = shape_range.get_width_range().lowerBound s.upperBound = shape_range.get_width_range().upperBound # Bump up specification version spec.specificationVersion = max( _MINIMUM_FLEXIBLE_SHAPES_SPEC_VERSION, spec.specificationVersion ) def set_multiarray_ndshape_range(spec, feature_name, lower_bounds, upper_bounds): """ Annotate an input or output MLMultiArray feature in a neural network spec to accommodate a range of shapes. This is different from ``update_multiarray_shape_range``, which works with rank 5 SBCHW mapping. :param spec: MLModel The MLModel spec containing the feature. :param feature_name: str The name of the feature for which to add shape range information. If the feature is not found in the input or output descriptions then an exception is thrown. :param lower_bounds: List[int] list of integers specifying the lower bounds of each dimension. Length must be same as the rank (length of shape) of the ``feature_name``. :param upper_bounds: List[int] list of integers specifying the upper bounds of each dimension. ``-1`` corresponds to unbounded range. Length must be same as the rank (length of shape) of the ``feature_name``. Examples -------- .. sourcecode:: python import coremltools from coremltools.models.neural_network import flexible_shape_utils spec = coremltools.utils.load_spec("mymodel.mlmodel") # say, the default shape of "my_multiarray_featurename" is (2,3) flexible_shape_utils.set_multiarray_ndshape_range( spec, feature_name="my_multiarray_featurename", lower_bounds=[1, 2], upper_bounds=[10, -1], ) :return: None. The spec is updated. """ feature = _get_feature(spec, feature_name) _set_multiarray_ndshape_range_for_feature(feature, lower_bounds, upper_bounds) # Bump up specification version spec.specificationVersion = max( _MINIMUM_NDARRAY_SPEC_VERSION, spec.specificationVersion ) def add_multiarray_ndshape_enumeration(spec, feature_name, enumerated_shapes): """ Annotate an input or output MLMultiArray feature in a neural network spec to accommodate a range of shapes. Add provided enumerated shapes to the list of shapes already present. This method is different from ``add_enumerated_multiarray_shapes``, which is applicable for rank 5 mapping, SBCHW, and arrays. :param spec: MLModel The MLModel spec containing the feature :param feature_name: str The name of the feature for which to add shape range information. If the feature is not found in the input or output descriptions then an exception is thrown :param enumerated_shapes: List[Tuple(int)] list of shapes, where each shape is specified as a tuple of integers. Examples -------- .. sourcecode:: python import coremltools from coremltools.models.neural_network import flexible_shape_utils spec = coremltools.utils.load_spec("mymodel.mlmodel") # say, the default shape of "my_multiarray_featurename" is (2,3) flexible_shape_utils.add_multiarray_ndshape_enumeration( spec, feature_name="my_multiarray_featurename", enumerated_shapes=[(2, 4), (2, 6)] ) :return: None. The spec is updated. """ feature = _get_feature(spec, feature_name) _add_multiarray_ndshape_enumeration_for_feature(feature, enumerated_shapes) # Bump up specification version spec.specificationVersion = max( _MINIMUM_NDARRAY_SPEC_VERSION, spec.specificationVersion ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/neural_network/optimization_utils.py0000644000000000000000000002000214672066616025675 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ Neural Network optimization utilities. """ import numpy as _np def _fuse_layer_with_scale_layer(layer_idx, scale_idx, layers): layer_type = layers[layer_idx].WhichOneof("layer") if layer_type == "convolution": layer = layers[layer_idx].convolution elif layer_type == "innerProduct": layer = layers[layer_idx].innerProduct else: raise Exception( "Scale fusion not supper for layer " "type {} ".format(layer_type) ) scale = layers[scale_idx].scale # Update weights sw = _np.array(scale.scale.floatValue) w = _np.array(layer.weights.floatValue) w = w.reshape(layer.outputChannels, int(len(w) / layer.outputChannels)) wp = w * sw[:, None] del layer.weights.floatValue[:] layer.weights.floatValue.extend(wp.flatten()) # Update biases if scale.hasBias: sb = _np.array(scale.bias.floatValue) if not layer.hasBias: layer.bias.floatValue.extend(sb) layer.hasBias = True else: lb = _np.array(layer.bias.floatValue) bp = sw * lb + sb del layer.bias.floatValue[:] layer.bias.floatValue.extend(bp) # re-wire outputs and delete scale layer print("Fused {}->{}".format(layers[layer_idx].name, layers[scale_idx].name)) del layers[layer_idx].output[:] layers[layer_idx].output.extend(layers[scale_idx].output) del layers[scale_idx] def _fuse_layer_with_bias_layer(layer_idx, bias_idx, layers): layer_type = layers[layer_idx].WhichOneof("layer") if layer_type == "convolution": layer = layers[layer_idx].convolution elif layer_type == "innerProduct": layer = layers[layer_idx].innerProduct else: raise Exception( "Bias fusion not supper for layer " "type {} ".format(layer_type) ) bias = layers[bias_idx].bias bb = _np.array(bias.bias.floatValue) if not layer.hasBias: layer.bias.floatValue.extend(bb) layer.hasBias = True else: lb = _np.array(layer.bias.floatValue) bp = lb + bb del layer.bias.floatValue[:] layer.bias.floatValue.extend(bp) # re-wire outputs and delete bias layer print("Fused {}->{}".format(layers[layer_idx].name, layers[bias_idx].name)) del layers[layer_idx].output[:] layers[layer_idx].output.extend(layers[bias_idx].output) del layers[bias_idx] def _bn_scale_fusion(bn_idx, scale_idx, layers): bn = layers[bn_idx].batchnorm scale = layers[scale_idx].scale gamma = _np.array(bn.gamma.floatValue) beta = _np.array(bn.beta.floatValue) sw = _np.array(scale.scale.floatValue) gamma = gamma * sw beta = beta * sw if scale.hasBias: sb = _np.array(scale.bias.floatValue) beta = beta + sb del bn.gamma.floatValue[:] del bn.beta.floatValue[:] bn.gamma.floatValue.extend(gamma) bn.beta.floatValue.extend(beta) # re-wire outputs and delete scale layer print("Fused {}->{}".format(layers[bn_idx].name, layers[scale_idx].name)) del layers[bn_idx].output[:] layers[bn_idx].output.extend(layers[scale_idx].output) del layers[scale_idx] def _conv_bn_fusion(conv_idx, bn_idx, layers): conv = layers[conv_idx].convolution bn = layers[bn_idx].batchnorm mean = _np.array(bn.mean.floatValue) variance = _np.array(bn.variance.floatValue) + bn.epsilon gamma = _np.array(bn.gamma.floatValue) beta = _np.array(bn.beta.floatValue) w = _np.array(conv.weights.floatValue) if conv.hasBias: b = _np.array(conv.bias.floatValue) else: b = _np.zeros(conv.outputChannels) w = w.reshape(conv.outputChannels, int(len(w) / conv.outputChannels)) wp = (gamma / _np.sqrt(variance))[:, None] * w bp = (gamma * b / _np.sqrt(variance)) - (gamma * mean / _np.sqrt(variance)) + beta del conv.weights.floatValue[:] if conv.hasBias: del conv.bias.floatValue[:] conv.weights.floatValue.extend(wp.flatten()) conv.bias.floatValue.extend(bp) conv.hasBias = True print("Fused {}->{}".format(layers[conv_idx].name, layers[bn_idx].name)) # re-wire outputs and delete batchnorm layer del layers[conv_idx].output[:] layers[conv_idx].output.extend(layers[bn_idx].output) del layers[bn_idx] def _get_nn_mappings(layers): layer_map = {} type_map = {} output_map = {} input_map = {} for idx, layer in enumerate(layers): layer_name = "{}".format(idx) layer_map[layer_name] = {"outputs": [], "inputs": []} layer_type = layer.WhichOneof("layer") if layer_type not in type_map.keys(): type_map[layer_type] = [] type_map[layer_type].append(layer_name) # Add inputs and outputs for layer for o in layer.output: layer_map[layer_name]["outputs"].append(o) for i in layer.input: layer_map[layer_name]["inputs"].append(i) # Construct input/output graph dict for l in layer_map.keys(): output_map[l] = [] input_map[l] = [] for cl in layer_map.keys(): if any(x in layer_map[l]["outputs"] for x in layer_map[cl]["inputs"]): output_map[l].append(cl) if any(x in layer_map[l]["inputs"] for x in layer_map[cl]["outputs"]): input_map[l].append(cl) return type_map, output_map, input_map def _optimize_nn(layers): type_map, output_map, input_map = _get_nn_mappings(layers) bn_layers = [] conv_layers = [] ip_layers = [] bias_layers = [] scale_layers = [] # Only fuse with non-instance batchnorm layers if "batchnorm" in type_map.keys(): for bn_layer_idx in type_map["batchnorm"]: if not layers[int(bn_layer_idx)].batchnorm.instanceNormalization: bn_layers.append(bn_layer_idx) if "convolution" in type_map.keys(): conv_layers = type_map["convolution"] if "innerProduct" in type_map.keys(): ip_layers = type_map["innerProduct"] if "bias" in type_map.keys(): bias_layers = type_map["bias"] if "scale" in type_map.keys(): scale_layers = type_map["scale"] # Convolution optimizations for conv_idx in conv_layers: if len(output_map[conv_idx]) != 1: continue output_idx = output_map[conv_idx][0] if len(input_map[output_idx]) != 1: continue # Batchnorm fusion if output_idx in bn_layers: _conv_bn_fusion(int(conv_idx), int(output_idx), layers) return _optimize_nn(layers) # Scale fusion if output_idx in scale_layers: _fuse_layer_with_scale_layer(int(conv_idx), int(output_idx), layers) return _optimize_nn(layers) # Bias fusion if output_idx in bias_layers: _fuse_layer_with_bias_layer(int(conv_idx), int(output_idx), layers) return _optimize_nn(layers) # Inner Product optimizations for ip_idx in ip_layers: if len(output_map[ip_idx]) != 1: continue output_idx = output_map[ip_idx][0] if len(input_map[output_idx]) != 1: continue # Scale Fusion if output_idx in scale_layers: _fuse_layer_with_scale_layer(int(ip_idx), int(output_idx), layers) return _optimize_nn(layers) # Bias Fusion if output_idx in bias_layers: _fuse_layer_with_bias_layer(int(ip_idx), int(output_idx), layers) return _optimize_nn(layers) # Batchnorm optimizations for bn_idx in bn_layers: if len(output_map[bn_idx]) != 1: continue output_idx = output_map[bn_idx][0] if len(input_map[output_idx]) != 1: continue # Scale Fusion if output_idx in scale_layers: _bn_scale_fusion(int(bn_idx), int(output_idx), layers) return _optimize_nn(layers) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/neural_network/printer.py0000644000000000000000000000724414672066616023427 0ustar00rootroot# Copyright (c) 2018, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from .spec_inspection_utils import (_get_feature_description_summary, _summarize_neural_network_spec, _summarize_neural_network_spec_code_style) def _print_network_spec_parameter_info_style(mlmodel_spec, interface_only=False): """ Print the network information summary. Args: mlmodel_spec : the mlmodel spec interface_only : Shows only the input and output of the network """ inputs, outputs, layers_info = _summarize_neural_network_spec(mlmodel_spec) print("Inputs:") for i in inputs: name, description = i print(" {} {}".format(name, description)) print("Outputs:") for o in outputs: name, description = o print(" {} {}".format(name, description)) if layers_info is None: print( "\n(This MLModel is not a neural network model or does not contain any layers)" ) if layers_info and not interface_only: print("\nLayers:") for idx, l in enumerate(layers_info): layer_type, name, in_blobs, out_blobs, params_info = l print("[{}] ({}) {}".format(idx, layer_type, name)) print(" Input blobs: {}".format(in_blobs)) print(" Output blobs: {}".format(out_blobs)) if len(params_info) > 0: print(" Parameters: ") for param in params_info: print(" {} = {}".format(param[0], param[1])) print("\n") def _print_network_spec_coding_style(mlmodel_spec, interface_only=False): """ Args: mlmodel_spec : the mlmodel spec interface_only : Shows only the input and output of the network """ inputs = [ (blob.name, _get_feature_description_summary(blob)) for blob in mlmodel_spec.description.input ] outputs = [ (blob.name, _get_feature_description_summary(blob)) for blob in mlmodel_spec.description.output ] input_names = [] print("Inputs:") for i in inputs: name, description = i print(" {} {}".format(name, description)) input_names.append(name) output_names = [] print("Outputs:") for o in outputs: name, description = o print(" {} {}".format(name, description)) output_names.append(name) if interface_only: return nn_spec = None if mlmodel_spec.HasField("neuralNetwork"): nn_spec = mlmodel_spec.neuralNetwork elif mlmodel_spec.HasField("neuralNetworkClassifier"): nn_spec = mlmodel_spec.neuralNetworkClassifier elif mlmodel_spec.HasField("neuralNetworkRegressor"): nn_spec = mlmodel_spec.neuralNetworkRegressor if nn_spec is None: print("\n(This MLModel is not a neural network model)") return print("\n") _summarize_neural_network_spec_code_style( nn_spec, input_names=input_names, output_names=output_names ) def print_network_spec(mlmodel_spec, interface_only=False, style=""): """ Print the network information summary. Args: mlmodel_spec : the mlmodel spec interface_only : Shows only the input and output of the network style : str. Either 'coding' or default, which prints information on parameters of layers. """ if style == "coding": _print_network_spec_coding_style(mlmodel_spec, interface_only=interface_only) else: _print_network_spec_parameter_info_style( mlmodel_spec, interface_only=interface_only ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/neural_network/quantization_utils.py0000644000000000000000000016374714672066616025725 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ Utilities to compress Neural Network Models. Only available in coremltools 2.0b1 and onwards """ from os import listdir as _listdir from sys import stdout as _stdout from typing import Optional as _Optional import numpy as _np from coremltools import ComputeUnit as _ComputeUnit from coremltools import _logger from coremltools._deps import _HAS_KMEANS1D, _kmeans1d from coremltools.models import ( _QUANTIZATION_MODE_CUSTOM_LOOKUP_TABLE, _QUANTIZATION_MODE_DEQUANTIZE, _QUANTIZATION_MODE_LINEAR_QUANTIZATION, _QUANTIZATION_MODE_LINEAR_SYMMETRIC, _QUANTIZATION_MODE_LOOKUP_TABLE_KMEANS, _QUANTIZATION_MODE_LOOKUP_TABLE_LINEAR, _SUPPORTED_QUANTIZATION_MODES, ) from coremltools.models import MLModel as _MLModel from ... import ( _MINIMUM_FP16_SPEC_VERSION, _MINIMUM_QUANTIZED_MODEL_SPEC_VERSION, _SPECIFICATION_VERSION_IOS_14, ) from ..utils import _get_model, _macos_version, _wp_to_fp16wp from .optimization_utils import _optimize_nn class QuantizedLayerSelector: """ This is the base class to implement custom selectors to skip certain layers during quantization. To implement a custom selector, create a class that inherits this class and override `do_quantize()` method. Examples -------- .. highlight:: python .. code-block:: python class MyLayerSelector(QuantizedLayerSelector): def __init__(self): super().__init__() def do_quantize(self, layer, **kwargs): ret = super().do_quantize(layer) if not ret or layer.name == "dense_2": return False return True selector = MyLayerSelector() quantized_model = quantize_weights( mlmodel, 8, quantization_mode="linear", selector=selector ) """ def __init__(self): self.quantizable_layer_types = { "convolution", "innerProduct", "embedding", "embeddingND", "batchnorm", "scale", "bias", "loadConstant", "loadConstantND", "simpleRecurrent", "gru", "uniDirectionalLSTM", "biDirectionalLSTM", "batchedMatmul", "depthwiseConv", "loop", "branch", } def do_quantize(self, layer, **kwargs): return layer.WhichOneof("layer") in self.quantizable_layer_types class AdvancedQuantizedLayerSelector(QuantizedLayerSelector): """Quantized layer selector allowing the user to specify some types of layers to skip during quantization process and the minimum size parameters in quantized convolution layers. Examples -------- .. highlight:: python .. code-block:: python from coremltools.models.neural_network.quantization_utils import ( AdvancedQuantizedLayerSelector, ) selector = AdvancedQuantizedLayerSelector( skip_layer_types=["batchnorm", "bias", "depthwiseConv"], minimum_conv_kernel_channels=4, minimum_conv_weight_count=4096, ) quantized_model = quantize_weights(model, 8, selector=selector) """ def __init__( self, skip_layer_types=[], minimum_conv_kernel_channels=4, minimum_conv_weight_count=4096, ): super().__init__() self.skip_layer_types = skip_layer_types # Error checking invalid_skip_types = [] for lt in skip_layer_types: if lt not in self.quantizable_layer_types: invalid_skip_types.append(lt) if len(invalid_skip_types) > 0: err_msg = "Skip quantization layer types ({}) is not supported.\n".format( ",".join(invalid_skip_types) ) err_msg += "Supported quantization layers: ({})".format( ",".join(self.quantizable_layer_types) ) raise ValueError(err_msg) self.minimum_conv_kernel_channels = minimum_conv_kernel_channels self.minimum_conv_weight_count = minimum_conv_weight_count def do_quantize(self, layer, weight_param=None): """ weight_param - should be name of the WeightParam field """ ret = super().do_quantize(layer) if not ret: return False layer_type = layer.WhichOneof("layer") if layer_type in self.skip_layer_types: return False if layer_type == "convolution": oc = layer.convolution.outputChannels kc = layer.convolution.kernelChannels kh = layer.convolution.kernelSize[0] kw = layer.convolution.kernelSize[1] groups = layer.convolution.nGroups counts = oc * kc * kh * kw has_bias = layer.convolution.hasBias if weight_param is None or weight_param == "weights": if "depthwiseConv" in self.skip_layer_types and kc == 1 and groups > 1: return False if ( kc < self.minimum_conv_kernel_channels or counts < self.minimum_conv_weight_count ): return False elif weight_param == "bias": return "bias" not in self.skip_layer_types else: raise ValueError( "Unrecognized quantization weight field {}".format(weight_param) ) elif layer_type == "innerProduct" or "batchedMatmul": if weight_param is None or weight_param == "weights": return True if weight_param == "bias": return "bias" not in self.skip_layer_types else: raise ValueError( "Unrecognized quantization weight field {}".format(weight_param) ) return True class MatrixMultiplyLayerSelector(QuantizedLayerSelector): """ Layer selector object that allows users to select matrix multiplication layers with one of the matrices being constant, based on some criterions like total numbers of parameters/weights, number of input or output channels and/or layer names. If any of the criterion is not valid, the corresponding layer is not selected. """ def __init__( self, minimum_weight_count=1, minimum_input_channels=1, minimum_output_channels=1, maximum_input_channels=None, maximum_output_channels=None, include_layers_with_names=None, ): super().__init__() # weight count refers to number of parameters/weights and is equal to product of input & output channels self.minimum_weight_count = minimum_weight_count self.minimum_input_channels = minimum_input_channels self.minimum_output_channels = minimum_output_channels self.maximum_input_channels = maximum_input_channels self.maximum_output_channels = maximum_output_channels if include_layers_with_names is None: self.include_layers_with_names = [] if not ( isinstance(self.include_layers_with_names, (list, tuple)) and all( [isinstance(s, str) for s in self.include_layers_with_names] ) ): raise ValueError( "Property 'include_layers_with_names' must be a list/tuple of str objects" ) def do_quantize(self, layer, weight_param=None): """ weight_param - should be name of the WeightParam field """ ret = super().do_quantize(layer) if not ret: return False layer_type = layer.WhichOneof("layer") if layer_type in ["innerProduct", "batchedMatmul"]: if weight_param == "bias": return True elif weight_param is None or weight_param == "weights": if layer_type == "innerProduct": ic = layer.innerProduct.inputChannels oc = layer.innerProduct.outputChannels else: ic = layer.batchedMatmul.weightMatrixFirstDimension oc = layer.batchedMatmul.weightMatrixSecondDimension wc = ic * oc if wc < self.minimum_weight_count: return False if ic < self.minimum_input_channels: return False if oc < self.minimum_output_channels: return False if self.maximum_input_channels and ic > self.maximum_input_channels: return False if self.maximum_output_channels and oc > self.maximum_output_channels: return False if ( self.include_layers_with_names and layer.name not in self.include_layers_with_names ): return False return True else: raise ValueError( "Unrecognized quantization weight field {}".format(weight_param) ) elif layer_type in ["loop", "branch"]: return True return False def _convert_1bit_array_to_byte_array(arr): """ Convert bit array to byte array. arr: list Bits as a list where each element is an integer of 0 or 1 Returns ------- numpy.array 1D numpy array of type uint8 """ # Padding if necessary while len(arr) < 8 or len(arr) % 8: arr.append(0) arr = _np.array(arr, dtype="uint8") bit_arr = [] idx = 0 # Iterate and combine 8-bits into a uint8 for arr_idx in range(int(len(arr) / 8)): bit_arr.append( ((arr[idx] << 7) & (1 << 7)) | ((arr[idx + 1] << 6) & (1 << 6)) | ((arr[idx + 2] << 5) & (1 << 5)) | ((arr[idx + 3] << 4) & (1 << 4)) | ((arr[idx + 4] << 3) & (1 << 3)) | ((arr[idx + 5] << 2) & (1 << 2)) | ((arr[idx + 6] << 1) & (1 << 1)) | ((arr[idx + 7] << 0) & (1 << 0)) ) idx += 8 return _np.array(bit_arr, dtype="uint8") def _convert_array_to_nbit_quantized_bytes(arr, nbits): split_arr = [] for idx in range(len(arr)): for i in reversed(range(nbits)): split_arr.append((arr[idx] >> i) & (1 << 0)) return _convert_1bit_array_to_byte_array(split_arr) def _decompose_bytes_to_bit_arr(arr): """ Unpack bytes to bits arr: list Byte Stream, as a list of uint8 values Returns ------- bit_arr: list Decomposed bit stream as a list of 0/1s of length (len(arr) * 8) """ bit_arr = [] for idx in range(len(arr)): for i in reversed(range(8)): bit_arr.append((arr[idx] >> i) & (1 << 0)) return bit_arr def _get_linear_lookup_table_and_weight(nbits, wp): """ Generate a linear lookup table. nbits: int Number of bits to represent a quantized weight value wp: numpy.array Weight blob to be quantized Returns ------- lookup_table: numpy.array Lookup table of shape (2^nbits, ) qw: numpy.array Decomposed bit stream as a list of 0/1s of length (len(arr) * 8) """ w = wp.reshape(1, -1) qw, scales, biases = _quantize_channelwise_linear(w, nbits, axis=0) indices = _np.array(range(0, 2 ** nbits)) lookup_table = indices * scales[0] + biases[0] return lookup_table, qw def _get_kmeans_lookup_table_and_weight( nbits, weight, force_kmeans1d=False, cluster_dim: int = 1, vector_axis: _Optional[int] = None ): """ Generate K-Means lookup table given weights nbits: Number of bits for quantization weight: Weights as numpy array force_kmeans1d: Use kmeans1d regardless of number of weights Returns ------- lut: numpy.array Lookup table, numpy array of shape (1 << nbits, ) wq: numpy.array Quantized weight """ if force_kmeans1d and cluster_dim > 1: raise ValueError("Cannot force kmeans1d for vector palettization (cluster_dim > 1).") num_weights = _np.prod(weight.shape) lut_len = 1 << nbits if cluster_dim > 1: # Import here to avoid circular import. from coremltools.optimize.coreml import _utils as optimize_utils weight = optimize_utils.reshape_weight_for_vector_lut(weight, cluster_dim, vector_axis) weight = weight.reshape(-1, cluster_dim) lut = _np.zeros((lut_len, cluster_dim)) is_better_to_use_kmeans1d = ( weight.shape[1] == 1 and num_weights >= 10_000 and weight.dtype == _np.float16 ) if (is_better_to_use_kmeans1d and _HAS_KMEANS1D) or force_kmeans1d: # Cluster with kmeans1d assert _HAS_KMEANS1D, "Unable to import kmeans1d, please make sure it's installed." values, indices, counts = _np.unique(weight, return_inverse=True, return_counts=True) indices = indices.flatten() n_clusters = min(len(values), lut_len) kmeans_results = _kmeans1d.cluster(values, n_clusters, weights=counts) lut = lut.squeeze(-1) lut[:n_clusters] = kmeans_results.centroids wq = _np.array(kmeans_results.clusters)[indices] else: # Cluster with scikit-learn try: from sklearn.cluster import KMeans except: raise ModuleNotFoundError( "scikit-learn is required for k-means quantization." " To install, run: \"pip install scikit-learn\"." ) if is_better_to_use_kmeans1d: _logger.warning("It would be better to use kmeans1d but that is not available." " Using scikit-learn for K-means.") n_clusters = min(num_weights, lut_len) kmeans = KMeans(n_clusters, init="k-means++", tol=1e-2, n_init=1, random_state=0).fit( weight ) wq = kmeans.labels_[:num_weights] lut[:n_clusters] = kmeans.cluster_centers_ return lut, wq def _quantize_channelwise_linear(weight, nbits, axis=0, symmetric=False): """ Linearly quantize weight blob. weight: numpy.array Weight to be quantized. nbits: int Number of bits per weight element axis: int Axis of the weight blob to compute channel-wise quantization, can be 0 or 1 symmetric: bool If true, set quantization range to be symmetrical to 0. Otherwise, set quantization range to be the minimum and maximum of weight parameters. Returns ------- quantized_weight: numpy.array quantized weight as float numpy array, with the same shape as weight scale: numpy.array per channel scale bias: numpy.array per channel bias """ if len(weight.shape) == 1: # vector situation, treat as 1 channel weight = weight.reshape((1, weight.shape[0])) rank = len(weight.shape) if axis == 1: transposed_axis_order = (1, 0) + tuple(range(2, rank)) weight = _np.transpose(weight, transposed_axis_order) num_channels = weight.shape[0] shape = weight.shape weight = weight.reshape((num_channels, -1)) # [C, L] a = _np.amin(weight, axis=-1) # [C,] b = _np.amax(weight, axis=-1) # [C,] if symmetric: r = _np.maximum(_np.abs(a), _np.abs(b)) scale = r / ((1 << nbits) / 2.0 - 1) bias = -(1 << nbits) / 2.0 * scale num = weight - bias[:, None] denom = scale[:, None] qw = _np.divide( num, denom, out=_np.zeros_like(num), where=(_np.abs(denom) > 1e-6) ) qw = _np.round(qw) else: qb = (1 << nbits) - 1 scale = (b - a) / qb inv_scale = _np.divide( 1.0, scale, out=_np.zeros_like(scale), where=(_np.abs(scale) > 1e-6) ) bias = a qw = (weight - a[:, None]) * inv_scale[:, None] qw = _np.round(qw) # Reshape quantized_weight = qw.reshape(shape) if axis == 1: quantized_weight = _np.transpose(quantized_weight, transposed_axis_order) return (quantized_weight, scale, bias) def _quantize_wp(wp, nbits, qm, axis=0, **kwargs): """ Quantize the weight blob wp: numpy.array Weight parameters nbits: int Number of bits qm: Quantization mode lut_function: (``callable function``) Python callable representing a look-up table Returns ------- scale: numpy.array Per-channel scale bias: numpy.array Per-channel bias lut: numpy.array Lookup table quantized_wp: numpy.array Quantized weight of same shape as wp, with dtype numpy.uint8 """ scale = bias = lut = None # Linear Quantization if qm in [ _QUANTIZATION_MODE_LINEAR_QUANTIZATION, _QUANTIZATION_MODE_LINEAR_SYMMETRIC, ]: symmetric = qm == _QUANTIZATION_MODE_LINEAR_SYMMETRIC qw, scale, bias = _quantize_channelwise_linear(wp, nbits, axis, symmetric) # Lookup tables elif qm == _QUANTIZATION_MODE_LOOKUP_TABLE_KMEANS: lut, qw = _get_kmeans_lookup_table_and_weight(nbits, wp) elif qm == _QUANTIZATION_MODE_CUSTOM_LOOKUP_TABLE: if "lut_function" not in kwargs.keys(): raise Exception( "Custom lookup table quantization mode " "selected but no lookup table function passed" ) lut_function = kwargs["lut_function"] if not callable(lut_function): raise Exception( "Argument for Lookup Table passed in but is " "not callable" ) try: lut, qw = lut_function(nbits, wp) except Exception as e: raise Exception( "{}\nCall to Lookup Table function failed".format(e.message) ) elif qm == _QUANTIZATION_MODE_LOOKUP_TABLE_LINEAR: lut, qw = _get_linear_lookup_table_and_weight(nbits, wp) else: raise NotImplementedError('Quantization method "{}" not supported'.format(qm)) quantized_wp = _np.uint8(qw) return scale, bias, lut, quantized_wp def _quantize_wp_field(wp, nbits, qm, shape, axis=0, **kwargs): """ Quantize WeightParam field in Neural Network Protobuf wp: MLModel.NeuralNetwork.WeightParam WeightParam field nbits: int Number of bits to be quantized qm: str Quantization mode shape: tuple Tensor shape held by wp axis: int Axis over which quantization is performed on, can be either 0 or 1 lut_function: (``callable function``) Python callable representing a LUT table function """ # De-quantization if qm == _QUANTIZATION_MODE_DEQUANTIZE: return _dequantize_wp(wp, shape, axis) # If the float32 field is empty do nothing and return if len(wp.floatValue) == 0: return # Half precision (16-bit) quantization if nbits == 16: return _wp_to_fp16wp(wp) if nbits > 8: raise Exception("Only 8-bit and lower quantization is supported") if qm not in _SUPPORTED_QUANTIZATION_MODES: raise Exception("Quantization mode {} not supported".format(qm)) # axis parameter check if axis == 1 and len(shape) != 4: raise Exception( "Quantization on second axis is only supported " "for rank-4 weight blob." ) if axis != 0 and axis != 1: raise Exception( "Invalid quantization axis {} passed in. Allowed" "values are 0 (first axis) and 1 (second axis)".format(axis) ) # WeightParam size check - non-linear quantizations are applied on layer level num_channels = ( shape[axis] if qm in [_QUANTIZATION_MODE_LINEAR_QUANTIZATION, _QUANTIZATION_MODE_LINEAR_SYMMETRIC] else 1 ) if len(wp.floatValue) % num_channels: raise Exception( "Number of quantization channels does not divide evenly into weights" ) qparams = wp.quantization qparams.numberOfBits = nbits weights = _np.array(wp.floatValue).reshape(shape) scale, bias, lut, uint8_weights = _quantize_wp(weights, nbits, qm, axis, **kwargs) uint8_weights = uint8_weights.flatten() if qm in [ _QUANTIZATION_MODE_LINEAR_QUANTIZATION, _QUANTIZATION_MODE_LINEAR_SYMMETRIC, ]: qparams.linearQuantization.scale.extend(scale) qparams.linearQuantization.bias.extend(bias) else: qparams.lookupTableQuantization.floatValue.extend(lut) wp.rawValue = bytes() if nbits == 8: wp.rawValue += uint8_weights.tobytes() else: wp.rawValue += _convert_array_to_nbit_quantized_bytes( uint8_weights, nbits ).tobytes() del wp.floatValue[:] def _unpack_to_bytes(byte_arr, num_weights, nbits): assert num_weights % 1 == 0 num_weights = int(num_weights) bit_arr = _decompose_bytes_to_bit_arr(byte_arr.flatten().tolist()) bit_arr = _np.array(bit_arr[: num_weights * nbits]).reshape((num_weights, nbits)) expo = 2 ** _np.array(list(reversed(range(0, nbits)))) byte_arr = _np.sum(bit_arr * expo, axis=1) return byte_arr def _dequantize_linear(weight_8bit, scale, bias, axis=0): if len(weight_8bit.shape) == 1: # vector situation, treat as 1 channel weight_8bit = weight_8bit.reshape((1, weight_8bit.shape[0])) rank = len(weight_8bit.shape) if axis == 1: transposed_axis_order = (1, 0) + tuple(range(2, rank)) weight_8bit = _np.transpose(weight_8bit, transposed_axis_order) num_channels = weight_8bit.shape[0] broadcast_shape = (num_channels,) + (1,) * (rank - 1) scale = scale.reshape(broadcast_shape) bias = bias.reshape(broadcast_shape) weight = weight_8bit.astype("float") * scale + bias if axis == 1: weight = _np.transpose(weight, transposed_axis_order) return weight def _dequantize_lut(weight_8bit, lut): return lut[weight_8bit.astype("uint8")] def _dequantize_wp(wp, shape, axis=0): if len(wp.floatValue) != 0: return is_linear = wp.quantization.WhichOneof("QuantizationType") == "linearQuantization" if is_linear: if len(wp.quantization.linearQuantization.scale) != len( wp.quantization.linearQuantization.bias ): raise Exception( "Linear quantization scale and bias vectors are " "different lengths" ) # axis parameter check if axis == 1 and len(shape) != 4: raise Exception( "Dequantization on second axis is only supported " "for rank-4 weight blob." ) if axis != 0 and axis != 1: raise Exception( "Invalid quantization axis {} passed in. Allowed" "values are 0 (first axis) and 1 (second axis)".format(axis) ) nbits = wp.quantization.numberOfBits num_weights = _np.prod(shape) byte_arr = _np.frombuffer(wp.rawValue, dtype=_np.uint8) weight_8bit = ( byte_arr if nbits == 8 else _unpack_to_bytes(byte_arr, num_weights, nbits) ) weight_8bit = weight_8bit.reshape(shape) if is_linear: scale = _np.array(wp.quantization.linearQuantization.scale) bias = _np.array(wp.quantization.linearQuantization.bias) dequantized_weight = _dequantize_linear(weight_8bit, scale, bias, axis) else: lut = _np.array(wp.quantization.lookupTableQuantization.floatValue) dequantized_weight = _dequantize_lut(weight_8bit, lut) wp.rawValue = bytes() wp.quantization.Clear() wp.floatValue.extend(dequantized_weight.flatten()) def _dequantize_nn_spec(spec): """ Dequantize weights in NeuralNetwork type mlmodel specifications. """ _quantize_nn_spec(spec, None, _QUANTIZATION_MODE_DEQUANTIZE) def _quantize_nn_spec(nn_spec, nbits, qm, **kwargs): """ Quantize weights in NeuralNetwork type mlmodel specifications. """ selector = kwargs.get("selector", QuantizedLayerSelector()) if qm not in _SUPPORTED_QUANTIZATION_MODES: raise Exception("Quantization mode {} not supported".format(qm)) if qm != _QUANTIZATION_MODE_DEQUANTIZE: if nbits is None: raise Exception('Missing argument "nbits"') if not (nbits > 0 and nbits <= 8 or nbits == 16): raise Exception( "Only half precision (16-bit), 1 to 8-bit " "quantization is supported" ) if qm == _QUANTIZATION_MODE_LINEAR_SYMMETRIC and nbits != 8: raise Exception("Symmetric quantization is only applicable for 8 bit" "linear") layers = nn_spec.layers # Perform optimization step if nbits is not None and nbits < 16 and qm != _QUANTIZATION_MODE_DEQUANTIZE: print("Optimizing Neural Network before Quantization:") _optimize_nn(layers) print("Finished optimizing network. Quantizing neural network..") # Quantize each layer for layer in layers: layer_type = layer.WhichOneof("layer") if not selector.do_quantize(layer): continue print("Quantizing layer {} of type {}".format(layer.name, layer_type)) # Convolution if layer_type == "convolution": output_channels = layer.convolution.outputChannels kernel_channels = layer.convolution.kernelChannels kernel_height = layer.convolution.kernelSize[0] kernel_width = layer.convolution.kernelSize[1] groups = layer.convolution.nGroups counts = output_channels * kernel_channels * kernel_height * kernel_width has_bias = layer.convolution.hasBias if layer.convolution.isDeconvolution: shape = ( kernel_channels, int(output_channels / groups), kernel_height, kernel_width, ) _quantize_wp_field( layer.convolution.weights, nbits, qm, shape, axis=1, **kwargs ) else: shape = (output_channels, kernel_channels, kernel_height, kernel_width) _quantize_wp_field( layer.convolution.weights, nbits, qm, shape, **kwargs ) if has_bias and selector.do_quantize(layer, weight_param="bias"): _quantize_wp_field( layer.convolution.bias, nbits, qm, shape=(output_channels,), **kwargs ) # Batchnorm elif layer_type == "batchnorm": nw = layer.batchnorm.channels _quantize_wp_field(layer.batchnorm.gamma, nbits, qm, shape=(nw,), **kwargs) _quantize_wp_field(layer.batchnorm.beta, nbits, qm, shape=(nw,), **kwargs) _quantize_wp_field(layer.batchnorm.mean, nbits, qm, shape=(nw,), **kwargs) _quantize_wp_field( layer.batchnorm.variance, nbits, qm, shape=(nw,), **kwargs ) # InnerProduct elif layer_type == "innerProduct": output_channels = layer.innerProduct.outputChannels input_channels = layer.innerProduct.inputChannels _quantize_wp_field( layer.innerProduct.weights, nbits, qm, shape=(output_channels, input_channels), **kwargs ) has_bias = layer.innerProduct.hasBias if has_bias and selector.do_quantize(layer, weight_param="bias"): _quantize_wp_field( layer.innerProduct.bias, nbits, qm, shape=(output_channels,), **kwargs ) # BatchedMatmul elif layer_type == "batchedMatmul": x1 = layer.batchedMatmul.weightMatrixFirstDimension x2 = layer.batchedMatmul.weightMatrixSecondDimension _quantize_wp_field( layer.batchedMatmul.weights, nbits, qm, shape=(x2, x1), **kwargs ) has_bias = layer.batchedMatmul.hasBias if has_bias and selector.do_quantize(layer, weight_param="bias"): _quantize_wp_field( layer.batchedMatmul.bias, nbits, qm, shape=(x2,), **kwargs ) # Embedding layer elif layer_type == "embedding": output_channels = layer.embedding.outputChannels input_channels = layer.embedding.inputDim _quantize_wp_field( layer.embedding.weights, nbits, qm, shape=(output_channels, input_channels), **kwargs ) if layer.embedding.hasBias: _quantize_wp_field( layer.embedding.bias, nbits, qm, shape=(output_channels,), **kwargs ) # Embedding ND layer elif layer_type == "embeddingND": output_channels = layer.embeddingND.embeddingSize input_channels = layer.embeddingND.vocabSize _quantize_wp_field( layer.embeddingND.weights, nbits, qm, shape=(output_channels, input_channels), **kwargs ) if layer.embeddingND.hasBias: _quantize_wp_field( layer.embeddingND.bias, nbits, qm, shape=(output_channels,), **kwargs ) # Scale layer elif layer_type == "scale": nw = _np.prod(layer.scale.shapeScale) _quantize_wp_field(layer.scale.scale, nbits, qm, shape=(nw,), **kwargs) if layer.scale.hasBias: nw = _np.prod(layer.scale.shapeBias) _quantize_wp_field(layer.scale.bias, nbits, qm, shape=(nw,), **kwargs) # Bias layer elif layer_type == "bias": nw = _np.prod(layer.bias.shape) _quantize_wp_field(layer.bias.bias, nbits, qm, shape=(nw,), **kwargs) # LoadConstant layer elif layer_type == "loadConstant": nw = _np.prod(layer.loadConstant.shape) _quantize_wp_field( layer.loadConstant.data, nbits, qm, shape=(nw,), **kwargs ) # LoadConstantND layer elif layer_type == "loadConstantND": nw = _np.prod(layer.loadConstantND.shape) _quantize_wp_field( layer.loadConstantND.data, nbits, qm, shape=(nw,), **kwargs ) # Simple Recurrent elif layer_type == "simpleRecurrent": i_size = layer.simpleRecurrent.inputVectorSize o_size = layer.simpleRecurrent.outputVectorSize _quantize_wp_field( layer.simpleRecurrent.weightMatrix, nbits, qm, shape=(o_size, i_size), **kwargs ) _quantize_wp_field( layer.simpleRecurrent.recursionMatrix, nbits, qm, shape=(o_size, o_size), **kwargs ) if layer.simpleRecurrent.hasBiasVector: _quantize_wp_field( layer.simpleRecurrent.biasVector, nbits, qm, shape=(o_size,), **kwargs ) # GRU elif layer_type == "gru": i_size = layer.gru.inputVectorSize o_size = layer.gru.outputVectorSize # Weight Matrix _quantize_wp_field( layer.gru.updateGateWeightMatrix, nbits, qm, shape=(o_size, i_size), **kwargs ) _quantize_wp_field( layer.gru.resetGateWeightMatrix, nbits, qm, shape=(o_size, i_size), **kwargs ) _quantize_wp_field( layer.gru.outputGateWeightMatrix, nbits, qm, shape=(o_size, i_size), **kwargs ) # Recursion Weights _quantize_wp_field( layer.gru.updateGateRecursionMatrix, nbits, qm, shape=(o_size, o_size), **kwargs ) _quantize_wp_field( layer.gru.resetGateRecursionMatrix, nbits, qm, shape=(o_size, o_size), **kwargs ) _quantize_wp_field( layer.gru.outputGateRecursionMatrix, nbits, qm, shape=(o_size, o_size), **kwargs ) # Bias if layer.gru.hasBiasVectors: _quantize_wp_field( layer.gru.updateGateBiasVector, nbits, qm, shape=(o_size,), **kwargs ) _quantize_wp_field( layer.gru.resetGateBiasVector, nbits, qm, shape=(o_size,), **kwargs ) _quantize_wp_field( layer.gru.outputGateBiasVector, nbits, qm, shape=(o_size,), **kwargs ) # LSTM Layers elif layer_type in ["uniDirectionalLSTM", "biDirectionalLSTM"]: def _lstmwp_to_fp16_lstmwp( lstm_wp, nbits, qm, i_size, o_size, has_peephole=True ): assert lstm_wp _quantize_wp_field( lstm_wp.inputGateWeightMatrix, nbits, qm, shape=(o_size, i_size), **kwargs ) _quantize_wp_field( lstm_wp.forgetGateWeightMatrix, nbits, qm, shape=(o_size, i_size), **kwargs ) _quantize_wp_field( lstm_wp.blockInputWeightMatrix, nbits, qm, shape=(o_size, i_size), **kwargs ) _quantize_wp_field( lstm_wp.outputGateWeightMatrix, nbits, qm, shape=(o_size, i_size), **kwargs ) _quantize_wp_field( lstm_wp.inputGateRecursionMatrix, nbits, qm, shape=(o_size, o_size), **kwargs ) _quantize_wp_field( lstm_wp.forgetGateRecursionMatrix, nbits, qm, shape=(o_size, o_size), **kwargs ) _quantize_wp_field( lstm_wp.blockInputRecursionMatrix, nbits, qm, shape=(o_size, o_size), **kwargs ) _quantize_wp_field( lstm_wp.outputGateRecursionMatrix, nbits, qm, shape=(o_size, o_size), **kwargs ) _quantize_wp_field( lstm_wp.inputGateBiasVector, nbits, qm, shape=(o_size,), **kwargs ) _quantize_wp_field( lstm_wp.forgetGateBiasVector, nbits, qm, shape=(o_size,), **kwargs ) _quantize_wp_field( lstm_wp.blockInputBiasVector, nbits, qm, shape=(o_size,), **kwargs ) _quantize_wp_field( lstm_wp.outputGateBiasVector, nbits, qm, shape=(o_size,), **kwargs ) if has_peephole: _quantize_wp_field( lstm_wp.inputGatePeepholeVector, nbits, qm, shape=(o_size,), **kwargs ) _quantize_wp_field( lstm_wp.forgetGatePeepholeVector, nbits, qm, shape=(o_size,), **kwargs ) _quantize_wp_field( lstm_wp.outputGatePeepholeVector, nbits, qm, shape=(o_size,), **kwargs ) if layer_type == "uniDirectionalLSTM": _lstmwp_to_fp16_lstmwp( lstm_wp=layer.uniDirectionalLSTM.weightParams, nbits=nbits, qm=qm, i_size=layer.uniDirectionalLSTM.inputVectorSize, o_size=layer.uniDirectionalLSTM.outputVectorSize, has_peephole=layer.uniDirectionalLSTM.params.hasPeepholeVectors, ) elif layer_type == "biDirectionalLSTM": for lstm_wp in layer.biDirectionalLSTM.weightParams: _lstmwp_to_fp16_lstmwp( lstm_wp=lstm_wp, nbits=nbits, qm=qm, i_size=layer.biDirectionalLSTM.inputVectorSize, o_size=layer.biDirectionalLSTM.outputVectorSize, has_peephole=layer.biDirectionalLSTM.params.hasPeepholeVectors, ) elif layer_type == "custom": print( "Skipping custom layer {}. Weights for this layer need to" "be converted manually".format(layer.name) ) elif layer_type == "branch": _quantize_nn_spec(layer.branch.ifBranch, nbits, qm, **kwargs) _quantize_nn_spec(layer.branch.elseBranch, nbits, qm, **kwargs) elif layer_type == "loop": _quantize_nn_spec(layer.loop.conditionNetwork, nbits, qm, **kwargs) _quantize_nn_spec(layer.loop.bodyNetwork, nbits, qm, **kwargs) else: raise Exception("Unknown layer " + layer_type + " to be quantized") def _quantize_spec_weights(spec, nbits, quantization_mode, **kwargs): nn_model_types = [ "neuralNetwork", "neuralNetworkClassifier", "neuralNetworkRegressor", ] model_type = spec.WhichOneof("Type") # Neural network models if model_type in nn_model_types: # Bump up to appropriate spec version if required if nbits == 16: spec.specificationVersion = max( _MINIMUM_FP16_SPEC_VERSION, spec.specificationVersion ) else: spec.specificationVersion = max( _MINIMUM_QUANTIZED_MODEL_SPEC_VERSION, spec.specificationVersion ) if spec.WhichOneof("Type") == "neuralNetwork": _quantize_nn_spec(spec.neuralNetwork, nbits, quantization_mode, **kwargs) elif spec.WhichOneof("Type") in "neuralNetworkClassifier": _quantize_nn_spec( spec.neuralNetworkClassifier, nbits, quantization_mode, **kwargs ) elif spec.WhichOneof("Type") in "neuralNetworkRegressor": _quantize_nn_spec( spec.neuralNetworkRegressor, nbits, quantization_mode, **kwargs ) # Recursively convert all pipeline models elif spec.WhichOneof("Type") == "pipeline": for model_spec in spec.pipeline.models: _quantize_spec_weights(model_spec, nbits, quantization_mode, **kwargs) elif spec.WhichOneof("Type") in ["pipelineClassifier", "pipelineRegressor"]: _quantize_spec_weights(spec.pipeline, nbits, quantization_mode, **kwargs) return spec def _load_and_resize_image(image_path, size): from PIL import Image img = Image.open(image_path) return img.resize(size, Image.LANCZOS) class TopKMetrics: def __init__(self, topk): self._topk = topk self._correct_count = 0 self._total_count = 0 def add_metric(self, output1, output2): self._total_count += 1 if self._topk == 1: if output1 == output2: self._correct_count += 1 else: self._topk = min(len(output1.keys()), self._topk) out1_topk = sorted(output1, key=output1.get, reverse=True)[: self._topk] out2_topk = sorted(output2, key=output2.get, reverse=True)[: self._topk] if out1_topk[0] in out2_topk: self._correct_count += 1 def display_metrics(self): pcorrect = (float(self._correct_count) / float(self._total_count)) * 100 pcorrect = _np.round(pcorrect, decimals=2) if self._topk == 1: print("Top 1 Agreement: {}%\n".format(pcorrect)) else: print("Top {} Agreement: {}%\n".format(self._topk, pcorrect)) class NoiseMetrics: def __init__(self): self._snr = [] self._psnr = [] @staticmethod def _compute_snr(arr1, arr2): noise = arr1 - arr2 noise_var = _np.sum(noise ** 2) / len(noise) + 1e-7 signal_energy = _np.sum(arr2 ** 2) / len(arr2) max_signal_energy = _np.amax(arr2 ** 2) snr = 10 * _np.log10(signal_energy / noise_var) psnr = 10 * _np.log10(max_signal_energy / noise_var) return snr, psnr def add_metric(self, output1, output2): import PIL # Output is Image if isinstance(output1, PIL.Image.Image): if output1.mode == "RGBA": output1 = output1.convert("RGB") output2 = output2.convert("RGB") arr1 = _np.array(output1).flatten() arr2 = _np.array(output2).flatten() snr, psnr = self._compute_snr(arr1, arr2) self._snr.append(snr) self._psnr.append(psnr) # Output is multiArray else: arr1 = output1.flatten() arr2 = output2.flatten() snr, psnr = self._compute_snr(arr1, arr2) self._snr.append(snr) self._psnr.append(psnr) def display_metrics(self): print("SNR: {} +/- {}".format(_np.mean(self._snr), _np.var(self._snr))) print("PSNR: {} +/- {}\n".format(_np.mean(self._psnr), _np.var(self._psnr))) class OutputMetric: """ Utility class to calculate and hold metrics between two model outputs """ def __init__(self, name, type): self.name = name self._metrics = [] if type == "stringType": self._metrics.append(TopKMetrics(topk=1)) elif type == "dictionaryType": self._metrics.append(TopKMetrics(topk=5)) elif type == "imageType" or type == "multiArrayType": self._metrics.append(NoiseMetrics()) else: raise Exception( """Unable to determine which metric to compute for output: {}""".format( name ) ) def add_metric(self, output1, output2): for metric in self._metrics: metric.add_metric(output1, output2) def display_metrics(self): for metric in self._metrics: metric.display_metrics() class ModelMetrics: """ A utility class to hold evaluation metrics """ def __init__(self, spec): self.model_metrics = {} for output in spec.description.output: output_type = output.type.WhichOneof("Type") self.model_metrics[output.name] = OutputMetric(output.name, output_type) def add_metrics(self, model1_output, model2_output): outputs = model1_output.keys() for output in outputs: self.model_metrics[output].add_metric( model1_output[output], model2_output[output] ) def display_metrics(self): for metric in self.model_metrics: print("Output {}:".format(metric)) dash = "----------" for x in range(0, len(metric)): dash += "-" print(dash) self.model_metrics[metric].display_metrics() def _characterize_qmodel_perf_with_data_dir(fpmodel, qspec, data_dir): supported_image_exts = ["jpg", "bmp", "png", "jpeg"] test_image_paths = [ "{}/{}".format(data_dir, fn) for fn in _listdir(data_dir) if any(fn.endswith(ext) for ext in supported_image_exts) ] if not test_image_paths: raise Exception( "{} contains no supported image files. " "Supported file types include jpg, bmp, png and jpeg.".format( data_dir ) ) qmodel = _get_model(qspec, compute_units=_ComputeUnit.CPU_ONLY) model_metrics = ModelMetrics(qspec) input_name = qspec.description.input[0].name input_size = ( qspec.description.input[0].type.imageType.width, qspec.description.input[0].type.imageType.height, ) print("\n\n") print("Analyzing {} images".format(len(test_image_paths))) print("Running Analysis this may take a while ...") print("\n") analyzed = 0 tried = 0 if fpmodel.compute_unit != _ComputeUnit.CPU_ONLY: fpmodel = _MLModel(fpmodel.get_spec(), compute_units=_ComputeUnit.CPU_ONLY) for image in test_image_paths: try: input = {input_name: _load_and_resize_image(image, input_size)} fp_pred = fpmodel.predict(input) q_pred = qmodel.predict(input) analyzed += 1 model_metrics.add_metrics(fp_pred, q_pred) except Exception as e: print(e) continue # Update Progress tried += 1 if tried % 10 == 0: _stdout.write("\r") _stdout.write("Analyzed {}/{}".format(tried, len(test_image_paths))) _stdout.flush() print("\n") model_metrics.display_metrics() def _characterize_quantized_model_perf(fpmodel, qspec, sample_data): qmodel = _get_model(qspec) model_metrics = ModelMetrics(qspec) print("\n\n") print("Analyzing {} samples".format(len(sample_data))) print("Running Analysis this may take a while ...") print("\n") analyzed = 0 tried = 0 fpmodel = _MLModel(fpmodel.get_spec(), compute_units=_ComputeUnit.CPU_ONLY) qmodel = _MLModel(qmodel.get_spec(), compute_units=_ComputeUnit.CPU_ONLY) for data in sample_data: try: fp_pred = fpmodel.predict(data) q_pred = qmodel.predict(data) analyzed += 1 model_metrics.add_metrics(fp_pred, q_pred) except Exception as e: print(e) continue # Update Progress tried += 1 if tried % 10 == 0: _stdout.write("\r") _stdout.write("Analyzed {}/{}".format(tried, len(sample_data))) _stdout.flush() print("\n") model_metrics.display_metrics() def compare_models(full_precision_model, quantized_model, sample_data): """ Utility function to compare the performance of a full precision vs quantized model full_precision_model: MLModel The full precision model with float32 weights quantized_model: MLModel Quantized version of the model with quantized weights sample_data: str | [dict] Data used to characterize performance of the quantized model in comparison to the full precision model. Either a list of sample input dictionaries or an absolute path to a directory containing images. Path to a directory containing images is only valid for models with one image input. For all other models a list of sample inputs must be provided. :return: None. Performance metrics are printed out """ emessage = """ Invalid sample data provided. Only a list of dictionaries containing sample data or path to a folder containing images is supported""" spec = full_precision_model.get_spec() num_inputs = len(spec.description.input) if isinstance(sample_data, str): input_type = spec.description.input[0].type.WhichOneof("Type") if num_inputs != 1 or input_type != "imageType": raise Exception( """Unable to analyze quantized models. Sample data was a path to a directory which is only supported with models with one image type input. Please try passing in a list of sample inputs as sample data. """ ) _characterize_qmodel_perf_with_data_dir( full_precision_model, quantized_model.get_spec(), sample_data ) elif isinstance(sample_data, list): if not all(type(d) is dict for d in sample_data): raise Exception(emessage) _characterize_quantized_model_perf( full_precision_model, quantized_model.get_spec(), sample_data ) else: raise Exception(emessage) def activate_int8_int8_matrix_multiplications(spec, selector=None): """ Utility function that takes in either a full precision (float) spec or an nbit quantized spec to selectively enable int8 activation + weight quantization of matrix multiplication operations where the second matrix represents a constant weight. spec: MLModel.get_spec() Currently conversion for only neural network models is supported. If a pipeline model is passed in then all embedded neural network models embedded within will be modified. selector: (optional) MatrixMultiplyLayerSelector A MatrixMultiplyLayerSelector object that enables int8 activation + weight quantization only on those layers for which the user-specified criterion on the minimum/maximum number of size/channels in constant weight parameters is met. It can also be derived to provide custom selection. """ # Recursively convert all pipeline models if spec.WhichOneof("Type") == "pipeline": for model_spec in spec.pipeline.models: activate_int8_int8_matrix_multiplications(model_spec, selector=selector) return spec elif spec.WhichOneof("Type") in ["pipelineClassifier", "pipelineRegressor"]: activate_int8_int8_matrix_multiplications(spec.pipeline, selector=selector) return spec # Neural network models elif spec.WhichOneof("Type") in [ "neuralNetwork", "neuralNetworkClassifier", "neuralNetworkRegressor", ]: if selector is None: selector = MatrixMultiplyLayerSelector() # Dequantize all the selected matrix multiplication layers spec = _quantize_spec_weights( spec, nbits=None, quantization_mode=_QUANTIZATION_MODE_DEQUANTIZE, selector=selector, ) def _quantized_weight_and_scale(W): W_max = max(_np.abs(_np.min(W)), _np.abs(_np.max(W))) W_normalized = W / W_max # [-1,1] W_quantized_int8 = 127.0 * W_normalized # [-127, 127] W_quantized_int8 = W_quantized_int8.astype(_np.int8) quant_scale = W_max / 127.0 return W_quantized_int8, quant_scale if spec.WhichOneof("Type") == "neuralNetwork": nn_spec = spec.neuralNetwork elif spec.WhichOneof("Type") in "neuralNetworkClassifier": nn_spec = spec.neuralNetworkClassifier elif spec.WhichOneof("Type") in "neuralNetworkRegressor": nn_spec = spec.neuralNetworkRegressor def _process_nn_layers(nn_spec): layers = nn_spec.layers # Replacing each matrix multiplication for layer in layers: layer_type = layer.WhichOneof("layer") if not selector.do_quantize(layer): continue if layer_type == "branch": _process_nn_layers(layer.branch.ifBranch) _process_nn_layers(layer.branch.elseBranch) elif layer_type == "loop": _process_nn_layers(layer.loop.conditionNetwork) _process_nn_layers(layer.loop.bodyNetwork) elif layer_type in ["innerProduct", "batchedMatmul"]: # Bump up to appropriate spec version if at least one replacement occurs spec.specificationVersion = max( _SPECIFICATION_VERSION_IOS_14, spec.specificationVersion, ) # InnerProduct if layer_type == "innerProduct": matmul_layer = layer.innerProduct # BatchedMatmul elif layer_type == "batchedMatmul": matmul_layer = layer.batchedMatmul wp = matmul_layer.weights if len(wp.floatValue) == 0: continue else: qw, qs = _quantized_weight_and_scale(wp.floatValue) print( "Modifying layer {} with size of weights {}, to use Int8 * Int8 matrix multiplication".format( layer.name, qw.size ) ) matmul_layer.int8DynamicQuantize = True wp.quantization.numberOfBits = 8 wp.quantization.linearQuantization.scale.extend(map(float, [qs])) wp.int8RawValue = bytes() wp.int8RawValue += qw.tobytes() del wp.floatValue[:] _process_nn_layers(nn_spec) return spec else: raise ValueError("Model Type {} not supported.".format(spec.WhichOneof("Type"))) def quantize_weights( full_precision_model, nbits, quantization_mode="linear", sample_data=None, **kwargs ): """ Utility function to convert a full precision (float) MLModel to a nbit quantized MLModel (float16). full_precision_model: MLModel Model which will be converted to half precision. Currently conversion for only neural network models is supported. If a pipeline model is passed in then all embedded neural network models embedded within will be converted. nbits: int Number of bits per quantized weight. Only 16-bit float point and 1-8 bit is supported quantization_mode: str One of the following: "linear": Linear quantization with scale and bias assuming the range of weight values is [A, B], where A = min(weight), B = max(weight) "linear_lut": Simple linear quantization represented as a lookup table "kmeans_lut": LUT based quantization, where LUT is generated by K-Means clustering "custom_lut": LUT quantization where LUT and quantized weight params are calculated using a custom function. If this mode is selected then a custom function must be passed in kwargs with key lut_function. The function must have input params (nbits, wp) where nbits is the number of quantization bits and wp is the list of weights for a given layer. The function should return two parameters (lut, qw) where lut is an array of length (2^n bits)containing LUT values and qw is the list of quantized weight parameters. See ``_get_linear_lookup_table_and_weight`` for a sample implementation. "linear_symmetric": Linear quantization with scale and bias assuming the range of weight values is [-A, A], where A = max(abs(weight)). sample_data: str | [dict] Data used to characterize performance of the quantized model in comparison to the full precision model. Either a list of sample input dictionaries or an absolute path to a directory containing images. Path to a directory containing images is only valid for models with one image input. For all other models a list of sample inputs must be provided. kwargs: keyword arguments *lut_function* : (``callable function``) A callable function provided when quantization mode is set to ``_QUANTIZATION_MODE_CUSTOM_LOOKUP_TABLE``. See ``quantization_mode`` for more details. *selector*: QuantizedLayerSelector A QuanatizedLayerSelector object that can be derived to provide custom quantization selection. Returns ------- model: MLModel The quantized MLModel instance if running on macOS 10.14 or later, otherwise the quantized model specification is returned Examples -------- .. sourcecode:: python import coremltools from coremltools.models.neural_network import quantization_utils model = coremltools.models.MLModel("my_model.mlmodel") quantized_model = quantization_utils.quantize_weights(model, 8, "linear") """ qmode_mapping = { "linear": _QUANTIZATION_MODE_LINEAR_QUANTIZATION, "kmeans": _QUANTIZATION_MODE_LOOKUP_TABLE_KMEANS, "kmeans_lut": _QUANTIZATION_MODE_LOOKUP_TABLE_KMEANS, "linear_lut": _QUANTIZATION_MODE_LOOKUP_TABLE_LINEAR, "custom_lut": _QUANTIZATION_MODE_CUSTOM_LOOKUP_TABLE, "dequantization": _QUANTIZATION_MODE_DEQUANTIZE, "linear_symmetric": _QUANTIZATION_MODE_LINEAR_SYMMETRIC, } try: qmode = qmode_mapping[quantization_mode] except KeyError: # kmeans is deprecated. Instead kmeans_lut is used. No need to show it. del qmode_mapping["kmeans"] raise Exception( "Invalid quantization mode. Quantization mode must be " "one of {}".format(qmode_mapping) ) print("Quantizing using {} quantization".format(quantization_mode)) spec = full_precision_model.get_spec() if nbits == 16 and spec.isUpdatable: raise Exception("updatable models cannot get quantized to FP16.") qspec = _quantize_spec_weights(spec, nbits, qmode, **kwargs) quantized_model = _get_model(qspec, compute_units=full_precision_model.compute_unit) if _macos_version() >= (10, 14) and sample_data: compare_models(full_precision_model, quantized_model, sample_data) return quantized_model ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/neural_network/spec_inspection_utils.py0000644000000000000000000002502414672066616026345 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ...proto import NeuralNetwork_pb2 as _NeuralNetwork_pb2 def _get_weight_param_summary(wp): """Get a summary of _NeuralNetwork_pb2.WeightParams Args: wp : _NeuralNetwork_pb2.WeightParams - the _NeuralNetwork_pb2.WeightParams message to display Returns: a str summary for wp """ summary_str = "" if wp.HasField("quantization"): nbits = wp.quantization.numberOfBits quant_type = ( "linearly" if wp.quantization.HasField("linearQuantization") else "lookup-table" ) summary_str += "{}-bit {} quantized".format(nbits, quant_type) if len(wp.floatValue) > 0: summary_str += "({} floatValues)".format(len(wp.floatValue)) if len(wp.float16Value) > 0: summary_str += "({} bytes float16Values)".format(len(wp.float16Value)) if len(wp.rawValue) > 0: summary_str += "({} bytes rawValues)".format(len(wp.rawValue)) return summary_str def _get_lstm_weight_param_summary(lstm_wp): weight_name_list = [ "W_i", "W_f", "W_z", "W_o", "H_i", "H_f", "H_z", "H_o", "b_i", "b_f", "b_z", "b_o", "p_i", "p_f", "p_o", ] wp_summary_list = [ _get_weight_param_summary(lstm_wp.inputGateWeightMatrix), _get_weight_param_summary(lstm_wp.forgetGateWeightMatrix), _get_weight_param_summary(lstm_wp.blockInputWeightMatrix), _get_weight_param_summary(lstm_wp.outputGateWeightMatrix), _get_weight_param_summary(lstm_wp.inputGateRecursionMatrix), _get_weight_param_summary(lstm_wp.forgetGateRecursionMatrix), _get_weight_param_summary(lstm_wp.blockInputRecursionMatrix), _get_weight_param_summary(lstm_wp.outputGateRecursionMatrix), _get_weight_param_summary(lstm_wp.inputGateBiasVector), _get_weight_param_summary(lstm_wp.forgetGateBiasVector), _get_weight_param_summary(lstm_wp.blockInputBiasVector), _get_weight_param_summary(lstm_wp.outputGateBiasVector), _get_weight_param_summary(lstm_wp.inputGatePeepholeVector), _get_weight_param_summary(lstm_wp.forgetGatePeepholeVector), _get_weight_param_summary(lstm_wp.outputGatePeepholeVector), ] lstm_wp_summary_list = [] for idx, summary in enumerate(wp_summary_list): if len(summary) > 0: lstm_wp_summary_list.append(weight_name_list[idx] + ", " + summary) return ("\n" + " " * 8).join(lstm_wp_summary_list) def _get_feature_description_summary(feature): if feature.type.HasField("multiArrayType"): shape = list(feature.type.multiArrayType.shape) int_shape = [int(x) for x in shape] return str(int_shape) else: return ("({})".format(str(feature.type))).replace("\n", "") def _summarize_network_layer_info(layer): """ Args: layer - an MLModel NeuralNetwork Layer protobuf message Returns: layer_type : str - type of layer layer_name : str - name of the layer layer_inputs : list[str] - a list of strings representing input blobs of the layer layer_outputs : list[str] - a list of strings representing output blobs of the layer layer_field_content : list[(str, str)] - a list of two-tuple of (parameter_name, content) """ layer_type_str = layer.WhichOneof("layer") layer_name = layer.name layer_inputs = list(layer.input) layer_outputs = list(layer.output) typed_layer = getattr(layer, layer_type_str) layer_field_names = [l.name for l in typed_layer.DESCRIPTOR.fields] layer_field_content = [] for name in layer_field_names: field = getattr(typed_layer, name) summary_str = "" if type(field) == _NeuralNetwork_pb2.LSTMWeightParams: summary_str = _get_lstm_weight_param_summary(field) elif type(field) == _NeuralNetwork_pb2.WeightParams: summary_str = _get_weight_param_summary(field) else: field_str = str(field) if len(field_str) > 0: summary_str = field_str.replace("\n", " ") if len(summary_str) > 0: layer_field_content.append([name, summary_str]) return layer_type_str, layer_name, layer_inputs, layer_outputs, layer_field_content def _summarize_neural_network_spec(mlmodel_spec): """ Summarize network into the following structure. Args: mlmodel_spec : mlmodel spec Returns: inputs : list[(str, str)] - a list of two tuple (name, descriptor) for each input blob. outputs : list[(str, str)] - a list of two tuple (name, descriptor) for each output blob layers : list[(str, list[str], list[str], list[(str, str)])] - a list of layers represented by layer name, input blobs, output blobs, a list of (parameter name, content) """ inputs = [ (blob.name, _get_feature_description_summary(blob)) for blob in mlmodel_spec.description.input ] outputs = [ (blob.name, _get_feature_description_summary(blob)) for blob in mlmodel_spec.description.output ] nn = None if mlmodel_spec.HasField("neuralNetwork"): nn = mlmodel_spec.neuralNetwork elif mlmodel_spec.HasField("neuralNetworkClassifier"): nn = mlmodel_spec.neuralNetworkClassifier elif mlmodel_spec.HasField("neuralNetworkRegressor"): nn = mlmodel_spec.neuralNetworkRegressor layers = ( [_summarize_network_layer_info(layer) for layer in nn.layers] if nn is not None else None ) return (inputs, outputs, layers) def _prRed(skk, end=None): print("\033[91m {}\033[00m".format(skk), end=end) def _prLightPurple(skk, end=None): print("\033[94m {}\033[00m".format(skk), end=end) def _prPurple(skk, end=None): print("\033[95m {}\033[00m".format(skk), end=end) def _prGreen(skk, end=None): print("\033[92m {}\033[00m".format(skk), end=end) def _print_layer_type_and_arguments( layer_type_str, layer_inputs, indentation, to_indent=True, shape=None, value=None ): if to_indent: _prRed(indentation * "\t" + "{}".format(layer_type_str), end="") else: _prRed("{}".format(layer_type_str), end="") if shape is None: _prLightPurple("({})".format(", ".join(layer_inputs))) elif value is not None: _prLightPurple("(shape = ", end="") print("{}, ".format(str(shape)), end="") _prLightPurple("value = ", end="") values = ",".join(["{0: 0.1f}".format(v) for v in value]).lstrip() print("[{}]".format(values), end="") _prLightPurple(")") else: _prLightPurple("(shape = ", end="") print("{}".format(str(shape)), end="") _prLightPurple(")") def _find_size(arr): s = 1 for a in arr: s *= a return s def _summarize_neural_network_spec_code_style( nn_spec, indentation=0, input_names=None, output_names=None ): """ print nn_spec as if writing code """ indentation_size = 1 if input_names: print("def model({}):".format(", ".join(input_names))) indentation += indentation_size for i, layer in enumerate(nn_spec.layers): layer_type_str = layer.WhichOneof("layer") layer_inputs = list(layer.input) layer_outputs = list(layer.output) if layer_type_str == "loop": if len(layer.loop.conditionNetwork.layers) > 0: _prPurple(indentation * "\t" + "Condition Network: ") _summarize_neural_network_spec_code_style( layer.loop.conditionNetwork, indentation=indentation ) if layer.loop.conditionVar: layer_inputs.append(layer.loop.conditionVar) _print_layer_type_and_arguments(layer_type_str, layer_inputs, indentation) indentation += indentation_size _summarize_neural_network_spec_code_style( layer.loop.bodyNetwork, indentation=indentation ) if len(layer.loop.conditionNetwork.layers) > 0: _prPurple(indentation * "\t" + "Condition Network: ") _summarize_neural_network_spec_code_style( layer.loop.conditionNetwork, indentation=indentation ) indentation -= indentation_size continue if layer_type_str == "branch": _print_layer_type_and_arguments(layer_type_str, layer_inputs, indentation) _prRed(indentation * "\t" + "IfBranch:") indentation += indentation_size _summarize_neural_network_spec_code_style( layer.branch.ifBranch, indentation=indentation ) indentation -= indentation_size if len(layer.branch.elseBranch.layers) > 0: _prRed(indentation * "\t" + "ElseBranch:") indentation += indentation_size _summarize_neural_network_spec_code_style( layer.branch.elseBranch, indentation=indentation ) indentation -= indentation_size continue if layer_type_str == "loopBreak" or layer_type_str == "loopContinue": _prRed(indentation * "\t" + layer_type_str) continue shape = None value = None if layer_type_str == "loadConstant": shape = layer.loadConstant.shape shape = list(shape) int_shape = [int(x) for x in shape] shape = tuple([1, 1] + int_shape) size = _find_size(shape) if size < 4 and len(layer.loadConstant.data.floatValue) > 0: value = map(float, list(layer.loadConstant.data.floatValue)) if layer_type_str == "loadConstantND": shape = layer.loadConstantND.shape shape = tuple(map(int, list(shape))) size = _find_size(shape) if size < 4 and len(layer.loadConstantND.data.floatValue) > 0: value = map(float, list(layer.loadConstantND.data.floatValue)) print(indentation * "\t", end="") print("{} =".format(", ".join(layer_outputs)), end="") _print_layer_type_and_arguments( layer_type_str, layer_inputs, indentation, to_indent=False, shape=shape, value=value, ) if output_names: _prRed("\n" + indentation * "\t" + "return ", end="") print("{}".format(", ".join(output_names))) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/neural_network/update_optimizer_utils.py0000644000000000000000000001124714672066616026546 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ Neural Network optimizer utilities. """ class AdamParams: """ Adam - A Method for Stochastic Optimization. Attributes ---------- lr: float The learning rate that controls learning step size. Adjustable in progress, default: 0.01. batch: int The mini-batch size, number of examples used to compute single gradient step, default: 10. beta1: float Controls the exponential decay rate for the first moment estimates, default: 0.9. beta2: float Controls the exponential decay rate for the second moment estimates, default: 0.999. eps: float The epsilon, a very small number to prevent any division by zero in the implementation, default: 1e-8. Methods ------- set_lr(value, min, max) Set value for learning rate. set_batch(value, allow_set) Set value for batch size. set_beta1(value, min, max) Set value for beta1. set_beta2(value, min, max) Set value for beta2. set_eps(value, min, max) Set value for epsilon. """ def __init__(self, lr=1e-2, batch=10, beta1=0.9, beta2=0.999, eps=1e-8): self._lr = RangeParam(lr) self._batch = Batch(batch) self._beta1 = RangeParam(beta1) self._beta2 = RangeParam(beta2) self._eps = RangeParam(eps) def set_lr(self, value, min, max): self._lr = RangeParam(value, min, max) def set_batch(self, value, allowed_set): self._batch = Batch(value, allowed_set) def set_beta1(self, value, min, max): self._beta1 = RangeParam(value, min, max) def set_beta2(self, value, min, max): self._beta2 = RangeParam(value, min, max) def set_eps(self, value, min, max): self._eps = RangeParam(value, min, max) @property def lr(self): return self._lr @property def batch(self): return self._batch @property def beta1(self): return self._beta1 @property def beta2(self): return self._beta2 @property def eps(self): return self._eps class SgdParams: """ SGD - Stochastic Gradient Descent optimizer. Attributes ---------- lr: float The learning rate that controls learning step size. Adjustable in progress, default: 0.01. batch: int The mini-batch size, number of examples used to compute single gradient step, default: 10. momentum: float The momentum factor that helps accelerate gradients vectors in the right direction, default 0. Methods ------- set_lr(value, min, max) Set value for learning rate. set_batch(value, allow_set) Set value for batch size. set_momentum(value, min, max) Set value for momentum. """ def __init__(self, lr=1e-2, batch=10, momentum=0): self._lr = RangeParam(lr) self._batch = Batch(batch) self._momentum = RangeParam(momentum) def set_lr(self, value, min, max): self._lr = RangeParam(value, min, max) def set_batch(self, value, allowed_set): self._batch = Batch(value, allowed_set) def set_momentum(self, value, min, max): self._momentum = RangeParam(value, min, max) @property def lr(self): return self._lr @property def batch(self): return self._batch @property def momentum(self): return self._momentum class RangeParam: """ Range Parameter optimizer. Attributes ---------- value: float min: float max: float """ def __init__(self, value, min=0, max=1): self._value = value if min >= max: raise ValueError("min value must be less than max value.") self._min = min self._max = max @property def value(self): return self._value @property def min(self): return self._min @property def max(self): return self._max class Batch: """ Batch optimizer. Attributes ---------- value: float allowed_set: float """ def __init__(self, value, allowed_set=None): self._value = value if allowed_set is None: self._allowed_set = [value] else: if len(allowed_set) > len(set(allowed_set)): raise ValueError("values in allowed_set must be unique.") self._allowed_set = allowed_set @property def value(self): return self._value @property def allowed_set(self): return self._allowed_set ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/neural_network/utils.py0000644000000000000000000000757714672066616023115 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy as _copy from coremltools.models.utils import _get_model from .builder import NeuralNetworkBuilder def make_image_input( model, input_name, is_bgr=False, red_bias=0.0, blue_bias=0.0, green_bias=0.0, gray_bias=0.0, scale=1.0, image_format="NHWC", ): """ Convert input of type multiarray to type image Parameters ---------- TODO Returns ------- model: MLModel A coreML MLModel object Examples -------- TODO """ spec = model.get_spec() if spec.WhichOneof("Type") not in [ "neuralNetwork", "neuralNetworkClassifier", "neuralNetworkRegressor", ]: raise ValueError( "Provided model must be of type neuralNetwork, neuralNetworkClassifier or neuralNetworkRegressor" ) if not isinstance(input_name, list): input_name = [input_name] spec_inputs = [i.name for i in spec.description.input] for name in input_name: if name not in spec_inputs: msg = "Provided input_name: {}, is not an existing input to the model" raise ValueError(msg.format(name)) builder = NeuralNetworkBuilder(spec=spec) builder.set_pre_processing_parameters( image_input_names=input_name, is_bgr=is_bgr, red_bias=red_bias, green_bias=green_bias, blue_bias=blue_bias, gray_bias=gray_bias, image_scale=scale, image_format=image_format, ) return _get_model(spec) def make_nn_classifier( model, class_labels, predicted_feature_name=None, predicted_probabilities_output=None, ): """ Convert a model of type "neuralNetwork" to type "neuralNetworkClassifier" Parameters ---------- TODO Returns ------- model: MLModel A coreML MLModel object Examples -------- TODO """ spec = model.get_spec() if spec.WhichOneof("Type") != "neuralNetwork": raise ValueError('Provided model must be of type "neuralNetwork"') # convert type to "neuralNetworkClassifier" and copy messages from "neuralNetwork" nn_spec = _copy.deepcopy(spec.neuralNetwork) spec.ClearField("neuralNetwork") for layer in nn_spec.layers: spec.neuralNetworkClassifier.layers.add().CopyFrom(layer) for preprocessing in nn_spec.preprocessing: spec.neuralNetworkClassifier.preprocessing.add().CopyFrom(preprocessing) spec.neuralNetworkClassifier.arrayInputShapeMapping = nn_spec.arrayInputShapeMapping spec.neuralNetworkClassifier.imageInputShapeMapping = nn_spec.imageInputShapeMapping spec.neuralNetworkClassifier.updateParams.CopyFrom(nn_spec.updateParams) # set properties related to classifier builder = NeuralNetworkBuilder(spec=spec) message = "Class labels must be a list of integers / strings or a file path" classes_in = class_labels if isinstance(classes_in, str): import os if not os.path.isfile(classes_in): raise ValueError("Path to class labels (%s) does not exist." % classes_in) with open(classes_in, "r") as f: classes = f.read() classes = classes.splitlines() elif isinstance(classes_in, list): # list[int or str] classes = classes_in assert all([isinstance(x, \ (int, str)) for x in classes]), message else: raise ValueError(message) kwargs = {} if predicted_feature_name is not None: kwargs["predicted_feature_name"] = predicted_feature_name if predicted_probabilities_output is not None: kwargs["prediction_blob"] = predicted_probabilities_output builder.set_class_labels(classes, **kwargs) return _get_model(spec) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/pipeline.py0000644000000000000000000002524414672066616020512 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ Pipeline utils for this package. """ from .. import SPECIFICATION_VERSION as _SPECIFICATION_VERSION from ..proto import Model_pb2 as _Model_pb2 from . import _feature_management from . import model as _model from ._interface_management import (set_classifier_interface_params, set_regressor_interface_params, set_training_features, set_transform_interface_params) class Pipeline: """ A pipeline model that exposes a sequence of models as a single model, It requires a set of inputs, a sequence of other models and a set of outputs. This class is the base class for :py:class:`PipelineClassifier` and :py:class:`PipelineRegressor`, which contain a sequence ending in a classifier or regressor and themselves behave like a classifier or regressor. This class may be used directly for a sequence of feature transformer objects. """ def __init__(self, input_features, output_features, training_features=None): """ Create a pipeline of models to be executed sequentially. Parameters ---------- input_features: [list of 2-tuples] Name(s) of the input features, given as a list of `('name', datatype)` tuples. The datatypes entry can be any of the data types defined in the :py:mod:`models.datatypes` module. output_features: [list of features] Name(s) of the output features, given as a list of `('name',datatype)` tuples. The datatypes entry can be any of the data types defined in the :py:mod:`models.datatypes` module. All features must be either defined in the inputs or be produced by one of the contained models. """ spec = _Model_pb2.Model() spec.specificationVersion = _SPECIFICATION_VERSION # Access this to declare it as a pipeline spec.pipeline spec = set_transform_interface_params( spec, input_features, output_features, training_features ) # Save the spec as a member variable. self.spec = spec def _validate_updatable_pipeline_on_add_model(self, spec): if spec.isUpdatable: raise ValueError( "New sub-models cannot be added after the pipeline has been marked as updatable" ) def add_model(self, spec): """ Add a protobuf spec or :py:class:`models.MLModel` instance to the pipeline. All input features of this model must either match the input_features of the pipeline, or match the outputs of a previous model. Parameters ---------- spec: [MLModel, Model_pb2] A protobuf spec or MLModel instance containing a model. """ self._validate_updatable_pipeline_on_add_model(self.spec) if isinstance(spec, _model.MLModel): spec = spec._spec pipeline = self.spec.pipeline step_spec = pipeline.models.add() step_spec.CopyFrom(spec) def _validate_sub_models_and_make_updatable(self, pipeline, spec): num_models = len(pipeline.models) if num_models < 1: raise ValueError( "Pipeline does not seem to have any models. It should be marked as updatable only after adding all sub-models." ) for model in pipeline.models[:-1]: if model.isUpdatable: raise ValueError( "Only the last model can be updatable in an updatable pipeline." ) last_model = pipeline.models[num_models - 1] if not last_model.isUpdatable: raise ValueError( "A pipeline can be made updatable only if the last model is updatable." ) spec.isUpdatable = True def make_updatable(self): self._validate_sub_models_and_make_updatable(self.spec.pipeline, self.spec) def set_training_input(self, training_input): """ Set the training inputs of the network spec. Parameters ---------- training_input: [tuple] List of training input names and type of the network. """ spec = self.spec set_training_features(spec, training_input) class PipelineRegressor(Pipeline): """ A pipeline model that exposes a sequence of models as a single model, It requires a set of inputs, a sequence of other models and a set of outputs. In this case the pipeline itself behaves as a regression model by designating a real valued output feature as its 'predicted feature'. """ def __init__(self, input_features, output_features, training_features=None): """ Create a set of pipeline models given a set of model specs. The final output model must be a regression model. Parameters ---------- input_features: [list of 2-tuples] Name(s) of the input features, given as a list of `('name', datatype)` tuples. The datatypes entry can be any of the data types defined in the :py:mod:`models.datatypes` module. output_features: [list of features] Name(s) of the output features, given as a list of `('name',datatype)` tuples. The datatypes entry can be any of the data types defined in the :py:mod:`models.datatypes` module. All features must be either defined in the inputs or be produced by one of the contained models. """ spec = _Model_pb2.Model() spec.specificationVersion = _SPECIFICATION_VERSION # Access this to declare it as a pipeline spec.pipelineRegressor spec = set_regressor_interface_params( spec, input_features, output_features, training_features ) # Save as a member variable self.spec = spec def add_model(self, spec): """ Add a protobuf spec or :py:class:`models.MLModel` instance to the pipeline. All input features of this model must either match the input_features of the pipeline, or match the outputs of a previous model. Parameters ---------- spec: [MLModel, Model_pb2] A protobuf spec or MLModel instance containing a model. """ super()._validate_updatable_pipeline_on_add_model(self.spec) if isinstance(spec, _model.MLModel): spec = spec._spec pipeline = self.spec.pipelineRegressor.pipeline step_spec = pipeline.models.add() step_spec.CopyFrom(spec) def make_updatable(self): super()._validate_sub_models_and_make_updatable( self.spec.pipelineRegressor.pipeline, self.spec ) def set_training_input(self, training_input): """ Set the training inputs of the network spec. Parameters ---------- training_input: [tuple] List of training input names and type of the network. """ spec = self.spec set_training_features(spec, training_input) class PipelineClassifier(Pipeline): """ A pipeline model that exposes a sequence of models as a single model, It requires a set of inputs, a sequence of other models and a set of outputs. In this case the pipeline itself behaves as a classification model by designating a discrete categorical output feature as its 'predicted feature'. """ def __init__( self, input_features, class_labels, output_features=None, training_features=None ): """ Create a set of pipeline models given a set of model specs. The last model in this list must be a classifier model. Parameters ---------- input_features: [list of 2-tuples] Name(s) of the input features, given as a list of `('name', datatype)` tuples. The datatypes entry can be any of the data types defined in the :py:mod:`models.datatypes` module. class_labels: [list] A list of string or integer class labels to use in making predictions. This list must match the class labels in the model outputting the categorical predictedFeatureName output_features: [list] A string or a list of two strings specifying the names of the two output features, the first being a class label corresponding to the class with the highest predicted score, and the second being a dictionary mapping each class to its score. If `output_features` is a string, it specifies the predicted class label and the class scores is set to the default value of `"classProbability."` """ output_features = _feature_management.process_or_validate_classifier_output_features( output_features, class_labels ) spec = _Model_pb2.Model() spec.specificationVersion = _SPECIFICATION_VERSION spec = set_classifier_interface_params( spec, input_features, class_labels, "pipelineClassifier", output_features, training_features, ) # Access this to declare it as a pipeline spec.pipelineClassifier # Save as a member variable self.spec = spec def add_model(self, spec): """ Add a protobuf spec or :py:class:`models.MLModel` instance to the pipeline. All input features of this model must either match the input_features of the pipeline, or match the outputs of a previous model. Parameters ---------- spec: [MLModel, Model_pb2] A protobuf spec or MLModel instance containing a model. """ super()._validate_updatable_pipeline_on_add_model(self.spec) if isinstance(spec, _model.MLModel): spec = spec._spec pipeline = self.spec.pipelineClassifier.pipeline step_spec = pipeline.models.add() step_spec.CopyFrom(spec) def make_updatable(self): super(PipelineClassifier, self)._validate_sub_models_and_make_updatable( self.spec.pipelineClassifier.pipeline, self.spec ) def set_training_input(self, training_input): """ Set the training inputs of the network spec. Parameters ---------- training_input: [tuple] List of training input names and type of the network. """ spec = self.spec set_training_features(spec, training_input) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/tree_ensemble.py0000644000000000000000000003662414672066616021522 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ Tree ensemble builder class to construct CoreML models. """ import collections as _collections from .. import SPECIFICATION_VERSION as _SPECIFICATION_VERSION from ..proto import Model_pb2 as _Model_pb2 from ..proto import TreeEnsemble_pb2 as _TreeEnsemble_pb2 from ._interface_management import (set_classifier_interface_params, set_regressor_interface_params) class TreeEnsembleBase: """ Base class for the tree ensemble builder class. This should be instantiated either through the :py:class:`TreeEnsembleRegressor` or :py:class:`TreeEnsembleClassifier` classes. """ def __init__(self): """ High level Python API to build a tree ensemble model for Core ML. """ # Set inputs and outputs spec = _Model_pb2.Model() spec.specificationVersion = _SPECIFICATION_VERSION # Save the spec in the protobuf self.spec = spec def set_default_prediction_value(self, values): """ Set the default prediction value(s). The values given here form the base prediction value that the values at activated leaves are added to. If values is a scalar, then the output of the tree must also be 1 dimensional; otherwise, values must be a list with length matching the dimension of values in the tree. Parameters ---------- values: [int | double | list[double]] Default values for predictions. """ if type(values) is not list: values = [float(values)] self.tree_parameters.numPredictionDimensions = len(values) for value in values: self.tree_parameters.basePredictionValue.append(value) def set_post_evaluation_transform(self, value): r""" Set the post processing transform applied after the prediction value from the tree ensemble. Parameters ---------- value: str A value denoting the transform applied. Possible values are: - ``"NoTransform"`` (default). Do not apply a transform. - ``"Classification_SoftMax"``. Apply a softmax function to the outcome to produce normalized, non-negative scores that sum to 1. The transformation applied to dimension `i` is equivalent to: .. math:: \frac{e^{x_i}}{\sum_j e^{x_j}} Note: This is the output transformation applied by the XGBoost package with multiclass classification. - ``"Regression_Logistic"``. Applies a logistic transform the predicted value, specifically: .. math:: (1 + e^{-v})^{-1} This is the transformation used in binary classification. """ self.tree_spec.postEvaluationTransform = _TreeEnsemble_pb2.TreeEnsemblePostEvaluationTransform.Value( value ) def add_branch_node( self, tree_id, node_id, feature_index, feature_value, branch_mode, true_child_id, false_child_id, relative_hit_rate=None, missing_value_tracks_true_child=False, ): """ Add a branch node to the tree ensemble. Parameters ---------- tree_id: int ID of the tree to add the node to. node_id: int ID of the node within the tree. feature_index: int Index of the feature in the input being split on. feature_value: double or int The value used in the feature comparison determining the traversal direction from this node. branch_mode: str Branch mode of the node, specifying the condition under which the node referenced by ``true_child_id`` is called next. Must be one of the following: - ``"BranchOnValueLessThanEqual"``. Traverse to node ``true_child_id`` if ``input[feature_index] <= feature_value``, and ``false_child_id`` otherwise. - ``"BranchOnValueLessThan"``. Traverse to node ``true_child_id`` if ``input[feature_index] < feature_value``, and ``false_child_id`` otherwise. - ``"BranchOnValueGreaterThanEqual"``. Traverse to node ``true_child_id`` if ``input[feature_index] >= feature_value``, and ``false_child_id`` otherwise. - ``"BranchOnValueGreaterThan"``. Traverse to node ``true_child_id`` if ``input[feature_index] > feature_value``, and ``false_child_id`` otherwise. - ``"BranchOnValueEqual"``. Traverse to node ``true_child_id`` if ``input[feature_index] == feature_value``, and ``false_child_id`` otherwise. - ``"BranchOnValueNotEqual"``. Traverse to node ``true_child_id`` if ``input[feature_index] != feature_value``, and ``false_child_id`` otherwise. true_child_id: int ID of the child under the true condition of the split. An error will be raised at model validation if this does not match the ``node_id`` of a node instantiated by ``add_branch_node`` or ``add_leaf_node`` within this ``tree_id``. false_child_id: int ID of the child under the false condition of the split. An error will be raised at model validation if this does not match the ``node_id`` of a node instantiated by ``add_branch_node`` or ``add_leaf_node`` within this ``tree_id``. relative_hit_rate: float [optional] When the model is converted compiled by CoreML, this gives hints to Core ML about which node is more likely to be hit on evaluation, allowing for additional optimizations. The values can be on any scale, with the values between child nodes being compared relative to each other. missing_value_tracks_true_child: bool [optional] If the training data contains NaN values or missing values, then this flag determines which direction a NaN value traverses. """ spec_node = self.tree_parameters.nodes.add() spec_node.treeId = tree_id spec_node.nodeId = node_id spec_node.branchFeatureIndex = int(feature_index) spec_node.branchFeatureValue = feature_value spec_node.trueChildNodeId = true_child_id spec_node.falseChildNodeId = false_child_id spec_node.nodeBehavior = _TreeEnsemble_pb2.TreeEnsembleParameters.TreeNode.TreeNodeBehavior.Value( branch_mode ) if relative_hit_rate is not None: spec_node.relativeHitRate = relative_hit_rate spec_node.missingValueTracksTrueChild = missing_value_tracks_true_child def add_leaf_node(self, tree_id, node_id, values, relative_hit_rate=None): """ Add a leaf node to the tree ensemble. Parameters ---------- tree_id: int ID of the tree to add the node to. node_id: int ID of the node within the tree. values: [float | int | list | dict] Value(s) at the leaf node to add to the prediction when this node is activated. If the prediction dimension of the tree is 1, then the value is specified as a float or integer value. For multidimensional predictions, the values can be a list of numbers with length matching the dimension of the predictions or a dictionary mapping index to value added to that dimension. Note that the dimension of any tree must match the dimension given when :py:meth:`set_default_prediction_value` is called. """ spec_node = self.tree_parameters.nodes.add() spec_node.treeId = tree_id spec_node.nodeId = node_id spec_node.nodeBehavior = _TreeEnsemble_pb2.TreeEnsembleParameters.TreeNode.TreeNodeBehavior.Value( "LeafNode" ) if not isinstance(values, _collections.abc.Iterable): values = [values] if relative_hit_rate is not None: spec_node.relativeHitRate = relative_hit_rate if type(values) == dict: iter = values.items() else: iter = enumerate(values) for index, value in iter: ev_info = spec_node.evaluationInfo.add() ev_info.evaluationIndex = index ev_info.evaluationValue = float(value) spec_node.nodeBehavior = _TreeEnsemble_pb2.TreeEnsembleParameters.TreeNode.TreeNodeBehavior.Value( "LeafNode" ) class TreeEnsembleRegressor(TreeEnsembleBase): """ Tree Ensemble builder class to construct a Tree Ensemble regression model. The TreeEnsembleRegressor class constructs a Tree Ensemble model incrementally using methods to add branch and leaf nodes specifying the behavior of the model. Examples -------- In the following example, the code saves the model to disk, which is a recommended practice but not required. .. sourcecode:: python >>> # Required inputs >>> import coremltools >>> from coremltools.models import datatypes >>> from coremltools.models.tree_ensemble import TreeEnsembleRegressor >>> import numpy as np >>> # Define input features >>> input_features = [("a", datatypes.Array(3)), ("b", (datatypes.Double()))] >>> # Define output_features >>> output_features = [("predicted_values", datatypes.Double())] >>> tm = TreeEnsembleRegressor(features = input_features, target = output_features) >>> # Split on a[2] <= 3 >>> tm.add_branch_node(0, 0, 2, 3, "BranchOnValueLessThanEqual", 1, 2) >>> # Add leaf to the true branch of node 0 that subtracts 1. >>> tm.add_leaf_node(0, 1, -1) >>> # Add split on b == 0 to the false branch of node 0, which is index 3 >>> tm.add_branch_node(0, 2, 3, 0, "BranchOnValueEqual", 3, 4) >>> # Add leaf to the true branch of node 2 that adds 1 to the result. >>> tm.add_leaf_node(0, 3, 1) >>> # Add leaf to the false branch of node 2 that subtracts 1 from the result. >>> tm.add_leaf_node(0, 4, -1) >>> tm.set_default_prediction_value([0, 0]) >>> # save the model to a .mlmodel file >>> model_path = './tree.mlmodel' >>> coremltools.models.utils.save_spec(tm.spec, model_path) >>> # load the .mlmodel >>> mlmodel = coremltools.models.MLModel(model_path) >>> # make predictions >>> test_input = { >>> 'a': np.array([0, 1, 2]).astype(np.float32), >>> "b": 3.0, >>> } >>> predictions = mlmodel.predict(test_input) """ def __init__(self, features, target): """ Create a Tree Ensemble regression model that takes one or more input features and maps them to an output feature. Parameters ---------- features: [list of features] Name(s) of the input features, given as a list of ``('name', datatype)`` tuples. The features are one of ``models.datatypes.Int64``, ``datatypes.Double``, or ``models.datatypes.Array``. Feature indices in the nodes are counted sequentially from 0 through the features. target: (default = None) Name of the target feature predicted. """ super().__init__() spec = self.spec spec = set_regressor_interface_params(spec, features, target) self.tree_spec = spec.treeEnsembleRegressor self.tree_parameters = self.tree_spec.treeEnsemble class TreeEnsembleClassifier(TreeEnsembleBase): """ Tree Ensemble builder class to construct a Tree Ensemble classification model. The TreeEnsembleClassifier class constructs a Tree Ensemble model incrementally using methods to add branch and leaf nodes specifying the behavior of the model. Examples -------- In the following example, the code saves the model to disk, which is a recommended practice but not required. .. sourcecode:: python >>> input_features = [("a", datatypes.Array(3)), ("b", datatypes.Double())] >>> tm = TreeEnsembleClassifier(features = input_features, class_labels = [0, 1], output_features = "predicted_class") >>> # Split on a[2] <= 3 >>> tm.add_branch_node(0, 0, 2, 3, "BranchOnValueLessThanEqual", 1, 2) >>> # Add leaf to the true branch of node 0 that subtracts 1. >>> tm.add_leaf_node(0, 1, -1) >>> # Add split on b == 0 to the false branch of node 0. >>> tm.add_branch_node(0, 2, 3, 0, "BranchOnValueEqual", 3, 4) >>> # Add leaf to the true branch of node 2 that adds 1 to the result. >>> tm.add_leaf_node(0, 3, 1) >>> # Add leaf to the false branch of node 2 that subtracts 1 from the result. >>> tm.add_leaf_node(0, 4, -1) >>> # Put in a softmax transform to translate these into probabilities. >>> tm.set_post_evaluation_transform("Classification_SoftMax") >>> tm.set_default_prediction_value([0, 0]) >>> # save the model to a .mlmodel file >>> model_path = './tree.mlmodel' >>> coremltools.models.utils.save_spec(tm.spec, model_path) >>> # load the .mlmodel >>> mlmodel = coremltools.models.MLModel(model_path) >>> # make predictions >>> test_input = { >>> 'a': np.array([0, 1, 2]).astype(np.float32), >>> "b": 3.0, >>> } >>> predictions = mlmodel.predict(test_input) """ def __init__(self, features, class_labels, output_features): """ Create a tree ensemble classifier model. Parameters ---------- features: [list of features] Name(s) of the input features, given as a list of ``('name', datatype)`` tuples. The features are one of ``models.datatypes.Int64``, ``datatypes.Double``, or ``models.datatypes.Array``. Feature indices in the nodes are counted sequentially from 0 through the features. class_labels: [list] A list of string or integer class labels to use in making predictions. The length of this must match the dimension of the tree model. output_features: [list] A string or a list of two strings specifying the names of the two output features, the first being a class label corresponding to the class with the highest predicted score, and the second being a dictionary mapping each class to its score. If ``output_features`` is a string, it specifies the predicted class label and the class scores is set to the default value of ``"classProbability"``. """ super().__init__() spec = self.spec spec = set_classifier_interface_params( spec, features, class_labels, "treeEnsembleClassifier", output_features ) self.tree_spec = spec.treeEnsembleClassifier self.tree_parameters = self.tree_spec.treeEnsemble ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/models/utils.py0000644000000000000000000023313614672066616020046 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ Utilities for the entire package. """ from collections import OrderedDict as _OrderedDict import copy as _copy import gc as _gc import math as _math import os as _os import shutil as _shutil import subprocess as _subprocess import sys as _sys import tempfile as _tempfile import warnings as _warnings from collections.abc import Iterable as _Iterable from functools import lru_cache as _lru_cache from typing import Callable as _Callable from typing import Dict as _Dict from typing import List as _List from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import Union as _Union import numpy as _np import coremltools as _ct from coremltools import _logger from coremltools import _SPECIFICATION_VERSION_IOS_16, _SPECIFICATION_VERSION_IOS_18 from coremltools import ComputeUnit as _ComputeUnit from coremltools import proto as _proto from coremltools.converters.mil import mil as _mil from coremltools.converters.mil.frontend.milproto import load as _milproto_to_pymil from coremltools.converters.mil.mil import Builder as _mb from coremltools.converters.mil.mil import Program as _Program from coremltools.converters.mil.mil.passes.defs.preprocess import NameSanitizer as _NameSanitizer from coremltools.converters.mil.mil.passes.defs.randomize import ( WeightRandomizer as _WeightRandomizer, ) from coremltools.converters.mil.mil.passes.graph_pass import AbstractGraphPass as _AbstractGraphPass from coremltools.converters.mil.mil.passes.helper import block_context_manager as _block_context_manager from coremltools.converters.mil.mil.passes.pass_pipeline import ( PassPipelineManager as _PassPipelineManager, ) from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY as _PASS_REGISTRY from coremltools.converters.mil.mil.program import Placeholder as _Placeholder from .._deps import _HAS_SCIPY _MLMODEL_EXTENSION = ".mlmodel" _MLPACKAGE_EXTENSION = ".mlpackage" _MODEL_FILE_NAME = 'model.mlmodel' _WEIGHTS_FILE_NAME = 'weight.bin' _WEIGHTS_DIR_NAME = 'weights' _MLPACKAGE_AUTHOR_NAME = "com.apple.CoreML" try: from ..libmodelpackage import ModelPackage as _ModelPackage except: _ModelPackage = None if _HAS_SCIPY: import scipy.sparse as _sp def _to_unicode(x): if isinstance(x, bytes): return x.decode() else: return x def _remove_invalid_keys(input_dict, model): # make sure that input_dict does not contain an input name, which # is not present in the list of model inputs input_dict_keys = list(input_dict.keys()) model_input_names = set([inp.name for inp in model._spec.description.input]) for k in input_dict_keys: if k not in model_input_names: del input_dict[k] def _create_mlpackage( proto_spec: _proto.Model_pb2, weights_dir: _Optional[str] = None, package_path: _Optional[str] = None, ) -> str: """ Parameters ---------- proto_spec The proto spec of the model. weights_dir Copy weights from this path to the ``mlpackage``. package_path Place the created ``mlpackage`` at this path. Error out if this path is a non-empty directory. Returns ------- path to the ``mlpackage``. """ if package_path is None: package_path = _tempfile.mkdtemp(suffix=_MLPACKAGE_EXTENSION) if _os.path.exists(package_path): if _os.listdir(package_path): raise FileExistsError( f"The package_path is invalid because it's a non-empty directory: {package_path}" ) # If package_path is an empty dir, the ModelPackage load will error out with `manifest.json not found` issue. _shutil.rmtree(package_path) _, ext = _os.path.splitext(package_path) if ext != _MLPACKAGE_EXTENSION: raise Exception( f"For an ML Package, extension must be {_MLPACKAGE_EXTENSION} (not {ext})" ) package = _ModelPackage(package_path) # Save proto to disk as the root model file, and copy into the model package. spec_file = _tempfile.NamedTemporaryFile(suffix=_MLMODEL_EXTENSION) spec_file.write(proto_spec.SerializeToString()) spec_file.flush() package.setRootModel(spec_file.name, _MODEL_FILE_NAME, _MLPACKAGE_AUTHOR_NAME, "CoreML Model Specification") # Spec file is auto cleaned after close, which is fine because it is already added to the model package. spec_file.close() # Add weights bundle into the model package. if weights_dir is not None: package.addItem( weights_dir, _WEIGHTS_DIR_NAME, _MLPACKAGE_AUTHOR_NAME, "CoreML Model Weights", ) return package_path def save_spec(spec, filename, auto_set_specification_version=False, weights_dir=None): """ Save a protobuf model specification to file. Parameters ---------- spec: Model_pb Protobuf representation of the model. filename: str File path where the spec is saved. auto_set_specification_version: bool If ``True``, will always try to set specification version automatically. weights_dir: str Path to the directory containing the weights.bin file. This is required when the spec has model type ``mlprogram``. If the ``mlprogram`` does not contain any weights, this path can be an empty directory. Examples -------- .. sourcecode:: python coremltools.utils.save_spec(spec, "HousePricer.mlmodel") coremltools.utils.save_spec(spec, "HousePricer.mlpackage") coremltools.utils.save_spec( spec, "mlprogram_model.mlpackage", weights_dir="/path/to/weights/directory" ) See Also -------- load_spec """ name, ext = _os.path.splitext(filename) is_package = False if not ext: filename = "{}{}".format(filename, _MLMODEL_EXTENSION) elif ext == _MLPACKAGE_EXTENSION: is_package = True elif ext == _MLMODEL_EXTENSION: is_package = False else: raise Exception("Extension must be {} or {} (not {})".format(_MLMODEL_EXTENSION, _MLPACKAGE_EXTENSION, ext)) if auto_set_specification_version: try: # always try to downgrade the specification version to the # minimal version that supports everything in this mlmodel from ..libcoremlpython import _MLModelProxy spec = _MLModelProxy.auto_set_specification_version(spec) except Exception as e: print(e) _warnings.warn( "Failed to automatic set specification version for this model.", RuntimeWarning, ) if is_package: if _ModelPackage is None: raise Exception( "Unable to load libmodelpackage. Cannot save spec" ) if spec.WhichOneof('Type') == "mlProgram" and weights_dir is None: raise Exception('spec of type mlProgram cannot be saved without the' ' weights file. Please provide the path to the weights file as well, ' 'using the \'weights_dir\' argument.') _create_mlpackage(spec, weights_dir=weights_dir, package_path=filename) else: with open(filename, "wb") as f: f.write(spec.SerializeToString()) def load_spec(model_path: str) -> _proto.Model_pb2: """ Load a protobuf model specification from file (``mlmodel``) or directory (``mlpackage``). Parameters ---------- model_path: Path to the model from which the protobuf spec is loaded. Returns ------- model_spec: Model_pb Protobuf representation of the model. Examples -------- .. sourcecode:: python spec = coremltools.utils.load_spec("HousePricer.mlmodel") spec = coremltools.utils.load_spec("HousePricer.mlpackage") See Also -------- save_spec """ if _os.path.isdir(model_path): if _ModelPackage is None: raise Exception("Unable to load libmodelpackage. Cannot make save spec.") specfile = _ModelPackage(model_path).getRootModel().path() else: specfile = model_path spec = _proto.Model_pb2.Model() with open(specfile, "rb") as f: spec.ParseFromString(f.read()) return spec def _get_nn_layers(spec): """ Returns a list of neural network layers if the model contains any. Parameters ---------- spec: Model_pb A model protobuf specification. Returns ------- [NN layer] list of all layers (including layers from elements of a pipeline). """ layers = [] if spec.WhichOneof("Type") == "pipeline": layers = [] for model_spec in spec.pipeline.models: if not layers: return _get_nn_layers(model_spec) else: layers.extend(_get_nn_layers(model_spec)) elif spec.WhichOneof("Type") in ["pipelineClassifier", "pipelineRegressor"]: layers = [] for model_spec in spec.pipeline.models: if not layers: return _get_nn_layers(model_spec) else: layers.extend(_get_nn_layers(model_spec)) elif spec.neuralNetwork.layers: layers = spec.neuralNetwork.layers elif spec.neuralNetworkClassifier.layers: layers = spec.neuralNetworkClassifier.layers elif spec.neuralNetworkRegressor.layers: layers = spec.neuralNetworkRegressor.layers return layers def _fp32_to_reversed_fp16_byte_array(fp32_arr): raw_fp16 = _np.float16(fp32_arr) x = "" for fp16 in raw_fp16: all_bytes = _np.fromstring(fp16.tobytes(), dtype="int8") x += all_bytes[1].tobytes() x += all_bytes[0].tobytes() return x def _fp32_to_fp16_byte_array(fp32_arr): if _np.amax(fp32_arr) >= 65504 or _np.amin(fp32_arr) <= -65504: raise Exception( "Model cannot be converted as " "it has weights that cannot be represented in " "half precision.\n" ) if _sys.byteorder == "little": return _np.float16(fp32_arr).tobytes() else: return _fp32_to_reversed_fp16_byte_array(fp32_arr) def _wp_to_fp16wp(wp): assert wp # If the float32 field is empty do nothing. if len(wp.floatValue) == 0: return wp.float16Value = _fp32_to_fp16_byte_array(wp.floatValue) del wp.floatValue[:] def _convert_neural_network_spec_weights_to_fp16(fp_spec): from .neural_network.quantization_utils import ( _QUANTIZATION_MODE_LINEAR_QUANTIZATION, _quantize_spec_weights, ) qspec = _quantize_spec_weights(fp_spec, 16, _QUANTIZATION_MODE_LINEAR_QUANTIZATION) return qspec def _convert_neural_network_weights_to_fp16(full_precision_model): """ Utility function to convert a full-precision (float) MLModel to a half-precision MLModel (float16). Parameters ---------- full_precision_model: MLModel Model which will be converted to half precision. Currently conversion for only neural network models is supported. If a pipeline model is passed in, then all embedded neural network models embedded within will be converted. Returns ------- model: MLModel The converted half precision MLModel. """ spec = full_precision_model.get_spec() return _get_model(_convert_neural_network_spec_weights_to_fp16(spec)) def _get_model(spec, compute_units=_ComputeUnit.ALL): """ Utility to get the model and the data. """ from . import MLModel if isinstance(spec, MLModel): return spec else: return MLModel(spec, compute_units=compute_units) def evaluate_regressor(model, data, target="target", verbose=False): """ Evaluate a Core ML regression model and compare against predictions from the original framework (for testing correctness of conversion). Parameters ---------- model: MLModel or str A loaded MLModel or a path to a saved MLModel. data: Dataframe Test data on which to evaluate the models. target: str Name of the column in the dataframe to be compared against the prediction. verbose: bool Set to true for a more verbose output. See Also -------- evaluate_classifier Examples -------- .. sourcecode:: python metrics = coremltools.utils.evaluate_regressor( spec, "data_and_predictions.csv", "target" ) print(metrics) {"samples": 10, "rmse": 0.0, max_error: 0.0} """ model = _get_model(model) if verbose: print("") print("Other Framework\t\tPredicted\t\tDelta") max_error = 0 error_squared = 0 for _, row in data.iterrows(): input_dict = dict(row) _remove_invalid_keys(input_dict, model) predicted = model.predict(input_dict)[_to_unicode(target)] other_framework = row[target] delta = predicted - other_framework if verbose: print("{}\t\t\t\t{}\t\t\t{:0.4f}".format(other_framework, predicted, delta)) max_error = max(abs(delta), max_error) error_squared = error_squared + (delta * delta) ret = { "samples": len(data), "rmse": _math.sqrt(error_squared / len(data)), "max_error": max_error, } if verbose: print("results: {}".format(ret)) return ret def evaluate_classifier(model, data, target="target", verbose=False): """ Evaluate a Core ML classifier model and compare against predictions from the original framework (for testing correctness of conversion). Use this evaluation for models that don't deal with probabilities. Parameters ---------- filename: list of str or list of MLModel File to load the model from, or a loaded version of the MLModel. data: list of str or list of Dataframe Test data on which to evaluate the models (dataframe, or path to a CSV file). target: str Column to interpret as the target column. verbose: bool Set to true for more verbose output. See Also -------- evaluate_regressor, evaluate_classifier_with_probabilities Examples -------- .. sourcecode:: python metrics = coremltools.utils.evaluate_classifier( spec, "data_and_predictions.csv", "target" ) print(metrics) {"samples": 10, num_errors: 0} """ model = _get_model(model) if verbose: print("") print("Other Framework\t\tPredicted") num_errors = 0 for _, row in data.iterrows(): input_dict = dict(row) _remove_invalid_keys(input_dict, model) predicted = model.predict(input_dict)[_to_unicode(target)] other_framework = row[target] if predicted != other_framework: num_errors += 1 if verbose: print("{}\t\t\t\t{}".format(other_framework, predicted)) ret = {"num_samples": len(data), "num_errors": num_errors} if verbose: print("results: {}".format(ret)) return ret def evaluate_classifier_with_probabilities( model, data, probabilities="probabilities", verbose=False ): """ Evaluate a classifier specification for testing. Parameters ---------- filename: [str | Model] File to load the model from, or a loaded version of the MLModel. data: [str | Dataframe] Test data on which to evaluate the models (dataframe, or path to a CSV file). probabilities: str Column to interpret as the probabilities column. verbose: bool Verbosity levels of the predictions. """ model = _get_model(model) if verbose: print("") print("Other Framework\t\tPredicted") max_probability_error, num_key_mismatch = 0, 0 for _, row in data.iterrows(): input_dict = {k: v for k, v in dict(row).items() if k != probabilities} _remove_invalid_keys(input_dict, model) predicted_values = model.predict(input_dict)[_to_unicode(probabilities)] other_values = row[probabilities] if set(predicted_values.keys()) != set(other_values.keys()): if verbose: print( "Different classes: ", str(predicted_values.keys()), str(other_values.keys()), ) num_key_mismatch += 1 continue for cur_class, cur_predicted_class_values in predicted_values.items(): delta = cur_predicted_class_values - other_values[cur_class] if verbose: print(delta, cur_predicted_class_values, other_values[cur_class]) max_probability_error = max(abs(delta), max_probability_error) if verbose: print("") ret = { "num_samples": len(data), "max_probability_error": max_probability_error, "num_key_mismatch": num_key_mismatch, } if verbose: print("results: {}".format(ret)) return ret def rename_feature( spec, current_name, new_name, rename_inputs=True, rename_outputs=True ): """ Rename a feature in the specification. Parameters ---------- spec: Model_pb The specification containing the feature to rename. current_name: str Current name of the feature. If this feature doesn't exist, the rename is a no-op. new_name: str New name of the feature. rename_inputs: bool Search for ``current_name`` only in the input features (that is, ignore output features). rename_outputs: bool Search for ``current_name`` only in the output features (that is, ignore input features). Examples -------- .. sourcecode:: python # In-place rename of spec model = MLModel("model.mlmodel") spec = model.get_spec() coremltools.utils.rename_feature(spec, "old_feature", "new_feature_name") # re-initialize model model = MLModel(spec) model.save("model.mlmodel") # Rename a spec when the model is an mlprogram, in that case, weights are stored outside of the spec model = coremltools.convert(torch_model, convert_to="mlprogram") spec = model.get_spec() # print info about inputs and outputs print(spec.description) coremltools.utils.rename_feature(spec, "old_feature", "new_feature_name") # re-initialize model model = MLModel(spec, weights_dir=model.weights_dir) model.save("model.mlpackage") """ if not rename_inputs and not rename_outputs: return changed_input = False changed_output = False if rename_inputs: for input in spec.description.input: if input.name == current_name: input.name = new_name changed_input = True if rename_outputs: for output in spec.description.output: if output.name == current_name: output.name = new_name changed_output = True if spec.description.predictedFeatureName == current_name: spec.description.predictedFeatureName = new_name if spec.description.predictedProbabilitiesName == current_name: spec.description.predictedProbabilitiesName = new_name if not changed_input and not changed_output: return # Rename internally in NN model nn = None for nn_type in [ "neuralNetwork", "neuralNetworkClassifier", "neuralNetworkRegressor", ]: if spec.HasField(nn_type): nn = getattr(spec, nn_type) if nn is not None: for layer in nn.layers: if rename_inputs: for index, name in enumerate(layer.input): if name == current_name: layer.input[index] = new_name if rename_outputs: for index, name in enumerate(layer.output): if name == current_name: layer.output[index] = new_name if rename_inputs: for preprocess_params in nn.preprocessing: if preprocess_params.featureName == current_name: preprocess_params.featureName = new_name if spec.HasField("neuralNetworkClassifier"): if nn.labelProbabilityLayerName == current_name: nn.labelProbabilityLayerName = new_name # Rename internally for feature vectorizer if spec.HasField("featureVectorizer") and rename_inputs: for input in spec.featureVectorizer.inputList: if input.inputColumn == current_name: input.inputColumn = new_name changed_input = True # Rename for pipeline models pipeline = None if spec.HasField("pipeline"): pipeline = spec.pipeline elif spec.HasField("pipelineClassifier"): pipeline = spec.pipelineClassifier.pipeline elif spec.HasField("pipelineRegressor"): pipeline = spec.pipelineRegressor.pipeline if pipeline is not None: for index, model in enumerate(pipeline.models): rename_feature( model, current_name, new_name, rename_inputs or (index != 0), rename_outputs or (index < len(spec.pipeline.models)), ) # Rename for mlProgram if spec.HasField("mlProgram"): new_name_sanitized = _NameSanitizer().sanitize_name(new_name) if new_name != new_name_sanitized: raise ValueError("Input/output names for ML Program must be of the format [a-zA-Z_][a-zA-Z0-9_]*. " "That is, it must start with a letter and only contain numerals, underscore or letters. " "Provided feature name, \"{}\" does not satisfy these requirements.".format(new_name)) mil = spec.mlProgram for function in mil.functions.values(): for name_value_type in function.inputs: if name_value_type.name == current_name: name_value_type.name = new_name for block in function.block_specializations.values(): for i, out_name in enumerate(block.outputs): if out_name == current_name: block.outputs[i] = new_name for op in block.operations: for argument in op.inputs.values(): for binding in argument.arguments: if binding.HasField("name"): if binding.name == current_name: binding.name = new_name for name_value_type in op.outputs: if name_value_type.name == current_name: name_value_type.name = new_name def _sanitize_value(x): """ Performs cleaning steps on the data so various type comparisons can be performed correctly. """ if isinstance(x, (str, int, float,)): return x elif _HAS_SCIPY and _sp.issparse(x): return x.todense() elif isinstance(x, _np.ndarray): return x elif isinstance(x, tuple): return (_sanitize_value(v) for v in x) elif isinstance(x, list): return [_sanitize_value(v) for v in x] elif isinstance(x, dict): return dict((_sanitize_value(k), _sanitize_value(v)) for k, v in x.items()) else: assert False, str(x) def _element_equal(x, y): """ Performs a robust equality test between elements. """ if isinstance(x, _np.ndarray) or isinstance(y, _np.ndarray): try: return (abs(_np.asarray(x) - _np.asarray(y)) < 1e-5).all() except: return False elif isinstance(x, dict): return ( isinstance(y, dict) and _element_equal(x.keys(), y.keys()) and all(_element_equal(x[k], y[k]) for k in x.keys()) ) elif isinstance(x, float): return abs(x - y) < 1e-5 * (abs(x) + abs(y)) elif isinstance(x, (list, tuple)): return x == y else: return bool(x == y) def evaluate_transformer(model, input_data, reference_output, verbose=False): """ Evaluate a transformer specification for testing. Parameters ---------- model: list of str or list of MLModel File to load the Model from, or a loaded version of the MLModel. input_data: list of dict Test data on which to evaluate the models. reference_output: list of dict Expected results for the model. verbose: bool Verbosity levels of the predictions. Examples -------- .. sourcecode:: python input_data = [{"input_1": 1, "input_2": 2}, {"input_1": 3, "input_2": 3}] expected_output = [{"input_1": 2.5, "input_2": 2.0}, {"input_1": 1.3, "input_2": 2.3}] metrics = coremltools.utils.evaluate_transformer( scaler_spec, input_data, expected_output ) See Also -------- evaluate_regressor, evaluate_classifier """ model = _get_model(model) if verbose: print(model) print("") print("Other Framework\t\tPredicted") num_errors = 0 for index, row in enumerate(input_data): assert isinstance(row, dict) sanitized_row = _sanitize_value(row) ref_data = _sanitize_value(reference_output[index]) if verbose: print("Input:\n\t", str(row)) print("Correct output:\n\t", str(ref_data)) predicted = _sanitize_value(model.predict(sanitized_row)) assert isinstance(ref_data, dict) assert isinstance(predicted, dict) predicted_trimmed = dict((k, predicted[k]) for k in ref_data.keys()) if verbose: print("Predicted:\n\t", str(predicted_trimmed)) if not _element_equal(predicted_trimmed, ref_data): num_errors += 1 ret = {"num_samples": len(input_data), "num_errors": num_errors} if verbose: print("results: {}".format(ret)) return ret def _has_custom_layer(spec): """ Returns true if the given protobuf specification has a custom layer, and false otherwise. Parameters ---------- spec: mlmodel spec Returns ------- ``True`` if the protobuf specification contains a neural network with a custom layer, ``False`` otherwise. """ layers = _get_nn_layers(spec) for layer in layers: if layer.WhichOneof("layer") == "custom": return True return False def _get_custom_layer_names(spec): """ Returns a list of ``className`` fields which appear in the given protobuf spec. Parameters ---------- spec: mlmodel spec Returns ------- set(str) A set of unique ``className`` fields of custom layers that appear in the model. """ layers = _get_nn_layers(spec) layers_out = set() for layer in layers: if layer.WhichOneof("layer") == "custom": layers_out.add(layer.custom.className) return layers_out def _get_custom_layers(spec): """ Returns a list of all neural network custom layers in the spec. Parameters ---------- spec: mlmodel spec Returns ------- [NN layer] A list of custom layer implementations. """ layers = _get_nn_layers(spec) layers_out = [] for layer in layers: if layer.WhichOneof("layer") == "custom": layers_out.append(layer) return layers_out def _replace_custom_layer_name(spec, oldname, newname): """ Substitutes ``newname`` for ``oldname`` in the ``className`` field of custom layers. If there are no custom layers, or no layers with ``className`` = ``oldname``, then the spec is unchanged. Parameters ---------- spec: mlmodel spec oldname: str The custom layer ``className`` to be replaced. newname: str The new ``className`` value to replace ``oldname``. Returns ------- An mlmodel spec. """ layers = _get_custom_layers(spec) for layer in layers: if layer.custom.className == oldname: layer.custom.className = newname def _is_macos(): """Returns True if current platform is MacOS, False otherwise.""" return _sys.platform == "darwin" @_lru_cache() def _macos_version(): """ Returns macOS version as a tuple of integers, making it easy to do proper version comparisons. On non-Macs, it returns an empty tuple. """ if _is_macos(): try: ver_str = _subprocess.run(["sw_vers", "-productVersion"], stdout=_subprocess.PIPE).stdout.decode('utf-8').strip('\n') return tuple([int(v) for v in ver_str.split(".")]) except: raise Exception("Unable to determine the macOS version") return () def _python_version(): """ Return python version as a tuple of integers """ version = _sys.version.split(" ")[0] version = list(map(int, list(version.split(".")))) return tuple(version) def _get_feature(spec, feature_name): for input_feature in spec.description.input: if input_feature.name == feature_name: return input_feature for output_feature in spec.description.output: if output_feature.name == feature_name: return output_feature raise Exception("Feature with name {} does not exist".format(feature_name)) def _get_input_names(spec): """ Returns a list of the names of the inputs to this model. :param spec: The model protobuf specification :return: list of str A list of input feature names """ retval = [feature.name for feature in spec.description.input] return retval def convert_double_to_float_multiarray_type(spec): """ Convert all double multiarrays feature descriptions (input, output, training input) to float multiarrays. Parameters ---------- spec: Model_pb The specification containing the multiarrays types to convert. Examples -------- .. sourcecode:: python # In-place convert multiarray type of spec spec = mlmodel.get_spec() coremltools.utils.convert_double_to_float_multiarray_type(spec) model = coremltools.models.MLModel(spec) """ def _convert_to_float(feature): if feature.type.HasField("multiArrayType"): if feature.type.multiArrayType.dataType == _proto.Model_pb2.ArrayFeatureType.DOUBLE: feature.type.multiArrayType.dataType = _proto.Model_pb2.ArrayFeatureType.FLOAT32 for feature in spec.description.input: _convert_to_float(feature) for feature in spec.description.output: _convert_to_float(feature) for feature in spec.description.trainingInput: _convert_to_float(feature) if spec.WhichOneof("Type") == "pipeline": for model_spec in spec.pipeline.models: convert_double_to_float_multiarray_type(model_spec) def compile_model(model: _proto.Model_pb2.Model, destination_path: _Optional[str] = None) -> str: """ Compiles a Core ML model spec. Parameters ---------- model: Model_pb2 Spec/protobuf to compile. Note: an ``mlprogam`` which uses a blob file is not supported. destination_path: str Path where the compiled model will be saved. Returns ------- str : Path to compiled model directory If the ``destination_path`` is specified, that is the value that will be returned. Examples -------- .. sourcecode:: python from coremltools.models import CompiledMLModel from coremltools.models.utils import compile_model from coremltools.proto import Model_pb2 spec = Model_pb2.Model() spec.specificationVersion = 1 input_ = spec.description.input.add() input_.name = "x" input_.type.doubleType.MergeFromString(b"") output_ = spec.description.output.add() output_.name = "y" output_.type.doubleType.MergeFromString(b"") spec.description.predictedFeatureName = "y" lr = spec.glmRegressor lr.offset.append(0.1) weights = lr.weights.add() weights.value.append(2.0) compiled_model_path = compile_model(spec) model = CompiledMLModel(compiled_model_path) y = model.predict({"x": 2}) See Also -------- coremltools.models.CompiledMLModel """ # Check environment if _macos_version() < (10, 13): raise Exception("Compiling a Core ML models is only support on macOS 10.13 or higher.") try: from ..libcoremlpython import _MLModelProxy except: raise Exception("Unable to compile any Core ML models.") # Check model parameter if isinstance(model, str): raise TypeError("To get a compiled model from a saved MLModel, first load the model, " " then call \"get_compiled_model_path\".") if isinstance(model, _ct.models.MLModel): raise TypeError("This model has already been compiled. Call \"get_compiled_model_path\"" " to get the compiled model.") if not isinstance(model, _proto.Model_pb2.Model): raise TypeError("Unrecognized input for \"model\" parameter. It should be a spec.") # Check file extension of destination_path parameter if destination_path is not None and not destination_path.rstrip('/').endswith(".mlmodelc"): raise Exception("\"destination_path\" parameter must have \".mlmodelc\" file extension.") # Compile model with _tempfile.TemporaryDirectory() as save_dir: spec_file_path = save_dir + '/spec.mlmodel' save_spec(model, spec_file_path) original_compiled_model_path = _MLModelProxy.compileModel(spec_file_path) # Move the compiled model if needed if destination_path is None: return original_compiled_model_path _shutil.move(original_compiled_model_path, destination_path) return destination_path def make_pipeline( *models: '_ct.models.MLModel', compute_units: _Union[None, _ct.ComputeUnit] = None ) -> '_ct.models.MLModel': """ Makes a pipeline with the given models. Parameters ---------- *models : Two or more instances of ``ct.models.MLModel``. compute_units : The set of processing units that all models in the pipeline can use to make predictions. Can be ``None`` or ``coremltools.ComputeUnit``. * If ``None``, the ``compute_unit`` will be inferred from the ``compute_unit`` values of the models. If all models do not have the same ``compute_unit`` values, this parameter must be specified. * ``coremltools.ComputeUnit`` is an enum with four possible values: - ``coremltools.ComputeUnit.ALL``: Use all compute units available, including the neural engine. - ``coremltools.ComputeUnit.CPU_ONLY``: Limit the model to only use the CPU. - ``coremltools.ComputeUnit.CPU_AND_GPU``: Use both the CPU and GPU, but not the neural engine. - ``coremltools.ComputeUnit.CPU_AND_NE``: Use both the CPU and neural engine, but not the GPU. Available only for macOS >= 13.0. Returns ------- ct.models.MLModel Examples -------- .. sourcecode:: python my_model1 = ct.models.MLModel("/tmp/m1.mlpackage") my_model2 = ct.models.MLModel("/tmp/m2.mlmodel") my_pipeline_model = ct.utils.make_pipeline(my_model1, my_model2) y = my_pipeline_model.predict({"x": 12}) my_pipeline_model.save("/tmp/my_pipeline.mlpackage") new_my_pipeline = ct.model.MLModel("/tmp/my_pipeline.mlpackage") """ def updateBlobFileName(proto_message, new_path): if type(proto_message) == _proto.MIL_pb2.Value: # Value protobuf message. This is what might need to be updated. if proto_message.WhichOneof('value') == 'blobFileValue': assert proto_message.blobFileValue.fileName == "@model_path/weights/weight.bin" proto_message.blobFileValue.fileName = new_path elif hasattr(proto_message, 'ListFields'): # Normal protobuf message for f in proto_message.ListFields(): updateBlobFileName(f[1], new_path) elif hasattr(proto_message, 'values'): # Protobuf map for v in proto_message.values(): updateBlobFileName(v, new_path) elif isinstance(proto_message, _Iterable) and not isinstance(proto_message, str): # Repeated protobuf message for e in proto_message: updateBlobFileName(e, new_path) assert len(models) > 1 if compute_units is not None and not isinstance(compute_units, _ComputeUnit): raise TypeError('"compute_units" parameter must be None or of type coremltools.ComputeUnit') if compute_units is None: all_compute_units_the_same = all(map( lambda m: models[0].compute_unit is m.compute_unit, models[1:] )) if not all_compute_units_the_same: raise ValueError( 'Models have different compute_unit values. The "compute_units" parameter must be specified.' ) compute_units = models[0].compute_unit if (compute_units == _ComputeUnit.CPU_AND_NE and _is_macos() and _macos_version() < (13, 0) ): raise ValueError( 'coremltools.ComputeUnit.CPU_AND_NE is only available on macOS >= 13.0' ) input_specs = list(map(lambda m: m.get_spec(), models)) pipeline_spec = _ct.proto.Model_pb2.Model() pipeline_spec.specificationVersion = max( map(lambda spec: spec.specificationVersion, input_specs) ) # If a later model doesn't get an input from a previous model, it must be # an input to the pipeline. available_as_input = set() for cur_spec in input_specs: for cur_input in cur_spec.description.input: if cur_input.name not in available_as_input: pipeline_spec.description.input.add().MergeFrom(cur_input) available_as_input.add(cur_input.name) available_as_input.update([i.name for i in cur_spec.description.output]) # If an output for a model is not used as input for a later model, assume it # should be an output to the pipeline. used_as_input = set() for cur_spec in input_specs[::-1]: # iterate overs specs in reverse for cur_output in cur_spec.description.output: if cur_output.name not in used_as_input: pipeline_spec.description.output.add().MergeFrom(cur_output) used_as_input.update([i.name for i in cur_spec.description.input]) # Map input shapes to output shapes var_name_to_type = {} for i in range(len(input_specs) - 1): for j in input_specs[i + 1].description.input: var_name_to_type[j.name] = j.type for j in input_specs[i].description.output: # If shape is already present, don't override it if j.type.WhichOneof('Type') == 'multiArrayType' and len(j.type.multiArrayType.shape) != 0: continue if j.name in var_name_to_type: j.type.CopyFrom(var_name_to_type[j.name]) # Update each model's spec to have a unique weight filename for i, cur_spec in enumerate(input_specs): if cur_spec.WhichOneof("Type") == "mlProgram": new_file_path = f"@model_path/weights/{i}-weight.bin" updateBlobFileName(cur_spec.mlProgram, new_file_path) pipeline_spec.pipeline.models.append(cur_spec) mlpackage_path = _create_mlpackage(pipeline_spec) dst = mlpackage_path + '/Data/' + _MLPACKAGE_AUTHOR_NAME + '/' + _WEIGHTS_DIR_NAME _os.mkdir(dst) # Copy and rename each model's weight file for i, cur_model in enumerate(models): if cur_model.weights_dir is not None: weight_file_path = cur_model.weights_dir + "/" + _WEIGHTS_FILE_NAME if _os.path.exists(weight_file_path): _shutil.copyfile(weight_file_path, dst + f"/{i}-weight.bin") return _ct.models.MLModel(pipeline_spec, compute_units=compute_units, weights_dir=dst) def _convert_model_spec_to_pymil_prog( mlmodel: "_ct.models.MLModel", specification_version: int, pymil_load_func: _Callable, ) -> _Program: """ A utility that converts an ``mlprogram`` model into PyMIL program. """ model_spec = mlmodel.get_spec() model_type = model_spec.WhichOneof("Type") if model_type in ( "neuralNetwork", "neuralNetworkClassifier", "neuralNetworkRegressor", "pipeline", "PipelineClassifier", "PipelineRegressor", ): msg = ( "coremltools.optimize.coreml are meant to be used only with mlprogram typed coreml models. " "This model has type {}. Please use coremltools.models.neural_network.quantization_utils.quantize_weights" "instead to compress the weights of the model." ) raise TypeError(msg.format(model_type)) elif model_type == "mlProgram": pass else: raise TypeError("weight compression not applicable for model type {}".format(model_type)) prog = pymil_load_func( model_spec=model_spec, specification_version=specification_version, file_weights_dir=mlmodel.weights_dir, ) return prog def _apply_graph_pass( mlmodel: "_ct.models.MLModel", graph_pass: _AbstractGraphPass, spec_version: int = _SPECIFICATION_VERSION_IOS_16, skip_model_load: _Optional[bool] = None, pymil_load_func: _Callable = _milproto_to_pymil.load, return_pymil_prog: bool = False, ) -> _Union["_ct.models.MLModel", _Program]: # We do the lazy import to prevent circular import from coremltools.converters.mil.converter import mil_convert as _mil_convert if skip_model_load is None: # Determine if skip the model load by the original mlmodel. skip_model_load = mlmodel.__proxy__ is None # Utility function which compresses a Core ML model # Converts the full precision mlmodel into a pymil program model_spec = mlmodel.get_spec() specification_version = max(model_spec.specificationVersion, spec_version) prog = _convert_model_spec_to_pymil_prog(mlmodel, specification_version, pymil_load_func) # Apply graph pass. assert isinstance( graph_pass, _AbstractGraphPass ), f"graph pass must be an AbstractGraphPass instance, but got {type(graph_pass)}" graph_pass.apply(prog) # An early return can prevent running all other optimization paths triggered by _mil_convert. if return_pymil_prog: return prog # Convert the pymil program back to mlmodel compressed_mlmodel = _mil_convert( prog, convert_to="mlprogram", convert_from="milinternal", specification_version=specification_version, compute_units=mlmodel.compute_unit, model_description=model_spec.description, skip_model_load=skip_model_load, ) return compressed_mlmodel def _try_get_weights_dir_path(mlpackage_path): """ Try to find the weights in mlpackage and return the path to the weights directory if found. Return None if not found. :param mlpackage_path: str, path to the mlpackage directory :return: path to the weights directory inside the mlpackage directory """ weights_dir = None try: if _ModelPackage.isValid(mlpackage_path): item_info = _ModelPackage(mlpackage_path).findItemByNameAuthor( _WEIGHTS_DIR_NAME, _MLPACKAGE_AUTHOR_NAME ) if item_info is not None: weights_dir = item_info.path() except: pass return weights_dir class MultiFunctionDescriptor: """ This data class defines how to construct a multifunction model from different model sources. Use the ``add_function`` method to specify the path to the source ``mlpackage``, along with the source and target function names. After setting the ``default_function_name`` to the ``MultiFunctionDescriptor`` instance, you can export a multifunction model using the ``save_multifunction`` method. Examples -------- .. sourcecode:: python from coremltools.utils import MultiFunctionDescriptor, save_multifunction # Initialize a MultiFunctionDescriptor instance with functions in an existing mlpackage. # desc will contain all functions in "my_model.mlpackage" desc = MultiFunctionDescriptor("my_model.mlpackage") # Construct a MultiFunctionDescriptor instance from scratch. # The below code inserts the "main" function from "my_model.mlpackage" as "main_1", # and inserts the "main" function from "my_model_2.mlpackage" as "main_2". desc = MultiFunctionDescriptor() desc.add_function( model_path="my_model.mlpackage", source_function_name="main", target_function_name="main_1", ) desc.add_function( model_path="my_model_2.mlpackage", source_function_name="main", target_function_name="main_2", ) # Each MultiFunctionDescriptor instance must have a default function name # so it can be saved as a multifunction mlpackage on disk. desc.default_function_name = "main_1" save_multifunction(desc, "my_multifunction_model.mlpackage") See Also -------- save_multifunction """ def __init__(self, model_path: _Optional[str] = None): """ If ``model_path`` is passed to the constructor, it must be a :obj:`str` pointing to an existing ``mlpackage`` on disk. The :py:class:`MultiFunctionDescriptor` instance will be initiated with the functions in ``model_path``. """ self._default_function_name = None self._name_to_source_function = {} self._modelpath_to_functions = {} self._modelpath_to_spec = {} if model_path is not None: self.add_model(model_path) def _functions(self) -> _Dict[str, _Tuple[str, str]]: """ Returns ``self._name_to_source_function`` """ return _copy.copy(self._name_to_source_function) def _add_modelpath_to_cache(self, model_path: str) -> None: """ Given an ``mlpackage`` path ``model_path``, this function caches related metadata. """ if model_path in self._modelpath_to_functions: return try: spec = load_spec(model_path) except Exception as err: raise ValueError(f"invalid model_path {model_path} with error {err} while loading.") desc = spec.description # For the protobuf in iOS17 and below, there was no `functions` field, # so "main" is the only function associated with the model in those cases. if len(desc.functions) == 0: self._modelpath_to_functions[model_path] = ["main"] else: self._modelpath_to_functions[model_path] = [func.name for func in desc.functions] self._modelpath_to_spec[model_path] = spec @property def default_function_name(self) -> _Union[str, None]: return self._default_function_name @default_function_name.setter def default_function_name(self, val: str) -> None: if not isinstance(val, str): raise ValueError(f"default_function_name must be type of str. Got {val}.") self._default_function_name = val def add_function( self, model_path: str, src_function_name: str, target_function_name: str ) -> None: """ Insert a ``src_function_name`` function from ``model_path`` as the ``target_function_name`` function in the multifunction descriptor. """ self._add_modelpath_to_cache(model_path) if src_function_name not in self._modelpath_to_functions[model_path]: raise ValueError(f"src_function_name {src_function_name} not found in {model_path}.") if target_function_name in self._name_to_source_function: raise ValueError(f"function {target_function_name} already exist.") self._name_to_source_function[target_function_name] = (model_path, src_function_name) def add_model(self, model_path: str) -> None: """ Insert all functions from the model in ``model_path`` into the multifunction descriptor. The function names will remain the same as in the original model. """ self._add_modelpath_to_cache(model_path) for func_name in self._modelpath_to_functions[model_path]: self.add_function(model_path, func_name, func_name) def remove_function(self, function_name: str) -> None: """ Remove a function ``function_name`` from the multifunction descriptor. """ if function_name not in self._name_to_source_function: raise ValueError(f"function_name {function_name} not found.") del self._name_to_source_function[function_name] def _multifunction_program_append_unifunction_program( multifunction_prog: _mil.Program, unifunction_prog: _mil.Program, src_func_name: str, target_func_name: str, ) -> None: multifunction_prog.add_function(target_func_name, unifunction_prog.functions[src_func_name]) def save_multifunction( desc: MultiFunctionDescriptor, destination_path: str, ): """ Save a :py:class:`MultiFunctionDescriptor` instance into a multifunction ``mlpackage``. This function also performs constant deduplication across functions to allow for weight sharing. Parameters ---------- desc: MultiFunctionDescriptor Multifunction descriptor to save on the disk. destination_path: str The path where the new ``mlpackage`` will be saved. Examples -------- .. sourcecode:: python from coremltools.utils import MultiFunctionDescriptor, save_multifunction desc = MultiFunctionDescriptor("my_model_1.mlpackage") desc.add_function("my_model_2.mlpackage", "main", "main_2") desc.default_function_name = "main_2" save_multifunction(desc, "multifunction_model.mlpackage") See Also -------- MultiFunctionDescriptor """ # We do the lazy import to prevent circular import from coremltools.converters.mil.converter import mil_convert as _mil_convert def get_function_spec( spec: _proto.Model_pb2, func_name: str ) -> _proto.Model_pb2.FunctionDescription: """ Utils to construct a FunctionDescription from the source spec. """ model_desc = spec.description # For single function model, we construct the FunctionDescription ourselves if len(model_desc.functions) == 0: assert func_name == "main", f"invalid function name {func_name}" return _proto.Model_pb2.FunctionDescription( input=model_desc.input, output=model_desc.output, state=model_desc.state, predictedFeatureName=model_desc.predictedFeatureName, predictedProbabilitiesName=model_desc.predictedProbabilitiesName, ) # For multifunction model, we look for the corresponding FunctionDescription for func_desc in model_desc.functions: if func_desc.name != func_name: continue res = _proto.Model_pb2.FunctionDescription() res.CopyFrom(func_desc) res.name = "" return res # compile model information: spec / weight_dir modelpath_to_spec_and_weightdir = {} for k, v in desc._name_to_source_function.items(): model_path = v[0] if model_path in modelpath_to_spec_and_weightdir: continue spec = desc._modelpath_to_spec[model_path] weight_dir = _try_get_weights_dir_path(model_path) if weight_dir is None: raise ValueError(f"weight_dir for model_path {model_path} not found.") modelpath_to_spec_and_weightdir[model_path] = (spec, weight_dir) # min spec version to support multi-functions model is iOS18 # we also make the target spec version the max among the input models spec_version = max( map(lambda val: val[0].specificationVersion, modelpath_to_spec_and_weightdir.values()) ) spec_version = max(spec_version, _SPECIFICATION_VERSION_IOS_18) # convert spec into pymil program modelpath_to_pymil = {} for model_path, (spec, weight_dir) in modelpath_to_spec_and_weightdir.items(): prog = _milproto_to_pymil.load( spec, spec_version, weight_dir, ) modelpath_to_pymil[model_path] = prog # construct a multifunction pymil program multifunction_prog = _mil.Program() function_to_desc = {} for target_func_name, v in desc._name_to_source_function.items(): model_path = v[0] src_func_name = v[1] prog = modelpath_to_pymil[model_path] _ct.utils._multifunction_program_append_unifunction_program( multifunction_prog, prog, src_func_name, target_func_name ) # get the corresponding function description from the spec spec = modelpath_to_spec_and_weightdir[model_path][0] function_spec = get_function_spec(spec, src_func_name) assert function_spec.name == "", "function_spec should not have name set" function_spec.name = target_func_name function_to_desc[target_func_name] = function_spec # Here we deduplicate the same weights across functions, to allow consts to use # the same blob file value when lowered into milproto. # By weight sharing, we can make the model size as small as we could. graph_pass = _PASS_REGISTRY["common::const_deduplication"] graph_pass._deduplicate_const_across_functions(multifunction_prog) # set default function name default_function_name = desc.default_function_name if default_function_name is None: raise ValueError( "default_function_name must be set for the MultiFunctionDescriptor instance before calling save_multifunction." ) if default_function_name not in multifunction_prog.functions: raise ValueError( f"default_function_name {default_function_name} not found in the program. Available functions names are {list(multifunction_prog.functions.keys())}" ) multifunction_prog.default_function_name = default_function_name # export program into multi-functions CoreML model functions = [] for func in multifunction_prog.functions: functions.append(function_to_desc[func]) model_description = _proto.Model_pb2.ModelDescription( functions=functions, defaultFunctionName=default_function_name, ) multifunction_prog.skip_all_passes = True mlmodel = _mil_convert( multifunction_prog, convert_to="mlprogram", convert_from="milinternal", specification_version=spec_version, compute_units=_ct.ComputeUnit.CPU_ONLY, model_description=model_description, export_multi_functions=True, skip_model_load=True, ) mlmodel.save(destination_path) def materialize_dynamic_shape_mlmodel( dynamic_shape_mlmodel: "_ct.models.MLModel", function_name_to_materialization_map: _Dict[str, _Dict[str, _Tuple[int]]], destination_path: str, source_function_name: str = "main", ) -> None: """ Given a dynamic-shape mlmodel, materialize symbols to create fixed-shape functions, then save as an .mlpackage to destination path. To save memory, the pymil program of input dynamic-shape mlmodel is re-used. Constant deduplication across functions is performed to allow weight sharing. Parameters ---------- dynamic_shape_mlmodel : ct.models.MLModel A dynamic-shape mlmodel to be materialized function_name_to_materialization_map: Dict[str, Dict[str, Tuple[int]]] A dictionary specifying the name of new functions to be created, and for each new function what is the new fixed shapes for inputs. If a new function has the same name as an old function, then the old function will be overridden destination_path : str The saved .mlpackage model path source_function_name: str The name of the source symbolic-shape function to be materialized, default = main Examples -------- .. sourcecode:: python from coremltools.utils import materialize_dynamic_shape_mlmodel # A dynamic-shape mlmodel you have converted dynamic_shape_mlmodel: ct.models.MLModel # As an example, let us assume the inputs are # 1. ``input_ids (1, query_length)`` # 2. ``mask (query_length, context_length)`` function_name_to_materialization_map = { "function_name_to_materialization_map": { "materialization_2_3": {"input_ids": (1, 2), "mask": (2, 3)}, "materialization_4_5": {"input_ids": (1, 4), "mask": (4, 5)}, } } materialize_dynamic_shape_mlmodel( dynamic_shape_mlmodel, function_name_to_materialization_map, "materialized_model.mlpackage", ) To make prediction from the materialized mlmodel, load the desired materialized function .. sourcecode:: python materialization_2_3 = ct.models.MLModel( "materialized_model.mlpackage", function_name="materialization_2_3" ) materialization_4_5 = ct.models.MLModel( "materialized_model.mlpackage", function_name="materialization_4_5" ) See Also -------- coremltools.converters.mil.mil.passes.defs.experiment.materialize_symbolic_shape_program """ # We do the lazy import to prevent circular import from coremltools.converters.mil.converter import mil_convert as _mil_convert if not isinstance(dynamic_shape_mlmodel, _ct.models.MLModel): raise ValueError( "Dynamic shape mlmodel must be type of ct.models.MLModel, " f"but got {type(dynamic_shape_mlmodel)}" ) for input in dynamic_shape_mlmodel._spec.description.input: if input.type.WhichOneof("Type") != "multiArrayType": raise NotImplementedError("Only tensor input is handled yet") for output in dynamic_shape_mlmodel._spec.description.output: if output.type.WhichOneof("Type") != "multiArrayType": raise NotImplementedError("Only tensor output is handled yet") if dynamic_shape_mlmodel._mil_program is not None: dynamic_shape_prog = dynamic_shape_mlmodel._mil_program else: dynamic_shape_prog = _milproto_to_pymil.load( dynamic_shape_mlmodel._spec, dynamic_shape_mlmodel._spec.specificationVersion, dynamic_shape_mlmodel.weights_dir, ) # Materialize symbolic shapes, then run all optimization passes pass_pipeline = _ct.PassPipeline.DEFAULT pass_pipeline.insert_pass(0, "common::materialize_symbolic_shape_program") pass_pipeline.set_options( "common::materialize_symbolic_shape_program", { "function_name_to_materialization_map": function_name_to_materialization_map, "source_function_name": source_function_name, }, ) _PassPipelineManager.apply_pipeline(dynamic_shape_prog, pass_pipeline) # Weights are duplicated in each materialized new function # By default, graph pass const_deduplication will not deduplicate across functions, # so we need to call it explicitly here const_deduplication_pass = _PASS_REGISTRY["common::const_deduplication"] const_deduplication_pass._deduplicate_const_across_functions(dynamic_shape_prog) export_multi_functions = True # If source function is the only function in source model, # and source function is replaced with materialization, # and materialization does not create other functions, # then we will end up with a unifunction model # Core ML distinguishs "unifunction model" and "multifunction model with only 1 function" if ( len(dynamic_shape_prog.functions) == 1 and len(function_name_to_materialization_map) == 1 and source_function_name in function_name_to_materialization_map ): export_multi_functions = False # Multifunciton is added in iOS 18, so # * if export multifunction, then specification version has to be iOS 18+ # * else, specification version can be the same as original version specification_version = dynamic_shape_mlmodel._spec.specificationVersion if export_multi_functions: specification_version = max(_ct.target.iOS18, specification_version) dynamic_shape_prog.skip_all_passes = True materialized_mlmodel = _mil_convert( dynamic_shape_prog, convert_from="milinternal", convert_to="mlprogram", specification_version=specification_version, compute_units=_ct.ComputeUnit.CPU_ONLY, export_multi_functions=export_multi_functions, skip_model_load=True, ) materialized_mlmodel.save(destination_path) def randomize_weights(mlmodel: "_ct.models.MLModel"): """ Utility function to randomize weights Parameters ---------- mlmodel: MLModel Model which will be randomized. Returns ------- model: MLModel The MLModel with randomized weights. Examples -------- .. sourcecode:: python import coremltools as ct model = ct.models.MLModel("my_model.mlpackage") randomized_mlmodel = ct.models.utils.randomize_weights(mlmodel) """ randomized_mlmodel = _apply_graph_pass( mlmodel, graph_pass=_WeightRandomizer(), skip_model_load=True ) return randomized_mlmodel def bisect_model( model: _Union[str, "_ct.models.MLModel"], output_dir: str, merge_chunks_to_pipeline: _Optional[bool] = False, check_output_correctness: _Optional[bool] = True, ): """ Utility function to split a mlpackage model into two mlpackages of approximately same file size. Parameters ---------- model: str or MLModel Path to the mlpackage file, or a Core ML model, to be split into two mlpackages of approximately same file size. output_dir: str Path to output directory where the two model chunks / pipeline model would be saved. If the `model` is `{path}/{model_name}.mlpackage`, the chunk models are going to be saved as: 1. first chunk model: `{output_dir}/{model_name}_chunk1.mlpackage` 2. second chunk model: `{output_dir}/{model_name}_chunk2.mlpackage` 3. chunked pipeline model: `{output_dir}/{model_name}_chunked_pipeline.mlpackage` If the `model` is type of `MLModel`, the chunk models are saved as: 1. first chunk model: `{output_dir}/chunk1.mlpackage` 2. second chunk model: `{output_dir}/chunk2.mlpackage` 3. chunked pipeline model: `{output_dir}/chunked_pipeline.mlpackage` merge_chunks_to_pipeline: bool If True, model chunks are managed inside a single pipeline model for easier asset maintenance. check_output_correctness: bool - If True, compares the outputs of original Core ML model with that of pipelined CoreML model chunks and reports PSNR in dB. - Enabling this feature uses more memory. Disable it if your machine runs out of memory. Examples -------- .. sourcecode:: python import coremltools as ct model_path = "my_model.mlpackage" output_dir = "./output/" # The following code will produce two smaller models: # `./output/my_model_chunk1.mlpackage` and `./output/my_model_chunk2.mlpackage` # It also compares the output numerical of the original Core ML model with the chunked models. ct.models.utils.bisect_model( model_path, output_dir, ) # The following code will produce a single pipeline model `./output/my_model_chunked_pipeline.mlpackage` ct.models.utils.bisect_model( model_path, output_dir, merge_chunks_to_pipeline=True, ) # You can also pass the MLModel object directly mlmodel = ct.models.MLModel(model_path) ct.models.utils.bisect_model( mlmodel, output_dir, merge_chunks_to_pipeline=True, ) """ # We do the lazy import to prevent circular import from . import MLModel from coremltools.converters.mil.converter import mil_convert as _mil_convert def get_pymil_prog_and_spec_from_model(model): # get the model spec and weight directory if isinstance(model, str): spec = load_spec(model) weights_dir = _try_get_weights_dir_path(model) else: spec = model._spec weights_dir = model.weights_dir # convert the model spec into pymil program, # we also convert operations into type of List prog = _milproto_to_pymil.load( spec, spec.specificationVersion, weights_dir, ) if len(prog.functions) > 1 or "main" not in prog.functions: raise ValueError("'bisect_model' only support model with a single 'main' function.") func = prog.functions["main"] func.operations = list(func.operations) return prog, spec # check the input type of model if not isinstance(model, (str, MLModel)): raise ValueError(f"'model' must be type of [str, MLModel]. Got {type(model)}.") # The below implementation assumes that the model is single function, with a "main" function. prog, spec = get_pymil_prog_and_spec_from_model(model) spec_version = spec.specificationVersion # Compute the incision point by bisecting the program based on weights size op_idx, first_chunk_weights_size, total_weights_size = _get_op_idx_split_location(prog) main_block = prog.functions["main"] incision_op = main_block.operations[op_idx] _logger.info( f"The incision op: name={incision_op.name}, type={incision_op.op_type}, index={op_idx}/{len(main_block.operations)}" ) _logger.info(f"First chunk size = {first_chunk_weights_size:.2f} MB") _logger.info(f"Second chunk size = {total_weights_size - first_chunk_weights_size:.2f} MB") # Build first chunk (in-place modifies prog by declaring early exits and removing unused subgraph) prog_chunk1 = _make_first_chunk_prog(prog, op_idx) # Build the second chunk # when the first chunk is created, the prog is modified in-place, so we need to re-convert a new pymil # program for the second chunk. prog_chunk2 = _make_second_chunk_prog( get_pymil_prog_and_spec_from_model(model)[0], op_idx, ) # Convert the MIL Program objects into MLModels # We skip_model_load if check_output_correctness=False _logger.info("Converting the two programs") model_chunk1 = _mil_convert( prog_chunk1, convert_to="mlprogram", convert_from="milinternal", specification_version=spec_version, compute_units=_ct.ComputeUnit.CPU_ONLY, skip_model_load=(not check_output_correctness), ) del prog_chunk1 _gc.collect() _logger.info("Conversion of first chunk done.") model_chunk2 = _mil_convert( prog_chunk2, convert_to="mlprogram", convert_from="milinternal", specification_version=spec_version, compute_units=_ct.ComputeUnit.CPU_ONLY, skip_model_load=(not check_output_correctness), ) del prog_chunk2 _gc.collect() _logger.info("Conversion of second chunk done.") # Verify output correctness if check_output_correctness: _logger.info("Verifying output correctness of chunks") if isinstance(model, str): mlmodel = _ct.models.MLModel(model, compute_units=_ct.ComputeUnit.CPU_ONLY) else: mlmodel = model _verify_output_correctness_of_chunks( full_model=mlmodel, first_chunk_model=model_chunk1, second_chunk_model=model_chunk2, ) # save model chunks _os.makedirs(output_dir, exist_ok=True) if isinstance(model, str): mlpackage_name = _os.path.basename(model) name, _ = _os.path.splitext(mlpackage_name) name += "_" else: name = "" if merge_chunks_to_pipeline: # Make a single pipeline model to manage the model chunks pipeline_model = make_pipeline(model_chunk1, model_chunk2) out_path_pipeline = _os.path.join(output_dir, name + "chunked_pipeline.mlpackage") pipeline_model.save(out_path_pipeline) # reload to ensure CPU placement if check_output_correctness: _logger.info("Verifying output correctness of pipeline model") pipeline_model = _ct.models.MLModel( out_path_pipeline, compute_units=_ct.ComputeUnit.CPU_ONLY ) _verify_output_correctness_of_chunks( full_model=mlmodel, pipeline_model=pipeline_model, ) else: # Save the chunked models to disk out_path_chunk1 = _os.path.join(output_dir, name + "chunk1.mlpackage") out_path_chunk2 = _os.path.join(output_dir, name + "chunk2.mlpackage") model_chunk1.save(out_path_chunk1) model_chunk2.save(out_path_chunk2) _logger.info( f"Saved chunks in {output_dir} with the suffix _chunk1.mlpackage and _chunk2.mlpackage" ) def _verify_output_correctness_of_chunks( full_model: "_ct.models.MLModel", first_chunk_model: _Optional["_ct.models.MLModel"] = None, second_chunk_model: _Optional["_ct.models.MLModel"] = None, pipeline_model: _Optional["_ct.models.MLModel"] = None, ) -> None: """Verifies the end-to-end output correctness of full (original) model versus chunked models""" # lazy import avoids circular error from coremltools.converters.mil.testing_utils import random_gen_input_feature_type as random_gen_input_feature_type from coremltools.converters.mil.testing_utils import compute_snr_and_psnr def report_correctness(original_outputs: _np.ndarray, final_outputs: _np.ndarray, log_prefix: str): """ Report PSNR values across two compatible tensors. This util is from https://github.com/apple/ml-stable-diffusion/blob/main/python_coreml_stable_diffusion/torch2coreml.py#L80, with a slightly modification. """ ABSOLUTE_MIN_PSNR = 35 _, original_psnr = compute_snr_and_psnr(original_outputs, original_outputs) _, final_psnr = compute_snr_and_psnr(original_outputs, final_outputs) dB_change = final_psnr - original_psnr _logger.info( f"{log_prefix}: PSNR changed by {dB_change:.1f} dB ({original_psnr:.1f} -> {final_psnr:.1f})" ) if final_psnr < ABSOLUTE_MIN_PSNR: _logger.warning(f"{final_psnr:.1f} dB is low!") else: _logger.info( f"{final_psnr:.1f} dB > {ABSOLUTE_MIN_PSNR} dB (minimum allowed) parity check passed" ) return final_psnr # Generate inputs for first chunk and full model input_dict = {} for input_desc in full_model._spec.description.input: input_dict[input_desc.name] = random_gen_input_feature_type(input_desc) # Generate outputs for full model outputs_from_full_model = full_model.predict(input_dict) if pipeline_model is not None: outputs_from_pipeline_model = pipeline_model.predict(input_dict) final_outputs = outputs_from_pipeline_model elif first_chunk_model is not None and second_chunk_model is not None: # Generate outputs for first chunk outputs_from_first_chunk_model = first_chunk_model.predict(input_dict) # Prepare inputs for second chunk model from first chunk's outputs and regular inputs second_chunk_input_dict = {} for input_desc in second_chunk_model._spec.description.input: if input_desc.name in outputs_from_first_chunk_model: second_chunk_input_dict[input_desc.name] = outputs_from_first_chunk_model[ input_desc.name ] else: second_chunk_input_dict[input_desc.name] = input_dict[input_desc.name] # Generate output for second chunk model outputs_from_second_chunk_model = second_chunk_model.predict(second_chunk_input_dict) final_outputs = outputs_from_second_chunk_model else: raise ValueError("Either a single Pipeline model or two model chunks should be provided.") # Verify correctness across all outputs from second chunk and full model for out_name in outputs_from_full_model.keys(): report_correctness( original_outputs=outputs_from_full_model[out_name], final_outputs=final_outputs[out_name], log_prefix=f"{out_name}", ) def _get_op_idx_split_location(prog: _mil.Program) -> _Tuple[int, int, int]: """Find the op that approximately bisects the graph as measure by weights size on each side""" main_block = prog.functions["main"] total_size_in_mb = 0 for op in main_block.operations: if op.op_type == "const" and isinstance(op.val.val, _np.ndarray): size_in_mb = op.val.val.size * op.val.val.itemsize / (1024 * 1024) total_size_in_mb += size_in_mb half_size = total_size_in_mb / 2 # Find the first non const op (single child), where the total cumulative size exceeds # the half size for the first time cumulative_size_in_mb = 0 for op in main_block.operations: if op.op_type == "const" and isinstance(op.val.val, _np.ndarray): size_in_mb = op.val.val.size * op.val.val.itemsize / (1024 * 1024) cumulative_size_in_mb += size_in_mb # Note: The condition "not op.op_type.startswith("const")" is to make sure that the # incision op is neither of type "const" nor "constexpr_*" ops that # are used to store compressed weights if ( cumulative_size_in_mb >= half_size and not op.op_type.startswith("const") and len(op.outputs) == 1 and len(op.outputs[0].child_ops) == 1 ): op_idx = main_block.operations.index(op) return op_idx, cumulative_size_in_mb, total_size_in_mb raise ValueError("Not able to find the bisect point in the model.") def _get_first_chunk_outputs(block: _mil.Block, op_idx: int) -> _List[_mil.Var]: # Get the list of all vars that go across from first program (all ops from 0 to op_idx (inclusive)) # to the second program (all ops from op_idx+1 till the end). These all vars need to be made the output # of the first program and the input of the second program boundary_vars = set() for i in range(op_idx + 1): op = block.operations[i] if not op.op_type.startswith("const"): for var in op.outputs: if var.val is None: # only consider non const vars for child_op in var.child_ops: child_op_idx = block.operations.index(child_op) if child_op_idx > op_idx: boundary_vars.add(var) return list(boundary_vars) @_block_context_manager def _add_fp32_casts(block: _mil.Block, boundary_vars: _List[_mil.Var]) -> None: new_boundary_vars = [] for var in boundary_vars: if var.dtype != _mil.types.fp16: new_boundary_vars.append(var) else: fp32_var = _mb.cast(x=var, dtype="fp32", name=var.name) new_boundary_vars.append(fp32_var) return new_boundary_vars def _make_first_chunk_prog( prog: _mil.Program, op_idx: int, ) -> _mil.Program: """Build first chunk by declaring early outputs and removing unused subgraph""" block = prog.functions["main"] boundary_vars = _get_first_chunk_outputs(block, op_idx) # Due to possible numerical issues, cast any fp16 var to fp32 new_boundary_vars = _add_fp32_casts(block, boundary_vars) block.outputs.clear() block.set_outputs(new_boundary_vars) _PASS_REGISTRY["common::dead_code_elimination"](prog) return prog def _make_second_chunk_prog(prog: _mil.Program, op_idx: int) -> _mil.Program: """Build second chunk by rebuilding a pristine MIL Program from MLModel""" block = prog.functions["main"] block.opset_version = _ct.target.iOS16 # First chunk outputs are second chunk inputs (e.g. skip connections) boundary_vars = _get_first_chunk_outputs(block, op_idx) # This op will not be included in this program. Its output var will be made into an input boundary_op = block.operations[op_idx] # Add all boundary ops as inputs with block: for var in boundary_vars: new_placeholder = _Placeholder( sym_shape=var.shape, dtype=var.dtype if var.dtype != _mil.types.fp16 else _mil.types.fp32, name=var.name, ) block._input_dict[new_placeholder.outputs[0].name] = new_placeholder.outputs[0] block.function_inputs = tuple(block._input_dict.values()) new_var = None if var.dtype == _mil.types.fp16: new_var = _mb.cast(x=new_placeholder.outputs[0], dtype="fp16", before_op=var.op) else: new_var = new_placeholder.outputs[0] block.replace_uses_of_var_after_op( anchor_op=boundary_op, old_var=var, new_var=new_var, # This is needed if the program contains "constexpr_*" ops. In normal cases, there are stricter # rules for removing them, and their presence may prevent replacing this var. # However in this case, since we want to remove all the ops in chunk 1, we can safely # set this to True. force_replace=True, ) _PASS_REGISTRY["common::dead_code_elimination"](prog) # Remove any unused inputs new_input_dict = _OrderedDict() for k, v in block._input_dict.items(): if len(v.child_ops) > 0: new_input_dict[k] = v block._input_dict = new_input_dict block.function_inputs = tuple(block._input_dict.values()) return prog ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.261547 coremltools-8.0/coremltools/optimize/0000755000000000000000000000000014672075535016701 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/__init__.py0000644000000000000000000000054014672066616021011 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools._deps import _IMPORT_CT_OPTIMIZE_TORCH from . import coreml if _IMPORT_CT_OPTIMIZE_TORCH: from . import torch ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.261547 coremltools-8.0/coremltools/optimize/coreml/0000755000000000000000000000000014672075535020162 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/coreml/__init__.py0000644000000000000000000000115314672066616022273 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ._config import ( OpLinearQuantizerConfig, OpMagnitudePrunerConfig, OpPalettizerConfig, OpThresholdPrunerConfig, OptimizationConfig, ) from ._post_training_quantization import ( CoreMLOpMetaData, CoreMLWeightMetaData, decompress_weights, get_weights_metadata, linear_quantize_weights, palettize_weights, prune_weights, ) from . import experimental././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/coreml/_config.py0000644000000000000000000014365214672066616022153 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from __future__ import annotations import sys from abc import ABC, abstractmethod from collections import OrderedDict from enum import Enum from typing import IO, Any, Callable, Dict, List, Optional, Tuple, Union import cattrs import numpy as np import yaml from attrs import define, field, validators from coremltools.converters.mil.mil import operation, types from coremltools.converters.mil.mil.types.type_mapping import is_builtin, numpy_type_to_builtin_type # TODO: Share the enum between cto.coreml and cto.torch (rdar://124409664). class CompressionGranularity(Enum): PER_TENSOR = 1 PER_GROUPED_CHANNEL = 2 PER_CHANNEL = 3 PER_BLOCK = 4 class OpCompressorConfig(ABC): """ An abstract class for the compressor configuration """ def _validate_op_type(self, op_type): """ A utility function checking if an op type is valid for the configuration """ pass @classmethod @abstractmethod def _from_dict(cls, config_dict: Dict[str, Any]) -> "OpCompressorConfig": """ An abstract method that construct an OpCompressorConfig from a dictionary. It must be implemented in the child class. """ raise ValueError("_from_dict must be implemented in the subclasses of OpCompressorConfig.") def _check_weight_threshold(instance, attribute, value): if value is not None and value < 0: raise ValueError(f"\"weight_threshold\" must be a non-negative integer. Got {value}.") def _normalize_dtype(dtype: Union[str, type]) -> type: if isinstance(dtype, str): try: dtype = types.string_to_builtin(dtype) except KeyError: raise ValueError(f"Invalid dtype {dtype}. Only support int8/uint8/int4/uint4.") elif np.issubdtype(dtype, np.integer): dtype = types.numpy_type_to_builtin_type(dtype) elif not types.is_builtin(dtype): raise ValueError(f"dtype={dtype} is unsupported for OpLinearQuantizerConfig.") return dtype """ Linear Quantization configuration """ def _normalize_granularity( granularity: Union[str, CompressionGranularity] ) -> CompressionGranularity: if isinstance(granularity, CompressionGranularity): return granularity if granularity == "per_tensor": return CompressionGranularity.PER_TENSOR elif granularity == "per_grouped_channel": return CompressionGranularity.PER_GROUPED_CHANNEL elif granularity == "per_channel": return CompressionGranularity.PER_CHANNEL elif granularity == "per_block": return CompressionGranularity.PER_BLOCK else: raise ValueError(f"Invalid granularity={granularity}") def check_block_size(instance, attr, block_size): """ Validator for block_size. Note the `instance` and `attr` are not used but required by attrs interface. """ if block_size is not None: if isinstance(block_size, int): if block_size < 0: raise ValueError( f"The block_size must be non-negative values, but got {block_size}" ) elif isinstance(block_size, (list, tuple)): for it_block_size in block_size: if not isinstance(it_block_size, int) or it_block_size < 0: raise ValueError("All values in block_size must be non-negative values.") else: raise ValueError( f"The block_size should be int or list/tuple of int, but got {type(block_size)}." ) def _structure_block_size_type(block_size, dtype): """ The block_size's type Union[int, List[int], Tuple[int, ...]] need a custom structure hook for attrs yaml conversion. Note the `dtype` parameter is not used but required by attrs interface. """ if isinstance(block_size, int): return block_size else: if not isinstance(block_size, (list, tuple)): raise ValueError( f'"block_size" must be int or list/tuple of int. Got {type(block_size)}' ) return block_size @define class OpLinearQuantizerConfig(OpCompressorConfig): """ Parameters ---------- mode: str Mode for linear quantization: * ``"linear_symmetric"`` (default): Input data are quantized in the range ``[-R, R]``, where :math:`R = max(abs(w_r))`. * ``"linear"``: Input data are quantized in the range :math:`[min(w_r), max(w_r)]`. dtype: str or np.generic or mil.type Determines the quantized data type (int8/uint8/int4/uint4). * The allowed values are: * ``np.int8`` (the default) * ``np.uint8`` * ``coremltools.converters.mil.mil.types.int8`` * ``coremltools.converters.mil.mil.types.uint8`` * ``coremltools.converters.mil.mil.types.int4`` * ``coremltools.converters.mil.mil.types.uint4`` * strings to specify dtype such as "int4", "uint4", etc granularity: str Granularity for quantization. * ``"per_tensor"`` * ``"per_channel"`` (default) * ``"per_block"`` block_size: int or List/Tuple of int * Only effective when granularity is set to "per_block". * Determines size of the block, where all elements in a block share the same scale and zero_point. * If it's int, the block size on each axis is auto determined for best performance. More specifially, the block will have ``block_size`` on input axis and ``1`` on output axis, where input/output axis is auto picked based on op type. For example, if weight has shape [Cout, Cin], the block will have shape [1, block_size]; If the weight has shape [C_out, C_in, KH, KW], the block will has shape [1, block_size, KH, KW]. * If it's a tuple of int, it must have the same rank as the weight, which specify the block size on each axis. * The value 0 means block size equal to dim size at the corresponding axis. * If the dim size on any axis is not divisible by the corresponding block size, the op will be skipped. The tuple input of ``block_size`` provides users fully control about the block. Here are some examples about how different granularities could be achieved: Given the weight of a 2D Conv which has shape [C_out, C_in, KH, KW]: |------------------------|--------------------------|---------------------------|----------------------------| | Granularity | output_channel_block_size| input_channel_block_size | Weight Shape of Each Block | |------------------------|--------------------------|---------------------------|----------------------------| | Per Tensor | 0 | 0 | [C_out, C_in, KH, KW] | | Per Input Channel | 0 | 1 | [C_out, 1, KH, KW] | | Per Output Channel | 1 | 0 | [1, C_in, KH, KW] | | Per Block | 1 | 32 | [1, 32, KH, KW] | |------------------------|--------------------------|---------------------------|----------------------------| Given the weight of a linear layer which has shape [C_out, C_in]: |------------------------|--------------------------|---------------------------|----------------------------| | Granularity | output_channel_block_size| input_channel_block_size | Weight Shape of Each Block | |------------------------|--------------------------|---------------------------|----------------------------| | Per Tensor | 0 | 0 | [C_out, C_in] | | Per Input Channel | 0 | 1 | [C_out, 1] | | Per Output Channel | 1 | 0 | [1, C_in] | | Per Block | 1 | 32 | [1, 32] | |------------------------|--------------------------|---------------------------|----------------------------| Given the weight of matmul's y (transpose_y=False) which has shape [..., C_in, C_out]: |------------------------|--------------------------|---------------------------|----------------------------| | Granularity | output_channel_block_size| input_channel_block_size | Weight Shape of Each Block | |------------------------|--------------------------|---------------------------|----------------------------| | Per Tensor | 0 | 0 | [..., C_in, C_out] | | Per Input Channel | 0 | 1 | [..., 1, C_out] | | Per Output Channel | 1 | 0 | [..., C_in, 1] | | Per Block | 1 | 32 | [..., 32, 1] | |------------------------|--------------------------|---------------------------|----------------------------| weight_threshold: int The size threshold, above which weights are pruned. That is, a weight tensor is pruned only if its total number of elements are greater than ``weight_threshold``. Default to 2048. For example, if ``weight_threshold = 1024`` and a weight tensor is of shape ``[10, 20, 1, 1]``, hence ``200`` elements, it will not be pruned. """ mode: str = field(default="linear_symmetric", validator=validators.instance_of(str)) dtype: Union[str, type] = field(default=types.int8, converter=_normalize_dtype) granularity: Union[str, CompressionGranularity] = field( default=CompressionGranularity.PER_CHANNEL, validator=validators.instance_of(CompressionGranularity), converter=_normalize_granularity, ) block_size: Union[int, List[int], Tuple[int, ...]] = field( default=32, validator=check_block_size ) weight_threshold: Optional[int] = field(default=2048, validator=validators.optional([validators.instance_of(int), _check_weight_threshold])) _WEIGHT_AFFINE_QUANTIZATION_MODES = ("LINEAR_SYMMETRIC", "LINEAR") _VALID_GRANULARITIES = ( CompressionGranularity.PER_TENSOR, CompressionGranularity.PER_CHANNEL, CompressionGranularity.PER_BLOCK, ) @mode.validator def check_mode(self, attr, mode): if not mode.upper() in self._WEIGHT_AFFINE_QUANTIZATION_MODES: raise ValueError(f"Only mode {self._WEIGHT_AFFINE_QUANTIZATION_MODES} supported for weight affine quantization. Got mode: \"{mode}\".") @dtype.validator def check_dtype(self, attr, dtype): if not types.is_builtin(dtype): raise ValueError(f"Invalid dtype. Should be builtin dtype, but got {type(dtype)}") if not (types.is_int(dtype) and dtype.get_bitwidth() in {4, 8}): raise ValueError( f"Invalid dtype. Should be int4/8 or uint4/8, but got {types.builtin_to_string(dtype)}" ) @granularity.validator def check_granularity(self, attr, granularity): if granularity not in self._VALID_GRANULARITIES: raise ValueError( f'"granularity" must be one of {self._VALID_GRANULARITIES}, but got {granularity}' ) def __attrs_post_init__(self): self.mode = self.mode.upper() if not is_builtin(self.dtype): self.dtype = numpy_type_to_builtin_type(self.dtype) # Set nbits and signed for backward compatibility with existing code. if types.is_int(self.dtype): self.nbits = self.dtype.get_bitwidth() self.signed = not self.dtype.is_unsigned() @classmethod def _from_dict(cls, config_dict: Dict[str, Any]) -> OpLinearQuantizerConfig: converter = cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook( Union[int, List[int], Tuple[int, ...]], _structure_block_size_type ) return converter.structure(config_dict, cls) """ Pruner configurations """ @define class OpThresholdPrunerConfig(OpCompressorConfig): """ All weights with absolute value smaller than ``threshold`` are changed to ``0``, and the tensor is stored in a sparse format. For example, given the following: * ``weight = [0.3, -0.2, -0.01, 0.05]`` * ``threshold = 0.03`` The sparsified weight would be ``[0.3, -0.2, 0, 0.05]``. Parameters ---------- threshold: float All weight values above this threshold are set to ``0``. * Default value is ``1e-12``. minimum_sparsity_percentile: float The sparsity level must be above this value for the weight representation to be stored in the sparse format rather than the dense format. For example, if ``minimum_sparsity_percentile = 0.6`` and the sparisty level is ``0.54``; that is, ``54%`` of the weight values are exactly ``0``, then the resulting weight tensor will be stored as a dense const op, and not converted to the ``constsexpr_sparse_to_dense`` op (which stores the weight values in a sparse format). * Must be a value between ``0`` and ``1``. * Default value is ``0.5``. weight_threshold: int The size threshold, above which weights are pruned. That is, a weight tensor is pruned only if its total number of elements are greater than ``weight_threshold``. For example, if ``weight_threshold = 1024`` and a weight tensor is of shape ``[10, 20, 1, 1]``, hence ``200`` elements, it will not be pruned. * If not provided, it will be set to ``2048``, in which weights bigger than ``2048`` elements are compressed. """ threshold: float = field(default=1e-12, validator=validators.instance_of(float)) minimum_sparsity_percentile: float = field(default=0.5, validator=validators.instance_of(float)) weight_threshold: Optional[int] = field( default=2048, validator=validators.optional([validators.instance_of(int), _check_weight_threshold]) ) @threshold.validator def check_threshold(self, attr, threshold): if threshold < 0: raise ValueError( f"Invalid value of \"threshold\": {threshold}. Needs to be in [0, inf)" ) @minimum_sparsity_percentile.validator def check_minimum_sparsity_percentile(self, attr, minimum_sparsity_percentile): if minimum_sparsity_percentile < 0 or minimum_sparsity_percentile > 1: raise ValueError( f"Invalid value of \"minimum_sparsity_percentile\": {minimum_sparsity_percentile}. Needs to be in [0, 1]" ) @classmethod def _from_dict(cls, config_dict: Dict[str, Any]) -> "OpThresholdPrunerConfig": converter = cattrs.Converter(forbid_extra_keys=True) return converter.structure(config_dict, cls) @define class OpMagnitudePrunerConfig(OpCompressorConfig): """ Prune the weight with a constant sparsity percentile, which can be specified by either ``target_sparsity`` or ``n_m_ratio``. If ``target_sparsity`` is set, where ``n = floor(size_of_weight_tensor * target_sparsity)``, the ``n`` lowest absolute weight values are changed to ``0``. For example, given the following: * ``weight = [0.3, -0.2, -0.01, 0.05]`` * ``target_sparsity = 0.75`` The sparsified weight would be ``[0.3, 0, 0, 0]``. If ``block_size`` is set, then weights are pruned in a block structured manner; that is, chunks of weight values, as big as the ``block_size``, will be set to ``0``. Block sparsity can only be applied to ``linear`` and ``conv`` layers. For example: .. code-block:: python # Given a 4 x 2 weight with the following value, and block_size = 2, dim = 0. [ [1, 3], [-6, -7], [0, 3], [-9, 2], ] # We first flatten the matrix along axis = 0. [1, -6, 0, -9, 3, -7, 3, 2] # For block size 2, the L2 norm will be compute of first 2 elements, then the second and 3rd element and so on. [6.08, 9.00, 7.62, 3.61] # Then the smallest values will be picked to prune. So if target_sparsity = 0.5, then the blocks that will be # pruned will be with ones with L2 norm value of 6.08 and 3.61. And hence, the elements in the first and third # block are pruned. Resulting in the following flatten pruned tensor: [0, 0, 0, -9, 3, -7, 0, 0] # The final pruned tensor is: [ [0, 3], [0, -7], [0, 0], [-9, 0], ] The ``n_m_ratio`` triggers ``n:m`` pruning along the ``dim`` axis. In ``n:m`` pruning, out of every ``m`` elements, ``n`` with lowest magnitude are set to ``0``. For more information, see `Learning N:M Fine-Grained Structured Sparse Neural Networks From Scratch `_. ``n:m`` pruning can be applied only to ``linear`` and ``conv`` layers. Example: .. code-block:: python # Given a 4 x 4 weight of [ [3, 4, 7, 6], [1, 8, -3, -8], [-2, -3, -4, 0], [5, 4, -3, -2], ] # For n_m_ratio = (1, 2) with axis = 1 (default), the resulting pruned weight is [ [0, 4, 7, 0], [0, 8, 0, -8], [0, -3, -4, 0], [5, 0, -3, 0], ] # For axis = 0, we get [ [3, 0, 7, 0], [0, 8, 0, -8], [0, 0, -4, 0], [5, 4, 0, -2], ] Parameters ---------- target_sparsity: float The percentage of sparsity for compression, which needs to be in the range ``[0, 1]``. When ``0``, no sparsification occurs. For ``1``, all weights become ``0``. block_size: int Block size for inducing block sparsity. This is applied on the ``dim`` dimension of the parameter. Having the zeros aligned in the parameter helps gain latency/memory performance on-device. * If set, must be greater than ``1`` to enable block sparsity. * Block sparsity can be applied only to ``linear`` and ``conv`` layers. * The channel will be padded with ``0`` if it is not divisible by ``block_size``. n_m_ratio: tuple[int] A tuple of two integers which specify the ratio for ``n:m`` pruning. * ``n`` must be smaller or equal to ``m``. * The channel would be padded with ``0`` if it is not divisible by ``m``. dim: int Dimension where the block sparsity or ``n:m`` sparsity is applied. * Must be either ``0`` or ``1``. * The default value for block sparsity is ``0`` (output channel). * The default value for ``n:m`` sparsity is ``1`` (input channel). weight_threshold: int The size threshold, above which weights are pruned. That is, a weight tensor is pruned only if its total number of elements is greater than ``weight_threshold``. For example, if ``weight_threshold = 1024`` and a weight tensor is of shape ``[10, 20, 1, 1]``, hence ``200`` elements, it will not be pruned. * If not provided, it will be set to ``2048``, in which weights bigger than ``2048`` elements are compressed. """ target_sparsity: Optional[float] = field(default=None, validator=validators.optional(validators.instance_of(float))) block_size: Optional[int] = field(default=None, validator=validators.optional(validators.instance_of(int))) n_m_ratio: Optional[Tuple[int, int]] = field(default=None, validator=validators.optional(validators.instance_of((list, tuple)))) dim: Optional[int] = field(default=None, validator=validators.optional(validators.instance_of(int))) weight_threshold: Optional[int] = field( default=2048, validator=validators.optional([validators.instance_of(int), _check_weight_threshold]) ) _SUPPORTED_OPS_FOR_STRUCTURAL_PRUNING = { "conv": ["weight"], "linear": ["weight"], } def _is_structural_pruning(self): return self.n_m_ratio is not None or self.block_size is not None def _validate_op_type(self, op_type): """ Structural sparsity can only be applied to conv / linear weight. """ if self._is_structural_pruning() and op_type not in self._SUPPORTED_OPS_FOR_STRUCTURAL_PRUNING: raise ValueError(f"block sparsity or n:m pruning does not support op type {op_type}.") def _check_const_op_is_valid(self, op): def _get_child_op_and_input(op): assert op.op_type == "const" res = [] for child in op.outputs[0].child_ops: child_op_type = child.op_type child_op_input = "" for k, v in child.inputs.items(): if v is op.outputs[0]: child_op_input = k break assert child_op_input != "" res.append((child_op_type, child_op_input)) return res if not self._is_structural_pruning(): return True child_op_type_and_input = _get_child_op_and_input(op) for op_type, input in child_op_type_and_input: if op_type not in self._SUPPORTED_OPS_FOR_STRUCTURAL_PRUNING: return False if input not in self._SUPPORTED_OPS_FOR_STRUCTURAL_PRUNING[op_type]: return False return True @target_sparsity.validator def check_target_sparsity(self, attr, target_sparsity): msg = "Either \"target_sparsity\" or \"n_m_ratio\" need to be set. They cannot be set at the same time." if target_sparsity is not None and self.n_m_ratio is not None: raise ValueError(msg) if target_sparsity is None and self.n_m_ratio is None: raise ValueError(msg) if target_sparsity is None: return if target_sparsity < 0 or target_sparsity > 1: raise ValueError( f"Invalid value of \"target_sparsity\": {target_sparsity}. Needs to be in [0, 1]." ) @block_size.validator def check_block_size(self, attr, block_size): if block_size is not None and self.n_m_ratio is not None: raise ValueError( "\"block_size\" and \"n_m_ratio\" cannot be set at the same time." ) if block_size is None: return if block_size is not None and block_size <= 1: raise ValueError(f"\"block_size\" must be an integer > 1. Got {block_size}.") @n_m_ratio.validator def check_n_m_ratio(self, attr, n_m_ratio): if n_m_ratio is None: return if len(n_m_ratio) != 2 or n_m_ratio[0] > n_m_ratio[1]: raise ValueError(f"\"n_m_ratio\" must be a tuple of two integers (n, m). n <= m. Got {n_m_ratio}") @dim.validator def check_dim(self, attr, dim): if dim is None: return if self.block_size is None and self.n_m_ratio is None: raise ValueError("\"dim\" can only be set along with \"block_size\" or \"n_m_ratio\".") if dim not in [0, 1]: raise ValueError(f"\"dim\" must be 1 or 0. Got {dim}.") def __attrs_post_init__(self): if self.block_size is not None and self.dim is None: self.dim = 0 if self.n_m_ratio is not None and self.dim is None: self.dim = 1 @classmethod def _from_dict(cls, config_dict: Dict[str, Any]) -> "OpMagnitudePrunerConfig": converter = cattrs.Converter(forbid_extra_keys=True) return converter.structure(config_dict, cls) """ Palettizer configuration """ @define class OpPalettizerConfig(OpCompressorConfig): """ Parameters ---------- nbits: int Number of bits per weight. Required for ``kmeans`` or ``uniform`` mode, but must not be set for ``unique`` or ``custom`` mode. A LUT would have 2\ :sup:`nbits` entries, where `nbits` can be ``{1, 2, 3, 4, 6, 8}``. mode: str Determine how the LUT is constructed by specifying one of the following: * ``"kmeans"`` (default): The LUT is generated by `k-means clustering`, a method of vector quantization that groups similar data points together to discover underlying patterns by using a fixed number (`k`) of clusters in a dataset. A cluster refers to a collection of data points aggregated together because of certain similarities. `nbits` is required. * ``"uniform"``: The LUT is generated by a linear histogram. - ``[v_min, v_min + scale, v_min + 2 * scale, ..., v_max]`` - Where the weight is in the range ``[v_min, v_max]``, and ``scale = (v_max - v_min) / (1 << nbits - 1)``. - ``nbits`` is required. A `histogram` is a representation of the distribution of a continuous variable, in which the entire range of values is divided into a series of intervals (or `bins`) and the representation displays how many values fall into each bin. Linear histograms have one bin at even intervals, such as one bin per integer. * ``"unique"``: The LUT is generated by unique values in the weights. The weights are assumed to be on a discrete lattice but stored in a float data type. This parameter identifies the weights and converts them into the palettized representation. Do not provide ``nbits`` for this mode. ``nbits`` is picked up automatically, with the smallest possible value in ``{1, 2, 4, 6, 8}`` such that the number of the unique values is ``<= (1 << nbits)``. If the weight has ``> 256`` unique values, the compression is skipped. For example: * If the weights are ``{0.1, 0.2, 0.3, 0.4}`` and ``nbits=2``, the weights are converted to ``{00b, 01b, 10b, 11b}``, and the generated LUT is ``[0.1, 0.2, 0.3, 0.4]``. * If the weights are ``{0.1, 0.2, 0.3, 0.4}`` and ``nbits=1``, nothing happens because the weights are not a 1-bit lattice. * If the weights are ``{0.1, 0.2, 0.3, 0.4, 0.5}`` and ``nbits=2``, nothing happens because the weights are not a 2-bit lattice. * ``"custom"``: The LUT and palettization parameters are calculated using a custom function. If this mode is selected then ``lut_function`` must be provided. Do not provide ``nbits`` for this mode. The user should customize ``nbits`` in the ``lut_function`` implementation. lut_function: callable A callable function which computes the weight palettization parameters. This must be provided if the mode is set to ``"custom"``. weight: np.ndarray A float precision numpy array. Returns: lut: list[float] The lookup table. indices: list[int] A list of indices for each element. The following is an example that extract the ``top_k`` elements as the LUT. Given that ``weight = [0.1, 0.5, 0.3, 0.3, 0.5, 0.6, 0.7]``, the ``lut_function`` produces ``lut = [0, 0.5, 0.6, 0.7], indices = [0, 1, 0, 0, 2, 3]``. .. sourcecode:: python def lut_function(weight): # In this example, we assume elements in the weights >= 0 weight = weight.flatten() nbits = 4 # Get the LUT, from extracting top k maximum unique elements in the weight to be the LUT # Note that k = 1 << nbits - 1, so we have the first element be 0 unique_elements = np.unique(weight) k = (1 << nbits) - 1 top_k = np.partition(weight, -k)[-k:] np.sort(top_k) lut = [0.0] + top_k.tolist() # Compute the indices mapping = {v: idx for idx, v in enumerate(lut)} indices = [mapping[v] if v in mapping else 0 for v in weight] return lut, indices granularity: str Granularity for quantization. * ``"per_tensor"`` (default) * ``"per_grouped_channel"`` group_size: int * Specify the number of channels in a group. Only effective when granularity is per_grouped_channel. * Default to 32. channel_axis: Optional[int] = None * Specify the channel axis to form a group of channels. Only effective when granularity is per_grouped_channel. * Default to None, where the axis is automatically picked based on op type. cluster_dim: int * The dimension of centroids for each look up table. When cluster_dim == 1, it's scalar palettization, where each entry in the lookup table is a scalar element. When cluster_dim > 1, it's vector palettization, where each entry in the lookup table is a vector of length cluster_dim. * More specifically, when ``cluster_dim > 1``, each ``cluster_dim`` length of weight vectors along the channel axis are palettized using the same 2-D centroid. . * Default to 1. enable_per_channel_scale: bool * When set to True, weights are normalized along the output channels using per channel scales before being palettized. num_kmeans_workers: int * Number of worker processes to use for performing k-means. It is recommended to use more than one worker process to parallelize the clustering, especially when multiple CPUs are available. * Default to 1. weight_threshold: int The size threshold, above which weights are pruned. That is, a weight tensor is pruned only if its total number of elements are greater than ``weight_threshold``. For example, if ``weight_threshold = 1024`` and a weight tensor is of shape ``[10, 20, 1, 1]``, hence ``200`` elements, it will not be pruned. * If not provided, it will be set to ``2048``, in which weights bigger than ``2048`` elements are compressed. """ mode: str = field(default="kmeans", validator=validators.instance_of(str)) nbits: Optional[int] = field(default=None) lut_function: Optional[Callable] = field(default=None) granularity: Union[str, CompressionGranularity] = field( default=CompressionGranularity.PER_TENSOR, validator=validators.instance_of(CompressionGranularity), converter=_normalize_granularity, ) group_size: int = field(default=32) channel_axis: Optional[int] = field(default=None) cluster_dim: int = field(default=1, validator=validators.instance_of(int)) enable_per_channel_scale: bool = field(default=False, validator=validators.instance_of(bool)) num_kmeans_workers: int = field(default=1, validator=validators.instance_of(int)) weight_threshold: Optional[int] = field(default=2048, validator=validators.optional([validators.instance_of(int), _check_weight_threshold])) _WEIGHT_PALETTIZATION_MODES = ("KMEANS", "UNIFORM", "UNIQUE", "CUSTOM") _VALID_NBITS = (1, 2, 3, 4, 6, 8) _VALID_GRANULARITIES = ( CompressionGranularity.PER_TENSOR, CompressionGranularity.PER_GROUPED_CHANNEL, ) @nbits.validator def check_nbits(self, attr, nbits): mode = self.mode.upper() if nbits is None and mode in ("KMEANS", "UNIFORM"): raise ValueError(f"\"nbits\" must be provided for {self.mode} mode") if nbits is not None and mode in ("UNIQUE", "CUSTOM"): raise ValueError(f"\"nbits\" must NOT be provided for {self.mode} mode") if nbits is not None and nbits not in self._VALID_NBITS: raise ValueError( f'Invalid value of "nbits" ({nbits}) for palettization. Supported "nbits" are {self._VALID_NBITS}' ) @mode.validator def check_mode(self, attr, mode): if not mode.upper() in self._WEIGHT_PALETTIZATION_MODES: raise ValueError(f"Only modes {self._WEIGHT_PALETTIZATION_MODES} are supported for weight palettization. Got \"mode\": \"{mode}\".") @lut_function.validator def check_lut_function(self, attr, lut_function): mode = self.mode.upper() if lut_function is None and mode == "CUSTOM": raise ValueError("\"lut_function\" can not be None, if \"mode\" is \"custom\".") if lut_function is not None and mode != "CUSTOM": raise ValueError("\"lut_function\" must be None, if \"mode\" is not \"custom\".") if lut_function is not None and not callable(lut_function): raise ValueError(f"A function object must be provided as \"lut_function\". Got a \"lut_function\" as type {type(self.lut_function)}") @granularity.validator def check_granularity(self, attr, granularity): if granularity not in self._VALID_GRANULARITIES: raise ValueError( f'"granularity" must be one of {self._VALID_GRANULARITIES}, but got {granularity}' ) def __attrs_post_init__(self): self.mode = self.mode.upper() @classmethod def _from_dict(cls, config_dict: Dict[str, Any]) -> OpPalettizerConfig: if "lut_function" in config_dict: raise ValueError( "_from_dict method does not support lut_function. Please create the OpPalettizerConfig from scratch." ) converter = cattrs.Converter(forbid_extra_keys=True) return converter.structure(config_dict, cls) @define class OptimizationConfig: """ A configuration wrapper that enables fine-grained control when compressing a model, Providing the following levels: `global`, `op type`, and `op name`. 1. ``global_config``: The default configuration applied to all ops / consts. 2. ``op_type_configs``: Configurations applied to specific op type. It overrides ``global_config``. 3. ``op_name_configs``: Configurations applied to specific constant or op instance. It overrides ``global_config`` and ``op_type_configs``. The following is an example that constructs an optimization config for weight palettization. .. code-block:: python from coremltools.optimize.coreml import OpPalettizerConfig, OptimizationConfig # The default global configuration is 8 bits palettization with kmeans global_config = OpPalettizerConfig(mode="kmeans", nbits=8) # We use 2 bits palettization for convolution layers, and skip the compression for linear layers op_type_configs = { "conv": OpPalettizerConfig(mode="kmeans", nbits=2), "linear": None, } # We want a convolution layer named "conv_1" to have a 4 bits palettization with a different mode op_name_configs = { "conv_1": OpPalettizerConfig(mode="uniform", nbits=4), } # Now we can put all configuration across three levels to construct an OptimizationConfig object config = OptimizationConfig( global_config=global_config, op_type_configs=op_type_configs, op_name_configs=op_name_configs, ) Parameters ---------- global_config: OpCompressorConfig Config to be applied globally to all supported ops. op_type_configs: dict[str, OpCompressorConfig] Op type level configs applied to a specific op class. * The keys of the dictionary are the string of the op type, and the values are the corresponding :py:class:`OpCompressorConfig`. * An op type will not be compressed if the value is set to ``None``. op_name_configs: dict[str, OpCompressorConfig] Op instance level configs applied to a specific constant or op. * The keys of the dictionary are the name of a constant or an op instance, and the values are the corresponding :py:class:`OpCompressorConfig`. * An op instance will not be compressed if the value is set to ``None``. * You can use ``coremltools.optimize.coreml.get_weights_metadata`` to get the name of the constants / op instances in the model. """ global_config: Optional[OpCompressorConfig] = field(default=None) op_type_configs: Optional[OpCompressorConfig] = field(default=None) op_name_configs: Optional[OpCompressorConfig] = field(default=None) # The following two private attributes is aim for backward compatibility for ct.compression_utils implementation # They need to be removed in the future once we deprecate ct.compression_utils _is_deprecated: bool = field(default=False, validator=validators.instance_of(bool)) _op_selector: Optional[Callable] = field(default=None) @staticmethod def _check_op_config_type(config): if config is None: return if not isinstance(config, OpCompressorConfig): raise ValueError(f"config must be type of OpCompressorConfig. Got {type(config)}.") def set_global(self, op_config: OpCompressorConfig): """ Sets the global config that would be applied to all constant ops. .. code-block:: python from coremltools.optimize.coreml import OpPalettizerConfig, OptimizationConfig config = OptimizationConfig() global_config = OpPalettizerConfig(mode="kmeans", nbits=8) config.set_global(global_config) Parameters ---------- op_config: OpCompressorConfig Config to be applied globally to all supported ops. """ self._check_op_config_type(op_config) self.global_config = op_config def set_op_type( self, op_type: str, op_config: OpCompressorConfig, ): """ Sets the compression config at the level of op type. .. code-block:: python from coremltools.optimize.coreml import OpPalettizerConfig, OptimizationConfig config = OptimizationConfig() conv_config = OpPalettizerConfig(mode="kmeans", nbits=2) config.set_op_type("conv", conv_config) Parameters ---------- op_type: str The type of an op. For instance, ``"conv", "linear"``. op_config: OpCompressorConfig Op type level config applied to a specific op class ``op_type``. """ if self._is_deprecated: raise ValueError("set_op_type is not exposed through the coremltools.compression_utils API.") self._check_op_config_type(op_config) if op_config is not None: op_config._validate_op_type(op_type) self.op_type_configs[op_type] = op_config def set_op_name( self, op_name: str, op_config: OpCompressorConfig, ): """ Sets the compression config at the level of constant / op instance by name. .. code-block:: python from coremltools.optimize.coreml import OpPalettizerConfig, OptimizationConfig config = OptimizationConfig() op_config = OpPalettizerConfig(mode="kmeans", nbits=2) config.set_op_name("conv_1", op_config) Note that, in order to get the name of a constant or an op instance, please refer to the ``coremltools.optimize.coreml.get_weights_metadata`` API. Parameters ---------- op_name: str The name of a constant or an op instance. op_config: OpCompressorConfig Op instance level config applied to a specific constant or op with name ``op_name``. """ if self._is_deprecated: raise ValueError("set_op_name is not exposed through the coremltools.compression_utils API.") self._check_op_config_type(op_config) self.op_name_configs[op_name] = op_config @_is_deprecated.validator def check_is_deprecated(self, attr, _is_deprecated): if not _is_deprecated and self._op_selector is not None: raise ValueError("op_selector is supported only through the coremltools.compression_utils API.") @op_type_configs.validator def check_op_type_configs(self, attr, op_type_configs): if op_type_configs is None: return for v in op_type_configs.values(): self._check_op_config_type(v) for k, v in op_type_configs.items(): if v is not None: v._validate_op_type(k) @op_name_configs.validator def check_op_name_configs(self, attr, op_name_configs): if op_name_configs is None: return for v in op_name_configs.values(): self._check_op_config_type(v) @global_config.validator def check_global_configs(self, attr, global_config): if global_config is None: return self._check_op_config_type(global_config) def _get_op_config(self, op: operation.Operation): """ This utility function retrieve the compression config for an non-const operation.Operation instance. The priority is by: op name -> op type -> global """ if not isinstance(op, operation.Operation): raise TypeError(f"op must be type of operation.Operation. Got {type(op)}") if op.op_type == "const": raise TypeError("op must not be of type const") if op.name in self.op_name_configs: return self.op_name_configs[op.name] elif op.op_type in self.op_type_configs: return self.op_type_configs[op.op_type] return self.global_config def _get_const_op_config(self, op: operation.Operation): """ This utility function retrieves the compression config by an const operation.Operation instance. If the const is fed into multiple operations, an error would be thrown if a conflict is detected. """ if not isinstance(op, operation.Operation): raise TypeError(f"op must be type of operation.Operation. Got {type(op)}") if not (op.op_type == "const" or op.op_type.startswith("constexpr_")): raise TypeError(f"op must be of type const or constexpr. Got {op.op_type}") if op.name in self.op_name_configs: return self.op_name_configs[op.name] if op.op_type in self.op_type_configs: # We don't allow users to call set_op_type for "const" ops. # The users are supposed to use set_global instead raise ValueError("const ops cannot be set by the `set_op_type` function. Please use `set_global`") # If the constant's output is only connected to the block output, we don't do compression # Due to this bug: rdar://108274019 ([Bug] constexpr ops cannot be directly fed to block output) child_ops = [child_op for op_output in op.outputs for child_op in op_output.child_ops] if len(child_ops) == 0: return None # If the const is fed into constexpr ops, we follow the chain to get the non-constexpr. if all(child_op.op_type.startswith("constexpr_") for child_op in child_ops): return self._get_const_op_config(child_ops[0]) op_configs = [self._get_op_config(op) for op in child_ops] for i, config in enumerate(op_configs): if config != op_configs[0]: raise ValueError( f"compression config conflict detected between ops {child_ops[0]} and {child_ops[i]}. " f"{child_ops[0]} has config {op_configs[0]} while {child_ops[i]} has {config}." ) return op_configs[0] def __attrs_post_init__(self): if self.op_type_configs is None: self.op_type_configs = {} if self.op_name_configs is None: self.op_name_configs = {} @classmethod def from_dict(cls, config_dict: Dict[str, Any]) -> "OptimizationConfig": """ Construct an ``OptimizationConfig`` instance from a nested dictionary. The dictionary should have the structure that only contains (if any) the following four ``str`` keys: * ``"config_type"``: Specify the configuration class type. * ``"global_config"``: Parameters for ``global_config``. * ``"op_type_configs"``: A nested dictionary for ``op_type_configs``. * ``"op_name_config"``: A nested dictionary for ``op_name_configs``. The following is a nested dictionary that creates an optimization config for weight palettization: .. code-block:: python config_dict = { "config_type": "OpPalettizerConfig", "global_config": { "mode": "kmeans", "nbits": 4, }, "op_type_configs": { "conv": { "mode": "uniform", "nbits": 1, } }, "op_name_configs": { "conv_1": { "mode": "unique", } }, } Note that you can override the ``config_type``. For instance, if you want to do threshold-based pruning to the model in addition to the convolution layers in which magnitude pruning is applied, the following is an example of the nested dictionary: .. code-block:: python config_dict = { "config_type": "OpThresholdPrunerConfig", "global_config": { "threshold": 0.01, }, "op_type_configs": { "conv": { "config_type": "OpMagnitudePrunerConfig", "n_m_ratio": [3, 4], } }, } Parameters ---------- config_dict: dict[str, Any] A dictionary that represents the configuration structure. """ def _get_cls_instance(cls_type, cls_attrs): if cls_attrs is None: return None converter = cattrs.Converter(forbid_extra_keys=True) if "config_type" in cls_attrs: cls_type = cls_attrs["config_type"] del cls_attrs["config_type"] class_type = getattr(sys.modules[__name__], cls_type) return class_type._from_dict(cls_attrs) def _check_config_dict(config_dict): valid_keys = ("config_type", "global_config", "op_name_configs", "op_type_configs") for k in config_dict: if k not in valid_keys: raise ValueError( f"Invalid key {k} to construct an OptimizationConfig object. Supported keys are {valid_keys}." ) _check_config_dict(config_dict) config_type = config_dict.get("config_type", None) if config_type is None or not isinstance(config_type, str): raise ValueError("config_type must be provided with type of string.") cls_attrs = {} if config_dict.get("global_config", None) is not None: cls_attrs["global_config"] = _get_cls_instance( config_type, config_dict["global_config"] ) for key in ["op_type_configs", "op_name_configs"]: if config_dict.get(key, None) is None: continue if not isinstance(config_dict[key], dict): raise ValueError(f"{key} must be type of dict. Got {type(config_dict[key])}") cls_attrs[key] = { k: _get_cls_instance(config_type, v) for k, v in config_dict[key].items() } return cls(**cls_attrs) @classmethod def from_yaml(cls, yml: Union[IO, str]) -> "OptimizationConfig": """ Construct an ``OptimizationConfig`` instance from a YAML file. The YAML file should have the structure that only contains (if any) the following four ``str`` keys: * ``"config_type"``: Specify the configuration class type. * ``"global_config"``: Parameters for ``global_config``. * ``"op_type_configs"``: A nested dictionary for ``op_type_configs``. * ``"op_name_config"``: A nested dictionary for ``op_name_configs``. The following is a YAML file that creates an optimization config for weight palettization: :: config_type: OpPalettizerConfig global_config: mode: kmeans nbits: 4 op_type_configs: conv: mode: uniform nbits: 1 op_name_configs: conv_1: mode: unique Note that you can override the ``config_type``. For instance, if you want to do threshold-based pruning to the model in addition to the convolution layers in which magnitude pruning is applied, the following is an example of the YAML file: :: config_type: OpThresholdPrunerConfig global_config: threshold: 0.01 op_type_configs: conv: config_type: OpMagnitudePrunerConfig n_m_ratio: [3, 4] Parameters ---------- yml: str, IO A YAML file or the path to the file. """ if isinstance(yml, str): with open(yml, "r") as file: config_dict = yaml.safe_load(file) else: config_dict = yaml.safe_load(yml) return cls.from_dict(config_dict) class _MetaDataDict(OrderedDict): """ A dictionary class with nice print out str """ def __init__(self, mapping=None, str_prefix=""): super().__init__(mapping) self._str_prefix = str_prefix def __str__(self): res = "" for k, v in self.items(): res += f"{self._str_prefix}{k}\n" res += f"{v}\n" return res ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/coreml/_post_training_quantization.py0000644000000000000000000005767714672066616026407 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict from typing import Dict, List, Optional import numpy as np from attrs import define, field, validators from tqdm import tqdm from coremltools.converters.mil.frontend.milproto import load as _milproto_to_pymil from coremltools.converters.mil.mil.passes.graph_pass import PassOption from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.models import model as _model from coremltools.models import utils as _model_utils from coremltools.optimize.coreml import OptimizationConfig as _OptimizationConfig from coremltools.optimize.coreml._config import _MetaDataDict from ._quantization_passes import WeightDecompressor as _WeightDecompressor def _is_valid_const(val, weight_threshold): return isinstance(val, np.ndarray) and val.size >= weight_threshold def _multifunction_unsupported(func): """ The decorator marks the PTQ API that doesn't support the multifunction model. We should use this decorator until the radar is fixed: rdar://126084385 ([Infra] Figure out the story of PTQ or other passes operate on loaded Mutli-function model) Note that the API must take `mlmodel` with type of `MLModel` as an input. """ def decorator(*args, **kwargs): num_args = func.__code__.co_argcount arg_names = list(func.__code__.co_varnames)[:num_args] param_dict = {k: v for k, v in zip(arg_names, args)} model = param_dict.get("mlmodel", None) if model is None: raise ValueError( f'Function {func} decorated with _multifunction_unsupported must takes "mlmodel" as an input.' ) if model._is_multifunction(): raise ValueError(f"{func} is not supported for a multifunction model.") return func(*args, **kwargs) decorator.__doc__ = func.__doc__ return decorator @_multifunction_unsupported def linear_quantize_weights( mlmodel: _model.MLModel, config: _OptimizationConfig, joint_compression: bool = False ): """ Utility function to convert a float precision MLModel of type ``mlprogram``, which uses float-precision weights, into a compressed MLModel that uses n-bit weights (currently only support n=4 and n=8). This is achieved by converting the float weight values that are stored in the ``const`` op into the ``constexpr_affine_dequantize`` or ``constexpr_blockwise_shift_scale`` op (based on model's minimum deployment target). This function uses linear quantization on the float weights, providing up to 4x (for 4-bit) savings in storage compared to float 16, or up to 4x savings compared to float 32. All computation at runtime uses float precision; the precision of the intermediate tensors and the compute precision of the ops are not altered. For each weight, this utility function converts the weight into the int4/8 or uint4/8 type using either `linear interpolation` (``"linear"`` mode) or `linear symmetric interpolation` (``"linear_symmetric"`` mode, the default). **Linear interpolation** The following description uses 8-bit quantization to illustrate, and 4-bit is similar to it. Linear interpolation (``"linear"`` mode) maps the min/max of the float range to the 8-bit integer range ``[low, high]`` using a zero point (also called quantization bias, or offset) and a scale factor. For the int8 quantization, ``[low, high] = [-128, 127]``, while uint8 quantization uses range ``[0, 255]``. ``"linear"`` mode uses the quantization formula: .. math:: w_r = s * (w_q - z) Where: * :math:`w_r` and :math:`s` are of type float. * :math:`w_r`` represents the float precision weight. * :math:`s` represents the scale. * :math:`w_q` and :math:`z` are of type 8-bit integer. * :math:`w_q` represents quantized weight. * :math:`z` represents the zero point. Quantized weights are computed as follows: .. math:: w_q = cast\_to\_8\_bit\_integer(w_r / s + cast\_to\_float(z)) Note: :math:`cast\_to\_8\_bit\_integer` is the process of clipping the input to range ``[low, high]`` followed by rounding and casting to 8-bit integer. In ``"linear"`` mode, ``s, z`` are computed by mapping the original float range ``[A, B]`` into the 8-bit integer range ``[-128, 127]`` or ``[0, 255]``. That is, you are solving the following linear equations: * ``B = s * (high - z)`` * ``A = s * (low - z)`` The equations result in the following: * ``s = (B - A) / (high - low)`` * ``z = cast_to_8_bit_integer((low * B - high * A) / (B - A))`` When the rank of weight ``w`` is 1, then ``s`` and ``z`` are both scalars. When the rank of the weight is greater than 1, then ``s`` and ``z`` are both vectors. In that case, scales are computed per `channel`, in which `channel` is the output dimension, which corresponds to the first dimension for ops such as ``conv`` and ``linear``, and the second dimension for the ``conv_transpose`` op. For ``"linear"`` mode, :math:`A = min(w_r)`, :math:`B = max(w_r)`. **Linear symmetric interpolation** With linear symmetric interpolation (``"linear_symmetric"`` mode, the default), rather than mapping the exact min/max of the float range to the quantized range, the function chooses the maximum absolute value between the min/max, which results in a floating-point range that is symmetric with respect to zero. This also makes the resulting zero point ``0`` for int8 weight and ``127`` for uint8 weight. For ``"linear_symmetric"`` mode: * :math:`A = -R` and :math:`B = R`, where :math:`R = max(abs(w_r))`. * This function maps to the range of ``[-127, 127]`` for int8 weight and ``[0, 254]`` for uint8 weight. * The result is ``s=(B-A)/254`` -> ``s=2R/254`` -> ``s=R/127``. * Solving for ``z``: * int8: ``z = (-127 * R + 127 * R)/2R`` -> ``z=0``. * uint8: ``z = (0 * R + 254 * R)/2R`` -> ``z=127``. Parameters ---------- mlmodel: MLModel Model to be quantized. This MLModel should be of type ``mlprogram``. config: OptimizationConfig An :py:class:`OptimizationConfig` object that specifies the parameters for weight quantization. joint_compression: bool Specification of whether or not to further compress the already-compressed input MLModel to a jointly compressed MLModel. See the `blockwise_palettize_weights` graph pass for information about which compression schemas could be further jointly palettized. Take "palettize + quantize" as an example of joint compression, where the input MLModel is already palettized, and the palettization's lookup table will be further quantized. In such an example, the weight values are represented by ``constexpr_blockwise_shift_scale`` + ``constexpr_lut_to_dense`` ops: lut(int8) -> constexpr_blockwise_shift_scale -> lut(fp16) -> constexpr_lut_to_dense -> dense(fp16) Returns ------- model: MLModel The quantized MLModel instance. Examples -------- .. sourcecode:: python import coremltools as ct import coremltools.optimize as cto model = ct.coreml.models.MLModel("my_model.mlpackage") config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpLinearQuantizerConfig(mode="linear_symmetric") ) compressed_model = cto.coreml.linear_quantize_weights(model, config) """ blockwise_weight_quantizer = PASS_REGISTRY["compression::linear_quantize_weights"] blockwise_weight_quantizer.set_options( [PassOption("config", config), PassOption("joint_compression", joint_compression)] ) return _model_utils._apply_graph_pass(mlmodel, blockwise_weight_quantizer) @_multifunction_unsupported def palettize_weights( mlmodel: _model.MLModel, config: _OptimizationConfig, joint_compression: bool = False ): """ Utility function to convert a float precision MLModel of type ``mlprogram`` to a compressed MLModel by reducing the overall number of weights using one or more lookup tables (LUT). A LUT contains a list of float values. An ``n-bit`` LUT has :math:`2^{n-bits}` entries. For example, a float weight vector such as ``{0.3, 0.3, 0.5, 0.5}`` can be compressed using a 1-bit LUT: ``{0.3, 0.5}``. In this case the float vector can be replaced with a 1-bit vector ``{0, 0, 1, 1}``. This function iterates over all the weights in the ``mlprogram``, discretizes its values, and constructs the LUT according to the algorithm specified in ``mode``. The float values are then converted to the ``n-bit`` values, and the LUT is saved alongside each weight. The ``const`` ops storing weight values are replaced by ``constexpr_lut_to_dense`` ops. At runtime, the LUT and the ``n-bit`` values are used to reconstruct the float weight values, which are then used to perform the float operation the weight is feeding into. Consider the following example of ``"uniform"`` mode (a linear histogram): * ``nbits = 4`` * ``mode = "uniform"`` * ``weight = [0.11, 0.19, 0.3, 0.08, 0.0, 0.02]`` The weight can be converted to a palette with indices ``[0, 1, 2, 3]`` (2 bits). The indices are a byte array. The data range ``[0.0, 0.3]`` is divided into four partitions linearly, which is ``[0.0, 0.1, 0.2, 0.3]``. * The LUT would be ``[0.0, 0.1, 0.2, 0.3]``. * The weight is rounded to ``[0.1, 0.2, 0.3, 0.1, 0.0, 0.0]`` and represented in the palette as indices ``[01b, 10b, 11b, 01b, 00b, 00b]``. Parameters ---------- mlmodel: MLModel Model to be converted by a LUT. This MLModel should be of type ``mlprogram``. config: OptimizationConfig An :py:class:`OptimizationConfig` object that specifies the parameters for weight palettization. joint_compression: bool Specification of whether or not to further compress the already-compressed input MLModel to a jointly compressed MLModel. See the `channelwise_palettize_weights` graph pass for information about which compression schemas could be further jointly palettized. Take "prune + palettize" as an example of joint compression, where the input MLModel is already pruned, and the non-zero entries will be further palettized. In such an example, the weight values are represented by ``constexpr_lut_to_sparse`` + ``constexpr_sparse_to_dense`` ops: ``lut(sparse)`` -> ``constexpr_lut_to_sparse`` -> ``weight(sparse)`` -> ``constexpr_sparse_to_dense`` -> ``weight(dense)`` Returns ------- model: MLModel The palettized MLModel instance. Examples -------- .. sourcecode:: python import coremltools as ct import coremltools.optimize as cto model = ct.models.MLModel("my_model.mlpackage") config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig(mode="kmeans", nbits=4) ) compressed_model = cto.coreml.palettize_weights(model, config) """ weight_palettizer = PASS_REGISTRY["compression::palettize_weights"] weight_palettizer.set_options( [PassOption("config", config), PassOption("joint_compression", joint_compression)] ) return _model_utils._apply_graph_pass(mlmodel, weight_palettizer) @_multifunction_unsupported def prune_weights( mlmodel: _model.MLModel, config: _OptimizationConfig, joint_compression: bool = False ): """ Utility function to convert a float precision MLModel of type ``mlprogram`` to a compressed MLModel using sparse representation. The ``const`` ops storing weight values are replaced by ``constexpr_sparse_to_dense`` ops. This function is useful if the model is trained with pruning techniques so that a lot of weights have zero values. If a large percentage of weight values are zero, a sparse representation is more efficient than a dense one (the default). The sparsified weights are stored in a bit mask. If the weight values are ``{0, 0, 0, 0, 0, 0, 0, 56.3}``, its sparse representation contains a bit mask with ones on locations where the value is non-zero: ``00000001b``. This is accompanied by non-zero data, which is a size-1 vector of value ``{56.3}``. For example, given the following: * ``weight = [0.3, 0, 0, 0.5, 0, 0]`` * ``non_zero_data, bit_mask = sparsify(weight)`` The indices of the non-zero elements are: * ``non_zero_data = [0.3, 0.5]`` * ``bit_mask = "100100"`` Parameters ---------- mlmodel: MLModel Model to be sparsified. This MLModel should be of type ``mlprogram``. config: OptimizationConfig An :py:class:`OptimizationConfig` object that specifies the parameters for weight pruning. joint_compression: bool Specification of whether or not to further prune the already-compressed input MLModel to a jointly compressed MLModel. See the `prune_weights` graph pass for information about which compression schemas could be further pruned. Take "quantize + prune" as an example of joint compression, where the input MLModel is already quantized, and it will be further pruned. In such an example, the weight values are represented by ``constexpr_sparse_blockwise_shift_scale`` + ``constexpr_sparse_to_dense`` ops: quantized(sparse) -> constexpr_sparse_blockwise_shift_scale -> weight(sparse) -> constexpr_sparse_to_dense -> weight(dense) Returns ------- model: MLModel The sparse MLModel instance. Examples -------- .. sourcecode:: python import coremltools as ct import coremltools.optimize as cto model = ct.models.MLModel("my_model.mlpackage") config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpThresholdPrunerConfig(threshold=1e-12) ) compressed_model = cto.coreml.prune_weights(model, config) """ weight_pruner = PASS_REGISTRY["compression::prune_weights"] weight_pruner.set_options( [PassOption("config", config), PassOption("joint_compression", joint_compression)] ) return _model_utils._apply_graph_pass(mlmodel, weight_pruner) @_multifunction_unsupported def decompress_weights(mlmodel: _model.MLModel): """ Utility function to convert weights that are sparse or palettized or affine quantized, back to the float format. That is, convert any of the following three ops to ``mb.const``: (1) ``constexpr_affine_dequantize`` (2) ``constexpr_lut_to_dense`` (3) ``constexpr_sparse_to_dense`` Parameters ---------- mlmodel: MLModel Model which will be decompressed. Returns ------- model: MLModel The MLModel with no ``constexpr`` ops included. Examples -------- .. sourcecode:: python import coremltools as ct model = ct.models.MLModel("my_compressed_model.mlpackage") decompressed_model = ct.optimize.coreml.decompress_weights(model) """ weight_decompressor = _WeightDecompressor(op_selector=lambda op: True) return _model_utils._apply_graph_pass(mlmodel, weight_decompressor) @_multifunction_unsupported def get_weights_metadata(mlmodel: _model.MLModel, weight_threshold: int = 2048): """ Utility function to get the weights metadata as a dictionary, which maps the weight's name to its corresponding CoreMLWeightMetaData. CoreMLWeightMetaData contains the following attributes: 1. ``val``: The weight data. 2. ``sparsity``: the percentile of the element whose absolute value ``<= 1e-12``. 3. ``unique_values``: number of unique values in the weight. 4. ``child_ops``: meta information of the child ops in which the weight is feeding into. Parameters ---------- mlmodel: MLModel Model in which the weight metadata is retrieved from. weight_threshold: int * The size threshold, above which weights are returned. That is, a weight tensor is included in the resulting dictionary only if its total number of elements are greater than ``weight_threshold``. For example, if ``weight_threshold = 1024`` and a weight tensor is of shape ``[10, 20, 1, 1]``, hence ``200`` elements, it will not be returned by the ``get_weights_metadata`` API. * If not provided, it will be set to ``2048``, in which weights bigger than ``2048`` elements are returned. Returns ------- dict[str, CoreMLWeightMetaData] A dict that maps weight's name to its metadata. Examples -------- In this example, there are two weights whose sizes are greater than ``2048``. A weight named ``conv_1_weight`` is feeding into a ``conv`` op named ``conv_1``, while another weight named ``linear_1_weight`` is feeding into a ``linear`` op named ``linear_1``. You can access the metadata by ``weight_metadata_dict["conv_1_weight"]``, and so on. .. sourcecode:: python import coremltools as ct mlmodel = ct.models.MLModel("my_model.mlpackage") weight_metadata_dict = ct.optimize.coreml.get_weights_metadata( mlmodel, weight_threshold=2048 ) # get the weight names with size > 25600 large_weights = [] for k, v in weight_metadata_dict.items(): if v.val.size >= 25600: large_weights.append(k) # get the weight names with sparsity >= 50% sparse_weights = [] for k, v in weight_metadata_dict.items(): if v.sparsity >= 0.5: sparse_weights.append(k) # get the weight names with unique elements <= 16 palettized_weights = [] for k, v in weight_metadata_dict.items(): if v.unique_values <= 16: palettized_weights.append(k) # print out the dictionary print(weight_metadata_dict) The output from the above example would be: :: conv_1_weight [ val: np.ndarray(shape=(32, 64, 2, 2), dtype=float32) sparsity: 0.5 unique_values: 4097 child_ops: [ conv(name=conv_1, weight=conv_1_weight, ...) ] ] linear_1_weight [ val: np.ndarray(shape=(128, 64), dtype=float32) sparsity: 0.2501220703125 unique_values: 4 child_ops: [ linear(name=linear_1, weight=linear_1_weight, ...) ] ] """ def _get_weight_metadata(op): """ Returns a CoreMLWeightMetaData object given a const operation. """ assert op.op_type == "const", f"Expect op be type of 'const', got '{op.op_type}'" child_ops = [] visited = set() for child_op in op.outputs[0].child_ops: if child_op in visited: continue visited.add(child_op) params_name_mapping = OrderedDict() for k, v in child_op.inputs.items(): if _is_valid_const(v.val, weight_threshold): params_name_mapping[k] = v.op.name child_ops.append( CoreMLOpMetaData( op_type=child_op.op_type, name=child_op.name, params_name_mapping=params_name_mapping, ) ) return CoreMLWeightMetaData(op.val.val, child_ops=child_ops) prog = _model_utils._convert_model_spec_to_pymil_prog( mlmodel, mlmodel.get_spec().specificationVersion, _milproto_to_pymil.load ) res = _MetaDataDict({}) def get_weights_meta_block(block): # get the candidates ops with the given op_type candidate_ops = [] for op in block.operations: for b in op.blocks: get_weights_meta_block(b) if op.op_type == "const" and _is_valid_const(op.val.val, weight_threshold): candidate_ops.append(op) for op in tqdm( candidate_ops, desc="Getting Core ML weights meta data", unit=" ops", ): res[op.name] = _get_weight_metadata(op) for f in prog.functions.values(): get_weights_meta_block(f) return res @define(frozen=True) class CoreMLOpMetaData: """ A container class that stores op meta data. The class has the following attributes: Parameters ---------- op_type: str The type of the op. For instance: ``conv``, ``linear``, and so on. name: str The name of the op. params_name_mapping: dict[str, str] A dict that maps the op's constant parameters to its corresponding weight name. For instance, given a ``conv`` op with ``params_name_mapping``, .. sourcecode:: python { "weight": "conv_1_weight", "bias": "conv_1_bias", } means that the weight and bias of this op are named ``conv_1_weight``, ``conv_1_bias``, respectively. """ op_type: str = field(validator=validators.instance_of(str)) name: str = field(validator=validators.instance_of(str)) params_name_mapping: Dict[str, str] = field(validator=validators.instance_of(dict)) def __str__(self): res = f"{self.op_type}(name={self.name}" for k, v in self.params_name_mapping.items(): res += f", {k}={v}" res += ", ...)" return res @define(frozen=True) class CoreMLWeightMetaData: """ A container class that stores weight meta data. The class has the following attributes: Parameters ---------- val: numpy.ndarray The weight data. sparsity: float The percentile of the element whose absolute value ``<= 1e-12``. unique_values: int Number of unique values in the weight. child_ops: list[CoreMLOpMetaData] A list of ``CoreMLOpMetaData`` which contains information of child ops in which the weight is feeding into. The attributes can be accessed by: ``child_ops[idx].op_type``: The operation type of the ``idx`` 'th child op. ``child_ops[idx].name``: The name of the ``idx`` 'th child op. Other op-dependant attributes also can be accessed. For instance, if ``idx`` 'th child op is a ``conv`` layer, ``child_ops[idx].weight`` will return its weight name. For more details, please refer to the ``CoreMLOpMetaData`` doc string. Examples -------- .. sourcecode:: python import numpy as np from coremltools.optimize.coreml import CoreMLWeightMetaData data = np.array([[1.0, 0.0], [0.0, 6.0]], dtype=np.float32) meta_data = CoreMLWeightMetaData(data) print(meta_data) Outputs:: [ val: np.ndarray(shape=(2, 2), dtype=float32) sparsity: 0.5 unique_values: 3 ] """ val: np.ndarray = field(validator=validators.instance_of(np.ndarray)) sparsity: Optional[float] = field(validator=validators.instance_of(float)) unique_values: Optional[int] = field(validator=validators.instance_of(int)) child_ops: Optional[List[CoreMLOpMetaData]] = field( default=None, validator=validators.optional(validators.instance_of(list)) ) @sparsity.default def _get_sparsity(self): num_of_zeros = np.sum(np.abs(self.val) <= 1e-12) return num_of_zeros / self.val.size @unique_values.default def _get_unique_values(self): return len(np.unique(self.val)) def __str__(self): res = "[ \n" res += f" val: np.ndarray(shape={self.val.shape}, dtype={self.val.dtype})\n" res += f" sparsity: {self.sparsity}\n" res += f" unique_values: {self.unique_values}\n" if self.child_ops is not None: res += " child_ops: [\n" for child_op in self.child_ops: res += f" {child_op}\n" res += " ]\n" res += "]" return res ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/coreml/_quantization_passes.py0000644000000000000000000017317214672066616025012 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import atexit from itertools import repeat from multiprocessing import Pool from typing import Callable, List, Optional, Tuple, Union import numpy as np from tqdm import tqdm from coremltools import _logger as logger from coremltools.converters.mil._deployment_compatibility import AvailableTarget from coremltools.converters.mil.backend.mil.load import should_use_weight_file from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, Program, types from coremltools.converters.mil.mil.block import is_current_opset_version_compatible_with from coremltools.converters.mil.mil.ops.defs.iOS16 import constexpr_affine_dequantize from coremltools.converters.mil.mil.ops.defs.iOS16 import ( constexpr_lut_to_dense as constexpr_lut_to_dense_ios16, ) from coremltools.converters.mil.mil.ops.defs.iOS16 import ( constexpr_sparse_to_dense as constexpr_sparse_to_dense_ios16, ) from coremltools.converters.mil.mil.ops.defs.iOS18 import ( constexpr_blockwise_shift_scale, constexpr_lut_to_dense, constexpr_sparse_to_dense, ) from coremltools.converters.mil.mil.passes.defs.quantization import AbstractQuantizationPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.converters.mil.mil.var import Var from coremltools.models._deprecation import deprecated from coremltools.models.neural_network.quantization_utils import _get_kmeans_lookup_table_and_weight from coremltools.optimize.coreml import _utils as optimize_utils from coremltools.optimize.coreml._config import ( CompressionGranularity, OpLinearQuantizerConfig, OpMagnitudePrunerConfig, OpPalettizerConfig, OpThresholdPrunerConfig, OptimizationConfig, ) class AbstractCompressionPass(AbstractQuantizationPass): """ The abstract class for the compression graph passes. """ _MINIMUM_OPSET_VERSION = AvailableTarget.iOS16 # Graph pass option for setting compression config. _config: Optional[OptimizationConfig] = None # Graph pass option for enabling joint compressions. _joint_compression: bool = False @property def config(self) -> OptimizationConfig: return self._config @config.setter def config(self, value: OptimizationConfig): self._check_config_type(value) self._config = value if value._op_selector is not None: self.op_selector = value._op_selector @property def joint_compression(self): return self._joint_compression @joint_compression.setter def joint_compression(self, joint_compression: bool): if not isinstance(joint_compression, bool): raise ValueError( f"joint_compression only supports bool, but got {type(joint_compression)}" ) self._joint_compression = joint_compression def __init__(self, config: OptimizationConfig = None, fake_compression: bool = False): if not isinstance(config, (OptimizationConfig, type(None))): raise ValueError(f"config must be of type OptimizationConfig. Got {type(config)}.") op_selector = None if config is None else config._op_selector super().__init__(op_selector=op_selector) self.fake_compression = fake_compression self._config = config if config is not None: self._check_config_type(config) def apply(self, prog): if not isinstance(prog, Program): raise TypeError('Transform "{}" can only be applied on PyMIL programs.'.format(self)) @block_context_manager def apply_block(block): if not is_current_opset_version_compatible_with(self._MINIMUM_OPSET_VERSION): logger.warning( f"The program's opset is not compatible with {self._MINIMUM_OPSET_VERSION}. " f"Skipped the compression pass {self.__class__}.") return if self._joint_compression and not is_current_opset_version_compatible_with( AvailableTarget.iOS18 ): raise ValueError( "Joint compression is only supported since iOS18. Please set the " "minimum deployment target to iOS18 if you want to use it." ) valid_consts = [] for op in list(block.operations): for b in op.blocks: apply_block(b) if self.is_valid_op(op): need_transform = True if self.op_selector is not None: need_transform = self.op_selector(op) if need_transform: valid_consts.append(op) for op in tqdm( valid_consts, desc=f"Running compression pass {self.__class__.__name__}", unit=" ops", ): self.transform_op(op) for f in prog.functions.values(): apply_block(f) def need_compress_const( self, op: Operation, _is_deprecated: bool, weight_threshold: float ) -> bool: """ The utility function is checking whether a const op can be compressed. If ``_is_deprecated = True``, the user is using the ``ct.compression_utils``, in which the ops are already filtered by ``op_selector``. For the new ``ct.optimize.coreml`` API, ``op_selector`` is no longer supported, so the ``weight_threshold`` is checked explicitly instead. """ val = self._get_const_value(op) if _is_deprecated and weight_threshold != None: raise ValueError("weight_threshold cannot be set through the deprecated ct.compression_util API") if _is_deprecated: return should_use_weight_file(val) if not self._validate_child_constexpr_for_compress(op): return False if weight_threshold is None: raise ValueError("weight_threshold cannot be None") # Disable 1D tensor compression due to MIL 1D Tensor bug (rdar://113860800). if ( not op.outputs[0].child_ops[0].op_type.startswith("constexpr_") and op.outputs[0].rank <= 1 ): return False return ( should_use_weight_file(val) and self._get_weight_to_compress_size(op) > weight_threshold ) def _validate_child_constexpr_for_compress(self, op: Operation) -> bool: """Check if child constexpr ops support current op to be compressed.""" for child_op in op.outputs[0].child_ops: if child_op.op_type.startswith("constexpr_"): # Const fed into constexpr_ ops cannot be further compressed. return False return True def _check_config_type(self, config: OptimizationConfig): """ The utility function is checking the OptimizationConfig is holding correct type of op config. """ def get_supported_types_as_str(supported_type): if not isinstance(supported_type, (tuple, list)): supported_type = [supported_type] return ", ".join([f"{val.__name__}" for val in supported_type]) all_configs = [] if config.global_config is not None: all_configs.append(config.global_config) all_configs.extend(list(config.op_type_configs.values())) all_configs.extend(list(config.op_name_configs.values())) for config in all_configs: if not isinstance(config, self._SUPPORTED_CONFIG_TYPE) and config is not None: supported_type_str = get_supported_types_as_str(self._SUPPORTED_CONFIG_TYPE) raise ValueError(f"{self.__class__.__name__} only accept {supported_type_str} type config. Got {config.__class__.__name__}.") def is_valid_op(self, op: Operation): if op.op_type == "const" and should_use_weight_file(self._get_const_value(op)): return True return False def _get_const_value(self, op: Operation) -> np.ndarray: if op.op_type != "const": raise ValueError(f"The op {op} is not a const") return op.outputs[0].val def _get_weight_to_compress_size(self, op: Operation) -> int: """ For joint compression, the constexpr op is the intermediate compressed result, so we need to go along the constexpr op chain to get the op which actually is the weight need to be compressed. For example, the op could be a const feed into constexpr_lut_to_dense as indices, and the constexpr_lut_to_dense is fed into a conv op. In this case, we need to find the original weight of the conv op, instead of using the const indices to determine if we want to compress the op. """ if not (op.op_type == "const" or op.op_type.startswith("constexpr_")): raise ValueError(f"Only support const or constexpr ops, but got {op.op_type}") if self.joint_compression: for op_output in op.outputs: # If the current const/constexpr is used in multiple ops, we do a depth-first # search to find the endpoint of the chained const/constexpr ops. for child_op in op_output.child_ops: if child_op.op_type.startswith("constexpr_"): return self._get_weight_to_compress_size(child_op) else: # The child op is not constexpr, which means the current op is the real # weight (not intermediate constexpr) that need compression. return np.prod(op.outputs[0].shape) if op.op_type != "const": raise ValueError("Only const weight can be compressed") return np.prod(op.outputs[0].shape) @register_pass(namespace="compression") class prune_weights(AbstractCompressionPass): """ This transform works for each ``const`` op if: - ``_is_deprecated=True`` and the ``op_selector`` returns ``True``. - ``_is_deprecated=False`` and the ``const`` value size ``> weight_threshold``. The transform performs the following: - The fraction of values with the least absolute value are zeroed out (self.sparsity). - If ``fake_compression=False``, the zeroed-out value is encoded using the ``constexpr_sparse_to_dense`` op. - If ``fake_compression=True``, the zeroed-out value is encoded using the ``const`` op. - Old ``const`` is replaced by a new operation with zeroed-out value. When the `joint_compression` option is set, for each existing compressed constexpr op, it will check if the result is sparse. If the result is sparse, it will replace the constexpr op by the corresponding sparse version to support joint compression. More specifically: - For quantization, `constexpr_blockwise_shift_scale` is replaced by `constexpr_sparse_blockwise_shift_scale` + `constexpr_sparse_to_dense` if the dequantized result is sparse. - For palettization, `constexpr_lut_to_dense` is replaced by `constexpr_lut_to_sparse` + `constexpr_sparse_to_dense` if the depalettized result is sparse. .. code-block:: Input graph: constexpr_blockwise_shift_scale -> downstream op Output graph: constexpr_sparse_blockwise_shift_scale -> constexpr_sparse_to_dense -> downstream op Support Options: - ``joint_compression``: Enable joint compression. Similar to blockwise_quantize_weights and """ _SUPPORTED_CONFIG_TYPE = (OpMagnitudePrunerConfig, OpThresholdPrunerConfig) # Ops to be further pruned for joint compression. _JOINT_SUPPORT_OPS = {"constexpr_blockwise_shift_scale", "constexpr_lut_to_dense"} def is_valid_op(self, op: Operation): if not self.joint_compression: return super().is_valid_op(op) if op.op_type in self._JOINT_SUPPORT_OPS and should_use_weight_file( self._get_const_value(op) ): return True return False def _get_const_value(self, op: Operation) -> np.ndarray: if op.op_type == "const" or not self.joint_compression: return super()._get_const_value(op) elif op.op_type.startswith("constexpr_"): # The materialized_val_inference is expensive, so only do it for joint compression, as # we need to get the de-compressed value and prune it. return op.materialized_val_inference() else: raise ValueError(f"The op {op} is not a const/constexpr.") @staticmethod def _produce_sparse_param(val) -> optimize_utils.SparseParamsIos16: flattened_val = val.flatten() return optimize_utils.SparseParamsIos16( nonzero_data=flattened_val[np.where(flattened_val != 0)], mask=np.packbits(np.where(flattened_val != 0, 1, 0), bitorder="little"), shape=val.shape, ) @staticmethod def compress_by_threshold( val, threshold, minimum_sparsity_percentile ) -> Optional[optimize_utils.SparseParamsIos16]: val = np.where(np.abs(val) <= threshold, 0, val) sparsity_percentile = np.sum(val == 0.0) / val.size if sparsity_percentile < minimum_sparsity_percentile: msg = (f"weight value has sparsity of {sparsity_percentile} < " f"minimum_sparsity_percentile {minimum_sparsity_percentile}. Skipped." ) logger.warning(msg) return None return prune_weights._produce_sparse_param(val) @staticmethod def compress_by_magnitude( val, target_sparsity, block_size=None, dim=None ) -> Optional[optimize_utils.SparseParamsIos16]: def _apply_block_sparsity(val, block_size, dim): shape = val.shape rank = len(shape) assert dim in [0, 1], "bock sparsity pruning only supports dim [0, 1]." assert rank in [2, 3, 4, 5], "block sparsity only supports weights of rank [2, 3, 4, 5]" """ Block sparsity follows these steps: 1. Input tensor with shape of ``[C_out, Cin, *K]``. 2. If ``dim = 1``, the tensor is transposed to ``[Cin, C_out, *K]``. The following example assumes ``dim = 0``. 3. Pad ``C_out`` so that it can be divided by ``block_size``: ``[C_out_pad, Cin, *K]``. 4. Divide the output channel by ``block_size`` and reshape: ``[C_out_pad // block_size, block_size, C_in, *K]``. 5. Compute the magnitude for each block: ``[C_out_pad // block_size, 1, C_in, *K]``. 6. Replicate the magnitude values for each block: ``[C_out_pad // block_size, block_size, C_in, *K]``. 7. Reshape the tensor back to ``[Cout_pad, C_in, *K]``. 8. Crop the tensor to ``[C_out, C_in, *K]``. 9. If ``dim = 1``, transpose the tensor back to the original layout. """ if dim == 1: perm = [1, 0] + list(range(2, rank)) val = np.transpose(val, axes=perm) channel = val.shape[0] if channel % block_size != 0: pad_size = block_size - channel % block_size pad_value = [(0, pad_size)] + [(0, 0)] * (rank - 1) val = np.pad(val, pad_value) shape_padded = val.shape assert shape_padded[0] % block_size == 0 new_shape = list(shape_padded) new_shape.insert(1, block_size) new_shape[0] = new_shape[0] // block_size val = np.reshape(val, (new_shape)) val = val * val val = np.sum(val, axis=1, keepdims=True) val = np.sqrt(val) reps = [1] * (rank + 1) reps[1] = block_size val = np.tile(val, reps) val = np.reshape(val, shape_padded) val = val[:channel] if dim == 1: val = np.transpose(val, axes=perm) return val magnitude_map = np.abs(val) if block_size is not None: channel = magnitude_map.shape[dim] if block_size > channel / 2: logger.warning( f"block_size > channel / 2 is not applicable for block sparsity. Got block_size = {block_size}, channel = {channel}. Skipped." ) return None magnitude_map = _apply_block_sparsity(magnitude_map, block_size, dim) q = target_sparsity * 100 if q == 100: val = 0 * val elif q != 0: val = np.where(magnitude_map <= np.percentile(magnitude_map, q), 0, val) return prune_weights._produce_sparse_param(val) @staticmethod def compress_by_nm_sparsity(val, n_m_ratio, dim) -> Optional[optimize_utils.SparseParamsIos16]: n, m = n_m_ratio assert n <= m shape = val.shape rank = len(shape) assert dim in [0, 1], "n:m pruning only supports dim [0, 1]." assert rank in [2, 3, 4, 5], "m:m pruning only supports weights of rank [2, 3, 4, 5]" """ The `n-m` pruning process follows these steps: 1. Input tensor with shape of ``[C_out, C_in, *K]``, where ``K`` is the spatial dimension from ``0`` to ``3``. 2. If ``axis = 1``, transpose the tensor to shape ``[*K, C_out, C_in]``; otherwise, ``(axis = 0)`` to ``[*K, C_in, C_out]``. 3. For the case of ``axis = 1``, reshape input to a 2D tensor ``[*K*C_out, C_in]``. Similar for ``axis = 0``. 4. Pad the last dimension with ``0`` so that it can be divided by ``m``: ``[*K*C_out, C_in_pad]``. 5. Reshape the tensor to have the last dimension ``m``: ``[*K*C_out*C_in_pad//m, m]``. 6. For each vector of length ``m``, we set the lowest ``n`` magnitute elements to ``0``. 7. Reshape the tensor back to the shape of ``[*K*C_out, C_in_pad]``. 8. Crop the last dimension to match the original shape of ``[*K*C_out, C_in]``. 9. Reshape the tensor to shape ``[*K, C_out, C_in]``. 10. Transpose the tensor back to ``[C_out, C_in, K]``. """ perm = list(range(2, rank)) + [0, 1] if dim == 0: perm[-2], perm[-1] = 1, 0 weight = np.copy(np.transpose(val, axes=perm)) shape_begin = weight.shape weight = np.reshape(weight, (-1, weight.shape[-1])) channel = weight.shape[-1] if m > channel / 2: logger.warning( f"m > channel / 2 is not applicable for n:m pruning. Got m = {m}, channel = {channel}. Skipped." ) return None if channel % m != 0: pad_size = m - channel % m weight = np.pad(weight, ((0, 0), (0, pad_size))) shape_padded = weight.shape assert shape_padded[-1] % m == 0 weight = np.reshape(weight, (-1, m)) magnitute = np.abs(weight) indices = np.argsort(magnitute, axis=-1)[:, :n] n_m_mask = np.zeros(weight.shape).astype(val.dtype) np.put_along_axis(n_m_mask, indices, 1.0, axis=-1) n_m_mask = np.reshape(n_m_mask, shape_padded) n_m_mask = n_m_mask[:, :channel] n_m_mask = np.reshape(n_m_mask, shape_begin) perm_back = [perm.index(i) for i in range(rank)] n_m_mask = np.transpose(n_m_mask, axes=perm_back) val = val * (1 - n_m_mask) return prune_weights._produce_sparse_param(val) @staticmethod def decompress( params: Union[optimize_utils.SparseParamsIos16, optimize_utils.SparseParams] ) -> np.ndarray: if isinstance(params, optimize_utils.SparseParamsIos16): return constexpr_sparse_to_dense_ios16.decompress( params.nonzero_data, params.mask, params.shape ) elif isinstance(params, optimize_utils.SparseParams): return constexpr_sparse_to_dense.decompress(params.nonzero_data, params.mask) else: raise ValueError("Invalid type of params") @staticmethod def _create_constexpr_var( op: Operation, sparse_params: optimize_utils.SparseParams, joint_compression: bool = False ) -> Var: if not is_current_opset_version_compatible_with(AvailableTarget.iOS18): sparse_params_ios16 = optimize_utils.ios18_sparse_params_to_ios16(sparse_params) return mb.constexpr_sparse_to_dense( nonzero_data=sparse_params_ios16.nonzero_data, mask=sparse_params_ios16.mask, shape=np.uint32(sparse_params_ios16.shape), before_op=op, name=op.name + "_sparsified", ) mask = sparse_params.mask nonzero_data = sparse_params.nonzero_data if joint_compression: if op.op_type == "constexpr_blockwise_shift_scale": mask, nonzero_data = mb.constexpr_sparse_blockwise_shift_scale( data_mask=mask, nonzero_data=op.data.val[mask != 0].flatten(), scale=op.scale, offset=op.offset, before_op=op, ) elif op.op_type == "constexpr_lut_to_dense": mask, nonzero_data = mb.constexpr_lut_to_sparse( indices_mask=mask, indices_nonzero_data=op.indices.val[mask != 0].flatten(), lut=op.lut, vector_axis=op.vector_axis, before_op=op, ) return mb.constexpr_sparse_to_dense( nonzero_data=nonzero_data, mask=mask, before_op=op, name=op.name + "_sparsified", ) def transform_op(self, op: Operation): op_config = self.config._get_const_op_config(op) if op_config is None: return if not self.need_compress_const(op, self.config._is_deprecated, op_config.weight_threshold): return const_val = self._get_const_value(op) if not isinstance(const_val, (np.ndarray, np.generic)): raise ValueError("Only numpy arrays are supported") sparse_params: Optional[optimize_utils.SparseParamsIos16] = None skip_msg = f"op named {op.name} not applicable for {op_config} configuration. Skipped." if isinstance(op_config, OpThresholdPrunerConfig): sparse_params = self.compress_by_threshold( val=const_val, threshold=op_config.threshold, minimum_sparsity_percentile=op_config.minimum_sparsity_percentile, ) elif isinstance(op_config, OpMagnitudePrunerConfig): # Structural sparsity can only be applied to conv / linear weight # For non applicable constant, we skip the compression, # we do allow the user to do structural pruning for non applicable constant, # if it is explicitly set by set_op_name, if not op_config._check_const_op_is_valid(op): if op.name not in self.config.op_name_configs: logger.warning(skip_msg) return if op_config.target_sparsity is not None: sparse_params = self.compress_by_magnitude( val=const_val, target_sparsity=op_config.target_sparsity, block_size=op_config.block_size, dim=op_config.dim, ) elif op_config.n_m_ratio is not None: sparse_params = self.compress_by_nm_sparsity( val=const_val, n_m_ratio=op_config.n_m_ratio, dim=op_config.dim, ) if sparse_params is None: logger.warning(skip_msg) return sparse_params: optimize_utils.SparseParams = optimize_utils.ios16_sparse_params_to_ios18( sparse_params ) if not self.fake_compression: new_var = self._create_constexpr_var( op, sparse_params, joint_compression=self.joint_compression and op.op_type in self._JOINT_SUPPORT_OPS, ) else: decompressed_val = self.decompress(sparse_params) new_var = mb.const( val=decompressed_val, before_op=op, name=op.name + "_fake_sparsified", ) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=new_var, no_check_var_types=True, force_replace=True, # Need force_replace to replace the constexpr. ) op.enclosing_block.remove_ops([op]) @register_pass(namespace="compression") class palettize_weights(AbstractCompressionPass): """ This transform works for each ``const`` op if: - ``_is_deprecated=True`` and the ``op_selector`` returns ``True``. - ``_is_deprecated=False`` and the ``const`` value size ``> weight_threshold``. The transform performs the following: - A linear look-up table (LUT) with 2\ :sup:`nbits` entries is created with values represented by indexing into this LUT. - If ``fake_compression=False``, compressed value is encoded using the ``constexpr_lut_to_dense`` op. - If ``fake_compression=True``, compressed value is decompressed and then encoded using the ``const`` op. - Old ``const`` op is replaced by a newly created operation. Here is an example for input and output graph of this graph pass: .. code-block:: Input graph: const -> downstream op Output graph: constexpr_lut_to_dense -> downstream op Support Options: - ``joint_compression``: Enable joint compression by quantizing an already compressed model. What op could be further quantized is in `_validate_child_constexpr_for_compress`. Using pruning + palettization as an example, for each existing ``constexpr_sparse_to_dense`` op, it tries to palettize the non-sparse elements in the spasified data, which could be represented as: - For each existing ``constexpr_sparse_to_dense`` op, it tries to palettize the non-sparse elements in the spasified data, which could be represented as: .. code-block:: Input graph: sparse weight(fp16) -> constexpr_sparse_to_dense -> dense weight(fp16) Output graph: sparse lut(int8) -> constexpr_lut_to_sparse -> sparse weight(fp16) -> constexpr_sparse_to_dense -> dense weight(fp16) For details about different palettization schemas, see `OpPalettizerConfig` for more details. """ _SUPPORTED_CONFIG_TYPE = OpPalettizerConfig _SUPPORTED_NBITS = (1, 2, 3, 4, 6, 8) _compress_pool: Optional[Pool] = None def __del__(self): if palettize_weights._compress_pool is not None: palettize_weights._compress_pool.close() def _validate_child_constexpr_for_compress(self, op: Operation) -> bool: """ Determines which pattern supports joint compression. In iOS18 joint compression, the quantized/sparsified data could be further palettized. For each specific op, we only palettize the specific input: - constexpr_sparse_to_dense's nonzero_data - constexpr_blockwise_shift_scale's data """ if ( is_current_opset_version_compatible_with(AvailableTarget.iOS18) and self.joint_compression ): if len(op.outputs[0].child_ops) == 1: child_op = op.outputs[0].child_ops[0] if ( child_op.op_type == "constexpr_sparse_to_dense" and child_op.nonzero_data == op.outputs[0] ): return True elif ( child_op.op_type == "constexpr_blockwise_shift_scale" and child_op.data == op.outputs[0] ): return True return super()._validate_child_constexpr_for_compress(op) @staticmethod def _get_nbits_for_unique_mode( val: np.ndarray, allowed_nbits: Tuple[int, ...], cluster_dim: int = 1, vector_axis: Optional[int] = None, ) -> int: """ Try each nbit in allowed_nbits to find one that can represent number of unique values in val. If cluster_dim > 1, it's for vector palettization, where the unique means vector unique. The vector_axis is only effective for vector palettization, which indicates on which axis the vector is. Note that the values in `allowed_nbits` need to be in ascending order. """ if cluster_dim == 1: val = val.flatten() unique_vals_num = len(np.unique(val)) else: # Vector palettization where each cluster_dim elements form a vector on vector_axis. if vector_axis is None: raise ValueError("The `vector_axis` must be specified when cluster_dim > 1") val = np.swapaxes(val, -1, vector_axis).reshape((-1, cluster_dim)) unique_vals_num = len(np.unique(val, axis=0)) for nbits in allowed_nbits: if unique_vals_num <= 1 << nbits: return nbits raise ValueError( f"Unique values in weight cannot be represented by {allowed_nbits[-1]} " "bits palettization." ) @staticmethod def _get_lut_and_indices( val: np.ndarray, mode: str, nbits: Optional[int], lut_function: Optional[Callable], cluster_dim: int = 1, vector_axis: Optional[int] = None, ) -> Tuple[np.ndarray, np.ndarray]: """Calculate look-up-table (LUT) and indices.""" def compress_kmeans(val, nbits, cluster_dim, vector_axis): lut, indices = _get_kmeans_lookup_table_and_weight( nbits, val, force_kmeans1d=False, cluster_dim=cluster_dim, vector_axis=vector_axis ) lut = lut.astype(val.dtype) indices = indices.astype(np.uint8) return lut, indices def compress_uniform(val, nbits): val = val.flatten() val_min = np.amin(val) val_max = np.amax(val) scale = (val_max - val_min) / ((1 << nbits) - 1) indices = np.round(((val - val_min) / (val_max - val_min)) * ((1 << nbits) - 1)).astype( np.uint8 ) lut = np.array(range(0, 1 << nbits)) * scale + val_min lut = lut.astype(val.dtype) return lut, indices def compress_unique(val, nbits, cluster_dim, vector_axis): if nbits is None: nbits = palettize_weights._get_nbits_for_unique_mode( val, palettize_weights._SUPPORTED_NBITS, cluster_dim, vector_axis, ) if cluster_dim > 1: val = optimize_utils.reshape_weight_for_vector_lut(val, cluster_dim, vector_axis) val = val.reshape((-1, cluster_dim)) unique_vals, unique_inverse = np.unique(val, axis=0, return_inverse=True) lut = np.zeros((1 << nbits, cluster_dim)) lut[: len(unique_vals)] = unique_vals indices = unique_inverse indices = indices.flatten() if cluster_dim == 1: # Squeeze the last dim to make behaviors back compatible with scalar palettization. lut = lut.squeeze(-1) return lut.astype(val.dtype), indices.astype(np.uint8) if mode == "KMEANS": lut, indices = compress_kmeans(val, nbits, cluster_dim, vector_axis) elif mode == "UNIFORM": if cluster_dim > 1: raise NotImplementedError( "Vector palettization (cluster_dim > 1) doesn't support UNIFORM mode." ) lut, indices = compress_uniform(val, nbits) elif mode == "UNIQUE": lut, indices = compress_unique(val, nbits, cluster_dim, vector_axis) else: if mode != "CUSTOM": raise AssertionError(f"Invalid mode {mode}") lut, indices = lut_function(val) return lut, indices @staticmethod @deprecated( suffix="Please use coremltools.optimize.coreml.palettize_weights.blockwise_compress", version="8.2", obj_prefix="coremltools.optimize.coreml.palettize_weights.", ) def compress(val, mode, nbits=None, lut_function=None) -> optimize_utils.LutParamsIos16: """ [Legacy] Per-tensor palletization. This API is for backward compatibility only. It's no longer used inside the coremltools. It's recommended to use `blockwise_compress` instead, which is more general. """ def check_lut_parameters_are_valid(val, lut, indices): if not isinstance(lut, np.ndarray) or not isinstance(indices, np.ndarray): raise ValueError("LUT and indices must be type of numpy array.") if indices.size != val.size: msg = "Indices size ({}) mismatched with the original weight({}).".format( indices.size, val.size ) raise ValueError(msg) if len(indices.shape) != 1 or indices.dtype != np.uint8: msg = "Indices must be a numpy vector of type uint8. Found shape {} with type {}".format( indices.shape, indices.dtype ) raise ValueError(msg) if lut.dtype != val.dtype: msg = "Dtype mismatched between LUT ({}) and weight ({})".format( lut.dtype, val.dtype ) raise ValueError(msg) if not isinstance(val, (np.ndarray, np.generic)): raise ValueError(f"Only numpy arrays are supported. Got {type(val)}") lut, indices = palettize_weights._get_lut_and_indices(val, mode, nbits, lut_function) check_lut_parameters_are_valid(val, lut, indices) params = optimize_utils.LutParamsIos16( lut=lut, indices=optimize_utils.pack_elements_into_bits(indices, int(np.log2(lut.shape[0]))), shape=val.shape, ) return params @staticmethod def blockwise_compress( original_data: np.ndarray, mode: str, nbits: Optional[int], block_sizes: List[int], lut_function: Optional[Callable] = None, cluster_dim: int = 1, channel_axis: Optional[int] = None, num_kmeans_workers: int = 1, ) -> Optional[optimize_utils.LutParams]: """ Compress original_data into n-bit representation by palettization. Supported nbits: 1, 2, 3, 4, 6, 8 Supported mode: KMEANS, UNIFORM, UNIQUE, CUSTOM block_sizes: Each element is the block size on corresponding axis for original_data. cluster_dim: Dimension of each cluster centroid, which is the length of each element in the lookup table. channel_axis: Only useful for vector palettization (cluster_dim > 1). If not provided, we will try to infer it from `block_sizes`. Returns None if the weight cannot be compressed (for example, the dim size on an axis is not divisible by the corresponding block_size). """ # TODO (rdar://127342739): Support more general blockwise palettization. # As general blockwise palettization hasn't been supported yet, we try to infer channel axis # and channel group size from block_sizes, and use grouped channelwise palettization instead. channel_group_size = 0 for axis, block_size in enumerate(block_sizes): if block_size != 0 and block_size != original_data.shape[axis]: if channel_axis is not None and channel_axis != axis: raise NotImplementedError( "General block-wise palettization is not supported. Please use " "'per_grouped_channel' or 'per_tensor' for the 'granularity' in config." ) channel_axis = axis channel_group_size = block_size if channel_axis is None: if cluster_dim > 1: raise ValueError( "Cannot infer channel axis, which is required for vector palettization." ) # Per-tensor compression, just need to pick a dummy axis. channel_axis = 0 return palettize_weights.grouped_channelwise_compress( original_data, mode, nbits, channel_axis, channel_group_size, lut_function, cluster_dim, num_kmeans_workers, ) @staticmethod def grouped_channelwise_compress( original_data: np.ndarray, mode: str, nbits: Optional[int], channel_axis: int, channel_group_size: int, lut_function: Optional[Callable] = None, cluster_dim: int = 1, num_kmeans_workers: int = 1, ) -> Optional[optimize_utils.LutParams]: """ Compress original_data into n-bit representation by grouped channelwise palettization. Supported nbits: 1, 2, 3, 4, 6, 8 Supported mode: KMEANS, UNIFORM, UNIQUE, CUSTOM block_sizes: Each element is the block size on corresponding axis for original_data. cluster_dim: Dimension of each cluster centroid, which is the length of each element in the lookup table. Returns None if the weight cannot be compressed (for example, the dim size on an axis is not divisible by the corresponding channel_group_size). """ if not isinstance(original_data, np.ndarray): raise ValueError(f"Only numpy arrays are supported, but got {type(original_data)}") if nbits is not None and nbits not in palettize_weights._SUPPORTED_NBITS: raise ValueError( f"Invalid nbits. Support {palettize_weights._SUPPORTED_NBITS}, but got {nbits}" ) data_rank = len(original_data.shape) if not (-data_rank <= channel_axis < data_rank): raise ValueError( "Invalid channel_axis. Should be in range " f"[{-data_rank}, {data_rank}), but got {channel_axis}" ) if channel_axis < 0: channel_axis += len(original_data.shape) channel_num = original_data.shape[channel_axis] if channel_group_size == 0: channel_group_size = channel_num if channel_num % channel_group_size != 0: logger.warning( f"Can't perform palettization: The number of channels at {channel_axis}th axis " f"({channel_num}) is not divisible by channel_group_size ({channel_group_size})." ) return None channel_group_num = channel_num // channel_group_size if channel_group_size % cluster_dim != 0: logger.warning( f"Can't perform palettization: The channel_group_size at {channel_axis}th axis " f"({channel_group_size}) is not divisible by cluster_dim ({cluster_dim})." ) return None if channel_axis != 0: original_data = np.swapaxes(original_data, 0, channel_axis) grouped_channel_data = np.split(original_data, channel_group_num, axis=0) # As the channel axis has been swapped to 0th axis, use 0 for vector_axis. vector_axis = 0 # If mode is UNIQUE, infer nbits from the number of unique values in each group. if mode.upper() == "UNIQUE": try: for per_group_data in grouped_channel_data: per_group_nbits = palettize_weights._get_nbits_for_unique_mode( per_group_data, palettize_weights._SUPPORTED_NBITS, cluster_dim, vector_axis ) # Pick the largest per-channel nbits to be used as the nbits for the whole op. if nbits is None or per_group_nbits > nbits: nbits = per_group_nbits except ValueError as e: logger.warning(f"Can't perform palettization:{e}") return None # The subprocesses have overhead, so only use it for expensive computations (k-means). if mode.upper() == "KMEANS" and num_kmeans_workers > 1: if palettize_weights._compress_pool is None: palettize_weights._compress_pool = Pool(processes=num_kmeans_workers) atexit.register(lambda: palettize_weights._compress_pool.terminate()) lut, indices = zip( *palettize_weights._compress_pool.starmap( palettize_weights._get_lut_and_indices, zip( grouped_channel_data, repeat(mode), repeat(nbits), repeat(lut_function), repeat(cluster_dim), repeat(vector_axis), ), ) ) else: lut, indices = zip( *[ palettize_weights._get_lut_and_indices( per_channel_group_data, mode, nbits, lut_function, cluster_dim, vector_axis ) for per_channel_group_data in grouped_channel_data ] ) lut = np.stack(lut, axis=0) indices = np.stack(indices, axis=0) if mode.upper() == "CUSTOM": # The custom lut_function provided by users should have nbits info. # The current `lut` has shape [group_num, lut_entry_num, Optional[cluster_dim]]. nbits = int(np.ceil(np.log2(lut.shape[1]))) # The lut and indices from `_get_lut_and_indices` is flattened. The desired result should be # `lut` with shape [channel_group_num, palette_num], and `indices` with same shape as the # original_data. palette_num = 2**nbits indices_target_shape = list(original_data.shape) if cluster_dim > 1: indices_target_shape[vector_axis] //= cluster_dim indices = indices.reshape(indices_target_shape) lut_target_shape = [1] * (len(original_data.shape) + 2) lut_target_shape[0] = channel_group_num lut_target_shape[-1] = cluster_dim lut_target_shape[-2] = palette_num lut = lut.reshape(lut_target_shape) if channel_axis != 0: lut = np.swapaxes(lut, 0, channel_axis) indices = np.swapaxes(indices, 0, channel_axis) indices_np_dtype = types.nptype_from_builtin(types.string_to_builtin(f"uint{nbits}")) return optimize_utils.LutParams( indices.astype(indices_np_dtype), lut, None if cluster_dim == 1 else channel_axis ) @staticmethod def decompress(params: Union[optimize_utils.LutParamsIos16, optimize_utils.LutParams]): if isinstance(params, optimize_utils.LutParamsIos16): return constexpr_lut_to_dense_ios16.decompress(params.lut, params.indices, params.shape) elif isinstance(params, optimize_utils.LutParams): return constexpr_lut_to_dense.decompress(params.indices, params.lut, None) else: raise ValueError("Invalid type of params") @staticmethod def _create_constexpr_var(op: Operation, lut_params: optimize_utils.LutParams) -> Var: """Create constexpr lut op based on opset version.""" if not is_current_opset_version_compatible_with(AvailableTarget.iOS18): lut_params_ios16 = optimize_utils.ios18_lut_params_to_ios16(lut_params) return mb.constexpr_lut_to_dense( indices=lut_params_ios16.indices, lut=lut_params_ios16.lut, shape=np.uint32(lut_params_ios16.shape), before_op=op, name=op.name + "_palettized", ) return mb.constexpr_lut_to_dense( indices=lut_params.indices, lut=lut_params.lut, vector_axis=lut_params.vector_axis, before_op=op, name=op.name + "_palettized", ) def transform_op(self, op: Operation): op_config = self.config._get_const_op_config(op) if op_config is None: return if not self.need_compress_const(op, self.config._is_deprecated, op_config.weight_threshold): return weight_to_compress = op.outputs[0].val restore_original_dtype = None if self.joint_compression: child_op = op.outputs[0].child_ops[0] if child_op.op_type == "constexpr_sparse_to_dense": # When the child op is sparse_to_dense op, the weight_to_compress is the sparse # representation, which need to be restored to dense representation for compression. weight_to_compress = constexpr_sparse_to_dense.decompress( weight_to_compress, child_op.mask.val ) if types.is_int(op.outputs[0].dtype) and op.outputs[0].dtype.get_bitwidth() <= 8: # For small range int weights (e.g. int8 weight produced by quantization), convert # it to int32 first to avoid overflow during palettization. restore_original_dtype = op.outputs[0].dtype weight_to_compress = weight_to_compress.astype(np.int32) block_sizes, channel_axis = optimize_utils.infer_block_sizes( op, op_config, weight_to_compress, return_channel_axis=True ) if block_sizes is None: logger.warning( f"Cannot perform palettization on {op.name} as block_sizes is None. Skipped this op." ) return if op_config.cluster_dim > 1: if not optimize_utils.is_cluster_dim_valid(op, op_config.cluster_dim, channel_axis): logger.warning(f"The `cluster_dim` is invalid for {op.name}. Skipped this op.") return if op_config.enable_per_channel_scale: # Normalize by per channel scales before doing palettization. per_channel_scale = np.max(np.abs(weight_to_compress), axis=channel_axis, keepdims=True) per_channel_scale[per_channel_scale == 0] = 1 weight_to_compress /= per_channel_scale lut_params = self.blockwise_compress( weight_to_compress, op_config.mode, op_config.nbits, block_sizes, op_config.lut_function, op_config.cluster_dim, channel_axis=channel_axis, num_kmeans_workers=op_config.num_kmeans_workers, ) if lut_params is None: logger.warning(f"Cannot perform palettization on {op.name}. Skipped this op.") return if restore_original_dtype is not None: lut_params = lut_params._replace( lut=lut_params.lut.astype(types.nptype_from_builtin(restore_original_dtype)) ) if not self.fake_compression: new_var: Optional[Var] = None # Specially handle sparse-related compression ops chaining. if self.joint_compression: child_op = op.outputs[0].child_ops[0] if child_op.op_type == "constexpr_sparse_to_dense": mask, nonzero_data = mb.constexpr_lut_to_sparse( indices_mask=child_op.mask, indices_nonzero_data=lut_params.indices[child_op.mask.val != 0].flatten(), lut=lut_params.lut, vector_axis=lut_params.vector_axis, before_op=child_op, name=op.name + "_palettized", ) # Feed the sparse lut's nonzero_data output to the child sparse op. new_var = nonzero_data # For other cases, the new lut var could be constructed directly from lut_params. if new_var is None: new_var = self._create_constexpr_var(op, lut_params) if op_config.enable_per_channel_scale: if not is_current_opset_version_compatible_with(AvailableTarget.iOS18): raise ValueError( "Palettization with per-channel-scale is only supported since " "iOS18. Please set minimum_deployment_target accordingly." ) new_var = mb.constexpr_blockwise_shift_scale( data=new_var, scale=per_channel_scale, offset=None, before_op=op, name=op.name + "_palettized_pcs", ) else: decompressed_val = self.decompress(lut_params) if op_config.enable_per_channel_scale: decompressed_val *= per_channel_scale new_var = mb.const( val=decompressed_val, before_op=op, name=op.name + "_fake_palettized", ) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=new_var, no_check_var_types=True, ) op.enclosing_block.remove_ops([op]) @register_pass(namespace="compression") class linear_quantize_weights(AbstractCompressionPass): """ This transform works for each ``const`` op if: - ``_is_deprecated=True`` and the ``op_selector`` returns ``True``. - ``_is_deprecated=False`` and the ``const`` value size ``> weight_threshold``. The transform performs the following: - Values are linearly quantized into n-bit. - If ``fake_compression=False``, compressed value is encoded using the ``constexpr_affine_dequantize`` op (pre-iOS18) or the ``constexpr_blockwise_shift_scale`` op (iOS18). - If ``fake_compression=True``, compressed value is decompressed and then encoded using the ``const`` op. Here is an example for input and output graph of this graph pass: .. code-block:: Input graph: const -> downstream op Output graph: constexpr_blockwise_shift_scale -> downstream op Support Options: - ``joint_compression``: Enable joint compression by quantizing an already compressed model. What op could be further quantized is in `_validate_child_constexpr_for_compress`. Using palettization + quantization as an example, for each existing ``constexpr_lut_to_dense`` op, it tries to quantize the elements in the lut, which could be represented as: .. code-block:: Input graph: lut(fp16) -> constexpr_lut_to_dense -> dense(fp16) -> downstream op Output graph: lut(int8) -> constexpr_blockwise_shift_scale -> lut(fp16) -> constexpr_lut_to_dense -> dense(fp16) -> downstream op For details about different quantization schemas, see `OpLinearQuantizerConfig` for more details. """ _SUPPORTED_CONFIG_TYPE = OpLinearQuantizerConfig _MODE_DTYPE_TO_RANGE = { (types.int8, "LINEAR"): (-128, 127), (types.int8, "LINEAR_SYMMETRIC"): (-127, 127), (types.uint8, "LINEAR"): (0, 255), (types.uint8, "LINEAR_SYMMETRIC"): (0, 254), } def _validate_child_constexpr_for_compress(self, op: Operation) -> bool: """ Overrides external method to support joint compression for iOS18+. In iOS18 joint compression, the palettized/sparsified data could be further quantized. For each specific op, we only quantize the specific input: - constexpr_lut_to_dense's lut - constexpr_lut_to_sparse's lut - constexpr_sparse_to_dense's nonzero_data """ if ( is_current_opset_version_compatible_with(AvailableTarget.iOS18) and self.joint_compression ): if len(op.outputs[0].child_ops) == 1: child_op = op.outputs[0].child_ops[0] if child_op.op_type == "constexpr_lut_to_dense" and child_op.lut == op.outputs[0]: return True elif ( child_op.op_type == "constexpr_lut_to_sparse" and child_op.lut == op.outputs[0] ): return True elif ( child_op.op_type == "constexpr_sparse_to_dense" and child_op.nonzero_data == op.outputs[0] ): return True return super()._validate_child_constexpr_for_compress(op) @classmethod @deprecated( suffix="Please use coremltools.optimize.coreml.linear_quantize_weights.blockwise_compress", version="8.2", obj_prefix="coremltools.optimize.coreml.linear_quantize_weights.", ) def compress( cls, val: np.ndarray, axis: int, mode: str, dtype: type ) -> optimize_utils.QuantParamsIos16: """ [Legacy] Per-channel quantization on axis. This API is for backward compatibility only. It's no longer used inside the coremltools. It's recommended to use `blockwise_compress` instead, which is more general. """ if not isinstance(val, (np.ndarray, np.generic)): raise ValueError("Only numpy arrays are supported") if isinstance(dtype, np.dtype): dtype = types.numpy_type_to_builtin_type(dtype) if not types.is_builtin(dtype): raise ValueError(f"The input dtype is should be a built-in type, but got {type(dtype)}") block_sizes = [0] * len(val.shape) block_sizes[axis] = 1 quant_params = cls.blockwise_compress( val, nbits=dtype.get_bitwidth(), mode=mode, signed=not dtype.is_unsigned(), block_sizes=block_sizes, ) if quant_params is None: raise ValueError("Failed to quantize.") return optimize_utils.ios18_quant_params_to_ios16(quant_params) @classmethod def blockwise_compress( cls, original_data: np.ndarray, nbits: int, mode: str, signed: bool, block_sizes: List[int], ) -> Optional[optimize_utils.QuantParams]: """ Compress original_data into n-bit representation by quantization. mode: "LINEAR_SYMMETRIC" or "LINEAR". block_sizes: Each element is the block size on corresponding axis for original_data. Returns None if the weight cannot be compressed (for example, the dim size on an axis is not divisible by the corresponding block_size). """ if not isinstance(original_data, np.ndarray): raise ValueError("Only numpy arrays are supported") result = optimize_utils.compute_qparams( original_data, nbits, signed, mode, types.nptype_from_builtin(types.get_nbits_int_builtin_type(nbits, signed)), block_sizes, ) if result is None: return None quantized_data, scale, zero_point = result return optimize_utils.QuantParams( data=quantized_data, scale=scale, offset=zero_point, nbits=np.uint8(nbits) ) @staticmethod def decompress(params: Union[optimize_utils.QuantParamsIos16, optimize_utils.QuantParams]): if isinstance(params, optimize_utils.QuantParamsIos16): return constexpr_affine_dequantize.decompress( params.quantized_data, params.zero_point, params.scale, params.axis ) elif isinstance(params, optimize_utils.QuantParams): return constexpr_blockwise_shift_scale.decompress( params.data, params.scale, params.offset, ) else: raise ValueError("Invalid type of params") @staticmethod def _create_constexpr_var(op: Operation, quant_params: optimize_utils.QuantParams) -> Var: """Create constexpr quant op based on opset version.""" if not is_current_opset_version_compatible_with(AvailableTarget.iOS18): quant_params_ios16 = optimize_utils.ios18_quant_params_to_ios16(quant_params) return mb.constexpr_affine_dequantize( quantized_data=quant_params_ios16.quantized_data, zero_point=quant_params_ios16.zero_point, scale=quant_params_ios16.scale, axis=quant_params_ios16.axis, before_op=op, name=op.name + "_quantized", ) return mb.constexpr_blockwise_shift_scale( data=quant_params.data, scale=quant_params.scale, offset=quant_params.offset, before_op=op, name=op.name + "_quantized", ) def transform_op(self, op: Operation): op_config: Optional[OpLinearQuantizerConfig] = self.config._get_const_op_config(op) if op_config is None: return if not self.need_compress_const(op, self.config._is_deprecated, op_config.weight_threshold): return weight_to_compress = op.outputs[0].val if self.joint_compression: child_op = op.outputs[0].child_ops[0] if child_op.op_type == "constexpr_sparse_to_dense": # When the child op is sparse_to_dense op, the weight_to_compress is the sparse # representation, which need to be restored to dense representation for compression. weight_to_compress = constexpr_sparse_to_dense.decompress( weight_to_compress, child_op.mask.val ) elif child_op.op_type.startswith("constexpr_lut_to_"): if not op_config.granularity == CompressionGranularity.PER_TENSOR: raise NotImplementedError( "When use joint compression for palettization-quantization, please make " "sure to use per-tensor quantization, because the axis for the data to be" "quantized (palettization's lut) is different from the original weight." ) block_sizes = optimize_utils.infer_block_sizes(op, op_config, weight_to_compress) if block_sizes is None: logger.warning( f"Cannot perform quantization on {op.name} as block_sizes is None. Skipped this op." ) return quant_params = self.blockwise_compress( weight_to_compress, op_config.nbits, op_config.mode, op_config.signed, block_sizes, ) if quant_params is None: logger.warning(f"Cannot perform quantization on {op.name}. Skipped this op.") return if not self.fake_compression: new_var: Optional[Var] = None # Specially handle sparse-related compression ops chaining. if self.joint_compression: child_op = op.outputs[0].child_ops[0] if child_op.op_type == "constexpr_sparse_to_dense": mask, nonzero_data = mb.constexpr_sparse_blockwise_shift_scale( data_mask=child_op.mask, nonzero_data=quant_params.data[child_op.mask.val != 0].flatten(), scale=quant_params.scale, offset=quant_params.offset, before_op=child_op, name=op.name + "_quantized", ) # Feed the sparse quantization op's nonzero_data output to the child sparse op. new_var = nonzero_data elif child_op.op_type == "constexpr_lut_to_sparse": # Here we only quantize the lut itself, which is a dense data, so we cannot use # the sparse version of the quant op; instead we just use the dense version of # the quant op. Will change if backends don't support it. pass # For other cases, the new quant var could be constructed directly from quant_params. if new_var is None: new_var = self._create_constexpr_var(op, quant_params) else: decompressed_val = self.decompress(quant_params) new_var = mb.const( val=decompressed_val, before_op=op, name=op.name + "_fake_quantized", ) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=op.outputs[0], new_var=new_var, no_check_var_types=True, ) op.enclosing_block.remove_ops([op]) @register_pass(namespace="compression") class WeightDecompressor(AbstractQuantizationPass): """ This graph pass transforms the ``constexpr`` op back into ``mb.const`` op. The ``constexpr`` op has op_type starts with the "constexpr_" prefix. """ def __init__(self, op_selector): super().__init__(op_selector=op_selector) def is_valid_op(self, op): return op.op_type is not None and op.op_type.startswith("constexpr_") def transform_op(self, op): decompressed_val = op.materialized_val_inference() if not isinstance(decompressed_val, (list, tuple)): decompressed_val = [decompressed_val] if len(decompressed_val) != len(op.outputs): raise ValueError( "The number of decompressed value should match the number of op outputs. " f"But got {len(decompressed_val)} vs {len(op.outputs)}" ) for decomp_val, output_var in zip(decompressed_val, op.outputs): new_const = mb.const(val=decomp_val, before_op=op, name=op.name) op.enclosing_block.replace_uses_of_var_after_op( anchor_op=op, old_var=output_var, new_var=new_const, no_check_var_types=True, force_replace=True, ) op.enclosing_block.remove_ops([op]) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/coreml/_utils.py0000644000000000000000000006451214672066616022043 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import math from collections import namedtuple from typing import List, Optional, Tuple, Union import numpy as np from coremltools import _getLogger from coremltools.converters.mil.mil import Operation, types from coremltools.optimize.coreml._config import ( CompressionGranularity, OpLinearQuantizerConfig, OpPalettizerConfig, ) _logger = _getLogger() SparseParamsIos16 = namedtuple("SparseParamsIos16", "nonzero_data mask shape") LutParamsIos16 = namedtuple("LutParamsIos16", "lut indices shape") QuantParamsIos16 = namedtuple("QuantParamsIos16", "quantized_data zero_point scale axis") SparseParams = namedtuple("SparseParams", "nonzero_data mask") LutParams = namedtuple("LutParams", "indices lut vector_axis") QuantParams = namedtuple("QuantParams", "data scale offset nbits") def get_quant_range(n_bits: int, signed: bool, mode: str) -> Tuple[int, int]: """ Utility to get the quantization range for a given quantization config Adapted from phoenix/quatization/_utils.py """ max_q = 2**n_bits if not signed: quant_min = 0 quant_max = max_q - 1 if mode == "LINEAR_SYMMETRIC": quant_max -= 1 else: quant_min = -max_q / 2 quant_max = max_q / 2 - 1 if mode == "LINEAR_SYMMETRIC": quant_min += 1 return int(quant_min), int(quant_max) def quantize_weight( weight: np.ndarray, axes: Tuple[int, ...], nbits: int, signed: bool, quantization_mode: str, dtype: np.dtype, ) -> Tuple[np.ndarray, np.ndarray, Optional[np.ndarray]]: """Get quantized data along with metadata (scale, zero_point).""" if not np.issubdtype(weight.dtype, np.floating): raise ValueError("Only floating numpy arrays are supported.") val_min = np.amin(weight, axis=axes, keepdims=True) val_max = np.amax(weight, axis=axes, keepdims=True) q_val_min, q_val_max = get_quant_range(nbits, signed, quantization_mode) zero_point = None if quantization_mode == "LINEAR_SYMMETRIC": # For the linear_symmetric quantization_mode, the range is symmetrical to 0 max_abs = np.maximum(np.abs(val_min), np.abs(val_max)) val_min = -max_abs val_max = max_abs if not signed: zero_point_shift = q_val_max // 2 zero_point = zero_point_shift * np.ones(val_min.shape) else: assert quantization_mode == "LINEAR" # For the linear quantization_mode, we need to make sure the data range contains `0` val_min = np.minimum(0.0, val_min) val_max = np.maximum(0.0, val_max) zero_point = (q_val_min * val_max - q_val_max * val_min) / (val_max - val_min) zero_point = np.round(zero_point) zero_point = np.clip(zero_point, q_val_min, q_val_max) scale = (val_max - val_min) / (q_val_max - q_val_min) quantized_data = np.round(weight / scale) if zero_point is not None: quantized_data += zero_point zero_point = zero_point.squeeze().astype(dtype) quantized_data = np.clip(quantized_data, q_val_min, q_val_max).astype(dtype) scale = scale.astype(weight.dtype).squeeze() return quantized_data, scale, zero_point def compute_qparams( weight: np.ndarray, nbits: int, signed: bool, quantization_mode: str, dtype: np.dtype, block_sizes: List[int], ) -> Optional[Tuple[np.ndarray, np.ndarray, Optional[np.ndarray]]]: """ Compress the given weight matrix by quantizing the weights. Provide different configurations of quantization by specifying a ``block_sizes`` which is a list containing the block size for each dimension of the weight or 0 otherwise. Note that per-tensor, per-channel, channelwise-grouped and per-block are just variants of specifying the block sizes for each dimension. """ if len(block_sizes) != len(weight.shape): raise AssertionError( "Each axis should have a block size, which means len(block_sizes) must be " f"equal to weight's rank, but got {len(block_sizes)} vs {len(weight.shape)}" ) new_shape, scale_shape, axes_to_skip = [], [], [] for axis, (dim_size, block_size) in enumerate(zip(weight.shape, block_sizes)): if block_size > 0: if dim_size % block_size != 0: _logger.warning( f"Invalid block_sizes; On {axis}th axis, the dim size {dim_size} is " f"not divisible by block size {block_size}. Unable to perform " "structured quantization." ) return None # Skip this axis while computing min & max axes_to_skip.append(len(new_shape)) # channel dim now will be (num_blocks, block_size) num_blocks = dim_size // block_size new_shape.extend([num_blocks, block_size]) scale_shape.append(num_blocks) else: new_shape.append(dim_size) scale_shape.append(1) # Axes to reduce while compute min & max values axes = tuple(filter(lambda x: x not in axes_to_skip, range(len(new_shape)))) quantized_data, scale, zero_point = quantize_weight( weight.reshape(new_shape), axes, nbits, signed, quantization_mode, dtype ) quantized_data = quantized_data.reshape(weight.shape) scale = scale.reshape(scale_shape) if zero_point is not None: zero_point = zero_point.reshape(scale_shape) return quantized_data, scale, zero_point def reshape_weight_for_vector_lut( weight: np.ndarray, vector_size: int, vector_axis: int ) -> np.ndarray: """ For vector palettization, we need to extract vectors and move them to the last dim. If the input weight has shape [s0, s1, s2, ... , sn], the output shape should have shape [s0, ..., si // vector_size, ..., sn, vector_size] where i == vector_axis. For example, starting from weight `a` which has shape `[4, 4]` and `vector_size=2` and `vector_axis=0`, we want to reshape the matrix into `c` with shape `[2, 4, 2]`, where `c[0, 0]` contains `a[0,0], a[1, 0]`, and `c[1, 0]` contains `a[2, 0], a[3, 0]`, etc. To achieve this, we need to first swap the vector_axis to last dim, and split out the vector_size, and finally swap it back. Here is a concrete exmaple: a = np.array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) b = np.swapaxes(a, 0, -1).reshape((4, 2, 2)) c = np.swapaxes(b, 0, 1) # c[0, 0] is array([0, 4]) # c[1, 0] is array([8, 12]) """ weight = np.swapaxes(weight, -1, vector_axis) weight = weight.reshape((*weight.shape[:-1], weight.shape[-1] // vector_size, vector_size)) return np.swapaxes(weight, -2, vector_axis) def find_indices_for_lut( data: np.ndarray, lut: np.ndarray, vector_axis: Optional[int] = None ) -> np.ndarray: """ Given a data and a look-up-table (LUT), find the closest indices in LUT that each element in data correspond to. It's the reverse process of "Given a LUT and indices, produce data using indices to fetch elements in LUT". Note the elements in data may not exactly match the elements in lut due to numerical instability. So we use fuzzy match to find the closest one instead of doing exact match. Parameters - data: Arbitrary numpy array. - lut: [block_num1, ..., 2**nbits, vector_size]. LUT's rank is K + 2, where K is the rank of data. Each dimension of data should be divisible by each corresponding dimension of the LUT. e.g., when data's shape is [2, 3, 4], the first three elements in lut's shape is [1, 1, 2], it means that there are two lookup tables over the last axis, and each of them have their own LUT values. See details in the iOS18 `constexpr_lut_to_dense` op. - vector_axis: Only effective when lut's last dim (vector_size) > 1. It denotes which axis the vector is along. """ if len(lut.shape) != len(data.shape) + 2: raise ValueError( "The lut's rank should be data's rank + 2. See constexpr_lut_to_dense op definition." ) if lut.shape[-1] > 1: if vector_axis is None: raise ValueError("The vector_axis must be provided for vector palettization.") if not len(data.shape) > vector_axis >= -len(data.shape): raise ValueError(f"Invalid vector_axis ({vector_axis})") if vector_axis < 0: vector_axis += len(data.shape) vector_size = lut.shape[-1] if data.shape[vector_axis] % vector_size != 0: raise ValueError( f"The data dim on {vector_axis}th axis ({data.shape[vector_axis]}) " f"must be divisible by vector_size ({vector_size})" ) data = reshape_weight_for_vector_lut(data, vector_size, vector_axis) # lut has shape [block_num0, block_num1, ..., 2**nbits, vector_size], so need to interleaved # repeat it to make each block match the weight. repeated_lut = lut for axis, block_num in enumerate(lut.shape[:-2]): weight_dim_size = data.shape[axis] if weight_dim_size % block_num != 0: raise ValueError( "The weight dim size in each axis must be divisible by the number " f"of luts. Got invalid lut {lut.shape} for weight shape " f"{data.shape[axis]} at axis {axis}" ) block_size = weight_dim_size // block_num # Can use np.kron for higher efficiency, but repeat is easier to understand. if block_size > 1: repeated_lut = np.repeat(repeated_lut, block_size, axis=axis) if lut.shape[-1] == 1: # For scalar palettization, we can simply find the closest value for each element. indices = np.argmin( np.abs(np.expand_dims(data, axis=-1) - np.squeeze(repeated_lut, axis=-1)), axis=-1 ) else: # For vector palettization, find the closest vector by Euclidean distance. dist = np.linalg.norm(np.expand_dims(data, axis=-2) - repeated_lut, axis=-1) indices = np.argmin(dist, axis=-1) nbits = int(math.log2(lut.shape[-2])) indices = indices.astype(types.nptype_from_builtin(types.string_to_builtin(f"uint{nbits}"))) return indices def infer_block_sizes( op: "Operation", op_config: Union[OpLinearQuantizerConfig, OpPalettizerConfig], weight_to_compress: np.ndarray, return_channel_axis: bool = False, ) -> Union[Optional[List[int]], Tuple[Optional[List[int]], Optional[int]]]: """ Infer block size on each axis based on the op and compression config. For per-channel, the channel axis is auto-picked. For per-block, the input/output axis is auto-picked if block_size is int. See the docstring of OpLinearQuantizerConfig for more details. """ input_channel_axis, output_channel_axis = select_input_output_channel_axis(op) if op_config.granularity == CompressionGranularity.PER_BLOCK and not isinstance( op_config.block_size, int ): if len(op_config.block_size) != len(weight_to_compress.shape): raise ValueError( "The block_size in config must has one element for each axis. However, for op " f"{op.name}, there are {len(op_config.block_size)} elements in block_size, " f"but there are {len(weight_to_compress.shape)} axes in the weight." ) channel_axis_candidates = [] for axis, (b_size, dim_size) in enumerate( zip(op_config.block_size, weight_to_compress.shape) ): if b_size != 0 and b_size != dim_size: channel_axis_candidates.append(axis) if len(channel_axis_candidates) == 1: # Set channel axis if we can infer it from block sizes; else just use the default one # inferred by op type. output_channel_axis = channel_axis_candidates[0] if return_channel_axis: return list(op_config.block_size), output_channel_axis return list(op_config.block_size) if input_channel_axis is None or output_channel_axis is None: if return_channel_axis: return None, output_channel_axis return None if ( op_config.granularity == CompressionGranularity.PER_GROUPED_CHANNEL and op_config.channel_axis is not None ): output_channel_axis = op_config.channel_axis block_sizes = [0] * len(weight_to_compress.shape) if op_config.granularity == CompressionGranularity.PER_TENSOR: input_channel_block_size = 0 output_channel_block_size = 0 elif op_config.granularity == CompressionGranularity.PER_CHANNEL: input_channel_block_size = 0 output_channel_block_size = 1 elif op_config.granularity == CompressionGranularity.PER_GROUPED_CHANNEL: input_channel_block_size = 0 output_channel_block_size = op_config.group_size else: assert op_config.granularity == CompressionGranularity.PER_BLOCK and isinstance( op_config.block_size, int ) input_channel_block_size = op_config.block_size output_channel_block_size = 1 if input_channel_axis < len(block_sizes): block_sizes[input_channel_axis] = input_channel_block_size if output_channel_axis < len(block_sizes): block_sizes[output_channel_axis] = output_channel_block_size if return_channel_axis: return block_sizes, output_channel_axis return block_sizes def select_input_output_channel_axis(op: "Operation") -> Tuple[Optional[int], Optional[int]]: """ Here are some representative ops: - linear: [D_out, D_in] - matmul's y: [..., D_in, D_out] if transpose_y is False, else [..., D_out, D_in] - conv: [C_out, C_in_div_group, KH, KW] - conv_transpose: [C_in, C_out_div_group, KH, KW] The input output channel axis selection criteria is: - For conv_transpose the output channel is 1 and input channel is 0. - For matmul's y: - When transpose_y=False, output channel is -1 and input channel is -2 - When transpose_y=True, output channel is -2 and input channel is -1 - For matmul's x: - When transpose_x=False, output channel is -2 and input channel is -1 - When transpose_y=True, output channel is -1 and input channel is -2 - For all other ops, output channel is 0 and input channel is 1. If cannot determine the input/output axis, return None to denote unknown. """ var = op.outputs[0] # The op could be fed into multiple ops, so we traverse all children ops to see if they # have consistent input/output axis, otherwise set the axis to None. output_channel_axis_set = set() input_channel_axis_set = set() for child_op in var.child_ops: # By default, output channel axis is 0 and input channel axis is 1. output_channel_axis, input_channel_axis = 0, 1 if child_op.op_type == "conv_transpose": output_channel_axis = 1 input_channel_axis = 0 elif child_op.op_type == "matmul": if child_op.y == var: if child_op.transpose_y.val: output_channel_axis = -2 input_channel_axis = -1 else: output_channel_axis = -1 input_channel_axis = -2 else: # var is used as matmul's x. if child_op.transpose_x.val: output_channel_axis = -1 input_channel_axis = -2 else: output_channel_axis = -2 input_channel_axis = -1 elif child_op.op_type.startswith("constexpr_"): # In joint compression constexpr op could be chained together. input_channel_axis, output_channel_axis = select_input_output_channel_axis(child_op) if output_channel_axis < 0: output_channel_axis += var.rank if input_channel_axis < 0: input_channel_axis += var.rank output_channel_axis_set.add(output_channel_axis) input_channel_axis_set.add(input_channel_axis) output_channel_axis, input_channel_axis = 0, 1 if len(output_channel_axis_set) > 1: _logger.warning( f"Can't decide output axis for op {op.name}, because it's fed " f"into multiple downstream ops which require different output axes." ) output_channel_axis = None elif len(output_channel_axis_set) == 1: output_channel_axis = output_channel_axis_set.pop() if len(input_channel_axis_set) > 1: _logger.warning( f"Can't decide input axis for op {op.name}, because it's fed " f"into multiple downstream ops which require different input axes." ) input_channel_axis = None elif len(input_channel_axis_set) == 1: input_channel_axis = input_channel_axis_set.pop() return input_channel_axis, output_channel_axis def is_cluster_dim_valid(op: "Operation", cluster_dim: int, channel_axis: int) -> bool: """ Check op-dependent restrictions for cluster_dim. For example, the conv's weight has shape [C_out, C_in/groups], but the effective shape in each group is actually [C_out/groups, C_in/groups], so we need to make sure the effective dim on channel_axis is divisible by `cluster_dim`. Similarly, for conv_transpose the weight has shape [C_in, C_out/groups], but the effective shape in each group is [C_in/groups, C_out/groups]. Returns True if the cluster_dim is valid, False otherwise. """ var = op.outputs[0] if channel_axis < 0: channel_axis += var.rank for child_op in var.child_ops: if child_op.op_type in {"conv", "conv_transpose"}: effective_shape = list(var.shape) if child_op.groups.val is not None and child_op.groups.val > 1: effective_shape[0] //= child_op.groups.val if effective_shape[channel_axis] % cluster_dim != 0: return False return True def ios16_sparse_params_to_ios18(sparse_params: SparseParamsIos16) -> SparseParams: """ The iOS18 constexpr_sparse_to_dense no longer accepts `shape` param. Instead, the `mask` param has shape info. So we need to convert the old bit-packed `mask` to new uint1 `mask`. """ if not isinstance(sparse_params, SparseParamsIos16): raise ValueError("Invalid type of params") mask = ( np.unpackbits(sparse_params.mask, count=np.prod(sparse_params.shape), bitorder="little") .reshape(sparse_params.shape) .astype(types.np_uint1_dtype) ) return SparseParams(nonzero_data=sparse_params.nonzero_data, mask=mask) def ios18_sparse_params_to_ios16(sparse_params: SparseParams) -> SparseParamsIos16: """The iOS16 sparse params pack mask into bytes, and need a `shape` parameter.""" return SparseParamsIos16( nonzero_data=sparse_params.nonzero_data, mask=np.packbits(sparse_params.mask, bitorder="little"), shape=sparse_params.mask.shape, ) def ios16_lut_params_to_ios18(lut_params: LutParamsIos16) -> LutParams: """ The iOS18 constexpr_lut_to_dense no longer accepts `shape` param. We need to convert the iOS16 params to the format acceptable by the iOS18 op. """ num_palettes = lut_params.lut.shape[0] nbits = int(math.log2(num_palettes)) if 2**nbits != num_palettes: raise AssertionError( f"Invalid number of palettes in lut_params. It should be 2**nbits, but got {num_palettes}" ) # Notice that the indices in iOS16 is packed, so we need to unpack first. unpacked_indices = restore_elements_from_packed_bits( lut_params.indices, nbits, np.prod(lut_params.shape) ) indices = unpacked_indices.reshape(lut_params.shape).astype( types.type_mapping.string_to_nptype(f"uint{nbits}") ) lut_shape = [1] * len(lut_params.shape) + [num_palettes, 1] lut = lut_params.lut.reshape(lut_shape) return LutParams(indices=indices, lut=lut, vector_axis=None) def ios18_lut_params_to_ios16(lut_params: LutParams) -> LutParamsIos16: """The iOS16 lut params pack indices into bytes, and need a `shape` parameter.""" for idx, dim_size in enumerate(lut_params.lut.shape[:-2]): if dim_size > 1: raise AssertionError( "The iOS16 only supports per-tensor lut, but got more than one " f"lut on {idx}th axis. LUT shape: {lut_params.lut.shape}" ) num_palettes = lut_params.lut.shape[-2] nbits = int(math.log2(num_palettes)) return LutParamsIos16( lut=lut_params.lut.reshape((num_palettes,)), indices=pack_elements_into_bits(lut_params.indices, nbits), shape=lut_params.indices.shape, ) def ios18_quant_params_to_ios16(quant_params: QuantParams) -> QuantParamsIos16: """ Transform iOS18 quant params to iOS16 version. The iOS16 constexpr_affine_dequantize op requires axis, and it requires scale and zero_point to have rank 0 or 1. """ # Infer the axis based on scale's shape. non_single_dim = [dim for dim, dim_size in enumerate(quant_params.scale.shape) if dim_size > 1] if len(non_single_dim) > 2: raise AssertionError( "The constexpr_affine_dequantize op doesn't support scale which " "have more than one non-single dimensions. Got scale with shape " f"{quant_params.scale.shape}" ) # If non_single_dim is empty, it means it's per-tensor quantization, just use a dummy axis. axis = 0 if len(non_single_dim) == 0 else non_single_dim[0] scale = quant_params.scale zero_point = quant_params.offset if zero_point is None: # The constexpr_affine_dequantize op requires zero_point. zero_point = np.zeros_like(scale).astype(quant_params.data.dtype) # The constexpr_affine_dequantize op requires scale and zero_point to have rank 0 or 1. if isinstance(scale, (np.ndarray, np.generic)): scale = np.squeeze(scale) if isinstance(zero_point, (np.ndarray, np.generic)): zero_point = np.squeeze(zero_point) return QuantParamsIos16( quantized_data=quant_params.data, zero_point=zero_point, scale=scale, axis=np.int32(axis) ) def pack_elements_into_bits(elements: np.ndarray, nbits: int) -> np.ndarray: """ Pack elements into nbits representation, by starting with the least significant bit (LSB) and moving upward to the most significant bit (MSB). Returns packed elements as np.uint8. """ if not np.issubdtype(elements.dtype, np.integer): raise ValueError(f"Only support packing integers elements, but got {elements.dtype}") # Adjust allowed value range based on if the input is signed or unsigned. if np.issubdtype(elements.dtype, np.signedinteger): max_val = 2 ** (nbits - 1) - 1 min_val = -max_val - 1 else: max_val = 2**nbits - 1 min_val = 0 if np.max(elements) > max_val: raise ValueError( f"To pack elements into {nbits}-bit, the max value is {max_val}, but got {np.max(elements)}" ) if np.min(elements) < min_val: raise ValueError( f"To pack elements into {nbits}-bit, the min value is {min_val}, but got {np.min(elements)}" ) # As np.unpackbits only supports uint8, convert to uint8 first. # Notice that it will not lose information, because the bits are unchanged when converting int8 # to uint8. For example, the signed int -6 has bit representation '11111010', and when we unpackbits # we get [0, 1, 0, 1, 1, 1, 1, 1], where only first 4 elements are needed for 4-bit representation. elements = elements.astype(np.uint8) bitarray = np.unpackbits(elements.reshape(-1, 1), bitorder="little", axis=-1)[:, :nbits] return np.packbits(bitarray.flatten(), bitorder="little") def restore_elements_from_packed_bits( packed_values: np.ndarray, nbits: int, element_num: int, are_packed_values_signed: bool = False ) -> np.ndarray: """ Restore elements from packed bits. Requires values that are packed by starting with the least significant bit (LSB) and moving upward to the most significant bit (MSB), which is the method used in `pack_elements_into_bits`. are_packed_values_signed: Indicates if the packed_values were packed from signed integers. If True, the n-bit number unpacked from packed_values will be interpreted as signed integers, and the returned ndarray will have dtype np.int8. Otherwise, np.uint8 will be used. """ if len(packed_values.shape) != 1: raise NotImplementedError( f"Only support 1-rank packed_values. But got {len(packed_values.shape)}" ) if packed_values.dtype == np.int8: # As np.unpackbits only supports uint8, need to convert first. packed_values = packed_values.astype(np.uint8) elif packed_values.dtype != np.uint8: raise NotImplementedError( f"Only support int8 or uint8 packed_values, but got {packed_values.dtype}" ) bitarray = np.unpackbits(packed_values, bitorder="little") pad_required = bitarray.size % nbits != 0 if pad_required: bitarray = np.concatenate([bitarray, np.zeros(nbits - bitarray.size % nbits)]).astype( bitarray.dtype ) if bitarray.size % nbits != 0: raise ValueError( f"The length of bitarray ({bitarray.size}) should be divisible by " f"nbits ({nbits})." ) bitarray = bitarray.reshape(-1, nbits)[:element_num, :] # The np.packbits doesn't work well for signed int if we feed `bitarray` to it directly. # For example, the original signed int is -6, which is packed as 1010 for 4-bit representation, # and here `bitarray` is [[0, 1, 0, 1]], where the value will be interpreted as 10 (b'1010') # by np.packbits. # To make np.packbits work correctly, we need to repeat the sign bit. For example, 1010 will # become 11111010, where np.packbits can correctly handle and after converting to int8 it's -6. if are_packed_values_signed: # Repeat the sign bit to make uint8 to int8 works. bitarray = np.repeat(bitarray, [1] * (nbits - 1) + [8 - nbits + 1], axis=1) restored_elements = np.packbits(bitarray, bitorder="little", axis=-1).reshape(-1) if are_packed_values_signed: restored_elements = restored_elements.astype(np.int8) return restored_elements ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.265547 coremltools-8.0/coremltools/optimize/coreml/experimental/0000755000000000000000000000000014672075535022657 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/coreml/experimental/__init__.py0000644000000000000000000000052714672066616024774 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from ._config import OpActivationLinearQuantizerConfig from ._post_training_quantization import linear_quantize_activations ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/coreml/experimental/_config.py0000644000000000000000000000763414672066616024647 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from __future__ import annotations from typing import Any, Dict, Optional, Union import cattrs import numpy as np from attrs import define, field, validators from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.types.type_mapping import is_builtin, numpy_type_to_builtin_type from .._config import OpCompressorConfig, _check_weight_threshold, _normalize_dtype """ Activation Linear Quantization configuration """ # TODO: This should be refactored to reuse OpLinearQuantizerConfig (rdar://129257210). @define class OpActivationLinearQuantizerConfig(OpCompressorConfig): """ Parameters ---------- mode: str Mode for linear quantization: * ``"linear_symmetric"`` (default): Input data are quantized in the range ``[-R, R]``, where :math:`R = max(abs(w_r))`. dtype: str or np.generic or mil.type Determines the quantized data type. * The allowed values are: * ``np.int8`` (the default) * ``coremltools.converters.mil.mil.types.int8`` weight_threshold: int If the operation has weight, above which activation are compressed. Set the same ``weight_threshold`` for activation as for weight linear quantization can guarantee valid operations get both weight and activation quantization to improve efficiency. * If not provided, it will be set to ``2048``, in which operations with weights bigger than ``2048`` elements are compressed. """ # TODO: enable more modes/dtypes (rdar://129257210). mode: str = field(default="linear_symmetric", validator=validators.instance_of(str)) dtype: Union[str, type] = field(default=types.int8, converter=_normalize_dtype) # Set the same ``weight_threshold`` for activation linear quantization as for weight linear quantization can guarantee # valid operations get both the weight (if weight exists) and activation linear quantized to improve efficiency. weight_threshold: Optional[int] = field( default=2048, validator=validators.optional([validators.instance_of(int), _check_weight_threshold]), ) _ACTIVATION_AFFINE_QUANTIZATION_MODES = ("LINEAR_SYMMETRIC",) @mode.validator def check_mode(self, attr, mode): if not mode.upper() in self._ACTIVATION_AFFINE_QUANTIZATION_MODES: raise ValueError( f'Only mode {self._ACTIVATION_AFFINE_QUANTIZATION_MODES} supported for activation affine quantization. Got mode: "{mode}".' ) @dtype.validator def check_dtype(self, attr, dtype): if not types.is_builtin(dtype): raise ValueError(f"Invalid dtype. Should be builtin dtype, but got {type(dtype)}") if not (types.is_int(dtype) and dtype.get_bitwidth() in {8} and not dtype.is_unsigned()): raise ValueError( f"Invalid dtype. Should be int8, but got {types.builtin_to_string(dtype)}" ) def __attrs_post_init__(self): self.mode = self.mode.upper() if not is_builtin(self.dtype): self.dtype = numpy_type_to_builtin_type(self.dtype) @classmethod def _from_dict(cls, config_dict: Dict[str, Any]) -> "OpActivationLinearQuantizerConfig": def _structure_type(value, dtype): if isinstance(value, type): return value else: if not isinstance(value, str) or value not in ("int8",): raise ValueError(f'"dtype" must be type of type or str ["int8"]. Got {value}') return getattr(np, value) converter = cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook(type, _structure_type) return converter.structure(config_dict, cls) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/coreml/experimental/_model_debugger.py0000644000000000000000000003420014672066616026333 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import List import numpy as np import coremltools as ct class OperationInfo: def __init__(self, spec): self.dependants = [] self.dependencies = [] outputs = dict([(output.name, output) for output in spec.outputs]) self.outputs = outputs self.spec = spec class BlockInfo: def __init__(self, name, operations, spec): self.name = name self.operations = operations self.spec = spec class FunctionInfo: def __init__(self, name, blocks, spec): self.name = name self.blocks = blocks self.spec = spec class ProgramInfo: def __init__(self, functions, spec): self.functions = functions self.spec = spec class ModelInfo: def __init__(self, program_info, spec): self.program_info = program_info self.spec = spec class ModelDebugger: @classmethod def batch(cls, iterable, n=1): l = len(iterable) for index in range(0, l, n): yield iterable[index : min(index + n, l)] @classmethod def unique(cls, sequence): seen = set() return [x for x in sequence if not (x in seen or seen.add(x))] @classmethod def split_list(cls, list): half = len(list) // 2 return list[:half], list[half:] @classmethod def get_block_info(cls, block_name, block_spec): operations = {} for operation_spec in block_spec.operations: operation = OperationInfo(operation_spec) dependencies = [] for input_name in operation_spec.inputs: arguments = operation_spec.inputs[input_name].arguments input_dependencies = [ operations.get(argument.name, None) for argument in arguments if argument.name is not None ] input_dependencies = [ input_dependency for input_dependency in input_dependencies if input_dependency is not None ] dependencies.extend(input_dependencies) dependencies = cls.unique(dependencies) for dependency in dependencies: dependency.dependants.append(operation) operation.dependencies = dependencies output_names = [output.name for output in operation_spec.outputs] for output_name in output_names: operations[output_name] = operation return BlockInfo(block_name, operations, block_spec) @classmethod def get_function_info(cls, function_name, function_spec): blocks = {} for block_name, block_spec in function_spec.block_specializations.items(): blocks[block_name] = cls.get_block_info(block_name, block_spec) return FunctionInfo(function_name, blocks, function_spec) @classmethod def get_program_info(cls, program_spec): functions = {} for function_name, function_spec in program_spec.functions.items(): functions[function_name] = cls.get_function_info(function_name, function_spec) return ProgramInfo(functions, program_spec) @classmethod def get_model_info(cls, model): model_spec = model.get_spec() return ModelInfo(cls.get_program_info(model_spec.mlProgram), model_spec) @classmethod def populate_outputs(cls, output_names, all_operations, acc): if len(output_names) == 0: return next_output_names = [] operations = [all_operations.get(output_name, None) for output_name in output_names] operations = [operation for operation in operations if operation is not None] acc.extend([output for operation in operations for output in operation.outputs.values()]) prev_output_names = [ output_name for operation in operations for dependency in operation.dependencies for output_name in dependency.outputs.keys() ] prev_output_names = cls.unique(prev_output_names) cls.populate_outputs(prev_output_names, all_operations, acc) @classmethod def get_all_outputs(cls, block_info): acc = [] output_names = block_info.spec.outputs cls.populate_outputs(output_names, block_info.operations, acc) return acc @classmethod def get_any_function(cls, model_info): program_info = model_info.program_info function_name = list(program_info.functions.keys())[0] return program_info.functions[function_name] @classmethod def get_any_block(cls, model_info): function_info = cls.get_any_function(model_info) block_specialization_name = list(function_info.blocks.keys())[0] return function_info.blocks[block_specialization_name] @classmethod def clone_spec(cls, spec): spec_class = spec.__class__ new_spec = spec_class() new_spec.CopyFrom(spec) return new_spec @classmethod def get_output_feature_type(cls, output_name, operations): operation = operations[output_name] data_type = operation.outputs[output_name].type.tensorType.dataType # Valid data type as model outputs. data_type_to_feature_type = { ct.proto.MIL_pb2.DataType.FLOAT16: ct.proto.FeatureTypes_pb2.ArrayFeatureType.FLOAT16, ct.proto.MIL_pb2.DataType.FLOAT64: ct.proto.FeatureTypes_pb2.ArrayFeatureType.DOUBLE, ct.proto.MIL_pb2.DataType.FLOAT32: ct.proto.FeatureTypes_pb2.ArrayFeatureType.FLOAT32, ct.proto.MIL_pb2.DataType.INT32: ct.proto.FeatureTypes_pb2.ArrayFeatureType.INT32, } # Return None for invalid data type as model outputs (e.g. bool). if data_type not in data_type_to_feature_type: return None return data_type_to_feature_type[data_type] def __init__(self, model): self.weights_dir = model.weights_dir self.model_info = self.__class__.get_model_info(model) self.block_info = self.__class__.get_any_block(self.model_info) model_outputs = [output for output in self.model_info.spec.description.output] output_names = set([output.name for output in model_outputs]) all_outputs = self.__class__.get_all_outputs(self.block_info) intermediate_outputs = [output for output in all_outputs if output.name not in output_names] self.__model_outputs = model_outputs self.__all_outputs = all_outputs self.__intermediate_outputs = intermediate_outputs self.__intermediate_output_names = self.__class__.unique( [output_spec.name for output_spec in intermediate_outputs] ) self.__cached_models = {} @property def output_names(self): return self.__class__.unique([output.name for output in self.outputs]) def get_intermediate_output_names( self, op_include_fn=(lambda op: not (op.spec.type == "const")) ): all_operations = self.block_info.operations intermediate_output_names = list( filter( lambda name: op_include_fn(all_operations[name]), self.__intermediate_output_names ) ) intermediate_output_names.reverse() return self.__class__.unique(intermediate_output_names) def _get_concat_op_info(self) -> List[List[str]]: """ Return a list of lists of input/output names of concat ops. """ intermediate_output_names_list = self.get_intermediate_output_names( lambda op: (op.spec.type == "concat") ) all_operations = self.block_info.operations concat_op_info_list = [] for concat_output_name in intermediate_output_names_list: # Get a list of input names (to "values") of current concat op. arguments = all_operations[concat_output_name].spec.inputs["values"].arguments argument_list = [val.name for val in arguments if val.name is not None] # Append the output name of current concat op. argument_list.append(concat_output_name) # Append a list of input/output names of current concat op. concat_op_info_list.append(argument_list) return concat_op_info_list def get_model_with_intermediate_outputs( self, intermediate_output_names, compute_units=ct.ComputeUnit.ALL ): model_key = frozenset(intermediate_output_names) model = self.__cached_models.get(model_key) if model is not None: # Found cached model. return model cloned_spec = self.__class__.clone_spec(self.model_info.spec) cloned_model_info = ModelInfo( ModelDebugger.get_program_info(cloned_spec.mlProgram), cloned_spec ) cloned_spec.specificationVersion = max(self.model_info.spec.specificationVersion, 7) cloned_block_info = self.__class__.get_any_block(cloned_model_info) for output_name in intermediate_output_names: cloned_output_type = self.__class__.get_output_feature_type( output_name, self.block_info.operations ) # Some intermediate tensors cannot be appended to outputs since their data type is not valid as an output data type. # For example, an intermediate tensor with bool type cannot be appended to outputs (which will cause compilation error). if cloned_output_type is None: continue cloned_block_info.spec.outputs.append(output_name) cloned_output = ct.proto.Model_pb2.FeatureDescription() cloned_output.name = output_name cloned_output.type.multiArrayType.dataType = cloned_output_type cloned_model_info.spec.description.output.append(cloned_output) model = ct.models.MLModel( cloned_spec, weights_dir=self.weights_dir, compute_units=compute_units ) self.__cached_models[model_key] = model return model def get_models_with_intermediate_outputs_safely( self, intermediate_output_names, compute_units=ct.ComputeUnit.ALL ): if len(intermediate_output_names) == 0: return [] models = [] output_names = [intermediate_output_names] while len(output_names) > 0: curr_output_names = output_names[0] del output_names[0] model = None try: # This could fail compilation model = self.get_model_with_intermediate_outputs(curr_output_names, compute_units) except ValueError as ex: print( f"Failed to create model with intermediate outputs={intermediate_output_names}, error={ex}" ) if len(curr_output_names) > 1: print("Retrying") # split in two and then retry xs = self.__class__.split_list(curr_output_names) output_names.insert(0, xs[1]) output_names.insert(0, xs[0]) if model is not None: models.append(model) return models # Clears all cached models def clear_cached_models(self): self.__cached_models.clear() # The function will get called for each intermediate output, return `False` if you want to stop the enumeration otherwise `True`. def check_intermediate_output(output_value, output_name, operation, activation_stats_dict): tensor_min = np.min(output_value.flatten()) tensor_max = np.max(output_value.flatten()) activation_stats_dict[output_name]["rmin"] = tensor_min activation_stats_dict[output_name]["rmax"] = tensor_max if output_name in activation_stats_dict: activation_stats_dict[output_name]["rmin"] = min( tensor_min, activation_stats_dict[output_name]["rmin"] ) activation_stats_dict[output_name]["rmax"] = max( tensor_max, activation_stats_dict[output_name]["rmax"] ) else: activation_stats_dict[output_name]["rmin"] = tensor_min activation_stats_dict[output_name]["rmax"] = tensor_max return True def step( self, step_fn, inputs, activation_stats_dict, intermediate_output_names=None, compute_units=ct.ComputeUnit.CPU_ONLY, batch_size=500, ): if intermediate_output_names is None: intermediate_output_names = self.get_intermediate_output_names() model_output_names = [output.name for output in self.__model_outputs] model_outputs = None batch_size = len(intermediate_output_names) for output_names in self.__class__.batch(intermediate_output_names, batch_size): models = self.get_models_with_intermediate_outputs_safely(output_names, compute_units) for model in models: outputs = model.predict(inputs) # cache model outputs if model_outputs is None: model_outputs = { key: value for key, value in outputs.items() if key in model_output_names } # remove model outputs outputs = { key: value for key, value in outputs.items() if key not in model_output_names } output_names = list(outputs.keys()) for output_name in output_names: output_value = outputs[output_name] del outputs[output_name] operation = self.block_info.operations.get(output_name, None) if not step_fn(output_value, output_name, operation, activation_stats_dict): return outputs = {} for (output_name, output_value) in model_outputs.items(): operation = self.block_info.operations.get(output_name, None) if not step_fn(output_value, output_name, operation, activation_stats_dict): return ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/coreml/experimental/_post_training_quantization.py0000644000000000000000000003222714672066616031064 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import defaultdict from typing import Dict, List, Optional, Union import numpy as np from coremltools import _SPECIFICATION_VERSION_IOS_17 from coremltools import _logger as logger from coremltools.converters.mil.converter import mil_convert as _mil_convert from coremltools.converters.mil.frontend.milproto import load as _milproto_to_pymil from coremltools.converters.mil.mil.passes.graph_pass import PassOption from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.models import MLModel as _MLModel from coremltools.models import utils as _model_utils from coremltools.optimize.coreml import OptimizationConfig as _OptimizationConfig from ._model_debugger import ModelDebugger from ._quantization_passes import ( insert_prefix_quantize_dequantize_pair as _insert_prefix_quantize_dequantize_pair, ) def linear_quantize_activations( mlmodel: _MLModel, config: _OptimizationConfig, sample_data: List[Dict[Optional[str], np.ndarray]], ): """ Utility function to convert a float precision MLModel of type ``mlprogram``, which uses float-precision activations, into a compressed MLModel that uses n-bit activations. Currently, only n=8 is suppported. This is achieved by feeding real sample data into the input MLModel, calibrating the resulting float activation values, converting the calibrated values into ``quantize`` and ``dequantize`` op pairs, and inserting those op pairs into the new MLModel instance where activations get quantized. Use this function with ``linear_quantize_weights`` for 8-bit activation and 8-bit weight linear quantization. It's also compatible for use with other weight compression methods. Parameters ---------- mlmodel: MLModel Model to be quantized. This MLModel should be of type ``mlprogram``. config: OptimizationConfig An :py:class:`OptimizationConfig` object that specifies the parameters for activation quantization. sample_data: List Data used to characterize statistics of the activation values of the original float precision model. Expects a list of sample input dictionaries, which should have the same format as the data used in `.predict` method for the mlmodel. More specifically, the input name need to be specified in the data, unless it's a single input model where the name will be auto inferred. Returns ------- model: MLModel The activation quantized MLModel instance. Examples -------- .. sourcecode:: python import coremltools as ct import coremltools.optimize as cto model = ct.coreml.models.MLModel("my_model.mlpackage") activation_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.experimental.OpActivationLinearQuantizerConfig( mode="linear_symmetric" ) ) compressed_model_a8 = cto.coreml.experimental.linear_quantize_activations( model, activation_config, sample_data ) # (Optional) It's recommended to use with linear_quantize_weights. weight_config = cto.coreml.OptimizationConfig( global_config=cto.OpLinearQuantizerConfig(mode="linear_symmetric") ) compressed_model_w8a8 = cto.linear_quantize_weights(compressed_model_a8, weight_config) """ # Validate Sample data. If the sample data name is not provided, try to infer it. for sample in sample_data: if None in sample.keys(): input_spec = mlmodel.get_spec().description.input if len(sample.keys()) > 1 or len(input_spec) > 1: raise ValueError( "When the model has multiple inputs, please provide the name for each data in `sample_data`" ) inferred_input_name = input_spec[0].name sample[inferred_input_name] = sample[None] del sample[None] ### Apply four major graph passes in order. # Insert prefix quantize/dequantize pairs to valid patterns. logger.info("Running compression pass linear_quantize_activations phase 1/4 ...") linear_activation_quantizer = PASS_REGISTRY[ "compression::insert_prefix_quantize_dequantize_pair" ] linear_activation_quantizer = _insert_prefix_quantize_dequantize_pair( config, fake_compression=False ) linear_activation_quantizer.set_options([PassOption("config", config)]) prog = _model_utils._apply_graph_pass( mlmodel, linear_activation_quantizer, spec_version=_SPECIFICATION_VERSION_IOS_17, pymil_load_func=_milproto_to_pymil.load, skip_model_load=True, # Save memony return_pymil_prog=True, ) # Insert suffix quantize/dequantize pairs to valid patterns. logger.info("Running compression pass linear_quantize_activations phase 2/4 ...") graph_pass = PASS_REGISTRY["compression::insert_suffix_quantize_dequantize_pair"] graph_pass.set_options([PassOption("config", config)]) graph_pass(prog) prog.validate() # Updating scale/zero_point in all quantize/dequantize ops calculated by calibration data. logger.info("Running compression pass linear_quantize_activations phase 3/4 ...") activation_stats = _get_activation_calibration_stats(mlmodel, sample_data) graph_pass = PASS_REGISTRY["compression::update_quantize_dequantize"] graph_pass.set_options([PassOption("activation_stats", activation_stats)]) graph_pass(prog) prog.validate() # Re-use exsiting path to dedup quantize/dequantize operations. logger.info("Running compression pass linear_quantize_activations phase 4/4 ...") graph_pass = PASS_REGISTRY["common::dequantize_quantize_pair_elimination"] graph_pass(prog) prog.validate() # Convert the pymil program (prog) back to mlmodel model_spec = mlmodel.get_spec() specification_version = max(model_spec.specificationVersion, _SPECIFICATION_VERSION_IOS_17) mlmodel_activation_quantized = _mil_convert( prog, convert_to="mlprogram", convert_from="milinternal", specification_version=specification_version, compute_units=mlmodel.compute_unit, model_description=model_spec.description, skip_model_load=False, # Must be False to avoid manually re-loading from disk before running prediction. ) return mlmodel_activation_quantized def _update_tensor_range( tensor_name: str, tensor_value: Union[int, float], activation_stats_dict: Dict[str, Dict[str, float]], ) -> None: tensor_min = np.min(np.array(tensor_value).flatten()) tensor_max = np.max(np.array(tensor_value).flatten()) activation_stats_dict[tensor_name]["rmin"] = tensor_min activation_stats_dict[tensor_name]["rmax"] = tensor_max if tensor_name in activation_stats_dict: activation_stats_dict[tensor_name]["rmin"] = min( tensor_min, activation_stats_dict[tensor_name]["rmin"] ) activation_stats_dict[tensor_name]["rmax"] = max( tensor_max, activation_stats_dict[tensor_name]["rmax"] ) else: activation_stats_dict[tensor_name]["rmin"] = tensor_min activation_stats_dict[tensor_name]["rmax"] = tensor_max def _combine_lists_with_common_elements(data: List[List[str]]) -> List[List[str]]: """ Parameters ---------- data: list[list[]] data is a list of lists with strings. Returns ------- merged: combined lists with common elements. Example ------- input: [["conv0", "conv1", "conv2"], ["conv0", "conv3"], ["relu0"]] output: [["conv0", "conv1", "conv2", "conv3"], ["relu0"]] """ merged = [] for item in data: item_set = set(item) not_exsit = True for result in merged: if result & item_set: result.update(item_set) not_exsit = False break if not_exsit: merged.append(item_set) return merged def _adjust_concat_surrounding_activation_stats( concat_op_info_list: List, activation_stats_dict: Dict[str, Dict[str, float]] ) -> None: """ Adjust the activation calibration stats of inputs/outputs to the same concat ops to maximize hardware efficiency. Tensor values of inputs/outputs to the same concat op should share same range (same min/max), so the quantized concat could be surrounded by quantize/dequantize pairs with same scale and zero point values. Example ------- - concat 1 - inputs: "input_1", "input_2", "input_3" output: "output_1" - concat 2 - inputs: "input_1", "input_4" output: "output_2" Input/output tensors range of concat 1 should be identical. Input/output tensors range of concat 2 should be identical. "input_1" is in both, which means activation calibration stats of all 6 tensors above should be identical. """ if concat_op_info_list is None: return # Merge tensor names which should have identical values, to the same list. concat_list_adjusted = _combine_lists_with_common_elements(concat_op_info_list) for concat_group in concat_list_adjusted: group_rmin_list, group_rmax_list = [], [] for tensor_name in concat_group: # Some tensor_name may not have rmin/rmax if the calibration failed before. if tensor_name in activation_stats_dict: group_rmin_list.append(activation_stats_dict[tensor_name]["rmin"]) group_rmax_list.append(activation_stats_dict[tensor_name]["rmax"]) if len(group_rmin_list) == 0: raise ValueError( "None of the calibration run succeeded. Please check logs about calibrating sample failures." ) group_rmin, group_rmax = min(group_rmin_list), max(group_rmax_list) for tensor_name in concat_group: if tensor_name in activation_stats_dict: activation_stats_dict[tensor_name]["rmin"] = group_rmin activation_stats_dict[tensor_name]["rmax"] = group_rmax def _get_activation_calibration_stats( fpmodel: _MLModel, sample_data: List[Dict[str, np.ndarray]] ) -> Dict[str, Dict[str, float]]: """ Calibration and store a dict of intermediate tensor stats. E.g. activation_stats_dict = {tensor_0: {rmin: 0.2, rmax: 3.8}, tensor_1: {rmin: 4.5, rmax: 12.6}}} Parameters ---------- fpmodel: MLModel Path to fp16/fp32 "model.mlpackage". (Expect the orginal mlmodel, not the one with quantize and dequant op) sample_data: list[dict] Data for calibration. Returns ------- activation_calibration_stats: dict """ logger.warning( "Running compression pass linear_quantize_activations: start calibrating {} samples".format( len(sample_data) ) ) logger.warning( "Running compression pass linear_quantize_activations: calibration may take a while ..." ) analyzed = 0 tried = 0 debugger = ModelDebugger(fpmodel) activation_stats_dict = defaultdict(dict) intermediate_output_names = debugger.get_intermediate_output_names( lambda op: (op.spec.type != "const") ) # Get data ranges for all inputs. for data in sample_data: for input_name in data: _update_tensor_range(input_name, data[input_name], activation_stats_dict) # The last few elements in intermediate_output_names might be output. # We don't maintain min/max value for an output tensor. # If it's an output tensor we exclude it, otherwise include it. model_spec = fpmodel.get_spec() output_count = len(fpmodel.get_spec().description.output) output_names = [] for i in range(0, output_count): output_name = model_spec.description.output[i].name output_names.append(output_name) for intermediate_output_name in intermediate_output_names: if intermediate_output_name in output_names: intermediate_output_names.remove(intermediate_output_name) # Get data ranges for all intermeditate outputs. for data in sample_data: tried += 1 try: debugger.step( step_fn=ModelDebugger.check_intermediate_output, inputs=data, activation_stats_dict=activation_stats_dict, intermediate_output_names=intermediate_output_names, ) analyzed += 1 logger.warning( "Running compression pass linear_quantize_activations: calibrating sample {}/{} succeeds.".format( tried, len(sample_data) ) ) except Exception as e: logger.error(e) logger.error( "Running compression pass linear_quantize_activations: calibrating sample {}/{} fails.".format( tried, len(sample_data) ) ) continue # Handle a special case - concat ops. _adjust_concat_surrounding_activation_stats( debugger._get_concat_op_info(), activation_stats_dict ) return activation_stats_dict ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/coreml/experimental/_quantization_passes.py0000644000000000000000000002213414672066616027476 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np from tqdm import tqdm from coremltools import _logger as logger from coremltools.converters.mil._deployment_compatibility import AvailableTarget from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Operation, Program, types from coremltools.converters.mil.mil.block import is_current_opset_version_compatible_with from coremltools.converters.mil.mil.passes.defs.quantization import AbstractQuantizationPass from coremltools.converters.mil.mil.passes.helper import block_context_manager from coremltools.converters.mil.mil.passes.pass_registry import register_pass from coremltools.optimize.coreml._config import OptimizationConfig from coremltools.optimize.coreml.experimental._config import OpActivationLinearQuantizerConfig """ ----------------------------------- Activation compression graph pass - ----------------------------------- """ class AbstractActCompressionPass(AbstractQuantizationPass): """ The abstract class for the activation compression graph passes. """ _MINIMUM_OPSET_VERSION = AvailableTarget.iOS17 def __init__(self, config: OptimizationConfig = None, fake_compression: bool = False): if not isinstance(config, (OptimizationConfig, type(None))): raise ValueError(f"config must be of type OptimizationConfig. Got {type(config)}.") op_selector = None if config is None else config._op_selector super().__init__(op_selector=op_selector) self.fake_compression = fake_compression self._config = config if config is not None: self._check_config_type(config) def apply(self, prog): if not isinstance(prog, Program): raise TypeError('Transform "{}" can only be applied on PyMIL programs.'.format(self)) @block_context_manager def apply_block(block): if not is_current_opset_version_compatible_with(self._MINIMUM_OPSET_VERSION): logger.warning( f"The program's opset is not compatible with {self._MINIMUM_OPSET_VERSION}. " f"Skipped the compression pass {self.__class__}." ) return valid_consts = [] for op in list(block.operations): for b in op.blocks: apply_block(b) if self.is_valid_op(op): need_transform = True if self.op_selector is not None: need_transform = self.op_selector(op) if need_transform: valid_consts.append(op) for op in tqdm( valid_consts, desc=f"Running activation compression pass {self.__class__.__name__}", unit=" ops", ): self.transform_op(op) for f in prog.functions.values(): apply_block(f) @property def config(self) -> OptimizationConfig: return self._config @config.setter def config(self, value: OptimizationConfig): self._check_config_type(value) self._config = value if value._op_selector is not None: self.op_selector = value._op_selector def _check_config_type(self, config: OptimizationConfig): """ The utility function is checking the OptimizationConfig is holding correct type of op config. """ def get_supported_types_as_str(supported_type): if not isinstance(supported_type, (tuple, list)): supported_type = [supported_type] return ", ".join([f"{val.__name__}" for val in supported_type]) all_configs = [] if config.global_config is not None: all_configs.append(config.global_config) all_configs.extend(list(config.op_type_configs.values())) all_configs.extend(list(config.op_name_configs.values())) for config in all_configs: if not isinstance(config, self._SUPPORTED_CONFIG_TYPE) and config is not None: supported_type_str = get_supported_types_as_str(self._SUPPORTED_CONFIG_TYPE) raise ValueError( f"{self.__class__.__name__} only accept {supported_type_str} type config. Got {config.__class__.__name__}." ) def is_valid_op(self, op: Operation): return True @register_pass(namespace="compression") class insert_prefix_quantize_dequantize_pair(AbstractActCompressionPass): """ This graph pass applies transform on each valid activation quantization pattern. A valid activation quantization pattern should be surrounded by a quantize/dequantize pair before and after this pattern. This transform adds a quantize/dequantize pair before valid activation quantization patterns. .. code-block:: Input graph: ... -> downstream op Output graph: quantize -> dequantize -> downstream op """ _SUPPORTED_CONFIG_TYPE = OpActivationLinearQuantizerConfig _MODE_DTYPE_TO_RANGE = { (types.int8, "LINEAR_SYMMETRIC"): (-127, 127), } SUPPORTED_UNARY_OP_TYPES = ["conv", "avg_pool", "max_pool"] SUPPORTED_BINARY_OP_TYPES = ["add"] SUPPORTED_OP_TYPES = SUPPORTED_UNARY_OP_TYPES + SUPPORTED_BINARY_OP_TYPES def transform_op(self, op: Operation): if op.op_type not in self.SUPPORTED_OP_TYPES: return False # Checking op-level config. Skip if we disable compression on certain ops. op_config = self.config._get_op_config(op) if op_config is None: return scale_dtype = None if op.inputs["x"].dtype == types.fp16: scale_dtype = np.float16 else: scale_dtype = np.float32 # Copy kargs from ``op`` to ``new_core_op``. kargs = {} for k, v in op.inputs.items(): kargs[k] = v if op.op_type in self.SUPPORTED_UNARY_OP_TYPES: new_quantize_op = mb.quantize( input=op.inputs["x"], scale=np.array(1).astype(scale_dtype), zero_point=np.int8(0), output_dtype="int8", before_op=op, ) new_dequantize_op = mb.dequantize( input=new_quantize_op, scale=np.array(1).astype(scale_dtype), zero_point=np.int8(0), before_op=op, ) # Update kargs (input) of ``new_core_op``. kargs["x"] = new_dequantize_op elif op.op_type in self.SUPPORTED_BINARY_OP_TYPES: """ For op with two live inputs (e.g. add): Input graph: ... ->| |-> downstream op ... ->| Output graph: quantize -> dequantize | |-> downstream op quantize -> dequantize | """ # Validation check. # Both inputs x and y need to be non-const. # Reject when either input is const. x_is_const = op.inputs["x"].op is not None and op.inputs["x"].op.op_type == "const" y_is_const = op.inputs["y"].op is not None and op.inputs["y"].op.op_type == "const" if x_is_const != y_is_const: return new_quantize_op_x = mb.quantize( input=op.inputs["x"], scale=np.array(1).astype(scale_dtype), zero_point=np.int8(0), output_dtype="int8", before_op=op, ) new_dequantize_op_x = mb.dequantize( input=new_quantize_op_x, scale=np.array(1).astype(scale_dtype), zero_point=np.int8(0), before_op=op, ) new_quantize_op_y = mb.quantize( input=op.inputs["y"], scale=np.array(1).astype(scale_dtype), zero_point=np.int8(0), output_dtype="int8", before_op=op, ) new_dequantize_op_y = mb.dequantize( input=new_quantize_op_y, scale=np.array(1).astype(scale_dtype), zero_point=np.int8(0), before_op=op, ) # Update kargs (inputs) of ``new_core_op``. kargs["x"] = new_dequantize_op_x kargs["y"] = new_dequantize_op_y # Update other kargs of ``new_core_op``. # These are the same regardless of whether it's a unary or binary op. kargs["name"] = op.name kargs["before_op"] = op new_core_op = getattr(mb, op.op_type)(**kargs) new_core_op.name = op.outputs[0].name if new_core_op.op.enclosing_block.try_replace_uses_of_var_after_op( old_var=op.outputs[0], new_var=new_core_op, anchor_op=new_core_op.op, end_op=new_core_op, ): new_core_op.op.enclosing_block.remove_ops([op]) ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.265547 coremltools-8.0/coremltools/optimize/torch/0000755000000000000000000000000014672075535020020 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/__init__.py0000644000000000000000000000074214672066616022134 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from coremltools.optimize.torch import ( base_model_optimizer, layerwise_compression, optimization_config, palettization, pruning, quantization, ) from ._logging import init_root_logger as _init_root_logger _logger = _init_root_logger() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_logging.py0000644000000000000000000000270014672066616022156 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import logging import os def init_root_logger(): logger = get_root_logger() logger.propagate = False for handler in logger.handlers: logger.removeHandler(handler) logger.addHandler(logging.StreamHandler()) level = os.environ.get("COREMLTOOLS_OPTIMIZE_TORCH_LOG_LEVEL", "info").upper() logger.setLevel(level) set_logger_formatter(logger) return logger def get_root_logger(): return logging.getLogger("coremltools.optimize.torch") def set_logger_formatter(logger, rank=None): rank_component = f"rank {rank}:" if rank is not None else "" fmt = f"{rank_component}%(asctime)s:%(name)s:%(lineno)s:%(levelname)s: %(message)s" formatter = logging.Formatter(fmt=fmt) for handler in logger.handlers: handler.setFormatter(formatter) def set_logger_filters(logger, rank=None): for handler in logger.handlers: handler.addFilter(RankZeroFilter(rank)) def set_rank_for_root_logger(rank): logger = get_root_logger() set_logger_formatter(logger, rank) set_logger_filters(logger, rank) class RankZeroFilter(logging.Filter): def __init__(self, rank): super().__init__() self.rank = rank def filter(self, record): return self.rank == 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_typing.py0000644000000000000000000000066614672066616022053 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Any as _Any from typing import Callable as _Callable from typing import Dict as _Dict import torch as _torch ParamsDict = _Dict[str, _Any] TensorCallable = _Callable[[_torch.Tensor], _torch.Tensor] ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.265547 coremltools-8.0/coremltools/optimize/torch/_utils/0000755000000000000000000000000014672075535021317 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/__init__.py0000644000000000000000000000033214672066616023426 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/dist_utils.py0000644000000000000000000000211414672066616024052 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os as _os import torch as _torch import torch.distributed as _dist def ddp_setup(rank: int, world_size: int): """ Set environment variables which are used for initializing distributed process group for :py:class:`DistributedDataParallel`. Args: rank: Unique identifier of each process world_size: Total number of processes """ _os.environ["MASTER_ADDR"] = "localhost" _os.environ["MASTER_PORT"] = "12355" _os.environ["WORLD_SIZE"] = f"{world_size}" _os.environ["RANK"] = f"{rank}" _os.environ["LOCAL_RANK"] = f"{rank}" _torch.cuda.set_device(f"cuda:{rank}") _dist.init_process_group("nccl", rank=rank, world_size=world_size) def is_leader(): """ Returns ``True`` if the rank of the current process is 0. """ if _dist.is_initialized(): return _dist.get_rank() == 0 return True ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/fsdp_utils.py0000644000000000000000000000563414672066616024055 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from abc import ABC as _ABC from abc import abstractmethod as _abstractmethod from functools import partial as _partial from typing import Iterable as _Iterable from typing import Type as _Type import torch as _torch from attr import define as _define from torch.distributed.fsdp.wrap import ModuleWrapPolicy as _TorchModuleWrapPolicy from torch.distributed.fsdp.wrap import size_based_auto_wrap_policy as _size_based_auto_wrap_policy class FSDPAutoWrapPolicy(_ABC): """ An abstract base class for implementing an `FSDP `_ auto wrap policy. Wrapping a model with ``FSDP`` wrapper, ``FSDP(model)``, results in a single FSDP unit for the entire model. Thus, during the model's execution, the ``all-gather`` operation collects all the parameters of the model on all GPUs and hence, parameter sharding doesn't save any CUDA memory. To avoid this, one can specify a :py:class:`FSDPAutoWrapPolicy`, which automatically creates multiple FSDP units nested within the top level FSDP unit, based on certain criteria such as a minimum size limit for each FSDP unit or based on the class structure of the model. This way, only one FSDP unit needs to collect full parameters at a time, and one can compute gradients for a much larger model, which wouldn't be possible otherwise. For more details, please refer to `FSDP documentation `_ """ @_abstractmethod def get_policy(self): """ Return a policy for wrapping different submodules of a model with FSDP wrapper. """ @_define class ModuleWrapPolicy(FSDPAutoWrapPolicy): """ An auto wrap policy which wraps instances of modules with classes specified by ``module_classes`` into separate FSDP units. This policy is useful for transformer like models which can be naturally split into distinct submodules. For example, for a GPT style decoder model, with ``Attention`` and ``FeedForward`` as the two types of layers in it, one can specify ``module_classes = [Attention, FeedForward]``. This would lead to each instance of ``Attention`` and ``FeedForward`` layer in the model to be wrapped into an individual FSDP unit. """ module_classes: _Iterable[_Type[_torch.nn.Module]] def get_policy(self): return _TorchModuleWrapPolicy(self.module_classes) @_define class SizeBasedWrapPolicy: """ An auto wrap policy which creates a new FSDP instances when the number of parameters in the the current FSDP unit exceeds ``min_num_params``. """ min_num_params: int def get_policy(self): return _partial(_size_based_auto_wrap_policy, min_num_params=self.min_num_params) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/graph_utils.py0000644000000000000000000005504314672066616024221 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import operator as _operator from typing import Any as _Any from typing import Dict as _Dict from typing import Iterable as _Iterable from typing import List as _List from typing import Tuple as _Tuple import torch as _torch from coremltools.optimize.torch._utils.registry import BaseRegistry as _BaseRegistry from coremltools.optimize.torch._utils.transforms import _LINEAR_CONV_DECONV_FUNCS from coremltools.optimize.torch._utils.transforms import ReplacementType as _ReplacementType from coremltools.optimize.torch._utils.transforms import TransformRegistry as _TransformRegistry from coremltools.optimize.torch._utils.transforms import fetch_argument as _fetch_argument from coremltools.optimize.torch._utils.transforms import fetch_attr as _fetch_attr from coremltools.optimize.torch._utils.transforms import fetch_func_params as _fetch_func_params from coremltools.optimize.torch._utils.transforms import load_arg as _load_arg from coremltools._deps import _HAS_TORCH_VISION, MSG_TORCH_VISION_NOT_FOUND if _HAS_TORCH_VISION: import torchvision as _torchvision def count_model_params(model: _torch.nn.Module) -> int: """ Return the number of trainable parameters in the provided model. """ return sum(param.numel() for param in model.parameters()) def generate_env( model: _torch.nn.Module, traced_model: _torch.fx.GraphModule, input_shape: _Tuple[int], ) -> _Dict[str, _Any]: """ Computes environment dictionary by going through the provided model and traced_model with specified input_shape. The environment dictionary maps from node name to the actual function/method/module/attr/placeholder executed by the node. Returns environment dictionary. """ env = dict() sample_input = _torch.rand(input_shape) count_params = count_model_params(model) if count_params > 0 and next(model.parameters()).is_cuda: sample_input = sample_input.cuda() modules = dict(model.named_modules()) for node in traced_model.graph.nodes: if node.op == "placeholder": # Only one graph-level input result = sample_input elif node.op == "get_attr": result = _fetch_attr(model, node.target) elif node.op == "call_function": result = node.target( *_load_arg(node.args, env), **_load_arg(node.kwargs, env) ) elif node.op == "call_method": self_obj, *args = _load_arg(node.args, env) kwargs = _load_arg(node.kwargs, env) result = getattr(self_obj, node.target)(*args, **kwargs) elif node.op == "call_module": ld_args = _load_arg(node.args, env) result = modules[node.target]( *_load_arg(node.args, env), **_load_arg(node.kwargs, env) ) env[node.name] = result return env def volume(items: _Iterable[int]) -> int: """ Given an input shape, computes the volume or total number of elements contained by that shape. Note that an empty tensor will always have a volume of 0. """ ret = 1 if len(items) >= 1 else 0 for item in items: ret *= item return ret def num_bytes(dtype: _torch.dtype) -> int: """ Computes the number of bytes required to represent the input datatype. """ num_bits_in_byte = 8 if dtype.is_floating_point: return _torch.finfo(dtype).bits / num_bits_in_byte else: return _torch.iinfo(dtype).bits / num_bits_in_byte def parse_call_function_target(target: _Any) -> str: """ Parses the call function target for the function-name and returns it. Initially, will try to lookup the function inside a dictionary that maps known functions to function-names. If the function is not found in the dictionary, then will use custom parsing code to get the function-name. """ if not _HAS_TORCH_VISION: raise ImportError(MSG_TORCH_VISION_NOT_FOUND) func_map = { _torch.nn.functional.conv1d: "conv1d", _torch.nn.functional.conv2d: "conv2d", _torch.nn.functional.conv3d: "conv3d", _torch.nn.functional.conv_transpose1d: "conv_transpose1d", _torch.nn.functional.conv_transpose2d: "conv_transpose2d", _torch.nn.functional.conv_transpose3d: "conv_transpose3d", _torch.nn.functional.unfold: "unfold", _torch.nn.functional.fold: "fold", _torch.nn.functional.avg_pool1d: "avg_pool1d", _torch.nn.functional.avg_pool2d: "avg_pool2d", _torch.nn.functional.avg_pool3d: "avg_pool3d", _torch.nn.functional.max_pool1d: "max_pool1d", _torch.nn.functional.max_pool2d: "max_pool2d", _torch.nn.functional.max_pool3d: "max_pool3d", _torch.nn.functional.max_unpool1d: "max_unpool1d", _torch.nn.functional.max_unpool2d: "max_unpool2d", _torch.nn.functional.max_unpool3d: "max_unpool3d", _torch.nn.functional.lp_pool1d: "lp_pool1d", _torch.nn.functional.lp_pool2d: "lp_pool2d", _torch.nn.functional.adaptive_max_pool1d: "adaptive_max_pool1d", _torch.nn.functional.adaptive_max_pool2d: "adaptive_max_pool2d", _torch.nn.functional.adaptive_max_pool3d: "adaptive_max_pool3d", _torch.nn.functional.adaptive_avg_pool1d: "adaptive_avg_pool1d", _torch.nn.functional.adaptive_avg_pool2d: "adaptive_avg_pool2d", _torch.nn.functional.adaptive_avg_pool3d: "adaptive_avg_pool3d", _torch.nn.functional.fractional_max_pool2d: "fractional_max_pool2d", _torch.nn.functional.fractional_max_pool3d: "fractional_max_pool3d", _torch.nn.functional.scaled_dot_product_attention: "scaled_dot_product_attention", _torch.nn.functional.threshold: "threshold", _torch.nn.functional.threshold_: "threshold_", _torch.nn.functional.relu: "relu", _torch.nn.functional.relu_: "relu_", _torch.nn.functional.hardtanh: "hardtanh", _torch.nn.functional.hardtanh_: "hardtanh_", _torch.nn.functional.hardswish: "hardswish", _torch.nn.functional.relu6: "relu6", _torch.nn.functional.elu: "elu", _torch.nn.functional.elu_: "elu_", _torch.nn.functional.selu: "selu", _torch.nn.functional.celu: "celu", _torch.nn.functional.leaky_relu: "leaky_relu", _torch.nn.functional.leaky_relu_: "leaky_relu_", _torch.nn.functional.prelu: "prelu", _torch.nn.functional.rrelu: "rrelu", _torch.nn.functional.rrelu_: "rrelu_", _torch.nn.functional.glu: "glu", _torch.nn.functional.gelu: "gelu", _torch.nn.functional.logsigmoid: "logsigmoid", _torch.nn.functional.hardshrink: "hardshrink", _torch.nn.functional.tanhshrink: "tanhshrink", _torch.nn.functional.softsign: "softsign", _torch.nn.functional.softplus: "softplus", _torch.nn.functional.softmin: "softmin", _torch.nn.functional.softmax: "softmax", _torch.nn.functional.softshrink: "softshrink", _torch.nn.functional.gumbel_softmax: "gumbel_softmax", _torch.nn.functional.log_softmax: "log_softmax", _torch.nn.functional.tanh: "tanh", _torch.nn.functional.sigmoid: "sigmoid", _torch.nn.functional.hardsigmoid: "hardsigmoid", _torch.nn.functional.silu: "silu", _torch.nn.functional.mish: "mish", _torch.nn.functional.batch_norm: "batch_norm", _torch.nn.functional.group_norm: "group_norm", _torch.nn.functional.instance_norm: "instance_norm", _torch.nn.functional.layer_norm: "layer_norm", _torch.nn.functional.local_response_norm: "local_response_norm", _torch.nn.functional.normalize: "normalize", _torch.nn.functional.linear: "linear", _torch.nn.functional.bilinear: "bilinear", _torch.nn.functional.dropout: "dropout", _torch.nn.functional.alpha_dropout: "alpha_dropout", _torch.nn.functional.feature_alpha_dropout: "feature_alpha_dropout", _torch.nn.functional.dropout1d: "dropout1d", _torch.nn.functional.dropout2d: "dropout2d", _torch.nn.functional.dropout3d: "dropout3d", _torch.nn.functional.embedding: "embedding", _torch.nn.functional.embedding_bag: "embedding_bag", _torch.nn.functional.one_hot: "one_hot", _torch.nn.functional.pairwise_distance: "pairwise_distance", _torch.nn.functional.cosine_similarity: "cosine_similarity", _torch.nn.functional.pdist: "pdist", _torch.nn.functional.pixel_shuffle: "pixel_shuffle", _torch.nn.functional.pixel_unshuffle: "pixel_unshuffle", _torch.nn.functional.pad: "pad", _torch.nn.functional.interpolate: "interpolate", _torch.nn.functional.upsample: "upsample", _torch.nn.functional.upsample_nearest: "upsample_nearest", _torch.nn.functional.upsample_bilinear: "upsample_bilinear", _torch.nn.functional.grid_sample: "grid_sample", _torch.nn.functional.affine_grid: "affine_grid", _torch.Size: "Size", _torch.cat: "cat", _torch.concat: "concat", _torch.flatten: "flatten", _torch.add: "add", _torch.mul: "mul", _torch.sub: "sub", _torch.div: "div", _torchvision.ops.stochastic_depth: "stochastic_depth", _torch.permute: "permute", _torch.swapaxes: "swapaxes", _torch.einsum: "einsum", _torch.squeeze: "squeeze", _torch.unsqueeze: "unsqueeze", _torch.floor_divide: "floor_divide", _torch.Tensor.size: "size", _torch.Tensor.view: "view", _torch.Tensor.contiguous: "contiguous", _torch.transpose: "transpose", _torch.chunk: "chunk", _torch.mean: "mean", _torch.eq: "eq", _torch.Tensor.eq: "eq", _torch._assert: "_assert", _torch.reshape: "reshape", _torch.Tensor.reshape: "reshape", _torch.Tensor.expand: "expand", _torch.Tensor.dim: "dim", _operator.add: "add", _operator.sub: "sub", _operator.mul: "mul", _operator.truediv: "truediv", _torch.Tensor.flatten: "flatten", } if target in func_map: return func_map[target] # Examples: # # # .fn at 0x108fdc900> # # tokens = str(target).split(" ") # Look for token after "method" or "function" token_keywords = ( "method", "function", " i + 1: func_name = tokens[i + 1] if func_name[-1] == ">": func_name = func_name[:-1] return func_name return "" class NodeSummary: """ A data structure for nodes that contains the following information: node name, op kind, node shape, param count and param size in MB. """ def __init__( self, node_name: str, op_kind: str, node_shape: _List[int], param_count: int, param_size: float, ): self.node_name = node_name self.op_kind = op_kind self.node_shape = node_shape self.param_count = param_count self.param_size = param_size class ModelSummary: """ Data structure that represents a summary of each node/module/function/attribute of some model. Contains a list of NodeSummary instances. """ def __init__(self): self.nodes = list() def append( self, node_name: str, op_kind: str, node_shape: _List[int], param_count: int, param_size: float, ): node_summary = NodeSummary( node_name, op_kind, node_shape, param_count, param_size ) self.nodes.append(node_summary) def __str__(self) -> str: from tabulate import tabulate as _tabulate fmt_table = [ [ node_summary.node_name, node_summary.op_kind, node_summary.node_shape, "{:,}".format(node_summary.param_count) if node_summary.param_count != 0 else "--", "{:0.3f}".format(node_summary.param_size) if node_summary.param_size != 0.0 else "--", ] for node_summary in self.nodes ] return str( _tabulate( fmt_table, headers=["name", "kind", "shape", "param #", "param size (MB)"], ) ) def model_summary( model: _torch.nn.Module, traced_model: _torch.fx.GraphModule, input_shape: _Tuple[int], ) -> ModelSummary: """ Method that takes in a model, traced model and expected input shape, and returns a ModelSummary data structure that summarizes the model. Note that this method will add a total node entry at the end of the ModelSummary. """ sample_input = _torch.randn(input_shape) count_params = count_model_params(model) if count_params > 0 and next(model.parameters()).is_cuda: sample_input = sample_input.cuda() _torch.fx.passes.shape_prop.ShapeProp(traced_model).propagate(sample_input) modules = dict(model.named_modules()) summary = ModelSummary() total_param_count = 0 total_param_size = 0.0 for node in traced_model.graph.nodes: op_kind = None if node.op == "placeholder": op_kind = "Input" elif node.op == "output": op_kind = "Output" elif node.op == "get_attr": op_kind = "Attr" elif node.op == "call_module": op_kind = modules[node.target].__class__.__name__ elif node.op == "call_function": op_kind = parse_call_function_target(node.target) else: assert node.op == "call_method", "unsupported node op" op_kind = node.target param_count = 0 param_size = 0 if node.op == "call_module": for param_name, param in modules[node.target].named_parameters(): param_count += volume(param.shape) param_size += volume(param.shape) * num_bytes(param.dtype) elif node.op == "call_function": weight_node = None bias_node = None if node.target in _LINEAR_CONV_DECONV_FUNCS: weight_node = node.args[1] bias_node = _fetch_argument(node.args, node.kwargs, None, 2, "bias") elif node.target == _torch.nn.functional.batch_norm: weight_node = _fetch_argument(node.args, node.kwargs, None, 3, "weight") bias_node = _fetch_argument(node.args, node.kwargs, None, 4, "bias") elif node.target == _torch.nn.functional.layer_norm: weight_node = _fetch_argument(node.args, node.kwargs, None, 2, "weight") bias_node = _fetch_argument(node.args, node.kwargs, None, 3, "bias") # If weight_node is None, then weight will be None # If bias_node is None, then bias will be None (weight, bias) = _fetch_func_params(model, weight_node, bias_node) if weight is not None: param_count += volume(weight.shape) param_size += volume(weight.shape) * num_bytes(weight.dtype) if bias is not None: param_count += volume(bias.shape) param_size += volume(bias.shape) * num_bytes(weight.dtype) param_size = param_size / 1e6 # 1 MB is 1000 * 1000 bytes node_shape = None if "tensor_meta" in node.meta: if not isinstance( node.meta["tensor_meta"], _torch.fx.passes.shape_prop.TensorMetadata ): # Multiple outputs. Simply use the first output shape node_shape = list(node.meta["tensor_meta"][0].shape) else: node_shape = list(node.meta["tensor_meta"].shape) summary.append(node.name, op_kind, node_shape, param_count, param_size) total_param_count += param_count total_param_size += param_size summary.append("Total", None, None, total_param_count, total_param_size) return summary class Rewriter: """ Graph utility which takes in the expected input shape to a model and the model, and rewrites the model so that relevant functions are mapped to the corresponding module operator types. Will not rewrite the model in-place. """ def __init__( self, input_shape: _Tuple[int], model: _torch.nn.Module, traced_model: _torch.fx.GraphModule = None, ): self.input_shape = input_shape self.model = model for layer in self.model.modules(): if hasattr(layer, "inplace"): layer.inplace = False if traced_model is None: self.graph_module = _torch.fx.symbolic_trace(self.model) else: self.graph_module = traced_model summary = model_summary(self.model, self.graph_module, self.input_shape) self.shapes = dict() # No need to process total entry at the end of the table for i in range(len(summary.nodes) - 1): node_summary = summary.nodes[i] self.shapes[node_summary.node_name] = node_summary.node_shape self.env = generate_env(self.model, self.graph_module, self.input_shape) def _replace_node_module(self, node: _torch.fx.Node, repl: _torch.nn.Module): """ Method which takes in an input node and replacement module, and properly replaces the input node in the graph with a new node created for the replacement module. Preserves the node name. Intended to be a subgraph replacement with no global effect on the rest of the graph. """ with self.graph_module.graph.inserting_after(node): # TODO: Check if node_name is correct for call_module function # Looks like "qualified" name needed here for the target parameter # https://pytorch.org/docs/stable/fx.html#torch.fx.Graph.call_module node_name = node.name self.model.__setattr__(node_name, repl) self.graph_module.__setattr__(node_name, repl) new_args = node.args if node.op == "call_module" else (node.args[0],) new_node = self.graph_module.graph.call_module(node_name, args=new_args) node.replace_all_uses_with(new_node) self.graph_module.graph.erase_node(node) new_node.name = node_name def _replace_node_function( self, node: _torch.fx.Node, target: _Any, fun_args: _Tuple[_Any], fun_kwargs: _Dict[str, _Any], ): """ Method which takes in an input node and replacement target function, and properly replaces the input node in the graph with a new node created for the replacement function. Preserves the node name. Intended to be a subgraph replacement with no global effect on the rest of the graph. """ with self.graph_module.graph.inserting_after(node): node_name = node.name new_node = self.graph_module.graph.call_function( target, args=fun_args, kwargs=fun_kwargs ) node.replace_all_uses_with(new_node) self.graph_module.graph.erase_node(node) new_node.name = node_name def rewrite(self, debug: bool = False, custom_registry: _BaseRegistry = None): """ Method which goes through each node in the graph and then each transform in the registry. If the transform pattern matches against the node, will compute the replacement module and replace the node with this module in the graph. This method then goes through the modified graph to delete unused operators and dead code. Finally, it recompiles and sanity-checks the graph before preparing the rewritten model. There is a debug flag which enables the printing of detailed information about the rewritten model and modified graph. """ modules = dict(self.model.named_modules()) registry = ( _TransformRegistry.get_registry_values() if custom_registry is None else custom_registry ) # fx represents its graph as an ordered list of # nodes so we can iterate through them. for node in self.graph_module.graph.nodes: for transform in registry: if transform.match_pattern(node, modules, self.env): repl = transform.get_replacement( self.model, node, self.shapes[node.name], modules, self.env ) if transform.get_replacement_type() is _ReplacementType.kMODULE: self._replace_node_module(node, repl) elif transform.get_replacement_type() is _ReplacementType.kFUNCTION: self._replace_node_function(node, repl[0], repl[1], repl[2]) else: raise Exception("Replacement type is not supported") # Regenerate modules modules = dict(self.model.named_modules()) self.graph_module.delete_all_unused_submodules() self.graph_module.graph.eliminate_dead_code() self.graph_module.recompile() # Does some checks to make sure the graph is well-formed. self.graph_module.graph.lint() self.model = _torch.fx.GraphModule(self.model, self.graph_module.graph) if debug: print("=======================================================") print("Printing human-readable graph module") self.graph_module.print_readable() print("=======================================================") print("Printing graph from graph module") print(self.graph_module.graph) print("=======================================================") print("Printing tabular graph") self.graph_module.graph.print_tabular() print("=======================================================") print("Printing named modules") for name, _ in self.model.named_modules(): print("Found module with name", name) print("=======================================================") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/k_means.py0000644000000000000000000012002014672066616023301 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import logging as _logging import queue as _queue from abc import abstractmethod as _abstractmethod from typing import Any as _Any from typing import Dict as _Dict from typing import List as _List from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import Type as _Type from typing import Union as _Union import torch as _torch import torch.multiprocessing as _mp from attr import define as _define from coremltools._deps import _kmeans1d from coremltools.converters.mil.mil.ops.defs.iOS18 import ( constexpr_blockwise_shift_scale as _quantize_op, ) from coremltools.optimize.coreml._utils import compute_qparams as _compute_qparams from coremltools.optimize.torch._utils.metadata_utils import ( CompressionMetadata as _CompressionMetadata, ) from coremltools.optimize.torch._utils.metadata_utils import ( register_metadata_version as _register_metadata_version, ) from coremltools.optimize.torch._utils.python_utils import ClassRegistryMixin as _ClassRegistryMixin from coremltools.optimize.torch._utils.torch_utils import ( get_atomic_layers, get_n_bits_from_dtype, get_sign_from_dtype, ) from coremltools.optimize.torch.palettization._efficient_kmeans import _EfficientKMeans _logger = _logging.getLogger(__name__) @_define(frozen=True) class KMeansConfig: n_bits: int = 4 axis: int = 0 lut_dtype: _torch.dtype = None block_size: _Optional[int] = None cluster_dim: _Optional[int] = None enable_per_channel_scale: bool = False mask: _Optional[_torch.Tensor] = None importance: _Optional[_torch.Tensor] = None class KMeansSupportedModulesRegistry(_ClassRegistryMixin): """ A registry of :py:class:`KMeansModule` classes """ REGISTRY: _Dict[str, _Type["KMeansModule"]] @classmethod def get_kmeans_module(cls, module: _torch.nn.Module) -> _Optional[_Type["KMeansModule"]]: """ Returns the :py:class:`KMeansModule` class which implements k-means for the given module. """ for _, layer_cls in cls.REGISTRY.items(): if layer_cls.is_supported_module(module): return layer_cls return None @classmethod def get_supported_modules(cls) -> _Tuple[_Type[_torch.nn.Module]]: """ Returns all supported module types for k-means. """ return tuple(layer_cls.layer_type for _, layer_cls in cls.REGISTRY.items()) class KMeansModule: """ An interface for adding support for a given module class for running k-means. Implements methods to retrieve parameters which can be clustered and to update them with new values after clustering. """ layer_type: _Type[ _torch.nn.Module ] # The layer type which this interface supports clustering for parameter_names: _List[str] = [] # List of parameters which are clustered for this layer type def __init_subclass__(cls): KMeansSupportedModulesRegistry.register(cls.__name__)(cls) def __init__(self, module: _torch.nn.Module, config: _Dict[str, KMeansConfig]): self.module = module self.config = config self._parameter_metadata = None self._init_parameter_metadata() @_abstractmethod def _init_parameter_metadata(self): """ Initialize metadata for k-means clustering for this layer type. The metadata is a dictionary from parameter name to a dictionary of metadata name and its value. This method should add the shape of the parameters as the metadata for each parameter which should be clustered. """ @_abstractmethod def _get_parameters_impl(self) -> _Dict[str, _torch.Tensor]: """ Returns a dictionary of parameter name to the parameter tensor which should be clustered for this layer type. """ @_abstractmethod def _update_parameters_impl(self, param_name: str, new_value: _torch.Tensor): """ Update the parameter corresponding to this parameter name with the new value after reshaping to original parameter shape. """ @_abstractmethod def _reshape_for_kmeans(self, param_name: str, value: _torch.Tensor) -> _torch.Tensor: """ Reshape any value of original parameter shape to flattened shape for k-means. """ @_abstractmethod def _reshape_to_original(self, param_name: str, value: _torch.Tensor) -> _torch.Tensor: """ Reshape any value flattened for k-means back to original parameter shape. """ def _compute_lut_and_indices(self, param_name: str, param: _torch.Tensor): """ Compute LUT and indices from parameter. For 4-bit palettization and param shape (32, 16, 3, 3), Case-1: If block_size = 4 and axis = 0, then LUT has shape (8, 1, 1, 1, 16, 1) Case-2: If block_size = 4 and axis = 1, then LUT has shape (1, 4, 1, 1, 16, 1) Case-3: If cluster_dim = 4, then LUT has shape (1, 1, 1, 1, 16, 4) """ axis = self.config[param_name].axis num_channels = param.shape[axis] mask = self.config[param_name].mask block_size = self.config[param_name].block_size block_size = num_channels if block_size is None else block_size cluster_dim = self.config[param_name].cluster_dim orig_param_shape = self._parameter_metadata[param_name]["shape"] cluster_dim = 1 if cluster_dim is None else cluster_dim lut, indices = [], [] if cluster_dim == 1: # Scalar palettization for block_idx in range(0, num_channels, block_size): if axis == 0: lut_idx, ind_idx = _torch.unique( param[block_idx : block_idx + block_size, :], return_inverse=True, ) else: lut_idx, ind_idx = _torch.unique( param[:, block_idx : block_idx + block_size], return_inverse=True, ) # Ensure param was correctly palettized # Unless a mask was applied, number of unique values cannot exceed 2^nbits max_unique_val = 2 ** self.config[param_name].n_bits assert mask is not None or len(lut_idx) <= max_unique_val, ( f"Found more than expected unique values in {self.module} " f"for {param_name}, expected <= {max_unique_val}, found = {len(lut_idx)}" ) # Pad lut with zeros if fewer than 2^n_bit unique values are found if len(lut_idx) < max_unique_val: padded_lut_idx = _torch.zeros(max_unique_val) padded_lut_idx[: len(lut_idx)] = lut_idx lut_idx = padded_lut_idx lut.append(lut_idx) indices.append(ind_idx) lut = _torch.stack(lut).unsqueeze(1 - axis).unsqueeze(-1) indices = _torch.cat(indices, dim=axis) indices = self._reshape_to_original(param_name, indices) else: # Vector palettization # Reshape param for 2D clustering if axis == 0: param_reshaped = param.transpose(0, 1).reshape(-1, cluster_dim) else: param_reshaped = param.reshape(-1, cluster_dim) lut, indices = _torch.unique(param_reshaped, dim=0, return_inverse=True) # Undo reshaping in indices done for 2D clustering if axis == 0: indices = indices.reshape(param.shape[0] // cluster_dim, param.shape[1]) else: indices = indices.reshape(param.shape[0], param.shape[1] // cluster_dim) # Incorporate param dimensions in lut shape for i in range(len(orig_param_shape) - lut.dim() + 2): lut = lut.unsqueeze(-3) return lut, indices def _scale_by_per_channel_scale(self, param_name: str, param: _torch.Tensor) -> _torch.Tensor: """ Compute per channel scales for scaling the parameter in the range ``[-1, 1]`` and store them in the parameter metadata. Also scale the parameter using the computed scales. """ if self.config[param_name].enable_per_channel_scale: flattened_param = param.flatten(1) per_channel_scale = _torch.max(_torch.abs(flattened_param), dim=1, keepdim=True).values # Handle zero scales per_channel_scale[per_channel_scale == 0] = 1 flattened_param /= per_channel_scale param = flattened_param.reshape(param.shape) self._parameter_metadata[param_name]["per_channel_scale"] = per_channel_scale return param def _get_compression_metadata( self, param_name: str, param: _torch.Tensor ) -> _CompressionMetadata: """ Return compression metadata to be stored in the model for this parameter """ metadata = _CompressionMetadata(param_name) compression_type = ["palettization"] # LUT metadata.lut, _ = self._compute_lut_and_indices(param_name, param) # Per channel scale if self.config[param_name].enable_per_channel_scale: per_channel_scale = self._parameter_metadata[param_name]["per_channel_scale"] reshaped_param = self._reshape_to_original(param_name, param) for _ in range(reshaped_param.dim() - per_channel_scale.dim()): per_channel_scale = per_channel_scale.unsqueeze(-1) metadata.palettization_scale = per_channel_scale # LUT quantization if self.config[param_name].lut_dtype is not None: dtype = self.config[param_name].lut_dtype compression_type.append("quantization") metadata.quantization_n_bits = get_n_bits_from_dtype(dtype) scale = self._parameter_metadata[param_name]["lut_quantization_scale"] # match scale rank to lut rank for i in range(metadata.lut.dim() - scale.dim()): scale = scale.unsqueeze(-1) metadata.quantization_scale = scale zp = self._parameter_metadata[param_name]["lut_quantization_zp"] if zp is not None: # match zp rank to lut rank for i in range(metadata.lut.dim() - zp.dim()): zp = zp.unsqueeze(-1) metadata.zero_point = zp # vector axis for cluster_dim > 1 cluster_dim = self.config[param_name].cluster_dim if cluster_dim is not None and cluster_dim > 1: metadata.vector_axis = self.config[param_name].axis # Compression type metadata.compression_type = compression_type return metadata def _register_compression_metadata(self, param_name: str, param: _torch.Tensor): """ Register compression metadata on the model so that it can be serialized. """ metadata = self._get_compression_metadata(param_name, param) metadata.register(self.module) def _unscale_by_per_channel_scale(self, param_name: str, param: _torch.Tensor) -> _torch.Tensor: """ Re-scale the parameter with ``param_name`` back to its original range by multiplying per channel scales. """ if self.config[param_name].enable_per_channel_scale: per_channel_scale = self._parameter_metadata[param_name]["per_channel_scale"] flattened_param = param.flatten(1) flattened_param *= per_channel_scale param = flattened_param.reshape(param.shape) return param @classmethod def is_supported_module(cls, module: _torch.nn.Module) -> bool: """ Returns ``True`` if clustering this module is supported by this interface. """ return isinstance(module, cls.layer_type) def get_parameters(self) -> _Dict[str, _torch.Tensor]: """ Returns a dictionary of parameter name to the parameter tensor which should be clustered for this layer type. Scales the weights in the range ``[-1, 1]`` if ``per_channel_scale`` is enabled. """ return self._get_parameters_impl() def update_parameters(self, param_name: str, new_value: _torch.Tensor): """ Update the parameter corresponding to this parameter name with the new value. """ self._register_compression_metadata(param_name, new_value) self._update_parameters_impl(param_name, new_value) def get_param_config(self, param_name: str, param: _torch.Tensor) -> KMeansConfig: """ Returns KMeansConfig for the specified parameter """ config = self.config[param_name] block_size = param.shape[config.axis] if config.block_size is None else config.block_size cluster_dim = 1 if config.cluster_dim is None else config.cluster_dim importance = self._reshape_for_kmeans(param_name, config.importance) mask = self._reshape_for_kmeans(param_name, config.mask) return KMeansConfig( n_bits=config.n_bits, axis=config.axis, lut_dtype=config.lut_dtype, block_size=block_size, cluster_dim=cluster_dim, enable_per_channel_scale=config.enable_per_channel_scale, mask=mask, importance=importance, ) class Linear(KMeansModule): layer_type: _Type = _torch.nn.Linear parameter_names: _List[str] = ["weight"] def _init_parameter_metadata(self): self._parameter_metadata = { "weight": { "shape": self.module.weight.shape, } } def _get_parameters_impl(self): scaled_param = self._scale_by_per_channel_scale("weight", self.module.weight.data) return {"weight": self._reshape_for_kmeans("weight", scaled_param)} def _update_parameters_impl(self, param_name: str, new_value: _torch.Tensor): param = self._reshape_to_original(param_name, new_value) self.module.weight.data = self._unscale_by_per_channel_scale(param_name, param) def _reshape_for_kmeans(self, param_name: str, value: _torch.Tensor) -> _torch.Tensor: return value def _reshape_to_original(self, param_name: str, value: _torch.Tensor) -> _torch.Tensor: return value class Embedding(KMeansModule): layer_type: _Type = _torch.nn.Embedding parameter_names: _List[str] = ["weight"] def _init_parameter_metadata(self): self._parameter_metadata = { "weight": { "shape": self.module.weight.shape, } } def _get_parameters_impl(self): scaled_param = self._scale_by_per_channel_scale("weight", self.module.weight.data) return {"weight": self._reshape_for_kmeans("weight", scaled_param)} def _update_parameters_impl(self, param_name: str, new_value: _torch.Tensor): param = self._reshape_to_original(param_name, new_value) self.module.weight.data = self._unscale_by_per_channel_scale(param_name, param) def _reshape_for_kmeans(self, param_name: str, value: _torch.Tensor) -> _torch.Tensor: return value def _reshape_to_original(self, param_name: str, value: _torch.Tensor) -> _torch.Tensor: return value class Conv2d(KMeansModule): layer_type: _Type = _torch.nn.Conv2d parameter_names: _List[str] = ["weight"] def _init_parameter_metadata(self): self._parameter_metadata = { "weight": { "shape": self.module.weight.shape, } } def _get_parameters_impl(self): scaled_param = self._scale_by_per_channel_scale("weight", self.module.weight.data) return {"weight": self._reshape_for_kmeans("weight", scaled_param)} def _update_parameters_impl(self, param_name: str, new_value: _torch.Tensor): param = self._reshape_to_original(param_name, new_value) self.module.weight.data = self._unscale_by_per_channel_scale(param_name, param) def _reshape_for_kmeans(self, param_name: str, value: _torch.Tensor) -> _torch.Tensor: if value is None: return value if self.config[param_name].axis == 0: new_value = value.flatten(1) else: new_value = value.transpose(0, 1).flatten(1).transpose(0, 1) return new_value def _reshape_to_original(self, param_name: str, value: _torch.Tensor) -> _torch.Tensor: if value is None: return value weight_shape = self._parameter_metadata[param_name]["shape"] if self.config[param_name].axis == 0: new_value = value.reshape(weight_shape) else: new_value = ( value.transpose(0, 1) .reshape( ( weight_shape[1], weight_shape[0], weight_shape[2], weight_shape[3], ) ) .transpose(0, 1) ) return new_value class MultiheadAttention(KMeansModule): layer_type: _Type = _torch.nn.MultiheadAttention parameter_names: _List[str] = ["in_proj_weight"] def _init_parameter_metadata(self): self._parameter_metadata = { "in_proj_weight": { "shape": self.module.in_proj_weight.shape, }, } def _get_parameters_impl(self): scaled_param = self._scale_by_per_channel_scale( "in_proj_weight", self.module.in_proj_weight.data ) return {"in_proj_weight": self._reshape_for_kmeans("in_proj_weight", scaled_param)} def _update_parameters_impl(self, param_name: str, new_value: _torch.Tensor): param = self._reshape_to_original(param_name, new_value) self.module.in_proj_weight.data = self._unscale_by_per_channel_scale(param_name, param) def _reshape_for_kmeans(self, param_name: str, value: _torch.Tensor) -> _torch.Tensor: return value def _reshape_to_original(self, param_name: str, value: _torch.Tensor) -> _torch.Tensor: return value class KMeans: @classmethod def _get_block_to_cluster(cls, weight: _torch.Tensor, config: KMeansConfig, block_idx: int): """ Extract block weight to cluster. """ if config.axis == 0: block_importance = ( config.importance[block_idx : block_idx + config.block_size, :] if config.importance is not None else None ) block_weight = weight[block_idx : block_idx + config.block_size, :] block_mask = ( config.mask[block_idx : block_idx + config.block_size, :].flatten() if config.mask is not None else None ) else: block_importance = ( config.importance[:, block_idx : block_idx + config.block_size] if config.importance is not None else None ) block_weight = weight[:, block_idx : block_idx + config.block_size] block_mask = ( config.mask[:, block_idx : block_idx + config.block_size].flatten() if config.mask is not None else None ) return block_weight, block_importance, block_mask @classmethod def _cluster_weights_with_masking( cls, block_weight: _torch.Tensor, block_importance: _torch.Tensor, block_mask: _torch.Tensor, config: KMeansConfig, ) -> _Tuple[_Optional[_torch.Tensor], _Optional[_torch.Tensor]]: """ Cluster block weight with clustering only applied to masked weight elements. """ num_clusters = 2**config.n_bits block_weight_flatten = block_weight.flatten() block_weight_flatten_masked = block_weight_flatten[block_mask] if len(block_weight_flatten_masked) > 0: if block_importance is not None: block_importance_flatten = block_importance.flatten() kmeans_results = _kmeans1d.cluster( block_weight_flatten_masked.numpy(), num_clusters, weights=block_importance_flatten[block_mask].numpy(), ) else: kmeans_results = _kmeans1d.cluster( block_weight_flatten_masked.numpy(), num_clusters ) return _torch.tensor(kmeans_results.centroids), _torch.tensor(kmeans_results.clusters) return None, None @classmethod def _cluster_weights_1d( cls, block_weight: _torch.Tensor, block_importance: _torch.Tensor, config: KMeansConfig, ) -> _Tuple[_torch.Tensor, _torch.Tensor]: """ Cluster weights such that each centroid is a 1d scalar, i.e., cluster_dim == 1. """ num_clusters = 2**config.n_bits block_weight_flatten = block_weight.flatten() if block_importance is not None: block_importance_flatten = block_importance.flatten() kmeans_results = _kmeans1d.cluster( block_weight_flatten.numpy(), num_clusters, weights=block_importance_flatten.numpy(), ) else: kmeans_results = _kmeans1d.cluster(block_weight_flatten.numpy(), num_clusters) return _torch.tensor(kmeans_results.centroids), _torch.tensor(kmeans_results.clusters) @classmethod def _cluster_weights_2d( cls, block_weight: _torch.Tensor, block_importance: _torch.Tensor, config: KMeansConfig, rank: int, ) -> _Tuple[_torch.Tensor, _torch.Tensor]: """ Cluster weights such that each centroid is a 2d vector, i.e., cluster_dim > 1. If axis = 0: vectors are chosen with elements along the output channel dimension. Example: weight = [ [1, 2, 3, 4], [5, 6, 7, 8], ] axis = 0 ======== clustering is done for the 4 points below: [ [1, 5], ---> point 1 [2, 6], ---> point 2 [3, 7], ---> point 3 [4, 8], ---> point 4 ] axis = 1 ======== clustering is done for the 4 points below: [ [1, 2], ---> point 1 [3, 4], ---> point 2 [5, 6], ---> point 3 [7, 8], ---> point 4 ] """ num_clusters = 2**config.n_bits # Convert weight from N-D to 2-D. # Apply 2-D k-means clustering on 2-D weights. if config.axis == 0: # (C_out, C_in, H, W) -> (C_in, C_out, H, W) # (C_in, C_out, H, W) -> (C_in * H * W * C_out // cluster_dim, cluster_dim) weight_2d = block_weight.transpose(0, 1).reshape(-1, config.cluster_dim) importance_2d = ( block_importance.transpose(0, 1) .reshape(-1, config.cluster_dim) .sum(dim=1, keepdim=True) if block_importance is not None else None ) else: # (C_out, C_in, H, W) -> (C_in, C_out * H * W) -> (C_out * H * W, C_in) # (C_out * H * W, C_in) -> (C_out * H * W * C_in // cluster_dim, cluster_dim) weight_2d = block_weight.reshape(-1, config.cluster_dim) importance_2d = ( block_importance.reshape(-1, config.cluster_dim).sum(dim=1, keepdim=True) if block_importance is not None else None ) # Optionally move tensors to GPU if _torch.cuda.is_available(): device_id = rank % _torch.cuda.device_count() weight_2d = weight_2d.to(f"cuda:{device_id}") importance_2d = ( importance_2d.to(f"cuda:{device_id}") if importance_2d is not None else None ) kmeans_results = _EfficientKMeans( n_clusters=num_clusters, init="kmeans++", n_init=5, max_iter=300, ).fit(weight_2d, sample_weight=importance_2d) weight_2d.cpu() if importance_2d is not None: importance_2d.cpu() return kmeans_results.cluster_centers_.cpu(), kmeans_results.labels_.cpu() @classmethod def _update_clustered_block_weight( cls, block_weight: _torch.Tensor, block_mask: _torch.Tensor, depalett_block_weight: _torch.Tensor, new_weight: _torch.Tensor, config: KMeansConfig, block_idx: int, ): """ Write back clustered weight in new weight. """ block_weight_flatten = block_weight.flatten() if block_mask is not None: new_block_weight = block_weight_flatten.clone() new_block_weight[block_mask] = depalett_block_weight new_block_weight = new_block_weight.reshape(block_weight.shape) else: if config.axis == 1 or config.cluster_dim == 1: new_block_weight = depalett_block_weight.reshape(block_weight.shape) else: # need to reshape back for cluster_dim > 1 and axis = 0 new_block_weight = depalett_block_weight.reshape( block_weight.shape[1], block_weight.shape[0] ).transpose(0, 1) if config.axis == 0: new_weight[block_idx : block_idx + config.block_size, :] = new_block_weight else: new_weight[:, block_idx : block_idx + config.block_size] = new_block_weight @classmethod @_torch.no_grad() def _cluster_weights_worker( cls, rank: int, work_q: _Union[_mp.Queue, _queue.Queue], results_q: _Union[_mp.Queue, _queue.Queue], ): while True: try: ( layer_name, weight_name, weight, config, ) = work_q.get_nowait() except _queue.Empty: break _logger.info(f"Starting to process layer {layer_name}") new_weight = _torch.zeros_like(weight, dtype=weight.dtype) _logger.info( f"Number of blocks in {layer_name}.{weight_name}: {weight.shape[config.axis] // config.block_size}" ) lut_quant_scale = [] lut_quant_zp = [] for block_idx in range(0, weight.shape[config.axis], config.block_size): block_weight, block_importance, block_mask = cls._get_block_to_cluster( weight, config, block_idx ) if block_mask is not None: if config.cluster_dim == 1: centroids, clusters = cls._cluster_weights_with_masking( block_weight, block_importance, block_mask, config ) else: # Masking not supported for cluster_dim > 1 centroids, clusters = None, None _logger.info( f"Skipping palettizing layer: {layer_name} with " f"cluster_dim: {config.cluster_dim} and mask, because " f"vector palettization with masking is not supported." ) new_weight = weight.clone() else: if config.cluster_dim == 1: centroids, clusters = cls._cluster_weights_1d( block_weight, block_importance, config ) else: centroids, clusters = cls._cluster_weights_2d( block_weight, block_importance, config, rank ) if centroids is not None and clusters is not None: # quantize LUT if config.lut_dtype is not None: centroids, scale, zp = cls._quantize_centroids(config.lut_dtype, centroids) lut_quant_scale.append(scale) if zp: lut_quant_zp.append(zp) depalett_block_weight = centroids[clusters].to(weight.dtype) cls._update_clustered_block_weight( block_weight, block_mask, depalett_block_weight, new_weight, config, block_idx, ) # Combine quantization scales / zp for all LUTs into single tensor scale, zp = None, None if config.lut_dtype is not None and len(lut_quant_scale) > 0: scale = _torch.stack(lut_quant_scale, dim=config.axis) if len(lut_quant_zp) > 0: zp = _torch.stack(lut_quant_zp, dim=config.axis) _logger.info(f"Finished processing {weight_name} in layer {layer_name} successfully") results_q.put((layer_name, weight_name, new_weight, scale, zp)) _logger.info("Process done, work queue is empty") @classmethod def _quantize_centroids(cls, dtype: _torch.dtype, centroids: _torch.Tensor): centroids = centroids.numpy() ret = _compute_qparams( weight=centroids, nbits=get_n_bits_from_dtype(dtype), quantization_mode="LINEAR_SYMMETRIC", dtype=centroids.dtype, block_sizes=[0] * centroids.ndim, signed=get_sign_from_dtype(dtype), ) if ret is None: _logger.warning(f"Unable to quantize centroids {centroids}") return quant_centroids, scale, zp = ret dequant_centroids = _quantize_op.decompress( quant_centroids, scale, zp, ) # Convert back to torch tensors dequant_centroids = _torch.from_numpy(dequant_centroids) scale = _torch.from_numpy(scale) if zp is not None: zp = _torch.from_numpy(zp) return dequant_centroids, scale, zp @classmethod def _get_weights_to_cluster( cls, model: _torch.nn.Module, work_q: _Union[_mp.Queue, _queue.Queue], config: _Union[_Dict[str, _Dict[str, KMeansConfig]], KMeansConfig] = KMeansConfig(), ) -> _Tuple[_Dict[str, KMeansModule], _Dict[str, _Any]]: if not isinstance(config, dict): layers_to_cluster = get_atomic_layers( model, layer_types=list(KMeansSupportedModulesRegistry.get_supported_modules()), name_prefix="", ) config_dict = {} for layer_name, layer in layers_to_cluster.items(): layer_config = {} for param_name in KMeansSupportedModulesRegistry.get_kmeans_module( layer ).parameter_names: layer_config[param_name] = config config_dict[layer_name] = layer_config else: layers_to_cluster = { layer_name: model.get_submodule(layer_name) for layer_name, _ in config.items() } config_dict = config k_means_module_map = dict() param_dict = {} for layer_name, layer in layers_to_cluster.items(): layer_config = config_dict[layer_name] k_means_module_cls = KMeansSupportedModulesRegistry.get_kmeans_module(layer) k_means_module: KMeansModule = k_means_module_cls(layer, layer_config) k_means_module_map[layer_name] = k_means_module for param_name, param in k_means_module.get_parameters().items(): param_config = k_means_module.get_param_config(param_name, param) work_q.put((layer_name, param_name, param, param_config)) param_dict[f"{layer_name}${param_name}"] = (param, param_config) return k_means_module_map, param_dict @classmethod def _prepare_worker_processes( cls, num_workers: int ) -> _Tuple[ _Union[_mp.Queue, _queue.Queue], _Union[_mp.Queue, _queue.Queue], _Optional[_List[_mp.Process]], ]: raise NotImplementedError("This method is not implemented by base class.") @classmethod def _run_worker_processes( cls, work_q: _Union[_mp.Queue, _queue.Queue], results_q: _Union[_mp.Queue, _queue.Queue], worker_processes: _Optional[_List[_mp.Process]], ): raise NotImplementedError("This method is not implemented by base class.") @classmethod def _join_worker_processes(cls, worker_processes: _Optional[_List[_mp.Process]]): raise NotImplementedError("This method is not implemented by base class.") @classmethod @_torch.no_grad() def cluster_weights( cls, model: _torch.nn.Module, config: _Union[_Dict[str, _Dict[str, KMeansConfig]], KMeansConfig] = KMeansConfig(), num_workers: int = 1, ) -> _torch.nn.Module: work_q, results_q, worker_processes = cls._prepare_worker_processes(num_workers) k_means_module_map, param_dict = cls._get_weights_to_cluster( model=model, work_q=work_q, config=config, ) num_params = len(param_dict) remaining_params = param_dict def _worker_loop() -> None: cls._run_worker_processes(work_q, results_q, worker_processes) num_params_left = len(remaining_params) num_errors = 0 last_chance = False while remaining_params: try: layer_name, param_name, new_value, scale, zp = results_q.get(timeout=10) except _queue.Empty: if worker_processes is not None: # This if path is for ParallelKMeans # Check if workers are still running, in which case they may still be chewing on data and we # need to wait. Also identify if any worker died (maybe it has been killed for OOM) and count # it as an error for proc in list(worker_processes): if not proc.is_alive(): proc.join() if proc.exitcode != 0: _logger.error( f"Process {proc} exited with exit code {proc.exitcode}" ) num_errors += 1 alive_processes = sum(proc.is_alive() for proc in worker_processes) if not alive_processes: if last_chance: _logger.info( f"All processes are done, but queue is empty, which is unexpected. Expecting to " f"receive {num_params_left} more param(s). Will end now." ) break else: last_chance = True continue _logger.info( f"Result queue is empty, but {alive_processes} process(es) is / are still alive, " f"continuing..." ) continue else: # This else path is for SequentialKMeans if not last_chance: last_chance = True continue else: raise ValueError( f"Queue is empty, which is unexpected. Expecting to receive {num_params_left} more " f"param(s)." ) else: _logger.info(f"Progress: {100 * (1.0 - (num_params_left / num_params)):.2f} %") k_means_module = k_means_module_map[layer_name] k_means_module._parameter_metadata[param_name]["lut_quantization_scale"] = scale k_means_module._parameter_metadata[param_name]["lut_quantization_zp"] = zp k_means_module.update_parameters(param_name, new_value) remaining_params.pop(f"{layer_name}${param_name}") # Even though it might not have succeeded num_params_left -= 1 _logger.info("joining worker processes") cls._join_worker_processes(worker_processes) _worker_loop() if remaining_params: _logger.error( f"The {len(remaining_params)} following params of following layers were not successfully palettized and" f" a new palettization will be attempted using a single worker: {', '.join(sorted(remaining_params))}" ) work_q, results_q, worker_processes = cls._prepare_worker_processes( num_workers=1 ) # Running the remaining params with 1 worker as that is more stable for current_param, param_tuple in remaining_params.items(): layer_name, param_name = current_param.split("$") work_q.put((layer_name, param_name, param_tuple[0], param_tuple[1])) _worker_loop() if remaining_params: raise RuntimeError( f"Even after rerunning all failed layers with a single worker, {len(remaining_params)} are " f"still missing: {', '.join(sorted(remaining_params))}" ) else: _logger.info( "After rerunning all failed layers with a single worker, all palettizations succeeded" ) _register_metadata_version(model) return model class ParallelKMeans(KMeans): @classmethod def _prepare_worker_processes( cls, num_workers: int, ) -> _Tuple[ _Union[_mp.Queue, _queue.Queue], _Union[_mp.Queue, _queue.Queue], _Optional[_List[_mp.Process]], ]: ctx = _mp.get_context("spawn") manager = ctx.Manager() work_q = manager.Queue() results_q = manager.Queue() worker_processes = [ ctx.Process( target=cls._cluster_weights_worker, args=(rank, work_q, results_q), name=f"Process-{rank}", daemon=True, ) for rank in range(num_workers) ] return work_q, results_q, worker_processes @classmethod def _run_worker_processes( cls, work_q: _Union[_mp.Queue, _queue.Queue], results_q: _Union[_mp.Queue, _queue.Queue], worker_processes: _Optional[_List[_mp.Process]], ): for worker_process in worker_processes: worker_process.start() _logger.info(f"Started {worker_process.name} for clustering weights.") @classmethod def _join_worker_processes(cls, worker_processes: _Optional[_List[_mp.Process]]): for worker_process in worker_processes: worker_process.join() _logger.info(f"Finished {worker_process.name}.") class SequentialKMeans(KMeans): @classmethod def _prepare_worker_processes( cls, num_workers: int ) -> _Tuple[ _Union[_mp.Queue, _queue.Queue], _Union[_mp.Queue, _queue.Queue], _Optional[_List[_mp.Process]], ]: work_q = _queue.Queue() results_q = _queue.Queue() return work_q, results_q, None @classmethod def _run_worker_processes( cls, work_q: _Union[_mp.Queue, _queue.Queue], results_q: _Union[_mp.Queue, _queue.Queue], worker_processes: _Optional[_List[_mp.Process]], ): cls._cluster_weights_worker(0, work_q, results_q) @classmethod def _join_worker_processes(cls, worker_processes: _Optional[_List[_mp.Process]]): return ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/math_utils.py0000644000000000000000000000050414672066616024041 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import torch as _torch def rmse_error(a, b): return _torch.sqrt(_torch.mean(_torch.square(a - b))) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/metadata_utils.py0000644000000000000000000001367314672066616024703 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from enum import Enum from typing import Dict as _Dict from typing import List as _List from typing import Optional as _Optional import torch as _torch from attr import define as _define from attr import field as _field from attrs import validators as _validators from coremltools.optimize.torch._utils.python_utils import DictableDataClass as _DictableDataClass STATE_DICT_METADATA_BUFFER_PREFIX = "_COREML_" BUFFER_NAME_SEPARATOR = "/" METADATA_VERSION_BUFFER = ( STATE_DICT_METADATA_BUFFER_PREFIX + BUFFER_NAME_SEPARATOR + "metadata_version" ) METADATA_VERSION = _torch.tensor(1) class CompressionType(Enum): pruning = 1 palettization = 2 quantization = 3 def __str__(self): return self.name @_define class CompressionMetadata(_DictableDataClass): """ Class to encapsulate and register (store as buffer in state_dict) compression metadata per parameter within a module. Args: param_name (:obj:`str`): Name of parameter corresponding to which metadata is stored. quantization_n_bits (:obj:`int`): The dtype to use for quantizing the weights. quantization_scale (:py:class:`torch.Tensor`): Quantization parameters used for scaling weights. zero_point (:py:class:`torch.Tensor`): Quantization parameters used for translating weights in affine or unsigned symmetric quantization. lut (:py:class:`torch.Tensor`): Look up table for palettized weights. palettization_scale (:py:class:`torch.Tensor`): Per channel scales used to normalize weights before being palettized. compression_type (:obj:`list` of :py:class:`CompressionType`): List of compression types applied to the parameter in the order in which they were applied. """ param_name: str = _field(validator=_validators.optional(_validators.instance_of(str))) quantization_n_bits: _Optional[int] = _field( default=None, validator=_validators.optional(_validators.instance_of(int)) ) quantization_scale: _Optional[_torch.Tensor] = _field( default=None, validator=_validators.optional(_validators.instance_of(_torch.Tensor)), ) zero_point: _Optional[_torch.Tensor] = _field( default=None, validator=_validators.optional(_validators.instance_of(_torch.Tensor)), ) lut: _Optional[_torch.Tensor] = _field( default=None, validator=_validators.optional(_validators.instance_of(_torch.Tensor)), ) palettization_scale: _Optional[_torch.Tensor] = _field( default=None, validator=_validators.optional(_validators.instance_of(_torch.Tensor)), ) vector_axis: _Optional[int] = _field( default=None, validator=_validators.optional([_validators.instance_of(int)]), ) compression_type: _Optional[_List[str]] = _field( default=None, converter=lambda lst: [CompressionType[item].value for item in lst] if lst else None, validator=_validators.optional( _validators.deep_iterable( member_validator=_validators.instance_of(int), iterable_validator=_validators.instance_of(list), ) ), ) def register(self, module: _torch.nn.Module, override_compression_type: bool = False): """ Register compression metadata as buffers in module's state_dict, In case of joint compression, compression_type metadata is appended to module's existing compression type, if any. If ``override_compression_type`` flag is set, module's existing compression type metadata is overridden. """ for metadata, value in self.as_dict().items(): if metadata == "param_name" or value is None: continue buffer_name = self._get_metadata_buffer_name(metadata) # Handle chaining of compression types if metadata == "compression_type" and not override_compression_type: try: current_value = module.get_buffer(buffer_name) value = current_value.tolist() + value except AttributeError: # Previous value doesn't exist pass # Wrap value as a tensor to register as a buffer in module state_dict if not _torch.is_tensor(value): value = _torch.tensor(value) module.register_buffer(buffer_name, value) def _get_metadata_buffer_name(self, metadata_key: str) -> str: return BUFFER_NAME_SEPARATOR.join( [STATE_DICT_METADATA_BUFFER_PREFIX, self.param_name, metadata_key] ) @classmethod def from_state_dict(cls, prefixed_dict) -> _Dict[str, "CompressionMetadata"]: """ Initialize per parameter CompressionMetadata from state_dict """ param_to_metadata_dict = dict() for key, value in prefixed_dict.items(): if key.startswith(STATE_DICT_METADATA_BUFFER_PREFIX) and key != METADATA_VERSION_BUFFER: prefix, param_name, metadata = key.split(BUFFER_NAME_SEPARATOR) if param_name not in param_to_metadata_dict: param_to_metadata_dict[param_name] = {"param_name": param_name} # For compression type, convert tensor to list of strings if metadata == "compression_type": value = [str(CompressionType(x)) for x in value.tolist()] param_to_metadata_dict[param_name][metadata] = value result = { pname: cls.from_dict(metadata) for pname, metadata in param_to_metadata_dict.items() } return result def register_metadata_version(model: _torch.nn.Module): """ Register metadata version for the model """ model.register_buffer(METADATA_VERSION_BUFFER, METADATA_VERSION) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/python_utils.py0000644000000000000000000000732114672066616024435 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import logging as _logging from collections import OrderedDict as _OrderedDict from typing import IO as _IO from typing import Any as _Any from typing import Dict as _Dict from typing import Type as _Type from typing import Union as _Union import cattrs as _cattrs import torch as _torch import yaml as _yaml from attr import asdict as _asdict _logger = _logging.getLogger(__name__) def get_str(val: _Any): if isinstance(val, float): return f"{val:.5f}" return str(val) class RegistryMixin: REGISTRY = None @classmethod def register(cls, name: str): if cls.REGISTRY is None: cls.REGISTRY = _OrderedDict() def inner_wrapper(wrapped_obj): if name in cls.REGISTRY: _logger.warning( f"Name: {name} is already registered with object: {cls.REGISTRY[name].__name__} " f"in registry: {cls.__name__}" f"Over-writing the name with new class: {wrapped_obj.__name__}" ) cls.REGISTRY[name] = wrapped_obj return wrapped_obj return inner_wrapper @classmethod def _get_object(cls, name: str): if name in cls.REGISTRY: return cls.REGISTRY[name] raise NotImplementedError( f"No object is registered with name: {name} in registry {cls.__name__}." ) class ClassRegistryMixin(RegistryMixin): @classmethod def get_class(cls, name: str): return cls._get_object(name) class FunctionRegistryMixin(RegistryMixin): @classmethod def get_function(cls, name: str): return cls._get_object(name) class DictableDataClass: """ Utility class that provides convertors to and from Python dict """ @classmethod def from_dict(cls, data_dict: _Dict[str, _Any]) -> "DictableDataClass": """ Create class from a dictionary of string keys and values. Args: data_dict (:obj:`dict` of :obj:`str` and values): A nested dictionary of strings and values. """ # Explicitly raise exception for unrecognized keys cls._validate_dict(data_dict) converter = _cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook(_torch.Tensor, lambda obj, type: obj) return converter.structure_attrs_fromdict(data_dict, cls) @classmethod def from_yaml(cls, yml: _Union[_IO, str]) -> "DictableDataClass": """ Create class from a yaml stream. Args: yml: An :py:class:`IO` stream containing yaml or a :obj:`str` path to the yaml file. """ if isinstance(yml, str): with open(yml, "r") as file: dict_from_yml = _yaml.safe_load(file) else: dict_from_yml = _yaml.safe_load(yml) if dict_from_yml is None: dict_from_yml = {} assert isinstance(dict_from_yml, dict), ( "Invalid yaml received. yaml stream should return a dict " f"on parsing. Received type: {type(dict_from_yml)}." ) return cls.from_dict(dict_from_yml) def as_dict(self) -> _Dict[str, _Any]: """ Returns the config as a dictionary. """ return _asdict(self) @classmethod def _validate_dict(cls: _Type, config_dict: _Dict[str, _Any]): for key, _ in config_dict.items(): if not hasattr(cls, key): raise ValueError(f"Found unrecognized key {key} in config_dict: {config_dict}.") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/registry.py0000644000000000000000000000774314672066616023554 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from abc import ABC as _ABC class BaseRegistry(_ABC): """ Base class for registries that register all subclasses automatically for ease-of-use. """ # Maps from child class registry name to child class registry registry_map = dict() def __init_subclass__(cls, *args, **kwargs): # Adds mapping from child class registry name to empty child class registry BaseRegistry.registry_map[cls.__name__] = dict() @classmethod def instantiate(cls, subcls, *args, **kwargs): """ Instantiates a subclass entry in the registry of the provided class. The registry is stored as a dictionary that maps from the subclass name to a freshly created instance of the subclass. Args: cls: The registry class, which is a subclass of BaseRegistry. subcls: The subclass to be registered in the registry class. args: The arguments to be used to create an instance of the subclass. kwargs: The keyword arguments to be used to create an instance of the subclass. """ subcls_instance = subcls(*args, **kwargs) cls.register(subcls_instance) @classmethod def instantiate_key(cls, subcls_key, subcls, *args, **kwargs): """ Instantiates a subclass entry in the registry of the provided class. The registry is stored as a dictionary that maps from the subclass key to a freshly created instance of the subclass. Args: cls: The registry class, which is a subclass of BaseRegistry. subcls_key: The subclass key to be used for the registry entry. subcls: The subclass to be registered in the registry class. args: The arguments to be used to create an instance of the subclass. kwargs: The keyword arguments to be used to create an instance of the subclass. """ subcls_instance = subcls(*args, **kwargs) cls.register_key(subcls_key, subcls_instance) @classmethod def register(cls, subcls): """ Registers subclass instance in registry of provided class. Uses the subclass name as the key for the registry entry. Args: cls: The registry class, which is a subclass of BaseRegistry. subcls: The subclass instance to register in the registry class. """ registry = cls.get_registry() # Syntax is needed because cannot look up __name__ from class instance registry[subcls.__class__.__name__] = subcls @classmethod def register_key(cls, subcls_key, subcls): """ Registers subclass instance in registry of provided class. Uses the subclass key as the key for the registry entry. Args: cls: The registry class, which is a subclass of BaseRegistry. subcls_key: The subclass key to be used for the registry entry. subcls: The subclass instance to register in the registry class. """ registry = cls.get_registry() registry[subcls_key] = subcls @classmethod def get_registry(cls): """ Looks up the registry corresponding to the provided registry class and returns it. Args: cls: The registry class, which is a subclass of BaseRegistry. """ return BaseRegistry.registry_map[cls.__name__] @classmethod def get_registry_values(cls): """ Looks up the registry corresponding to the provided registry class and returns its values. This is useful for List/Set style registries with keys generated automatically by this class. Args: cls: The registry class, which is a subclass of BaseRegistry. """ return BaseRegistry.registry_map[cls.__name__].values() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/report_utils.py0000644000000000000000000000744014672066616024431 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import logging as _logging from typing import Tuple, Type import torch from coremltools.optimize.torch._utils.math_utils import rmse_error from coremltools.optimize.torch._utils.metadata_utils import CompressionMetadata, CompressionType from coremltools.optimize.torch.base_model_optimizer import _Report from coremltools.optimize.torch.pruning._utils import ( block2_sparsity, structured_sparsity, unstructured_sparsity, ) _logger = _logging.getLogger(__name__) def _normalize_report(report: _Report) -> _Report: """ Normalizes the report by making sure all parameter reports have the same number """ all_keys = set() for _, param_report in report.items(): for key in param_report: all_keys.add(key) for _, param_report in report.items(): for key in all_keys: if key not in param_report: param_report[key] = -1 return report def compute_post_training_report( uncompressed_model: torch.nn.Module, compressed_model: torch.nn.Module, supported_modules: Tuple[Type[torch.nn.Module]], ) -> _Report: """ Computes rmse between compressed and uncompressed parameters """ report = _Report() for name, module in compressed_model.named_modules(): if not isinstance(module, supported_modules): continue compression_metadata = CompressionMetadata.from_state_dict(module.state_dict()) for param_name in compression_metadata: module_summary = dict() param_key = f"{name}.{param_name}" if name else param_name with torch.no_grad(): compression_types = [ CompressionType(x) for x in compression_metadata[param_name].compression_type ] uncompressed_module = uncompressed_model.get_submodule(name) compressed_param = module.get_parameter(param_name) uncompressed_param = uncompressed_module.get_parameter(param_name) module_summary["error"] = rmse_error(compressed_param, uncompressed_param).item() module_summary["#params"] = int(torch.numel(compressed_param)) if CompressionType.pruning in compression_types: sparse_summary = { "structured_weight_sparsity": structured_sparsity(compressed_param), "unstructured_weight_sparsity": unstructured_sparsity(compressed_param), } if compressed_param.size(0) % 2 == 0: sparse_summary["block2_weight_sparsity"] = block2_sparsity(compressed_param) else: sparse_summary["block2_weight_sparsity"] = -1 # Not applicable module_summary.update(sparse_summary) if CompressionType.quantization in compression_types: quantization_n_bits = compression_metadata[param_name].quantization_n_bits # FIXME: add sign of dtype here module_summary["dtype"] = f"dtype=int{quantization_n_bits}" if CompressionType.palettization in compression_types: lut_shape = compression_metadata[param_name].lut.shape n_clusters = lut_shape[-2] cluster_dim = lut_shape[-1] module_summary[ "palettization_mode" ] = f"num_clusters={n_clusters}, cluster_dim={cluster_dim}" report[param_key] = module_summary report = _normalize_report(report) return report ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/state_dict_utils.py0000644000000000000000000000426414672066616025242 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Any, Dict, Mapping, NamedTuple import torch class AddMetadataStateDictHook: """ Create a hook that will add the given keys/values in the state dict metadata of the module it is registered on Args: extra_metadata: the extra state dict to be added to the state dict allow_overwrite: If True, do not raise if any of the keys are already in the state dict and would be overwritten by the new state """ def __init__(self, extra_metadata: Mapping[str, Any], allow_overwrite: bool = False): self.extra_metadata = extra_metadata self.allow_overwrite = allow_overwrite def __call__( self, module: torch.nn.Module, destination: Dict[str, torch.Tensor], prefix: str, local_metadata: Dict[str, Any], ) -> Dict[str, torch.Tensor]: for key, value in self.extra_metadata.items(): if key in local_metadata and not self.allow_overwrite: raise ValueError( f"Metadata key '{key}' would be overwritten as it already exists in the local_metadata dict: {local_metadata[key]}" ) local_metadata[key] = value return destination class LoadStateDictPostHook: """ Create a hook that acts on the module after its state_dict has been loaded. """ def __call__(self, module: torch.nn.Module, incompatible_keys: NamedTuple) -> None: pass def _verify_state_dict(state_dict, expected_keys): missing_keys = [] unexpected_keys = [] for key in state_dict: if key not in expected_keys: unexpected_keys.append(key) if len(unexpected_keys) > 0: raise ValueError(f"Found unexpected keys {unexpected_keys} in state_dict: {state_dict}") for key in expected_keys: if key not in state_dict: missing_keys.append(key) if len(missing_keys) > 0: raise ValueError(f"Missing keys {missing_keys} from state_dict: {state_dict}") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/torch_utils.py0000644000000000000000000001510214672066616024227 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import logging as _logging import operator as _operator import re as _re from contextlib import contextmanager from distutils.version import StrictVersion as _StrictVersion from typing import Any as _Any from typing import Dict as _Dict from typing import List as _List from typing import Tuple as _Tuple from typing import Type as _Type from typing import Union as _Union import numpy as _np import torch as _torch import torch.nn as _nn _logger = _logging.getLogger(__name__) def list_or_str_to_tensor(alist: _Union[_List[int], str, _torch.Tensor]) -> _torch.Tensor: if isinstance(alist, _torch.Tensor): return alist elif isinstance(alist, str): # Safety check since we are calling eval range_str_regex = r"^(range)\(\d+(\,?\s*\d+){0,2}\)$" assert _re.match(range_str_regex, alist), ( f"{alist} is invalid.", "Please provide a string such as 'range(...)'", ) try: alist = eval(alist) except Exception: _logger.error( f"Invalid range str {alist}.", "Please refer to the documentation for correct usage", ) return _torch.tensor( _np.ones( len(alist), ) * alist, dtype=_torch.float32, requires_grad=False, ) def _get_dtype_info(dtype: _torch.dtype): if dtype.is_floating_point: info_fn = _torch.finfo else: info_fn = _torch.iinfo return info_fn(dtype) def get_n_bits_from_dtype(dtype: _Union[str, _torch.dtype]) -> int: if type(dtype) is _torch.dtype: dtype_info = _get_dtype_info(dtype) return dtype_info.bits elif type(dtype) is str: return int(_re.search(r"\d+", dtype).group()) else: raise TypeError( "dtype must either be a string or an instance of torch.dtype," f" not {type(dtype)}" ) def get_sign_from_dtype(dtype: _Union[str, _torch.dtype]) -> int: if type(dtype) is _torch.dtype: dtype_info = _get_dtype_info(dtype) return dtype_info.min < 0 elif type(dtype) is str: return not dtype.startswith("u") else: raise TypeError( "dtype must either be a string or an instance of torch.dtype," f" not {type(dtype)}" ) def maybe_convert_str_to_dtype(dtype: _Union[str, _torch.dtype]) -> _torch.dtype: _str_to_dtype_map = { "quint8": _torch.quint8, "qint8": _torch.qint8, "float32": _torch.float32, "int8": _torch.int8, "uint8": _torch.uint8, # Torch doesn't support int4 or int3 # but we can represent it as int8 "int4": _torch.int8, "uint4": _torch.uint8, "qint4": _torch.qint8, "quint4": _torch.quint8, "uint3": _torch.uint8, "int3": _torch.int8, "fp8_e4m3": _torch.float8_e4m3fn, "fp8_e5m2": _torch.float8_e5m2, } if isinstance(dtype, str): dtype = dtype.lower() if dtype in _str_to_dtype_map: return _str_to_dtype_map[dtype] else: raise ValueError(f"Received unsupported dtype: {dtype}") elif isinstance(dtype, _torch.dtype): return dtype else: raise ValueError(f"Received unrecognized type for dtype: {type(dtype)}") def maybe_convert_str_to_mod_type(mod_type: str): """ Convert str to module type """ if not isinstance(mod_type, str): return mod_type if _re.fullmatch(r"operator\.[a-z]+", mod_type) and hasattr(_operator, mod_type.split(".")[-1]): return getattr(_operator, mod_type.split(".")[-1]) elif _re.fullmatch(r"torch\.[a-z]+", mod_type) and hasattr(_torch, mod_type.split(".")[-1]): return getattr(_torch, mod_type.split(".")[-1]) elif hasattr(_torch.nn, mod_type): return getattr(_torch.nn, mod_type) elif hasattr(_torch.nn.functional, mod_type): return getattr(_torch.nn.functional, mod_type) return mod_type @contextmanager def get_eval_model(model: _nn.Module): train_flag = model.training try: yield model.eval() finally: model.train(mode=train_flag) def get_parent_child_name(name: str) -> _Tuple[str, str]: """ Returns name of parent and child modules from a full module name. """ split = name.rsplit(".", 1) if len(split) == 1: return "", split[0] else: return split[0], split[1] def get_fully_qualified_name(model: _torch.nn.Module, module: _torch.nn.Module) -> str: """ Returns fully qualified name for a module if it exists in the model. The fully qualified name can be used to fetch the module using ``model.get_submodule``. """ for mod_name, mod in model.named_modules(remove_duplicate=True): if mod == module: return mod_name raise ValueError(f"Module: {module} is not a submodule of {model}.") def get_atomic_layers( module: _nn.Module, layer_types: _Union[_List[str], _List[_Type]], name_prefix: str = "", ) -> _Dict[str, _nn.Module]: """ Returns a dictionary of layer_name: layer for every layer in the module which matches the types specified in layers_to_find. """ if isinstance(module, tuple(layer_types)): return {name_prefix: module} result = {} for name, child in module.named_children(): result.update( get_atomic_layers( child, layer_types=layer_types, name_prefix=name_prefix + "." + name if name_prefix != "" else name, ) ) return result def clone_tensor_object(obj: _Any): """ Clone a nested list, tuple or dict of tensors. """ if isinstance(obj, _torch.Tensor): return obj.clone() elif isinstance(obj, tuple): return tuple(clone_tensor_object(item) for item in obj) elif isinstance(obj, list): return [clone_tensor_object(item) for item in obj] elif isinstance(obj, dict): return {key: clone_tensor_object(val) for key, val in obj.items()} else: raise ValueError(f"Cannot clone unrecognized object type: {obj}.") def get_torch_version(version): """ returns torch version given a version string. Works for versions like "2.1.1", "2.1.1+cpu", "2.1.1+rc" etc and would return 2.1.1 for these cases """ version_regex = r"\d+\.\d+\.\d+" version = _re.search(version_regex, str(version)).group(0) return _StrictVersion(version) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/transforms.py0000644000000000000000000011321014672066616024065 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import operator as _operator from abc import ABC as _ABC from abc import abstractmethod as _abstractmethod from enum import Enum as _Enum from typing import Any as _Any from typing import Dict as _Dict from typing import List as _List from typing import Tuple as _Tuple import torch as _torch from coremltools.optimize.torch._utils.registry import BaseRegistry as _BaseRegistry from coremltools._deps import _HAS_TORCH_VISION, MSG_TORCH_VISION_NOT_FOUND if _HAS_TORCH_VISION: import torchvision as _torchvision _CONV_FUNCS = ( _torch.nn.functional.conv1d, _torch.nn.functional.conv2d, _torch.nn.functional.conv3d, ) _DECONV_FUNCS = ( _torch.nn.functional.conv_transpose1d, _torch.nn.functional.conv_transpose2d, _torch.nn.functional.conv_transpose3d, ) _MAXPOOL_FUNCS = ( _torch.nn.functional.max_pool1d, _torch.nn.functional.max_pool2d, _torch.nn.functional.max_pool3d, ) _AVGPOOL_FUNCS = ( _torch.nn.functional.avg_pool1d, _torch.nn.functional.avg_pool2d, _torch.nn.functional.avg_pool3d, ) _ADAPTIVEAVGPOOL_FUNCS = ( _torch.nn.functional.adaptive_avg_pool1d, _torch.nn.functional.adaptive_avg_pool2d, _torch.nn.functional.adaptive_avg_pool3d, ) _LINEAR_CONV_DECONV_FUNCS = (_torch.nn.functional.linear,) + _CONV_FUNCS + _DECONV_FUNCS def get_kernel_size(weights: _torch.nn.parameter.Parameter) -> _Tuple[int]: """ Returns the kernel size of the input parameters given weights tensor. Assumes the first and second dimension correspond to channel dimensions. Assumes the remaining dimensions are spatial dimensions, which correspond to the kernel size. """ num_spatial_dims = len(weights.shape) - 2 assert num_spatial_dims >= 1, "Number of spatial dimensions should be at least 1" return tuple(weights.shape[2:]) def fetch_argument( args: _List[_Any], kwargs: _Dict[str, _Any], default: _Any, position: int, keyword: str = None, ) -> _Any: """ Given a list of arguments, a dictionary of keyword arguments, a default value for the parameter if it cannot be found, the expected position of the parameter and the keyword corresponding to the parameter, this function determines the value of the parameter and returns it. """ ret = default if keyword in kwargs: ret = kwargs[keyword] elif len(args) > position: ret = args[position] return ret def load_arg(a, env): """ Loads argument a from environment dictionary env and returns the result. """ return _torch.fx.graph.map_arg(a, lambda n: env[n.name]) def fetch_attr(model: _torch.nn.Module, target: str) -> _Any: """ Returns attribute within model that corresponds to the input target string. """ target_atoms = target.split(".") attr_itr = model for i, atom in enumerate(target_atoms): if not hasattr(attr_itr, atom): raise RuntimeError( f"Node referenced nonexistent target {'.'.join(target_atoms[:i])}" ) attr_itr = getattr(attr_itr, atom) return attr_itr def fetch_func_params( model: _torch.nn.Module, weight_node: _torch.fx.Node, bias_node: _torch.fx.Node ) -> _Tuple[_Any]: """ Given a model, weight node and bias node, this function fetches the attributes corresponding to these nodes and returns them as a 2-tuple (weight, bias) """ weight = None if weight_node is not None: assert weight_node.op == "get_attr", "unsupported op for weight node" weight = fetch_attr(model, weight_node.target) bias = None if bias_node is not None: assert bias_node.op == "get_attr", "unsupported op for bias node" bias = fetch_attr(model, bias_node.target) return (weight, bias) class ReplacementType(_Enum): kMODULE = 0 kFUNCTION = 1 class Transform(_ABC): """ Abstract base class for all transformations. """ @_abstractmethod def __str__(self) -> str: """ Returns the name of the transform. """ pass @_abstractmethod def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: """ Determines whether the input node matches the requirements for the transform to take place and returns the result. """ pass @_abstractmethod def get_replacement_type(self) -> ReplacementType: """ Returns the replacement type enumeration for this transformation. """ pass @_abstractmethod def get_replacement( self, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _Any: """ Computes the transformed node as a result of the transformation given input node and expected output shape of this node. """ pass # TODO: Add transforms for # adaptive_max_pool1d, adaptive_max_pool2d, adaptive_max_pool3d class TransformRegistry(_BaseRegistry): """ Registry that contains general transforms for rewriting PyTorch network to canonical representation. """ def __init_subclass__(cls, *args, **kwargs): TransformRegistry.instantiate(cls, *args, **kwargs) def requires_grad(weights: _Any) -> bool: """ If the given weights is a torch parameter instance and has requires_grad == True, returns True. Otherwise returns False. """ return isinstance(weights, _torch.nn.parameter.Parameter) and weights.requires_grad class FuncToModuleLinear(Transform, TransformRegistry): """ Subclass of class Transform that rewrites Linear functions to Linear modules. """ def __str__(self) -> str: return "FuncToModuleLinear" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return node.op == "call_function" and node.target == _torch.nn.functional.linear def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: weight_node = node.args[1] bias_node = fetch_argument(node.args, node.kwargs, None, 2, "bias") (weight, bias) = fetch_func_params(model, weight_node, bias_node) out_features = weight.shape[0] in_features = weight.shape[1] repl = _torch.nn.Linear(in_features, out_features, bias is not None) repl.weight = _torch.nn.parameter.Parameter( weight, requires_grad=requires_grad(weight) ) if bias is not None: repl.bias = _torch.nn.parameter.Parameter( bias, requires_grad=requires_grad(bias) ) return repl class FuncToModuleConv(Transform, TransformRegistry): """ Subclass of class Transform that rewrites Conv functions to Conv modules. """ def __str__(self) -> str: return "FuncToModuleConv" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return node.op == "call_function" and node.target in _CONV_FUNCS def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: weight_node = node.args[1] bias_node = fetch_argument(node.args, node.kwargs, None, 2, "bias") (weight, bias) = fetch_func_params(model, weight_node, bias_node) kernel_size = get_kernel_size(weight) stride = fetch_argument(node.args, node.kwargs, 1, 3, "stride") padding = fetch_argument(node.args, node.kwargs, 0, 4, "padding") dilation = fetch_argument(node.args, node.kwargs, 1, 5, "dilation") groups = fetch_argument(node.args, node.kwargs, 1, 6, "groups") out_channels = weight.shape[0] in_channels = weight.shape[1] * groups args = [ in_channels, out_channels, kernel_size, stride, padding, dilation, groups, bias is not None, ] repl = None if node.target == _torch.nn.functional.conv1d: repl = _torch.nn.Conv1d(*args) elif node.target == _torch.nn.functional.conv2d: repl = _torch.nn.Conv2d(*args) else: assert node.target == _torch.nn.functional.conv3d, "unsupported node target" repl = _torch.nn.Conv3d(*args) repl.weight = _torch.nn.parameter.Parameter( weight, requires_grad=requires_grad(weight) ) if bias is not None: repl.bias = _torch.nn.parameter.Parameter( bias, requires_grad=requires_grad(bias) ) return repl class FuncToModuleDeconv(Transform, TransformRegistry): """ Subclass of class Transform that rewrites Deconv functions to Deconv modules. """ def __str__(self) -> str: return "FuncToModuleDeconv" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return node.op == "call_function" and node.target in _DECONV_FUNCS def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: weight_node = node.args[1] bias_node = fetch_argument(node.args, node.kwargs, None, 2, "bias") (weight, bias) = fetch_func_params(model, weight_node, bias_node) kernel_size = get_kernel_size(weight) stride = fetch_argument(node.args, node.kwargs, 1, 3, "stride") padding = fetch_argument(node.args, node.kwargs, 0, 4, "padding") output_padding = fetch_argument(node.args, node.kwargs, 0, 5, "output_padding") groups = fetch_argument(node.args, node.kwargs, 1, 6, "groups") dilation = fetch_argument(node.args, node.kwargs, 1, 7, "dilation") out_channels = weight.shape[1] in_channels = weight.shape[0] * groups args = [ in_channels, out_channels, kernel_size, stride, padding, output_padding, groups, bias is not None, dilation, ] repl = None if node.target == _torch.nn.functional.conv_transpose1d: repl = _torch.nn.ConvTranspose1d(*args) elif node.target == _torch.nn.functional.conv_transpose2d: repl = _torch.nn.ConvTranspose2d(*args) else: assert ( node.target == _torch.nn.functional.conv_transpose3d ), "unsupported node target" repl = _torch.nn.ConvTranspose3d(*args) repl.weight = _torch.nn.parameter.Parameter( weight, requires_grad=requires_grad(weight) ) if bias is not None: repl.bias = _torch.nn.parameter.Parameter( bias, requires_grad=requires_grad(bias) ) return repl class FuncToModuleMaxPool(Transform, TransformRegistry): """ Subclass of class Transform that rewrites MaxPool functions to MaxPool modules. """ def __str__(self) -> str: return "FuncToModuleMaxPool" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return node.op == "call_function" and node.target in _MAXPOOL_FUNCS def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: kernel_size = fetch_argument(node.args, node.kwargs, None, 1, "kernel_size") stride = fetch_argument(node.args, node.kwargs, None, 2, "stride") padding = fetch_argument(node.args, node.kwargs, 0, 3, "padding") dilation = fetch_argument(node.args, node.kwargs, 1, 4, "dilation") ceil_mode = fetch_argument(node.args, node.kwargs, False, 5, "ceil_mode") return_indices = fetch_argument( node.args, node.kwargs, False, 6, "return_indices" ) args = [kernel_size, stride, padding, dilation, return_indices, ceil_mode] repl = None if node.target == _torch.nn.functional.max_pool1d: repl = _torch.nn.MaxPool1d(*args) elif node.target == _torch.nn.functional.max_pool2d: repl = _torch.nn.MaxPool2d(*args) else: assert ( node.target == _torch.nn.functional.max_pool3d ), "unsupported node target" repl = _torch.nn.MaxPool3d(*args) return repl class FuncToModuleAvgPool(Transform, TransformRegistry): """ Subclass of class Transform that rewrites AvgPool functions to AvgPool modules. """ def __str__(self) -> str: return "FuncToModuleAvgPool" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return node.op == "call_function" and node.target in _AVGPOOL_FUNCS def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: kernel_size = fetch_argument(node.args, node.kwargs, None, 1, "kernel_size") stride = fetch_argument(node.args, node.kwargs, None, 2, "stride") padding = fetch_argument(node.args, node.kwargs, 0, 3, "padding") ceil_mode = fetch_argument(node.args, node.kwargs, False, 4, "ceil_mode") count_include_pad = fetch_argument( node.args, node.kwargs, True, 5, "count_include_pad" ) args = [kernel_size, stride, padding, ceil_mode, count_include_pad] repl = None if node.target == _torch.nn.functional.avg_pool1d: repl = _torch.nn.AvgPool1d(*args) elif node.target == _torch.nn.functional.avg_pool2d: divisor_override = fetch_argument( node.args, node.kwargs, None, 6, "divisor_override" ) repl = _torch.nn.AvgPool2d(*args, divisor_override) else: assert ( node.target == _torch.nn.functional.avg_pool3d ), "unsupported node target" divisor_override = fetch_argument( node.args, node.kwargs, None, 6, "divisor_override" ) repl = _torch.nn.AvgPool3d(*args, divisor_override) return repl class FuncToModuleAdaptiveAvgPool(Transform, TransformRegistry): """ Subclass of class Transform that rewrites AdaptiveAvgPool functions to AdaptiveAvgPool modules. """ def __str__(self) -> str: return "FuncToModuleAdaptiveAvgPool" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return node.op == "call_function" and node.target in _ADAPTIVEAVGPOOL_FUNCS def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: output_size = node.args[1] repl = None if node.target == _torch.nn.functional.adaptive_avg_pool1d: repl = _torch.nn.AdaptiveAvgPool1d(output_size) elif node.target == _torch.nn.functional.adaptive_avg_pool2d: repl = _torch.nn.AdaptiveAvgPool2d(output_size) else: assert ( node.target == _torch.nn.functional.adaptive_avg_pool3d ), "unsupported node target" repl = _torch.nn.AdaptiveAvgPool3d(output_size) return repl class FuncToModuleDropout(Transform, TransformRegistry): """ Subclass of class Transform that rewrites Dropout functions to Dropout modules. """ def __str__(self) -> str: return "FuncToModuleDropout" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return node.op == "call_function" and ( node.target == _torch.nn.functional.dropout or node.target == _torch.nn.functional.dropout1d or node.target == _torch.nn.functional.dropout2d or node.target == _torch.nn.functional.dropout3d ) def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: p = fetch_argument(node.args, node.kwargs, 0.5, 1, "p") inplace = fetch_argument(node.args, node.kwargs, False, 3, "inplace") args = [p, inplace] repl = None if node.target == _torch.nn.functional.dropout: repl = _torch.nn.Dropout(*args) elif node.target == _torch.nn.functional.dropout1d: repl = _torch.nn.Dropout1d(*args) elif node.target == _torch.nn.functional.dropout2d: repl = _torch.nn.Dropout2d(*args) else: assert node.target == _torch.nn.functional.dropout3d repl = _torch.nn.Dropout3d(*args) return repl class FuncToModuleFlatten(Transform, TransformRegistry): """ Subclass of class Transform that rewrites Flatten functions to Flatten modules. """ def __str__(self) -> str: return "FuncToModuleFlatten" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return node.op == "call_function" and node.target == _torch.flatten def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: start_dim = fetch_argument(node.args, node.kwargs, 0, 1, "start_dim") end_dim = fetch_argument(node.args, node.kwargs, -1, 2, "end_dim") repl = _torch.nn.Flatten(start_dim, end_dim) return repl class MethodToModuleFlatten(Transform, TransformRegistry): """ Subclass of class Transform that rewrites Flatten methods to Flatten modules. """ def __str__(self) -> str: return "MethodToModuleFlatten" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a method if node.op == "call_method": self_obj, *args = load_arg(node.args, env) return isinstance(self_obj, _torch.Tensor) and node.target == "flatten" else: return False def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: start_dim = fetch_argument(node.args, node.kwargs, 0, 1, "start_dim") end_dim = fetch_argument(node.args, node.kwargs, -1, 2, "end_dim") repl = _torch.nn.Flatten(start_dim, end_dim) return repl class FuncToModuleBatchNorm(Transform, TransformRegistry): """ Subclass of class Transform that rewrites BatchNorm functions to BatchNorm modules. """ def __str__(self) -> str: return "FuncToModuleBatchNorm" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return ( node.op == "call_function" and node.target == _torch.nn.functional.batch_norm ) def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: running_mean_node = node.args[1] running_var_node = node.args[2] weight_node = fetch_argument(node.args, node.kwargs, None, 3, "weight") bias_node = fetch_argument(node.args, node.kwargs, None, 4, "bias") # Does not look like this is required by the module training = fetch_argument(node.args, node.kwargs, False, 5, "training") momentum = fetch_argument(node.args, node.kwargs, 0.1, 6, "momentum") eps = fetch_argument(node.args, node.kwargs, 1e-05, 7, "eps") running_mean = None assert ( running_mean_node.op == "get_attr" ), "unsupported op for running mean node" running_mean = fetch_attr(model, running_mean_node.target) running_var = None assert running_var_node.op == "get_attr", "unsupported op for running var node" running_var = fetch_attr(model, running_var_node.target) (weight, bias) = fetch_func_params(model, weight_node, bias_node) num_spatial_dims = len(output_shape) - 2 num_features = running_mean.shape[0] affine = weight is not None or bias is not None # The function batch_norm always tracks the running stats track_running_stats = True args = [num_features, eps, momentum, affine, track_running_stats] repl = None if num_spatial_dims == 0 or num_spatial_dims == 1: repl = _torch.nn.BatchNorm1d(*args) elif num_spatial_dims == 2: repl = _torch.nn.BatchNorm2d(*args) elif num_spatial_dims == 3: repl = _torch.nn.BatchNorm3d(*args) else: raise ValueError("Unsupported number of spatial dimensions for batch norm") # Note that running_mean and running_var are not trainable parameters if running_mean is not None: if isinstance(running_mean, _torch.nn.parameter.Parameter): repl.running_mean = _torch.nn.parameter.Parameter( running_mean, requires_grad=False ) else: repl.register_buffer("running_mean", running_mean) if running_var is not None: if isinstance(running_var, _torch.nn.parameter.Parameter): repl.running_var = _torch.nn.parameter.Parameter( running_var, requires_grad=False ) else: repl.register_buffer("running_var", running_var) if weight is not None: repl.weight = _torch.nn.parameter.Parameter( weight, requires_grad=requires_grad(weight) ) if bias is not None: repl.bias = _torch.nn.parameter.Parameter( bias, requires_grad=requires_grad(bias) ) return repl class FuncToModuleLayerNorm(Transform, TransformRegistry): """ Subclass of class Transform that rewrites LayerNorm functions to LayerNorm modules. """ def __str__(self) -> str: return "FuncToModuleLayerNorm" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return ( node.op == "call_function" and node.target == _torch.nn.functional.layer_norm ) def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: normalized_shape = node.args[1] weight_node = fetch_argument(node.args, node.kwargs, None, 2, "weight") bias_node = fetch_argument(node.args, node.kwargs, None, 3, "bias") eps = fetch_argument(node.args, node.kwargs, 1e-05, 4, "eps") (weight, bias) = fetch_func_params(model, weight_node, bias_node) elementwise_affine = weight is not None args = [tuple(normalized_shape), eps, elementwise_affine] repl = _torch.nn.LayerNorm(*args) if elementwise_affine: repl.weight = _torch.nn.parameter.Parameter( weight, requires_grad=requires_grad(weight) ) if bias is not None: repl.bias = _torch.nn.parameter.Parameter( bias, requires_grad=requires_grad(bias) ) return repl class FuncToModuleRelu(Transform, TransformRegistry): """ Subclass of class Transform that rewrites Relu functions to Relu modules. """ def __str__(self) -> str: return "FuncToModuleRelu" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return node.op == "call_function" and node.target == _torch.nn.functional.relu def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: # inplace == False by default repl = _torch.nn.ReLU() return repl class FuncToModuleSigmoid(Transform, TransformRegistry): """ Subclass of class Transform that rewrites Sigmoid functions to Sigmoid modules. """ def __str__(self) -> str: return "FuncToModuleSigmoid" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return node.op == "call_function" and node.target == _torch.sigmoid def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: repl = _torch.nn.Sigmoid() return repl class FuncToModuleSoftmax(Transform, TransformRegistry): """ Subclass of class Transform that rewrites Softmax functions to Softmax modules. """ def __str__(self) -> str: return "FuncToModuleSoftmax" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return ( node.op == "call_function" and node.target == _torch.nn.functional.softmax ) def get_replacement_type(self) -> ReplacementType: return ReplacementType.kMODULE def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _torch.nn.Module: dim = fetch_argument(node.args, node.kwargs, None, 1, "dim") dtype = fetch_argument(node.args, node.kwargs, None, 3, "dtype") assert dtype is None repl = _torch.nn.Softmax(dim) return repl class MethodToFuncPermute(Transform, TransformRegistry): """ Subclass of class Transform that rewrites Permute methods to Permute functions. """ def __str__(self) -> str: return "MethodToFuncPermute" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a method if node.op == "call_method": self_obj, *args = load_arg(node.args, env) tensor_check = isinstance(self_obj, _torch.Tensor) if not _HAS_TORCH_VISION: raise ImportError(MSG_TORCH_VISION_NOT_FOUND) torchvision_check = self_obj == _torchvision.ops.misc.Permute return (tensor_check or torchvision_check) and node.target == "permute" else: return False def get_replacement_type(self) -> ReplacementType: return ReplacementType.kFUNCTION def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _Any: input = node.args[0] # dims must be a tuple of int dims = None if isinstance(node.args[1], int): dims = tuple(node.args[1:]) else: dims = tuple(node.args[1]) target = _torch.permute fun_args = (input, dims) fun_kwargs = dict() return target, fun_args, fun_kwargs class MethodToFuncReshape(Transform, TransformRegistry): """ Subclass of class Transform that rewrites Reshape/View methods to Reshape functions. """ def __str__(self) -> str: return "MethodToFuncReshape" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a method if node.op == "call_method": self_obj, *args = load_arg(node.args, env) return isinstance(self_obj, _torch.Tensor) and ( node.target == "reshape" or node.target == "view" ) else: return False def get_replacement_type(self) -> ReplacementType: return ReplacementType.kFUNCTION def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _Any: input = node.args[0] # shape must be a tuple of int shape = None any_nodes = False for item in node.args[1:]: if isinstance(item, _torch.fx.Node): any_nodes = True all_ints = True for item in node.args[1:]: if not isinstance(item, int): all_ints = False break if any_nodes: _, *args = load_arg(node.args, env) shape = tuple(args) elif all_ints: shape = tuple(node.args[1:]) elif isinstance(node.args[1], _torch.Size) or isinstance(node.args[1], tuple): shape = tuple(node.args[1]) else: raise Exception( "Invalid view/reshape. Expected input arguments to be singleton Size, all ints or tuple of ints." ) target = _torch.reshape fun_args = (input, shape) fun_kwargs = dict() return target, fun_args, fun_kwargs class MethodToFuncTranspose(Transform, TransformRegistry): """ Subclass of class Transform that rewrites Transpose methods to Transpose functions. """ def __str__(self) -> str: return "MethodToFuncTranspose" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a method if node.op == "call_method": self_obj, *args = load_arg(node.args, env) return isinstance(self_obj, _torch.Tensor) and node.target == "transpose" else: return False def get_replacement_type(self) -> ReplacementType: return ReplacementType.kFUNCTION def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _Any: input = node.args[0] dim0 = node.args[1] dim1 = node.args[2] target = _torch.transpose fun_args = (input, dim0, dim1) fun_kwargs = dict() return target, fun_args, fun_kwargs class MethodToFuncChunk(Transform, TransformRegistry): """ Subclass of class Transform that rewrites Chunk methods to Chunk functions. """ def __str__(self) -> str: return "MethodToFuncChunk" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a method if node.op == "call_method": self_obj, *args = load_arg(node.args, env) return isinstance(self_obj, _torch.Tensor) and node.target == "chunk" else: return False def get_replacement_type(self) -> ReplacementType: return ReplacementType.kFUNCTION def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _Any: input = node.args[0] chunks = node.args[1] dim = fetch_argument(node.args, node.kwargs, 0, 2, "dim") target = _torch.chunk fun_args = (input, chunks, dim) fun_kwargs = dict() return target, fun_args, fun_kwargs class FuncToFuncOperator(Transform, TransformRegistry): """ Subclass of class Transform that rewrites pointwise operator functions to equivalent pointwise PyTorch operator functions. Purpose of this transform is to replace operators with the equivalent PyTorch operators. This protects against backward hook failures due to inplace operations such as +=, -=, *= and /=. """ repl_map = { _operator.add: _torch.add, _operator.sub: _torch.sub, _operator.mul: _torch.mul, _operator.truediv: _torch.div, } def __str__(self) -> str: return "FuncToFuncOperator" def match_pattern( self, node: _torch.fx.Node, modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> bool: # Checks if we are calling a function # The target attribute is the function that call_function calls return node.op == "call_function" and node.target in FuncToFuncOperator.repl_map def get_replacement_type(self) -> ReplacementType: return ReplacementType.kFUNCTION def get_replacement( self, model: _torch.nn.Module, node: _torch.fx.Node, output_shape: _Tuple[int], modules: _Dict[str, _torch.nn.Module], env: _Dict[str, _Any], ) -> _Any: target = FuncToFuncOperator.repl_map[node.target] fun_args = node.args fun_kwargs = node.kwargs return target, fun_args, fun_kwargs ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/validation_utils.py0000644000000000000000000001640014672066616025244 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy as _copy import logging as _logging from typing import Dict as _Dict from typing import List as _List from typing import Optional as _Optional import torch as _torch import torch.nn as _nn from coremltools.optimize.torch.optimization_config import ( ModuleOptimizationConfig as _ModuleOptimizationConfig, ) from coremltools.optimize.torch.optimization_config import ( PalettizationGranularity as _PalettizationGranularity, ) from coremltools.optimize.torch.optimization_config import ( QuantizationGranularity as _QuantizationGranularity, ) _logger = _logging.getLogger(__name__) class ConfigValidator: def __init__( self, param_name: str, param: _torch.Tensor, module: _nn.Module, config: _Optional[_ModuleOptimizationConfig], module_level_advanced_options: _Optional[_Dict] = None, ): self.param_name = param_name self.param = param self.module = module self.config = _copy.deepcopy(config) self.module_level_advanced_options = module_level_advanced_options def validate(self, checks_to_run: _List[str]) -> bool: for check_name in checks_to_run: check_method = getattr(self, f"sanitize_{check_name}", None) assert check_method, f"Check {check_method} not found" result = check_method() if not result: return result return True def sanitize_quantization_block_size(self): """ Validates and updates block_size attribute in quantization config for specified parameter. If compression should be skipped for param, returns False. Else, returns True and updates config inplace. """ if self.config.granularity != _QuantizationGranularity.per_block: return True if len(self.config.block_size) > self.param.ndim: _logger.warning( f"{self.param_name}: Length of block_size tuple {len(self.config.block_size)} " f"should not exceed the number of dimensions in the parameter {self.param.ndim}" ) return False # Verify that for non input or output channel axis, block size is either zero or equal to axis length for idx, bs in enumerate(self.config.block_size): if idx > 1: if bs != 0 and bs != self.param.shape[idx]: _logger.warning( f"{self.param_name}: Unsupported block_size={self.config.block_size}. " "Blocking is currently only supported along input OR output channel axis." ) return False # Determine whether it is an N-D block or a integer block size if len(self.config.block_size) >= 2: bs_output = self.config.block_size[0] bs_input = self.config.block_size[1] else: bs_output = None bs_input = self.config.block_size[0] should_block_output = ( bs_output > 0 and bs_output < self.param.shape[0] if bs_output else False ) should_block_input = bs_input > 0 and bs_input < self.param.shape[1] if should_block_input and not should_block_output: # By default we will always have per-channel on output-channel axis bs_output = 1 should_block_output = True if not should_block_input and not should_block_output: _logger.warning( f"{self.param_name}: Valid block_size={self.config.block_size} not specified for any axis. " "Use per_channel or per_tensor granularity if blocking is not required." ) return False # Check if the output-channel block size is divisible by the axis length if should_block_output and self.param.shape[0] % bs_output != 0: _logger.warning( f"{self.param_name}: block_size={bs_output} is not divisible by axis length={self.param.shape[0]}" ) return False # Check if the input-channel block size is divisible by the axis length if should_block_input and self.param.shape[1] % bs_input != 0: _logger.warning( f"{self.param_name}: block_size={bs_input} is not divisible by axis length={self.param.shape[0]}" ) return False self.config.block_size = (bs_output, bs_input) return True def sanitize_palettization_group_size(self): """ Validates and updates block_size attribute in palettization config for specified parameter. If compression should be skipped for param, returns False. Else, returns True and updates config inplace. """ if self.config.granularity != _PalettizationGranularity.per_grouped_channel: return True # If block size is not divisible by axis length skip palettizing this param if ( self.module_level_advanced_options and self.module_level_advanced_options["cluster_permute"] ): axis_length = self.param.permute( self.module_level_advanced_options["cluster_permute"] ).shape[self.config.channel_axis] else: axis_length = self.param.shape[self.config.channel_axis] if axis_length % self.config.group_size != 0: _logger.warning( f"{self.param_name}: axis_length={axis_length} is not divisible by group_size={self.config.group_size}" ) return False return True def sanitize_palettization_cluster_dim(self): """ Validates and updates cluster_dim attribute in palettization config for specified parameter. If compression should be skipped for param, returns False. Else, returns True and updates config inplace. """ if self.config.cluster_dim is None: self.config.cluster_dim = 1 return True if self.config.cluster_dim > 1: if self.config.channel_axis == 0: dim_size = self.param.shape[0] else: dim_size = self.param.shape[1] if dim_size % self.config.cluster_dim != 0: _logger.warning( f"{self.param_name}: The number of channels in channel axis dimension: {self.config.channel_axis}," f" {dim_size} is not divisible by cluster_dim={self.config.cluster_dim}" ) return False return True def validate_param_config( param_name: str, param: _torch.Tensor, module: _nn.Module, config: _Optional[_ModuleOptimizationConfig], checks_to_run: _List[str], module_level_advanced_options: _Optional[_Dict] = None, ): validator = ConfigValidator(param_name, param, module, config, module_level_advanced_options) is_valid_config = validator.validate(checks_to_run) if not is_valid_config: # Skip compression for this param if config is invalid _logger.info(f"Skipping compression for {param_name}") return None return validator.config ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/_utils/version_utils.py0000644000000000000000000000077414672066616024606 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import torch as _torch from packaging import version def version_ge(module, target_version): return version.parse(module.__version__) >= version.parse(target_version) def get_torch_version(): return _torch.__version__ def is_torch_2(): return version_ge(_torch, "2.0.0") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/base_model_optimizer.py0000644000000000000000000001223014672066616024564 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy as _copy import logging as _logging from abc import ABC as _ABC from abc import abstractmethod as _abstractmethod from collections import UserDict as _UserDict from typing import Iterable as _Iterable from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import Type as _Type import torch as _torch from coremltools.optimize.torch._utils.python_utils import get_str as _get_str from coremltools.optimize.torch.optimization_config import OptimizationConfig as _OptimizationConfig _logger = _logging.getLogger(__name__) class _Report(_UserDict): """ A dictionary with pretty printing. """ def __repr__(self): if len(self.data) < 1: return "" column_names = list(self.data.values())[0].keys() column_names = ["name"] + list(column_names) print_list = [column_names] print_list += [ [f"{key}"] + [_get_str(val[cn]) for cn in column_names[1:]] for key, val in self.data.items() ] col_size = [max(map(len, col)) for col in zip(*print_list)] ret_str = [ " | ".join( f"{' ' * (col_size[idx] - len(val))}{val}" for idx, val in enumerate(print_list[0]) ) ] ret_str += [" | ".join(f"{'-' * cs}" for cs in col_size)] for pl in print_list[1:]: ret_str.append( " | ".join(f"{' ' * (col_size[idx] - len(val))}{val}" for idx, val in enumerate(pl)) ) return "\n".join(ret_str) class BaseModelOptimizer(_ABC): """ An abstract base class for implementing optimizers. """ _supported_modules: _Tuple[_Type[_torch.nn.Module]] def __init__(self, model: _torch.nn.Module, config: _Optional[_OptimizationConfig] = None): self._model = model self._config = config @_abstractmethod def report(self) -> _Report: raise NotImplementedError() @property def supported_modules(self) -> _Tuple[_Type[_torch.nn.Module]]: return self._supported_modules def _get_model_for_compression(self, inplace: bool): return self._model if inplace else _copy.deepcopy(self._model) class BaseTrainingTimeModelOptimizer(BaseModelOptimizer): """ An abstract base class for implementing optimization algorithms which are integrated in model training pipelines. These optimizers simulate model compression and learn compression parameters during model training. """ def __init__(self, model: _torch.nn.Module, config: _Optional[_OptimizationConfig] = None): super().__init__(model, config) self._step_count = 0 @_abstractmethod def prepare(self, *args, **kwargs) -> _torch.nn.Module: raise NotImplementedError() @_abstractmethod def step(self): raise NotImplementedError() @_abstractmethod def finalize( self, model: _Optional[_torch.nn.Module] = None, inplace: bool = False ) -> _torch.nn.Module: raise NotImplementedError() class BasePostTrainingModelOptimizer(BaseModelOptimizer): """ An abstract base class for implementing optimization algorithms which perform zero-shot compression, after a model has been trained. These optimizers do no need any data to perform compression. """ def __init__(self, model: _torch.nn.Module, config: _Optional[_OptimizationConfig] = None): super().__init__(model, config) self._uncompressed_model = None def compress(self, *args, inplace: bool = False, **kwargs) -> _torch.nn.Module: # if inplace is True: # self._uncompressed_model -> deep copy of model passed by user # self._model -> model passed by user # if inplace is False: # self._uncompressed_model -> model passed by user # self._model -> deep copy of model passed by user self._uncompressed_model = self._get_model_for_compression(inplace=not inplace) self._model = self._get_model_for_compression(inplace=inplace) return self._model class BaseDataCalibratedModelOptimizer(BaseModelOptimizer): """ An abstract base class for optimization algorithms which use calibration data to compress models. """ def __init__(self, model: _torch.nn.Module, config: _Optional[_OptimizationConfig] = None): super().__init__(model, config) self._uncompressed_model = None def compress( self, dataloader: _Iterable, *args, inplace: bool = False, **kwargs ) -> _torch.nn.Module: # if inplace is True: # self._uncompressed_model -> deep copy of model passed by user # self._model -> model passed by user # if inplace is False: # self._uncompressed_model -> model passed by user # self._model -> deep copy of model passed by user self._uncompressed_model = self._get_model_for_compression(inplace=not inplace) self._model = self._get_model_for_compression(inplace=inplace) return self._model ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1726511965.265547 coremltools-8.0/coremltools/optimize/torch/layerwise_compression/0000755000000000000000000000000014672075535024445 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/layerwise_compression/__init__.py0000644000000000000000000000557314672066616026570 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ .. _coremltools_optimize_torch_layerwise_compression: _`LayerwiseCompressor` ================================== .. autoclass:: coremltools.optimize.torch.layerwise_compression.LayerwiseCompressorConfig :members: from_dict, as_dict, from_yaml, get_layers .. autoclass:: coremltools.optimize.torch.layerwise_compression.LayerwiseCompressor :members: compress Algorithms ========== :obj:`coremltools.optimize.torch.layerwise_compression.algorithms` submodule contains classes that implement the algorithms to be used with :py:class:`LayerwiseCompressor`, which can be used to compress LLM-based models GPTQ ---- .. autoclass:: coremltools.optimize.torch.layerwise_compression.algorithms.ModuleGPTQConfig :show-inheritance: .. autoclass:: coremltools.optimize.torch.layerwise_compression.algorithms.GPTQ :show-inheritance: SparseGPT --------- .. autoclass:: coremltools.optimize.torch.layerwise_compression.algorithms.ModuleSparseGPTConfig :show-inheritance: .. autoclass:: coremltools.optimize.torch.layerwise_compression.algorithms.SparseGPT :show-inheritance: Base class for layerwise compression algorithms config ------------------------------------------------------ .. autoclass:: coremltools.optimize.torch.layerwise_compression.LayerwiseCompressionAlgorithmConfig :show-inheritance: :no-members: Base class for layerwise compression algorithms ----------------------------------------------- .. autoclass:: coremltools.optimize.torch.layerwise_compression.LayerwiseCompressionAlgorithm :show-inheritance: :members: add_batch, cleanup, compress Input Cacher ============ :obj:`coremltools.optimize.torch.layerwise_compression.input_cacher` submodule contains classes which provide a way of capturing the model's inputs up till the first module set up to be compressed. FirstLayerInputCacher --------------------- .. autoclass:: coremltools.optimize.torch.layerwise_compression.FirstLayerInputCacher :show-inheritance: :members: cache DefaultInputCacher ------------------ .. autoclass:: coremltools.optimize.torch.layerwise_compression.DefaultInputCacher :show-inheritance: :members: cache GPTFirstLayerInputCacher ------------------------ .. autoclass:: coremltools.optimize.torch.layerwise_compression.GPTFirstLayerInputCacher :show-inheritance: :members: cache """ from .algorithms import ( GPTQ, LayerwiseCompressionAlgorithm, LayerwiseCompressionAlgorithmConfig, ModuleGPTQConfig, ModuleSparseGPTConfig, SparseGPT, ) from .input_cacher import DefaultInputCacher, FirstLayerInputCacher, GPTFirstLayerInputCacher from .layerwise_compressor import LayerwiseCompressor, LayerwiseCompressorConfig ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/layerwise_compression/_quant.py0000644000000000000000000001646214672066616026317 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause # Original implementation from https://github.com/IST-DASLab/sparsegpt # Copyright 2023 IST Austria Distributed Algorithms and Systems Lab. All Rights Reserved. import torch as _torch _normal_float_palette = { # The 4 bit numbers are copied from QLoRA paper: https://arxiv.org/abs/2305.14314 4: _torch.tensor( [ -1.0, -0.6961928009986877, -0.5250730514526367, -0.39491748809814453, -0.28444138169288635, -0.18477343022823334, -0.09105003625154495, 0.0, 0.07958029955625534, 0.16093020141124725, 0.24611230194568634, 0.33791524171829224, 0.44070982933044434, 0.5626170039176941, 0.7229568362236023, 1.0, ] ), # The 3 bit numbers are obtained from bitsandbytes: https://github.com/TimDettmers/bitsandbytes/blob/18e827d666fa2b70a12d539ccedc17aa51b2c97c/bitsandbytes/functional.py#L236 3: _torch.tensor([-1.0, -0.4786292, -0.21714179, 0.0, 0.1609302, 0.33791524, 0.562617, 1.0]), } def quantize( x: _torch.Tensor, scale: _torch.Tensor, zero: _torch.Tensor, max_q: _torch.Tensor, enable_normal_float: bool, ): """ Quantize ``x`` by rounding and clamping the value using specified quantization parameters. """ n_bits = _torch.log2(max_q + 1).item() if enable_normal_float: if n_bits not in _normal_float_palette: raise ValueError(f"Normal float format is not supported for {n_bits}.") nf_palette = _normal_float_palette[n_bits] nf_palette = nf_palette.to(x.device) distances = _torch.cdist((x / scale).view(-1, 1), nf_palette.unsqueeze(0).T) indices = _torch.min(distances, dim=1).indices return scale * nf_palette[indices].view(x.shape) else: q = _torch.clamp(_torch.round(x / scale) + zero, 0, max_q) return scale * (q - zero) class Quantizer(_torch.nn.Module): """ A module for quantizing tensors by scaling, shifting, rounding and clamping them such that the values are represented in ``n_bits`` precision. """ def __init__( self, n_bits: int, per_channel: bool = True, symmetric: bool = False, enable_normal_float: bool = False, mse: bool = False, norm: float = 2.4, grid: int = 100, max_shrink: float = 0.8, group_rows: int = 1, ): super().__init__() self._per_channel = per_channel self._symmetric = symmetric self._enable_normal_float = enable_normal_float self._mse = mse self._norm = norm self._grid = grid self._max_shrink = max_shrink self._group_rows = group_rows self.register_buffer("max_q", _torch.tensor(2**n_bits - 1)) self.register_buffer("scale", _torch.zeros(1)) self.register_buffer("zero", _torch.zeros(1)) def find_params(self, x, weight=False): """ Compute quantization parameters. """ device = x.device self.max_q = self.max_q.to(device) shape = x.shape if self._per_channel: if weight: x = x.flatten(1) if self._group_rows > 1: x = x.reshape((x.shape[0] // self._group_rows, -1)) else: if len(shape) == 4: x = x.permute([1, 0, 2, 3]) x = x.flatten(1) if len(shape) == 3: x = x.reshape((-1, shape[-1])).t() if len(shape) == 2: x = x.t() else: x = x.flatten().unsqueeze(0) tmp = _torch.zeros(x.shape[0], device=device) x_min = _torch.minimum(x.min(1)[0], tmp) x_max = _torch.maximum(x.max(1)[0], tmp) if self._symmetric: x_max = _torch.maximum(_torch.abs(x_min), x_max) tmp = x_min < 0 if _torch.any(tmp): x_min[tmp] = -x_max[tmp] tmp = (x_min == 0) & (x_max == 0) x_min[tmp] = -1 x_max[tmp] = +1 if self._enable_normal_float: self.scale = _torch.maximum(x_max, abs(x_min)) else: self.scale = (x_max - x_min) / self.max_q if self._symmetric: self.zero_point = _torch.full_like(self.scale, (self.max_q + 1) / 2) else: self.zero_point = _torch.round(-x_min / self.scale) if self._mse: best = _torch.full([x.shape[0]], float("inf"), device=device) for i in range(int(self._max_shrink * self._grid)): p = 1 - i / self._grid x_min1 = p * x_min x_max1 = p * x_max scale1 = (x_max1 - x_min1) / self.max_q zero_point1 = ( _torch.round(-x_min1 / scale1) if not self._symmetric else self.zero_point ) q = quantize( x, scale1.unsqueeze(1), zero_point1.unsqueeze(1), self.max_q, self._enable_normal_float, ) q -= x q.abs_() q.pow_(self._norm) err = _torch.sum(q, 1) tmp = err < best if _torch.any(tmp): best[tmp] = err[tmp] self.scale[tmp] = scale1[tmp] self.zero_point[tmp] = zero_point1[tmp] if not self._per_channel: if weight: tmp = shape[0] else: tmp = shape[1] if len(shape) != 3 else shape[2] self.scale = self.scale.repeat(tmp) self.zero_point = self.zero_point.repeat(tmp) if weight: if self._group_rows > 1: self.scale = self.scale.unsqueeze(1).repeat(1, self._group_rows) self.zero_point = self.zero_point.unsqueeze(1).repeat(1, self._group_rows) shape = [-1] + [1] * (len(shape) - 1) self.scale = self.scale.reshape(shape) self.zero_point = self.zero_point.reshape(shape) return if len(shape) == 4: self.scale = self.scale.reshape((1, -1, 1, 1)) self.zero_point = self.zero_point.reshape((1, -1, 1, 1)) if len(shape) == 3: self.scale = self.scale.reshape((1, 1, -1)) self.zero_point = self.zero_point.reshape((1, 1, -1)) if len(shape) == 2: self.scale = self.scale.unsqueeze(0) self.zero_point = self.zero_point.unsqueeze(0) def quantize(self, x): """ Quantize ``x`` using pre-computed quantization parameters. """ if self.ready(): return quantize(x, self.scale, self.zero_point, self.max_q, self._enable_normal_float) return x def enabled(self): """ Returns ``True`` if quantization is enabled. """ return self.max_q > 0 def ready(self): """ Returns ``True`` if quantization parameters have been computed. """ return _torch.all(self.scale != 0) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/layerwise_compression/algorithms.py0000644000000000000000000007417514672066616027206 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause # Original implementation from https://github.com/IST-DASLab/sparsegpt # Copyright 2023 IST Austria Distributed Algorithms and Systems Lab. All Rights Reserved. import copy as _copy import logging as _logging import math as _math import time as _time from abc import ABC as _ABC from abc import abstractmethod as _abstractmethod from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import Union as _Union import cattrs as _cattrs import torch as _torch import torch.nn as _nn from attr import define as _define from attr import field as _field from attrs import validators as _validators from coremltools.optimize.torch._utils.metadata_utils import ( CompressionMetadata as _CompressionMetadata, ) from coremltools.optimize.torch._utils.python_utils import ClassRegistryMixin as _ClassRegistryMixin from coremltools.optimize.torch._utils.torch_utils import ( get_n_bits_from_dtype as _get_n_bits_from_dtype, ) from coremltools.optimize.torch._utils.torch_utils import ( maybe_convert_str_to_dtype as _maybe_convert_str_to_dtype, ) from coremltools.optimize.torch.layerwise_compression._quant import Quantizer as _Quantizer from coremltools.optimize.torch.layerwise_compression._quant import _normal_float_palette from coremltools.optimize.torch.layerwise_compression._quant import quantize as _quantize from coremltools.optimize.torch.optimization_config import ( ModuleOptimizationConfig as _ModuleOptimizationConfig, ) from coremltools.optimize.torch.optimization_config import QuantizationGranularity from coremltools.optimize.torch.quantization.quantization_config import ( QuantizationScheme as _QuantizationScheme, ) _logger = _logging.getLogger(__name__) class LayerwiseCompressionAlgorithmConfig(_ABC, _ClassRegistryMixin, _ModuleOptimizationConfig): """ A template class and registry for configuration classes to be used with :py:class:`LayerwiseCompressionAlgorithm`. """ pass @LayerwiseCompressionAlgorithmConfig.register("gptq") @_define class ModuleGPTQConfig(LayerwiseCompressionAlgorithmConfig): """ Configuration class for specifying global and module-level compression options for the `Generative Pre-Trained Transformer Quantization (GPTQ) `_ algorithm. Args: weight_dtype (:py:class:`torch.dtype`): The dtype to use for quantizing the weights. The number of bits used for quantization is inferred from the dtype. When dtype is set to :py:class:`torch.float32`, the weights corresponding to that layer are not quantized. Defaults to :py:class:`torch.uint8`, which corresponds to 8-bit quantization. granularity (:py:class:`QuantizationGranularity`): Specifies the granularity at which quantization parameters will be computed. Can be one of ``per_channel``, ``per_tensor`` or ``per_block``. When using ``per_block``, ``block_size`` argument must be specified. Defaults to ``per_channel``. quantization_scheme (:py:class:`~.coremltools.optimize.torch.quantization.quantization_config.QuantizationScheme`): Type of quantization configuration to use. When this parameter is set to ``QuantizationScheme.symmetric``, all weights are quantized with zero point as zero. When it is set to ``QuantizationScheme.affine``, zero point can be set anywhere in the range of values allowed for the quantized weight. Defaults to ``QuantizationScheme.symmetric``. block_size (:obj:`int`): When ``block_size`` is specified, ``block_size`` number of values will share the same quantization parameters of scale, as well as the same zero point when applicable, across the input channel axis. Defaults to ``None``. enable_normal_float (:obj:`bool`): When ``True``, normal float format is used for quantization. It's only supported when ``weight_dtype`` is equal to ``int3`` and ``int4``. Defaults to ``False``. hessian_dampening (:obj:`float`): Dampening factor added to the diagonal of the Hessian used by GPTQ algorithm. Defaults to ``0.01``. use_activation_order_heuristic (:obj:`bool`): When ``True``, columns of weight are sorted in descending order of values of Hessian diagonal elements. Defaults to ``True``. processing_group_size (:obj:`int`): The weights are updated in blocks of size ``processing_group_size``. Defaults to ``128``. .. note: Blocking is currently limited to the input channel axis for GPTQ. For a linear layer of shape `(C_o x C_i)`, and ``block_size`` `B`, the quantization scales will have shape `(C_o x C_i/B)`. For a 2D conv layer of shape `(C_o x C_i x H x W)`, the quantization scales will have shape `(C_o x C_i/B x 1 x 1)`. """ weight_dtype: _Union[str, _torch.dtype] = _field( default="uint8", ) granularity: QuantizationGranularity = _field( default="per_channel", converter=QuantizationGranularity, validator=_validators.in_(QuantizationGranularity), ) quantization_scheme: _QuantizationScheme = _field( default="symmetric", converter=_QuantizationScheme, validator=_validators.in_(_QuantizationScheme), ) block_size: _Optional[int] = _field( default=None, validator=_validators.optional(_validators.instance_of(int)) ) enable_normal_float: bool = _field(default=False, validator=_validators.instance_of(bool)) hessian_dampening: float = _field(default=0.01, validator=_validators.instance_of(float)) use_activation_order_heuristic: bool = _field( default=False, validator=_validators.instance_of(bool) ) processing_group_size: int = _field(default=128, validator=_validators.instance_of(int)) algorithm: str = _field(default="gptq", validator=_validators.in_("gptq")) def __attrs_post_init__(self): self.weight_n_bits = _get_n_bits_from_dtype(self.weight_dtype) self.weight_dtype = _maybe_convert_str_to_dtype(self.weight_dtype) if self.weight_dtype not in [_torch.uint8, _torch.float32]: raise ValueError( f"weight_dtype must be one of (torch.uint8, torch.float32) not {self.weight_dtype}" ) @classmethod def from_dict(cls, config_dict): converter = _cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook( _Union[str, _torch.dtype], lambda obj, type: obj, ) return converter.structure_attrs_fromdict(config_dict, cls) @LayerwiseCompressionAlgorithmConfig.register("sparse_gpt") @_define class ModuleSparseGPTConfig(LayerwiseCompressionAlgorithmConfig): """ Configuration class for specifying global and module-level compression options for the `Sparse Generative Pre-Trained Transformer (SparseGPT) `_ algorithm. Args: target_sparsity (:obj:`float`): Fraction of weight elements to set to ``0``. Defaults to ``0.5``. n_m_ratio (:obj:`tuple` of :obj:`int`): A tuple of two integers which specify how ``n:m`` pruning should be applied. In ``n:m`` pruning, out of every ``m`` elements, ``n`` with lowest magnitude are set to zero. When ``n_m_ratio`` is not ``None``, the value of ``target_sparsity`` is ignored and the actual target sparsity is determined by the ``n:m`` ratio. weight_dtype (:py:class:`torch.dtype`): The dtype to use for quantizing the weights. The number of bits used for quantization is inferred from the dtype. When dtype is set to :py:class:`torch.float32`, the weights corresponding to that layer are not quantized. Defaults to :py:class:`torch.float32`, which corresponds to no quantization. quantization_granularity (:py:class:`QuantizationGranularity`): Specifies the granularity at which quantization parameters will be computed. Can be one of ``per_channel``, ``per_tensor`` or ``per_block``. When using ``per_block``, ``block_size`` argument must be specified. Defaults to ``per_channel``. quantization_scheme (:py:class:`~.coremltools.optimize.torch.quantization.quantization_config.QuantizationScheme`): Type of quantization configuration to use. When this parameter is set to ``QuantizationScheme.symmetric``, all weights are quantized with zero point as zero. When it is set to ``QuantizationScheme.affine``, zero point can be set anywhere in the range of values allowed for the quantized weight. Defaults to ``QuantizationScheme.symmetric``. enable_normal_float (:obj:`bool`): When ``True``, normal float format is used for quantization. It's only supported for ``weight_dtype`` is equal to ``int3`` and ``int4``. hessian_dampening (:obj:`float`): Dampening factor added to the diagonal of the Hessian used by GPTQ algorithm. Defaults to ``0.01``. processing_group_size (:obj:`int`): The weights are updated in blocks of size processing_group_size. Defaults to ``128``. """ target_sparsity: float = _field(default=0.5, validator=_validators.instance_of(float)) n_m_ratio: _Optional[_Tuple[int, int]] = _field( default=None, validator=_validators.optional( _validators.deep_iterable( member_validator=_validators.instance_of(int), iterable_validator=_validators.instance_of((tuple, list)), ) ), ) weight_dtype: _Union[str, _torch.dtype] = _field( default="uint8", ) quantization_granularity: QuantizationGranularity = _field( default="per_channel", converter=QuantizationGranularity, validator=_validators.in_(QuantizationGranularity), ) quantization_scheme: _QuantizationScheme = _field( default="symmetric", converter=_QuantizationScheme, validator=_validators.in_(_QuantizationScheme), ) enable_normal_float: bool = _field(default=False, validator=_validators.instance_of(bool)) hessian_dampening: float = _field(default=0.01, validator=_validators.instance_of(float)) processing_group_size: int = _field(default=128, validator=_validators.instance_of(int)) algorithm: str = _field(default="sparse_gpt", validator=_validators.in_("sparse_gpt")) def __attrs_post_init__(self): self.weight_n_bits = _get_n_bits_from_dtype(self.weight_dtype) self.weight_dtype = _maybe_convert_str_to_dtype(self.weight_dtype) if self.weight_dtype not in [_torch.uint8, _torch.float32]: raise ValueError( f"weight_dtype must be one of (torch.uint8, torch.float32) not {self.weight_dtype}" ) @classmethod def from_dict(cls, config_dict): converter = _cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook( _Union[str, _torch.dtype], lambda obj, type: obj, ) return converter.structure_attrs_fromdict(config_dict, cls) class LayerwiseCompressionAlgorithm(_ClassRegistryMixin): """ A template class for implementing layerwise compression algorithms to be used with :py:class:`LayerwiseCompressor`. """ @_abstractmethod def add_batch(self, inp: _torch.Tensor, out: _torch.Tensor) -> None: """ Perform computation on a batch of data to acquire statistics before compression. """ raise NotImplementedError("Method not implemented in base class.") @_abstractmethod def cleanup(self) -> None: """ Reset the state of the compression algorithm object and free GPU memory. """ raise NotImplementedError("Method not implemented in base class.") @_abstractmethod def compress(self) -> None: """ Compress the weights of the layer. """ raise NotImplementedError("Method not implemented in base class.") class OBSCompressionAlgorithm(LayerwiseCompressionAlgorithm): """ A compression algorithm which uses the Hessian of the reconstruction loss to compress a weight matrix of a given layer. Based on the optimal brain surgeon paradigm described in `Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning `_. """ def __init__(self, layer: _nn.Module, config: LayerwiseCompressionAlgorithmConfig): self._layer = layer self._device = self._layer.weight.device self._nsamples = 0 self._config = config weight = self._layer.weight.data if isinstance(self._layer, _nn.Conv2d): weight = weight.flatten(1) self._dim = weight.dim() self._rows = weight.shape[0] self._columns = weight.shape[1] self._hessian = _torch.zeros((self._columns, self._columns), device=self._device) @_abstractmethod def _init_parameters(self, config: LayerwiseCompressionAlgorithmConfig): """ Initialize parameters of the algorithm from config. """ raise NotImplementedError("Method not implemented in base class.") def add_batch(self, inp: _torch.Tensor, out: _torch.Tensor): self._compute_hessian(inp, out) def _compute_hessian(self, inp: _torch.Tensor, out: _torch.Tensor): """ Compute Hessian of the L2 loss between the original output of the layer and the output computed using compressed weights. """ self._inp1 = inp self._out1 = out if len(inp.shape) == 2: inp = inp.unsqueeze(0) tmp = inp.shape[0] if isinstance(self._layer, _nn.Linear): if len(inp.shape) == 3: inp = inp.reshape((-1, inp.shape[-1])) inp = inp.t() if isinstance(self._layer, _nn.Conv2d): unfold = _nn.Unfold( self._layer.kernel_size, dilation=self._layer.dilation, padding=self._layer.padding, stride=self._layer.stride, ) inp = unfold(inp) inp = inp.permute([1, 0, 2]) inp = inp.flatten(1) self._hessian *= self._nsamples / (self._nsamples + tmp) self._nsamples += tmp inp = _math.sqrt(2 / self._nsamples) * inp.float() self._hessian += inp.matmul(inp.t()) @_abstractmethod def _compress_impl(self): """ Implementation of the compression algorithm """ raise NotImplementedError("Method not implemented in base class.") def compress(self): self._compress_impl() # NOTE: Currently algorithm assumes weight parameter is available for all layers # and the only parameter that gets updated metadata = self._get_compression_metadata("weight", self._layer.weight) metadata.register(self._layer) def cleanup(self): self._inp1 = None self._out1 = None self._nsamples = 0 _torch.cuda.empty_cache() self._hessian = None @_abstractmethod def _get_compression_metadata(self, param_name, param): raise NotImplementedError("Method not implemented in base class.") def _store_quantization_params(self, quantizer: _Quantizer): if quantizer is not None: scale = quantizer.scale scale_store = _torch.empty_like(scale, device=_torch.device("cpu")).copy_(scale) self._scale.append(scale_store) if not self._enable_normal_float: zero_point = quantizer.zero_point zero_point_store = _torch.empty_like(zero_point, device=_torch.device("cpu")).copy_( zero_point ) self._zero_point.append(zero_point_store) @LayerwiseCompressionAlgorithm.register("gptq") class GPTQ(OBSCompressionAlgorithm): """ A post-training compression algorithm based on the paper `GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers `_. Args: layer (:obj:`torch.nn.Module`): Module to be compressed. config (:py:class:`ModuleGPTQConfig`): Config specifying hyperparameters for the GPTQ algorithm. """ def __init__(self, layer: _nn.Module, config: ModuleGPTQConfig): super().__init__(layer, config) self._init_parameters(config) def _init_parameters(self, config: ModuleGPTQConfig): # Defaults to blocking along input channel axis self._block_size = config.block_size if self._block_size is not None and self._columns % self._block_size != 0: raise ValueError( f"Block size must completely divide the axis along which blocking is done: {self._columns} % {self._block_size} != 0" ) self._weight_n_bits = config.weight_n_bits self._processing_group_size = config.processing_group_size self._enable_normal_float = config.enable_normal_float self._hessian_dampening = config.hessian_dampening self._use_activation_order_heuristic = config.use_activation_order_heuristic # static grouping leads to all quantization parameters being pre-computed, # rather than dynamically during algorithm execution. This is necessary when # activation_order_heuristic is turned on to make sure the model is still exportable self._enable_static_blocking = config.use_activation_order_heuristic self._quantizer = None if self._weight_n_bits < 16: per_channel = config.granularity in [ QuantizationGranularity.per_channel, QuantizationGranularity.per_block, ] self._quantizer = _Quantizer( n_bits=self._weight_n_bits, per_channel=per_channel, symmetric=config.quantization_scheme == _QuantizationScheme.symmetric, enable_normal_float=config.enable_normal_float, ) self._scale = [] self._zero_point = [] def _compress_impl(self): weight = self._layer.weight.data.clone() if isinstance(self._layer, _nn.Conv2d): if self._block_size is not None: self._block_size = self._block_size * weight.shape[2] * weight.shape[3] weight = weight.flatten(1) weight = weight.float() tick = _time.time() if not self._quantizer.ready(): self._quantizer.find_params(weight, weight=True) if self._block_size is None: self._store_quantization_params(self._quantizer) hessian = self._hessian del self._hessian dead = _torch.diag(hessian) == 0 hessian[dead, dead] = 1 weight[:, dead] = 0 blocks = [] if self._enable_static_blocking and self._block_size is not None: for i in range(0, self._columns, self._block_size): quantizer = _copy.deepcopy(self._quantizer) quantizer.find_params(weight[:, i : (i + self._block_size)], weight=True) blocks.append(quantizer) self._store_quantization_params(quantizer) perm = None if self._use_activation_order_heuristic: perm = _torch.argsort(_torch.diag(hessian), descending=True) weight = weight[:, perm] hessian = hessian[perm][:, perm] losses = _torch.zeros_like(weight) quant_weight = _torch.zeros_like(weight) damp = self._hessian_dampening * _torch.mean(_torch.diag(hessian)) diag = _torch.arange(self._columns, device=self._device) hessian[diag, diag] += damp hessian = _torch.linalg.cholesky(hessian) hessian = _torch.cholesky_inverse(hessian) hessian = _torch.linalg.cholesky(hessian, upper=True) hessian_inverse = hessian for i1 in range(0, self._columns, self._processing_group_size): i2 = min(i1 + self._processing_group_size, self._columns) count = i2 - i1 weight_block = weight[:, i1:i2].clone() quant_weight_block = _torch.zeros_like(weight_block) error_block = _torch.zeros_like(weight_block) losses_block = _torch.zeros_like(weight_block) hessian_inverse_block = hessian_inverse[i1:i2, i1:i2] for i in range(count): w = weight_block[:, i] d = hessian_inverse_block[i, i] if self._block_size is not None: if self._enable_static_blocking: idx = perm[i1 + i] self._quantizer = blocks[idx // self._block_size] else: if (i1 + i) % self._block_size == 0: self._quantizer.find_params( weight[:, (i1 + i) : (i1 + i + self._block_size)], weight=True, ) self._store_quantization_params(self._quantizer) q = _quantize( w.unsqueeze(1), self._quantizer.scale, self._quantizer.zero_point, self._quantizer.max_q, self._enable_normal_float, ).flatten() quant_weight_block[:, i] = q losses_block[:, i] = (w - q) ** 2 / d**2 err1 = (w - q) / d weight_block[:, i:] -= err1.unsqueeze(1).matmul( hessian_inverse_block[i, i:].unsqueeze(0) ) error_block[:, i] = err1 quant_weight[:, i1:i2] = quant_weight_block losses[:, i1:i2] = losses_block / 2 weight[:, i2:] -= error_block.matmul(hessian_inverse[i1:i2, i2:]) if _torch.cuda.is_available(): _torch.cuda.synchronize() _logger.info( "time %.2f, weight quantization error %.2f" % (_time.time() - tick, _torch.sum(losses).item()) ) if self._use_activation_order_heuristic: inverse_perm = _torch.argsort(perm) quant_weight = quant_weight[:, inverse_perm] self._layer.weight.data = quant_weight.reshape(self._layer.weight.shape).to( self._layer.weight.data.dtype ) _logger.debug( "quantization error in output activations = %.2f" % (_torch.sum((self._layer(self._inp1) - self._out1) ** 2)) ) def _get_compression_metadata(self, param_name, param): metadata = _CompressionMetadata(param_name) scale = _torch.cat(self._scale, dim=1) if self._enable_normal_float: metadata.compression_type = ["palettization"] metadata.lut = _normal_float_palette[self._weight_n_bits].unsqueeze(-1) for _ in range(param.dim()): metadata.lut = metadata.lut.unsqueeze(0) metadata.palettization_scale = scale else: metadata.compression_type = ["quantization"] metadata.quantization_n_bits = self._weight_n_bits metadata.quantization_scale = scale metadata.zero_point = _torch.cat(self._zero_point, dim=1) return metadata @LayerwiseCompressionAlgorithm.register("sparse_gpt") class SparseGPT(OBSCompressionAlgorithm): """ A post-training compression algorithm based on the paper `SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot `_ Args: layer (:obj:`torch.nn.Module`): Module to be compressed. config (:py:class:`ModuleSparseGPTConfig`): Config specifying hyper-parameters for the SparseGPT algorithm. """ def __init__(self, layer: _nn.Module, config: ModuleSparseGPTConfig): super().__init__(layer, config) self._init_parameters(config) def _init_parameters(self, config: ModuleSparseGPTConfig): self._target_sparsity = config.target_sparsity self._weight_n_bits = config.weight_n_bits self._n_m_ratio = config.n_m_ratio self._processing_group_size = config.processing_group_size self._enable_normal_float = config.enable_normal_float self._hessian_dampening = config.hessian_dampening self._quantizer = None if self._weight_n_bits < 16: per_channel = config.quantization_granularity in [ QuantizationGranularity.per_channel, QuantizationGranularity.per_block, ] self._quantizer = _Quantizer( n_bits=self._weight_n_bits, per_channel=per_channel, symmetric=config.quantization_scheme == _QuantizationScheme.symmetric, enable_normal_float=config.enable_normal_float, ) self._scale = [] self._zero_point = [] if self._n_m_ratio is not None: self._prune_n, self._prune_m = self._n_m_ratio else: self._prune_n, self._prune_m = 0, 0 def _compress_impl(self): weight = self._layer.weight.data.clone() if isinstance(self._layer, _nn.Conv2d): weight = weight.flatten(1) weight = weight.float() if self._quantizer is not None and not self._quantizer.ready(): self._quantizer.find_params(weight, weight=True) self._store_quantization_params(self._quantizer) tick = _time.time() hessian = self._hessian del self._hessian dead = _torch.diag(hessian) == 0 hessian[dead, dead] = 1 weight[:, dead] = 0 losses = _torch.zeros(self._rows, device=self._device) damp = self._hessian_dampening * _torch.mean(_torch.diag(hessian)) diag = _torch.arange(self._columns, device=self._device) hessian[diag, diag] += damp hessian = _torch.linalg.cholesky(hessian) hessian = _torch.cholesky_inverse(hessian) hessian = _torch.linalg.cholesky(hessian, upper=True) hessian_inverse = hessian mask = None for i1 in range(0, self._columns, self._processing_group_size): i2 = min(i1 + self._processing_group_size, self._columns) count = i2 - i1 weight_block = weight[:, i1:i2].clone() quant_weight_block = _torch.zeros_like(weight_block) error_block = _torch.zeros_like(weight_block) losses_block = _torch.zeros_like(weight_block) hessian_inverse_block = hessian_inverse[i1:i2, i1:i2] if self._prune_n == 0: if mask is not None: mask1 = mask[:, i1:i2] else: tmp = ( weight_block**2 / (_torch.diag(hessian_inverse_block).reshape((1, -1))) ** 2 ) thresh = _torch.sort(tmp.flatten())[0][int(tmp.numel() * self._target_sparsity)] mask1 = tmp <= thresh else: mask1 = _torch.zeros_like(weight_block) == 1 for i in range(count): w = weight_block[:, i] d = hessian_inverse_block[i, i] if self._prune_n != 0 and i % self._prune_m == 0: tmp = ( weight_block[:, i : (i + self._prune_m)] ** 2 / ( _torch.diag(hessian_inverse_block)[i : (i + self._prune_m)].reshape( (1, -1) ) ) ** 2 ) mask1.scatter_( 1, i + _torch.topk(tmp, self._prune_n, dim=1, largest=False)[1], True, ) q = w.clone() q[mask1[:, i]] = 0 if self._quantizer is not None: q = _quantize( q.unsqueeze(1), self._quantizer.scale, self._quantizer.zero_point, self._quantizer.max_q, self._enable_normal_float, ).flatten() quant_weight_block[:, i] = q losses_block[:, i] = (w - q) ** 2 / d**2 err1 = (w - q) / d weight_block[:, i:] -= err1.unsqueeze(1).matmul( hessian_inverse_block[i, i:].unsqueeze(0) ) error_block[:, i] = err1 weight[:, i1:i2] = quant_weight_block losses += _torch.sum(losses_block, 1) / 2 weight[:, i2:] -= error_block.matmul(hessian_inverse[i1:i2, i2:]) if _torch.cuda.is_available(): _torch.cuda.synchronize() _logger.info( "time %.2f, weight quantization error %.2f" % (_time.time() - tick, _torch.sum(losses).item()) ) self._layer.weight.data = weight.reshape(self._layer.weight.shape).to( self._layer.weight.data.dtype ) _logger.debug( "quantization error in output activations = %.2f" % (_torch.sum((self._layer(self._inp1) - self._out1) ** 2)) ) def _get_compression_metadata(self, param_name, param): metadata = _CompressionMetadata(param_name) compression_type = ["pruning"] if not self._quantizer: metadata.compression_type = compression_type return metadata scale = _torch.cat(self._scale, dim=1) if self._enable_normal_float: compression_type.append("palettization") metadata.lut = _normal_float_palette[self._weight_n_bits].unsqueeze(-1) for _ in range(param.dim()): metadata.lut = metadata.lut.unsqueeze(0) metadata.palettization_scale = scale else: compression_type.append("quantization") metadata.quantization_n_bits = self._weight_n_bits metadata.quantization_scale = scale metadata.zero_point = _torch.cat(self._zero_point, dim=1) metadata.compression_type = compression_type return metadata ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/layerwise_compression/input_cacher.py0000644000000000000000000001467014672066616027473 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause # Original implementation from https://github.com/IST-DASLab/sparsegpt # Copyright 2023 IST Austria Distributed Algorithms and Systems Lab. All Rights Reserved. import logging as _logging import re as _re from abc import ABC as _ABC from abc import abstractmethod as _abstractmethod from typing import Dict as _Dict from typing import Iterable as _Iterable from typing import List as _List from typing import Tuple as _Tuple from typing import Union as _Union import torch as _torch import torch.nn as _nn from coremltools.optimize.torch._utils.python_utils import ClassRegistryMixin as _ClassRegistryMixin _logger = _logging.getLogger(__name__) class StopExecution(ValueError): pass class FirstLayerInputCacher(_ABC, _ClassRegistryMixin): """ A template class for getting the inputs to feed to the first layer of the model which is set up for compression. """ def __init__(self, model: _nn.Module, layers: str): self._model = model self._layers = layers @_abstractmethod def cache( self, dataloader: _Iterable, nsamples: int, device: str ) -> _Tuple[_List[_torch.Tensor], _Dict[str, _torch.Tensor]]: """ Cache inputs and keyword arguments to be fed to first layer of the model which is set up for compression. Args: dataloader (:py:class:`Iterable`): An iterable where each element is an input to the model to be compressed. nsamples (:obj:`int`): Number of samples to cache. device (:obj:`str`): Device string for device to run compression on. """ raise NotImplementedError("Method not implemented in base class.") @FirstLayerInputCacher.register("gpt") class GPTFirstLayerInputCacher(FirstLayerInputCacher): """ An implementation of :py:class:`FirstLayerInputCacher` for GPT style models. Computes inputs to feed to the first layer of the model which is set up for compression. Args: model (:obj:`torch.nn.Module`): Module to be compressed. layers (:obj:`str`): Regex string for the decoder layers of the model. """ def __init__( self, model: _nn.Module, layers: _Union[str, _List], ): super().__init__(model, layers) self._pre_layers = [] self._first_layer = None for layer_name, layer in model.named_modules(remove_duplicate=True): if self._first_layer_match(layer_name, layer): self._pre_layers.append(layer) self._first_layer = layer # break the first time there's a match break elif len(list(layer.children())) == 0: self._pre_layers.append(layer) if self._first_layer is None: _logger.warning( "Could not find first decoder layer based on", f"decoder layer path {layers} regex", ) def _first_layer_match(self, layer_name: str, layer: _torch.nn.Module) -> bool: if isinstance(self._layers, str): return _re.fullmatch(self._layers, layer_name) elif isinstance(self._layers, list): if isinstance(self._layers[0], str): return _re.fullmatch(self._layers[0], layer_name) else: return layer == self._layers[0] def _feed_data(self, dataloader: _Iterable, nsamples: int, device: str): """ Feed data to the model so that the inputs to the first layer can be cached. """ num_sampled = 0 for batch in dataloader: try: self._model(batch.to(device)) except StopExecution: pass num_sampled += 1 if num_sampled >= nsamples: break @staticmethod def _get_input_cacher_pre_hook(inputs, kwarg_inputs): """ Returns forward_pre_hook for caching inputs and keyword arguments to the first decoder layer of a GPT model. """ def input_cacher_pre_hook(module, args, kwargs): inputs.append(args) for key, val in kwargs.items(): kwarg_inputs[key] = val raise StopExecution() return input_cacher_pre_hook def cache( self, dataloader: _Iterable, nsamples: int, device: str ) -> _Tuple[_List[_torch.Tensor], _Dict[str, _torch.Tensor]]: """ Cache inputs and keyword arguments to be fed to the first decoder layer of a GPT style model. Args: dataloader (:py:class:`Iterable`): An iterable where each element is an input to the model to be compressed. nsamples (:obj:`int`): Number of samples to cache. device (:obj:`str`): Device string for device to run compression on. """ for layer in self._pre_layers: layer.to(device) inputs, kwarg_inputs = [], {} input_cacher_handle = self._first_layer.register_forward_pre_hook( self._get_input_cacher_pre_hook(inputs, kwarg_inputs), with_kwargs=True ) self._feed_data(dataloader, nsamples, device) input_cacher_handle.remove() for layer in self._pre_layers: layer.cpu() for key, val in kwarg_inputs.items(): if isinstance(val, _torch.Tensor): kwarg_inputs[key] = val.to(device) return inputs, kwarg_inputs @FirstLayerInputCacher.register("default") class DefaultInputCacher(FirstLayerInputCacher): def cache( self, dataloader: _Iterable, nsamples: int, device: str ) -> _Tuple[_List[_torch.Tensor], _Dict[str, _torch.Tensor]]: """ Cache inputs and keyword arguments to be fed to first layer of the model which is set up for compression. Args: dataloader (:py:class:`Iterable`): An iterable where each element is an input to the model to be compressed. nsamples (:obj:`int`): Number of samples to cache. device (:obj:`str`): Device string for device to run compression on. """ inputs = [] sampled = 0 for batch in dataloader: inputs.append(batch.to(device)) sampled += 1 if sampled == nsamples: break return inputs, {} ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/layerwise_compression/layerwise_compressor.py0000644000000000000000000004271514672066616031310 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause # Original implementation from https://github.com/IST-DASLab/sparsegpt # Copyright 2023 IST Austria Distributed Algorithms and Systems Lab. All Rights Reserved. import logging as _logging import re as _re from collections import OrderedDict as _OrderedDict from contextlib import contextmanager as _contextmanager from typing import Any as _Any from typing import Callable as _Callable from typing import Dict as _Dict from typing import Iterable as _Iterable from typing import List as _List from typing import NewType as _NewType from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import Union as _Union import cattrs as _cattrs import torch as _torch import torch.nn as _nn from attr import define as _define from attr import field as _field from attrs import validators as _validators from coremltools.optimize.torch._utils.metadata_utils import ( register_metadata_version as _register_metadata_version, ) from coremltools.optimize.torch._utils.report_utils import ( compute_post_training_report as _compute_post_training_report, ) from coremltools.optimize.torch._utils.torch_utils import get_atomic_layers as _get_atomic_layers from coremltools.optimize.torch._utils.torch_utils import get_eval_model as _get_eval_model from coremltools.optimize.torch.base_model_optimizer import ( BaseDataCalibratedModelOptimizer as _BaseDataCalibratedModelOptimizer, ) from coremltools.optimize.torch.base_model_optimizer import _Report from coremltools.optimize.torch.layerwise_compression.algorithms import ( LayerwiseCompressionAlgorithm as _LayerwiseCompressionAlgorithm, ) from coremltools.optimize.torch.layerwise_compression.algorithms import ( LayerwiseCompressionAlgorithmConfig as _LayerwiseCompressionAlgorithmConfig, ) from coremltools.optimize.torch.layerwise_compression.input_cacher import ( FirstLayerInputCacher as _FirstLayerInputCacher, ) from coremltools.optimize.torch.optimization_config import OptimizationConfig as _OptimizationConfig _logger = _logging.getLogger(__name__) _ModuleTypeConfigType = _NewType( "ModuleTypeConfigType", _Dict[_Union[_Callable, str], _Optional[_LayerwiseCompressionAlgorithmConfig]], ) _SUPPORTED_MODULES = [_torch.nn.Conv2d, _torch.nn.Linear] @_define class LayerwiseCompressorConfig(_OptimizationConfig): """ Configuration class for specifying how different submodules of a model are compressed by :py:class:`LayerwiseCompressor`. Note that only sequential models are supported. Args: layers (:obj:`list` of :py:class:`torch.nn.Module` or :obj:`str`): List of layers to be compressed. When items in the list are :obj:`str`, the string can be a regex or the exact name of the module. The layers listed should be immediate child modules of the parent container :py:class:`torch.nn.Sequential` model, and they should be contiguous. That is, the output of layer ``n`` should be the input to layer ``n+1``. global_config (:py:class:`ModuleGPTQConfig` or :py:class:`ModuleSparseGPTConfig`): Config to be applied globally to all supported modules. Missing values are chosen from the default config. module_type_configs (:obj:`dict` of :obj:`str` to :py:class:`ModuleGPTQConfig` or :py:class:`ModuleSparseGPTConfig`): Module type configs applied to a specific module class, such as :py:class:`torch.nn.Linear`. The keys can be either strings or module classes. module_name_configs (:obj:`dict` of :obj:`str` to :py:class:`ModuleGPTQConfig` or :py:class:`ModuleSparseGPTConfig`): Module-level configs applied to specific modules. The name of the module must either be a regex or a fully qualified name that can be used to fetch it from the top level module using the ``module.get_submodule(target)`` method. input_cacher (:obj:`str` or :py:class:`FirstLayerInputCacher`): Cacher object that caches inputs which are then fed to the first layer set up for compression. calibration_nsamples (:obj:`int`): Number of samples to be used for calibration. """ layers: _Optional[_Union[_List[_Union[_nn.Module, str]], _nn.ModuleList]] = _field( default=None, validator=_validators.optional( _validators.deep_iterable( member_validator=_validators.instance_of((_nn.Module, str)), iterable_validator=_validators.instance_of((list, _nn.ModuleList)), ) ), ) global_config: _Optional[_LayerwiseCompressionAlgorithmConfig] = _field( default=None, validator=_validators.optional( _validators.instance_of(_LayerwiseCompressionAlgorithmConfig) ), ) module_type_configs: _ModuleTypeConfigType = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.instance_of((str, _Callable)), value_validator=_validators.optional( _validators.instance_of(_LayerwiseCompressionAlgorithmConfig) ), mapping_validator=_validators.instance_of(dict), ), ) module_name_configs: _Dict[str, _Optional[_LayerwiseCompressionAlgorithmConfig]] = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.instance_of(str), value_validator=_validators.optional( _validators.instance_of(_LayerwiseCompressionAlgorithmConfig) ), mapping_validator=_validators.instance_of(dict), ), ) input_cacher: str = _field(default="default", converter=_FirstLayerInputCacher.get_class) calibration_nsamples: int = _field(default=128, validator=_validators.instance_of(int)) @classmethod def from_dict(cls, config_dict: _Dict[str, _Any]) -> "LayerwiseCompressorConfig": super().from_dict(config_dict) converter = _cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook( _Optional[_Union[_List[_Union[_nn.Module, str]], _nn.ModuleList]], lambda obj, type: obj, ) converter.register_structure_hook( _LayerwiseCompressionAlgorithmConfig, lambda obj, type: _LayerwiseCompressionAlgorithmConfig.get_class( obj["algorithm"] ).from_dict(obj), ) converter.register_structure_hook( _ModuleTypeConfigType, lambda module_type_config, type: { key: _LayerwiseCompressionAlgorithmConfig.get_class(val["algorithm"]).from_dict(val) if val is not None else None for key, val in module_type_config.items() }, ) return converter.structure_attrs_fromdict(config_dict, cls) def get_layers(self, model: _nn.Module): if self.layers is None: for module_name, module in model.named_children(): yield module_name, module else: yielded = set() for module_name, module in model.named_modules(remove_duplicate=True): for layer in self.layers: if isinstance(layer, str) and _re.fullmatch(layer, module_name): if module_name not in yielded: yielded.add(module_name) yield module_name, module elif module == layer: if module_name not in yielded: yielded.add(module_name) yield module_name, module @_contextmanager def _set_torch_flags(): # TODO: Copied from original implementation; determine if this is necessary cuda_matmul_tf32 = _torch.backends.cuda.matmul.allow_tf32 cudnn_allow_tf32 = _torch.backends.cudnn.allow_tf32 try: _torch.backends.cuda.matmul.allow_tf32 = False _torch.backends.cudnn.allow_tf32 = False yield finally: _torch.backends.cuda.matmul.allow_tf32 = cuda_matmul_tf32 _torch.backends.cudnn.allow_tf32 = cudnn_allow_tf32 class LayerwiseCompressor(_BaseDataCalibratedModelOptimizer): """ A post-training compression algorithm which compresses a sequential model layer by layer by minimizing the quantization error while quantizing the weights. The implementation supports two variations of this algorithm: 1) `Generative Pre-Trained Transformer Quantization (GPTQ) `_ 2) `Sparse Generative Pre-Trained Transformer (SparseGPT) `_ At a high level, it compresses weights of a model layer by layer by minimizing the L2 norm of the difference between the original activations and activations obtained from compressing the weights of a layer. The activations are computed using a few samples of training data. Only sequential models are supported, where the output of one layer feeds into the input of the next layer. For HuggingFace models, disable the ``use_cache`` config. This is used to speed up decoding, but to generalize forward pass for :py:class:`LayerwiseCompressor` algorithms across all model types, the behavior must be disabled. Example: .. code-block:: python import torch.nn as nn from coremltools.optimize.torch.layerwise_compression import ( LayerwiseCompressor, LayerwiseCompressorConfig, ) model = nn.Sequential( OrderedDict( { "conv": nn.Conv2d(1, 20, (3, 3)), "relu1": nn.ReLU(), "conv2": nn.Conv2d(20, 20, (3, 3)), "relu2": nn.ReLU(), } ) ) dataloder = load_calibration_data() # initialize the quantizer config = LayerwiseCompressorConfig.from_dict( { "global_config": { "algorithm": "gptq", "weight_dtype": "int4", }, "input_cacher": "default", "calibration_nsamples": 16, } ) compressor = LayerwiseCompressor(model, config) compressed_model = compressor.compress(dataloader) Args: model (:obj:`torch.nn.Module`): Module to be compressed. config (:py:class:`LayerwiseCompressorConfig`): Config that specifies how different submodules in the model will be compressed. """ _supported_modules: _Tuple = tuple(_SUPPORTED_MODULES) def __init__(self, model: _nn.Module, config: LayerwiseCompressorConfig): super().__init__(model, config) self._input_cacher = self._config.input_cacher( self._model, self._config.layers, ) @staticmethod def _forward_layer(layer, inputs, kwarg_inputs, outputs) -> _List: """ Perform forward pass on layer and store outputs. """ for j, inp in enumerate(inputs): if isinstance(inp, _torch.Tensor): inp = (inp,) outputs[j] = layer(*inp, **kwarg_inputs) return outputs def _get_cached_inputs( self, dataloader: _Iterable, device: str ) -> _Tuple[_List[_torch.Tensor], _Dict[str, _torch.Tensor]]: """ Cache the inputs and keyword arguments up till the first layer set up for compression """ inputs, kwarg_inputs = self._input_cacher.cache( dataloader=dataloader, nsamples=self._config.calibration_nsamples, device=device, ) return inputs, kwarg_inputs def _get_layers_to_compress(self) -> _Dict[str, _nn.Module]: """ Returns a list of layers to be compressed """ return self._config.get_layers(self._model) def _init_and_config_layer( self, atomic_layer_name, atomic_layer ) -> _Optional[_LayerwiseCompressionAlgorithm]: """ Initializes and configures the compression algorithm for a given atomic layer. Returns the initialized and configured compression algorithm object """ layer_config = self._config.get_module_config(atomic_layer_name, atomic_layer) if layer_config is not None: algo_class = _LayerwiseCompressionAlgorithm.get_class(layer_config.algorithm) try: return algo_class(atomic_layer, layer_config) except ValueError as error: _logger.info(f"Skipping compression for {atomic_layer_name}. Reason={error}") return None def _register_activation_processing_hook( self, atomic_layer, compressor_obj ) -> _torch.utils.hooks.RemovableHandle: """ Registers a forward hook on the layer for performing computation using the inputs to acquire statistics. Returns the handle for the forward hook """ def activation_processing_hook(_, inp, out): compressor_obj.add_batch(inp[0].data, out.data) return atomic_layer.register_forward_hook(activation_processing_hook) @_torch.no_grad() def _compress_impl(self, dataloader: _Iterable, device: str) -> _nn.Module: """ Compresses a model layerwise using the following steps: 1) Compute inputs to the first layer which is set up for compression using input cacher 2) For each layer, find submodules which are supported for compression and install compression hooks. 3) Run forward pass through each layer, compute activation statistics and use them to compress weights. 4) Compute updated outputs using compressed weights to propagate quantization error to the next layer and set them up as inputs to next layer. """ inputs, kwarg_inputs = self._get_cached_inputs(dataloader, device) outputs = [None for _ in inputs] # compress the layers one by one for layer_idx, (parent_layer_name, layer) in enumerate(self._get_layers_to_compress()): layer.to(device) atomic_layers_dict = _get_atomic_layers( layer, layer_types=self._supported_modules, name_prefix=parent_layer_name, ) # dict mapping layer_name -> compression algorithm object compression_algo_objects_dict = dict() # dict mapping layer_name -> forward hook handle layer_hooks = [] for atomic_layer_name, atomic_layer in atomic_layers_dict.items(): obj = self._init_and_config_layer(atomic_layer_name, atomic_layer) if obj is not None: compression_algo_objects_dict[atomic_layer_name] = obj layer_hooks.append(self._register_activation_processing_hook(atomic_layer, obj)) # Compute statistics on the activations using the activation processing hooks outputs = self._forward_layer( layer, inputs, kwarg_inputs, outputs, ) # Remove the activation processing hooks for h in layer_hooks: h.remove() # compress the layers _logger.info(f"Layer {layer_idx}") for ( atomic_layer_name, compressor_algo, ) in compression_algo_objects_dict.items(): _logger.info(f"Compressing {atomic_layer_name}") compressor_algo.compress() compressor_algo.cleanup() del compression_algo_objects_dict # feed the previous layer's outputs to this layer outputs = self._forward_layer( layer, inputs, kwarg_inputs, outputs, ) # free memory layer.cpu() del layer _torch.cuda.empty_cache() # interchange inputs and outputs inputs, outputs = outputs, inputs _register_metadata_version(self._model) return self._model def compress(self, dataloader: _Iterable, device: str, inplace: bool = False) -> _nn.Module: """ Compresses model using samples from ``dataloader``. Args: dataloader (:py:class:`Iterable`): An iterable where each element is an input to the model to be compressed. device (:obj:`str`): Device string for device to run compression on. inplace (:obj:`bool`): If ``True``, model transformations are carried out in-place and the original module is mutated, otherwise a copy of the model is mutated and returned. Defaults to ``False``. """ self._model = super().compress(dataloader=dataloader, inplace=inplace) with _get_eval_model(self._model): with _set_torch_flags(): return self._compress_impl(dataloader, device) def report(self) -> _Report: return _compute_post_training_report( self._uncompressed_model, self._model, supported_modules=self._supported_modules, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/optimization_config.py0000644000000000000000000001642614672066616024456 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import re as _re from collections import OrderedDict as _OrderedDict from enum import Enum as _Enum from typing import Any as _Any from typing import Callable as _Callable from typing import Dict as _Dict from typing import List as _List from typing import Optional as _Optional from typing import Union as _Union import torch as _torch from attr import Factory as _Factory from attr import define as _define from attrs import field as _field from coremltools.optimize.torch._utils.python_utils import DictableDataClass as _DictableDataClass class QuantizationGranularity(_Enum): """ Enum to denote granularity at which different compression schemes are applied. See specific algorithm for more details. """ per_tensor = "per_tensor" per_channel = "per_channel" per_block = "per_block" class PalettizationGranularity(_Enum): """ Enum to denote granularity at which different compression schemes are applied. See specific algorithm for more details. """ per_tensor = "per_tensor" per_grouped_channel = "per_grouped_channel" class ModuleOptimizationConfig(_DictableDataClass): pass @_define class OptimizationConfig(_DictableDataClass): global_config: _Optional[ModuleOptimizationConfig] = None module_type_configs: _Dict[ _Union[_Callable, str], _Optional[ModuleOptimizationConfig] ] = _Factory(_OrderedDict) module_name_configs: _Dict[str, _Optional[ModuleOptimizationConfig]] = _Factory(_OrderedDict) def set_global( self, global_config: _Optional[ModuleOptimizationConfig] ) -> "OptimizationConfig": """ Set the global config. """ self.global_config = global_config return self def set_module_type( self, object_type: _Union[_Callable, str], opt_config: _Optional[ModuleOptimizationConfig] ) -> "OptimizationConfig": """ Set the module level optimization config for a given module type. If the module level optimization config for an existing module type was already set, the new config will override the old one. """ self.module_type_configs[object_type] = opt_config return self def set_module_name( self, module_name: str, opt_config: _Optional[ModuleOptimizationConfig] ) -> "OptimizationConfig": """ Set the module level optimization config for a given module instance. If the module level optimization config for an existing module was already set, the new config will override the old one. """ self.module_name_configs[module_name] = opt_config return self def get_module_config( self, name: str, module: _torch.nn.Module ) -> _Optional[ModuleOptimizationConfig]: for mod_name in self.module_name_configs: if _re.fullmatch(mod_name, name): return self.module_name_configs[mod_name] if type(module) in self.module_type_configs: return self.module_type_configs[type(module)] elif module.__class__.__name__ in self.module_type_configs: return self.module_type_configs[module.__class__.__name__] else: return self.global_config @classmethod def from_dict(cls, config_dict: _Dict[str, _Any]) -> _Optional["OptimizationConfig"]: """ Create class from a dictionary of string keys and values. Args: config_dict (:obj:`dict` of :obj:`str` and values): A nested dictionary of strings and values. """ # passing forbid_extra_keys=True doesn't prevent silent failure when keys are mis-spelled cls._validate_dict(config_dict) return def _validate_same_params(self, param_names: _List[str]): """ This method validates that all the parameters in param_names have the same value across all the module level configs. """ expected_values = None if self.global_config is not None: expected_values = { param_name: getattr(self.global_config, param_name) for param_name in param_names } for name, config in self.module_type_configs.items(): if config is not None: expected_values = self._validate_expected_value( expected_values, name, config, param_names ) for name, config in self.module_name_configs.items(): if config is not None: expected_values = self._validate_expected_value( expected_values, name, config, param_names ) @staticmethod def _validate_expected_value( expected_values: _Dict[str, _Any], name: str, config: ModuleOptimizationConfig, param_names: _List[str], ): if expected_values is None: expected_values = { param_name: getattr(config, param_name) for param_name in param_names } for param_name, expected_val in expected_values.items(): val = getattr(config, param_name) if val != expected_val: raise ValueError( f"Value of parameter {param_name} cannot " f"be different between different module level configs." f"Expected value: {expected_val}, received: {val} " f"for config {name}." ) return expected_values def _structure_from_dict_hook_factory(conversion_cls: _Any) -> _Callable: def _structure_from_dict_hook( module_type_dict: _Dict[_Union[_Callable, str], _Any], type: _Any ): return_dict = _OrderedDict() for key, value in module_type_dict.items(): if value is None: return_dict[key] = None else: if isinstance(value, dict): return_dict[key] = conversion_cls.from_dict(value) else: assert isinstance(value, conversion_cls), ( "value in module type dict should be either a dict or " "a module config object." ) return_dict[key] = value return return_dict return _structure_from_dict_hook def _validate_module_type_keys_factory(supported_modules): supported_module_names = [cls.__name__ for cls in supported_modules] def validate_module_type_key(instance, attribute, value): if isinstance(value, str): assert value in supported_module_names, ( f"keys for module_type_configs must be one of " f"{supported_module_names}. Received: {value}." ) else: assert value in supported_modules, ( f"keys for module_type_configs must be one of " f"{supported_modules}. Received: {value}." ) return validate_module_type_key def _deprecated_field(message="This field is deprecated"): def validator(inst, attr, val): if val is not None: raise DeprecationWarning(message) return _field(default=None, validator=validator, on_setattr=validator) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2695472 coremltools-8.0/coremltools/optimize/torch/palettization/0000755000000000000000000000000014672075535022707 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/palettization/__init__.py0000644000000000000000000000435314672066616025025 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ .. _coremltools_optimize_torch_palettization: .. include:: palettization_desc.rst :end-line: 7 _`DKMPalettizer` ================ Top level APIs -------------- .. autoclass:: coremltools.optimize.torch.palettization.ModuleDKMPalettizerConfig :members: from_dict, as_dict, from_yaml .. autoclass:: coremltools.optimize.torch.palettization.DKMPalettizerConfig :members: set_global, set_module_type, set_module_name, from_dict, as_dict, from_yaml .. autoclass:: coremltools.optimize.torch.palettization.DKMPalettizer :members: prepare, step, report, finalize _`Palettization layers for DKM` ------------------------------- .. autoclass:: coremltools.optimize.torch.palettization.FakePalettize :no-members: _`SensitiveKMeans` ================== .. autoclass:: coremltools.optimize.torch.palettization.ModuleSKMPalettizerConfig :members: from_dict, as_dict, from_yaml .. autoclass:: coremltools.optimize.torch.palettization.SKMPalettizerConfig :members: set_global, set_module_type, set_module_name, from_dict, as_dict, from_yaml .. autoclass:: coremltools.optimize.torch.palettization.SKMPalettizer :members: compute_sensitivity, compress _`PostTrainingPalettization` ============================ .. autoclass:: coremltools.optimize.torch.palettization.ModulePostTrainingPalettizerConfig :members: from_dict, as_dict, from_yaml .. autoclass:: coremltools.optimize.torch.palettization.PostTrainingPalettizerConfig :members: set_global, set_module_type, set_module_name, from_dict, as_dict, from_yaml .. autoclass:: coremltools.optimize.torch.palettization.PostTrainingPalettizer :members: compress """ from .fake_palettize import FakePalettize from .palettization_config import DKMPalettizerConfig, ModuleDKMPalettizerConfig from .palettizer import DKMPalettizer from .post_training_palettization import ( ModulePostTrainingPalettizerConfig, PostTrainingPalettizer, PostTrainingPalettizerConfig, ) from .sensitive_k_means import ModuleSKMPalettizerConfig, SKMPalettizer, SKMPalettizerConfig ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/palettization/_custom_conversion.py0000644000000000000000000003726114672066616027210 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import torch as _torch import torch.nn as _nn import torch.nn.qat as _nnqat from coremltools.optimize.torch._utils.metadata_utils import ( CompressionMetadata as _CompressionMetadata, ) from ._supported_modules import Conv1d, Embedding, LayerNorm, MultiheadAttention class PalettizationCustomConversionBase(_nn.Module): """ PalettizationCustomConversionBase is the base class for palettized model conversion. It implements the get_finalized_weights method which returns the palettized weights from ``LUT`` and ``indices`` post-palettization. """ def __init__(self): super().__init__() @classmethod def do_attribute_assertions(cls, observed_module: _nn.Module): assert hasattr( observed_module, "qconfig" ), f"Module {type(observed_module)} has no attribute qconfig" assert hasattr(observed_module, "activation_post_process"), ( f"Module {type(observed_module)} has no " f"attribute activation_post_process" ) assert hasattr(observed_module, "weight_fake_quant"), ( f"Module {type(observed_module)} has no attribute " f"weight_fake_quant" ) @classmethod def get_finalized_weights(cls, observed_module: _nn.Module): if observed_module.weight_fake_quant.partitions: return observed_module.weight_fake_quant.forward(observed_module.weight.detach()) return observed_module.weight @classmethod def add_metadata(cls, observed_module: _nn.Module, return_module: _nn.Module): for dir_key in dir(observed_module): if "_fake_quant" in dir_key: if not isinstance(getattr(observed_module, dir_key).centroids[0], _torch.Tensor): break param_name = dir_key.replace("_fake_quant", "") compression_metadata = _CompressionMetadata(param_name) compression_metadata.compression_type = ["palettization"] lut = _torch.stack(getattr(observed_module, dir_key).centroids, dim=0) for i in range(observed_module.weight.dim() + 2 - lut.dim()): lut = lut.unsqueeze(-3) compression_metadata.lut = lut if getattr(observed_module, dir_key).enable_per_channel_scale: per_channel_scaling_factor = getattr( observed_module, dir_key ).per_channel_scaling_factor for _ in range(observed_module.weight.dim() - per_channel_scaling_factor.dim()): per_channel_scaling_factor = per_channel_scaling_factor.unsqueeze(-1) compression_metadata.palettization_scale = per_channel_scaling_factor compression_metadata.register(return_module) @classmethod def from_observed(cls, observed_module: _nn.Module): """ The classes that base-class this class will have to implement the ``from_observed`` method to tell the convert method what type of a module to return through Pytorch's conversion. """ raise NotImplementedError() class LinearPalettizationConversion(PalettizationCustomConversionBase): """ Conversion class for Linear. """ def __init__(self): super().__init__() @classmethod def from_observed(cls, observed_module: _nn.Module): cls.do_attribute_assertions(observed_module) finalized_weights = cls.get_finalized_weights(observed_module) return_module = _nn.Linear( in_features=observed_module.in_features, out_features=observed_module.out_features, bias=observed_module.bias is not None, device=observed_module.device if hasattr(observed_module, "device") else None, dtype=observed_module.dtype if hasattr(observed_module, "dtype") else None, ) return_module.weight = _nn.Parameter(finalized_weights) cls.add_metadata(observed_module, return_module) if observed_module.bias is not None: return_module.bias = _nn.Parameter(observed_module.bias.detach()) return_module.activation_post_process = observed_module.activation_post_process return return_module class Conv1dPalettizationConversion(PalettizationCustomConversionBase): """ Conversion class for Conv2d. """ def __init__(self): super().__init__() @classmethod def from_observed(cls, observed_module: _nn.Module): cls.do_attribute_assertions(observed_module) finalized_weights = cls.get_finalized_weights(observed_module) return_module = _nn.Conv1d( in_channels=observed_module.in_channels, out_channels=observed_module.out_channels, kernel_size=observed_module.kernel_size, stride=observed_module.stride, padding=observed_module.padding, dilation=observed_module.dilation, groups=observed_module.groups, bias=observed_module.bias is not None, padding_mode=observed_module.padding_mode, device=observed_module.device if hasattr(observed_module, "device") else None, dtype=observed_module.dtype if hasattr(observed_module, "dtype") else None, ) return_module.weight = _nn.Parameter(finalized_weights) cls.add_metadata(observed_module, return_module) if observed_module.bias is not None: return_module.bias = _nn.Parameter(observed_module.bias.detach()) return_module.activation_post_process = observed_module.activation_post_process return return_module class Conv2dPalettizationConversion(PalettizationCustomConversionBase): """ Conversion class for Conv2d. """ def __init__(self): super().__init__() @classmethod def from_observed(cls, observed_module: _nn.Module): cls.do_attribute_assertions(observed_module) finalized_weights = cls.get_finalized_weights(observed_module) return_module = _nn.Conv2d( in_channels=observed_module.in_channels, out_channels=observed_module.out_channels, kernel_size=observed_module.kernel_size, stride=observed_module.stride, padding=observed_module.padding, dilation=observed_module.dilation, groups=observed_module.groups, bias=observed_module.bias is not None, padding_mode=observed_module.padding_mode, device=observed_module.device if hasattr(observed_module, "device") else None, dtype=observed_module.dtype if hasattr(observed_module, "dtype") else None, ) return_module.weight = _nn.Parameter(finalized_weights) cls.add_metadata(observed_module, return_module) if observed_module.bias is not None: return_module.bias = _nn.Parameter(observed_module.bias.detach()) return_module.activation_post_process = observed_module.activation_post_process return return_module class Conv3dPalettizationConversion(PalettizationCustomConversionBase): """ Conversion class for Conv3d. """ def __init__(self): super().__init__() @classmethod def from_observed(cls, observed_module: _nn.Module): cls.do_attribute_assertions(observed_module) finalized_weights = cls.get_finalized_weights(observed_module) return_module = _nn.Conv3d( in_channels=observed_module.in_channels, out_channels=observed_module.out_channels, kernel_size=observed_module.kernel_size, stride=observed_module.stride, padding=observed_module.padding, dilation=observed_module.dilation, groups=observed_module.groups, bias=observed_module.bias is not None, padding_mode=observed_module.padding_mode, device=observed_module.device if hasattr(observed_module, "device") else None, dtype=observed_module.dtype if hasattr(observed_module, "dtype") else None, ) return_module.weight = _nn.Parameter(finalized_weights) cls.add_metadata(observed_module, return_module) if observed_module.bias is not None: return_module.bias = _nn.Parameter(observed_module.bias.detach()) return_module.activation_post_process = observed_module.activation_post_process return return_module class LayerNormPalettizationConversion(PalettizationCustomConversionBase): """ Conversion class for LayerNorm. """ def __init__(self): super().__init__() @classmethod def from_observed(cls, observed_module: _nn.Module): cls.do_attribute_assertions(observed_module) finalized_weights = cls.get_finalized_weights(observed_module) return_module = _nn.LayerNorm( normalized_shape=observed_module.normalized_shape, eps=observed_module.eps, elementwise_affine=observed_module.elementwise_affine, device=observed_module.device if hasattr(observed_module, "device") else None, dtype=observed_module.dtype if hasattr(observed_module, "dtype") else None, ) if observed_module.elementwise_affine: return_module.weight = _nn.Parameter(finalized_weights) if observed_module.bias: return_module.bias = _nn.Parameter(observed_module.bias.detach()) cls.add_metadata(observed_module, return_module) return_module.activation_post_process = observed_module.activation_post_process return return_module class MultiheadAttentionPalettizationConversion(PalettizationCustomConversionBase): """ Conversion class for MultiheadAttention. """ def __init__(self): super().__init__() @classmethod def do_attribute_assertions(cls, observed_module: _nn.Module): assert hasattr( observed_module, "qconfig" ), f"Module {type(observed_module)} has no attribute qconfig" assert hasattr(observed_module, "activation_post_process"), ( f"Module {type(observed_module)} has no " f"attribute activation_post_process" ) assert hasattr(observed_module.out_proj, "weight_fake_quant"), ( f"Module {type(observed_module.out_proj)} has no attribute " f"q_proj_weight_fake_quant" ) if not observed_module._qkv_same_embed_dim: assert hasattr(observed_module, "q_proj_weight_fake_quant"), ( f"Module {type(observed_module)} has no attribute " f"q_proj_weight_fake_quant" ) assert hasattr(observed_module, "k_proj_weight_fake_quant"), ( f"Module {type(observed_module)} has no attribute " f"k_proj_weight_fake_quant" ) assert hasattr(observed_module, "v_proj_weight_fake_quant"), ( f"Module {type(observed_module)} has no attribute " f"v_proj_weight_fake_quant" ) else: assert hasattr(observed_module, "in_proj_weight_fake_quant"), ( f"Module {type(observed_module)} has no attribute " f"in_proj_weight_fake_quant" ) @classmethod def from_observed(cls, observed_module: _nn.Module): cls.do_attribute_assertions(observed_module) add_bias_kv = observed_module.bias_k is not None and observed_module.bias_v is not None bias = ( observed_module.out_proj.bias is not None and observed_module.in_proj_bias is not None ) return_module = _nn.MultiheadAttention( embed_dim=observed_module.embed_dim, num_heads=observed_module.num_heads, dropout=observed_module.dropout, bias=bias, add_bias_kv=add_bias_kv, add_zero_attn=observed_module.add_zero_attn, kdim=observed_module.kdim, vdim=observed_module.vdim, batch_first=observed_module.batch_first, device=observed_module.device if hasattr(observed_module, "device") else None, dtype=observed_module.dtype if hasattr(observed_module, "dtype") else None, ) if not observed_module._qkv_same_embed_dim: return_module.q_proj_weight = _nn.Parameter( observed_module.q_proj_weight_fake_quant.forward( observed_module.q_proj_weight.detach() ) ) return_module.k_proj_weight = _nn.Parameter( observed_module.k_proj_weight_fake_quant.forward( observed_module.k_proj_weight.detach() ) ) return_module.v_proj_weight = _nn.Parameter( observed_module.v_proj_weight_fake_quant.forward( observed_module.v_proj_weight.detach() ) ) else: return_module.in_proj_weight = _nn.Parameter( observed_module.in_proj_weight_fake_quant.forward( observed_module.in_proj_weight.detach() ) ) return_module.out_proj.weight = _nn.Parameter( observed_module.out_proj.weight_fake_quant.forward( observed_module.out_proj.weight.detach() ) ) if bias: return_module.out_proj.bias = _nn.Parameter(observed_module.out_proj.bias.detach()) return_module.in_proj_bias = _nn.Parameter(observed_module.in_proj_bias.detach()) if add_bias_kv: return_module.bias_k = _nn.Parameter(observed_module.bias_k.detach()) return_module.bias_v = _nn.Parameter(observed_module.bias_v.detach()) cls.add_metadata(observed_module, return_module) return_module.activation_post_process = observed_module.activation_post_process return return_module class EmbeddingPalettizationConversion(PalettizationCustomConversionBase): """ Conversion class for Embedding. """ def __init__(self): super().__init__() @classmethod def from_observed(cls, observed_module: _nn.Module): cls.do_attribute_assertions(observed_module) finalized_weights = cls.get_finalized_weights(observed_module) return_module = _nn.Embedding( num_embeddings=observed_module.num_embeddings, embedding_dim=observed_module.embedding_dim, padding_idx=observed_module.padding_idx, max_norm=observed_module.max_norm, norm_type=observed_module.norm_type, scale_grad_by_freq=observed_module.scale_grad_by_freq, sparse=observed_module.sparse, _weight=None, device=observed_module.device if hasattr(observed_module, "device") else None, dtype=observed_module.dtype if hasattr(observed_module, "dtype") else None, ) return_module.weight = _nn.Parameter(finalized_weights) cls.add_metadata(observed_module, return_module) return_module.activation_post_process = observed_module.activation_post_process return return_module # Dictionary to map nnqat modules to Custom Conversion class. Each of these Custom Conversion classes # implement a ``from_observed`` method which is used to create original modules from qat modules. PALETTIZATION_CONVERT_DICT = { "observed_to_quantized_custom_module_class": { _nnqat.Linear: LinearPalettizationConversion, _nnqat.Conv2d: Conv2dPalettizationConversion, _nnqat.Conv3d: Conv3dPalettizationConversion, Conv1d: Conv1dPalettizationConversion, LayerNorm: LayerNormPalettizationConversion, Embedding: EmbeddingPalettizationConversion, MultiheadAttention: MultiheadAttentionPalettizationConversion, } } ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/palettization/_efficient_kmeans.py0000644000000000000000000002205614672066616026717 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Optional as _Optional from typing import Union as _Union import numpy as _np import torch as _torch import torch.distributed as _dist class _EfficientKMeans: """ An implementation of k-means which runs entirely on GPU. """ def __init__( self, n_clusters: int, init: _Union[str, _torch.Tensor], n_init: int = 0, labels=None, max_iter: int = 100, tol: float = 0.0001, error_bnd: float = 0.0, ): self.n_clusters = n_clusters self.n_init = n_init self.max_iter = max_iter self.tol = tol self.labels_ = labels self.inertia_ = None self.cluster_centers_ = init self.error_bnd = error_bnd assert self.max_iter > 0 assert self.n_clusters > 0 @staticmethod def _get_cluster_avg( n_clusters: int, indices: _torch.Tensor, vals: _torch.Tensor, sample_weight: _Optional[_torch.Tensor] = None, ) -> _torch.Tensor: agg_vals = ( vals.float() * sample_weight.float() if sample_weight is not None else vals.float() ) v_sum = ( _torch.zeros([n_clusters] + list(vals[0].size())) .to(vals.device) .index_add_(0, indices, agg_vals) ) weight = ( _torch.ones(len(vals), dtype=_torch.int).to(vals.device) if sample_weight is None else sample_weight.squeeze(1).to(vals.device) ) v_numel = ( _torch.zeros(n_clusters, dtype=weight.dtype) .to(vals.device) .index_add_(0, indices, weight) ) v_numel[v_numel == 0] = 1 v_avg = v_sum / v_numel.reshape(-1, 1) return v_avg.to(vals.dtype) @staticmethod def x_c_dist(params: _torch.Tensor, clusters: _torch.Tensor) -> _torch.Tensor: """ Calculate the distance between weights and clusters. """ clusters = clusters.contiguous() if _torch.finfo(params.dtype).bits > _torch.finfo(clusters.dtype).bits: return _torch.cdist(params.to(clusters.dtype), clusters).square() else: return _torch.cdist(params, clusters.to(params.dtype)).square() def _kmeans_pp( self, parameters: _torch.Tensor, sample_weight: _Optional[_torch.Tensor] = None ) -> "_EfficientKMeans": assert len(parameters) >= self.n_clusters self.inertia_ = int(1e9) for n in range(self.n_init): centroids = _torch.zeros( (self.n_clusters, parameters.size(-1)), device=parameters.device, dtype=parameters.dtype, ) for i in range(self.n_clusters): if i == 0: centroids[i] = parameters[_torch.randint(0, len(parameters), [1])] d_ij_curr = _torch.cdist(centroids[:i], parameters) else: d_ij_prev = _torch.cdist(centroids[i - 1 : i], parameters) d_ij_prev[d_ij_prev == 0] = -int(1e9) d_ij_curr = _torch.cat((d_ij_curr, d_ij_prev), dim=0) c_to_x = _torch.min(d_ij_curr, dim=0) centroids[i] = parameters[c_to_x[0].argmax()] for i in range(self.max_iter): min_error, labels = _torch.cdist(parameters, centroids).min(dim=-1) # if W is None: centroids.zero_() agg_params = parameters * sample_weight if sample_weight is not None else parameters weights = sample_weight.squeeze(1) if sample_weight is not None else None centroids.scatter_add_( 0, labels.view(-1, 1).expand([-1, parameters.size(-1)]), agg_params, ) n_centroids = _torch.bincount( labels, weights=weights, minlength=self.n_clusters ).view(-1, 1) centroids /= n_centroids cur_inertia = min_error.square().sum() if cur_inertia < self.inertia_: exit = self.inertia_ <= cur_inertia * (1 + self.tol) self.inertia_ = cur_inertia self.labels_ = labels self.cluster_centers_ = centroids if exit: break return self def fit( self, X: _torch.Tensor, sample_weight: _Optional[_torch.Tensor] = None ) -> "_EfficientKMeans": """ Compute k-means clustering. """ N = len(X) assert N >= self.n_clusters, f"too many clusters {self.n_clusters} for {N} samples" if isinstance(self.cluster_centers_, str): if "kmeans++" in self.cluster_centers_: if _dist.is_available() and _dist.is_initialized(): rank = _dist.get_rank() else: rank = 0 if "cpu" in self.cluster_centers_: import sklearn.cluster if "minibatch" in self.cluster_centers_: clustering_method = sklearn.cluster.MiniBatchKMeans else: clustering_method = sklearn.cluster.KMeans kmeans = clustering_method( n_init=self.n_init, n_clusters=self.n_clusters, max_iter=self.max_iter, random_state=rank + 1, tol=self.tol, ).fit(X.float().cpu().numpy(), sample_weight=sample_weight) self.inertia_ = _torch.Tensor([kmeans.inertia_]).to(X.device) self.labels_ = _torch.from_numpy(kmeans.labels_).int().to(X.device) self.cluster_centers_ = None else: self._kmeans_pp(X.float(), sample_weight=sample_weight) self.cluster_centers_ = _EfficientKMeans._get_cluster_avg( self.n_clusters, self.labels_, X, sample_weight=sample_weight ) elif self.cluster_centers_ == "opt1d": from coremltools._deps import _kmeans1d self.labels_, self.cluster_centers_ = _kmeans1d.cluster( X, self.n_clusters, weights=sample_weight ) self.n_clusters = len(self.cluster_centers_) self.cluster_centers_ = ( _torch.Tensor(self.cluster_centers_) .to(device=X.device, dtype=X.dtype) .view(-1, 1) ) self.labels_ = _torch.Tensor(self.labels_).int().to(X.device) min_error, _ = _EfficientKMeans.x_c_dist(X, self.cluster_centers_).min(dim=-1) self.inertia_ = min_error.sum() else: self.inertia_ = None for i in range(self.max_iter): self.cluster_centers_ = _EfficientKMeans._get_cluster_avg( self.n_clusters, self.labels_, X, sample_weight=sample_weight ) # remove empty clusters perhaps due to pruning nan_centers = self.cluster_centers_.isnan() if nan_centers.any(): self._kmeans_pp(X, sample_weight=sample_weight) continue x_c_dist = _EfficientKMeans.x_c_dist(X, self.cluster_centers_) min_error, self.labels_ = x_c_dist.min(dim=-1) cur_inertia = min_error.sum() if self.error_bnd and _torch.sqrt(cur_inertia / N) < self.error_bnd: unique, counts = _torch.unique(self.labels_, return_counts=True) idx = unique[counts.argmin()] reduce_cluster_centers_ = self.cluster_centers_.clone() reduce_cluster_centers_[idx] = _np.nan reduce_cluster_centers_ = reduce_cluster_centers_[ ~_torch.isnan(reduce_cluster_centers_) ].view(-1, 1) reduce_min_error, reduce_labels_ = _EfficientKMeans.x_c_dist( X, reduce_cluster_centers_ ).min(dim=-1) reduce_inertia = reduce_cluster_centers_.sum() rmse_error = _torch.sqrt(reduce_inertia / N) if rmse_error < self.error_bnd: self.cluster_centers_ = reduce_cluster_centers_ self.labels_ = reduce_labels_ self.n_clusters = len(self.cluster_centers_) continue if self.inertia_ is None or abs(self.inertia_ - cur_inertia) > self.tol: self.inertia_ = cur_inertia else: self.inertia_ = cur_inertia break return self ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/palettization/_fake_palettizer_tensor_hook.py0000644000000000000000000002322414672066616031206 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import gc from typing import Callable as _Callable from typing import Tuple as _Tuple import torch as _torch import torch.distributed as _dist from ._utils import get_shard_list as _get_shard_list MAX_RECURSION_DEPTH = 10 class _FakePalettizerTensorHook: """ _FakePalettizerTensorHook is the custom hook that implements many of the tensor packing and unpacking techniques illustrated in the paper `eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models `_ """ SOFTMAX_BACKWARD = "SoftmaxBackward" CLAMP_BACKWARD = "ClampBackward" DIST_BACKWARD = "EuclideanDistBackward" TRANS_BACKWARD = "TransposeBackward" STACK_BACKWARD = "StackBackward" INDEX_BACKWARD = "IndexBackward" DIV_BACKWARD = "DivBackward" SLICE_BACKWARD = "SliceBackward" VIEW_BACKWARD = "ViewBackward" EXPAND_BACKWARD = "ExpandBackward" RESHAPE_BACKWARD = "ReshapeAliasBackward" TOCOPY_BACKWARD = "ToCopyBackward" gc_trigger = None last_report = {} def __init__( self, zero_threshold, device, min_size=0, max_mem=1.0, use_unique=False, use_shard=False, ): self.min_size = max(min_size, 64) self.max_mem = max_mem self.tensor_dict = {} self.tensor_counter = {} self.total_requested = 0 self.total_allocated = 0 self.use_unique = use_unique self.use_shard = use_shard self.pack_counter = -1 self.device = device self.zero_threshold = zero_threshold t = _torch.cuda.get_device_properties(device).total_memory a = _torch.cuda.memory_allocated(device) self.use_cpu = (a / t) > abs(self.max_mem) and hasattr(_torch.autograd, "graph") if self.use_cpu: if self.__class__.gc_trigger is None: self.__class__.gc_trigger = True if self.__class__.gc_trigger: gc.collect() def _copy_to_device(self, x) -> _torch.Tensor: if self.use_cpu: packed = _torch.empty(x.size(), dtype=x.dtype, layout=x.layout, pin_memory=True) packed.copy_(x, non_blocking=True) return packed return x def _unique_tensor(self, x) -> _Tuple[_torch.Tensor, _torch.Tensor, _torch.Tensor]: if x.size(1) <= 1 or x.size(0) <= 1024: return x y, y_i = x.float().unique(return_inverse=True, dim=0) y_base = 0 y = y.to(x.dtype) y = self._copy_to_device(y) max_y_size = y.size(0) if max_y_size >= _torch.iinfo(_torch.int16).max: y_base = max_y_size // 2 y_i -= y_base max_y_size = y_base + 1 y_i = _lower_int(y_i, 0, max_y_size) y_i = self._copy_to_device(y_i) return y, y_i, y_base def _compress_tensor(self, x, dtype) -> list: if x.numel() <= self.min_size: return x if x.dim() > 1: x = x.flatten(end_dim=-2) world_size = _dist.get_world_size() rank = _dist.get_rank() if len(x) < world_size or not self.use_shard: x = x.to(dtype) if self.use_unique: x = self._unique_tensor(x) return x shard_list = _get_shard_list(len(x)) tensor_list = [None] * world_size shard = x[shard_list[rank] : shard_list[rank + 1]].to(dtype) if self.use_unique: tensor_list[rank] = self._unique_tensor(shard) else: tensor_list[rank] = self._copy_to_device(shard) for i in range(world_size): shard = x[shard_list[i] : shard_list[i + 1]] if i != rank: tensor_list[i] = {"size": shard.size(), "dtype": dtype} return tensor_list def pack(self, x) -> _Tuple[str, _Callable, _torch.device, _torch.Tensor]: """ Function that will be called every time an operation saves a tensor for backward. """ key = None op = lambda z: z.view(size) if x.numel() <= self.min_size: return x x_clone = x.clone() if self.max_mem <= 0 else None device = x.device size = x.size() if x.dtype.is_floating_point: grad_fn_list = [] full_grad_fn_list = [] c_grad_fn = x.grad_fn while len(grad_fn_list) < 2: if c_grad_fn: str_grad_fn = str(type(c_grad_fn)) full_grad_fn_list.append(str_grad_fn) if ( self.__class__.RESHAPE_BACKWARD in str_grad_fn or self.__class__.TOCOPY_BACKWARD in str_grad_fn or self.__class__.EXPAND_BACKWARD in str_grad_fn ): pass else: grad_fn_list.append(str_grad_fn) c_grad_fn = c_grad_fn.next_functions[0][0] if c_grad_fn.next_functions else None else: break if key is None: for _ in range(len(grad_fn_list), 2): grad_fn_list.append("None") if ( self.__class__.SOFTMAX_BACKWARD in grad_fn_list[0] and self.__class__.DIV_BACKWARD in grad_fn_list[1] ): key = "softmax" + f".{self.pack_counter}" elif ( self.__class__.CLAMP_BACKWARD in grad_fn_list[0] and self.__class__.SOFTMAX_BACKWARD in grad_fn_list[1] ): key = "softmax" + f".{self.pack_counter}" op = lambda z: z.view(size).clamp(min=self.zero_threshold) elif self.__class__.DIST_BACKWARD in grad_fn_list[0]: self.pack_counter += 1 key = "x_c_dist" + f".{self.pack_counter}" elif ( self.__class__.VIEW_BACKWARD in grad_fn_list[0] and self.__class__.DIST_BACKWARD in grad_fn_list[1] ): key = "x_c_dist" + f".{self.pack_counter}" elif ( ( self.__class__.VIEW_BACKWARD in grad_fn_list[0] and self.__class__.STACK_BACKWARD in grad_fn_list[1] ) or ( self.__class__.STACK_BACKWARD in grad_fn_list[0] and self.__class__.INDEX_BACKWARD in grad_fn_list[1] ) or ( self.__class__.STACK_BACKWARD in grad_fn_list[0] and self.__class__.SLICE_BACKWARD in grad_fn_list[1] ) ): key = "X.b" + f".{-1}" elif ( self.__class__.TRANS_BACKWARD in grad_fn_list[0] and self.__class__.STACK_BACKWARD in grad_fn_list[1] ): key = "X.b" + f".{-1}" if key in self.tensor_dict: size = x.mT.size() op = lambda z: z.reshape(size).mT else: key = None if key is None: key = self._compress_tensor(x, x.dtype) elif key not in self.tensor_dict: w = self._compress_tensor(x, x.dtype) self.tensor_dict[key] = w else: key = self._compress_tensor(x, _torch.uint8) op = lambda z: z.to(device, _torch.int32) return key, op, device, x_clone def unpack(self, x) -> _torch.Tensor: """ Function that will be called to return a value to compute a new tensor, which is the one actually used during the backward pass. """ if isinstance(x, tuple): key, op, device, y = x look_up = isinstance(key, str) if look_up: v = self.tensor_dict[key] else: v = key v = _decompress_tensor(v, device) if look_up: self.tensor_dict[key] = v x = op(v) return x def _lower_int(x, x_min=None, x_max=None) -> _torch.Tensor: if x_min is None: x_min, x_max = x.min(), x.max() for t in [_torch.uint8, _torch.int8, _torch.int16, _torch.int32]: if _torch.iinfo(t).bits >= _torch.iinfo(x.dtype).bits: break if _torch.iinfo(t).min <= x_min and x_max <= _torch.iinfo(t).max: x = x.to(t) break return x def _deunique_tensor(x, device) -> _torch.Tensor: y, y_i, y_base = x y = y.to(device, non_blocking=True) y_i = y_i.to(_torch.int32) if y_base > 0: y_i += y_base return y[y_i] def _decompress_tensor(x, device) -> _torch.Tensor: if not isinstance(x, list): if isinstance(x, tuple): x = _deunique_tensor(x, device=device) return x distributed_world_size = _dist.get_world_size() distributed_rank = _dist.get_rank() for i in range(distributed_world_size): if isinstance(x[i], dict): x[i] = _torch.empty(**x[i], device=device) else: if isinstance(x[i], tuple): x[i] = _deunique_tensor(x[i], device=device) else: x[i] = x[i].to(device, non_blocking=True) _dist.all_gather(x[:distributed_world_size], x[distributed_rank]) return _torch.concat(x, dim=0) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/palettization/_partitioner.py0000644000000000000000000003232614672066616025766 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import math as _math from typing import Optional as _Optional from typing import Tuple as _Tuple import torch as _torch import torch.distributed as _dist from ._efficient_kmeans import _EfficientKMeans from ._utils import get_shard_list as _get_shard_list from ._utils import vectorize as _vectorize # NF Cluster sizes for which partitioning has been verified. NF_CLUSTER_SIZES = [8, 9, 16, 17] class _Partitioner: """ Internal class that manages partitioning. The ``FakePalettize`` class base classes the ``_Partitioner`` class and all the partitioning logic is controlled by this class. """ def __init__( self, n_bits: int, enforce_zero: bool, prune_threshold: float, cluster_dim: int, cluster_permute: _Tuple, group_size: _Optional[int], palett_tau: float, kmeans_init: str, percentage_palett_enable: float, kmeans_opt1d_threshold: int, kmeans_batch_threshold: int, kmeans_n_init: int, kmeans_error_bnd: float, ): self.centroids = [kmeans_init] self.n_clusters = 2 ** int(n_bits) self.labels = [None] self.enforce_zero = [enforce_zero] self.enable_partition = [] self.proj_factor = None self.partitions = [] self.cum_inertia = [] self.cluster_dim = cluster_dim self.cluster_permute = cluster_permute self.prune_threshold = prune_threshold self.palett_tau = palett_tau # rename to palett_tau self.group_size = group_size self.percentage_palett_enable = percentage_palett_enable self.kmeans_opt1d_threshold = kmeans_opt1d_threshold self.kmeans_batch_threshold = kmeans_batch_threshold self.kmeans_n_init = kmeans_n_init self.kmeans_error_bnd = kmeans_error_bnd def create_partitions(self, weights) -> None: """ Method to create partitions in the weights. These partitions can be used to run channel level palettization. """ with _torch.no_grad(): num_channels = len(weights) usr_num_channels_per_partition = ( int(self.group_size) if self.group_size else num_channels ) self.partitions = [ list(range(i, min(num_channels, i + usr_num_channels_per_partition))) for i in range(0, num_channels, usr_num_channels_per_partition) ] num_partitions = len(self.partitions) self.centroids = self.centroids * num_partitions self.labels = self.labels * num_partitions self.enforce_zero = self.enforce_zero * num_partitions self.cum_inertia = [1e9] * num_partitions self.partition_numel = _torch.tensor( [_torch.numel(weights[p]) for p in self.partitions] ) self.enable_partition = [True] * max( 1, int(self.percentage_palett_enable * num_partitions) ) self.enable_partition += [False] * (num_partitions - len(self.enable_partition)) numel_per_partition = max(self.partition_numel) assert numel_per_partition assert ( numel_per_partition >= self.n_clusters * self.cluster_dim ), f"The number of clusters ({self.n_clusters}) and/or the cluster dim ({self.cluster_dim}) is TOO big" def get_partition_kmeans( self, X, partition, n_clusters, labels, enforce_zero, max_iter, init, n_init=10, ) -> _EfficientKMeans: """ Method to get kmeans for a particular partition. """ cY, pad = _vectorize(X[partition], self.cluster_dim) kmeans = _EfficientKMeans( n_clusters=n_clusters, init=init, labels=labels, n_init=n_init, max_iter=max_iter, error_bnd=self.kmeans_error_bnd, ).fit(cY) if enforce_zero: # fix zero zero_point = _torch.zeros_like(kmeans.cluster_centers_[0]).unsqueeze(0) zero_idx = _torch.argmin( _torch.cdist(kmeans.cluster_centers_.float(), zero_point.float()) ) # always put zero in the first temp = kmeans.cluster_centers_[0] kmeans.cluster_centers_[zero_idx] = temp kmeans.cluster_centers_[0] = zero_point return kmeans def init_partitions(self, parameters) -> None: """ Method to initialize the partitions and set the k-means. Called during first iteration of palettization in the forward method of ``FakePalettize``. """ if isinstance(self.centroids[0], _torch.Tensor): return with _torch.no_grad(): num_partitions = len(self.partitions) numel_per_partition = max(self.partition_numel) if "nf" in self.centroids[0]: if self.n_clusters in NF_CLUSTER_SIZES and self.cluster_dim == 1: nf_fit = "fit" in self.centroids[0] for i, partition in enumerate(self.partitions): bit = int(_math.log2(self.n_clusters)) sparse = bool(_math.log2(self.n_clusters) - bit) self.centroids[i] = ( _generate_natural_float(bit=bit, sparse=sparse) .to(parameters.device) .to(parameters.dtype) .view(-1, 1) ) if nf_fit: best_err = _torch.finfo(_torch.float).max best_lambd = 1 best_retry = 0 best_thold = 10 up_down_hill = 0 lambd_list = [[1 + x / 100, 1 - x / 100] for x in range(99)] lambd_list = [1] + [v for sublist in lambd_list for v in sublist] cur_X = parameters[self.partitions[i]].view(-1, 1) for cur_lambd in lambd_list: if up_down_hill > best_thold and cur_lambd < 1: continue if up_down_hill < -best_thold and cur_lambd > 1: continue cur_lut = _torch.stack( [x.sign() * x.abs() ** (cur_lambd) for x in self.centroids[i]] ) x_c_dist = _torch.cdist(cur_X, cur_lut.to(cur_X.dtype)).square() cur_err = x_c_dist.min(-1).values.float().sum() if best_err > cur_err: best_retry = 0 best_err = cur_err best_lambd = cur_lambd if best_lambd > 1: up_down_hill += 1 else: up_down_hill -= 1 elif best_retry > best_thold: break else: best_retry += 1 self.centroids[i] = _torch.stack( [x.sign() * x.abs() ** (best_lambd) for x in self.centroids[i]] ) return self.centroids = ["auto"] * num_partitions for i in range(num_partitions): if self.centroids[i] == "auto": # if auto then pick either init method self.centroids[i] = ( "opt1d" if ( numel_per_partition <= self.n_clusters or numel_per_partition <= self.kmeans_opt1d_threshold ) and self.cluster_dim == 1 else "kmeans++" ) if _dist.is_available() and _dist.is_initialized(): distributed_world_size = _dist.get_world_size() else: distributed_world_size = 1 if max(num_partitions, distributed_world_size) < self.kmeans_batch_threshold: for i, partition in enumerate(self.partitions): kmeans = self.get_partition_kmeans( parameters, partition, self.n_clusters, self.labels[i], self.enforce_zero[i], max_iter=100, init=self.centroids[i], n_init=max(1, self.kmeans_n_init // distributed_world_size), ) bcast_rank = _get_best_rank(kmeans.inertia_, _torch.argmin) if bcast_rank: _dist.broadcast(kmeans.cluster_centers_, bcast_rank) self.centroids[i] = kmeans.cluster_centers_ self.labels[i] = None else: shard_list = _get_shard_list(num_partitions) centroids_list = [None] * distributed_world_size for i in range(distributed_world_size): begin, end = shard_list[i], shard_list[i + 1] current_rank = ( _dist.get_rank() if _dist.is_available() and _dist.is_initialized() else 0 ) if i == current_rank and begin < end: for p in range(begin, end): kmeans = self.get_partition_kmeans( parameters, self.partitions[p], self.n_clusters, self.labels[p], self.enforce_zero[p], max_iter=100, init=self.centroids[p], n_init=self.kmeans_n_init, ) self.centroids[p] = kmeans.cluster_centers_ centroids_list[i] = _torch.stack(self.centroids[begin:end]) else: centroids_list[i] = _torch.full( [end - begin, self.n_clusters, self.cluster_dim], float("nan"), dtype=parameters.dtype, device=parameters.device, ) if _dist.is_available() and _dist.is_initialized(): _dist.all_gather(centroids_list, centroids_list[_dist.get_rank()]) centroids_list = [v for sublist in centroids_list for v in sublist] assert len(centroids_list) == num_partitions for p in range(num_partitions): self.labels[p] = None self.centroids[p] = centroids_list[p] def _load_from_state_dict_( self, state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs, ): self.cluster_permute = state_dict.pop(prefix + "permute") self.partitions = state_dict.pop(prefix + "partitions") self.centroids = state_dict.pop(prefix + "centroids") self.labels = state_dict.pop(prefix + "labels") self.proj_factor = state_dict.pop(prefix + "proj_factor") def _save_to_state_dict_(self, destination, prefix, keep_vars): destination[prefix + "centroids"] = self.centroids destination[prefix + "labels"] = self.labels destination[prefix + "permute"] = self.cluster_permute destination[prefix + "partitions"] = self.partitions def _get_best_rank(metric, func=_torch.argmin) -> int: """ Get best rank of a particular metric according to a specified function. """ if _dist.is_available() and _dist.is_initialized(): distributed_world_size = _dist.get_world_size() if distributed_world_size > 1: tensor_list = [_torch.zeros_like(metric) for _ in range(distributed_world_size)] _dist.all_gather(tensor_list, metric) bcast_rank = func(_torch.Tensor(tensor_list)) return bcast_rank return None def _generate_natural_float(bit=4, sparse=False, offset=0.9677083) -> _torch.Tensor: """ Function to generate NF4 values. """ from scipy.stats import norm space = (2**bit) // 2 # one more positive value, this is an asymmetric type v1 = norm.ppf(_torch.linspace(offset, 0.5, space + 1)[:-1]).tolist() if sparse: v3 = [-x for x in v1] else: v3 = (-norm.ppf(_torch.linspace(offset, 0.5, space)[:-1])).tolist() v = [0] + v3 + list(reversed(v1)) values = _torch.Tensor(v) values /= values.max() return values ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/palettization/_supported_modules.py0000644000000000000000000002617114672066616027204 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import torch as _torch import torch.nn as _nn import torch.nn.functional as _F from .palettization_config import SUPPORTED_PYTORCH_QAT_MODULES def _get_palettization_qat_mappings(): """ _get_palettization_qat_mappings creates qat_module_mappings supported by coremltools.optimize.torch for palettization. We support three modules already in DEFAULT_QAT_MODULE_MAPPINGS, namely, nn.Linear, nn.Conv2d and nn.Conv3d. Additionally, we have added support for preparation of nn.Conv1d, nn.LayerNorm, nn.MultiheadAttention and nn.Embedding modules. """ qat_module_mappings = ( _torch.quantization.quantization_mappings.get_default_qat_module_mappings() ) for k in list(qat_module_mappings.keys()): if k not in SUPPORTED_PYTORCH_QAT_MODULES: del qat_module_mappings[k] qat_module_mappings[Conv1d._FLOAT_MODULE] = Conv1d qat_module_mappings[LayerNorm._FLOAT_MODULE] = LayerNorm qat_module_mappings[MultiheadAttention._FLOAT_MODULE] = MultiheadAttention qat_module_mappings[Embedding._FLOAT_MODULE] = Embedding return qat_module_mappings class Conv1d(_nn.Conv1d): _FLOAT_MODULE = _nn.Conv1d def forward(self, input): qweight = self.weight_fake_quant(self.weight) if self.padding_mode != "zeros": return _F.conv1d( _F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode), qweight, self.bias, self.stride, (0,), self.dilation, self.groups, ) return _F.conv1d( input, qweight, self.bias, self.stride, self.padding, self.dilation, self.groups ) @classmethod def from_float(cls, mod): r"""Create a qat module from a float module or qparams_dict Args: `mod` a float module, either produced by torch.quantization utilities or directly from user """ assert type(mod) == cls._FLOAT_MODULE, ( "qat." + cls.__name__ + ".from_float only works for " + cls._FLOAT_MODULE.__name__ ) assert hasattr(mod, "qconfig"), "Input float module must have qconfig defined" assert mod.qconfig, "Input float module must have a valid qconfig" qconfig = mod.qconfig qat = cls( mod.in_channels, mod.out_channels, mod.kernel_size, stride=mod.stride, padding=mod.padding, dilation=mod.dilation, groups=mod.groups, bias=mod.bias is not None, padding_mode=mod.padding_mode, ) qat.qconfig = qconfig qat.weight_fake_quant = qconfig.weight() wnorm = None for k, hook in mod._forward_pre_hooks.items(): if "WeightNorm" in str(hook): wnorm = hook if wnorm: qat = _nn.utils.weight_norm(qat, name=wnorm.name, dim=wnorm.dim) for name, param in mod.named_parameters(recurse=False): setattr(qat, name, param) if wnorm: _nn.utils.remove_weight_norm(mod) return qat class LayerNorm(_nn.LayerNorm): _FLOAT_MODULE = _nn.LayerNorm def forward(self, input): return _F.layer_norm( input, self.normalized_shape, self.weight_fake_quant(self.weight) if self.elementwise_affine else self.weight, self.bias, self.eps, ) @classmethod def from_float(cls, mod): r"""Create a qat module from a float module or qparams_dict Args: `mod` a float module, either produced by torch.quantization utilities or directly from user """ assert type(mod) == cls._FLOAT_MODULE, ( "qat." + cls.__name__ + ".from_float only works for " + cls._FLOAT_MODULE.__name__ ) assert hasattr(mod, "qconfig"), "Input float module must have qconfig defined" assert mod.qconfig, "Input float module must have a valid qconfig" assert isinstance( mod.weight, _nn.Parameter ), "CANNOT be prepared for palettization: weight is NOT learnable" qconfig = mod.qconfig qat = cls(mod.normalized_shape, eps=mod.eps, elementwise_affine=mod.elementwise_affine) qat.qconfig = qconfig if qat.elementwise_affine: qat.weight_fake_quant = qconfig.weight() for name, param in mod.named_parameters(recurse=False): setattr(qat, name, param) assert qat.elementwise_affine == (qat.weight is not None) return qat class Embedding(_nn.Embedding): _FLOAT_MODULE = _nn.Embedding def forward(self, input): qweight = self.weight_fake_quant(self.weight) return _F.embedding( input, qweight, self.padding_idx, self.max_norm, self.norm_type, self.scale_grad_by_freq, self.sparse, ) @classmethod def from_float(cls, mod): r"""Create a qat module from a float module or qparams_dict Args: `mod` a float module, either produced by torch.quantization utilities or directly from user """ assert type(mod) == cls._FLOAT_MODULE, ( "qat." + cls.__name__ + ".from_float only works for " + cls._FLOAT_MODULE.__name__ ) assert hasattr(mod, "qconfig"), "Input float module must have qconfig defined" assert mod.qconfig, "Input float module must have a valid qconfig" assert isinstance( mod.weight, _nn.Parameter ), "CANNOT be prepared for palettization: weight is NOT learnable" qconfig = mod.qconfig qat = cls( mod.num_embeddings, mod.embedding_dim, mod.padding_idx, max_norm=mod.max_norm, norm_type=mod.norm_type, scale_grad_by_freq=mod.scale_grad_by_freq, sparse=mod.sparse, _weight=None, ) qat.qconfig = qconfig qat.weight_fake_quant = qconfig.weight() for name, param in mod.named_parameters(recurse=False): setattr(qat, name, param) return qat class MultiheadAttention(_nn.MultiheadAttention): _FLOAT_MODULE = _nn.MultiheadAttention def forward(self, query, key, value, key_padding_mask=None, need_weights=True, attn_mask=None): is_batched = query.dim() == 3 if self.batch_first and is_batched: # Ensure that that the "is" property is maintained if key is value: if query is key: query = key = value = query.transpose(1, 0) else: query, key = (x.transpose(1, 0) for x in (query, key)) value = key else: query, key, value = (x.transpose(1, 0) for x in (query, key, value)) if not self._qkv_same_embed_dim: attn_output, attn_output_weights = _F.multi_head_attention_forward( query, key, value, self.embed_dim, self.num_heads, self.in_proj_weight, self.in_proj_bias, self.bias_k, self.bias_v, self.add_zero_attn, self.dropout, self.out_proj.weight_fake_quant(self.out_proj.weight), self.out_proj.bias, training=self.training, key_padding_mask=key_padding_mask, need_weights=need_weights, attn_mask=attn_mask, use_separate_proj_weight=True, q_proj_weight=self.q_proj_weight_fake_quant(self.q_proj_weight), k_proj_weight=self.k_proj_weight_fake_quant(self.k_proj_weight), v_proj_weight=self.v_proj_weight_fake_quant(self.v_proj_weight), ) else: attn_output, attn_output_weights = _F.multi_head_attention_forward( query, key, value, self.embed_dim, self.num_heads, self.in_proj_weight_fake_quant(self.in_proj_weight), self.in_proj_bias, self.bias_k, self.bias_v, self.add_zero_attn, self.dropout, self.out_proj.weight_fake_quant(self.out_proj.weight), self.out_proj.bias, training=self.training, key_padding_mask=key_padding_mask, need_weights=need_weights, attn_mask=attn_mask, ) if self.batch_first and is_batched: return attn_output.transpose(1, 0), attn_output_weights else: return attn_output, attn_output_weights @classmethod def from_float(cls, mod): r"""Create a palettization module from a float module or qparams_dict Args: `mod` a float module, either produced by torch.quantization utilities or directly from user """ assert type(mod) == cls._FLOAT_MODULE, ( "qat." + cls.__name__ + ".from_float only works for " + cls._FLOAT_MODULE.__name__ ) assert hasattr(mod, "qconfig"), "Input float module must have qconfig defined" assert mod.qconfig, "Input float module must have a valid qconfig" qconfig = mod.qconfig qat = cls( mod.embed_dim, mod.num_heads, mod.dropout, batch_first=mod.batch_first, bias=hasattr(mod, "in_proj_bias"), add_bias_kv=mod.bias_k is not None, add_zero_attn=mod.add_zero_attn, kdim=mod.kdim, vdim=mod.vdim, ) qat.qconfig = qconfig if not qat._qkv_same_embed_dim: qat.q_proj_weight_fake_quant = qconfig.weight() qat.k_proj_weight_fake_quant = qconfig.weight() qat.v_proj_weight_fake_quant = qconfig.weight() else: qat.in_proj_weight_fake_quant = qconfig.weight() qat.out_proj.weight_fake_quant = qconfig.weight() for name, param in mod.named_parameters(recurse=False): setattr(qat, name, param) for name, param in mod.out_proj.named_parameters(recurse=False): setattr(qat.out_proj, name, param) return qat def get_palettizable_parameters(module): """ Return a list of parameters of the module which can be palettized """ if isinstance(module, _nn.MultiheadAttention): if not module._qkv_same_embed_dim: return [ (module.out_proj.weight, "out_proj.weight"), (module.q_proj_weight, "q_proj_weight"), (module.k_proj_weight, "k_proj_weight"), (module.v_proj_weight, "v_proj_weight"), ] else: return [ (module.in_proj_weight, "in_proj_weight"), (module.out_proj.weight, "out_proj.weight"), ] return [(module.weight, "weight")] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/palettization/_utils.py0000644000000000000000000000403414672066616024561 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Tuple as _Tuple import torch as _torch import torch.distributed as _dist def vectorize(current_tensor, cluster_dim) -> _Tuple[_torch.Tensor, _torch.Tensor]: """ Function to vectorize a tensor till the point where its numel is divisible by cluster_dim. The remaining parameters are returned as a pad. """ num_misalignment = _torch.numel(current_tensor) % cluster_dim if cluster_dim > 1: current_tensor = current_tensor.transpose(0, -1) pad = None if num_misalignment: current_tensor = current_tensor.flatten() pad = current_tensor[-num_misalignment:] current_tensor = current_tensor[:-num_misalignment] return current_tensor.reshape(-1, cluster_dim), pad def devectorize(current_tensor, pad, target_size, cluster_dim) -> _torch.Tensor: """ Function to devectorize by tracing back the vectorize operation in the method above. """ if pad is not None: current_tensor = _torch.cat([current_tensor.flatten(), pad]) if cluster_dim > 1: current_tensor = current_tensor.reshape(_torch.Size(tuple(target_size)[::-1])) current_tensor = current_tensor.transpose(0, -1) return current_tensor return current_tensor.reshape(target_size) def get_shard_list(length) -> list: """ Function to generate shard_list for different partitions. """ distributed_world_size = ( _dist.get_world_size() if _dist.is_available() and _dist.is_initialized() else 1 ) shard_size = max(1, length // distributed_world_size) shard_list = list(range(0, length, shard_size)) if len(shard_list) > distributed_world_size: shard_list = shard_list[:distributed_world_size] + [length] else: shard_list += [length] * (distributed_world_size + 1 - len(shard_list)) return shard_list ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/palettization/fake_palettize.py0000644000000000000000000007665714672066616026275 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import contextlib import logging as _logging from distutils.version import StrictVersion as _StrictVersion from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import Union as _Union import torch as _torch import torch.distributed as _dist import torch.nn.functional as _F from torch.ao.quantization.observer import ObserverBase as _ObserverBase from torch.quantization import FakeQuantize as _FakeQuantize from coremltools.optimize.torch._utils.torch_utils import get_torch_version as _get_torch_version from ._efficient_kmeans import _EfficientKMeans from ._fake_palettizer_tensor_hook import _FakePalettizerTensorHook from ._partitioner import _Partitioner from ._utils import devectorize as _devectorize from ._utils import get_shard_list as _get_shard_list from ._utils import vectorize as _vectorize from .palettization_config import DEFAULT_PALETTIZATION_ADVANCED_OPTIONS # This is the maximum torch version currently supported for supporting the # FakePalettizerTensorHook as the backward graph tracing that the pack/unpack method # does accepts certain names for functions which have been changed after this # torch version MAX_TORCH_VERSION_FOR_PALETT_MAX_MEM = "2.0.1" _logger = _logging.getLogger(__name__) class FakePalettize(_FakeQuantize, _Partitioner): """ A class that implements palettization algorithm described in `DKM: Differentiable K-Means Clustering Layer for Neural Network Compression `_. It clusters the weights using a differentiable version of ``k-means``, allowing the look-up-table (LUT) and indices of palettized weights to be learnt using a gradient-based optimization algorithm such as SGD. Extends :py:class:`torch.quantization.FakeQuantize` to add support for palettization. Example: .. code-block:: python from collections import OrderedDict import torch import torch.nn as nn import coremltools.optimize.torch.palettization as palett model = nn.Sequential( OrderedDict( [ ("linear1", nn.Linear(4, 5)), ("sigmoid1", nn.Sigmoid()), ("linear2", nn.Linear(5, 4)), ("sigmoid2", nn.Sigmoid), ] ) ) fq_activation = nn.Identity fq_weight = palett.FakePalettize.with_args( observer=torch.quantization.MovingAveragePerChannelMinMaxObserver.with_args( quant_min=-128, quant_max=127, dtype=torch.qint8 ), n_bits=2, cluster_dim=1, module_parameter_shape=torch.Size([5, 4]), ) model.linear2.qconfig = torch.quantization.QConfig( activation=fq_activation, weight=fq_weight ) palettized_model = palett.prepare_palettizer(model) train_model(palettized_model) palettized_converted_model = palett.finalize(palettized_model) Args: observer (:obj:`torch.ao.quantization.observer.ObserverBase`): Observer for quantizing the ``LUT``. n_bits (:obj:`int`): Number of palettization bits. There would be :math:`2^{n\_bits}` unique weights in the ``LUT``. cluster_dim (:obj:`int`): Dimensionality of centroids to use for clustering. enable_per_channel_scale (:obj:`bool`): When set to ``True``, per channel scaling is used along the channel dimension. group_size (:obj:`int`): Each group of ``group_size`` number of channels are palettized using different look up tables. quant_min (:obj:`int`): The minimum allowable quantized value. quant_max (:obj:`int`): The maximum allowable quantized value. lut_dtype (:obj:`str`): String that decides whether to quantize the ``LUT`` or not. The following are the ``str`` LUT quantization combinations: (``u8``, ``uint8``), (``i8``, ``int8``), and (``f16``, ``float16``). advanced_options (:obj:`dict`): Advanced options to configure the palettization algorithm. observer_kwargs (optional): Arguments for the observer module. .. note:: Allowed keys for ``advanced_options`` are the parameters listed as ``optional`` in :py:class:`ModuleDKMPalettizerConfig`, besides the ones already covered by other parameters in this class. """ fake_palett_enabled: _torch.Tensor def __init__( self, observer: _ObserverBase, n_bits: int, cluster_dim: int, enable_per_channel_scale: bool = False, group_size: _Optional[int] = None, quant_min: int = -128, quant_max: int = 127, lut_dtype: str = "f32", advanced_options: dict = {}, **observer_kwargs, ): cluster_permute = advanced_options.get( "cluster_permute", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["cluster_permute"] ) palett_max_mem = advanced_options.get( "palett_max_mem", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_max_mem"] ) if palett_max_mem < 1: _CURRENT_TORCH_VERSION = _get_torch_version(_torch.__version__) if _CURRENT_TORCH_VERSION > _StrictVersion(MAX_TORCH_VERSION_FOR_PALETT_MAX_MEM): _logger.error( f"palett_max_mem<1 is only supported till a max torch version " f"of:{MAX_TORCH_VERSION_FOR_PALETT_MAX_MEM} " ) palett_shard = advanced_options.get( "palett_shard", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_shard"] ) palett_unique = advanced_options.get( "palett_unique", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_unique"] ) palett_min_tsize = advanced_options.get( "palett_min_tsize", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_min_tsize"], ) kmeans_max_iter = advanced_options.get( "kmeans_max_iter", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["kmeans_max_iter"] ) prune_threshold = advanced_options.get( "prune_threshold", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["prune_threshold"] ) kmeans_init = advanced_options.get( "kmeans_init", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["kmeans_init"] ) kmeans_opt1d_threshold = advanced_options.get( "kmeans_opt1d_threshold", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["kmeans_opt1d_threshold"], ) enforce_zero = advanced_options.get( "enforce_zero", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["enforce_zero"] ) palett_mode = advanced_options.get( "palett_mode", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_mode"] ) palett_cluster_tol = advanced_options.get( "palett_cluster_tol", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_cluster_tol"], ) palett_tau = advanced_options.get( "palett_tau", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_tau"] ) palett_epsilon = advanced_options.get( "palett_epsilon", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_epsilon"] ) palett_lambda = advanced_options.get( "palett_lambda", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_lambda"] ) add_extra_centroid = advanced_options.get( "add_extra_centroid", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["add_extra_centroid"], ) per_channel_scaling_factor_scheme = advanced_options.get( "per_channel_scaling_factor_scheme", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["per_channel_scaling_factor_scheme"], ) percentage_palett_enable = advanced_options.get( "percentage_palett_enable", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["percentage_palett_enable"], ) kmeans_batch_threshold = advanced_options.get( "kmeans_batch_threshold", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["kmeans_batch_threshold"], ) kmeans_n_init = advanced_options.get( "kmeans_n_init", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["kmeans_n_init"] ) zero_threshold = advanced_options.get( "zero_threshold", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["zero_threshold"] ) palett_batch_mode = advanced_options.get( "palett_batch_mode", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_batch_mode"], ) palett_dist = advanced_options.get( "palett_dist", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_dist"], ) kmeans_error_bnd = advanced_options.get( "kmeans_error_bnd", DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["kmeans_error_bnd"], ) self._target_module_level_sparsity = 0.0 _FakeQuantize.__init__(self, observer, quant_min, quant_max, **observer_kwargs) _Partitioner.__init__( self, n_bits, enforce_zero, prune_threshold, cluster_dim, cluster_permute, group_size, palett_tau, kmeans_init, percentage_palett_enable, kmeans_opt1d_threshold, kmeans_batch_threshold, kmeans_n_init, kmeans_error_bnd, ) self.cluster_permute = cluster_permute self.enable_per_channel_scale = enable_per_channel_scale self.per_channel_scaling_factor_scheme = per_channel_scaling_factor_scheme self.per_channel_scaling_factor = None self.partitions = [] self.group_size = group_size self.lut_dtype = lut_dtype self.add_extra_centroid = add_extra_centroid self.need_to_quantize = self.lut_dtype in ["i8", "u8", "f16"] self.autograd_graph = hasattr(_torch.autograd, "graph") and palett_max_mem < 1.0 self.palett_max_mem = palett_max_mem self.palett_min_tsize = palett_min_tsize self.palett_unique = palett_unique self.palett_shard = palett_shard self.palett_dist = palett_dist and _dist.is_available() and _dist.is_initialized() self.zero_threshold = zero_threshold self.prune_threshold = prune_threshold self.palett_batch_mode = palett_batch_mode self.palett_cluster_tol = palett_cluster_tol self.kmeans_max_iter = kmeans_max_iter self.palett_mode = palett_mode self.palett_tau = palett_tau self.palett_epsilon = palett_epsilon self.palett_lambda = palett_lambda self.n_bits = n_bits self.cluster_dim = cluster_dim self.kmeans_init = kmeans_init self.register_buffer("fake_palett_enabled", _torch.tensor([0], dtype=_torch.uint8)) self.disable_fake_quant() self.disable_observer() def enable_fake_palett(self, enabled: bool = True) -> None: self.fake_palett_enabled[0] = 1 if enabled else 0 def disable_fake_palett(self): self.enable_fake_palett(False) def diff_palettize(self, X) -> _torch.Tensor: cX, pad = list( zip( *[ _vectorize(X[partition], self.cluster_dim) for i, partition in enumerate(self.partitions) ] ) ) if self.training: with _torch.no_grad(): if self.palett_tau > 0: new_centroid_list = [] new_cur_n_clusters = self.n_clusters for i, partition in enumerate(self.partitions): if not self.enable_partition[i]: continue cur_clusters, cur_inverse, cur_counts = _torch.unique( self.centroids[i].float(), dim=0, return_inverse=True, return_counts=True, ) cur_n_clusters = len(cur_clusters) new_cur_n_clusters = min(new_cur_n_clusters, cur_n_clusters) if cur_n_clusters < self.n_clusters * (1 - self.palett_cluster_tol): for j, count in enumerate(cur_counts): if count > 1: new_centroid = 0.5 * ( cur_clusters[j] + cur_clusters[(j + 1) % cur_n_clusters] ) self.centroids[i][cur_inverse.tolist().index(j)] = new_centroid new_centroid_list.append(new_centroid) batch_partitions = [] seq_partitions = [] disabled_partitions = [] most_common_numel = None for i, numel in enumerate(self.partition_numel): if self.enable_partition[i]: if most_common_numel is None: most_common_numel = self.partition_numel[self.enable_partition].mode()[0] if numel == most_common_numel: batch_partitions.append(i) else: seq_partitions.append(i) elif isinstance(self.centroids[i], _torch.Tensor): disabled_partitions.append(i) if len(batch_partitions) == 1 or not self.palett_batch_mode: seq_partitions += batch_partitions batch_partitions = [] if batch_partitions: X, mean_inertia = self.diff_palettize_batch(X, cX, pad, batch_partitions) if seq_partitions: X, mean_inertia = self.diff_palettize_seq(X, cX, pad, seq_partitions) if disabled_partitions: X = self.palettize(X, cX, pad, disabled_partitions) else: X = self.palettize(X, cX, pad, partitions=range(len(self.partitions))) return X def diff_palettize_seq( self, X, cX, pad, partitions ) -> _Tuple[_torch.Tensor, _Union[_torch.Tensor, int]]: cur_inertia = [] for p in partitions: partition = self.partitions[p] centroids = self.centroids[p].clone() if _torch.is_grad_enabled(): assert not centroids.requires_grad cX_p = cX[p] cX_pt = cX_p.T last_inertia = None keep_sparsity = self.prune_threshold == 0 and self.enforce_zero[p] for j in range(self.kmeans_max_iter): x_c_dist = _EfficientKMeans.x_c_dist(cX_p, centroids) if keep_sparsity: # need to be keep pruning exact, no additional weight to be pruned by being assigned to the zero # centroid. the zero centroid is always centroids[0] if _torch.is_nonzero(centroids[0]): centroids[0] = _torch.zeros_like(centroids[0]).unsqueeze(0) cX_nonzero_indices = cX_p.nonzero(as_tuple=True)[0] x_c_dist[cX_nonzero_indices, :1] = 1 / self.zero_threshold if self.prune_threshold > 0: x_c_dist[:, :1] -= self.prune_threshold if "dkm" in self.palett_mode: attention = _F.softmax(-x_c_dist / self.palett_tau, dim=-1).clamp( min=self.zero_threshold ) elif "topk" in self.palett_mode: values, indices = _torch.topk(x_c_dist, k=2, dim=-1, largest=False) attention_topk = _F.softmax(-values / self.palett_tau, dim=-1) attention = _torch.zeros_like(x_c_dist) attention[:, indices] = attention_topk elif "hard" in self.palett_mode: col_idx = x_c_dist.min(dim=-1).indices row_idx = _torch.arange(start=0, end=len(col_idx), dtype=_torch.int32).to( cX_p.device ) attention = _torch.sparse_coo_tensor( _torch.vstack([row_idx, col_idx]), _torch.ones_like(row_idx).to(cX_p.device), x_c_dist.size(), dtype=x_c_dist.dtype, requires_grad=True, ).to_dense() elif "gsm" in self.palett_mode: attention = _F.gumbel_softmax(-x_c_dist / self.palett_tau, dim=-1) else: raise ValueError(f"palett_mode: {self.palett_mode} currently not supported.") # attention_sum can overflow with fp16 attention_sum = attention.sum(dim=0).view(-1, 1) assert not (attention_sum == 0).any() # matmul can overflow with fp16 centroids = _torch.matmul(cX_pt, attention).T / attention_sum if self.need_to_quantize: centroids = super().forward(centroids) if _torch.is_grad_enabled(): assert centroids.requires_grad if self.enforce_zero[p]: # fix zero zero_point = _torch.zeros_like(centroids[0]).unsqueeze(0) centroids[0] = zero_point min_error, _ = x_c_dist.min(dim=-1) cur_inertia.append(min_error.sum()) if last_inertia and abs(last_inertia - cur_inertia[-1]) <= self.palett_epsilon: break last_inertia = cur_inertia[-1] X[partition] = _devectorize( _torch.matmul(attention, centroids), pad[p], X[partition].size(), self.cluster_dim, ).to(X.dtype) self.labels[p] = None self.centroids[p] = centroids.detach().to(X.dtype) self.cum_inertia[p] += float(cur_inertia[-1].detach()) return X, (_torch.stack(cur_inertia).mean() if cur_inertia else -1) def diff_palettize_batch(self, X, cX, pad, partitions) -> _Tuple[_torch.Tensor, _torch.Tensor]: num_partitions = len(partitions) centroids = _torch.stack([self.centroids[i] for i in partitions]).clone() cX = _torch.stack([cX[i] for i in partitions]) cXt = cX.mT last_inertia = None for j in range(self.kmeans_max_iter): if self.palett_dist: x_c_dist = dist_batch_cdist_square.apply(cX, centroids) else: x_c_dist = _EfficientKMeans.x_c_dist(cX, centroids) attention = _F.softmax(-x_c_dist / self.palett_tau, -1).clamp(min=self.zero_threshold) # attention_sum can overflow with fp16 if _torch.is_grad_enabled(): assert attention.requires_grad attention_sum = attention.sum(dim=1).view(num_partitions, -1, 1) centroids = _torch.matmul(cXt, attention).mT / attention_sum if self.need_to_quantize: centroids = super().forward(centroids) if _torch.is_grad_enabled(): assert centroids.requires_grad if self.enforce_zero[0]: zero_point = _torch.zeros_like(centroids[0][0]).unsqueeze(0) for k in range(centroids.size(0)): centroids[k][0] = zero_point if self.kmeans_max_iter <= 1 and self.percentage_palett_enable >= 1: cur_inertia = _torch.zeros([num_partitions], device=X.device, dtype=X.dtype) break else: min_error, _ = x_c_dist.min(dim=-1) cur_inertia = min_error.sum(dim=1) avg_inertia = cur_inertia.mean() if last_inertia and abs(last_inertia - avg_inertia) <= self.palett_epsilon: break last_inertia = avg_inertia tX = _torch.matmul(attention, centroids) for i, p in enumerate(partitions): partition = self.partitions[p] X[partition] = _devectorize(tX[i], pad[p], X[partition].size(), self.cluster_dim).to( X.dtype ) self.labels[p] = None self.centroids[p] = centroids[i].detach().to(X.dtype) self.cum_inertia[p] += float(cur_inertia[i].detach()) return X, cur_inertia def palettize(self, X, cX, pad, partitions) -> _torch.Tensor: """ This method is run during inference time by the forward method of the ``FakePalettize`` class. It calculates the weight from the ``LUT`` and ``indices`` across all partitions and returns them. """ batch_partitions = [] seq_partitions = [] most_common_numel = self.partition_numel[partitions].mode()[0] for p in partitions: if self.partition_numel[p] == most_common_numel and self.labels[p] is None: batch_partitions.append(p) else: seq_partitions.append(p) if len(batch_partitions) == 1 or not self.palett_batch_mode: seq_partitions += batch_partitions batch_partitions = [] if seq_partitions: X = self.palettize_seq(X, cX, pad, seq_partitions) if batch_partitions: X = self.palettize_batch(X, cX, pad, batch_partitions) return X def palettize_seq(self, X, cX, pad, partitions) -> _torch.Tensor: for p in partitions: partition = self.partitions[p] labels = self.labels[p] centroids = self.centroids[p] if labels is None: cX_p = cX[p] x_c_dist = _EfficientKMeans.x_c_dist(cX_p, centroids) if self.prune_threshold > 0: x_c_dist[:, :1] -= self.prune_threshold min_error, labels = x_c_dist.min(dim=-1) self.labels[p] = labels.to(_torch.int).cpu() if X is not None: X[partition] = _devectorize( centroids[self.labels[p]], pad[p], X[partition].size(), self.cluster_dim, ).to(X.dtype) return X def palettize_batch(self, X, cX, pad, partitions) -> _torch.Tensor: # intentionally use cat instead of stack to make the backward graph distinguishable from diff_palettize_batch cX = _torch.cat([cX[i] for i in partitions]).view(len(partitions), -1, self.cluster_dim) centroids = _torch.stack([self.centroids[i] for i in partitions]) x_c_dist = _EfficientKMeans.x_c_dist(cX, centroids) if self.prune_threshold > 0: x_c_dist[:, :, :1] -= self.prune_threshold min_error, labels = x_c_dist.min(dim=-1) for i, p in enumerate(partitions): partition = self.partitions[p] centroids = self.centroids[p] self.labels[p] = labels[i].to(_torch.int).cpu() X[partition] = _devectorize( centroids[self.labels[p]], pad[p], X[partition].size(), self.cluster_dim ).to(X.dtype) return X def forward(self, weights: _torch.Tensor) -> _torch.Tensor: if self.cluster_permute and len(self.cluster_permute) == len(weights.size()): weights = weights.permute(self.cluster_permute) if self.enable_per_channel_scale: if not isinstance(self.per_channel_scaling_factor, _torch.Tensor): self.per_channel_scaling_factor = _torch.zeros((weights.flatten(1).shape[0], 1)) with _torch.no_grad(): if not self.per_channel_scaling_factor[0][0]: permuted_weights_proj = weights.flatten(1) if self.per_channel_scaling_factor_scheme == "min_max": self.per_channel_scaling_factor = 0.5 * ( permuted_weights_proj.max(1)[0].view(-1, 1) - permuted_weights_proj.min(1)[0].view(-1, 1) ) elif self.per_channel_scaling_factor_scheme == "abs": self.per_channel_scaling_factor = ( permuted_weights_proj.abs().max(1)[0].view(-1, 1) ) else: raise ValueError( f"Unsupported per_channel_scaling_factor_scheme:{self.per_channel_scaling_factor_scheme}" ) weights = (weights.flatten(1) / self.per_channel_scaling_factor).view( weights.size() ) # scale the weights using projection factors if self.fake_palett_enabled[0] == 1: if not self.partitions: self.create_partitions(weights.detach()) tensor_hook = None if self.training and self.palett_max_mem < 1.0: tensor_hook = _FakePalettizerTensorHook( zero_threshold=self.zero_threshold, device=weights.device, min_size=self.palett_min_tsize, max_mem=self.palett_max_mem, use_unique=self.palett_unique and self.cluster_dim == 1 and weights.dtype in [_torch.bfloat16, _torch.float16], use_shard=self.palett_shard, ) with _torch.autograd.graph.saved_tensors_hooks( tensor_hook.pack, tensor_hook.unpack ) if tensor_hook else contextlib.nullcontext(): cloned_weights = weights.clone() self.init_partitions(cloned_weights.detach()) palettized_weights = self.diff_palettize(cloned_weights) else: palettized_weights = super().forward(weights) if self.enable_per_channel_scale: palettized_weights = ( palettized_weights.flatten(1) * self.per_channel_scaling_factor ).view(palettized_weights.size()) if self.cluster_permute: palettized_weights = palettized_weights.permute( _torch.argsort(_torch.Tensor(self.cluster_permute)).tolist() ) if self.lut_dtype == "f16": palettized_weights = palettized_weights.to(_torch.float16).to(weights.dtype) elif self.lut_dtype == "b16": palettized_weights = palettized_weights.to(_torch.bfloat16).to(weights.dtype) return palettized_weights def _load_from_state_dict( self, state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs, ): self.lut_dtype = local_metadata["lut_dtype"] self.fake_palett_enabled = _torch.empty( state_dict[prefix + "fake_palett_enabled"].size(), device=self.centroids.device, ) _Partitioner._load_from_state_dict_( self, state_dict, prefix + "palett.", local_metadata, strict, missing_keys, unexpected_keys, error_msgs, ) if self.need_to_quantize: # We will go through FakeQuantize._load_from_state_dict and then nn.Module._load_from_state_dict super()._load_from_state_dict( state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs, ) else: # Jump FakeQuantize and go to nn.Module directly super(_FakeQuantize, self)._load_from_state_dict( state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs, ) def _save_to_state_dict(self, destination, prefix, keep_vars): if self.need_to_quantize: # Use normal inheritance, go through FakeQuantize._save_to_state_dict super()._save_to_state_dict(destination, prefix, keep_vars) self.centroids = super().forward(self.centroids) else: # Skip FakeQuantize._save_to_state_dict and go directly to nn.Module._save_to_state_dict super(_FakeQuantize, self)._save_to_state_dict(destination, prefix, keep_vars) # State dicts can only contain tensors (for DDP), so store infos in the metatadata dict (in particular str) destination._metadata[prefix[:-1]]["lut_dtype"] = self.lut_dtype destination[prefix + "per_channel_scaling_factor"] = self.per_channel_scaling_factor _Partitioner._save_to_state_dict_(self, destination, prefix + "palett.", keep_vars) def __repr__(self): rep = super().__repr__() rep += f"lut_dtype: {self.lut_dtype}, " rep += f"n_bits: {self.n_bits}, " rep += f"cluster_dim: {self.cluster_dim}, " rep += f"palett_tau: {self.palett_tau}, " rep += f"palett_mode: {self.palett_mode}" return rep class dist_batch_cdist_square(_torch.autograd.Function): def forward_2d(X, C): _C = C.reshape(-1) _X = X.repeat(1, C.size(0)) _T = _X - _C _T = _T.square() T = _T.view(X.size(0), C.size(0), C.size(1)).sum(dim=-1) return T def forward_3d(X, C): T = [None] * X.size(0) for i in range(X.size(0)): T[i] = dist_batch_cdist_square.forward_2d(X[i], C[i]) return _torch.stack(T) def backward_2d(X, C, grad_output): _C = C.reshape(-1) _X = X.repeat(1, C.size(0)) _T = _X - _C _T = _T.view(-1, C.size(0), C.size(1)) _T = _T * grad_output.unsqueeze(-1).expand( grad_output.size(0), grad_output.size(1), C.size(1) ) grad_X = _T.sum(dim=1) grad_C = _T.sum(dim=0) return 2 * grad_X, -2 * grad_C def backward_3d(X, C, grad_output): grad_X = [None] * X.size(0) grad_C = [None] * X.size(0) for i in range(X.size(0)): grad_X[i], grad_C[i] = dist_batch_cdist_square.backward_2d(X[i], C[i], grad_output[i]) return _torch.stack(grad_X), _torch.stack(grad_C) @staticmethod def forward(ctx, X, C): shard_list = _get_shard_list(X.size(0)) T = [None] * _dist.world_size for i in range(_dist.world_size): cur_X = X[shard_list[i] : shard_list[i + 1]] cur_C = C[shard_list[i] : shard_list[i + 1]] if i == _dist.rank: T[i] = _torch.cdist(cur_X, cur_C).square() else: T[i] = _torch.zeros( [cur_X.size(0), cur_X.size(1), cur_C.size(1)], device=X.device, dtype=X.dtype, ) _dist.all_gather(T, T[_dist.rank]) T = _torch.cat(T) M = _torch.Tensor([]) ctx.save_for_backward(X, C, M) return T @staticmethod def backward(ctx, grad_output): X, C, _ = ctx.saved_tensors # gradient is data-dependent, so it CANNOT be sharded if X.dim() == 3: grad_X, grad_C = dist_batch_cdist_square.backward_3d(X, C, grad_output) else: grad_X, grad_C = dist_batch_cdist_square.backward_2d(X, C, grad_output) return grad_X, grad_C ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/palettization/palettization_config.py0000644000000000000000000006260014672066616027501 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict as _OrderedDict from typing import Any as _Any from typing import Callable as _Callable from typing import Dict as _Dict from typing import List as _List from typing import NewType as _NewType from typing import Optional as _Optional from typing import Union as _Union import cattrs as _cattrs import torch as _torch import torch.nn as _nn from attr import define as _define from attr import field as _field from attrs import validators as _validators from coremltools.optimize.torch._utils.torch_utils import ( maybe_convert_str_to_dtype as _maybe_convert_str_to_dtype, ) from coremltools.optimize.torch.optimization_config import ( ModuleOptimizationConfig as _ModuleOptimizationConfig, ) from coremltools.optimize.torch.optimization_config import OptimizationConfig as _OptimizationConfig from coremltools.optimize.torch.optimization_config import ( PalettizationGranularity, _deprecated_field, _validate_module_type_keys_factory, ) # Default advanced options for palettization DEFAULT_PALETTIZATION_ADVANCED_OPTIONS = { "cluster_permute": None, "palett_max_mem": 1.0, "kmeans_max_iter": 3, "prune_threshold": 1e-7, "kmeans_init": "auto", "kmeans_opt1d_threshold": 1024, "enforce_zero": False, "palett_mode": "dkm", "palett_tau": 0.0001, "palett_epsilon": 0.0001, "palett_lambda": 0.0, "add_extra_centroid": False, "palett_cluster_tol": 0.0, "palett_min_tsize": 64 * 1024, "palett_unique": False, "palett_shard": False, "palett_batch_mode": False, "palett_dist": False, "per_channel_scaling_factor_scheme": "min_max", "percentage_palett_enable": 1.0, "kmeans_batch_threshold": 4, "kmeans_n_init": 100, "zero_threshold": 1e-7, "kmeans_error_bnd": 0.0, "channel_axis": 0, } DEFAULT_PALETTIZATION_OPTIONS = { "quant_min": -128, "quant_max": 127, "dtype": _torch.qint8, "lut_dtype": "f32", "weight_threshold": 2048, "milestone": 0, "quantize_activations": False, "enable_per_channel_scale": False, "granularity": "per_tensor", "group_size": None, } _default_palettization_scheme = { **DEFAULT_PALETTIZATION_OPTIONS, **DEFAULT_PALETTIZATION_ADVANCED_OPTIONS, } # Default scheme for palettization DEFAULT_PALETTIZATION_SCHEME = { _nn.Linear: {"n_bits": 4, "cluster_dim": 1, **_default_palettization_scheme}, _nn.Conv1d: {"n_bits": 2, "cluster_dim": 1, **_default_palettization_scheme}, _nn.Conv2d: {"n_bits": 2, "cluster_dim": 1, **_default_palettization_scheme}, _nn.Conv3d: {"n_bits": 2, "cluster_dim": 1, **_default_palettization_scheme}, _nn.LayerNorm: {"n_bits": 2, "cluster_dim": 1, **_default_palettization_scheme}, _nn.MultiheadAttention: {"n_bits": 2, "cluster_dim": 1, **_default_palettization_scheme}, _nn.Embedding: {"n_bits": 2, "cluster_dim": 1, **_default_palettization_scheme}, } # Pytorch modules from torch.ao.quantization.quantization_mappings.DEFAULT_QAT_MODULE_MAPPINGS that are supported # for palettization SUPPORTED_PYTORCH_QAT_MODULES = (_nn.Linear, _nn.Conv2d, _nn.Conv3d) @_define class ModuleDKMPalettizerConfig(_ModuleOptimizationConfig): r""" Configuration class for specifying global and module-level options for the palettization algorithm implemented in :py:class:`DKMPalettizer`. The parameters specified in this config control the DKM algorithm, described in `DKM: Differentiable K-Means Clustering Layer for Neural Network Compression `_. For most use cases, the only parameters you need to specify are ``n_bits``, ``weight_threshold``, and ``milestone``. .. note:: Most of the parameters in this class are meant for advanced use cases and for further fine-tuning the DKM algorithm. The default values usually work for a majority of tasks. .. note:: Change the following parameters only when you use activation quantization in conjunction with DKM weight palettization: ``quant_min``, ``quant_max``, ``dtype``, and ``quantize_activations``. Args: n_bits (:obj:`int`): Number of clusters. The number of clusters used is :math:`2^{n\_bits}`. Defaults to ``4`` for linear layers and ``2`` for all other layers. weight_threshold (:obj:`int`): A module is only palettized if the number of elements in its weight matrix exceeds ``weight_threshold``. If there are multiple weights in a module, such as :py:class:`torch.nn.MultiheadAttention`, all of them must have more elements than the ``weight_threshold`` for the module to be palettized. Defaults to ``2048``. granularity (:py:class:`PalettizationGranularity`): Granularity for palettization. One of ``per_tensor`` or ``per_grouped_channel``. Defaults to ``per_tensor``. group_size (:obj:`int`): Specify the number of channels in a group. Only effective when granularity is ``per_grouped_channel``. channel_axis (:obj:`int`): Specify the channel axis to form a group of channels. Only effective when granularity is ``per_grouped_channel``. Defaults to output channel axis. For now, only output channel axis is supported by DKM. enable_per_channel_scale (:obj:`bool`): When set to ``True``, per-channel scaling is used along the channel dimension. milestone (:obj:`int`): Step or epoch at which palettization begins. Defaults to ``0``. cluster_dim (:obj:`int`, ``optional``): The dimension of each cluster. quant_min (:obj:`int`, ``optional``): The minimum value for each element in the weight clusters if they are quantized. Defaults to ``-128``. quant_max (:obj:`int`, ``optional``): The maximum value for each element in the weight clusters if they are quantized. Defaults to ``127`` dtype (:py:class:`torch.dtype`, ``optional``): The ``dtype`` to use for quantizing the activations. Only applies when ``quantize_activations`` is ``True``. Defaults to :py:class:`torch.qint8`. lut_dtype (:obj:`str`, ``optional``): ``dtype`` to use for quantizing the clusters. Allowed options are ``'i8'``, ``'u8'``, ``'f16'``, ``'bf16'``, ``'f32'``. Defaults to ``'f32'``, so by default, the clusters aren't quantized. quantize_activations (:obj:`bool`, ``optional``): When ``True``, the activations are quantized. Defaults to ``False``. cluster_permute (:obj:`tuple`, ``optional``): Permutation order to apply to weight partitions. Defaults to ``None``. palett_max_mem (:obj:`float`, ``optional``): Proportion of available GPU memory that should be used for palettization. Defaults to ``1.0``. kmeans_max_iter (:obj:`int`, ``optional``): Maximum number of differentiable ``k-means`` iterations. Defaults to ``3``. prune_threshold (:obj:`float`, ``optional``): Hardshrinks weights between [``-prune_threshold``, ``prune_threshold``] to zero. Useful for joint pruning and palettization. Defaults to ``1e-7``. kmeans_init (:obj:`str`, ``optional``): ``k-means`` algorithm to use. Allowed options are ``opt1d``, ``cpu.kmeans++`` and ``kmeans++``. Defaults to ``auto``. kmeans_opt1d_threshold (:obj:`int`, ``optional``): Channel threshold to decide if ``opt1d kmeans`` should be used. Defaults to ``1024``. enforce_zero (:obj:`bool`, ``optional``): If ``True``, enforces the LUT centroid which is closest to the origin to be fixed to zero. Defaults to ``False``. palett_mode (:obj:`str`, ``optional``): Criteria to calculate attention during ``k-means``. Allowed options are ``gsm``, ``dkm`` and ``hard``. Defaults to ``dkm``. palett_tau (:obj:`float`, ``optional``): Temperature factor for softmax used in DKM algorithm. Defaults to ``0.0001``. palett_epsilon (:obj:`float`, ``optional``): Distance threshold for clusters between ``k-means`` iterations. Defaults to ``0.0001``. palett_lambda (:obj:`float`, ``optional``): Reduces effects of outliers during centroid calculation. Defaults to ``0.0``. add_extra_centroid (:obj:`bool`, ``optional``): If ``True``, adds an extra centroid to the LUT. Defaults to ``False``. palett_cluster_tol (:obj:`float`, ``optional``): Tolerance for non-unique centroids in the LUT. The higher the number, the more tolerance for non-unique centroids. Defaults to ``0.0``. palett_min_tsize (:obj:`int`, ``optional``): Weight threshold beyond which to use custom packing and unpacking hook for autograd. Defaults to ``64*1024``. palett_unique (:obj:`bool`, ``optional``): If ``True``, reduces the attention map by leveraging the fact that FP16 only has ``2^16`` unique values. Useful for Large Models like LLMs where attention maps can be huge. Defaults to ``False``. For more details, read `eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models `_ . palett_shard (:obj:`bool`, ``optional``): If ``True``, the index list is sharded across GPUs. Defaults to ``False``. For more details, read `eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models `_ . palett_batch_mode (:obj:`bool`, ``optional``): If ``True``, performs batch DKM across different partitions created for different blocks. Defaults to ``False``. More details can be found `eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models `_ . palett_dist (:obj:`bool`, ``optional``): If ``True``, performs distributed distance calculation in batch_mode if distributed torch is available. Defaults to ``False``. per_channel_scaling_factor_scheme (:obj:`str`, ``optional``): Criteria to calculate the ``per_channel_scaling_factor``. Allowed options are ``min_max`` and ``abs``. Defaults to ``min_max``. percentage_palett_enable (:obj:`float`, ``optional``): Percentage partitions to enable for DKM. Defaults to ``1.0``. kmeans_batch_threshold (:obj:`int`, ``optional``): Threshold to decide what the ``num_partitions`` value should be to go through with the sharded centroids list. ``num_partitions`` is calculated by dividing the channel size by the ``group_size`` provided. If ``num_partitions``` matches ``kmeans_batch_threshold``, the algorithm resorts to performing distributed k-means for lower partition numbers, given that ``num_partition`` number of GPUs are available. Defaults to ``4``. kmeans_n_init (:obj:`int`, ``optional``): Number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of ``kmeans_n_init`` consecutive runs in terms of inertia. zero_threshold (:obj:`int`, ``optional``): Zero threshold to be used to decide the minimum value of clamp for softmax. Defaults to ``1e-7``. kmeans_error_bnd (:obj:`float`, ``optional``): Distance threshold to decide at what distance between parameters and clusters to stop the ``k-means`` operation. Defaults to ``0.0``. This class supports two different configurations to structure the palettization: 1. **Per-tensor palettization**: This is the default configuration where the whole tensor shares a single lookup table. The ``granularity`` is set to ``per_tensor`` and ``group_size`` is ``None``. 2. **Per-grouped-channel palettization**: In this configuration, ``group_size`` number of channels along ``channel_axis`` share the same lookup table. For example, for a weight matrix of shape ``(16, 25)``, if we provide ``group_size = 8``, the shape of the lookup table would be ``(2, 2^n_bits)``. .. note:: Grouping is currently only supported along the output channel axis. """ n_bits: _Optional[int] = _field( default=None, validator=_validators.optional(_validators.instance_of(int)) ) weight_threshold: int = _field( default=DEFAULT_PALETTIZATION_OPTIONS["weight_threshold"], validator=_validators.instance_of(int), ) granularity: PalettizationGranularity = _field( default=DEFAULT_PALETTIZATION_OPTIONS["granularity"], converter=PalettizationGranularity, validator=_validators.in_(PalettizationGranularity), ) group_size: _Optional[int] = _field( default=DEFAULT_PALETTIZATION_OPTIONS["group_size"], validator=_validators.optional(_validators.instance_of(int)), ) channel_axis: int = _field( default=0, validator=_validators.optional([_validators.instance_of(int), _validators.in_([0])]), ) enable_per_channel_scale: bool = _field( default=DEFAULT_PALETTIZATION_OPTIONS["enable_per_channel_scale"], validator=_validators.instance_of(bool), ) milestone: int = _field( default=DEFAULT_PALETTIZATION_OPTIONS["milestone"], validator=_validators.instance_of(int), ) cluster_dim: _Optional[int] = _field( default=None, validator=_validators.optional(_validators.instance_of(int)) ) quant_min: int = _field( default=DEFAULT_PALETTIZATION_OPTIONS["quant_min"], validator=_validators.instance_of(int), ) quant_max: int = _field( default=DEFAULT_PALETTIZATION_OPTIONS["quant_max"], validator=_validators.instance_of(int), ) dtype: _torch.dtype = _field( default=DEFAULT_PALETTIZATION_OPTIONS["dtype"], converter=_maybe_convert_str_to_dtype, validator=[ _validators.instance_of(_torch.dtype), _validators.in_([_torch.qint8, _torch.quint8, _torch.float32]), ], ) lut_dtype: str = _field( default=DEFAULT_PALETTIZATION_OPTIONS["lut_dtype"], validator=_validators.instance_of(str), ) quantize_activations: bool = _field( default=DEFAULT_PALETTIZATION_OPTIONS["quantize_activations"], validator=_validators.instance_of(bool), ) cluster_permute: _Optional[tuple] = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["cluster_permute"], validator=_validators.optional(_validators.instance_of(tuple)), ) palett_max_mem: float = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_max_mem"], validator=_validators.instance_of(float), ) kmeans_max_iter: int = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["kmeans_max_iter"], validator=_validators.instance_of(int), ) prune_threshold: float = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["prune_threshold"], validator=_validators.instance_of(float), ) kmeans_init: str = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["kmeans_init"], validator=_validators.instance_of(str), ) kmeans_opt1d_threshold: int = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["kmeans_opt1d_threshold"], validator=_validators.instance_of(int), ) enforce_zero: bool = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["enforce_zero"], validator=_validators.instance_of(bool), ) palett_mode: str = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_mode"], validator=_validators.instance_of(str), ) palett_tau: float = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_tau"], validator=_validators.instance_of(float), ) palett_epsilon: float = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_epsilon"], validator=_validators.instance_of(float), ) palett_lambda: float = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_lambda"], validator=_validators.instance_of(float), ) add_extra_centroid: bool = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["add_extra_centroid"], validator=_validators.instance_of(bool), ) palett_cluster_tol: float = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_cluster_tol"], validator=_validators.instance_of(float), ) palett_min_tsize: int = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_min_tsize"], validator=_validators.instance_of(int), ) palett_unique: bool = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_unique"], validator=_validators.instance_of(bool), ) palett_shard: bool = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_shard"], validator=_validators.instance_of(bool), ) palett_batch_mode: bool = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_batch_mode"], validator=_validators.instance_of(bool), ) palett_dist: bool = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["palett_dist"], validator=_validators.instance_of(bool), ) per_channel_scaling_factor_scheme: str = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["per_channel_scaling_factor_scheme"], validator=_validators.and_( _validators.instance_of(str), _validators.in_(["min_max", "abs"]) ), ) percentage_palett_enable: float = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["percentage_palett_enable"], validator=_validators.instance_of(float), ) kmeans_batch_threshold: int = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["kmeans_batch_threshold"], validator=_validators.instance_of(int), ) kmeans_n_init: int = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["kmeans_n_init"], validator=_validators.instance_of(int), ) zero_threshold: float = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["zero_threshold"], validator=_validators.instance_of(float), ) kmeans_error_bnd: float = _field( default=DEFAULT_PALETTIZATION_ADVANCED_OPTIONS["kmeans_error_bnd"], validator=_validators.instance_of(float), ) partition_size: int = _deprecated_field( message=( "partition_size is being deprecated and will be removed in " "future versions. Please use group_size parameter instead." ) ) cluster_dtype: str = _deprecated_field( message=( "cluster_dtype is being deprecated and will be removed in " "future versions. Please use lut_dtype parameter instead." ) ) @group_size.validator def per_grouped_channel_granularity(self, attribute, value): if self.granularity == PalettizationGranularity.per_grouped_channel: assert ( value is not None ), "group_size has to be specified along with per_grouped_channel granularity." assert value > 0, "group_size should be greater than zero" else: assert value is None, "group_size can't be specified along with per_tensor granularity." _default_module_type_configs = _OrderedDict( { key: ModuleDKMPalettizerConfig.from_dict(val) for key, val in DEFAULT_PALETTIZATION_SCHEME.items() } ) _GlobalConfigType = _NewType( "GlobalConfigType", _Union[ _Optional[ModuleDKMPalettizerConfig], _List[_Optional[ModuleDKMPalettizerConfig]], ], ) _ModuleTypeConfigType = _NewType( "ModuleTypeConfigType", _Dict[_Union[_Callable, str], _GlobalConfigType] ) _ModuleNameConfigType = _NewType( "ModuleNameConfigType", _Dict[str, _Optional[ModuleDKMPalettizerConfig]] ) def _validate_dkm_config_type(instance, attribute, value): if value is not None: if isinstance(value, list): return _validators.deep_iterable( member_validator=_validators.optional( _validators.instance_of(ModuleDKMPalettizerConfig) ), iterable_validator=_validators.instance_of(list), )(instance, attribute, value) else: return _validators.optional(_validators.instance_of(ModuleDKMPalettizerConfig))( instance, attribute, value ) @_define class DKMPalettizerConfig(_OptimizationConfig): """ Configuration for specifying how different submodules of a model are palettized by :py:class:`DKMPalettizer`. The ``module_type_configs`` parameter can accept a list of :py:class:`ModuleDKMPalettizerConfig` as values for a given module type. The list can specify different parameters for different ``weight_threshold`` values. This is useful if you want to apply different configs to layers of the same type with weights of different sizes. For example, to use ``4`` -bit palettization for weights with more than ``1000`` elements and ``2`` -bit palettization for weights with more than ``300`` but less than ``1000`` elements, create a config as follows: .. code-block:: python custom_config = { nn.Conv2d: [ {"n_bits": 4, "cluster_dim": 4, "weight_threshold": 1000}, {"n_bits": 2, "cluster_dim": 2, "weight_threshold": 300}, ] } config = DKMPalettizerConfig.from_dict({"module_type_configs": custom_config}) Args: global_config (:py:class:`ModuleDKMPalettizerConfig`): Config to be applied globally to all supported modules. Missing values are chosen from the default config. module_type_configs (:obj:`dict` of :obj:`str` to :py:class:`ModuleDKMPalettizerConfig`): Module type level configs applied to a specific module class, such as :py:class:`torch.nn.Linear`. The keys can be either strings or module classes. When ``module_type_config`` is set to ``None`` for a module type, it is not palettized. module_name_configs (:obj:`dict` of :obj:`str` to :py:class:`ModuleDKMPalettizerConfig`): Module-level configs applied to specific modules. The name of the module must be a fully qualified name that can be used to fetch it from the top-level module using the ``module.get_submodule(target)`` method. When ``module_name_config`` is set to ``None`` for a module, it is not palettized. """ global_config: _GlobalConfigType = _field(default=None, validator=_validate_dkm_config_type) module_type_configs: _ModuleTypeConfigType = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.and_( _validators.instance_of((str, _Callable)), _validate_module_type_keys_factory(list(DEFAULT_PALETTIZATION_SCHEME.keys())), ), value_validator=_validate_dkm_config_type, mapping_validator=_validators.instance_of(dict), ), ) module_name_configs: _ModuleNameConfigType = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.instance_of(str), value_validator=_validate_dkm_config_type, mapping_validator=_validators.instance_of(dict), ), ) def __attrs_post_init__(self): if ( self.global_config is None and len(self.module_type_configs) == 0 and len(self.module_name_configs) == 0 ): self.module_type_configs = _default_module_type_configs self._sort_configs_by_weight_threshold(self.global_config) for ctype, config in self.module_type_configs.items(): self.set_module_type(ctype, self._sort_configs_by_weight_threshold(config)) for name, config in self.module_name_configs.items(): self.set_module_name(name, self._sort_configs_by_weight_threshold(config)) @classmethod def from_dict(cls, config_dict: _Dict[str, _Any]) -> "DKMPalettizerConfig": super().from_dict(config_dict) converter = _cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook(_ModuleTypeConfigType, _structure_from_dict_hook) converter.register_structure_hook(_ModuleNameConfigType, _structure_from_dict_hook) converter.register_structure_hook(_GlobalConfigType, _structure_dkm_config_hook) return converter.structure_attrs_fromdict(config_dict, cls) @staticmethod def _sort_configs_by_weight_threshold(config: _GlobalConfigType): if isinstance(config, list): return sorted(config, key=lambda x: x.weight_threshold) return config def _structure_dkm_config_hook( config_dict: _Union[_List[_Dict[str, _Any]], _Dict[str, _Any]], type: _Any ): if isinstance(config_dict, list): return [ModuleDKMPalettizerConfig.from_dict(cd) for cd in config_dict] return ModuleDKMPalettizerConfig.from_dict(config_dict) def _structure_from_dict_hook(module_type_dict: _Dict[_Union[_Callable, str], _Any], type: _Any): return_dict = _OrderedDict() for key, value in module_type_dict.items(): if value is None: return_dict[key] = None else: return_dict[key] = _structure_dkm_config_hook(value, type) return return_dict ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/palettization/palettizer.py0000644000000000000000000004122514672066616025450 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import logging as _logging from typing import Dict as _Dict from typing import Optional as _Optional import torch as _torch import torch.nn as _nn from torch.ao.quantization import FakeQuantize as _FakeQuantize from coremltools.optimize.torch._typing import ParamsDict as _ParamsDict from coremltools.optimize.torch._utils.math_utils import rmse_error as _rmse_error from coremltools.optimize.torch._utils.metadata_utils import ( register_metadata_version as _register_metadata_version, ) from coremltools.optimize.torch._utils.torch_utils import get_eval_model as _get_eval_model from coremltools.optimize.torch._utils.validation_utils import ( validate_param_config as _validate_param_config, ) from coremltools.optimize.torch.base_model_optimizer import ( BaseTrainingTimeModelOptimizer as _BaseTrainingTimeModelOptimizer, ) from coremltools.optimize.torch.base_model_optimizer import _Report from coremltools.optimize.torch.palettization._custom_conversion import ( PALETTIZATION_CONVERT_DICT as _PALETTIZATION_CONVERT_DICT, ) from coremltools.optimize.torch.palettization._supported_modules import ( _get_palettization_qat_mappings, ) from coremltools.optimize.torch.palettization._supported_modules import ( get_palettizable_parameters as _get_palettizable_parameters, ) from coremltools.optimize.torch.palettization.fake_palettize import FakePalettize as _FakePalettize from coremltools.optimize.torch.palettization.palettization_config import ( DEFAULT_PALETTIZATION_ADVANCED_OPTIONS as _DEFAULT_PALETTIZATION_ADVANCED_OPTIONS, ) from coremltools.optimize.torch.palettization.palettization_config import ( DEFAULT_PALETTIZATION_SCHEME as _DEFAULT_PALETTIZATION_SCHEME, ) from coremltools.optimize.torch.palettization.palettization_config import ( DKMPalettizerConfig as _DKMPalettizerConfig, ) from coremltools.optimize.torch.palettization.palettization_config import ( ModuleDKMPalettizerConfig as _ModuleDKMPalettizerConfig, ) _logger = _logging.getLogger(__name__) class Palettizer(_BaseTrainingTimeModelOptimizer): pass class DKMPalettizer(Palettizer): """ A palettization algorithm based on `"DKM: Differentiable K-Means Clustering Layer for Neural Network Compression" `_. It clusters the weights using a differentiable version of ``k-means``, allowing the lookup table (LUT) and indices of palettized weights to be learnt using a gradient-based optimization algorithm such as SGD. Example: .. code-block:: python import torch from coremltools.optimize.torch.palettization import ( DKMPalettizer, DKMPalettizerConfig, ModuleDKMPalettizerConfig, ) # code that defines the pytorch model, loss and optimizer. model, loss_fn, optimizer = create_model_loss_and_optimizer() # initialize the palettizer config = DKMPalettizerConfig(global_config=ModuleDKMPalettizerConfig(n_bits=4)) palettizer = DKMPalettizer(model, config) # prepare the model to insert FakePalettize layers for palettization model = palettizer.prepare(inplace=True) # use palettizer in your PyTorch training loop for inputs, labels in data: output = model(inputs) loss = loss_fn(output, labels) loss.backward() optimizer.step() palettizer.step() # fold LUT and indices into weights model = palettizer.finalize(inplace=True) Args: model (:py:class:`torch.nn.Module`): Model on which the palettizer will act. config (:py:class:`DKMPalettizerConfig`): Config which specifies how different submodules in the model will be configured for palettization. Default config is used when passed as ``None``. """ def __init__(self, model: _nn.Module, config: _Optional[_DKMPalettizerConfig] = None): config = _DKMPalettizerConfig() if config is None else config super().__init__(model, config) self._milestones = {} self._supported_modules = _get_palettization_qat_mappings() def _palettize_supported_modules(self): """ Method to palettize supported modules. """ for name, submodule in self._model.named_modules(remove_duplicate=True): config = self._config.get_module_config(name, submodule) if type(submodule) in self._supported_modules: if config is not None: submod_configs = config if isinstance(config, list) else [config] for submod_config in submod_configs: if all( param.numel() > submod_config.weight_threshold for param, _ in _get_palettizable_parameters(submodule) ): module_level_advanced_options = self._get_module_level_advanced_options( submodule, submod_config ) n_bits = ( submod_config.n_bits if submod_config.n_bits is not None else _DEFAULT_PALETTIZATION_SCHEME[type(submodule)]["n_bits"] ) cluster_dim = ( submod_config.cluster_dim if submod_config.cluster_dim is not None else _DEFAULT_PALETTIZATION_SCHEME[type(submodule)]["cluster_dim"] ) enable_per_channel_scale = ( submod_config.enable_per_channel_scale if submod_config.enable_per_channel_scale is not None else _DEFAULT_PALETTIZATION_SCHEME[type(submodule)][ "enable_per_channel_scale" ] ) updated_config = None for param, param_name in _get_palettizable_parameters(submodule): updated_config = _validate_param_config( name + "." + param_name, param, submodule, submod_config, [ "palettization_cluster_dim", "palettization_group_size", ], module_level_advanced_options, ) if not updated_config: break if not updated_config: continue self._palettize_module( submodule, n_bits, cluster_dim, enable_per_channel_scale, updated_config.group_size, updated_config.quant_min, updated_config.quant_max, updated_config.lut_dtype, updated_config.dtype, updated_config.quantize_activations, module_level_advanced_options, ) self._milestones[name] = updated_config.milestone @staticmethod def _palettize_module( module: _nn.Module, n_bits: int, cluster_dim: int, enable_per_channel_scale: bool, group_size: _Optional[int], quant_min: int, quant_max: int, lut_dtype: str, dtype: _torch.dtype, quantize_activations: bool, advanced_options: _Dict, ): """ Method to palettize a module. """ fq_activation = _nn.Identity fq_weight = _FakePalettize.with_args( observer=_torch.quantization.MovingAveragePerChannelMinMaxObserver.with_args( quant_min=quant_min, quant_max=quant_max, dtype=dtype ), n_bits=n_bits, cluster_dim=cluster_dim, enable_per_channel_scale=enable_per_channel_scale, group_size=group_size, quant_min=quant_min, quant_max=quant_max, lut_dtype=lut_dtype, advanced_options=advanced_options, ) if quantize_activations: fq_activation = _FakeQuantize.with_args( observer=_torch.quantization.MovingAveragePerChannelMinMaxObserver.with_args( quant_min=quant_min, quant_max=quant_max, dtype=dtype ), quant_min=quant_min, quant_max=quant_max, ) module.qconfig = _torch.quantization.QConfig(activation=fq_activation, weight=fq_weight) @staticmethod def _get_module_level_advanced_options( module: _nn.Module, module_level_config: _ModuleDKMPalettizerConfig ) -> _ParamsDict: """ Returns advanced_options for a module. First checks whether the user specified something for those options in the palettization_config. If not, uses the options from the DEFAULT_PALETTIZATION_SCHEME of that module type. Returns false otherwise. """ module_level_advanced_options = {} for key in _DEFAULT_PALETTIZATION_ADVANCED_OPTIONS.keys(): if key == "cluster_permute" and module_level_config.lut_dtype == "oc_last": cluster_permute = list(range(module.weight.dim())) cluster_permute = cluster_permute[1:] + cluster_permute[:1] module_level_advanced_options[key] = cluster_permute else: module_level_advanced_options[key] = getattr(module_level_config, key) return module_level_advanced_options def prepare(self, inplace: bool = False) -> _nn.Module: """ Prepares a model for palettization aware training by inserting :py:class:`FakePalettize` layers in appropriate places as specified by the config. Args: inplace (:obj:`bool`): If ``True``, model transformations are carried out in-place and the original module is mutated, otherwise a copy of the model is mutated and returned. """ self._model = self._get_model_for_compression(inplace) self._model.train() self._palettize_supported_modules() qat_mappings = _get_palettization_qat_mappings() self._model = _torch.quantization.prepare_qat( self._model, mapping=qat_mappings, inplace=True ) return self._model def finalize(self, model: _Optional[_nn.Module] = None, inplace: bool = False) -> _nn.Module: """ Removes :py:class:`FakePalettize` layers from a model and creates new model weights from the ``LUT`` and ``indices`` buffers. This function is called to prepare a palettized model for export using `coremltools `_. Args: model (:obj:`nn.Module`): model to finalize. inplace (:obj:`bool`): If ``True``, model transformations are carried out in-place and the original module is mutated; otherwise, a copy of the model is mutated and returned. """ if model is None: model = self._model model.eval() finalized_model = _torch.quantization.convert( model, convert_custom_config_dict=_PALETTIZATION_CONVERT_DICT, inplace=inplace ) if model is None: self._model = finalized_model _register_metadata_version(finalized_model) return finalized_model def step(self): """ Step through the palettizer. When the number of times ``step`` is called is equal to ``milestone``, palettization is enabled. """ for name, module in self._model.named_modules(): if name in self._milestones: if self._step_count == self._milestones[name]: self._enable_fake_palett_impl(module, True) self._init_prune_threshold_and_module_wise_target_sparsity(module) if self._step_count > self._milestones[name]: self._update_prune_threshold(module) self._step_count += 1 @staticmethod def _init_prune_threshold_and_module_wise_target_sparsity(module: _torch.nn.Module): if hasattr(module, "weight_fake_quant") and hasattr(module, "weight_mask"): non_zero_weights = module.weight_mask.count_nonzero().item() total_weights = _torch.numel(module.weight_mask) target_module_level_sparsity = 1 - non_zero_weights / total_weights inverse_mask = (module.weight_mask + 1) % 2 n_bits = module.weight_fake_quant.n_bits cluster_dim = module.weight_fake_quant.cluster_dim add_extra_centroid = module.weight_fake_quant.add_extra_centroid n_clusters = 2 ** int(n_bits) + int(add_extra_centroid) prune_threshold_init = _torch.abs(inverse_mask * module.weight_orig).max() / ( total_weights / cluster_dim / n_clusters ) module.weight_fake_quant.prune_threshold = prune_threshold_init module.weight_fake_quant._target_module_level_sparsity = target_module_level_sparsity @staticmethod def _update_prune_threshold(module: _torch.nn.Module): if hasattr(module, "weight_fake_quant") and hasattr(module, "weight_mask"): weight_detached = module.weight.detach() qweight = module.weight_fake_quant.palettize(weight_detached) sparsity = 1 - qweight.count_nonzero() / qweight.numel() prune_ratio = float(module.weight_fake_quant._target_module_level_sparsity) / ( sparsity + 1e-7 ) if prune_ratio > 0 and abs(prune_ratio - 1) > 0.01: prune_multiplier = max(min(prune_ratio, 1.25), 0.9) module.weight_fake_quant.prune_threshold *= prune_multiplier def enable_fake_palett(self, flag: bool): _logging.info( f"[{type(self).__name__}] " + ("enable" if flag else "disable") + " fake_palett" ) for name, module in self._model.named_modules(): self._enable_fake_palett_impl(module, flag) @staticmethod def _enable_fake_palett_impl(module: _torch.nn.Module, flag: bool): def enable_fn(mod): if hasattr(mod, "enable_fake_palett"): mod.enable_fake_palett(flag) module.apply(enable_fn) def report(self) -> _Report: """ Returns a dictionary with important statistics related to current state of palettization. Each key in the dictionary corresponds to a module name, and the value is a dictionary containing the statistics, such as number of clusters and cluster dimension, number of parameters, and so on. """ report = _Report() with _get_eval_model(self._model) as model: with _torch.no_grad(): for name, module in model.named_modules(): module_summary = dict() if hasattr(module, "weight_fake_quant"): module_summary["device"] = module.weight.device qweight = module.weight_fake_quant.forward(module.weight.detach()) lut_dtype = module.weight_fake_quant.lut_dtype cluster_permute = module.weight_fake_quant.cluster_permute module_summary["error"] = _rmse_error( module.weight.detach(), qweight ).item() n_clusters = module.weight_fake_quant.n_clusters module_summary["#params"] = int(_torch.numel(qweight)) cluster_dim = module.weight_fake_quant.cluster_dim module_summary["#dtype"] = ( f":num_clusters: {n_clusters} <{lut_dtype, cluster_permute}> " f"dim={cluster_dim}" ) report[name] = module_summary return report ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/palettization/post_training_palettization.py0000644000000000000000000003377014672066616031122 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import logging as _logging from collections import OrderedDict as _OrderedDict from typing import Any as _Any from typing import Callable as _Callable from typing import Dict as _Dict from typing import NewType as _NewType from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import Union as _Union import cattrs as _cattrs import torch as _torch from attr import define as _define from attr import field as _field from attrs import validators as _validators from coremltools.optimize.torch._utils.k_means import KMeansConfig as _KMeansConfig from coremltools.optimize.torch._utils.k_means import ( KMeansSupportedModulesRegistry as _KMeansSupportedModulesRegistry, ) from coremltools.optimize.torch._utils.k_means import ParallelKMeans as _ParallelKMeans from coremltools.optimize.torch._utils.k_means import SequentialKMeans as _SequentialKMeans from coremltools.optimize.torch._utils.report_utils import ( compute_post_training_report as _compute_post_training_report, ) from coremltools.optimize.torch._utils.torch_utils import ( maybe_convert_str_to_dtype as _maybe_convert_str_to_dtype, ) from coremltools.optimize.torch._utils.torch_utils import ( maybe_convert_str_to_mod_type as _maybe_convert_str_to_mod_type, ) from coremltools.optimize.torch._utils.validation_utils import ( validate_param_config as _validate_param_config, ) from coremltools.optimize.torch.base_model_optimizer import ( BasePostTrainingModelOptimizer as _BasePostTrainingModelOptimizer, ) from coremltools.optimize.torch.base_model_optimizer import _Report from coremltools.optimize.torch.optimization_config import ( ModuleOptimizationConfig as _ModuleOptimizationConfig, ) from coremltools.optimize.torch.optimization_config import OptimizationConfig as _OptimizationConfig from coremltools.optimize.torch.optimization_config import ( PalettizationGranularity, _structure_from_dict_hook_factory, ) _logger = _logging.getLogger(__name__) @_define class ModulePostTrainingPalettizerConfig(_ModuleOptimizationConfig): """ Configuration class for specifying global and module-level palettization options for :py:class:`PostTrainingPalettizerConfig` algorithm. Args: n_bits (:obj:`int`): Number of bits to use for palettizing the weights. Defaults to ``4``. lut_dtype (:py:class:`torch.dtype`): The dtype to use for representing each element in lookup tables. When value is ``None``, no quantization is performed. Supported values are :py:class:`torch.int8` and :py:class:`torch.uint8`. Defaults to ``None``. granularity (:py:class:`PalettizationGranularity`) – Granularity for palettization. One of ``per_tensor`` or ``per_grouped_channel``. Defaults to ``per_tensor``. group_size (:obj:`int`): Specify the number of channels in a group. Only effective when granularity is ``per_grouped_channel``. channel_axis (:obj:`int`): Specify the channel axis to form a group of channels. Only effective when granularity is ``per_grouped_channel``. Defaults to output channel axis. cluster_dim (:obj:`int`): The dimension of centroids for each lookup table. The centroid is a scalar by default. When ``cluster_dim > 1``, it indicates 2-D clustering, and each ``cluster_dim`` length of weight vectors along the output channel are palettized using the same 2-D centroid. The length of each entry in the lookup tables is equal to ``cluster_dim``. enable_per_channel_scale (:obj:`bool`): When set to ``True``, weights are normalized along the output channels using per-channel scales before being palettized. This is not supported with ``cluster_dim > 1``. This class supports two different configurations to structure the palettization: 1. **Per-tensor palettization**: This is the default configuration where the whole tensor shares a single lookup table. The ``granularity`` is set to ``per_tensor``, and ``group_size`` is ``None``. 2. **Per-grouped-channel palettization**: In this configuration, the number of channels ``group_size`` along ``channel_axis`` share the same lookup table. For example, for a weight matrix of shape ``(16, 25)``, if we provide ``group_size = 8``, the shape of the lookup table would be ``(2, 2^n_bits)``. .. note:: Grouping is currently only supported along either the input or output channel axis. """ n_bits: _Optional[int] = _field( default=4, validator=_validators.optional(_validators.instance_of(int)) ) lut_dtype: _torch.dtype = _field( default=None, converter=lambda val: _maybe_convert_str_to_dtype(val) if val else val, validator=_validators.optional( [ _validators.instance_of(_torch.dtype), _validators.in_([_torch.int8, _torch.uint8]), ] ), ) granularity: PalettizationGranularity = _field( default="per_tensor", converter=PalettizationGranularity, validator=_validators.in_(PalettizationGranularity), ) group_size: _Optional[int] = _field( default=None, validator=_validators.optional(_validators.instance_of(int)) ) channel_axis: int = _field( default=0, validator=_validators.optional([_validators.instance_of(int), _validators.in_([0, 1])]), ) cluster_dim: _Optional[int] = _field( default=None, validator=_validators.optional(_validators.instance_of(int)) ) enable_per_channel_scale: _Optional[bool] = _field( default=False, validator=_validators.optional(_validators.instance_of(bool)) ) @group_size.validator def per_grouped_channel_granularity(self, attribute, value): if self.granularity == PalettizationGranularity.per_grouped_channel: assert ( value is not None ), "group_size has to be specified along with per_grouped_channel granularity." assert value > 0, "group_size should be greater than zero" else: assert value is None, "group_size can't be specified along with per_tensor granularity." @cluster_dim.validator def no_per_channel_scale(self, attribute, value): if value and value > 1: assert ( self.enable_per_channel_scale == False ), f"Enabling per_channel_scale is not supported for cluster_dim={value} larger than 1" _ModuleTypeConfigType = _NewType( "ModuleTypeConfigType", _Dict[_Union[_Callable, str], _Optional[ModulePostTrainingPalettizerConfig]], ) @_define class PostTrainingPalettizerConfig(_OptimizationConfig): """ Configuration class for specifying how different submodules of a model should be post-training palettized by :py:class:`PostTrainingPalettizer`. Args: global_config (:py:class:`ModulePostTrainingPalettizerConfig`): Config to be applied globally to all supported modules. module_type_configs (:obj:`dict` of :obj:`str` to :py:class:`ModulePostTrainingPalettizerConfig`): Module type configs applied to a specific module class, such as :py:class:`torch.nn.Linear`. The keys can be either strings or module classes. module_name_configs (:obj:`dict` of :obj:`str` to :py:class:`ModulePostTrainingPalettizerConfig`): Module name configs applied to specific modules. This can be a dictionary with module names pointing to their corresponding :py:class:`ModulePostTrainingPalettizerConfig`. """ global_config: _Optional[ModulePostTrainingPalettizerConfig] = _field( default=None, validator=_validators.optional(_validators.instance_of(ModulePostTrainingPalettizerConfig)), ) module_type_configs: _ModuleTypeConfigType = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.instance_of((str, _Callable)), value_validator=_validators.optional( _validators.instance_of(ModulePostTrainingPalettizerConfig) ), mapping_validator=_validators.instance_of(dict), ), ) module_name_configs: _Dict[str, _Optional[ModulePostTrainingPalettizerConfig]] = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.instance_of(str), value_validator=_validators.optional( _validators.instance_of(ModulePostTrainingPalettizerConfig) ), mapping_validator=_validators.instance_of(dict), ), ) def __attrs_post_init__(self): if ( self.global_config is None and len(self.module_type_configs) == 0 and len(self.module_name_configs) == 0 ): self.global_config = ModulePostTrainingPalettizerConfig() self.module_type_configs = { _maybe_convert_str_to_mod_type(key): val for key, val in self.module_type_configs.items() } @classmethod def from_dict(cls, config_dict: _Dict[str, _Any]) -> "PostTrainingPalettizerConfig": super().from_dict(config_dict) converter = _cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook( _ModuleTypeConfigType, _structure_from_dict_hook_factory(ModulePostTrainingPalettizerConfig), ) return converter.structure_attrs_fromdict(config_dict, cls) class PostTrainingPalettizer(_BasePostTrainingModelOptimizer): """ Perform post-training palettization on a torch model. Post palettization, all the weights in supported layers point to elements in a lookup table after performing a k-means operation. Example: .. code-block:: python import torch.nn as nn from coremltools.optimize.torch.palettization import ( PostTrainingPalettizerConfig, PostTrainingPalettizer, ) model = nn.Sequential( OrderedDict( { "conv": nn.Conv2d(1, 20, (3, 3)), "relu1": nn.ReLU(), "conv2": nn.Conv2d(20, 20, (3, 3)), "relu2": nn.ReLU(), } ) ) # initialize the palettizer config = PostTrainingPalettizerConfig.from_dict( { "global_config": { "n_bits": 4, }, } ) ptpalettizer = PostTrainingPalettizer(model, config) palettized_model = ptpalettizer.compress() Args: model (:obj:`torch.nn.Module`): Module to be compressed. config (:py:class:`PostTrainingPalettizerConfig`): Config that specifies how different submodules in the model will be palettized. """ _supported_modules: _Tuple = _KMeansSupportedModulesRegistry.get_supported_modules() def __init__(self, model: _torch.nn.Module, config: PostTrainingPalettizerConfig = None): config = PostTrainingPalettizerConfig() if config is None else config super().__init__(model, config) def compress(self, num_kmeans_workers: int = 1, inplace: bool = False) -> _torch.nn.Module: """ The compress method performs a `k-means` operation on all supported modules. Args: num_kmeans_workers (:obj:`int`): Number of worker processes used for performing post-training palettization. Defaults to ``1``. inplace (:obj:`bool`): If ``True``, model transformations are carried out in-place and the original module is mutated, otherwise a copy of the model is mutated and returned. Defaults to ``False``. """ self._model = super().compress(inplace=inplace) kmeans_config_dict = dict() for name, submodule in self._model.named_modules(): submod_config = self._config.get_module_config(name, submodule) if submod_config is None: continue k_means_module_cls = _KMeansSupportedModulesRegistry.get_kmeans_module(submodule) if k_means_module_cls is None: continue for param_name in k_means_module_cls.parameter_names: # Validate configuration for parameter param = submodule.get_parameter(param_name) updated_config = _validate_param_config( name + "." + param_name, param, submodule, submod_config, ["palettization_group_size", "palettization_cluster_dim"], ) if not updated_config: continue if name not in kmeans_config_dict: kmeans_config_dict[name] = {} kmeans_config_dict[name][param_name] = _KMeansConfig( n_bits=updated_config.n_bits, axis=updated_config.channel_axis, lut_dtype=updated_config.lut_dtype, block_size=updated_config.group_size, cluster_dim=updated_config.cluster_dim, enable_per_channel_scale=updated_config.enable_per_channel_scale, ) if num_kmeans_workers > 1: return _ParallelKMeans.cluster_weights( self._model, kmeans_config_dict, num_workers=num_kmeans_workers ) else: return _SequentialKMeans.cluster_weights(self._model, kmeans_config_dict) def report(self) -> _Report: return _compute_post_training_report( self._uncompressed_model, self._model, supported_modules=self._supported_modules, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/palettization/sensitive_k_means.py0000644000000000000000000010074714672066616027000 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy as _copy import logging as _logging import tempfile as _tempfile from collections import OrderedDict as _OrderedDict from contextlib import contextmanager as _contextmanager from typing import Any as _Any from typing import Callable as _Callable from typing import Dict as _Dict from typing import Iterable as _Iterable from typing import List as _List from typing import NewType as _NewType from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import Union as _Union import cattrs as _cattrs import torch as _torch import torch.multiprocessing as _mp from attr import define as _define from attr import field as _field from attrs import validators as _validators from torch.distributed.fsdp import FullStateDictConfig as _FullStateDictConfig from torch.distributed.fsdp import FullyShardedDataParallel as _FSDP from torch.distributed.fsdp import ShardingStrategy as _ShardingStrategy from torch.distributed.fsdp import StateDictType as _StateDictType from coremltools.optimize.torch._utils.dist_utils import ddp_setup as _ddp_setup from coremltools.optimize.torch._utils.dist_utils import is_leader as _is_leader from coremltools.optimize.torch._utils.fsdp_utils import FSDPAutoWrapPolicy as _FSDPAutoWrapPolicy from coremltools.optimize.torch._utils.k_means import KMeansConfig as _KMeansConfig from coremltools.optimize.torch._utils.k_means import ( KMeansSupportedModulesRegistry as _KMeansSupportedModulesRegistry, ) from coremltools.optimize.torch._utils.k_means import ParallelKMeans as _ParallelKMeans from coremltools.optimize.torch._utils.k_means import SequentialKMeans as _SequentialKMeans from coremltools.optimize.torch._utils.report_utils import ( compute_post_training_report as _compute_post_training_report, ) from coremltools.optimize.torch._utils.torch_utils import ( maybe_convert_str_to_dtype as _maybe_convert_str_to_dtype, ) from coremltools.optimize.torch._utils.torch_utils import ( maybe_convert_str_to_mod_type as _maybe_convert_str_to_mod_type, ) from coremltools.optimize.torch._utils.validation_utils import ( validate_param_config as _validate_param_config, ) from coremltools.optimize.torch.base_model_optimizer import ( BaseDataCalibratedModelOptimizer as _BaseDataCalibratedModelOptimizer, ) from coremltools.optimize.torch.base_model_optimizer import _Report from coremltools.optimize.torch.optimization_config import ( ModuleOptimizationConfig as _ModuleOptimizationConfig, ) from coremltools.optimize.torch.optimization_config import OptimizationConfig as _OptimizationConfig from coremltools.optimize.torch.optimization_config import ( PalettizationGranularity, _structure_from_dict_hook_factory, ) _logger = _logging.getLogger(__name__) @_define class ModuleSKMPalettizerConfig(_ModuleOptimizationConfig): """ Configuration class for specifying global and module-level palettization options for :py:class:`SKMPalettizer` algorithm. Args: n_bits (:obj:`int`): Number of bits to use for palettizing the weights. Defaults to ``4``. lut_dtype (:py:class:`torch.dtype`): The dtype to use for representing each element in lookup tables. When value is ``None``, no quantization is performed. Supported values are :py:class:`torch.int8` and :py:class:`torch.uint8`. Defaults to ``None``. granularity (:py:class:`PalettizationGranularity`) – Granularity for palettization. One of ``per_tensor`` or ``per_grouped_channel``. Defaults to ``per_tensor``. group_size (:obj:`int`): Specify the number of channels in a group. Only effective when granularity is ``per_grouped_channel``. channel_axis (:obj:`int`): Specify the channel axis to form a group of channels. Only effective when granularity is ``per_grouped_channel``. Defaults to output channel axis. cluster_dim (:obj:`int`): The dimension of centroids for each lookup table. The centroid is a scalar by default. When ``cluster_dim > 1``, it indicates 2-D clustering, and each ``cluster_dim`` length of weight vectors along the output channel are palettized using the same 2-D centroid. The length of each entry in the lookup tables is equal to ``cluster_dim``. enable_per_channel_scale (:obj:`bool`): When set to ``True``, weights are normalized along the output channels using per-channel scales before being palettized. This is not supported with ``cluster_dim > 1``. This class supports two different configurations to structure the palettization: 1. **Per-tensor palettization**: This is the default configuration where the whole tensor shares a single lookup table. The ``granularity`` is set to ``per_tensor``, and ``group_size`` is ``None``. 2. **Per-grouped-channel palettization**: In this configuration, the number of channels ``group_size`` along ``channel_axis`` share the same lookup table. For example, for a weight matrix of shape ``(16, 25)``, if we provide ``group_size = 8``, the shape of the lookup table would be ``(2, 2^n_bits)``. .. note:: Grouping is currently only supported along either the input or output channel axis. """ n_bits: int = _field(default=4, validator=_validators.instance_of(int)) lut_dtype: _torch.dtype = _field( default=None, converter=lambda val: _maybe_convert_str_to_dtype(val) if val else val, validator=_validators.optional( [ _validators.instance_of(_torch.dtype), _validators.in_([_torch.int8, _torch.uint8]), ] ), ) granularity: PalettizationGranularity = _field( default="per_tensor", converter=PalettizationGranularity, validator=_validators.in_(PalettizationGranularity), ) group_size: _Optional[int] = _field( default=None, validator=_validators.optional(_validators.instance_of(int)) ) channel_axis: int = _field( default=0, validator=_validators.optional([_validators.instance_of(int), _validators.in_([0, 1])]), ) cluster_dim: _Optional[int] = _field( default=None, validator=_validators.optional(_validators.instance_of(int)) ) enable_per_channel_scale: bool = _field( default=False, validator=_validators.optional(_validators.instance_of(bool)) ) @group_size.validator def per_grouped_channel_granularity(self, attribute, value): if self.granularity == PalettizationGranularity.per_grouped_channel: assert ( value is not None ), "group_size has to be specified along with per_grouped_channel granularity." assert value > 0, "group_size should be greater than zero" else: assert value is None, "group_size can't be specified along with per_tensor granularity." @cluster_dim.validator def no_per_channel_scale(self, attribute, value): if value and value > 1: assert ( self.enable_per_channel_scale == False ), f"Enabling per_channel_scale is not supported for cluster_dim={value} larger than 1" _ModuleTypeConfigType = _NewType( "ModuleTypeConfigType", _Dict[_Union[_Callable, str], _Optional[ModuleSKMPalettizerConfig]], ) @_define class SKMPalettizerConfig(_OptimizationConfig): """ Configuration class for specifying how different submodules of a model are palettized by :py:class:`SKMPalettizer`. Args: global_config (:py:class:`ModuleSKMPalettizerConfig`): Config to be applied globally to all supported modules. Missing values are chosen from the default config. module_type_configs (:obj:`dict` of :obj:`str` to :py:class:`ModuleSKMPalettizerConfig`): Module type configs applied to a specific module class, such as :py:class:`torch.nn.Linear`. The keys can be either strings or module classes. module_name_configs (:obj:`dict` of :obj:`str` to :py:class:`ModuleSKMPalettizerConfig`): Module-level configs applied to specific modules. The name of the module must either be a regex or a fully qualified name that can be used to fetch it from the top level module using the ``module.get_submodule(target)`` method. calibration_nsamples (:obj:`int`): Number of samples to be used for calibration. """ global_config: _Optional[ModuleSKMPalettizerConfig] = _field( default=None, validator=_validators.optional(_validators.instance_of(ModuleSKMPalettizerConfig)), ) module_type_configs: _ModuleTypeConfigType = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.instance_of((str, _Callable)), value_validator=_validators.optional( _validators.instance_of(ModuleSKMPalettizerConfig) ), mapping_validator=_validators.instance_of(dict), ), ) module_name_configs: _Dict[str, _Optional[ModuleSKMPalettizerConfig]] = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.instance_of(str), value_validator=_validators.optional( _validators.instance_of(ModuleSKMPalettizerConfig) ), mapping_validator=_validators.instance_of(dict), ), ) calibration_nsamples: int = _field(default=128, validator=_validators.instance_of(int)) def __attrs_post_init__(self): if ( self.global_config is None and len(self.module_type_configs) == 0 and len(self.module_name_configs) == 0 ): self.global_config = ModuleSKMPalettizerConfig() self.module_type_configs = { _maybe_convert_str_to_mod_type(key): val for key, val in self.module_type_configs.items() } @classmethod def from_dict(cls, config_dict: _Dict[str, _Any]) -> "SKMPalettizerConfig": super().from_dict(config_dict) converter = _cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook( _ModuleTypeConfigType, _structure_from_dict_hook_factory(ModuleSKMPalettizerConfig), ) return converter.structure_attrs_fromdict(config_dict, cls) class SKMPalettizer(_BaseDataCalibratedModelOptimizer): """ Perform post-training palettization of weights by running a weighted k-means on the model weights. The weight values used for weighing different elements of a model's weight matrix are computed using the Fisher information matrix, which is an approximation of the Hessian. These weight values indicate how sensitive a given weight element is: the more sensitive an element, the larger the impact perturbing or palettizing it has on the model’s loss function. This means that weighted k-means moves the clusters closer to the sensitive weight values, allowing them to be represented more exactly. This leads to a lower degradation in model performance after palettization. The Fisher information matrix is computed using a few samples of calibration data. This algorithm implements `SqueezeLLM: Dense-and-Sparse Quantization `_. Example: .. code-block:: python import torch.nn as nn from coremltools.optimize.torch.palettization import ( SKMPalettizer, SKMPalettizerConfig, ) model = nn.Sequential( OrderedDict( { "conv": nn.Conv2d(1, 20, (3, 3)), "relu1": nn.ReLU(), "conv2": nn.Conv2d(20, 20, (3, 3)), "relu2": nn.ReLU(), } ) ) dataloader = load_calibration_data() # define callable for loss function def loss_fn(model, data): inp, target = data out = model(inp) return nn.functional.mse_loss(out, target) # initialize the palettizer config = SKMPalettizerConfig.from_dict( { "global_config": { "n_bits": 4, }, "calibration_nsamples": 16, } ) compressor = SKMPalettizer(model, config) compressed_model = compressor.compress(dataloader=dataloader, loss_fn=loss_fn) Args: model (:obj:`torch.nn.Module`): Module to be compressed. config (:py:class:`LayerwiseCompressorConfig`): Config that specifies how different submodules in the model will be compressed. """ _supported_modules: _Tuple = _KMeansSupportedModulesRegistry.get_supported_modules() _SENSITIVITY_CLIP_THR: int = 1e-12 def __init__(self, model: _torch.nn.Module, config: _Optional[SKMPalettizerConfig] = None): config = SKMPalettizerConfig() if config is None else config super().__init__(model, config) self._tempdir = _tempfile.TemporaryDirectory() self._sensitivity_path = self._tempdir.name + "/sensitivity.pt" self._model_checkpoint_path = self._tempdir.name + "/model.pt" def _compute_sensitivity_impl_single_worker( self, dataset: _List, loss_fn: _Callable, sensitivity_path: _Optional[str] ): """ Computes sensitivity for the model weights using a single process. """ if _torch.cuda.is_available(): self._model.cuda() self._model.zero_grad() with self._register_grad_square_hooks(self._model): for didx, data in enumerate(dataset): _logger.info(f"Computing sensitivity using sample {didx}") loss = loss_fn(self._model, data) loss.backward() sensitivity_dict = dict() for name, param in self._model.named_parameters(remove_duplicate=True): if param.requires_grad: sensitivity_dict[name] = -param.grad.cpu() _torch.save(sensitivity_dict, self._get_sensitivity_path(sensitivity_path)) def _compute_sensitivity_impl_multiple_workers( self, rank: int, num_workers: int, dataset: _List, loss_fn: _Callable, sensitivity_path: _Optional[str] = None, fsdp_auto_wrap_policy: _Optional[_FSDPAutoWrapPolicy] = None, ): """ Computes sensitivity for the model weights using multiple processes. This mode is useful for large models for which computing gradients on a single process is infeasible because the model does not fit on a single GPU. The model is sharded on multiple GPUs using :py:class:`FullyShardedDataParallel`, which enables distributed computation of gradients. If ``sensitivity_path`` is passed as ``None``, the sensitivity matrices are stored temporarily and deleted after model compression. Otherwise, they are saved at the location specified by ``sensitivity_path``. Args: rank (:obj:`int`): Rank of the worker process on which this function is executed num_workers (:obj:`int`): Number of workers used for computing sensitivity dataset (:py:class:`Iterable`): An iterable where each element is an input to the model to be compressed. Used for computing gradients of model weights. loss_fn (:obj:`Callable`): A callable which takes the model and data as input and performs a forward pass on the model and computes the training loss sensitivity_path (:obj:`str` or ``None``): An optional path for saving the sensitivity of weights. Defaults to ``None``. fsdp_auto_wrap_policy (:py:class:`_FSDPAutoWrapPolicy` or ``None``): Policy to apply :py:class:`FullyShardedDataParallel` to submodules of ``model``. Defaults to ``None``. """ _ddp_setup(rank, num_workers) auto_wrap_policy = ( fsdp_auto_wrap_policy.get_policy() if fsdp_auto_wrap_policy is not None else None ) model = _FSDP( module=self._model, auto_wrap_policy=auto_wrap_policy, sharding_strategy=_ShardingStrategy.FULL_SHARD, use_orig_params=False, device_id=_torch.cuda.current_device(), sync_module_states=True, ) # We want to compute squares of gradients of the un-sharded parameters # to use later for k-means. However, parameters are sharded and gradients # are also computed in the sharded state. And there is no efficient way # to un-shard them, hence we use an optimizer to add the sharded gradients # to the parameters, which can later be un-sharded when we save the state dict. optim = _torch.optim.SGD( [param for param in model.parameters() if param.requires_grad], lr=1.0 ) optim.zero_grad() with self._register_grad_square_hooks(model): for didx, data in enumerate(dataset): if _is_leader(): _logger.info(f"Computing sensitivity using sample {didx}") loss = loss_fn(model, data) loss.backward() # we set the parameters to zero so that when we call optim.step, # the parameter values are equal to the square of the gradient with _torch.no_grad(): for param in model.parameters(): param.data.zero_() optim.step() cfg = _FullStateDictConfig(offload_to_cpu=True, rank0_only=True) with _FSDP.state_dict_type(model, _StateDictType.FULL_STATE_DICT, cfg): sensitivity_dict = model.state_dict() if _is_leader(): _torch.save(sensitivity_dict, self._get_sensitivity_path(sensitivity_path)) def _get_dataset(self, rank: int, num_workers: int, dataloader: _Iterable) -> _List[_Any]: """ Create a subset of dataloader for worker with given rank. """ dataset = [] num_samples = self._config.calibration_nsamples // num_workers sampled = 0 for idx, data in enumerate(dataloader): if idx % num_workers == rank: dataset.append(_copy.deepcopy(data)) sampled += 1 if sampled == num_samples: break return dataset @staticmethod @_contextmanager def _register_grad_square_hooks(model: _torch.nn.Module): """ Context manager for registering gradient squaring hooks within the context and unregistering them on exit. """ hook_handles = [] for param in model.parameters(): if param.requires_grad: hook_handles.append(param.register_hook(lambda grad: _torch.square(grad))) try: yield model finally: for handle in hook_handles: handle.remove() def _get_sensitivity_path(self, sensitivity_path: _Optional[str]) -> str: """ Return sensitivity_path if it's not None else a temporary path """ return sensitivity_path if sensitivity_path is not None else self._sensitivity_path def compute_sensitivity( self, dataloader: _Iterable, loss_fn: _Callable, sensitivity_path: _Optional[str] = None, num_sensitivity_workers: int = 1, fsdp_auto_wrap_policy: _Optional[_FSDPAutoWrapPolicy] = None, ) -> _Dict[str, _Any]: """ Compute sensitivities of model's weights. A weight element's sensitivity indicates how much effect perturbing it has on the model's loss function. The sensitivities are computed as Fisher information of the model's weights. If ``sensitivity_path`` is passed as a non ``None`` value, the sensitivity matrices saved at the location specified by ``sensitivity_path``. When computing sensitivity of large models, it is recommended to use ``num_sensitivity_workers`` equal to the number of GPUs available. This is because computing gradients using a single process maybe infeasible for a large model as it may not fit on a single GPU. When ``num_sensitivity_workers > 1``, the model is sharded on multiple GPUs using :py:class:`FullyShardedDataParallel`, which enables distributed computation of gradients. Args: dataloader (:py:class:`Iterable`): An iterable where each element is an input to the model to be compressed. Used for computing gradients of model weights. loss_fn (:obj:`Callable`): A callable which takes the model and data as input and performs a forward pass on the model and computes the training loss sensitivity_path (:obj:`str` or ``None``): An optional path for saving the sensitivity of weights. Defaults to ``None``. num_sensitivity_workers (:obj:`int`): Number of worker processes used for computing sensitivity. Defaults to ``1``. fsdp_auto_wrap_policy (:py:class:`_FSDPAutoWrapPolicy` or ``None``): Policy which specifies how different submodules of ``model`` are wrapped with individual :py:class:`FullyShardedDataParallel` wrappers. This argument is only used when ``num_sensitivity_workers > 1`` and it is only necessary when the model cannot be fit on a single GPU. Please refer to documentation of :py:class:`_FSDPAutoWrapPolicy` for more details. Defaults to ``None`. """ if num_sensitivity_workers > 1 and not _torch.cuda.is_available(): _logger.warning( "num_sensitivity_workers > 1 is only supported on GPUs with CUDA. Setting " "num_sensitivity_workers to 1, since a CUDA compatible PyTorch installation" "couldn't be found." ) num_sensitivity_workers = 1 # We save the model's state dict so that we can restore it later # We need to do this because _compute_sensitivity_impl_multiple_workers # sets the parameters' value to squares of their gradients and # _compute_sensitivity_impl_single_worker can modify layers such as batch norm # during forward pass _torch.save(self._model.state_dict(), self._model_checkpoint_path) if num_sensitivity_workers == 1: self._compute_sensitivity_impl_single_worker( self._get_dataset(0, 1, dataloader), loss_fn, sensitivity_path, ) else: if fsdp_auto_wrap_policy is None: _logger.warning( "num_sensitivity_workers > 1 and fsdp_auto_wrap_policy is None. For a large model, this might " "lead to OOM issue on GPUs. Consider setting fsdp_auto_wrap_policy to indicate how different " "submodules of the model should be wrapped with FSDP wrappers to prevent all gather for all " "parameters on all GPUs." ) ctx = _mp.get_context("spawn") worker_processes = [ ctx.Process( target=self._compute_sensitivity_impl_multiple_workers, args=( rank, num_sensitivity_workers, self._get_dataset(rank, num_sensitivity_workers, dataloader), loss_fn, sensitivity_path, fsdp_auto_wrap_policy, ), name=f"Process-{rank}", ) for rank in range(num_sensitivity_workers) ] for worker_process in worker_processes: worker_process.start() _logger.info(f"Started {worker_process.name} for computing sensitivity.") for worker_process in worker_processes: worker_process.join() _logger.info(f"Finished {worker_process.name}.") # restore the original state of the model self._model.cpu() old_state_dict = _torch.load(self._model_checkpoint_path) self._model.load_state_dict(old_state_dict) return self._process_sensitivity(sensitivity_path) def _process_sensitivity(self, sensitivity_path: _Optional[str] = None) -> _Dict[str, _Any]: """ Post process the sensitivity values to normalize them. """ raw_sensitivity_dict = _torch.load(self._get_sensitivity_path(sensitivity_path)) sensitivity_dict = dict() for key, val in raw_sensitivity_dict.items(): # Since optimizer sets param value as: p <= p - learning_rate * (grad**2), # we need to negate the values to get grad**2 val = 100 * -val if len(val.nonzero()) == 0: val[val == 0] = 1.0 # normalize sensitivity between 0 and 1 val = val / _torch.max(val) # Clipping very small or zero sensitivity values stabilizes k-means, # they can lead to divergence otherwise val[val == 0] = _torch.min(val[val != 0]) val[val < self._SENSITIVITY_CLIP_THR] = self._SENSITIVITY_CLIP_THR sensitivity_dict[key] = val # If user wants to save sensitivity values at the specified path # we save them in the processed state if sensitivity_path is not None: _torch.save(sensitivity_dict, sensitivity_path) return sensitivity_dict def _compute_outlier_mask(self, sensitivity: _torch.Tensor, outliers: float) -> _torch.Tensor: """ Compute outlier masks using the sensitivity values. """ sensitivity_flat = sensitivity.flatten() numel = sensitivity_flat.numel() num_outliers = int(numel * (outliers / 100.0)) mask = _torch.ones_like(sensitivity_flat, dtype=_torch.bool) mask[_torch.argsort(sensitivity_flat, descending=True)[:num_outliers]] = False return mask.reshape(sensitivity.shape) def _get_submodules_to_compress(self) -> _Iterable[_Tuple[str, _torch.nn.Module]]: """ Return an iterator over the names and submodules to be compressed. """ for name, submodule in self._model.named_modules(): yield name, submodule def compress( self, dataloader: _Optional[_Iterable] = None, loss_fn: _Optional[_Callable] = None, sensitivity_path: _Optional[str] = None, num_kmeans_workers: int = 1, num_sensitivity_workers: int = 1, inplace: bool = False, fsdp_auto_wrap_policy: _Optional[_FSDPAutoWrapPolicy] = None, ) -> _torch.nn.Module: """ Compresses a model's weights using Fisher information sensitivity based weighted k-means palettization. Args: dataloader (:py:class:`Iterable`): An iterable where each element is an input to the model to be compressed. Used for computing gradients of model weights. This argument is not needed if ``sensitivity_path`` is specified and will be ignored. It is required then ``sensitivity_path`` is ``None``. Defaults to ``None``. loss_fn (:obj:`Callable`): A callable which takes the model and data as input and performs a forward pass on the model and computes the training loss. This argument is not needed if ``sensitivity_path`` is specified and will be ignored. It is required when ``sensitivity_path`` is ``None``. Defaults to ``None``. sensitivity_path (:obj:`str` or ``None``): An optional path from which the sensitivity values are loaded. If ``sensitivity_path`` is not ``None``, sensitivity values are loaded from the path specified, otherwise, sensitivity values are computed using the ``dataloader`` and ``loss_fn``. The sensitivity values stored at ``sensitivity_path`` should be a dictionary from strings indicating fully qualified parameter names to tensors with the same shape as the parameters, with each element of the tensor indicating how important that element is. This is usally the output of the :py:meth:`compute_sensitivity` method. Defaults to ``None``. num_kmeans_workers (:obj:`int`): Number of worker processes to use for performing k-means. It is recommended to use more than one worker process to parallelize the clustering, especially when multiple CPUs are available. Defaults to ``1``. num_sensitivity_workers (:obj:`int`): Number of worker processes to use for computing sensitivity. For large models, it is recommended to set this value to the number of GPUs available. Defaults to ``1``. inplace (:obj:`bool`): If ``True``, model transformations are carried out in-place and the original module is mutated, otherwise a copy of the model is mutated and returned. Defaults to ``False``. fsdp_auto_wrap_policy (:py:class:`_FSDPAutoWrapPolicy` or ``None``): Policy which specifies how different submodules of ``model`` are wrapped with individual :py:class:`FullyShardedDataParallel` wrappers. This argument is only used when ``num_sensitivity_workers > 1`` and it is only necessary when the model cannot be fit on a single GPU. Please refer to documentation of :py:class:`_FSDPAutoWrapPolicy` for more details. Defaults to ``None`. """ self._model = super().compress(dataloader=dataloader, inplace=inplace) if sensitivity_path is None: sensitivity_dict = self.compute_sensitivity( dataloader, loss_fn, sensitivity_path, num_sensitivity_workers, fsdp_auto_wrap_policy=fsdp_auto_wrap_policy, ) else: _logger.info(f"Loading sensitivity values from {sensitivity_path}.") sensitivity_dict = _torch.load(sensitivity_path) kmeans_config_dict = dict() for name, submodule in self._get_submodules_to_compress(): submod_config = self._config.get_module_config(name, submodule) if submod_config is None: continue k_means_module_cls = _KMeansSupportedModulesRegistry.get_kmeans_module(submodule) if k_means_module_cls is None: continue for param_name in k_means_module_cls.parameter_names: # Validate configuration for parameter param = submodule.get_parameter(param_name) updated_config = _validate_param_config( name + "." + param_name, param, submodule, submod_config, ["palettization_group_size", "palettization_cluster_dim"], ) if not updated_config: continue sensitivity_key = f"{name}.{param_name}" if len(name) > 0 else param_name sensitivity = sensitivity_dict[sensitivity_key] if name not in kmeans_config_dict: kmeans_config_dict[name] = {} kmeans_config_dict[name][param_name] = _KMeansConfig( n_bits=updated_config.n_bits, axis=updated_config.channel_axis, lut_dtype=updated_config.lut_dtype, block_size=updated_config.group_size, importance=sensitivity, cluster_dim=updated_config.cluster_dim, enable_per_channel_scale=updated_config.enable_per_channel_scale, ) if num_kmeans_workers > 1: return _ParallelKMeans.cluster_weights( self._model, kmeans_config_dict, num_workers=num_kmeans_workers ) else: return _SequentialKMeans.cluster_weights(self._model, kmeans_config_dict) def report(self) -> _Report: return _compute_post_training_report( self._uncompressed_model, self._model, supported_modules=self._supported_modules, ) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2695472 coremltools-8.0/coremltools/optimize/torch/pruning/0000755000000000000000000000000014672075535021502 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/pruning/__init__.py0000644000000000000000000000335714672066616023623 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ .. _coremltools_optimize_torch_pruning: .. include:: pruning_desc.rst _`MagnitudePruner` ================== .. autoclass:: coremltools.optimize.torch.pruning.ModuleMagnitudePrunerConfig :members: from_dict, as_dict, from_yaml .. autoclass:: coremltools.optimize.torch.pruning.MagnitudePrunerConfig :members: set_global, set_module_type, set_module_name, from_dict, as_dict, from_yaml .. autoclass:: coremltools.optimize.torch.pruning.MagnitudePruner :members: prepare, step, report, finalize Pruning scheduler ================= :obj:`coremltools.optimize.torch.pruning.pruning_scheduler` submodule contains classes that implement pruning schedules, which can be used for changing the sparsity of pruning masks applied by various types of pruning algorithms to prune neural network parameters. Base class ---------- .. autoclass:: coremltools.optimize.torch.pruning.pruning_scheduler.PruningScheduler :show-inheritance: :no-members: PolynomialDecayScheduler ------------------------ .. autoclass:: coremltools.optimize.torch.pruning.pruning_scheduler.PolynomialDecayScheduler :show-inheritance: :members: compute_sparsity ConstantSparsityScheduler ------------------------- .. autoclass:: coremltools.optimize.torch.pruning.pruning_scheduler.ConstantSparsityScheduler :show-inheritance: :members: compute_sparsity """ from .magnitude_pruner import MagnitudePruner, MagnitudePrunerConfig, ModuleMagnitudePrunerConfig from .pruning_scheduler import ConstantSparsityScheduler, PolynomialDecayScheduler ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/pruning/_base_pruner.py0000644000000000000000000001404514672066616024524 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy as _copy import logging as _logging from typing import Optional as _Optional from typing import Tuple as _Tuple import torch as _torch from coremltools.optimize.torch._utils.metadata_utils import ( register_metadata_version as _register_metadata_version, ) from coremltools.optimize.torch._utils.torch_utils import get_eval_model as _get_eval_model from coremltools.optimize.torch.base_model_optimizer import ( BaseTrainingTimeModelOptimizer as _BaseTrainingTimeModelOptimizer, ) from coremltools.optimize.torch.base_model_optimizer import _Report from coremltools.optimize.torch.optimization_config import OptimizationConfig as _OptimizationConfig from coremltools.optimize.torch.pruning import _utils from coremltools.optimize.torch.pruning._base_pruning_method import BaseDynamicPruningMethod _logger = _logging.getLogger(__name__) class BasePruner(_BaseTrainingTimeModelOptimizer): pass class BasePrunerWithPruningMethod(BasePruner): """ Base class for all pruners which use a PruningMethod (implemented in """ _supported_modules: _Tuple def __init__(self, model: _torch.nn.Module, config: _OptimizationConfig): super().__init__(model, config) self._pruner_info = {} @property def _is_prepared(self) -> bool: return len(self._pruner_info) > 0 def prepare(self, inplace: bool = False) -> _torch.nn.Module: """ Prepares the model for pruning. Args: inplace (:obj:`bool`): If ``True``, model transformations are carried out in-place and the original module is mutated, otherwise a copy of the model is mutated and returned. """ return self._get_model_for_compression(inplace=inplace) def step(self): """ Steps through the pruning schedule once. At every call to :meth:`.step`, an internal step counter is incremented by one. """ raise NotImplementedError() def finalize( self, model: _Optional[_torch.nn.Module] = None, inplace: bool = False ) -> _torch.nn.Module: """ Prepares the model for export. Removes pruning forward pre-hooks attached to submodules and commits pruning changes to pruned module parameters by multiplying the pruning masks with the parameter matrix. Args: model (:obj:`nn.Module`): model to finalize inplace (:obj:`bool`): If ``True``, model transformations are carried out in-place and the original module is mutated, otherwise a copy of the model is mutated and returned. """ if model is None: model = self._model finalized_model = model if inplace else _copy.deepcopy(model) # Add compression metadata _register_metadata_version(finalized_model) for name, pruner_info in self._pruner_info.items(): submodule = finalized_model.get_submodule(name) _utils.register_compression_metadata(submodule, pruner_info, self._supported_modules) # Remove pruning hooks for name, submodule in finalized_model.named_modules(remove_duplicate=True): if hasattr(submodule, "pruning_method"): submodule.pruning_method.remove(submodule) # If the module has been joint pruned + palettized, then palettizer finalize() # can remove pruning_method attribute but not the forward pre hook. So we explicitly remove it. elif name in self._pruner_info and _utils.is_palettized_module( self._pruner_info[name].module ): for k, hook in submodule._forward_pre_hooks.items(): if isinstance(hook, BaseDynamicPruningMethod): del submodule._forward_pre_hooks[k] if model is None: self._model = finalized_model return finalized_model def report(self) -> _Report: """ Returns a dictionary with important statistics related to current state of pruning. Each key in the dictionary corresponds to a module name and the value is a dictionary containing the statistics such as ``unstructured_weight_sparsity``, number of parameters, etc. Also contains a ``global`` key containing the same statistics aggregated over all the modules set up for pruning. """ report = _Report() with _get_eval_model(self._model): with _torch.no_grad(): # add submodule level sparsity summary total_num_params = 0 for name, pruner_info in self._pruner_info.items(): submodule = pruner_info.module if hasattr(submodule, "pruning_method"): submod_config = pruner_info.config num_params = getattr(submodule, submod_config.param_name).detach().numel() summary = {"#params": int(num_params)} summary.update(submodule.pruning_method.get_sparsity_summary(submodule)) total_num_params += num_params report[name] = summary # get global sparsity summary global_summaries = {"#params": total_num_params} for sparsity_type in ["structured", "unstructured", "block2"]: layer_numel = [val["#params"] for _, val in report.items()] layer_sparsities = [ val[f"{sparsity_type}_weight_sparsity"] for _, val in report.items() ] global_summaries[ f"{sparsity_type}_weight_sparsity" ] = _utils.get_global_sparsity_summaries(layer_sparsities, layer_numel) report["global"] = global_summaries return report _allowed_granularity_values = ["per_scalar", "per_kernel", "per_channel", "per_layer"] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/pruning/_base_pruning_method.py0000644000000000000000000003135314672066616026234 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import logging as _logging import types as _types from typing import Any as _Any from typing import Dict as _Dict from typing import NamedTuple as _NamedTuple from typing import Optional as _Optional from typing import cast as _cast import numpy as _np import torch as _torch import torch.nn.utils.prune as _prune import torch.utils.hooks as _hooks from coremltools.optimize.torch._typing import ParamsDict as _ParamsDict from coremltools.optimize.torch._utils.state_dict_utils import ( LoadStateDictPostHook as _LoadStateDictPostHook, ) from coremltools.optimize.torch.pruning._utils import block2_sparsity as _block2_sparsity from coremltools.optimize.torch.pruning._utils import structured_sparsity as _structured_sparsity from coremltools.optimize.torch.pruning._utils import ( unstructured_sparsity as _unstructured_sparsity, ) _logger = _logging.getLogger(__name__) class BaseDynamicPruningMethod(_prune.BasePruningMethod): """ Extension of PyTorch's native pruning infra for seamless model export and progressive sparsity schedules This class works by registering itself as a forward pre-hook into each prune-able `nn.Module` to apply the pruning mask """ _tensor_name: str scheduled: bool def update_mask(self, module: _torch.nn.Module, scheduled_value: float) -> None: raise NotImplementedError() def bind_module(self, module: _torch.nn.Module) -> None: module.pruning_method = self # type: ignore orig_get_state = getattr(module, "__getstate__", None) # Override state method of module instance to exclude the non-leaf tensor # which is neither a parameter nor a buffer # See: https://discuss.pytorch.org/t/using-nn-utils-prune-causes-torch-tensor-deepcopy-to-fail/107470 def __getstate__(self: _torch.nn.Module) -> _Dict[str, _Any]: if orig_get_state is not None: state: _Dict[str, _Any] = orig_get_state() else: state = dict(self.__dict__) if hasattr(self, "pruning_method"): pruner = _cast(BaseDynamicPruningMethod, self.pruning_method) if pruner._tensor_name in state: state[pruner._tensor_name] = None return state module.__getstate__ = _types.MethodType(__getstate__, module) # type: ignore[assignment] @classmethod def from_module_and_params( cls, module: _torch.nn.Module, param_name: str = "weight", **params: _ParamsDict ) -> "BaseDynamicPruningMethod": """ Factory method of this class that is tied to a particular nn.Module """ pruning_method: BaseDynamicPruningMethod pruning_method = super(BaseDynamicPruningMethod, cls).apply( module, name=param_name, **params ) pruning_method.bind_module(module) return pruning_method def _remove_impl(self, module: _torch.nn.Module, fuse_pruning_mask: bool) -> None: assert self._tensor_name is not None # Restore the (pruned) tensor under its original name orig = module._parameters[self._tensor_name + "_orig"] assert orig is not None if fuse_pruning_mask: pruned_orig = None if self.scheduled: current_mask = module._buffers[self._tensor_name + "_mask"] assert current_mask is not None current_amount = self.infer_sparsity_amount_from_external_mask( current_mask ) # may have been loaded from ckpt and current_amount != self.amount: # self.amount may be # out-of-sync with the ckpt if hasattr(self, "amount") and not _np.isclose( current_amount, self.amount, rtol=1 / orig.numel() ): _logger.warning( f"Pruning method {self.__class__}'s sparsity schedule state ({self.amount}) is inconsistent " f"with pruning mask's current state ({current_amount}). This is probably harmless " f"if you are exporting a pruned model" ) # We have detected an inconsistent state so we correct for this by updating the # pruning method's schedule. This correction will ensure the following `self._apply_mask_impl` # call to use the correct self.amount self.update_mask(module, current_amount) pruned_orig = current_mask.to(orig.dtype) * orig if pruned_orig is None: pruned_orig = self._apply_mask_impl(module) orig.data = pruned_orig.data setattr(module, self._tensor_name, orig) del module._parameters[self._tensor_name + "_orig"] del module._buffers[self._tensor_name + "_mask"] def remove(self, module: _torch.nn.Module, fuse_pruning_mask: bool = True) -> _torch.nn.Module: """Removes pruning masks and forward_pre_hooks from the module If `fuse_pruning_mask` is True, then weights are fused with the pruning mask before re-registering the weights under the original name """ name = self._tensor_name for k, hook in module._forward_pre_hooks.items(): if isinstance(hook, BaseDynamicPruningMethod) and hook._tensor_name == name: self._remove_impl(module, fuse_pruning_mask) del module._forward_pre_hooks[k] if hasattr(module, "pruning_method"): delattr(module, "pruning_method") return module raise ValueError( f"Parameter '{name}' of module {module} has to be pruned " f"before pruning can be removed." ) def _apply_mask_impl(self, module: _torch.nn.Module) -> _torch.Tensor: # Identical to prune.BasePruningMethod.apply_mask as the default method for fusing weights and masks # Exposed to allow overriding by complex pruning algorithms assert self._tensor_name is not None, "Module {} has to be pruned".format(module) mask = getattr(module, self._tensor_name + "_mask") orig = getattr(module, self._tensor_name + "_orig") pruned_tensor: _torch.Tensor = mask.to(dtype=orig.dtype) * orig return pruned_tensor def apply_mask(self, module: _torch.nn.Module) -> _torch.Tensor: return self._apply_mask_impl(module) def infer_sparsity_amount_from_external_mask(self, external_mask: _torch.Tensor) -> float: """ Infer the sparsity amount from a given binary mask based on the granularity configuration of the pruning method """ if hasattr(self, "granularity"): # rank 2: torch.Linear, rank 3: torch.Conv1d, rank 4: torch.Conv2d, rank 5: torch.Conv3d rank = len(external_mask.shape) if self.granularity == "per_scalar" or rank == 2: return external_mask.eq(0).float().mean().item() elif rank in [3, 4, 5]: if self.granularity == "per_kernel": start_dim = 2 elif self.granularity == "per_channel": start_dim = 1 else: raise ValueError( f"Can not infer sparsity amount for granularity: {self.granularity}" ) return external_mask.flatten(start_dim).eq(0).all(-1).float().mean().item() else: raise ValueError(f"weights tensor rank must be in [2, 3, 4, 5], got {rank}") def get_sparsity_summary(self, module: _torch.nn.Module) -> _Dict[str, _torch.tensor]: """ Returns summary of the current state of pruning of module, indexed with name. """ assert self._tensor_name is not None, "Module {} has not been pruned".format(module) weight: _torch.Tensor = getattr(module, self._tensor_name).detach() if hasattr(module, "weight_fake_quant") and hasattr(module.weight_fake_quant, "palettize"): weight = module.weight_fake_quant.palettize(weight) summary = { "structured_weight_sparsity": _structured_sparsity(weight), "unstructured_weight_sparsity": _unstructured_sparsity(weight), } if weight.size(0) % 2 == 0: summary["block2_weight_sparsity"] = _block2_sparsity(weight) else: summary["block2_weight_sparsity"] = -1 # Not applicable return summary class _SyncScheduledValueLoadStateDictPostHook(_LoadStateDictPostHook): def __init__(self, scheduled_value_name: str): super().__init__() self._scheduled_value_name = scheduled_value_name def __call__(self, module: _torch.nn.Module, incompatible_keys: _NamedTuple) -> None: if hasattr(module, "pruning_method"): pruning_method: ScheduledBaseDynamicPruningMethod = module.pruning_method assert hasattr(pruning_method, "_tensor_name"), ( f"state_dict cannot be loaded. Attribute _tensor_name " f"missing from pruning forward hook installed on the " f"module: {module}" ) assert hasattr(pruning_method, self._scheduled_value_name), ( f"state_dict cannot be loaded. Attribute {self._scheduled_value_name} " f"missing from pruning forward hook installed on the module {module}" ) scheduled_value_buffer_name = ( f"{pruning_method._tensor_name}_{self._scheduled_value_name}" ) assert hasattr(module, scheduled_value_buffer_name), ( f"state_dict cannot be loaded. Buffer {scheduled_value_buffer_name} " f"missing from module: {module}" ) scheduled_value = getattr(module, scheduled_value_buffer_name) # set pruning method amount to be the same as the value from state dict if isinstance(scheduled_value, _torch.Tensor): scheduled_value = scheduled_value.data.item() setattr(pruning_method, self._scheduled_value_name, scheduled_value) class ScheduledBaseDynamicPruningMethod(BaseDynamicPruningMethod): """ An extension of BaseDynamicPruningMethod for scheduled pruners where the pruning amount is changed externally over the course of the training. """ def __init__(self, scheduled_value: _Any, scheduled_value_name: str, **kwargs: _ParamsDict): super().__init__() self.scheduled_value_name = scheduled_value_name setattr(self, scheduled_value_name, scheduled_value) self.sync_scheduled_value_post_hook_handle: _Optional[_hooks.RemovableHandle] = None def bind_module(self, module: _torch.nn.Module) -> None: super().bind_module(module) param_tensor = getattr(module, self._tensor_name + "_orig") scheduled_value = getattr(self, self.scheduled_value_name) scheduled_value_tensor = _torch.tensor(scheduled_value, device=param_tensor.device) module.register_buffer( self._tensor_name + "_" + self.scheduled_value_name, scheduled_value_tensor, ) self.sync_scheduled_value_post_hook_handle = module.register_load_state_dict_post_hook( _SyncScheduledValueLoadStateDictPostHook(self.scheduled_value_name) ) def update_mask(self, module: _torch.nn.Module, scheduled_value: float) -> None: assert self._tensor_name is not None assert self.scheduled # Get the original non-pruned parameter tensor orig = getattr(module, self._tensor_name + "_orig") assert ( orig is not None ), "Must have called apply() to initialize pruning before calling update_mask()" # Update scheduled value setattr(self, self.scheduled_value_name, scheduled_value) # keep scheduled value buffer in sync scheduled_value_tensor: _torch.Tensor = getattr( module, self._tensor_name + "_" + self.scheduled_value_name ) scheduled_value_tensor.fill_(scheduled_value) # Update the mask with the new amount module.register_buffer( self._tensor_name + "_mask", self.compute_mask(orig, default_mask=None), ) def _remove_impl(self, module: _torch.nn.Module, fuse_pruning_mask: bool) -> None: super()._remove_impl(module, fuse_pruning_mask) del module._buffers[self._tensor_name + "_" + self.scheduled_value_name] if self.sync_scheduled_value_post_hook_handle is not None: self.sync_scheduled_value_post_hook_handle.remove() self.sync_scheduled_value_post_hook_handle = None ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/pruning/_utils.py0000644000000000000000000002216414672066616023360 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import logging as _logging from typing import List as _List from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import cast as _cast import torch as _torch from coremltools.optimize.torch._utils.metadata_utils import ( CompressionMetadata as _CompressionMetadata, ) logger = _logging.getLogger(__name__) def lerp(v0, v1, t): return v0 + (v1 - v0) * t def spline(v0, v1, t, power): one_m_t = 1.0 - t x = one_m_t**power return lerp(v1, v0, x) def magnitude_ranked_mask( weights: _torch.Tensor, sparsity_fraction: float, block_size: int, granularity: str ) -> _torch.Tensor: """ Compute a binary mask for pruning based on magnitude-based ranking If granularity is `per_scalar`, L1 norm is used. L2 is used otherwise """ shape = weights.shape rank = len(shape) # rank 1: flattened global unstructured weights, rank 2: torch.Linear, rank 3: torch.Conv1d, # rank 4: torch.Conv2d, rank 5: torch.Conv3d assert rank in [1, 2, 3, 4, 5], f"weights tensor rank must be in [1, 2, 3, 4, 5], got {rank}" if granularity == "per_scalar" or rank == 2: magnitude_map = weights.abs() nb_weight_components = weights.numel() elif rank in [3, 4, 5]: if granularity == "per_kernel": start_dim = 2 nb_weight_components = shape[0] * shape[1] elif granularity == "per_channel": start_dim = 1 nb_weight_components = shape[0] else: raise ValueError(f"Unsupported granularity for magnitude_ranked_mask: {granularity}") # Compute L2 norm per weight slice (as defined by the granularity) magnitude_map = _torch.norm(weights.flatten(start_dim), dim=-1) for _ in range(rank - start_dim): magnitude_map = magnitude_map.unsqueeze(-1) if block_size > 1: ch_shape = shape[0] if ch_shape % block_size != 0: # Since the number of channels isn't divisible by block size, # we shall pad the channels so that it is divisible pad_shape = list(magnitude_map.shape) pad_shape[0] = block_size - ch_shape % block_size magnitude_map = _torch.cat( [magnitude_map, _torch.zeros(pad_shape, device=magnitude_map.device)], dim=0 ) ch_shape = magnitude_map.shape[0] assert ch_shape % block_size == 0 # Reshape to expose the "block" sub-axis s = list(magnitude_map.shape) # block exposed shape s.insert(1, block_size) s[0] = int(s[0] / block_size) f = [-1] * len(s) # expand factors to recover orig shape f[1] = block_size magnitude_map = ( magnitude_map.view(s) .pow(2) .sum(1, keepdim=True) .sqrt() .expand(f) .contiguous() .view(magnitude_map.shape) ) # Reshape to original shape in case of padding magnitude_map = magnitude_map[: shape[0]] nb_nonzero = _torch.ceil( _torch.as_tensor(nb_weight_components, dtype=_torch.float32) * (1 - sparsity_fraction) ).int() # handle special case when sparsity_fraction = 1.0 if nb_nonzero == 0: thr = 1.0 + magnitude_map.flatten().max() else: thr = ( magnitude_map.flatten().sort()[0].flip(0)[nb_nonzero - 1] ) # produces same mask for 1.0 and 0.0 sparsity mask = _torch.greater_equal(magnitude_map, thr) return mask def n_m_mask(weights: _torch.Tensor, nm: _Tuple[int, int], dim: _Optional[int] = 1): """ Create a n:m sparsity mask. """ shape = weights.shape permuted_shape = shape rank = len(shape) num_zeros, block_size = nm mask_value = 0.0 assert num_zeros < block_size, ( f"n (number of zeros) = {num_zeros} must be " f"less than m (block size) = {block_size}" ) assert dim in [0, 1], ( f"n:m mask is supported along dimensions (0, 1), " f"corresponding to input and output channels. Received " f"dim = {dim}" ) # rank 2: torch.Linear, rank 3: torch.Conv1d, # rank 4: torch.Conv2d, rank 5: torch.Conv3d assert rank in [2, 3, 4, 5], f"weights tensor rank must be in [2, 3, 4, 5], got {rank}" # num_non_zeros = block_size - num_zeros # if n:m is required along C_o, flip C_i and C_o if dim == 0: weights = _torch.permute(weights, [1, 0] + list(range(2, rank))) # transform to A x C_i # For Conv1D: C_o x C_i x H ==> H x C_o x C_i ==> H*C_o x C_i # For Conv2D: C_o x C_i x H x W ==> H x W x C_o x C_i ==> H*W*C_o x C_i # For Conv3D: C_o x C_i x H x W x D ==> H x W x D x C_o x C_i ==> H*W*D*C_o x C_i if rank > 2: permute_array = list(range(2, rank)) + [0, 1] weights = _torch.permute(weights, permute_array) permuted_shape = weights.shape weights = _torch.reshape(weights, (-1, weights.shape[-1])) abs_weights = weights.abs() padding_size = block_size - abs_weights.shape[-1] % block_size abs_weights_pad = _torch.nn.functional.pad(abs_weights, (0, padding_size), mode="constant") num_blocks = abs_weights_pad.numel() // block_size weights_blocks = abs_weights_pad.view(num_blocks, block_size) indices = _torch.argsort(weights_blocks, dim=1)[:, :num_zeros] sparsity_mask = _torch.ones([num_blocks, block_size], device=weights.device) sparsity_mask.scatter_(dim=1, index=indices, value=mask_value) sparsity_mask = sparsity_mask.view(abs_weights_pad.shape) sparsity_mask = sparsity_mask[:, : abs_weights.shape[-1]] # revert changes to mask shape to achieve same size as original weight if rank > 2: sparsity_mask = _torch.reshape(sparsity_mask, permuted_shape) permute_array = [rank - 2, rank - 1] + list(range(0, rank - 2)) sparsity_mask = _torch.permute(sparsity_mask, permute_array) if dim == 0: sparsity_mask = _torch.permute(sparsity_mask, [1, 0] + list(range(2, rank))) return sparsity_mask def block2_sparsity(weight: _torch.Tensor) -> _torch.Tensor: n = weight.size(0) assert n % 2 == 0 return weight.flatten(1).view(n // 2, 2, -1).sum(1).eq(0.0).float().mean().item() def structured_sparsity(weight: _torch.Tensor) -> _torch.Tensor: return weight.flatten(1).sum(1).eq(0.0).float().mean().item() def unstructured_sparsity(weight: _torch.Tensor) -> _torch.Tensor: return weight.eq(0.0).float().mean().item() def get_global_sparsity_summaries( layer_sparsities: _List[_torch.Tensor], layer_numel: _List[int] ) -> float: assert len(layer_sparsities) == len(layer_numel) weighted_sum, denom = 0.0, 0.0 for sparsity, numel in zip(layer_sparsities, layer_numel): if sparsity >= 0.0: denom += numel weighted_sum += numel * _cast(float, sparsity) if _torch.all(_torch.tensor(layer_sparsities) < 0): # to indicate the sparsity type is not applicable return -1 assert denom > 0.0 return weighted_sum / denom def validate_allowed_granularity_values(instance, attribute, value): if value is None: return allowed_values = ["per_scalar", "per_kernel", "per_channel", "per_layer"] if value not in allowed_values: raise ValueError( f"Allowed values for granularity are: {', '.join(allowed_values)}. " f"Received: {value}" ) def is_quantized_module(module): """ Check if a module has been quantized by inserting torch.ao.quantization.FakeQuantize layers """ return hasattr(module, "weight_fake_quant") and not hasattr( module.weight_fake_quant, "fake_palett_enabled" ) def is_palettized_module(module): """ Check if a module has been palettized by inserting coremltools.optimize.torch.palettization.FakePalettize layers """ return hasattr(module, "weight_fake_quant") and hasattr( module.weight_fake_quant, "fake_palett_enabled" ) def get_joint_pruned_quantized_submodule(module, supported_modules): """ Given a quantized module, find the submodule that supports pruning """ if isinstance(module, supported_modules): return module for submodule in module.children(): if isinstance(submodule, supported_modules): return submodule return None def register_compression_metadata(submodule, pruner_info, supported_modules): config = pruner_info.config compression_type = ["pruning"] # Identify joint compression cases if is_quantized_module(pruner_info.module): compression_type += ["quantization"] submodule = get_joint_pruned_quantized_submodule(submodule, supported_modules) elif is_palettized_module(pruner_info.module): compression_type += ["palettization"] param_name = config.param_name metadata = _CompressionMetadata(param_name) metadata.compression_type = compression_type metadata.register(submodule, override_compression_type=(len(compression_type) > 1)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/pruning/magnitude_pruner.py0000644000000000000000000005735114672066616025437 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy as _copy import logging as _logging from collections import OrderedDict as _OrderedDict from typing import Any as _Any from typing import Callable as _Callable from typing import Dict as _Dict from typing import NewType as _NewType from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import Union as _Union import attrs as _attrs import cattrs as _cattrs import torch as _torch from attr import define as _define from attr import field as _field from attrs import validators as _validators from coremltools.optimize.torch._typing import ParamsDict as _ParamsDict from coremltools.optimize.torch.optimization_config import ( ModuleOptimizationConfig as _ModuleOptimizationConfig, ) from coremltools.optimize.torch.optimization_config import OptimizationConfig as _OptimizationConfig from coremltools.optimize.torch.optimization_config import ( _structure_from_dict_hook_factory, _validate_module_type_keys_factory, ) from coremltools.optimize.torch.pruning._base_pruner import ( BasePrunerWithPruningMethod as _BasePrunerWithPruningMethod, ) from coremltools.optimize.torch.pruning._base_pruner import _allowed_granularity_values from coremltools.optimize.torch.pruning._base_pruning_method import ( ScheduledBaseDynamicPruningMethod as _ScheduledBaseDynamicPruningMethod, ) from coremltools.optimize.torch.pruning._utils import ( magnitude_ranked_mask as _magnitude_ranked_mask, ) from coremltools.optimize.torch.pruning._utils import n_m_mask as _n_m_mask from coremltools.optimize.torch.pruning.pruning_scheduler import ( ConstantSparsityScheduler as _ConstantSparsityScheduler, ) from coremltools.optimize.torch.pruning.pruning_scheduler import ( PruningScheduler as _PruningScheduler, ) from coremltools.optimize.torch.pruning.pruning_scheduler import _PruningSchedulerType _logger = _logging.getLogger(__name__) _SUPPORTED_MODULES = (_torch.nn.Linear, _torch.nn.Conv1d, _torch.nn.Conv2d, _torch.nn.Conv3d) @_define class ModuleMagnitudePrunerConfig(_ModuleOptimizationConfig): """ Configuration class for specifying global and module level pruning options for magnitude pruning algorithm implemented in :py:class:`MagnitudePruner`. This class supports four different modes of sparsity: 1. **Unstructured sparsity**: This is the default sparsity mode used by :py:class:`MagnitudePruner`. It is activated when ``block_size = 1``, ``n_m_ratio = None`` and ``granularity = per_scalar``. In this mode, the ``n`` weights with the lowest absolute values are set to 0, where ``n = floor(size_of_weight_tensor * target_sparsity)``. For example, given the following: * ``weight = [0.3, -0.2, -0.01, 0.05]`` * ``target_sparsity = 0.75`` The pruned weight would be ``[0.3, 0, 0, 0]`` 2. **Block structured sparsity**: This mode is activated when ``block_size > 1`` and ``n_m_ratio = None``. In this mode, the weight matrix is first reshaped to a rank 2 matrix by folding all dimensions ``>= 1`` into a single dimension. Then, blocks of size ``block_size`` along the ``0-th`` dimension, which have the lowest ``L2`` norm, are set to 0. The number of blocks which are zeroed out is determined by the ``target_sparsity`` parameter. The blocks are chosen in a non-overlapping fashion. For example: .. code-block:: python # Given a 4 x 2 weight with the following value, and block_size = 2. [ [1, 3], [-6, -7], [0, 3], [-9, 2], ] # L2 norm is computed along the 0-th dimension for blocks of size 2: [ [6.08, 7.62], [9.00, 3.61], ] # Then the smallest values are picked to prune. So if target_sparsity = 0.5, # then the blocks that will be pruned will be with ones with L2 norm values # of 6.08 and 3.61. And hence, the elements in the first and third # block are pruned. The final pruned tensor is: [ [0, 3], [0, -7], [0, 0], [-9, 0], ] 3. **n:m structured sparsity**: This mode is activated when ``n_m_ratio != None``. Similar to block structured sparsity, in this mode, the weight matrix is reshaped to a rank 2 matrix. Then, out of non-overlapping blocks of size ``m`` along the ``0-th`` or ``1-st`` dimension, the ``n`` elements with the smallest absolute value are set to 0. The dimension along which the blocks are chosen is controlled by the ``dim`` parameter and it defaults to ``1``. For linear layers, ``dim = 1`` and ratios where ``m`` is a factor of 16 (e.g. ``3:4``, ``7:8`` etc.) are recommended to get latency gains for models executing specifically on the CPU. For example: .. code-block:: python # Given a 4 x 4 weight of [ [3, 4, 7, 6], [1, 8, -3, -8], [-2, -3, -4, 0], [5, 4, -3, -2], ] # For n_m_ratio = (1, 2) with dim = 1 (default), the resulting pruned weight is [ [0, 4, 7, 0], [0, 8, 0, -8], [0, -3, -4, 0], [5, 0, -3, 0], ] 4. **General structured sparsity**: This mode is activated when ``granularity`` is set to one of ``per_channel`` or ``per_kernel``. It only applies to weights of ``rank >= 3``. For example, a rank 4 weight matrix of shape ``[C_o x C_i x H x W]`` can be thought of as ``C_o`` matrices of shape ``[C_i x H X W]`` or ``C_o*C_i`` matrices of size ``[H x W]``. ``per_channel`` granularity sets some of the ``[C_i x H X W]`` matrices to 0 whereas ``per_kernel`` granularity sets some of the ``[H x W]`` matrices to 0. When granularity is ``per_channel``, the weight matrix is reshaped to a rank 2 matrix, where all dimensions ``>= 1`` are folded into a single dimension. Then ``L2`` norm is computed for all rows and the weights corresponding to ``n`` smallest ``L2`` norm rows are set to 0 to achieve ``target_sparsity``. For example: .. code-block:: python # Given a 2 x 2 x 1 x 2 weight, granularity = per_channel, [ [ [[2, -1]], [[-3, 2]], ], [ [[5, -2]], [[-1, -3]], ], ] # It is first reshaped to shape 2 x 4, i.e.: [ [2, -1, -3, 2], [5, -2, -1, -3], ] # Then L2 norm is computed for each row of the matrix: [4.2426, 6.2450] # Finally, to achieve target sparsity = 0.5, since the first element is # smaller, the corresponding row is set to 0, resulting in the pruned weight: [ [ [[0, 0]], [[0, 0]], ], [ [[5, -2]], [[-1, -3]], ], ] When granularity is ``per_kernel``, the weight matrix is reshaped to a rank 3 matrix, where all dimensions ``>= 2`` are folded into a single dimension. Then ``L2`` norm is computed for all vectors along the last dimension, ``dim = 2`` and the weights corresponding to the ``n`` smallest ``L2`` norm vectors are set to 0 to achieve ``target_sparsity``. For the same example as before, setting granularity ``per_kernel`` will achieve: .. code-block:: python # The original 2 x 2 x 1 x 2 weight matrix is reshaped into shape 2 x 2 x 2, i.e.: [ [[2, -1], [-3, 2]], [[5, -2], [-1, -3]], ] # Then L2 norm is computed for each of the 4 vectors of size 2, [2, -1], [-3, 2], etc.: [ [2.2361, 3.6056], [5.3852, 3.1623], ] # Finally, to achieve target sparsity = 0.5, since the first and last elements are # smallest, the corresponding row in the weights is set to 0, # resulting in the pruned weight: [ [ [[0, 0]], [[-3, 2]], ], [ [[5, -2]], [[0, 0]], ], ] Args: scheduler (:py:class:`PruningScheduler`): A pruning scheduler which specifies how the sparsity should be changed over the course of the training. Defaults to constant sparsity scheduler which sets the sparsity to ``target_sparsity`` at step ``0``. initial_sparsity (:obj:`float`): Desired fraction of zeroes at the beginning of the training process. Defaults to ``0.0``. target_sparsity (:obj:`float`): Desired fraction of zeroes at the end of the training process. Defaults to ``0.5``. granularity (:obj:`str`): Specifies the granularity at which the pruning mask will be computed. Can be one of ``per_channel``, ``per_kernel`` or ``per_scalar``. Defaults to ``per_scalar``. block_size (:obj:`int`): Block size for inducing block sparsity within the mask. This is applied on the output channel dimension of the parameter (the ``0`` -th dimension). Having larger block size may be beneficial for latency compared to smaller block sizes, for models running on certain compute units such as the neural engine. ``block_size`` must be greater than ``1`` to enable block sparsity, and must be at most half the number of output channels. When the number of output channels is not divisible by the block size, the weight matrix is padded with zeros to compute the pruning mask and then un-padded to the original size. Defaults to ``1``. n_m_ratio (:obj:`tuple` of :obj:`int`): A tuple of two integers which specify how ``n:m`` pruning should be applied. In ``n:m`` pruning, out of every ``m`` elements, ``n`` with lowest magnitude are set to zero. When ``n_m_ratio`` is not ``None``, ``block_size``, ``granularity``, and ``initial_sparsity`` should be ``1``, ``per_scalar``, and ``0.0`` respectively. The value of ``target_sparsity`` is ignored and the actual target sparsity is determined by the ``n:m`` ratio. For more information, see `Learning N:M Fine-Grained Structured Sparse Neural Networks From Scratch `_. Defaults to ``None``, which means ``n:m`` sparsity is not used. dim (:obj:`int`): Dimension along which blocks of ``m`` elements are chosen when applying ``n:m`` sparsity. This parameter is only used when ``n_m_ratio`` is not ``None``. Defaults to ``1``. param_name (:obj:`str`): The name of the parameter to be pruned. Defaults to ``weight``. """ scheduler: _PruningSchedulerType = _field( default=_ConstantSparsityScheduler(begin_step=0), validator=_validators.instance_of(_PruningScheduler), ) initial_sparsity: float = _field(default=0.0, validator=_validators.instance_of(float)) target_sparsity: float = _field(default=0.5, validator=_validators.instance_of(float)) granularity: str = _field( default="per_scalar", validator=[_validators.instance_of(str), _validators.in_(_allowed_granularity_values)], ) block_size: int = _field(default=1, validator=_validators.instance_of(int)) n_m_ratio: _Optional[_Tuple[int, int]] = _field( default=None, validator=_attrs.validators.optional( _validators.deep_iterable( member_validator=_validators.instance_of(int), iterable_validator=_validators.instance_of((tuple, list)), ) ), ) dim: int = _field(default=1, validator=_validators.instance_of(int)) param_name: str = _field(default="weight", validator=_validators.instance_of(str)) def __attrs_post_init__(self): if self.n_m_ratio is not None: assert ( len(self.n_m_ratio) == 2 ), f"n_m_ratio must be a tuple of 2 integers, received: {self.n_m_ratio}" n, m = self.n_m_ratio assert m > 0, f"Received n_m_ratio (n, m): {self.n_m_ratio}. m must be greater than 0." assert n <= m, ( f"Received n_m_ratio (n, m): {self.n_m_ratio}. The number of zero in a block (n) " f"must be less than or equal to the block size (m)." ) if self.block_size is not None and self.block_size > 1: raise ValueError( f"Received block_size = {self.block_size} and n_m_ratio = {self.n_m_ratio}. " f"These two modes are mutually exclusive. When n_m_ratio != None, " f"the only allowed value of block_size is 1. " f"n_m_ratio should be equal to None for block_size > 1." ) if self.granularity is not None and self.granularity != "per_scalar": raise ValueError( f"Received granularity = {self.granularity} and n_m_ratio = {self.n_m_ratio}. " f"When n_m_ratio != None, the only allowed value of granularity is " f"per_scalar." ) _ModuleTypeConfigType = _NewType( "ModuleTypeConfigType", _Dict[_Union[_Callable, str], _Optional[ModuleMagnitudePrunerConfig]], ) @_define class MagnitudePrunerConfig(_OptimizationConfig): """ Configuration class for specifying how different submodules in a model are pruned by :py:class:`MagnitudePruner`. Args: global_config (:py:class:`ModuleMagnitudePrunerConfig`): Config to be applied globally to all supported modules. Missing values are chosen from the default config. module_type_configs (:obj:`dict` of :obj:`str` to :py:class:`ModuleMagnitudePrunerConfig`): Module type level configs applied to a specific module class, such as :py:class:`torch.nn.Linear`. The keys can be either strings or module classes. If ``module_type_config`` is set to ``None`` for a module type, it wouldn't get pruned. module_name_configs (:obj:`dict` of :obj:`str` to :py:class:`ModuleMagnitudePrunerConfig`): Module level configs applied to specific modules. The name of the module must be a fully qualified name that can be used to fetch it from the top level module using the ``module.get_submodule(target)`` method. If ``module_name_config`` is set to ``None`` for a module, it wouldn't get pruned. """ global_config: _Optional[ModuleMagnitudePrunerConfig] = _field( default=None, validator=_validators.optional(_validators.instance_of(ModuleMagnitudePrunerConfig)), ) module_type_configs: _ModuleTypeConfigType = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.and_( _validators.instance_of((str, _Callable)), _validate_module_type_keys_factory(_SUPPORTED_MODULES), ), value_validator=_validators.optional( _validators.instance_of(ModuleMagnitudePrunerConfig) ), mapping_validator=_validators.instance_of(dict), ), ) module_name_configs: _Dict[str, _Optional[ModuleMagnitudePrunerConfig]] = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.instance_of(str), value_validator=_validators.optional( _validators.instance_of(ModuleMagnitudePrunerConfig) ), mapping_validator=_validators.instance_of(dict), ), ) def __attrs_post_init__(self): if ( self.global_config is None and len(self.module_type_configs) == 0 and len(self.module_name_configs) == 0 ): self.global_config = ModuleMagnitudePrunerConfig() @classmethod def from_dict(cls, config_dict: _Dict[str, _Any]) -> "MagnitudePrunerConfig": super().from_dict(config_dict) converter = _cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook( _ModuleTypeConfigType, _structure_from_dict_hook_factory(ModuleMagnitudePrunerConfig), ) return converter.structure_attrs_fromdict(config_dict, cls) class _MagnitudePruningMethod(_ScheduledBaseDynamicPruningMethod): """ Magnitude-based static mask pruning method as described in `To prune, or not to prune: exploring the efficacy of pruning for model compression `_ """ _tensor_name: str scheduled: bool = True amount: float def __init__( self, amount: float, block_size: int, granularity: str, n_m_ratio: _Optional[_Tuple[int, int]] = None, dim: _Optional[int] = None, **kwargs: _ParamsDict, ): super().__init__(scheduled_value=amount, scheduled_value_name="amount") self.block_size = block_size self.granularity = granularity self.n_m_ratio = n_m_ratio self.dim = dim def compute_mask(self, t: _torch.Tensor, default_mask: _torch.Tensor) -> _torch.Tensor: if self.n_m_ratio is not None: _, block_size = self.n_m_ratio num_zeros = int(self.amount * block_size) if num_zeros == 0: # when number of zeros is < 0, we increase sparsity gradually return _magnitude_ranked_mask(t, self.amount, 1, self.granularity).float() else: return _n_m_mask(t, (num_zeros, block_size), self.dim).float() else: return _magnitude_ranked_mask(t, self.amount, self.block_size, self.granularity).float() @_define class _MagnitudePrunerInfo: config: ModuleMagnitudePrunerConfig module: _torch.nn.Module sparsity_level: float class MagnitudePruner(_BasePrunerWithPruningMethod): """ A pruning algorithm based on `To prune, or not to prune: exploring the efficacy of pruning for model compression `_. It extends the idea in the paper to different kinds of structured sparsity modes, in addition to unstructured sparsity. In order to achieve the desired sparsity, this algorithm sorts a module's weight matrix by the magnitude of its elements, and sets all elements less than a threshold to zero. Four different modes of sparsity are supported, encompassing both structured and unstructured sparsity. For details on how to select these different sparsity modes, please see :py:class:`ModuleMagnitudePrunerConfig`. Example: .. code-block:: python import torch from collections import OrderedDict from coremltools.optimize.torch.pruning import MagnitudePruner, MagnitudePrunerConfig # define model and loss function model = torch.nn.Sequential( OrderedDict( [ ("conv1", torch.nn.Conv2d(3, 32, 3, padding="same")), ("conv2", torch.nn.Conv2d(32, 32, 3, padding="same")), ] ) ) loss_fn = define_loss() # define the loss function # initialize pruner and configure it # we only prune the fisrt conv layer config = MagnitudePrunerConfig.from_dict( { "module_name_configs": { "conv1": { "scheduler": {"update_steps": [3, 5, 7]}, "target_sparsity": 0.75, "granularity": "per_channel", }, } } ) pruner = MagnitudePruner(model, config) # insert pruning layers in the model model = pruner.prepare() for inputs, labels in data: output = model(inputs) loss = loss_fn(output, labels) loss.backward() optimizer.step() pruner.step() # commit pruning masks to model parameters pruner.finalize(inplace=True) Args: model (:py:class:`torch.nn.Module`): Model on which the pruner will act. config (:py:class:`MagnitudePrunerConfig`): Config which specifies how different submodules in the model will be configured for pruning. Default config is used when passed as ``None``. """ _supported_modules: _Tuple = _SUPPORTED_MODULES def __init__(self, model: _torch.nn.Module, config: _Optional[MagnitudePrunerConfig] = None): config = MagnitudePrunerConfig() if config is None else config super().__init__(model, config) def prepare(self, inplace: bool = False) -> _torch.nn.Module: if self._is_prepared: _logger.warning( "Model has already been prepared for pruning. This API call " "will be a no-op." ) return self._model self._model = super().prepare(inplace=inplace) for name, submodule in self._model.named_modules(remove_duplicate=True): submod_config = self._config.get_module_config(name, submodule) if isinstance(submodule, self._supported_modules) and submod_config is not None: submod_config = _copy.deepcopy(submod_config) if submod_config.n_m_ratio is not None: num_zeros, block_size = submod_config.n_m_ratio # Add target sparsity to make scheduler work submod_config.target_sparsity = float(num_zeros) / float(block_size) _MagnitudePruningMethod.from_module_and_params( submodule, param_name=submod_config.param_name, amount=submod_config.initial_sparsity, block_size=submod_config.block_size, granularity=submod_config.granularity, n_m_ratio=submod_config.n_m_ratio, dim=submod_config.dim, ) self._pruner_info[name] = _MagnitudePrunerInfo( config=submod_config, module=submodule, sparsity_level=submod_config.initial_sparsity, ) return self._model def step(self): if not self._is_prepared: _logger.warning( "Model has not been prepared for pruning. This API call " "will be a no-op. prepare method must be called before " "a call to the step method." ) return self._step_count += 1 for name, pruner_info in self._pruner_info.items(): if hasattr(pruner_info.module, "pruning_method"): sparsity_level = pruner_info.config.scheduler.compute_sparsity( self._step_count, prev_sparsity=pruner_info.sparsity_level, config=pruner_info.config, ) if sparsity_level != pruner_info.sparsity_level: pruner_info.module.pruning_method.update_mask( pruner_info.module, sparsity_level ) pruner_info.sparsity_level = sparsity_level ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/pruning/pruning_scheduler.py0000644000000000000000000001272314672066616025601 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from abc import ABC as _ABC from abc import abstractmethod as _abstractmethod from typing import Union as _Union import attr as _attr import torch as _torch from attr import define as _define from attr import field as _field from attrs import validators as _validators from coremltools.optimize.torch._utils.torch_utils import ( list_or_str_to_tensor as _list_or_str_to_tensor, ) from coremltools.optimize.torch.optimization_config import ( ModuleOptimizationConfig as _ModuleOptimizationConfig, ) from coremltools.optimize.torch.pruning._utils import spline as _spline @_define class PruningScheduler(_ABC): """ An abstraction for implementing schedules to be used for changing the sparsity of pruning masks applied by various types of pruning algorithms to module parameters over the course of the training. """ @_abstractmethod def compute_sparsity( self, step_count: int, prev_sparsity: float, config: _ModuleOptimizationConfig ) -> float: """ Compute the sparsity at the next step given the previous sparsity and the module optimization config. Args: step_count (:obj:`int`): Current step count. prev_sparsity (:obj:`float`): Sparsity at previous step. config (:py:class:`ModuleOptimizationConfig`): Optimization config for the module which contains information such as target sparsity and initial sparsity. """ raise NotImplementedError() @_define class PolynomialDecayScheduler(PruningScheduler): r""" A pruning scheduler inspired by the paper `"To prune or not to prune" `_. It sets the sparsity at step :math:`t` using the formula: .. math:: sparsity_t = target\_sparsity + (initial\_sparsity - target\_sparsity) * (1 - \frac{update\_index}{total\_number\_of\_updates}) ^ {power} If :math:`t` is in :math:`update\_steps`, else it keeps the sparsity at its previous value. Here, :math:`update\_index` is the index of :math:`t` in the :math:`update\_steps` array and :math:`total\_number\_of\_updates` is the length of :math:`update\_steps` array. Args: update_steps (:obj:`list` of :obj:`int` or :obj:`str`): The indices of optimization steps at which pruning should be performed. This can be passed in as a string representing the range, such as ``range(start_index, end_index, step_size)``. power (:obj:`int`, optional): Exponent to be used in the sparsity function. Defaults to ``3``. """ update_steps: _torch.tensor = _field( converter=_list_or_str_to_tensor, eq=_attr.cmp_using(eq=_torch.equal) ) power: int = _field(default=3, validator=_validators.instance_of(int)) @update_steps.validator def _check_update_steps(self, attribute, value): assert ( len(value.size()) == 1 ), f"update_steps: {value} must be a 1-D tensor or list of ints." for elem in value: if elem.int() != elem: raise ValueError(f"Each element of update_steps {value} must be an integer.") assert ( elem >= 0 ), f"All elements of update_steps must be non-negative. Received: {value}." def compute_sparsity( self, step_count: int, prev_sparsity: float, config: _ModuleOptimizationConfig ) -> float: cur_step_update_steps_mask = step_count == self.update_steps if _torch.any(cur_step_update_steps_mask): update_number = _torch.nonzero(cur_step_update_steps_mask, as_tuple=True)[0].item() update_step_shape = self.update_steps.shape[0] if update_step_shape == 1: t = 1.0 else: t = update_number / (update_step_shape - 1) initial_sparsity = ( config.initial_sparsity if hasattr(config, "initial_sparsity") else 0.0 ) assert hasattr(config, "target_sparsity"), ( f"Attribute target_sparsity not found in config {config}. " f"{self.__class__} only works with configs " f"which have this attribute." ) return _spline(initial_sparsity, config.target_sparsity, t, self.power) return prev_sparsity @_define class ConstantSparsityScheduler(PruningScheduler): """ A pruning schedule with constant sparsity throughout training. Sparsity is set to zero initially and to ``target_sparsity`` at step ``begin_step``. Args: begin_step (:obj:`int`): step at which to begin pruning. """ begin_step: int = _field(validator=_validators.instance_of(int)) def compute_sparsity( self, step_count: int, prev_sparsity: float, config: _ModuleOptimizationConfig ) -> float: if step_count >= self.begin_step: assert hasattr(config, "target_sparsity"), ( f"Attribute target_sparsity not found in config {config}. " f"{self.__class__} only works with configs " f"which have this attribute." ) return config.target_sparsity return prev_sparsity _PruningSchedulerType = _Union[PolynomialDecayScheduler, ConstantSparsityScheduler] ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2735472 coremltools-8.0/coremltools/optimize/torch/quantization/0000755000000000000000000000000014672075535022546 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/__init__.py0000644000000000000000000000370714672066616024666 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ .. _coremltools_optimize_torch_qat: Quantization refers to techniques for performing neural network computations in lower precision than floating point. Quantization can reduce a model’s size and also improve a model’s inference latency and memory bandwidth requirement, because many hardware platforms offer high-performance implementations of quantized operations. _`LinearQuantizer` ================== .. autoclass:: coremltools.optimize.torch.quantization.ModuleLinearQuantizerConfig :members: from_dict, as_dict, from_yaml .. autoclass:: coremltools.optimize.torch.quantization.LinearQuantizerConfig :members: set_global, set_module_type, set_module_name, from_dict, as_dict, from_yaml .. autoclass:: coremltools.optimize.torch.quantization.LinearQuantizer :members: prepare, step, report, finalize .. autoclass:: coremltools.optimize.torch.quantization.ObserverType .. autoclass:: coremltools.optimize.torch.quantization.QuantizationScheme _`PostTrainingQuantization` ============================ .. autoclass:: coremltools.optimize.torch.quantization.ModulePostTrainingQuantizerConfig :members: from_dict, as_dict, from_yaml .. autoclass:: coremltools.optimize.torch.quantization.PostTrainingQuantizerConfig :members: set_global, set_module_type, set_module_name, from_dict, as_dict, from_yaml .. autoclass:: coremltools.optimize.torch.quantization.PostTrainingQuantizer :members: compress """ from .quantization_config import ( LinearQuantizerConfig, ModuleLinearQuantizerConfig, ObserverType, QuantizationScheme, ) from .quantizer import LinearQuantizer from .post_training_quantization import ( ModulePostTrainingQuantizerConfig, PostTrainingQuantizer, PostTrainingQuantizerConfig, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/_annotation_config.py0000644000000000000000000000776014672066616026770 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Optional as _Optional import torch as _torch import torch.ao.quantization as _aoquant from attr import define as _define from torch.ao.quantization.quantizer.quantizer import ( QuantizationSpec as _TorchQuantizationSpec, ) from coremltools.optimize.torch.quantization.quantization_config import ( ModuleLinearQuantizerConfig as _ModuleLinearQuantizerConfig, ) from coremltools.optimize.torch.quantization.quantization_config import ObserverType as _ObserverType from coremltools.optimize.torch.quantization.quantization_config import ( QuantizationScheme as _QuantizationScheme, ) @_define class AnnotationConfig: """ Module/Operator level configuration class for :py:class:`CoreMLQuantizer`. For each module/operator, defines the dtype, quantization scheme and observer type for input(s), output and weights (if any). """ input_activation: _Optional[_TorchQuantizationSpec] = None output_activation: _Optional[_TorchQuantizationSpec] = None weight: _Optional[_TorchQuantizationSpec] = None @staticmethod def _normalize_dtype(dtype: _torch.dtype) -> _torch.dtype: """ PyTorch export quantizer only supports uint8 and int8 data types, so we map the quantized dtypes to the corresponding supported dtype. """ dtype_map = { _torch.quint8: _torch.uint8, _torch.qint8: _torch.int8, } return dtype_map.get(dtype, dtype) @classmethod def from_quantization_config( cls, quantization_config: _Optional[_ModuleLinearQuantizerConfig], ) -> _Optional["AnnotationConfig"]: """ Creates a :py:class:`AnnotationConfig` from ``ModuleLinearQuantizerConfig`` """ if ( quantization_config is None or quantization_config.weight_dtype == _torch.float32 ): return None # Activation QSpec if quantization_config.activation_dtype == _torch.float32: output_activation_qspec = None else: activation_qscheme = _QuantizationScheme.get_qscheme( quantization_config.quantization_scheme, is_per_channel=False, ) activation_dtype = cls._normalize_dtype( quantization_config.activation_dtype ) output_activation_qspec = _TorchQuantizationSpec( observer_or_fake_quant_ctr=_aoquant.FakeQuantize.with_args( observer=_ObserverType.get_observer( quantization_config.activation_observer, is_per_channel=False, ), dtype=activation_dtype, qscheme=activation_qscheme, ), dtype=activation_dtype, qscheme=activation_qscheme, ) # Weight QSpec weight_qscheme = _QuantizationScheme.get_qscheme( quantization_config.quantization_scheme, is_per_channel=quantization_config.weight_per_channel, ) weight_dtype = cls._normalize_dtype(quantization_config.weight_dtype) weight_qspec = _TorchQuantizationSpec( observer_or_fake_quant_ctr=_aoquant.FakeQuantize.with_args( observer=_ObserverType.get_observer( quantization_config.weight_observer, is_per_channel=quantization_config.weight_per_channel, ), dtype=weight_dtype, qscheme=weight_qscheme, ), dtype=weight_dtype, qscheme=weight_qscheme, ) return AnnotationConfig( input_activation=output_activation_qspec, output_activation=output_activation_qspec, weight=weight_qspec, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/_backend_config.py0000644000000000000000000007445614672066616026213 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import operator as _operator from typing import Any as _Any from typing import List as _List from typing import Set as _Set import torch as _torch import torch.ao.nn.qat as _nnq import torch.ao.nn.quantized.reference as _nnr import torch.nn as _nn import torch.nn.functional as _F import torch.nn.intrinsic as _nni import torch.nn.intrinsic.qat as _nniq from torch.ao.quantization.backend_config import BackendConfig as _BackendConfig from torch.ao.quantization.backend_config import BackendPatternConfig as _BackendPatternConfig from torch.ao.quantization.backend_config import DTypeWithConstraints as _DTypeWithConstraints import coremltools.optimize.torch.quantization.modules.conv_transpose as _qconv_transpose import coremltools.optimize.torch.quantization.modules.conv_transpose_fused as _qconv_transpose_fused import coremltools.optimize.torch.quantization.modules.fused_modules as _fused import coremltools.optimize.torch.quantization.modules.qat_modules as _qat import coremltools.optimize.torch.quantization.modules.quantized_modules as _quantized from coremltools.optimize.torch._utils.version_utils import is_torch_2 as _is_torch_2 from coremltools.optimize.torch.quantization._backend_config_utils import ( activation_configs as _activation_configs, ) from coremltools.optimize.torch.quantization._backend_config_utils import ( binary_op_act_configs as _binary_op_act_configs, ) from coremltools.optimize.torch.quantization._backend_config_utils import ( binary_op_configs as _binary_op_configs, ) from coremltools.optimize.torch.quantization._backend_config_utils import bn_relu as _bn_relu from coremltools.optimize.torch.quantization._backend_config_utils import ( share_observer_configs as _share_observer_configs, ) from coremltools.optimize.torch.quantization._backend_config_utils import ( weighted_act_configs as _weighted_act_configs, ) from coremltools.optimize.torch.quantization._backend_config_utils import ( weighted_bn_act_configs as _weighted_bn_act_configs, ) from coremltools.optimize.torch.quantization._backend_config_utils import ( weighted_bn_configs as _weighted_bn_configs, ) from coremltools.optimize.torch.quantization._backend_config_utils import ( weighted_bn_relu_configs as _weighted_bn_relu_configs, ) from coremltools.optimize.torch.quantization._backend_config_utils import ( weighted_configs as _weighted_configs, ) from coremltools.optimize.torch.quantization._backend_config_utils import ( weighted_relu_configs as _weighted_relu_configs, ) # module based activations _mod_activations = ( _nn.PReLU, _nn.RReLU, _nn.ReLU6, _nn.LeakyReLU, _nn.Sigmoid, _nn.LogSigmoid, _nn.Hardsigmoid, _nn.SiLU, _nn.ELU, _nn.CELU, _nn.SELU, _nn.GLU, _nn.Mish, _nn.GELU, _nn.Tanh, _nn.Hardtanh, _nn.Softmax, _nn.LogSoftmax, _nn.Hardswish, ) # functional activations _func_activations = ( _F.prelu, _F.rrelu_, _F.rrelu, _F.relu6, _F.leaky_relu, _F.leaky_relu_, _F.logsigmoid, _F.silu, _F.elu, _F.elu_, _F.celu, _F.celu_, _F.selu, _F.selu_, _F.glu, _F.mish, _F.gelu, _F.hardtanh, _F.hardtanh_, _F.log_softmax, _F.hardswish, ) # ReLU activations _relu_activations = ( _nn.ReLU, _F.relu, ) # layers which have a fixed output range and hence use fixed qparams _fixed_qparams_modules = { _torch.nn.Hardsigmoid, _torch.nn.functional.hardsigmoid, "hardsigmoid", "hardsigmoid_", _torch.nn.Sigmoid, _torch.sigmoid, "sigmoid", "sigmoid_", _torch.nn.Softmax, _torch.nn.Tanh, _torch.tanh, "tanh", "tanh_", } class _BackendConfigRegistry: """ A registry of quantization patterns. """ backend_config: _BackendConfig = _BackendConfig() supported_modules: _Set[_Any] = set() @classmethod def register(cls): def inner_wrapper(wrapped_fn): backend_pattern_configs: _List[_BackendPatternConfig] = wrapped_fn() for config in backend_pattern_configs: if not isinstance(config.pattern, tuple): cls.supported_modules.add(config.pattern) cls.backend_config.set_backend_pattern_configs(backend_pattern_configs) return wrapped_fn return inner_wrapper @_BackendConfigRegistry.register() def _conv1d_act() -> _List[_BackendPatternConfig]: """ float: Conv1d -> Act qat: FakeQuant -> qat.ConvAct1d -> FakeQuant """ configs = _weighted_relu_configs( mod=_nn.Conv1d, func_mod=_F.conv1d, fused_mod=_nni.ConvReLU1d, qat_mod=_nniq.ConvReLU1d, ref_quant_mod=_nnr.Conv1d, ) for act in _mod_activations: configs.extend( _weighted_act_configs( mod=_nn.Conv1d, func_mod=_F.conv1d, act=act, fused_mod=_fused.ConvAct1d, qat_mod=_qat.ConvAct1d, ref_quant_mod=_quantized.QuantizedConvAct1d, ) ) return configs @_BackendConfigRegistry.register() def _conv2d_act() -> _List[_BackendPatternConfig]: """ float: Conv2d -> Act qat: FakeQuant -> qat.ConvAct2d -> FakeQuant """ configs = _weighted_relu_configs( mod=_nn.Conv2d, func_mod=_F.conv2d, fused_mod=_nni.ConvReLU2d, qat_mod=_nniq.ConvReLU2d, ref_quant_mod=_nnr.Conv2d, ) for act in _mod_activations: configs.extend( _weighted_act_configs( mod=_nn.Conv2d, func_mod=_F.conv2d, act=act, fused_mod=_fused.ConvAct2d, qat_mod=_qat.ConvAct2d, ref_quant_mod=_quantized.QuantizedConvAct2d, ) ) return configs @_BackendConfigRegistry.register() def _conv3d_act() -> _List[_BackendPatternConfig]: """ float: Conv3d -> Act qat: FakeQuant -> qat.ConvAct3d -> FakeQuant """ configs = _weighted_relu_configs( mod=_nn.Conv3d, func_mod=_F.conv3d, fused_mod=_nni.ConvReLU3d, qat_mod=_nniq.ConvReLU3d, ref_quant_mod=_nnr.Conv3d, ) for act in _mod_activations: configs.extend( _weighted_act_configs( mod=_nn.Conv3d, func_mod=_F.conv3d, act=act, fused_mod=_fused.ConvAct3d, qat_mod=_qat.ConvAct3d, ref_quant_mod=_quantized.QuantizedConvAct3d, ) ) return configs @_BackendConfigRegistry.register() def _conv_transpose1d_act() -> _List[_BackendPatternConfig]: """ float: ConvTranspose1d -> Act qat: FakeQuant -> qat.ConvTransposeAct1d -> FakeQuant """ configs = [] for act in _mod_activations + _relu_activations: configs.extend( _weighted_act_configs( mod=_nn.ConvTranspose1d, func_mod=_F.conv_transpose1d, act=act, fused_mod=_fused.ConvTransposeAct1d, qat_mod=_qat.ConvTransposeAct1d, ref_quant_mod=_quantized.QuantizedConvTransposeAct1d, ) ) return configs @_BackendConfigRegistry.register() def _conv_transpose2d_act() -> _List[_BackendPatternConfig]: """ float: ConvTranspose2d -> Act qat: FakeQuant -> qat.ConvTransposeAct2d -> FakeQuant """ configs = [] for act in _mod_activations + _relu_activations: configs.extend( _weighted_act_configs( mod=_nn.ConvTranspose2d, func_mod=_F.conv_transpose2d, act=act, fused_mod=_fused.ConvTransposeAct2d, qat_mod=_qat.ConvTransposeAct2d, ref_quant_mod=_quantized.QuantizedConvTransposeAct2d, ) ) return configs @_BackendConfigRegistry.register() def _conv_transpose3d_act() -> _List[_BackendPatternConfig]: """ float: ConvTranspose3d -> Act qat: FakeQuant -> qat.ConvTransposeAct3d -> FakeQuant """ configs = [] for act in _mod_activations + _relu_activations: configs.extend( _weighted_act_configs( mod=_nn.ConvTranspose3d, func_mod=_F.conv_transpose3d, act=act, fused_mod=_fused.ConvTransposeAct3d, qat_mod=_qat.ConvTransposeAct3d, ref_quant_mod=_quantized.QuantizedConvTransposeAct3d, ) ) return configs @_BackendConfigRegistry.register() def _linear_act() -> _List[_BackendPatternConfig]: """ float: Linear -> Act qat: FakeQuant -> qat.LinearAct -> FakeQuant """ configs = _weighted_relu_configs( mod=_nn.Linear, func_mod=_F.linear, fused_mod=_nni.LinearReLU, qat_mod=_nniq.LinearReLU, ref_quant_mod=_nnr.Linear, ) for act in _mod_activations: configs.extend( _weighted_act_configs( mod=_nn.Linear, func_mod=_F.linear, act=act, fused_mod=_fused.LinearAct, qat_mod=_qat.LinearAct, ref_quant_mod=_quantized.QuantizedLinearAct, ) ) return configs @_BackendConfigRegistry.register() def _conv1d_bn() -> _List[_BackendPatternConfig]: """ float: Conv1d -> BatchNorm1d qat: FakeQuant -> qat.ConvBn1d -> FakeQuant """ return _weighted_bn_configs( mod=_nn.Conv1d, bn_mod=_nn.BatchNorm1d, fused_mod=_nni.ConvBn1d, qat_mod=_nniq.ConvBn1d, ref_quant_mod=_nnr.Conv1d, ) @_BackendConfigRegistry.register() def _conv2d_bn() -> _List[_BackendPatternConfig]: """ float: Conv2d -> BatchNorm2d qat: FakeQuant -> qat.ConvBn2d -> FakeQuant """ return _weighted_bn_configs( mod=_nn.Conv2d, bn_mod=_nn.BatchNorm2d, fused_mod=_nni.ConvBn2d, qat_mod=_nniq.ConvBn2d, ref_quant_mod=_nnr.Conv2d, ) @_BackendConfigRegistry.register() def _conv3d_bn() -> _List[_BackendPatternConfig]: """ float: Conv3d -> BatchNorm3d qat: FakeQuant -> qat.ConvBn3d -> FakeQuant """ return _weighted_bn_configs( mod=_nn.Conv3d, bn_mod=_nn.BatchNorm3d, fused_mod=_nni.ConvBn3d, qat_mod=_nniq.ConvBn3d, ref_quant_mod=_nnr.Conv3d, ) @_BackendConfigRegistry.register() def _conv_transpose1d_bn() -> _List[_BackendPatternConfig]: """ float: ConvTranspose1d -> BatchNorm1d qat: FakeQuant -> qat.ConvTransposeBn1d -> FakeQuant """ return _weighted_bn_configs( mod=_nn.ConvTranspose1d, bn_mod=_nn.BatchNorm1d, fused_mod=_fused.ConvTransposeBn1d, qat_mod=_qconv_transpose_fused.ConvTransposeBn1d, ref_quant_mod=_nnr.ConvTranspose1d, ) @_BackendConfigRegistry.register() def _conv_transpose2d_bn() -> _List[_BackendPatternConfig]: """ float: ConvTranspose2d -> BatchNorm2d qat: FakeQuant -> qat.ConvTransposeBn2d -> FakeQuant """ return _weighted_bn_configs( mod=_nn.ConvTranspose2d, bn_mod=_nn.BatchNorm2d, fused_mod=_fused.ConvTransposeBn2d, qat_mod=_qconv_transpose_fused.ConvTransposeBn2d, ref_quant_mod=_nnr.ConvTranspose2d, ) @_BackendConfigRegistry.register() def _conv_transpose3d_bn() -> _List[_BackendPatternConfig]: """ float: ConvTranspose3d -> BatchNorm3d qat: FakeQuant -> qat.ConvTransposeBn3d -> FakeQuant """ return _weighted_bn_configs( mod=_nn.ConvTranspose3d, bn_mod=_nn.BatchNorm3d, fused_mod=_fused.ConvTransposeBn3d, qat_mod=_qconv_transpose_fused.ConvTransposeBn3d, ref_quant_mod=_nnr.ConvTranspose3d, ) @_BackendConfigRegistry.register() def _linear_bn() -> _List[_BackendPatternConfig]: """ float: Linear -> BatchNorm1d qat: FakeQuant -> qat.LinearBn1d -> FakeQuant """ return _weighted_bn_configs( mod=_nn.Linear, bn_mod=_nn.BatchNorm1d, fused_mod=_nni.LinearBn1d, qat_mod=_nniq.LinearBn1d, ref_quant_mod=_nnr.Linear, ) @_BackendConfigRegistry.register() def _conv1d_bn_act() -> _List[_BackendPatternConfig]: """ float: Conv1d -> BatchNorm1d -> Act qat: FakeQuant -> qat.ConvBnAct1d -> FakeQuant """ configs = _weighted_bn_relu_configs( mod=_nn.Conv1d, bn_mod=_nn.BatchNorm1d, fused_mod=_nni.ConvBnReLU1d, qat_mod=_nniq.ConvBnReLU1d, ref_quant_mod=_nnr.Conv1d, ) for act in _mod_activations: configs.extend( _weighted_bn_act_configs( mod=_nn.Conv1d, act=act, bn_mod=_nn.BatchNorm1d, root_mod=_nni.ConvBn1d, fused_mod=_fused.ConvBnAct1d, qat_mod=_qat.ConvBnAct1d, ref_quant_mod=_quantized.QuantizedConvAct1d, ) ) return configs @_BackendConfigRegistry.register() def _conv2d_bn_act() -> _List[_BackendPatternConfig]: """ float: Conv2d -> BatchNorm2d -> Act qat: FakeQuant -> qat.ConvBnAct2d -> FakeQuant """ configs = _weighted_bn_relu_configs( mod=_nn.Conv2d, bn_mod=_nn.BatchNorm2d, fused_mod=_nni.ConvBnReLU2d, qat_mod=_nniq.ConvBnReLU2d, ref_quant_mod=_nnr.Conv2d, ) for act in _mod_activations: configs.extend( _weighted_bn_act_configs( mod=_nn.Conv2d, act=act, bn_mod=_nn.BatchNorm2d, root_mod=_nni.ConvBn2d, fused_mod=_fused.ConvBnAct2d, qat_mod=_qat.ConvBnAct2d, ref_quant_mod=_quantized.QuantizedConvAct2d, ) ) return configs @_BackendConfigRegistry.register() def _conv3d_bn_act() -> _List[_BackendPatternConfig]: """ float: Conv3d -> BatchNorm3d -> Act qat: FakeQuant -> qat.ConvBnAct3d -> FakeQuant """ configs = _weighted_bn_relu_configs( mod=_nn.Conv3d, bn_mod=_nn.BatchNorm3d, fused_mod=_nni.ConvBnReLU3d, qat_mod=_nniq.ConvBnReLU3d, ref_quant_mod=_nnr.Conv3d, ) for act in _mod_activations: configs.extend( _weighted_bn_act_configs( mod=_nn.Conv3d, act=act, bn_mod=_nn.BatchNorm3d, root_mod=_nni.ConvBn3d, fused_mod=_fused.ConvBnAct3d, qat_mod=_qat.ConvBnAct3d, ref_quant_mod=_quantized.QuantizedConvAct3d, ) ) return configs @_BackendConfigRegistry.register() def _conv_transpose1d_bn_act() -> _List[_BackendPatternConfig]: """ float: ConvTranspose1d -> BatchNorm1d -> Act qat: FakeQuant -> qat.ConvTransposeBnAct1d -> FakeQuant """ configs = [] for act in _mod_activations + _relu_activations: configs.extend( _weighted_bn_act_configs( mod=_nn.ConvTranspose1d, act=act, bn_mod=_nn.BatchNorm1d, root_mod=_fused.ConvTransposeBn1d, fused_mod=_fused.ConvTransposeBnAct1d, qat_mod=_qat.ConvTransposeBnAct1d, ref_quant_mod=_quantized.QuantizedConvTransposeAct1d, ) ) return configs @_BackendConfigRegistry.register() def _conv_transpose2d_bn_act() -> _List[_BackendPatternConfig]: """ float: ConvTranspose2d -> BatchNorm2d -> Act qat: FakeQuant -> qat.ConvTransposeBnAct2d -> FakeQuant """ configs = [] for act in _mod_activations + _relu_activations: configs.extend( _weighted_bn_act_configs( mod=_nn.ConvTranspose2d, act=act, bn_mod=_nn.BatchNorm2d, root_mod=_fused.ConvTransposeBn2d, fused_mod=_fused.ConvTransposeBnAct2d, qat_mod=_qat.ConvTransposeBnAct2d, ref_quant_mod=_quantized.QuantizedConvTransposeAct2d, ) ) return configs @_BackendConfigRegistry.register() def _conv_transpose3d_bn_act() -> _List[_BackendPatternConfig]: """ float: ConvTranspose3d -> BatchNorm3d -> Act qat: FakeQuant -> qat.ConvTransposeBnAct3d -> FakeQuant """ configs = [] for act in _mod_activations + _relu_activations: configs.extend( _weighted_bn_act_configs( mod=_nn.ConvTranspose3d, act=act, bn_mod=_nn.BatchNorm3d, root_mod=_fused.ConvTransposeBn3d, fused_mod=_fused.ConvTransposeBnAct3d, qat_mod=_qat.ConvTransposeBnAct3d, ref_quant_mod=_quantized.QuantizedConvTransposeAct3d, ) ) return configs @_BackendConfigRegistry.register() def _conv1d() -> _List[_BackendPatternConfig]: """ float: Conv1d qat: FakeQuant -> qat.Conv1d -> FakeQuant """ return _weighted_configs( mod=_nn.Conv1d, func_mod=_F.conv1d, qat_mod=_nnq.Conv1d, ref_quant_mod=_nnr.Conv1d, ) @_BackendConfigRegistry.register() def _conv2d() -> _List[_BackendPatternConfig]: """ float: Conv2d qat: FakeQuant -> qat.Conv2d -> FakeQuant """ return _weighted_configs( mod=_nn.Conv2d, func_mod=_F.conv2d, qat_mod=_nnq.Conv2d, ref_quant_mod=_nnr.Conv2d, ) @_BackendConfigRegistry.register() def _conv3d() -> _List[_BackendPatternConfig]: """ float: Conv3d qat: FakeQuant -> qat.Conv3d -> FakeQuant """ return _weighted_configs( mod=_nn.Conv3d, func_mod=_F.conv3d, qat_mod=_nnq.Conv3d, ref_quant_mod=_nnr.Conv3d, ) @_BackendConfigRegistry.register() def _conv_transpose1d() -> _List[_BackendPatternConfig]: """ float: ConvTranspose1d qat: FakeQuant -> qat.ConvTranspose1d -> FakeQuant """ return _weighted_configs( mod=_nn.ConvTranspose1d, func_mod=_F.conv_transpose1d, qat_mod=_qconv_transpose.ConvTranspose1d, ref_quant_mod=_nnr.ConvTranspose1d, ) @_BackendConfigRegistry.register() def _conv_transpose2d() -> _List[_BackendPatternConfig]: """ float: ConvTranspose2d qat: FakeQuant -> qat.ConvTranspose2d -> FakeQuant """ return _weighted_configs( mod=_nn.ConvTranspose2d, func_mod=_F.conv_transpose2d, qat_mod=_qconv_transpose.ConvTranspose2d, ref_quant_mod=_nnr.ConvTranspose2d, ) @_BackendConfigRegistry.register() def _conv_transpose3d() -> _List[_BackendPatternConfig]: """ float: ConvTranspose3d qat: FakeQuant -> qat.ConvTranspose3d -> FakeQuant """ return _weighted_configs( mod=_nn.ConvTranspose3d, func_mod=_F.conv_transpose3d, qat_mod=_qconv_transpose.ConvTranspose3d, ref_quant_mod=_nnr.ConvTranspose3d, ) @_BackendConfigRegistry.register() def _linear() -> _List[_BackendPatternConfig]: """ float: Linear qat: FakeQuant -> qat.Linear -> FakeQuant """ return _weighted_configs( mod=_nn.Linear, func_mod=_F.linear, qat_mod=_nnq.Linear, ref_quant_mod=_nnr.Linear, ) @_BackendConfigRegistry.register() def _embedding() -> _List[_BackendPatternConfig]: """ float: Embedding qat: qat.Embedding """ return _weighted_configs( mod=_nn.Embedding, func_mod=None, qat_mod=_nnq.Embedding, ref_quant_mod=_nnr.Embedding, input_output_observed=False, ) @_BackendConfigRegistry.register() def _embedding_bag() -> _List[_BackendPatternConfig]: """ float: EmbeddingBag qat: qat.EmbeddingBag """ return _weighted_configs( mod=_nn.EmbeddingBag, func_mod=None, qat_mod=_nnq.EmbeddingBag, ref_quant_mod=_nnr.EmbeddingBag, input_output_observed=False, ) # n-ary ops @_BackendConfigRegistry.register() def _identity() -> _List[_BackendPatternConfig]: return _share_observer_configs(ops=[_nn.Identity]) @_BackendConfigRegistry.register() def _add_act() -> _List[_BackendPatternConfig]: """ float: input_1 -> add -> Act -> output input_2 -> qat: FakeQuant -> add -> Act -> FakeQuant FakeQuant -> """ acts = _mod_activations + _func_activations + (_nn.ReLU, _F.relu, _torch.relu) return _binary_op_act_configs(ops=[_operator.add, _torch.add], acts=list(acts)) @_BackendConfigRegistry.register() def _mul_act() -> _List[_BackendPatternConfig]: """ float: input_1 -> mul -> Act -> output input_2 -> qat: FakeQuant -> mul -> Act -> FakeQuant FakeQuant -> """ acts = _mod_activations + _func_activations + (_nn.ReLU, _F.relu, _torch.relu) return _binary_op_act_configs(ops=[_operator.mul, _torch.mul], acts=list(acts)) @_BackendConfigRegistry.register() def _matmul_act() -> _List[_BackendPatternConfig]: """ float: input_1 -> matmul -> Act -> output input_2 -> qat: FakeQuant -> matmul -> Act -> FakeQuant FakeQuant -> """ acts = _mod_activations + _func_activations + (_nn.ReLU, _F.relu, _torch.relu) return _binary_op_act_configs(ops=[_torch.matmul], acts=list(acts)) @_BackendConfigRegistry.register() def _einsum_act() -> _List[_BackendPatternConfig]: """ float: input_1 -> einsum -> Act -> output input_2 -> qat: FakeQuant -> einsum -> Act -> FakeQuant FakeQuant -> """ acts = _mod_activations + _func_activations + (_nn.ReLU, _F.relu, _torch.relu) return _binary_op_act_configs(ops=[_torch.einsum], acts=list(acts)) @_BackendConfigRegistry.register() def _add() -> _List[_BackendPatternConfig]: """ float: input_1 -> add -> output input_2 -> qat: FakeQuant -> add -> FakeQuant FakeQuant -> """ return _binary_op_configs(ops=[_operator.add, _torch.add]) @_BackendConfigRegistry.register() def _mul() -> _List[_BackendPatternConfig]: """ float: input_1 -> mul -> output input_2 -> qat: FakeQuant -> mul -> FakeQuant FakeQuant -> """ return _binary_op_configs(ops=[_operator.mul, _torch.mul]) @_BackendConfigRegistry.register() def _matmul() -> _List[_BackendPatternConfig]: """ float: input_1 -> matmul -> output input_2 -> qat: FakeQuant -> matmul -> FakeQuant FakeQuant -> """ return _binary_op_configs(ops=[_torch.matmul]) @_BackendConfigRegistry.register() def _einsum() -> _List[_BackendPatternConfig]: """ float: input_1 -> einsum -> output input_2 -> qat: FakeQuant -> einsum -> FakeQuant FakeQuant -> """ return _binary_op_configs(ops=[_torch.einsum]) @_BackendConfigRegistry.register() def _cat() -> _List[_BackendPatternConfig]: """ float: input_1 -> cat -> output input_2 -> qat: FakeQuant -> cat -> FakeQuant FakeQuant -> The number of inputs is not restricted to 2. All FakeQuant(s) share the same scale and zero point """ return _share_observer_configs(ops=[_torch.cat]) # pooling layers @_BackendConfigRegistry.register() def _max_pool1d() -> _List[_BackendPatternConfig]: """ float: MaxPool1d qat: FakeQuant -> MaxPool1d -> FakeQuant FakeQuant(s) share the same scale and zero point """ return _share_observer_configs(ops=[_nn.MaxPool1d, _F.max_pool1d]) @_BackendConfigRegistry.register() def _max_pool2d() -> _List[_BackendPatternConfig]: """ float: MaxPool2d qat: FakeQuant -> MaxPool2d -> FakeQuant FakeQuant(s) share the same scale and zero point """ return _share_observer_configs(ops=[_nn.MaxPool2d, _F.max_pool2d]) @_BackendConfigRegistry.register() def _max_pool3d() -> _List[_BackendPatternConfig]: """ float: MaxPool3d qat: FakeQuant -> MaxPool3d -> FakeQuant FakeQuant(s) share the same scale and zero point """ return _share_observer_configs(ops=[_nn.MaxPool3d, _F.max_pool3d]) @_BackendConfigRegistry.register() def _adaptive_avg_pool1d() -> _List[_BackendPatternConfig]: """ float: AdaptiveAvgPool1d qat: FakeQuant -> AdaptiveAvgPool1d -> FakeQuant FakeQuant(s) share the same scale and zero point """ return _share_observer_configs( ops=[_nn.AdaptiveAvgPool1d, _F.adaptive_avg_pool1d, _torch.adaptive_avg_pool1d] ) @_BackendConfigRegistry.register() def _adaptive_avg_pool2d() -> _List[_BackendPatternConfig]: """ float: AdaptiveAvgPool2d qat: FakeQuant -> AdaptiveAvgPool2d -> FakeQuant FakeQuant(s) share the same scale and zero point """ return _share_observer_configs(ops=[_nn.AdaptiveAvgPool2d, _F.adaptive_avg_pool2d]) @_BackendConfigRegistry.register() def _adaptive_avg_pool3d() -> _List[_BackendPatternConfig]: """ float: AdaptiveAvgPool3d qat: FakeQuant -> AdaptiveAvgPool3d -> FakeQuant FakeQuant(s) share the same scale and zero point """ return _share_observer_configs(ops=[_nn.AdaptiveAvgPool3d, _F.adaptive_avg_pool3d]) @_BackendConfigRegistry.register() def _avg_pool1d() -> _List[_BackendPatternConfig]: """ float: AvgPool1d qat: FakeQuant -> AvgPool1d -> FakeQuant FakeQuant(s) share the same scale and zero point """ return _share_observer_configs( ops=[_nn.AvgPool1d, _F.avg_pool1d, _torch.avg_pool1d, _torch.mean] ) @_BackendConfigRegistry.register() def _avg_pool2d() -> _List[_BackendPatternConfig]: """ float: AvgPool2d qat: FakeQuant -> AvgPool2d -> FakeQuant FakeQuant(s) share the same scale and zero point """ return _share_observer_configs(ops=[_nn.AvgPool2d, _F.avg_pool2d, _torch._C._nn.avg_pool2d]) @_BackendConfigRegistry.register() def _avg_pool3d() -> _List[_BackendPatternConfig]: """ float: AvgPool3d qat: FakeQuant -> AvgPool3d -> FakeQuant FakeQuant(s) share the same scale and zero point """ return _share_observer_configs(ops=[_nn.AvgPool3d, _F.avg_pool3d, _torch._C._nn.avg_pool3d]) # memory movement ops @_BackendConfigRegistry.register() def _flatten() -> _List[_BackendPatternConfig]: """ float: AvgPool1d qat: FakeQuant -> Flatten -> FakeQuant FakeQuant(s) share the same scale and zero point """ return _share_observer_configs(ops=[_nn.Flatten, _torch.flatten]) # norm layers @_BackendConfigRegistry.register() def _bn() -> _List[_BackendPatternConfig]: """ float: BatchNorm qat: FakeQuant -> BatchNorm -> FakeQuant """ return _activation_configs(ops=[_nn.BatchNorm1d, _nn.BatchNorm2d, _nn.BatchNorm3d]) @_BackendConfigRegistry.register() def _bn2d_relu() -> _List[_BackendPatternConfig]: """ float: BatchNorm2d -> ReLU qat: FakeQuant -> BNReLU2d -> FakeQuant """ return _bn_relu(mod=_nn.BatchNorm2d, fused_mod=_nni.BNReLU2d) @_BackendConfigRegistry.register() def _bn3d_relu() -> _List[_BackendPatternConfig]: """ float: BatchNorm3d -> ReLU qat: FakeQuant -> BNReLU3d -> FakeQuant """ return _bn_relu(mod=_nn.BatchNorm3d, fused_mod=_nni.BNReLU3d) # activations @_BackendConfigRegistry.register() def _softmax() -> _List[_BackendPatternConfig]: """ float: Softmax qat: FakeQuant -> Softmax -> FakeQuant FakeQuant at the output has fixed qparams. """ constraints = ( _DTypeWithConstraints( dtype=_torch.quint8, quant_min_lower_bound=0, quant_max_upper_bound=255, scale_exact_match=1.0 / 256.0, zero_point_exact_match=0, ) if _is_torch_2() else None ) return _activation_configs(ops=[_nn.Softmax], constraints=constraints) @_BackendConfigRegistry.register() def _sigmoid() -> _List[_BackendPatternConfig]: """ float: Sigmoid qat: FakeQuant -> Sigmoid -> FakeQuant FakeQuant at the output has fixed qparams. """ constraints = ( _DTypeWithConstraints( dtype=_torch.quint8, quant_min_lower_bound=0, quant_max_upper_bound=255, scale_exact_match=1.0 / 256.0, zero_point_exact_match=0, ) if _is_torch_2() else None ) return _activation_configs(ops=[_nn.Sigmoid, _F.sigmoid], constraints=constraints) @_BackendConfigRegistry.register() def _hardsigmoid() -> _List[_BackendPatternConfig]: """ float: Hardsigmoid qat: FakeQuant -> Hardsigmoid -> FakeQuant FakeQuant at the output has fixed qparams. """ constraints = ( _DTypeWithConstraints( dtype=_torch.quint8, quant_min_lower_bound=0, quant_max_upper_bound=255, scale_exact_match=1.0 / 256.0, zero_point_exact_match=0, ) if _is_torch_2() else None ) return _activation_configs(ops=[_nn.Hardsigmoid, _F.hardsigmoid], constraints=constraints) @_BackendConfigRegistry.register() def _tanh() -> _List[_BackendPatternConfig]: """ float: Tanh qat: FakeQuant -> Tanh -> FakeQuant FakeQuant at the output has fixed qparams. """ constraints = ( _DTypeWithConstraints( dtype=_torch.quint8, quant_min_lower_bound=0, quant_max_upper_bound=255, scale_exact_match=2.0 / 256.0, zero_point_exact_match=128, ) if _is_torch_2() else None ) return _activation_configs(ops=[_nn.Tanh, _F.tanh], constraints=constraints) @_BackendConfigRegistry.register() def _activations() -> _List[_BackendPatternConfig]: """ float: Act qat: FakeQuant -> Act -> FakeQuant """ ops = [op for op in _mod_activations if op not in _fixed_qparams_modules] ops += [ _nn.ReLU, _F.relu, _F.relu_, ] + list(_func_activations) return _activation_configs(ops=ops) def get_backend_config() -> _BackendConfig: """ Returns backend config encoding information about how quantization layers are inserted in a module. """ return _BackendConfigRegistry.backend_config def get_supported_modules() -> _List[_Any]: """ Returns a tuple of modules which are supported for quantization aware training. """ return tuple(_BackendConfigRegistry.supported_modules) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/_backend_config_utils.py0000644000000000000000000003440114672066616027415 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from copy import deepcopy as _deepcopy from typing import Any as _Any from typing import Callable as _Callable from typing import List as _List from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import Type as _Type import torch as _torch import torch.nn as _nn import torch.nn.functional as _F from torch.ao.quantization.backend_config import BackendPatternConfig as _BackendPatternConfig from torch.ao.quantization.backend_config import DTypeConfig as _DTypeConfig from torch.ao.quantization.backend_config import DTypeWithConstraints as _DTypeWithConstraints from torch.ao.quantization.backend_config import ObservationType as _ObservationType from coremltools.optimize.torch._utils.version_utils import is_torch_2 as _is_torch_2 act_quant_dtype_configs = [ # int input and output _DTypeConfig( input_dtype=_torch.quint8, output_dtype=_torch.quint8, ), # int input, float output _DTypeConfig( input_dtype=_torch.quint8, output_dtype=_torch.float, ), # float input, int output _DTypeConfig( input_dtype=_torch.float, output_dtype=_torch.quint8, ), ] weighted_dtype_configs = [ # weight int, act float, weight dtype signed _DTypeConfig( input_dtype=_torch.float, output_dtype=_torch.float, weight_dtype=_torch.qint8, bias_dtype=_torch.float, ), # weight int, act float, weight dtype unsigned _DTypeConfig( input_dtype=_torch.float, output_dtype=_torch.float, weight_dtype=_torch.quint8, bias_dtype=_torch.float, ), # weight int, act int, weight dtype signed _DTypeConfig( input_dtype=_torch.quint8, output_dtype=_torch.quint8, weight_dtype=_torch.qint8, bias_dtype=_torch.float, ), # weight int, act int, weight dtype unsigned _DTypeConfig( input_dtype=_torch.quint8, output_dtype=_torch.quint8, weight_dtype=_torch.quint8, bias_dtype=_torch.float, ), ] def get_fuser_method(constructor): """ Creates fuser method from class constructor of fused modules. """ if _is_torch_2(): def fuser_method(is_qat, m1, m2): if isinstance(m1, tuple): m0, m1 = m1 return constructor(m1, m0, m2) return constructor(m1, m2) else: def fuser_method(is_qat, m1, m2): if isinstance(m2, tuple): m2, m3 = m2 return constructor(m3, m2, m1) return constructor(m2, m1) return fuser_method def get_fusion_pattern(pattern: _Tuple[_Any, _Any]) -> _Tuple[_Any, _Any]: """ Swaps fusion pattern if torch version is >= 2.0. """ if _is_torch_2(): return pattern[1], pattern[0] else: return pattern def fused_mod_config( mod: _Type[_nn.Module], fused_mod: _Type[_nn.Module], qat_mod: _Type[_nn.Module], ref_quant_mod: _Type[_nn.Module], input_output_observed: _Optional[bool] = None, ) -> _BackendPatternConfig: """ Returns backend pattern config for fused modules. """ config = ( _BackendPatternConfig(fused_mod) .set_observation_type(_ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT) .set_dtype_configs(weighted_dtype_configs) .set_root_module(mod) .set_qat_module(qat_mod) .set_reference_quantized_module(ref_quant_mod) ) if input_output_observed is not None: if _is_torch_2(): config.set_observation_type(_ObservationType.INPUT_OUTPUT_NOT_OBSERVED) else: config._input_output_observed = False return config def qat_mod_config( mod: _Type[_nn.Module], qat_mod: _Type[_nn.Module], ref_quant_mod: _Type[_nn.Module] ) -> _BackendPatternConfig: """ Returns backend pattern config for QAT modules. """ return ( _BackendPatternConfig(qat_mod) .set_observation_type(_ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT) .set_dtype_configs(weighted_dtype_configs) .set_root_module(mod) .set_reference_quantized_module(ref_quant_mod) ) def weighted_configs( mod: _Type[_nn.Module], func_mod: _Optional[_Callable], qat_mod: _Type[_nn.Module], ref_quant_mod: _Type[_nn.Module], input_output_observed: _Optional[bool] = None, ) -> _List[_BackendPatternConfig]: """ Returns backend pattern configs for modules which have a weight associated with them, such as convolution, linear, embedding, etc. """ configs = [ # conv/linear module fused_mod_config( mod=mod, fused_mod=mod, qat_mod=qat_mod, ref_quant_mod=ref_quant_mod, input_output_observed=input_output_observed, ), # qat conv/linear qat_mod_config(mod=mod, qat_mod=qat_mod, ref_quant_mod=ref_quant_mod), ] if func_mod is not None: configs += [ # functional _BackendPatternConfig(func_mod) .set_observation_type(_ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT) .set_dtype_configs(weighted_dtype_configs) ._set_input_type_to_index({"weight": 1, "bias": 2}), ] return configs def weighted_relu_configs( mod: _Type[_nn.Module], func_mod: _Callable, fused_mod: _Type[_nn.Module], ref_quant_mod: _Type[_nn.Module], qat_mod: _Type[_nn.Module], ) -> _List[_BackendPatternConfig]: """ Returns backend pattern configs for the following sequence of ops: input -> mod -> relu -> output where mod is a module with a weight associated with it, such as convolution and linear. """ return [ # conv/linear module + relu func/module *[ _BackendPatternConfig(get_fusion_pattern((act, mod))) .set_dtype_configs(weighted_dtype_configs) .set_fuser_method(get_fuser_method(fused_mod)) .set_fused_module(fused_mod) for act in [_nn.ReLU, _F.relu] ], # conv/linear func + relu func/module *[ _BackendPatternConfig(get_fusion_pattern((act, func_mod))) .set_observation_type(_ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT) .set_dtype_configs(weighted_dtype_configs) for act in [_nn.ReLU, _F.relu] ], # conv/linear + relu fused fused_mod_config( mod=mod, fused_mod=fused_mod, qat_mod=qat_mod, ref_quant_mod=ref_quant_mod ), # qat conv/linear + relu qat_mod_config(mod=mod, qat_mod=qat_mod, ref_quant_mod=ref_quant_mod), ] def weighted_act_configs( mod: _Type[_nn.Module], func_mod: _Callable, act: _Type[_nn.Module], fused_mod: _Type[_nn.Module], qat_mod: _Type[_nn.Module], ref_quant_mod: _Type[_nn.Module], ) -> _List[_BackendPatternConfig]: """ Returns backend pattern configs for the following sequence of ops: input -> mod -> activation -> output where mod is a module with a weight associated with it, such as convolution and linear. """ return [ # conv/linear module + act module _BackendPatternConfig(get_fusion_pattern((act, mod))) .set_dtype_configs(weighted_dtype_configs) .set_fuser_method(get_fuser_method(fused_mod)) .set_fused_module(fused_mod), # conv/linear func + act module _BackendPatternConfig(get_fusion_pattern((act, func_mod))) .set_observation_type(_ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT) .set_dtype_configs(weighted_dtype_configs), # conv/linear + act fused fused_mod_config( mod=fused_mod, fused_mod=fused_mod, qat_mod=qat_mod, ref_quant_mod=ref_quant_mod, ), # qat conv/linear + act qat_mod_config(mod=fused_mod, qat_mod=qat_mod, ref_quant_mod=ref_quant_mod), ] def weighted_bn_configs( mod: _Type[_nn.Module], bn_mod: _Type[_nn.Module], fused_mod: _Type[_nn.Module], ref_quant_mod: _Type[_nn.Module], qat_mod: _Type[_nn.Module], ) -> _List[_BackendPatternConfig]: """ Returns backend pattern configs for the following sequence of ops: input -> mod -> batch_norm -> output where mod is a module with a weight associated with it, such as convolution and linear. """ return [ # conv module + bn module _BackendPatternConfig(get_fusion_pattern((bn_mod, mod))) .set_dtype_configs(weighted_dtype_configs) .set_fuser_method(get_fuser_method(fused_mod)) .set_fused_module(fused_mod), # conv + bn fused fused_mod_config( mod=mod, fused_mod=fused_mod, qat_mod=qat_mod, ref_quant_mod=ref_quant_mod ), # qat conv + bn qat_mod_config(mod=mod, qat_mod=qat_mod, ref_quant_mod=ref_quant_mod), ] def weighted_bn_relu_configs( mod: _Type[_nn.Module], bn_mod: _Type[_nn.Module], fused_mod: _Type[_nn.Module], ref_quant_mod: _Type[_nn.Module], qat_mod: _Type[_nn.Module], ) -> _List[_BackendPatternConfig]: """ Returns backend pattern configs for the following sequence of ops: input -> mod -> batch_norm -> relu -> output where mod is a module with a weight associated with it, such as convolution and linear. """ return [ # conv module + bn module + relu func/module *[ _BackendPatternConfig(get_fusion_pattern((act, (bn_mod, mod)))) .set_dtype_configs(weighted_dtype_configs) .set_fuser_method(get_fuser_method(fused_mod)) .set_fused_module(fused_mod) for act in [_nn.ReLU, _F.relu] ], # conv + bn + relu fused fused_mod_config( mod=mod, fused_mod=fused_mod, qat_mod=qat_mod, ref_quant_mod=ref_quant_mod ), # qat conv + bn + relu qat_mod_config(mod=mod, qat_mod=qat_mod, ref_quant_mod=ref_quant_mod), ] def weighted_bn_act_configs( mod: _Type[_nn.Module], act: _Type[_nn.Module], bn_mod: _Type[_nn.Module], root_mod: _Type[_nn.Module], fused_mod: _Type[_nn.Module], ref_quant_mod: _Type[_nn.Module], qat_mod: _Type[_nn.Module], ) -> _List[_BackendPatternConfig]: """ Returns backend pattern configs for the following sequence of ops: input -> mod -> batch_norm -> activation -> output where mod is a module with a weight associated with it, such as convolution and linear. """ return [ # conv module + bn module + act module _BackendPatternConfig(get_fusion_pattern((act, (bn_mod, mod)))) .set_dtype_configs(weighted_dtype_configs) .set_fuser_method(get_fuser_method(fused_mod)) .set_fused_module(fused_mod), # conv + bn + act fused fused_mod_config( mod=root_mod, fused_mod=fused_mod, qat_mod=qat_mod, ref_quant_mod=ref_quant_mod, ), # qat conv + bn + act qat_mod_config(mod=root_mod, qat_mod=qat_mod, ref_quant_mod=ref_quant_mod), ] def binary_op_configs(ops: _List[_Any]) -> _List[_BackendPatternConfig]: """ Returns backend pattern configs for the following sequence of ops: input_1 --> operator --> output input_2 --> where operator is a binary operator such as add, multiply or matmul. """ return [ _BackendPatternConfig(op) .set_dtype_configs(act_quant_dtype_configs) .set_observation_type(_ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT) for op in ops ] def binary_op_act_configs(ops: _List[_Any], acts: _List[_Any]) -> _List[_BackendPatternConfig]: """ Returns backend pattern configs for the following sequence of ops: input_1 --> operator --> act --> output input_2 --> where operator is a binary operator such as add or multiply. """ configs = [] for op in ops: configs.extend( [ _BackendPatternConfig(get_fusion_pattern((act, op))) .set_dtype_configs(act_quant_dtype_configs) .set_observation_type(_ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT) for act in acts ] ) return configs def share_observer_configs(ops: _List[_Any]) -> _List[_BackendPatternConfig]: """ Returns backend pattern configs for ops which do not change the scale or zero-point of the input tensor and thus can share the same qparams. """ return [ _BackendPatternConfig(op) .set_observation_type(_ObservationType.OUTPUT_SHARE_OBSERVER_WITH_INPUT) .set_dtype_configs(act_quant_dtype_configs) for op in ops ] def activation_configs( ops: _List[_Any], constraints: _Optional[_DTypeWithConstraints] = None ) -> _List[_BackendPatternConfig]: """ Returns backend pattern configs for default ops like activations which do not have an associated weight but can alter the scale and zero point of the input tensor. """ dtype_configs = [] for act_dtype in act_quant_dtype_configs: new_act_dtype = _deepcopy(act_dtype) if act_dtype.output_dtype == _torch.quint8 and constraints is not None: new_act_dtype.output_dtype_with_constraints = constraints dtype_configs.append(new_act_dtype) return [ _BackendPatternConfig(op) .set_observation_type(_ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT) .set_dtype_configs(dtype_configs) for op in ops ] def bn_relu( mod: _Type[_nn.Module], fused_mod: _Type[_nn.Module], ) -> _List[_BackendPatternConfig]: """ Returns backend pattern configs for the following sequence of ops: input -> batch_norm -> relu -> output """ return [ # bn module + relu func/module *[ _BackendPatternConfig(get_fusion_pattern((act, mod))) .set_dtype_configs(weighted_dtype_configs) .set_fuser_method(get_fuser_method(fused_mod)) .set_fused_module(fused_mod) for act in [_nn.ReLU, _F.relu] ] ] + activation_configs( ops=[fused_mod] ) # fused bn + relu ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/_configure.py0000644000000000000000000005270714672066616025253 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import defaultdict as _defaultdict from typing import Any as _Any from typing import Optional as _Optional from typing import Tuple as _Tuple import torch as _torch import torch.ao.quantization as _aoquant import torch.fx as _fx import torch.nn as _nn import torch.nn.intrinsic as _nni import torch.nn.intrinsic.qat as _nniqat from torch.ao.quantization.backend_config import BackendConfig as _BackendConfig from torch.ao.quantization.fx.custom_config import PrepareCustomConfig as _PrepareCustomConfig from torch.quantization.quantize_fx import prepare_qat_fx as _prepare_qat_fx import coremltools.optimize.torch.quantization.modules.qat_modules as _qat from coremltools.optimize.torch._utils.graph_utils import count_model_params as _count_model_params from coremltools.optimize.torch._utils.torch_utils import ( get_parent_child_name as _get_parent_child_name, ) from coremltools.optimize.torch.quantization._backend_config import _fixed_qparams_modules from coremltools.optimize.torch.quantization._utils import CombinationOpType as _CombinationOpType from coremltools.optimize.torch.quantization._utils import combine_op_type as _combine_op_type from coremltools.optimize.torch.quantization._utils import find_module as _find_module from coremltools.optimize.torch.quantization._utils import find_target as _find_target from coremltools.optimize.torch.quantization._utils import ( get_share_qparams_ops as _get_share_qparams_ops, ) from coremltools.optimize.torch.quantization._utils import ( group_activation_quantization_modules_by_id as _group_activation_quantization_modules_by_id, ) from coremltools.optimize.torch.quantization._utils import ( is_activation_post_process as _is_activation_post_process, ) from coremltools.optimize.torch.quantization._utils import is_quantized as _is_quantized from coremltools.optimize.torch.quantization.modules.observers import NoopObserver as _NoopObserver from coremltools.optimize.torch.quantization.quantization_config import ( QuantizationScheme as _QuantizationScheme, ) from coremltools.optimize.torch.quantization.quantization_config import ( _default_quantization_options, ) # layers which only scale the output and hence can use zero point = 0 if needed _scale_only_layers = { _torch.nn.Dropout, _torch.nn.Dropout1d, _torch.nn.Dropout2d, _torch.nn.Dropout3d, } # layers which are always quantized with affine config because they have zero point = 0 _always_affine_layers = { _torch.nn.ReLU, _torch.nn.functional.relu, _torch.nn.functional.relu_, _torch.nn.ReLU6, _nni.ConvReLU1d, _nniqat.ConvReLU1d, _nni.ConvReLU2d, _nniqat.ConvReLU2d, _nni.ConvReLU3d, _nniqat.ConvBnReLU3d, _nni.ConvBnReLU1d, _nniqat.ConvBnReLU1d, _nni.ConvBnReLU2d, _nniqat.ConvBnReLU2d, _nni.ConvBnReLU3d, _nniqat.ConvBnReLU3d, _nni.LinearReLU, _nniqat.LinearReLU, _nni.BNReLU3d, _nni.BNReLU3d, } # fused quantized layers _fused_quantized_layers = { _qat.ConvAct1d, _qat.ConvBnAct1d, _qat.ConvAct2d, _qat.ConvBnAct2d, _qat.ConvAct3d, _qat.ConvBnAct3d, _qat.ConvTransposeAct1d, _qat.ConvTransposeBnAct1d, _qat.ConvTransposeAct2d, _qat.ConvTransposeBnAct2d, _qat.ConvTransposeAct3d, _qat.ConvTransposeBnAct3d, _qat.LinearAct, } _common_observer_param_names = [ "dtype", "qscheme", "reduce_range", "quant_min", "quant_max", "eps", ] _observer_type_to_param_names = { _aoquant.MinMaxObserver: list(_common_observer_param_names), _aoquant.PerChannelMinMaxObserver: list(_common_observer_param_names) + ["ch_axis"], _aoquant.MovingAverageMinMaxObserver: list(_common_observer_param_names) + ["averaging_constant"], _aoquant.MovingAveragePerChannelMinMaxObserver: list(_common_observer_param_names) + ["averaging_constant", "ch_axis"], _aoquant.HistogramObserver: [ "bins", "upsample_rate", "dtype", "qscheme", "reduce_range", "eps", ], _aoquant.PlaceholderObserver: [ "dtype", "quant_min", "quant_max", "custom_op_name", ], _aoquant.NoopObserver: ["dtype", "custom_op_name"], _NoopObserver: ["dtype", "custom_op_name"], _aoquant.FixedQParamsObserver: [ "scale", "zero_point", "dtype", "qscheme", "quant_min", "quant_max", ], } class QATConfigurationHandler: """ Prepares the model for QAT by inserting weight and activation quantizers as specified in qconfig_mapping. Implements additional graph passes on a prepared module returned by prepare_qat_fx. """ def __init__( self, prepare_custom_config: _PrepareCustomConfig, qconfig_mapping: _aoquant.QConfigMapping, backend_config: _BackendConfig, quantization_scheme: _QuantizationScheme, ): self._quantization_scheme = quantization_scheme self._qconfig_mapping = qconfig_mapping self._prepare_custom_config = prepare_custom_config self._backend_config = backend_config self._share_qparams_ops = _get_share_qparams_ops(self._backend_config) self._device = None self._act_quant_groups = dict() self._modules_to_replace = _defaultdict(list) self._new_act_post_process = dict() def prepare(self, model: _nn.Module, example_inputs: _Tuple[_Any, ...]): """ Performs graph passes on model to configure activation and weight quantization layers. """ model = _prepare_qat_fx( model, prepare_custom_config=self._prepare_custom_config, qconfig_mapping=self._qconfig_mapping, example_inputs=example_inputs, backend_config=self._backend_config, ) self._setup_fake_quant_module_device(model, example_inputs) self._act_quant_groups = _group_activation_quantization_modules_by_id(model) if self._quantization_scheme == _QuantizationScheme.symmetric: self._mark_always_affine_layers_for_replacement(model) self._mark_always_affine_combination_ops_for_replacement(model) self._mark_fixed_qparams_modules_for_replacement(model) self._replace_weight_fake_quant_for_embedding_layers(model) model = self._replace_activation_quantizers(model) model = self._remove_activation_quantizer_after_dropout(model) return model def _setup_fake_quant_module_device( self, model: _fx.GraphModule, example_inputs: _Tuple[_Any, ...] ): """ Set device for all fake quantize modules by inferring from model and/or data """ # Record the device of the model count_params = _count_model_params(model) if count_params > 0: self._device = next(model.parameters()).device elif len(example_inputs) > 0: self._device = example_inputs[0].device else: self._device = _torch.device("cpu") for name, module in model.named_modules(remove_duplicate=True): if ( hasattr(module, "weight_fake_quant") and module.weight_fake_quant is not None and hasattr(module, "set_device") ): module.weight_fake_quant.set_device(self._device) elif not name.endswith(".weight_fake_quant") and hasattr(module, "set_device"): module.set_device(self._device) def _get_affine_act_post_process_mod_from_symmetric(self, module: _aoquant.FakeQuantizeBase): """ Returns activation post process module which is same as module but with affine qscheme instead of symmetric. """ activation_post_process = module.activation_post_process observer_type = type(activation_post_process) if observer_type not in _observer_type_to_param_names: raise ValueError(f"Found unrecognized observer type {type(activation_post_process)}.") observer_param_names = _observer_type_to_param_names[observer_type] kwargs = {k: getattr(activation_post_process, k) for k in observer_param_names} if "qscheme" in kwargs: kwargs["qscheme"] = _torch.per_tensor_affine if module.ch_axis != -1: new_act_post_process = _aoquant.FakeQuantize( observer=observer_type, ch_axis=module.ch_axis, **kwargs ) else: new_act_post_process = _aoquant.FakeQuantize(observer=observer_type, **kwargs) return new_act_post_process def _replace_activation_quantizers(self, model: _fx.GraphModule) -> _fx.GraphModule: """ Replaces all nodes marked for replacement with new nodes. """ replaced = set() for node, new_act_post_process in self._new_act_post_process.items(): if node not in replaced: model.delete_submodule(node.target) model.add_submodule(node.target, new_act_post_process) replaced.add(node) # replace pointers to all modules which share this activation quantizer for child_node in self._modules_to_replace[node]: if child_node not in replaced: parent, child = _get_parent_child_name(child_node.target) parent_mod = model.get_submodule(parent) setattr(parent_mod, child, new_act_post_process) replaced.add(child_node) model.recompile() return model def _mark_act_post_process_for_replacement( self, node: _fx.Node, model: _fx.GraphModule, new_act_post_process: _Optional[_aoquant.FakeQuantizeBase] = None, ): """ Marks an activation post process layer (activation quantizer) for replacement. """ shared_qparam_nodes = [] if len(node.users) == 1: next_node = list(node.users.keys())[0] next_module = _find_module(model, next_node) if _is_activation_post_process(next_module) and _is_quantized(next_module): module_to_replace_id = id(model.get_submodule(next_node.target)) # Some mods share the activation quantizer being replaced here, # so we collect all those mods here so that those can be pointed to # the new replaced module for child_node in self._act_quant_groups[module_to_replace_id]: consumer_node = child_node.args[0] if consumer_node.op == "call_module": child_mod = _find_module(model, consumer_node) if type(child_mod) in self._share_qparams_ops: shared_qparam_nodes.append(child_node) self._modules_to_replace[child_node] = [] elif consumer_node.op == "call_function": if consumer_node.target in self._share_qparams_ops: shared_qparam_nodes.append(child_node) self._modules_to_replace[child_node] = [] self._modules_to_replace[next_node] = shared_qparam_nodes if new_act_post_process is None: new_act_post_process = self._get_affine_act_post_process_mod_from_symmetric( next_module ) self._new_act_post_process[next_node] = new_act_post_process @staticmethod def _remove_activation_quantizer_after_dropout(model: _fx.GraphModule): """ During evaluation, dropout is a no-op. During conversion, conv_1 -> activation_q_1 -> dropout -> activation_q_2 -> conv_2 becomes conv_1 -> quant_1 -> dequant_1 -> quant_2 -> dequant_2 -> conv_2 where quant_1,dequant_1 have different qparams from quant_2/dequant_2 because dropout scales the output by 1/(1-p). This leads to inefficiency during inference. Since during inference, conv_2 sees quantized activations coming from conv_1, removing activation_q_2 doesn't lead to increased quantization error. Hence, this pass removes activation_q_2. """ nodes_to_remove = set() for node in model.graph.nodes: if node.op == "call_module": layer = _find_module(model, node) if isinstance(layer, tuple(_scale_only_layers)): prev_module = _find_module(model, node.prev) next_module = _find_module(model, node.next) if _is_activation_post_process(next_module) and _is_activation_post_process( prev_module ): nodes_to_remove.add(node.next) for node in nodes_to_remove: node.replace_all_uses_with(node.prev) model.delete_submodule(node.target) model.graph.erase_node(node) model.recompile() return model def _mark_always_affine_layers_for_replacement(self, model: _fx.GraphModule): """ Some layers like ReLU can be quantized with affine qscheme even when we want to use symmetric quantization (zero point = 0). This is because these layers always have a non-negative output. And thus, an affine activation post process layer attached after layers like these will always observe zero point as 0. This can possibly help us reduce quantization error because of the larger number of quantization levels available. (Symmetric quantization will force the output of these layers to use [0, 127] as the output range, but with affine quantization, we can use [0, 255]). prepare_qat_fx requires all modules being fused together to have the same QConfig. Thus, if we have a Conv followed by a ReLU and we want to set ReLU to have affine qscheme, we would have to set Conv to use affine qscheme as well. But this isn't correct because a stand alone Conv layer somewhere else in the network will also use affine qscheme which is undesirable we want to fix zero point to 0. Hence, we add this pass which replaces all occurrences of activation post process after ``always_affine_layers`` with an affine version. """ # Note: For all these ops, whether or not we can use affine qscheme for them depends only on # the op itself or one preceding op. # Note: graph.nodes traverses the nodes in topological order for node in model.graph.nodes: if node.op == "call_module": layer = _find_target(model, node.target) if type(layer) in _always_affine_layers: self._mark_act_post_process_for_replacement(node, model) elif isinstance(layer, tuple(_fused_quantized_layers)): if type(layer.act) in _always_affine_layers: self._mark_act_post_process_for_replacement(node, model) # layers which only scale the output can also use affine qcheme elif isinstance(layer, tuple(_scale_only_layers)): arg_mod = _find_module(model, node.args[0]) if ( _is_activation_post_process(arg_mod) and node.args[0] in self._modules_to_replace ): self._mark_act_post_process_for_replacement(node, model) elif node.op == "call_function": combine_op_type = _combine_op_type(node) if combine_op_type is not None: if combine_op_type == _CombinationOpType.AddReLU: self._mark_act_post_process_for_replacement(node, model) elif node.target in _always_affine_layers: self._mark_act_post_process_for_replacement(node, model) def _mark_always_affine_combination_ops_for_replacement(self, model: _fx.GraphModule): """ This method follows the same reasoning as described in ``_mark_always_affine_layers_for_replacement``, but instead of replacing activation quantizers for stand-alone ops, it replaces them for ops which consume more than 1 tensor as input. For add or cat, if the qscheme of all tensors being combined together is affine, it also uses affine qscheme, otherwise, it uses symmetric qscheme. """ for node in model.graph.nodes: if node.op == "call_function": combine_op_type = _combine_op_type(node) if combine_op_type is not None and combine_op_type != _CombinationOpType.AddReLU: args = node.args if combine_op_type == _CombinationOpType.Concat: args = node.args[0] arg_act_qschemes = [] for arg in args: arg_mod = _find_module(model, arg) if arg_mod is not None: if ( type(arg_mod) in _always_affine_layers or arg in self._modules_to_replace ): arg_act_qschemes.append(_QuantizationScheme.affine) elif hasattr(arg_mod, "qscheme"): if arg_mod.qscheme == _torch.per_tensor_affine: arg_act_qschemes.append(_QuantizationScheme.affine) else: arg_act_qschemes.append(_QuantizationScheme.symmetric) else: arg_act_qschemes.append(_QuantizationScheme.symmetric) else: arg_act_qschemes.append(_QuantizationScheme.symmetric) if all(x == _QuantizationScheme.affine for x in arg_act_qschemes): # We have already marked cat op for replacement, when one of the # tensors it combines was marked for replacement. So we don't need to # add it here again. if combine_op_type != _CombinationOpType.Concat: self._mark_act_post_process_for_replacement(node, model) else: # If any of the tensor being cat-ed together need to use # [-128, 127] range, we can't use affine quantization in # symmetric mode for them, so we remove them from modules marked for replacement. if combine_op_type == _CombinationOpType.Concat: for arg in args: if arg in self._modules_to_replace: self._modules_to_replace.pop(arg) if arg in self._new_act_post_process: self._new_act_post_process.pop(arg) def _mark_fixed_qparams_modules_for_replacement(self, model: _fx.GraphModule): """ If a fixed qparams activation is fused, with conv/linear, we need to make sure its qconfig is inherited by the fused op's activation quantizer. Before this step, all fused layers will have symmetric/affine activation quantizer. """ for node in model.graph.nodes: if node.op == "call_module": layer = _find_target(model, node.target) if isinstance(layer, tuple(_fused_quantized_layers)): # If output of this layer is being cat with another layer, we don't want # to enforce that layer to use the same activation quantizer, so we ignore it if _torch.cat in [ child_node.target for child_node in self._act_quant_groups[id(layer)] ]: continue elif type(layer.act) in _fixed_qparams_modules: act_post_process = self._qconfig_mapping.object_type_qconfigs[ type(layer.act) ].activation() self._mark_act_post_process_for_replacement(node, model, act_post_process) def _replace_weight_fake_quant_for_embedding_layers(self, model: _fx.GraphModule): """ Changes qscheme of embedding layers from float qparams to integer qparams. """ for node in model.graph.nodes: if node.op == "call_module": layer = _find_target(model, node.target) if ( isinstance(layer, _torch.nn.Embedding) and hasattr(layer, "weight_fake_quant") and isinstance(layer.weight_fake_quant, _aoquant.FakeQuantize) ): weight_dtype = layer.weight_fake_quant.dtype delattr(layer, "weight_fake_quant") observer_cls = type(layer.qconfig.weight().activation_post_process) layer.weight_fake_quant = _aoquant.FakeQuantize( observer=observer_cls, dtype=weight_dtype, qscheme=_QuantizationScheme.get_qscheme( self._quantization_scheme, is_per_channel=True ), ch_axis=_default_quantization_options["weight_ch_axis"], ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/_coreml_quantizer.py0000644000000000000000000004631614672066616026654 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import operator as _operator from typing import Callable as _Callable from typing import List as _List from typing import Optional as _Optional import torch as _torch from torch.ao.quantization.quantizer.quantizer import Quantizer as _TorchQuantizer from torch.ao.quantization.quantizer.xnnpack_quantizer import _get_module_name_filter from torch.fx import Node as _Node import coremltools.optimize.torch.quantization._coreml_quantizer_utils as _annotation_utils from coremltools.optimize.torch._utils.python_utils import FunctionRegistryMixin as _FunctionRegistryMixin from coremltools.optimize.torch.quantization._annotation_config import ( AnnotationConfig as _AnnotationConfig, ) from coremltools.optimize.torch.quantization.quantization_config import ( LinearQuantizerConfig as _LinearQuantizerConfig, ) from coremltools.optimize.torch.quantization.quantization_config import ( ModuleLinearQuantizerConfig as _ModuleLinearQuantizerConfig, ) class _AnnotationPatternRegistry(_FunctionRegistryMixin): """ A registry of quantization annotation rules. """ @classmethod def get_annotators(cls): return cls.REGISTRY @_AnnotationPatternRegistry.register("conv_act") def _annotate_conv_act( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> conv -> activation -> output """ return _annotation_utils.annotate_conv_bn_act_helper( model, quantization_config, filter_fn, use_bn=False ) @_AnnotationPatternRegistry.register("conv_bn_act") def _annotate_conv_bn_act( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> conv -> batch_norm -> activation -> output """ return _annotation_utils.annotate_conv_bn_act_helper( model, quantization_config, filter_fn, use_bn=True ) @_AnnotationPatternRegistry.register("conv_bn") def _annotate_conv_bn( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> conv -> batch_norm -> output """ annotated_partitions = [] conv_dims = [1, 2, 3] for conv_dim in conv_dims: pattern_gm = _annotation_utils.get_conv_bn_pattern( conv_dim, act_fn=None, act_in_place=False ) annotated_partitions.extend( _annotation_utils.annotate_weighted_mod_pattern( model, pattern_gm, quantization_config, filter_fn ) ) return annotated_partitions @_AnnotationPatternRegistry.register("conv") def _annotate_conv( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> conv -> output """ annotated_partitions = [] for conv_dim in [1, 2, 3]: pattern_gm = _annotation_utils.get_conv_pattern(conv_dim=conv_dim, act_fn=None) annotated_partitions.extend( _annotation_utils.annotate_weighted_mod_pattern( model, pattern_gm, quantization_config, filter_fn ) ) return annotated_partitions @_AnnotationPatternRegistry.register("linear_act") def _annotate_linear_act( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> linear -> activation -> output """ return _annotation_utils.annotate_linear_bn_act_helper( model, quantization_config, filter_fn, use_bn=False ) @_AnnotationPatternRegistry.register("linear_bn_act") def _annotate_linear_bn_act( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> linear -> batch_norm -> activation -> output """ return _annotation_utils.annotate_linear_bn_act_helper( model, quantization_config, filter_fn, use_bn=True ) @_AnnotationPatternRegistry.register("linear_bn") def _annotate_linear_bn( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> linear -> batch_norm -> output """ pattern_gm = _annotation_utils.get_linear_bn_pattern( act_fn=None, act_in_place=False ) return _annotation_utils.annotate_weighted_mod_pattern( model, pattern_gm, quantization_config, filter_fn ) @_AnnotationPatternRegistry.register("linear") def _annotate_linear( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> linear -> output """ pattern_gm = _annotation_utils.get_linear_pattern(act_fn=None) return _annotation_utils.annotate_weighted_mod_pattern( model, pattern_gm, quantization_config, filter_fn ) @_AnnotationPatternRegistry.register("add_act") def _annotate_add_act( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input_1 --- \ --> add -> activation -> output / input_2 --- """ ops = [_operator.add, _torch.add, _operator.iadd] return _annotation_utils.annotate_binary_op_helper( model, ops, quantization_config, filter_fn ) @_AnnotationPatternRegistry.register("add") def _annotate_add( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input_1 --- \ --> add -> output / input_2 --- """ annotated_partitions = [] ops = [_operator.add, _torch.add, _operator.iadd] for binary_op in ops: pattern_gm = _annotation_utils.get_binary_op_act_pattern(binary_op, None) annotated_partitions.extend( _annotation_utils.annotate_binary_op_act_pattern( model, pattern_gm, quantization_config, filter_fn ) ) return annotated_partitions @_AnnotationPatternRegistry.register("mul_act") def _annotate_mul_act( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input_1 --- \ --> mul -> activation -> output / input_2 --- """ ops = [_operator.mul, _torch.mul, _operator.imul] return _annotation_utils.annotate_binary_op_helper( model, ops, quantization_config, filter_fn ) @_AnnotationPatternRegistry.register("mul") def _annotate_mul( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input_1 --- \ --> mul -> output / input_2 --- """ annotated_partitions = [] ops = [_operator.mul, _torch.mul, _operator.imul] for binary_op in ops: pattern_gm = _annotation_utils.get_binary_op_act_pattern(binary_op, None) annotated_partitions.extend( _annotation_utils.annotate_binary_op_act_pattern( model, pattern_gm, quantization_config, filter_fn ) ) return annotated_partitions @_AnnotationPatternRegistry.register("matmul_act") def _annotate_matmul_act( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input_1 --- \ --> matmul -> activation -> output / input_2 --- """ return _annotation_utils.annotate_binary_op_helper( model, [_torch.matmul], quantization_config, filter_fn ) @_AnnotationPatternRegistry.register("matmul") def _annotate_matmul( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input_1 --- \ --> matmul -> output / input_2 --- """ pattern_gm = _annotation_utils.get_binary_op_act_pattern(_torch.matmul, None) return _annotation_utils.annotate_binary_op_act_pattern( model, pattern_gm, quantization_config, filter_fn ) @_AnnotationPatternRegistry.register("max_pool1d") def _annotate_max_pool1d( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> max_pool1d -> output """ return _annotation_utils.annotate_unary_shared_observer_ops( model, [_torch.nn.MaxPool1d, _torch.nn.functional.max_pool1d, _torch.max_pool1d], quantization_config, filter_fn, ) @_AnnotationPatternRegistry.register("max_pool2d") def _annotate_max_pool2d( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> max_pool2d -> output """ return _annotation_utils.annotate_unary_shared_observer_ops( model, [_torch.nn.MaxPool2d, _torch.nn.functional.max_pool2d, _torch.max_pool2d], quantization_config, filter_fn, ) @_AnnotationPatternRegistry.register("max_pool3d") def _annotate_max_pool3d( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> max_pool3d -> output """ return _annotation_utils.annotate_unary_shared_observer_ops( model, [_torch.nn.MaxPool3d, _torch.nn.functional.max_pool3d, _torch.max_pool3d], quantization_config, filter_fn, ) @_AnnotationPatternRegistry.register("adaptive_avg_pool1d") def _annotate_adaptive_avg_pool1d( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> adaptive_avg_pool1d -> output """ return _annotation_utils.annotate_unary_shared_observer_ops( model, [ _torch.nn.AdaptiveAvgPool1d, _torch.nn.functional.adaptive_avg_pool1d, _torch.adaptive_avg_pool1d, ], quantization_config, filter_fn, ) @_AnnotationPatternRegistry.register("adaptive_avg_pool2d") def _annotate_adaptive_avg_pool2d( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> adaptive_avg_pool2d -> output """ return _annotation_utils.annotate_unary_shared_observer_ops( model, [_torch.nn.AdaptiveAvgPool2d, _torch.nn.functional.adaptive_avg_pool2d], quantization_config, filter_fn, ) @_AnnotationPatternRegistry.register("adaptive_avg_pool3d") def _annotate_adaptive_avg_pool3d( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> adaptive_avg_pool3d -> output """ return _annotation_utils.annotate_unary_shared_observer_ops( model, [_torch.nn.AdaptiveAvgPool3d, _torch.nn.functional.adaptive_avg_pool3d], quantization_config, filter_fn, ) @_AnnotationPatternRegistry.register("avg_pool1d") def _annotate_avg_pool1d( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> avg_pool1d -> output """ return _annotation_utils.annotate_unary_shared_observer_ops( model, [ _torch.nn.AvgPool1d, _torch.nn.functional.avg_pool1d, _torch.avg_pool1d, _torch.mean, ], quantization_config, filter_fn, ) @_AnnotationPatternRegistry.register("avg_pool2d") def _annotate_avg_pool2d( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> avg_pool2d -> output """ return _annotation_utils.annotate_unary_shared_observer_ops( model, [ _torch.nn.AvgPool2d, _torch.nn.functional.avg_pool2d, _torch._C._nn.avg_pool2d, ], quantization_config, filter_fn, ) @_AnnotationPatternRegistry.register("avg_pool3d") def _annotate_avg_pool3d( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> avg_pool3d -> output """ return _annotation_utils.annotate_unary_shared_observer_ops( model, [ _torch.nn.AvgPool3d, _torch.nn.functional.avg_pool3d, _torch._C._nn.avg_pool3d, ], quantization_config, filter_fn, ) @_AnnotationPatternRegistry.register("flatten") def _annotate_flatten( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates input -> flatten -> output """ return _annotation_utils.annotate_unary_shared_observer_ops( model, [ _torch.nn.Flatten, _torch.flatten, ], quantization_config, filter_fn, ) class CoreMLQuantizer(_TorchQuantizer): """ Annotates all recognized patterns using ``config``. Extends py:class:`torch.ao.quantization.quantizer.quantizer.Quantizer` to add support for quantization patterns supported by Core ML. Use it in conjunction with PyTorch 2.0 ``prepare_pt2e`` and ``prepare_qat_pt2e`` APIs for post training weight and activation quantization using calibration data and for quantization aware training (QAT). Example: .. code-block:: python import torch.nn as nn from torch._export import capture_pre_autograd_graph from torch.ao.quantization.quantize_pt2e import convert_pt2e, prepare_qat_pt2e from coremltools.optimize.torch.quantization._coreml_quantizer import CoreMLQuantizer model = nn.Sequential( OrderedDict( { "conv": nn.Conv2d(1, 20, (3, 3)), "relu1": nn.ReLU(), "conv2": nn.Conv2d(20, 20, (3, 3)), "relu2": nn.ReLU(), } ) ) loss_fn = define_loss() # initialize the annotator with quantization config config = LinearQuantizerConfig.from_dict( { "global_config": { "quantization_scheme": "symmetric", } } ) quantizer = CoreMLQuantizer(config) example_inputs = torch.randn(1, 1, 28, 28) # create export graph exported_model = capture_pre_autograd_graph(model, (example_inputs,)) # prepare the model to insert FakeQuantize layers for QAT prepared_model = prepare_qat_pt2e(exported_model, quantizer) # use prepared model in your PyTorch training loop for inputs, labels in data: output = prepared_model(inputs) loss = loss_fn(output, labels) loss.backward() optimizer.step() # turn observers/quantizers on/off depending on iteration number # convert operations to their quanitzed counterparts using parameters learnt via QAT model = convert_pt2e(prepared_model) """ def __init__(self, config: _Optional[_LinearQuantizerConfig]): self._config = config def _annotate_all_patterns( self, model: _torch.fx.GraphModule, quantization_config: _Optional[_ModuleLinearQuantizerConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ): annotators = _AnnotationPatternRegistry.get_annotators() for _, annotator in annotators.items(): annotation_config = _AnnotationConfig.from_quantization_config( quantization_config ) annotator(model, annotation_config, filter_fn) def annotate(self, model: _torch.fx.GraphModule) -> _torch.fx.GraphModule: # First annotate all modules/operations which have name based configs module_name_list = list(self._config.module_name_configs.keys()) for module_name, config in self._config.module_name_configs.items(): self._annotate_all_patterns( model, config, _get_module_name_filter(module_name) ) # Next annotate all modules/operations which have type based configs tp_list = list(self._config.module_type_configs.keys()) for module_type, config in self._config.module_type_configs.items(): self._annotate_all_patterns( model, config, _annotation_utils.get_object_type_filter(module_type) ) # Annotate all other modules/operations self._annotate_all_patterns( model, self._config.global_config, _annotation_utils.get_not_object_type_or_name_filter( tp_list, module_name_list ), ) return model def validate(self, model: _torch.fx.GraphModule) -> None: pass ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/_coreml_quantizer_utils.py0000644000000000000000000006646214672066616030100 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools as _itertools from typing import Callable as _Callable from typing import List as _List from typing import Optional as _Optional from typing import Tuple as _Tuple import torch as _torch import torch.nn.functional as _F _IS_TORCH_OLDER_THAN_2_3 = tuple(map(int, _torch.__version__.split(".")[:2])) < (2, 3) _IS_TORCH_OLDER_THAN_2_4 = tuple(map(int, _torch.__version__.split(".")[:2])) < (2, 4) if _IS_TORCH_OLDER_THAN_2_4: from torch.ao.quantization.pt2e.utils import get_aten_graph_module else: from torch.ao.quantization.pt2e.utils import _get_aten_graph_module_for_pattern from torch.ao.quantization.quantizer.quantizer import ( FixedQParamsQuantizationSpec as _FixedQParamsQuantizationSpec, ) from torch.ao.quantization.quantizer.quantizer import ( QuantizationAnnotation as _QuantizationAnnotation, ) from torch.ao.quantization.quantizer.quantizer import QuantizationSpec as _TorchQuantizationSpec from torch.ao.quantization.quantizer.quantizer import ( QuantizationSpecBase as _TorchQuantizationSpecBase, ) from torch.ao.quantization.quantizer.quantizer import ( SharedQuantizationSpec as _SharedQuantizationSpec, ) from torch.ao.quantization.quantizer.xnnpack_quantizer import _get_module_name_filter from torch.ao.quantization.quantizer.xnnpack_quantizer_utils import ( _is_annotated, _mark_nodes_as_annotated, ) from torch.fx import Node as _Node from torch.fx.passes.utils.matcher_with_name_node_map_utils import ( SubgraphMatcherWithNameNodeMap as _SubgraphMatcherWithNameNodeMap, ) from torch.fx.passes.utils.source_matcher_utils import ( get_source_partitions as _get_source_partitions, ) from coremltools.optimize.torch.quantization._annotation_config import ( AnnotationConfig as _AnnotationConfig, ) # All activations recognized for conv-act/conv-bn-act patterns _supported_activations = ( _F.relu, _F.relu6, _F.leaky_relu, _F.silu, _F.elu, _F.celu, _F.selu, _F.mish, _F.hardtanh, _F.hardswish, _F.hardsigmoid, ) # These activation functions don't have an inplace argument _supported_activations_no_inplace = (_F.gelu, _F.sigmoid, _F.logsigmoid, _F.tanh) # Map of dimension to convolution function _conv_fn_map = {1: _F.conv1d, 2: _F.conv2d, 3: _F.conv3d} def _get_aten_graph_module( pattern: _torch.nn.Module, example_inputs: _Tuple[_torch.Tensor], is_cuda: bool = False ): if _IS_TORCH_OLDER_THAN_2_3: return get_aten_graph_module(pattern.forward, example_inputs, is_cuda) elif _IS_TORCH_OLDER_THAN_2_4: return get_aten_graph_module(pattern, example_inputs, is_cuda) else: return _get_aten_graph_module_for_pattern(pattern, example_inputs, is_cuda) def _adjust_activation_qspec( node: _torch.fx.Node, qspec: _Optional[_TorchQuantizationSpecBase] ) -> _Optional[_TorchQuantizationSpecBase]: """ Adjust quantization spec for ops which can use fixed qparams or ops for which we can use affine quantization mode during symmetric quantization because their output is always positive. """ if qspec is None: return qspec tanh_qspec = _FixedQParamsQuantizationSpec( dtype=_torch.uint8, scale=2.0 / 256.0, zero_point=128, quant_min=0, quant_max=255, qscheme=_torch.per_tensor_symmetric, ) sigmoid_qspec = _FixedQParamsQuantizationSpec( dtype=_torch.uint8, scale=1.0 / 256.0, zero_point=0, quant_min=0, quant_max=255, qscheme=_torch.per_tensor_affine, ) fixed_q_params_ops = { _torch.ops.aten.tanh.default: tanh_qspec, _torch.ops.aten.tanh_.default: tanh_qspec, _torch.ops.aten.sigmoid.default: sigmoid_qspec, _torch.ops.aten.sigmoid_.default: sigmoid_qspec, _torch.ops.aten.hardsigmoid.default: sigmoid_qspec, _torch.ops.aten.hardsigmoid_.default: sigmoid_qspec, } always_affine_ops = ( _torch.ops.aten.relu.default, _torch.ops.aten.relu_.default, _torch.ops.aten.relu6.default, _torch.ops.aten.relu6_.default, ) # ReLU6 activation maps to _torch.ops.aten.hardtanh.default with # min_val = 0 and max_val = 6 is_always_affine_op = node.target in always_affine_ops or ( node.target in [_torch.ops.aten.hardtanh.default, _torch.ops.aten.hardtanh_.default] and node.args[1] == 0 # min_val, corresponding to ReLU6 and node.args[2] == 6 # max_val, corresponding to ReLU6 ) if node.target in fixed_q_params_ops: return _TorchQuantizationSpec( observer_or_fake_quant_ctr=qspec.observer_or_fake_quant_ctr, dtype=qspec.dtype, qscheme=fixed_q_params_ops[node.target].qscheme, ) # FIXME: Because of a bug in PyTorch in function _create_obs_or_fq_from_qspec # in module torch/ao/quantization/fx/prepare.py which creates a # FixedQParamsFakeQuantize partial, instead of an instance, we cannot # actually create FixedQParamsQuantizationSpec if is_always_affine_op: return _TorchQuantizationSpec( observer_or_fake_quant_ctr=qspec.observer_or_fake_quant_ctr, dtype=qspec.dtype, qscheme=_torch.per_tensor_affine, ) return qspec def get_object_type_filter(tp: _Callable): """ Returns a filter which returns True if a node in the final exported graph was created because of an object of type ``tp`` """ def object_type_filter(n: _Node) -> bool: # example: { # 'add_10': ('add', ) # } nn_module_stack = n.meta.get("nn_module_stack", {}) types = [t for _, t in nn_module_stack.values()] source_fn_stack = n.meta.get("source_fn_stack", {}) types.extend([t for _, t in source_fn_stack]) return tp in types return object_type_filter def get_not_object_type_or_name_filter( tp_list: _List[_Callable], module_name_list: _List[str] ) -> _Callable[[_Node], bool]: """ Returns a filter which returns True if a node in the final exported graph was not created using any modules with names in ``module_name_list`` or objects with type in ``tp_list``. """ object_type_filters = [get_object_type_filter(tp) for tp in tp_list] module_name_list_filters = [_get_module_name_filter(m) for m in module_name_list] def not_object_type_or_name_filter(n: _Node) -> bool: return not any(f(n) for f in object_type_filters + module_name_list_filters) return not_object_type_or_name_filter def _get_weighted_mod_pattern( mod_fn: _Callable, example_inputs: _Tuple[_torch.Tensor, ...], act_fn: _Optional[_Callable] = None, act_in_place: bool = False, ) -> _torch.nn.Module: """ Returns an aten graph corresponding to a sequence of these ops: input -> weighted_mod -> activation -> output A weighted mod is a module which has a weight and bias, such as a convolution module or a linear module. No activation is used if ``act_fn`` is ``None``. ``act_fn`` is an activation function from _supported_activations or _supported_activations_no_inplace """ class Pattern(_torch.nn.Module): def forward(self, input, weight, bias): mod_out = mod_fn(input, weight, bias) output = mod_out node_dict = { "input": input, "mod": mod_out, "weight": weight, "bias": bias, } if act_fn is not None: # Only add output if activation function is applied to model output output = act_fn(output, inplace=True) if act_in_place else act_fn(output) node_dict["output"] = output return output, node_dict return _get_aten_graph_module(Pattern(), example_inputs) def _get_weighted_mod_bn_pattern( mod_fn: _Callable, example_inputs: _Tuple[_torch.Tensor, ...], act_fn: _Optional[_Callable] = None, act_in_place: bool = False, ) -> _torch.nn.Module: """ Returns an aten graph corresponding to a sequence of these ops: input -> weighted_mod -> batch_norm -> activation -> output A weighted mod is a module which has a weight and bias, such as a convolution module or a linear module. No activation is used if ``act_fn`` is ``None``. ``act_fn`` is an activation function from _supported_activations or _supported_activations_no_inplace """ class Pattern(_torch.nn.Module): def forward(self, input, weight, bias, bn_weight, bn_bias, bn_run_mean, bn_run_var): mod_out = mod_fn(input, weight, bias) output = _F.batch_norm( mod_out, bn_run_mean, bn_run_var, bn_weight, bn_bias, training=True ) if act_fn is not None: output = act_fn(output, inplace=True) if act_in_place else act_fn(output) return output, { "input": input, "mod": mod_out, "weight": weight, "bias": bias, "output": output, } return _get_aten_graph_module(Pattern(), example_inputs) def get_binary_op_act_pattern( binary_op: _Callable, act_fn: _Optional[_Callable] = None, act_in_place: bool = False, ) -> _torch.nn.Module: """ Returns an aten graph corresponding to a sequence of these ops: input_1 --- \ --> binary_op -> activation -> output / input_2 --- A binary op is any operation which consumes two inputs to create one output, such as addition or multiplication. No activation is used if ``act_fn`` is ``None``. ``act_fn`` is an activation function from _supported_activations or _supported_activations_no_inplace """ class Pattern(_torch.nn.Module): def forward(self, input_1, input_2): binary_op_out = binary_op(input_1, input_2) node_dict = { "binary_op": binary_op_out, } output = binary_op_out if act_fn is not None: output = act_fn(output, inplace=True) if act_in_place else act_fn(output) node_dict["output"] = output return output, node_dict example_inputs = (_torch.randn(1), _torch.randn(1)) return _get_aten_graph_module(Pattern(), example_inputs) def get_conv_pattern( conv_dim: int, act_fn: _Optional[_Callable] = None, act_in_place: bool = False ) -> _torch.nn.Module: """ Returns an aten graph corresponding to a sequence of these ops: input -> conv -> activation -> output No activation is used if ``act_fn`` is ``None``. ``act_fn`` is an activation function from _supported_activations or _supported_activations_no_inplace """ assert ( conv_dim in _conv_fn_map ), f"Dimension {conv_dim} is not supported for Convolution layers." example_inputs = ( _torch.randn(1, 1, *[3] * conv_dim), # input _torch.randn(1, 1, *[1] * conv_dim), # conv weight _torch.randn(1), # conv bias ) return _get_weighted_mod_pattern( _conv_fn_map[conv_dim], example_inputs, act_fn, act_in_place ) def get_conv_bn_pattern( conv_dim: int, act_fn: _Optional[_Callable] = None, act_in_place: bool = False ) -> _torch.nn.Module: """ Returns an aten graph corresponding to a sequence of these ops: input -> conv -> batch_norm -> activation -> output No activation is used if ``act_fn`` is ``None``. ``act_fn`` is an activation function from _supported_activations or _supported_activations_no_inplace """ assert ( conv_dim in _conv_fn_map ), f"Dimension {conv_dim} is not supported for Convolution layers." example_inputs = ( _torch.randn(1, 1, *[3] * conv_dim), # input _torch.randn(1, 1, *[1] * conv_dim), # conv weight _torch.randn(1), # conv bias _torch.randn(1), # bn_weight _torch.randn(1), # bn_bias _torch.randn(1), # bn_run_mean _torch.randn(1), # bn_run_var ) return _get_weighted_mod_bn_pattern( _conv_fn_map[conv_dim], example_inputs, act_fn, act_in_place ) def get_linear_pattern( act_fn: _Optional[_Callable] = None, act_in_place: bool = False ) -> _torch.nn.Module: """ Returns an aten graph corresponding to a sequence of these ops: input -> linear -> activation -> output No activation is used if ``act_fn`` is ``None``. ``act_fn`` is an activation function from _supported_activations or _supported_activations_no_inplace """ example_inputs = ( _torch.randn(1, 1), # input _torch.randn(3, 1), # linear weight _torch.randn(3), # linear bias ) return _get_weighted_mod_pattern(_F.linear, example_inputs, act_fn, act_in_place) def get_linear_bn_pattern( act_fn: _Optional[_Callable] = None, act_in_place: bool = False ) -> _torch.nn.Module: """ Returns an aten graph corresponding to a sequence of these ops: input -> linear -> batch_norm -> activation -> output No activation is used if ``act_fn`` is ``None``. ``act_fn`` is an activation function from _supported_activations or _supported_activations_no_inplace """ example_inputs = ( _torch.randn(2, 1), # input _torch.randn(3, 1), # linear weight _torch.randn(3), # linear bias _torch.randn(3), # bn_weight _torch.randn(3), # bn_bias _torch.randn(3), # bn_run_mean _torch.randn(3), # bn_run_var ) return _get_weighted_mod_bn_pattern(_F.linear, example_inputs, act_fn, act_in_place) def annotate_weighted_mod_pattern( model: _torch.fx.GraphModule, pattern_gm: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]], ) -> _Optional[_List[_List[_Node]]]: """ Annotates all nodes in ``model``, which match the pattern specified by ``pattern_gm`` using ``quantization_config``. ``pattern_gm`` captures patterns of the following type: input -> weighted_mod -> batch_norm -> activation -> output batch_norm and activation may or may not be applied in the pattern. Only annotates those patterns in which all nodes return True when ``filter_fn`` is applied to them. """ model.graph.eliminate_dead_code() model.recompile() matcher = _SubgraphMatcherWithNameNodeMap(pattern_gm, ignore_literals=True) matches = matcher.match(model.graph) annotated_partitions = [] for match in matches: name_node_map = match.name_node_map input_node = name_node_map["input"] mod_node = name_node_map["mod"] weight_node = name_node_map["weight"] bias_node = name_node_map["bias"] if "output" in name_node_map: # In this case, an activation is applied to the weighted module output output_node = name_node_map["output"] # If the output is same as mod_node, it means we have an inplace activation, # so we need to correct the mod_node if mod_node == output_node: mod_node = mod_node.args[0] else: output_node = None # Validate mod args if mod_node.args[0] is not input_node: raise ValueError( f"Weighted module arg did not contain input node {input_node}" ) if mod_node.args[1] is not weight_node: raise ValueError( f"Weighted module arg did not contain weight node {weight_node}" ) if len(mod_node.args) > 2 and mod_node.args[2] is not bias_node: raise ValueError( f"Weighted module arg did not contain bias node {bias_node}" ) # Skip if the partition is already annotated or is filtered out by the user partition = [mod_node, weight_node] if bias_node is not None: partition.append(bias_node) if _is_annotated(partition): continue if filter_fn and any(not filter_fn(n) for n in partition): continue # Annotate conv inputs and pattern output input_qspec_map = dict() if not _is_annotated([input_node]): input_qspec_map[input_node] = ( quantization_config.input_activation if quantization_config else None ) else: input_qspec_map[input_node] = input_node.meta[ "quantization_annotation" ].output_qspec input_qspec_map[weight_node] = ( quantization_config.weight if quantization_config else None ) output_qspec = ( quantization_config.output_activation if quantization_config else None ) if bias_node is not None: input_qspec_map[bias_node] = None if output_node is None: mod_node.meta["quantization_annotation"] = _QuantizationAnnotation( input_qspec_map=input_qspec_map, output_qspec=output_qspec, _annotated=True, ) else: mod_node.meta["quantization_annotation"] = _QuantizationAnnotation( input_qspec_map=input_qspec_map, _annotated=True, ) if not _is_annotated([output_node]): output_qspec = _adjust_activation_qspec( node=output_node, qspec=output_qspec ) output_node.meta["quantization_annotation"] = _QuantizationAnnotation( output_qspec=output_qspec, _annotated=True, ) _mark_nodes_as_annotated(partition) annotated_partitions.append(partition) return annotated_partitions def annotate_binary_op_act_pattern( model: _torch.fx.GraphModule, pattern_gm: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates all nodes in ``model``, which match the pattern specified by ``pattern_gm`` using ``quantization_config``. ``pattern_gm`` captures patterns of the following type: input_1 --- \ --> binary_op -> activation -> output / input_2 --- activation may or may not be applied in the pattern. Only annotates those patterns in which all nodes return True when ``filter_fn`` is applied to them. """ model.graph.eliminate_dead_code() model.recompile() matcher = _SubgraphMatcherWithNameNodeMap(pattern_gm, ignore_literals=True) matches = matcher.match(model.graph) annotated_partitions = [] for match in matches: name_node_map = match.name_node_map binary_op_node: _Node = name_node_map["binary_op"] if "output" in name_node_map: output_node = name_node_map["output"] # In this case, an activation is applied to the weighted module output output_node = name_node_map["output"] # If the output is same as binary_op_node, it means we have an inplace activation, # so we need to correct the binary_op_node if binary_op_node == output_node: binary_op_node = binary_op_node.args[0] partition = [output_node, binary_op_node] else: output_node = None partition = [binary_op_node] if output_node is not None and len(binary_op_node.users) > 1: raise ValueError("Binary op with activation has more than one users.") if _is_annotated(partition): continue if filter_fn and any(not filter_fn(n) for n in partition): continue input_act_qspec = ( quantization_config.input_activation if quantization_config else None ) output_act_qspec = ( quantization_config.output_activation if quantization_config else None ) input_qspec_map = {} input_act0 = binary_op_node.args[0] if isinstance(input_act0, _Node): input_qspec_map[input_act0] = input_act_qspec input_act1 = binary_op_node.args[1] if isinstance(input_act1, _Node): input_qspec_map[input_act1] = input_act_qspec if output_node is None: binary_op_node.meta["quantization_annotation"] = _QuantizationAnnotation( input_qspec_map=input_qspec_map, output_qspec=output_act_qspec, _annotated=True, ) else: binary_op_node.meta["quantization_annotation"] = _QuantizationAnnotation( input_qspec_map=input_qspec_map, _annotated=True, ) output_act_qspec = _adjust_activation_qspec( node=output_node, qspec=output_act_qspec ) output_node.meta["quantization_annotation"] = _QuantizationAnnotation( output_qspec=output_act_qspec, _annotated=True, ) _mark_nodes_as_annotated(partition) annotated_partitions.append(partition) return annotated_partitions def annotate_unary_shared_observer_ops( model: _torch.fx.GraphModule, ops: _List[_Callable], quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ Annotates all nodes in ``model``, which correspond to unary ops specified in ``ops``. input --> op --> output input and output nodes share the same quantization parameters. """ partitions = _get_source_partitions(model.graph, ops, filter_fn) annotated_partitions = [] for _, op_partitions in partitions.items(): for partition in op_partitions: output_node = partition.output_nodes[0] op_node = partition.nodes[0] if _is_annotated([output_node, op_node]): continue input_node = op_node.args[0] input_act_qspec = ( quantization_config.input_activation if quantization_config else None ) output_act_qspec = ( quantization_config.output_activation if quantization_config else None ) if ( "quantization_annotation" not in input_node.meta or not input_node.meta["quantization_annotation"]._annotated or input_node.meta["quantization_annotation"].output_qspec is None or input_act_qspec is None or output_act_qspec is None ): continue # input and output of op will share quantization parameter with input of op act_qspec = _SharedQuantizationSpec(input_node) op_node.meta["quantization_annotation"] = _QuantizationAnnotation( input_qspec_map={ input_node: act_qspec, }, _annotated=True, ) output_node.meta["quantization_annotation"] = _QuantizationAnnotation( output_qspec=act_qspec, _annotated=True, ) annotated_partitions.append(partition.nodes) return annotated_partitions def annotate_conv_bn_act_helper( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, use_bn: bool = False, ) -> _Optional[_List[_List[_Node]]]: """ A helper function for annotating all patterns involving convolution operations, i.e., input -> conv -> batch_norm -> activation -> output conv can be either 1D, 2D or 3D batch_norm and activation may or may not be applied. """ annotated_partitions = [] pattern_map = { True: get_conv_bn_pattern, False: get_conv_pattern, } conv_dims = [1, 2, 3] combinations = _itertools.product(conv_dims, _supported_activations, [True, False]) for conv_dim, act_fn, act_in_place in combinations: pattern_gm = pattern_map[use_bn](conv_dim, act_fn, act_in_place) annotated_partitions.extend( annotate_weighted_mod_pattern( model, pattern_gm, quantization_config, filter_fn ) ) combinations = _itertools.product(conv_dims, _supported_activations_no_inplace) for conv_dim, act_fn in combinations: pattern_gm = pattern_map[use_bn](conv_dim, act_fn, act_in_place=False) annotated_partitions.extend( annotate_weighted_mod_pattern( model, pattern_gm, quantization_config, filter_fn ) ) return annotated_partitions def annotate_linear_bn_act_helper( model: _torch.fx.GraphModule, quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, use_bn: bool = False, ) -> _Optional[_List[_List[_Node]]]: """ A helper function for annotating all patterns involving linear operations, i.e., input -> linear -> batch_norm -> activation -> output batch_norm and activation may or may not be applied. """ annotated_partitions = [] pattern_map = { True: get_linear_bn_pattern, False: get_linear_pattern, } combinations = _itertools.product(_supported_activations, [True, False]) for act_fn, act_in_place in combinations: pattern_gm = pattern_map[use_bn](act_fn, act_in_place) annotated_partitions.extend( annotate_weighted_mod_pattern( model, pattern_gm, quantization_config, filter_fn ) ) for act_fn in _supported_activations_no_inplace: pattern_gm = pattern_map[use_bn](act_fn, act_in_place=False) annotated_partitions.extend( annotate_weighted_mod_pattern( model, pattern_gm, quantization_config, filter_fn ) ) return annotated_partitions def annotate_binary_op_helper( model: _torch.fx.GraphModule, binary_ops: _List[_Callable], quantization_config: _Optional[_AnnotationConfig], filter_fn: _Optional[_Callable[[_Node], bool]] = None, ) -> _Optional[_List[_List[_Node]]]: """ A helper function for annotating all patterns involving binary operations, i.e., using ``quantization_config``. input_1 --- \ --> binary_op -> activation -> output / input_2 --- activation may or may not be applied in the pattern. """ annotated_partitions = [] combinations = _itertools.product(binary_ops, _supported_activations, [True, False]) for binary_op, act_fn, act_in_place in combinations: pattern_gm = get_binary_op_act_pattern(binary_op, act_fn, act_in_place) annotated_partitions.extend( annotate_binary_op_act_pattern( model, pattern_gm, quantization_config, filter_fn ) ) combinations = _itertools.product(binary_ops, _supported_activations_no_inplace) for binary_op, act_fn in combinations: pattern_gm = get_binary_op_act_pattern(binary_op, act_fn, act_in_place=False) annotated_partitions.extend( annotate_binary_op_act_pattern( model, pattern_gm, quantization_config, filter_fn ) ) return annotated_partitions ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/_qconfig_mapping.py0000644000000000000000000003416614672066616026432 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import logging as _logging from typing import Any as _Any from typing import Callable as _Callable from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import Type as _Type import torch as _torch import torch.ao.quantization as _aoquant import torch.nn as _nn from coremltools.optimize.torch.quantization._backend_config import _fixed_qparams_modules from coremltools.optimize.torch.quantization._backend_config import ( get_supported_modules as _get_supported_modules, ) from coremltools.optimize.torch.quantization._utils import get_quant_range as _get_quant_range from coremltools.optimize.torch.quantization.modules.observers import NoopObserver as _NoopObserver from coremltools.optimize.torch.quantization.quantization_config import ( LinearQuantizerConfig as _LinearQuantizerConfig, ) from coremltools.optimize.torch.quantization.quantization_config import ( ModuleLinearQuantizerConfig as _ModuleLinearQuantizerConfig, ) from coremltools.optimize.torch.quantization.quantization_config import ( ObserverType as _ObserverType, ) from coremltools.optimize.torch.quantization.quantization_config import ( QuantizationScheme as _QuantizationScheme, ) from coremltools.optimize.torch.quantization.quantization_config import ( _default_quantization_options, ) _logger = _logging.getLogger(__name__) class _QConfigMappingBuilder: """ Builds py:class:`QConfigMapping` from :py:class:`LinearQuantizerConfig`. """ _observer_cls: _Type = _ObserverType @classmethod def _get_fake_quantize_class( cls, quantization_config: _Optional[_ModuleLinearQuantizerConfig] = None ) -> _Type[_aoquant.FakeQuantizeBase]: return _aoquant.FakeQuantize @classmethod def _create_fake_quantize_partial_from_kwargs( cls, is_weight: bool, observer_cls: _aoquant.ObserverBase, dtype: _torch.dtype, qscheme: _torch.qscheme, weight_per_channel: bool = False, quant_min: _Optional[int] = None, quant_max: _Optional[int] = None, ch_axis: _Optional[int] = None, quantization_config: _Optional[_ModuleLinearQuantizerConfig] = None, ) -> _Callable: fq_kwargs = dict() fq_kwargs["observer"] = observer_cls fq_kwargs["dtype"] = dtype fq_kwargs["qscheme"] = qscheme if is_weight and weight_per_channel: fq_kwargs["ch_axis"] = ( ch_axis if ch_axis else _default_quantization_options["weight_ch_axis"] ) if quant_min is not None: fq_kwargs["quant_min"] = quant_min if quant_max is not None: fq_kwargs["quant_max"] = quant_max fq_class = cls._get_fake_quantize_class(quantization_config) return fq_class.with_args(**fq_kwargs) @classmethod def _get_default_qconfig_from_quantization_scheme( cls, quantization_scheme: _QuantizationScheme, quantization_config: _ModuleLinearQuantizerConfig, ) -> _aoquant.QConfig: """ Returns default QConfig for a given quantization scheme """ act_observer = cls._observer_cls.get_observer( _default_quantization_options["observer"], is_per_channel=False ) act_dtype = _default_quantization_options["activation_dtype"] act_qscheme = _QuantizationScheme.get_qscheme(quantization_scheme, is_per_channel=False) weight_observer = cls._observer_cls.get_observer( _default_quantization_options["observer"], is_per_channel=_default_quantization_options["weight_per_channel"], ) weight_dtype = _default_quantization_options["weight_dtype"] weight_qscheme = _QuantizationScheme.get_qscheme( quantization_scheme, is_per_channel=_default_quantization_options["weight_per_channel"], ) return _aoquant.QConfig( activation=cls._create_fake_quantize_partial_from_kwargs( False, act_observer, act_dtype, act_qscheme ), weight=cls._create_fake_quantize_partial_from_kwargs( True, weight_observer, weight_dtype, weight_qscheme, weight_per_channel=_default_quantization_options["weight_per_channel"], ), ) @classmethod def _adjust_qconfig_for_module_type( cls, mod_type: _Any, qconfig: _aoquant.QConfig, quantization_config: _ModuleLinearQuantizerConfig, ) -> _aoquant.QConfig: """ Enforces Embedding layers to use float qparams, because that's preferred by prepare_qat_fx. Overwrites ch_axis for ConvTranspose layers if qscheme is not per_tensor """ if mod_type in [ _torch.nn.Embedding, _torch.nn.ConvTranspose1d, _torch.nn.ConvTranspose2d, _torch.nn.ConvTranspose3d, ]: weight = qconfig.weight() weight_dtype = weight.dtype if weight_dtype == _torch.float: return qconfig weight_per_channel = _default_quantization_options["weight_per_channel"] weight_observer = type(weight.activation_post_process) if mod_type == _torch.nn.Embedding: ch_axis = None weight_qscheme = _torch.per_channel_affine_float_qparams # we do not want to quantize inputs to Embedding layer because they are integers activation_config = _NoopObserver.with_args(dtype=_torch.float) else: if hasattr(weight, "qscheme") and weight.qscheme not in [ _torch.per_tensor_affine, _torch.per_tensor_symmetric, ]: ch_axis = 1 weight_qscheme = weight.qscheme # preserve activation config for ConvTranspose ops activation_config = qconfig.activation weight_per_channel = ( weight_per_channel if quantization_config is None else quantization_config.weight_per_channel ) else: return qconfig return _aoquant.QConfig( activation=activation_config, weight=cls._create_fake_quantize_partial_from_kwargs( True, weight_observer, weight_dtype, weight_qscheme, weight_per_channel=weight_per_channel, quant_min=weight.quant_min, quant_max=weight.quant_max, ch_axis=ch_axis, ), ) return qconfig @staticmethod def _get_module_names_for_setting_qconfig(model: _nn.Module, mod_name: str) -> _Tuple[str, ...]: """ When layers are fused and we want to skip quantization for a convolution or linear layer, we need to set the qconfig for the layer being fused as None as well. """ try: submod = model.get_submodule(mod_name) except AttributeError: return (mod_name,) if isinstance(submod, _torch.nn.Conv2d) or isinstance(submod, _torch.nn.ConvTranspose2d): return mod_name, f"{mod_name}.conv" elif isinstance(submod, _torch.nn.Linear): return mod_name, f"{mod_name}.linear" return (mod_name,) @classmethod def _create_qconfig_from_quantization_config( cls, quantization_config: _ModuleLinearQuantizerConfig, ) -> _Optional[_aoquant.QConfig]: """ Creates a :py:class:`QConfig` from ``quantization_config`` """ if quantization_config.activation_dtype == _torch.float32: activation_qconfig = _NoopObserver.with_args( dtype=_torch.float, ) else: act_observer = cls._observer_cls.get_observer( quantization_config.activation_observer, is_per_channel=False ) act_dtype = quantization_config.activation_dtype act_qscheme = _QuantizationScheme.get_qscheme( quantization_config.quantization_scheme, is_per_channel=False ) activation_qconfig = cls._create_fake_quantize_partial_from_kwargs( False, act_observer, act_dtype, act_qscheme ) if quantization_config.weight_dtype == _torch.float32: weight_qconfig = _NoopObserver.with_args( dtype=_torch.float, ) else: quant_min, quant_max = ( _get_quant_range( n_bits=quantization_config.weight_n_bits, dtype=quantization_config.weight_dtype, ) if quantization_config.weight_n_bits < 8 else (None, None) ) weight_observer = cls._observer_cls.get_observer( quantization_config.weight_observer, is_per_channel=quantization_config.weight_per_channel, ) weight_dtype = quantization_config.weight_dtype weight_qscheme = _QuantizationScheme.get_qscheme( quantization_config.quantization_scheme, is_per_channel=quantization_config.weight_per_channel, ) weight_qconfig = cls._create_fake_quantize_partial_from_kwargs( True, weight_observer, weight_dtype, weight_qscheme, weight_per_channel=quantization_config.weight_per_channel, quant_min=quant_min, quant_max=quant_max, ) return _aoquant.QConfig(activation=activation_qconfig, weight=weight_qconfig) def _get_supported_modules(self): supported_modules = list(set(_get_supported_modules()) - set(_fixed_qparams_modules)) # Add _FakeQuantize, NoopObserver to ensure all fused ops have same qconfig supported_modules.append(_aoquant.FakeQuantize) supported_modules.append(_NoopObserver) return supported_modules def get_default_qconfig_mapping( self, quantization_scheme: _QuantizationScheme, quantization_config: _Optional[_ModuleLinearQuantizerConfig] = None, qconfig: _Optional[_aoquant.QConfig] = None, ) -> _aoquant.QConfigMapping: """ Returns default QConfigMapping for a given quantization scheme. If a qconfig is passed, it is used as the default qconfig instead. """ supported_modules = self._get_supported_modules() qconfig_mapping = _aoquant.QConfigMapping() default_qconfig_mapping = _aoquant.get_default_qat_qconfig_mapping() # copy qconfig mapping for fixed qparams for key in default_qconfig_mapping.object_type_qconfigs: if key in _fixed_qparams_modules: qconfig_mapping.set_object_type( key, default_qconfig_mapping.object_type_qconfigs[key] ) qconfig = ( self._get_default_qconfig_from_quantization_scheme( quantization_scheme, quantization_config ) if qconfig is None else qconfig ) qconfig_mapping.set_global(qconfig) for mod_type in supported_modules: qconfig_mapping.set_object_type( mod_type, self._adjust_qconfig_for_module_type(mod_type, qconfig, quantization_config), ) return qconfig_mapping def get_qconfig_mapping_from_quantization_config( self, model: _nn.Module, quantization_config: _LinearQuantizerConfig, quantization_scheme: _QuantizationScheme, ) -> _aoquant.QConfigMapping: """ Builds py:class:`QConfigMapping` from :py:class:`LinearQuantizerConfig`. """ qconfig = ( self._create_qconfig_from_quantization_config(quantization_config.global_config) if quantization_config.global_config is not None else None ) qconfig_mapping = self.get_default_qconfig_mapping( quantization_scheme=quantization_scheme, quantization_config=quantization_config.global_config, qconfig=qconfig, ) for mod_type, config in quantization_config.module_type_configs.items(): qconfig = ( self._create_qconfig_from_quantization_config(config) if config is not None else config ) qconfig = ( self._adjust_qconfig_for_module_type(mod_type, qconfig, config) if qconfig is not None else qconfig ) qconfig_mapping = qconfig_mapping.set_object_type(mod_type, qconfig) for mod_name, config in quantization_config.module_name_configs.items(): qconfig = ( self._create_qconfig_from_quantization_config(config) if config is not None else config ) try: submod = model.get_submodule(mod_name) qconfig = ( self._adjust_qconfig_for_module_type(type(submod), qconfig, config) if qconfig is not None else qconfig ) except AttributeError: _logger.warning( f"Could not find a submodule with name {mod_name}. " f"If the name corresponded to something other than a module, " f"this message can be ignored. Otherwise, it's possible " f"the module name was not correctly specified in the config." ) mod_names = self._get_module_names_for_setting_qconfig(model, mod_name) for mn in mod_names: qconfig_mapping = qconfig_mapping.set_module_name(mn, qconfig) return qconfig_mapping ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/_utils.py0000644000000000000000000001714514672066616024427 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import math import operator as _operator from collections import defaultdict from enum import Enum as _Enum from typing import Dict as _Dict from typing import List as _List from typing import Optional as _Optional from typing import Tuple as _Tuple import torch as _torch import torch.ao.quantization as _aoquant import torch.fx as _fx from torch.ao.nn.quantized.reference.modules.utils import _quantize_and_dequantize_weight_decomposed from torch.ao.quantization.backend_config import BackendConfig as _BackendConfig from torch.ao.quantization.backend_config import ObservationType as _ObservationType from coremltools.optimize.torch._utils.metadata_utils import ( CompressionMetadata as _CompressionMetadata, ) from coremltools.optimize.torch._utils.version_utils import is_torch_2 as _is_torch_2 def is_per_channel_quant(qscheme: _torch.qscheme) -> bool: """ Returns True if provided qscheme is for per-channel quantization. Otherwise returns False. """ return qscheme in [_torch.per_channel_symmetric, _torch.per_channel_affine] def is_symmetric_quant(qscheme: _torch.qscheme) -> bool: """ Returns True if provided qscheme is for symmetric quantization. Otherwise returns False. """ return qscheme in [_torch.per_tensor_symmetric, _torch.per_channel_symmetric] def is_pytorch_defined_observer(observer: _aoquant.ObserverBase): """ Returns True if provided observer instance is defined by PyTorch. Otherwise returns False. """ checklist = ( _aoquant.MinMaxObserver, _aoquant.PerChannelMinMaxObserver, _aoquant.MovingAverageMinMaxObserver, _aoquant.MovingAveragePerChannelMinMaxObserver, _aoquant.HistogramObserver, _aoquant.PlaceholderObserver, _aoquant.NoopObserver, _aoquant.FixedQParamsObserver, ) return isinstance(observer, checklist) class CombinationOpType(_Enum): Add = "add" Mul = "mul" Concat = "concat" AddReLU = "add_relu" def find_target(model, target_name): """ Finds the module in model which is referenced by the target_name. target_name is in the form of `mod_a.mod_b.mod_c` """ current_obj = model for attr in target_name.split("."): current_obj = getattr(current_obj, attr) return current_obj def find_module(model: _torch.nn.Module, node: _fx.Node): """ Finds module corresponding to the node. """ if hasattr(node, "op") and node.op == "call_module": return find_target(model, node.target) return None def is_add(node: _fx.Node): """ Returns True if node is an add op """ if node.op == "call_function": return node.target == _operator.add or node.target == _torch.add return False def is_mul(node: _fx.Node): """ Returns True if node is a mul op """ if node.op == "call_function": return node.target == _operator.mul or node.target == _torch.mul return False def is_concat(node: _fx.Node): """ Returns True if node is a concat op """ if node.op == "call_function": return node.target == _torch.cat return False def is_relu(node: _fx.Node) -> bool: """ Returns True if node is a relu op """ if node.op == "call_function": return node.target == _torch.nn.functional.relu return False def is_add_relu(node: _fx.Node) -> bool: """ Returns True if node is a add-relu op """ return is_relu(node) and len(node.args) == 1 and is_add(node.args[0]) def combine_op_type(node: _fx.Node) -> _Optional[CombinationOpType]: """ Returns type of combination op at this node -> add, mul, add-relu or concat """ if is_add(node): return CombinationOpType.Add elif is_mul(node): return CombinationOpType.Mul elif is_add_relu(node): return CombinationOpType.AddReLU elif is_concat(node): return CombinationOpType.Concat return None def is_activation_post_process(module: _torch.nn.Module) -> bool: """ Returns true if a module is an activation post process module. """ return isinstance(module, _aoquant.FakeQuantizeBase) def is_quantized(module: _aoquant.FakeQuantizeBase): """ Returns true if activation post process module uses integer dtypes. """ if hasattr(module, "activation_post_process"): return module.activation_post_process.dtype in [_torch.qint8, _torch.quint8] return False def group_activation_quantization_modules_by_id( model: _fx.GraphModule, ) -> _Dict[int, _List[_fx.Node]]: """ Groups activation post process layers by their ids. This is useful because multiple activation post process modules in a traced graph may point to the same module. """ groups = defaultdict(list) for node in model.graph.nodes: if node.op == "call_module": module = find_target(model, node.target) if is_activation_post_process(module) and is_quantized(module): groups[id(module)].append(node) return groups def get_share_qparams_ops(backend_config: _BackendConfig): """ Returns list of ops which share qparams with input. """ configs = ( backend_config._pattern_complex_format_to_config if _is_torch_2() else backend_config.configs ) return [ op for op in configs if configs[op].observation_type == _ObservationType.OUTPUT_SHARE_OBSERVER_WITH_INPUT ] def get_quant_range(n_bits: int, dtype: _torch.dtype) -> _Tuple[int, int]: """ Returns quant_max and quant_min values for a given quantization n_bits. """ max_q = 2**n_bits if dtype in [_torch.quint8, _torch.uint8]: quant_min = 0 quant_max = max_q - 1 else: quant_min = -max_q / 2 quant_max = max_q / 2 - 1 return int(quant_min), int(quant_max) def get_n_bits_from_range(quant_min: int, quant_max: int) -> int: """ Returns quantization n_bits for given quantization range """ n_bits = int(math.log2(quant_max + 1)) if quant_min < 0: n_bits += 1 return n_bits def register_compression_metadata(submodule): metadata = _CompressionMetadata("weight") metadata.compression_type = ["quantization"] metadata.quantization_n_bits = get_n_bits_from_range( submodule.weight_quant_min, submodule.weight_quant_max ) metadata.quantization_scale = ( submodule.weight_scale.detach().clone().unsqueeze(-1) if submodule.weight_axis == 0 else submodule.weight_scale.detach().clone().unsqueeze(-1).transpose(1, 0) ) metadata.zero_point = ( submodule.weight_zero_point.detach().clone().unsqueeze(-1) if submodule.weight_axis == 0 else submodule.weight_zero_point.detach().clone().unsqueeze(-1).transpose(1, 0) ) metadata.register(submodule) def pre_apply_weight_quant(model: _torch.nn.Module): for module in model.modules(): if isinstance( module, _torch.ao.nn.quantized.reference.modules.utils.ReferenceQuantizedModule, ): weight = _quantize_and_dequantize_weight_decomposed( module.weight, module.weight_qscheme, module.weight_dtype, module.weight_scale, module.weight_zero_point, module.weight_axis_int, module.weight_quant_min, module.weight_quant_max, ) module.weight.detach().copy_(weight) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2735472 coremltools-8.0/coremltools/optimize/torch/quantization/modules/0000755000000000000000000000000014672075535024216 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/modules/__init__.py0000644000000000000000000000033314672066616026326 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/modules/conv_transpose.py0000644000000000000000000003153514672066616027642 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause # Original implementation from https://github.com/pytorch/pytorch/blob/main/torch/ao/nn/qat/modules/conv.py # Copyright (c) 2016 Facebook, Inc (Adam Paszke) from typing import List as _List from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import TypeVar as _TypeVar import torch import torch.nn as nn import torch.nn.functional as F from torch import Tensor as _Tensor from torch.ao.nn.intrinsic import _FusedModule from torch.nn.common_types import _size_1_t, _size_2_t, _size_3_t from torch.nn.modules.utils import _pair, _single, _triple torch.manual_seed(0) __all__ = ["ConvTranspose1d", "ConvTranspose2d", "ConvTranspose3d"] MOD = _TypeVar("MOD", bound=nn.modules.conv._ConvTransposeNd) class _ConvTransposeNd(torch.ao.nn.qat.modules.conv._ConvNd): _FLOAT_MODULE = MOD def __init__( self, in_channels: int, out_channels: int, kernel_size: _Tuple[int, ...], stride: _Tuple[int, ...], padding: _Tuple[int, ...], dilation: _Tuple[int, ...], transposed: bool, output_padding: _Tuple[int, ...], groups: int, bias: bool, padding_mode: str, qconfig=None, device=None, dtype=None, ) -> None: factory_kwargs = {"device": device, "dtype": dtype} super().__init__( in_channels, out_channels, kernel_size, stride, padding, dilation, transposed, output_padding, groups, bias, padding_mode, qconfig, **factory_kwargs ) assert qconfig, "qconfig must be provided for QAT module" self.qconfig = qconfig self.weight_fake_quant = qconfig.weight(factory_kwargs=factory_kwargs) @staticmethod def from_float(cls, mod): r"""Create a qat module from a float module Args: `mod`: a float module, either produced by torch.ao.quantization utilities or directly from user """ assert type(mod) == cls._FLOAT_MODULE, ( "qat." + cls.__name__ + ".from_float only works for " + cls._FLOAT_MODULE.__name__ # type: ignore[attr-defined] ) assert hasattr(mod, "qconfig"), "Input float module must have qconfig defined" assert mod.qconfig, "Input float module must have a valid qconfig" if issubclass(type(mod), _FusedModule): mod = mod[0] # type: ignore[index] qconfig = mod.qconfig qat_conv = cls( mod.in_channels, mod.out_channels, mod.kernel_size, stride=mod.stride, padding=mod.padding, dilation=mod.dilation, output_padding=mod.output_padding, groups=mod.groups, bias=mod.bias is not None, padding_mode=mod.padding_mode, qconfig=qconfig, ) qat_conv.weight = mod.weight qat_conv.bias = mod.bias return qat_conv def to_float(self): """This works for both single qat conv, and the qat conv - relu modules to convert the qat module to a floating point module """ cls = type(self) conv = cls._FLOAT_CONV_MODULE( # type: ignore[attr-defined, operator] in_channels=self.in_channels, out_channels=self.out_channels, kernel_size=self.kernel_size, # type: ignore[arg-type] stride=self.stride, # type: ignore[arg-type] padding=self.padding, # type: ignore[arg-type] dilation=self.dilation, # type: ignore[arg-type] output_padding=self.output_padding, groups=self.groups, bias=self.bias is not None, padding_mode=self.padding_mode, ) conv.weight = torch.nn.Parameter(self.weight.detach()) if self.bias is not None: conv.bias = torch.nn.Parameter(self.bias.detach()) # conv relu if issubclass(cls, _FusedModule): modules = [conv] assert hasattr(cls, "_FLOAT_RELU_MODULE") relu = cls._FLOAT_RELU_MODULE() # type: ignore[attr-defined] modules.append(relu) fused = cls._FLOAT_MODULE(*modules) # type: ignore[arg-type, attr-defined, operator] fused.train(self.training) return fused else: return conv class ConvTranspose1d(_ConvTransposeNd, nn.ConvTranspose1d): r""" A ConvTranspose1d module attached with FakeQuantize modules for weight, used for quantization aware training. We adopt the same interface as`torch.nn.ConvTranspose1d`, please see https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose1d.html for documentation. Similar to :class:`~torch.nn.ConvTranspose1d`, with FakeQuantize modules initialized to default. Attributes: weight_fake_quant: fake quant module for weight """ _FLOAT_MODULE = nn.ConvTranspose1d _FLOAT_CONV_MODULE = nn.ConvTranspose1d def __init__( self, in_channels: int, out_channels: int, kernel_size: _size_1_t, stride: _size_1_t = 1, padding: _size_1_t = 0, output_padding: _size_1_t = 0, groups: int = 1, bias: bool = True, dilation: _size_1_t = 1, padding_mode: str = "zeros", qconfig=None, device=None, dtype=None, ) -> None: kernel_size_ = _single(kernel_size) stride_ = _single(stride) padding_ = _single(padding) dilation_ = _single(dilation) output_padding_ = _single(output_padding) super().__init__( in_channels, out_channels, kernel_size_, stride=stride_, padding=padding_, dilation=dilation_, transposed=True, output_padding=output_padding_, groups=groups, bias=bias, padding_mode=padding_mode, qconfig=qconfig, device=device, dtype=dtype, ) def forward(self, input: _Tensor, output_size: _Optional[_List[int]] = None) -> _Tensor: if self.padding_mode != "zeros": raise ValueError("Only `zeros` padding mode is supported for ConvTranspose1d") assert isinstance(self.padding, _Tuple) # One cannot replace _List by _Tuple or Sequence in "_output_padding" because # TorchScript does not support `Sequence[T]` or `_Tuple[T, ...]`. num_spatial_dims = 1 output_padding = self._output_padding( input, output_size, self.stride, self.padding, self.kernel_size, # type: ignore[arg-type] num_spatial_dims, self.dilation, ) # type: ignore[arg-type] return F.conv_transpose1d( input, self.weight_fake_quant(self.weight), self.bias, self.stride, self.padding, output_padding, self.groups, self.dilation, ) @classmethod def from_float(cls, mod): return super().from_float(cls, mod) class ConvTranspose2d(_ConvTransposeNd, nn.ConvTranspose2d): r""" A ConvTranspose2d module attached with FakeQuantize modules for weight, used for quantization aware training. We adopt the same interface as `torch.nn.ConvTranspose2d`, please see https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose2d.html for documentation. Similar to `torch.nn.ConvTranspose2d`, with FakeQuantize modules initialized to default. Attributes: weight_fake_quant: fake quant module for weight """ _FLOAT_MODULE = nn.ConvTranspose2d _FLOAT_CONV_MODULE = nn.ConvTranspose2d def __init__( self, in_channels: int, out_channels: int, kernel_size: _size_2_t, stride: _size_2_t = 1, padding: _size_2_t = 0, output_padding: _size_2_t = 0, groups: int = 1, bias: bool = True, dilation: _size_2_t = 1, padding_mode: str = "zeros", qconfig=None, device=None, dtype=None, ) -> None: kernel_size_ = _pair(kernel_size) stride_ = _pair(stride) padding_ = _pair(padding) dilation_ = _pair(dilation) output_padding_ = _pair(output_padding) super().__init__( in_channels, out_channels, kernel_size_, stride=stride_, padding=padding_, dilation=dilation_, transposed=True, output_padding=output_padding_, groups=groups, bias=bias, padding_mode=padding_mode, qconfig=qconfig, device=device, dtype=dtype, ) def forward(self, input: _Tensor, output_size: _Optional[_List[int]] = None) -> _Tensor: if self.padding_mode != "zeros": raise ValueError("Only `zeros` padding mode is supported for ConvTranspose1d") assert isinstance(self.padding, _Tuple) # One cannot replace _List by _Tuple or Sequence in "_output_padding" because # TorchScript does not support `Sequence[T]` or `_Tuple[T, ...]`. num_spatial_dims = 2 output_padding = self._output_padding( input, output_size, self.stride, self.padding, self.kernel_size, # type: ignore[arg-type] num_spatial_dims, self.dilation, ) # type: ignore[arg-type] return F.conv_transpose2d( input, self.weight_fake_quant(self.weight), self.bias, self.stride, self.padding, output_padding, self.groups, self.dilation, ) @classmethod def from_float(cls, mod): return super().from_float(cls, mod) class ConvTranspose3d(_ConvTransposeNd, nn.ConvTranspose3d): r""" A ConvTranspose3d module attached with FakeQuantize modules for weight, used for quantization aware training. We adopt the same interface as `torch.nn.ConvTranspose3d`, please see https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose3d.html for documentation. Similar to `torch.nn.ConvTranspose3d`, with FakeQuantize modules initialized to default. Attributes: weight_fake_quant: fake quant module for weight """ _FLOAT_MODULE = nn.ConvTranspose3d _FLOAT_CONV_MODULE = nn.ConvTranspose3d def __init__( self, in_channels: int, out_channels: int, kernel_size: _size_3_t, stride: _size_3_t = 1, padding: _size_3_t = 0, output_padding: _size_3_t = 0, groups: int = 1, bias: bool = True, dilation: _size_3_t = 1, padding_mode: str = "zeros", qconfig=None, device=None, dtype=None, ) -> None: kernel_size_ = _triple(kernel_size) stride_ = _triple(stride) padding_ = _triple(padding) dilation_ = _triple(dilation) output_padding_ = _triple(output_padding) super().__init__( in_channels, out_channels, kernel_size_, stride=stride_, padding=padding_, dilation=dilation_, transposed=True, output_padding=output_padding_, groups=groups, bias=bias, padding_mode=padding_mode, qconfig=qconfig, device=device, dtype=dtype, ) def forward(self, input: _Tensor, output_size: _Optional[_List[int]] = None) -> _Tensor: if self.padding_mode != "zeros": raise ValueError("Only `zeros` padding mode is supported for ConvTranspose1d") assert isinstance(self.padding, _Tuple) # One cannot replace _List by _Tuple or Sequence in "_output_padding" because # TorchScript does not support `Sequence[T]` or `_Tuple[T, ...]`. num_spatial_dims = 3 output_padding = self._output_padding( input, output_size, self.stride, self.padding, self.kernel_size, # type: ignore[arg-type] num_spatial_dims, self.dilation, ) # type: ignore[arg-type] return F.conv_transpose3d( input, self.weight_fake_quant(self.weight), self.bias, self.stride, self.padding, output_padding, self.groups, self.dilation, ) @classmethod def from_float(cls, mod): return super().from_float(cls, mod) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/modules/conv_transpose_fused.py0000644000000000000000000003737614672066616031041 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause # Original implementation from https://github.com/pytorch/pytorch/blob/main/torch/ao/nn/intrinsic/qat/modules/conv_fused.py # Copyright (c) 2016 Facebook, Inc (Adam Paszke) from typing import List as _List from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import TypeVar as _TypeVar import torch import torch.ao.nn.intrinsic as nni import torch.nn as nn import torch.nn.functional as F from torch import Tensor from torch.nn.common_types import _size_1_t, _size_2_t, _size_3_t from torch.nn.modules.utils import _pair, _single, _triple from torch.nn.utils import fuse_conv_bn_weights from coremltools.optimize.torch.quantization.modules.fused_modules import ( ConvTransposeBn1d as iConvTransposeBn1d, ) from coremltools.optimize.torch.quantization.modules.fused_modules import ( ConvTransposeBn2d as iConvTransposeBn2d, ) from coremltools.optimize.torch.quantization.modules.fused_modules import ( ConvTransposeBn3d as iConvTransposeBn3d, ) _BN_CLASS_MAP = { 1: nn.BatchNorm1d, 2: nn.BatchNorm2d, 3: nn.BatchNorm3d, } __all__ = ["ConvTransposeBn1d", "ConvTransposeBn2d", "ConvTransposeBn3d"] MOD = _TypeVar("MOD", bound=nn.modules.conv._ConvTransposeNd) class _ConvTransposeBnNd(nni.qat.modules.conv_fused._ConvBnNd): _FLOAT_MODULE = MOD def __init__( self, # ConvNd args in_channels: int, out_channels: int, kernel_size: _Tuple[int, ...], stride: _Tuple[int, ...], padding: _Tuple[int, ...], dilation: _Tuple[int, ...], transposed: bool, output_padding: _Tuple[int, ...], groups: int, bias: bool, padding_mode: str, # BatchNormNd args # num_features: out_channels eps=1e-05, momentum=0.1, # affine: True # track_running_stats: True # Args for this module freeze_bn: bool = False, qconfig=None, dim: int = 2, ): nni.qat.modules.conv_fused._ConvBnNd.__init__( self, in_channels, out_channels, kernel_size, stride, padding, dilation, transposed, output_padding, groups, bias, padding_mode, eps, momentum, freeze_bn, qconfig, dim, ) def forward(self, input, output_size: _Optional[_List[int]] = None): assert isinstance(self.padding, _Tuple) return self._forward(input, output_size) def _forward(self, input, output_size): # if self._enable_slow_path_for_better_numerical_stability: # return self._forward_slow(input) return self._forward_approximate(input, output_size) def _forward_approximate(self, input, output_size): """ Taken from nni.qat.modules.conv_fused._ConvBnNd Changes made for weight_shape and bias_shape Approximated method to fuse conv and bn. It requires only one forward pass. conv_orig = conv / scale_factor where scale_factor = bn.weight / running_std """ assert self.bn.running_var is not None running_std = torch.sqrt(self.bn.running_var + self.bn.eps) scale_factor = self.bn.weight / running_std weight_shape = [1] * len(self.weight.shape) weight_shape[1] = -1 bias_shape = [1] * len(self.weight.shape) bias_shape[1] = -1 scaled_weight = self.weight_fake_quant(self.weight * scale_factor.reshape(weight_shape)) # using zero bias here since the bias for original conv # will be added later if self.bias is not None: zero_bias = torch.zeros_like(self.bias, dtype=input.dtype) else: zero_bias = torch.zeros( self.out_channels, device=scaled_weight.device, dtype=input.dtype ) conv = self._conv_forward(input, scaled_weight, zero_bias, output_size) conv_orig = conv / scale_factor.reshape(bias_shape) if self.bias is not None: conv_orig = conv_orig + self.bias.reshape(bias_shape) conv = self.bn(conv_orig) return conv @classmethod def from_float(cls, mod): r"""Create a qat module from a float module or qparams_dict Args: `mod` a float module, either produced by torch.ao.quantization utilities or directly from user """ # The ignore is because _FLOAT_MODULE is a TypeVar here where the bound # has no __name__ (code is fine though) assert type(mod) == cls._FLOAT_MODULE, ( "qat." + cls.__name__ + ".from_float only works for " + cls._FLOAT_MODULE.__name__ ) # type: ignore[attr-defined] assert hasattr(mod, "qconfig"), "Input float module must have qconfig defined" assert mod.qconfig, "Input float module must have a valid qconfig" qconfig = mod.qconfig conv, bn = mod[0], mod[1] qat_convbn = cls( in_channels=conv.in_channels, out_channels=conv.out_channels, kernel_size=conv.kernel_size, stride=conv.stride, padding=conv.padding, dilation=conv.dilation, output_padding=conv.output_padding, groups=conv.groups, bias=conv.bias is not None, padding_mode=conv.padding_mode, eps=bn.eps, momentum=bn.momentum, freeze_bn=False, qconfig=qconfig, ) qat_convbn.weight = conv.weight qat_convbn.bias = conv.bias qat_convbn.bn.weight = bn.weight qat_convbn.bn.bias = bn.bias qat_convbn.bn.running_mean = bn.running_mean qat_convbn.bn.running_var = bn.running_var # mypy error: Cannot determine type of 'num_batches_tracked' qat_convbn.bn.num_batches_tracked = bn.num_batches_tracked # type: ignore[has-type] return qat_convbn def to_float(self): """ transpose applied to weights taken from torch.ao.nn.intrinsic.qat.modules.conv_fused._ConvBnNd """ cls = type(self) conv = cls._FLOAT_CONV_MODULE( # type: ignore[attr-defined] self.in_channels, self.out_channels, self.kernel_size, self.stride, self.padding, self.output_padding, self.groups, self.bias is not None, self.dilation, self.padding_mode, ) conv.weight = torch.nn.Parameter(self.weight.detach()) if self.bias is not None: conv.bias = torch.nn.Parameter(self.bias.detach()) if cls._FLOAT_BN_MODULE: # type: ignore[attr-defined] # fuse bn into conv assert self.bn.running_var is not None and self.bn.running_mean is not None conv.weight.data = conv.weight.data.transpose(1, 0) conv.weight, conv.bias = fuse_conv_bn_weights( conv.weight, conv.bias, self.bn.running_mean, self.bn.running_var, self.bn.eps, self.bn.weight, self.bn.bias, ) conv.weight.data = conv.weight.data.transpose(1, 0) if cls._FLOAT_RELU_MODULE: # type: ignore[attr-defined] modules = [] modules.append(conv) relu = cls._FLOAT_RELU_MODULE() # type: ignore[attr-defined] modules.append(relu) conv_relu = cls._FUSED_FLOAT_MODULE(*modules) # type: ignore[attr-defined] conv_relu.train(self.training) return conv_relu else: conv.train(self.training) return conv class ConvTransposeBn1d(_ConvTransposeBnNd, nn.ConvTranspose1d): r""" A ConvTransposeBn1d module is a module fused from ConvTranspose1d and BatchNorm1d, attached with FakeQuantize modules for weight, used in quantization aware training. We combined the interface of :class:`torch.nn.ConvTranspose1d` and :class:`torch.nn.BatchNorm1d`. Similar to :class:`torch.nn.ConvTranspose1d`, with FakeQuantize modules initialized to default. Attributes: freeze_bn: weight_fake_quant: fake quant module for weight """ _FLOAT_BN_MODULE = nn.BatchNorm1d _FLOAT_RELU_MODULE: None = None _FLOAT_MODULE = iConvTransposeBn1d _FLOAT_CONV_MODULE = nn.ConvTranspose1d def __init__( self, # ConvTranspose1d args in_channels: int, out_channels: int, kernel_size: _size_1_t, stride: _size_1_t = 1, padding: _size_1_t = 0, output_padding: _size_1_t = 0, dilation: _size_1_t = 1, groups: int = 1, bias: bool = False, padding_mode: str = "zeros", # BatchNorm1d args # num_features: out_channels eps=1e-05, momentum=0.1, # affine: True # track_running_stats: True # Args for this module freeze_bn=False, qconfig=None, ): kernel_size = _single(kernel_size) stride = _single(stride) padding = _single(padding) dilation = _single(dilation) output_padding_ = _single(output_padding) _ConvTransposeBnNd.__init__( self, in_channels, out_channels, kernel_size, stride, padding, dilation, True, output_padding_, groups, bias, padding_mode, eps, momentum, freeze_bn, qconfig, dim=1, ) def _conv_forward( self, input: Tensor, weight: Tensor, bias: _Optional[Tensor], output_size: _Optional[_List[int]] = None, ) -> Tensor: num_spatial_dims = 1 output_padding = self._output_padding( input, output_size, self.stride, self.padding, self.kernel_size, # type: ignore[arg-type] num_spatial_dims, self.dilation, ) # type: ignore[arg-type] return F.conv_transpose1d( input, weight, bias, self.stride, self.padding, output_padding, self.groups, self.dilation, ) class ConvTransposeBn2d(_ConvTransposeBnNd, nn.ConvTranspose2d): r""" A ConvTransposeBn2d module is a module fused from ConvTranspose2d and BatchNorm2d, attached with FakeQuantize modules for weight, used in quantization aware training. We combined the interface of :class:`torch.nn.ConvTranspose2d` and :class:`torch.nn.BatchNorm2d`. Similar to :class:`torch.nn.ConvTranspose2d`, with FakeQuantize modules initialized to default. Attributes: freeze_bn: weight_fake_quant: fake quant module for weight """ _FLOAT_BN_MODULE = nn.BatchNorm2d _FLOAT_RELU_MODULE: None = None _FLOAT_MODULE = iConvTransposeBn2d _FLOAT_CONV_MODULE = nn.ConvTranspose2d def __init__( self, # ConvTranspose2d args in_channels: int, out_channels: int, kernel_size: _size_2_t, stride: _size_2_t = 1, padding: _size_2_t = 0, output_padding: _size_2_t = 0, dilation: _size_2_t = 1, groups: int = 1, bias: bool = False, padding_mode: str = "zeros", # BatchNorm2d args # num_features: out_channels eps=1e-05, momentum=0.1, # affine: True # track_running_stats: True # Args for this module freeze_bn=False, qconfig=None, ): kernel_size = _pair(kernel_size) stride = _pair(stride) padding = _pair(padding) dilation = _pair(dilation) output_padding_ = _pair(output_padding) _ConvTransposeBnNd.__init__( self, in_channels, out_channels, kernel_size, stride, padding, dilation, True, output_padding_, groups, bias, padding_mode, eps, momentum, freeze_bn, qconfig, dim=2, ) def _conv_forward( self, input: Tensor, weight: Tensor, bias: _Optional[Tensor], output_size: _Optional[_List[int]] = None, ) -> Tensor: num_spatial_dims = 2 output_padding = self._output_padding( input, output_size, self.stride, self.padding, self.kernel_size, # type: ignore[arg-type] num_spatial_dims, self.dilation, ) # type: ignore[arg-type] return F.conv_transpose2d( input, weight, bias, self.stride, self.padding, output_padding, self.groups, self.dilation, ) class ConvTransposeBn3d(_ConvTransposeBnNd, nn.ConvTranspose3d): r""" A ConvTransposeBn3d module is a module fused from ConvTranspose2d and BatchNorm3d, attached with FakeQuantize modules for weight, used in quantization aware training. We combined the interface of :class:`torch.nn.ConvTranspose3d` and :class:`torch.nn.BatchNorm3d`. Similar to :class:`torch.nn.ConvTranspose3d`, with FakeQuantize modules initialized to default. Attributes: freeze_bn: weight_fake_quant: fake quant module for weight """ _FLOAT_BN_MODULE = nn.BatchNorm3d _FLOAT_RELU_MODULE: None = None _FLOAT_MODULE = iConvTransposeBn3d _FLOAT_CONV_MODULE = nn.ConvTranspose3d def __init__( self, # ConvTranspose3d args in_channels: int, out_channels: int, kernel_size: _size_3_t, stride: _size_3_t = 1, padding: _size_3_t = 0, output_padding: _size_3_t = 0, dilation: _size_3_t = 1, groups: int = 1, bias: bool = False, padding_mode: str = "zeros", # BatchNorm3d args # num_features: out_channels eps=1e-05, momentum=0.1, # affine: True # track_running_stats: True # Args for this module freeze_bn=False, qconfig=None, ): kernel_size = _triple(kernel_size) stride = _triple(stride) padding = _triple(padding) dilation = _triple(dilation) output_padding_ = _triple(output_padding) _ConvTransposeBnNd.__init__( self, in_channels, out_channels, kernel_size, stride, padding, dilation, True, output_padding_, groups, bias, padding_mode, eps, momentum, freeze_bn, qconfig, dim=3, ) def _conv_forward( self, input: Tensor, weight: Tensor, bias: _Optional[Tensor], output_size: _Optional[_List[int]] = None, ) -> Tensor: num_spatial_dims = 3 output_padding = self._output_padding( input, output_size, self.stride, self.padding, self.kernel_size, # type: ignore[arg-type] num_spatial_dims, self.dilation, ) # type: ignore[arg-type] return F.conv_transpose3d( input, weight, bias, self.stride, self.padding, output_padding, self.groups, self.dilation, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/modules/fused_modules.py0000644000000000000000000000456214672066616027435 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict as _OrderedDict from typing import Union as _Union import torch as _torch import torch.nn as _nn import torch.nn.intrinsic as _nni class _ConvBn(_torch.nn.Sequential): def __init__(self, conv: _nn.Module, bn: _nn.Module): super().__init__(_OrderedDict([("conv", conv), ("bn", bn)])) @property def weight(self): return self.conv.weight class _ConvAct(_torch.nn.Sequential): def __init__(self, conv: _nn.Module, act: _nn.Module): super().__init__(_OrderedDict([("conv", conv), ("act", act)])) @property def weight(self): return self.conv.weight class ConvTransposeBn1d(_ConvBn): pass class ConvTransposeBn2d(_ConvBn): pass class ConvTransposeBn3d(_ConvBn): pass class _ConvBnAct(_torch.nn.Sequential): intr_mod: _Union[ _nni.ConvBn1d, _nni.ConvBn2d, _nni.ConvBn3d, ConvTransposeBn1d, ConvTransposeBn2d, ConvTransposeBn3d, ] def __init__(self, conv: _nn.Module, bn: _nn.Module, act: _nn.Module): super().__init__(_OrderedDict([("conv", self.intr_mod(conv, bn)), ("act", act)])) @property def weight(self): return self.conv.weight class ConvAct1d(_ConvAct): pass class ConvAct2d(_ConvAct): pass class ConvAct3d(_ConvAct): pass class ConvTransposeAct1d(_ConvAct): pass class ConvTransposeAct2d(_ConvAct): pass class ConvTransposeAct3d(_ConvAct): pass class ConvBnAct1d(_ConvBnAct): intr_mod = _nni.ConvBn1d pass class ConvBnAct2d(_ConvBnAct): intr_mod = _nni.ConvBn2d pass class ConvBnAct3d(_ConvBnAct): intr_mod = _nni.ConvBn3d pass class ConvTransposeBnAct1d(_ConvBnAct): intr_mod = ConvTransposeBn1d pass class ConvTransposeBnAct2d(_ConvBnAct): intr_mod = ConvTransposeBn2d pass class ConvTransposeBnAct3d(_ConvBnAct): intr_mod = ConvTransposeBn3d pass class LinearAct(_torch.nn.Sequential): def __init__(self, linear: _nn.Linear, act: _nn.Module): super().__init__(_OrderedDict([("linear", linear), ("act", act)])) @property def weight(self): return self.linear.weight ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/modules/observers.py0000644000000000000000000000142714672066616026606 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Any as _Any from typing import Dict as _Dict import torch as _torch import torch.ao.quantization as _aoquant class NoopObserver(_aoquant.NoopObserver): """ Extends aoquant.NoopObserver to add support for accepting factory_kwargs which are passed to it during qconfig.weight() creation in QAT Conv/Linear modules. """ def __init__( self, dtype: _torch.dtype = _torch.float16, custom_op_name: str = "", factory_kwargs: _Dict[str, _Any] = None, ): super().__init__(dtype, custom_op_name) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/modules/qat_modules.py0000644000000000000000000001610314672066616027106 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict as _OrderedDict from typing import Type as _Type from typing import Union as _Union import torch as _torch import torch.ao.nn.intrinsic as _nni import torch.ao.nn.qat as _nnqat import torch.ao.quantization as _aoquant import torch.nn as _nn import torch.nn.intrinsic.qat as _nniqat import coremltools.optimize.torch.quantization.modules.conv_transpose as _qconv_transpose import coremltools.optimize.torch.quantization.modules.conv_transpose_fused as _qconv_transpose_fused import coremltools.optimize.torch.quantization.modules.fused_modules as _fuse class _ConvAct(_torch.nn.Sequential): root_mod: _Type[_nn.Module] qat_mod: _Union[ _nnqat.Conv1d, _nnqat.Conv2d, _nnqat.Conv3d, _qconv_transpose.ConvTranspose1d, _qconv_transpose.ConvTranspose2d, _qconv_transpose.ConvTranspose3d, ] fused_mod: _Union[ _fuse.ConvAct1d, _fuse.ConvAct2d, _fuse.ConvAct3d, _fuse.ConvTransposeAct1d, _fuse.ConvTransposeAct2d, _fuse.ConvTransposeAct3d, ] def __init__(self, conv: _nn.Module, act: _nn.Module, qconfig: _aoquant.QConfig): super().__init__(_OrderedDict([("conv", conv), ("act", act)])) self.qconfig = qconfig def forward(self, x: _torch.Tensor) -> _torch.Tensor: return self.act(self.conv(x)) @property def weight(self): return self.conv.weight @property def weight_fake_quant(self): return self.conv.weight_fake_quant @classmethod def from_float(cls, mod: _nn.Module): if isinstance(mod.conv, cls.qat_mod): conv = mod.conv else: assert isinstance(mod.conv, cls.root_mod), ( f"Failed to convert module for QAT. " f"Expected module type {cls.root_mod}, " f"received type {type(mod.conv)}." ) conv = cls.qat_mod.from_float(mod.conv) conv.activation_post_process = None return cls(conv, mod.act, mod.qconfig) def to_float(self) -> _nn.Module: return self.fused_mod( conv=self.conv.to_float(), act=self.act, ) class _ConvBnAct(_ConvAct): intr_mod: _Type[_nn.Module] qat_mod: _Union[ _nniqat.ConvBn1d, _nniqat.ConvBn2d, _nniqat.ConvBn3d, _qconv_transpose_fused.ConvTransposeBn1d, _qconv_transpose_fused.ConvTransposeBn2d, _qconv_transpose_fused.ConvTransposeBn3d, ] fused_mod: _Union[ _fuse.ConvAct1d, _fuse.ConvAct2d, _fuse.ConvAct3d, _fuse.ConvTransposeAct1d, _fuse.ConvTransposeAct2d, _fuse.ConvTransposeAct3d, ] @classmethod def from_float(cls, mod: _nn.Module): if isinstance(mod.conv, cls.intr_mod): conv = cls.qat_mod.from_float(mod.conv) else: conv = mod.conv assert isinstance(conv, cls.qat_mod), ( f"Failed to convert module for QAT. " f"Expected module type {cls.qat_mod}, " f"received type {type(conv)}." ) conv.activation_post_process = None return cls(conv, mod.act, mod.qconfig) class ConvAct1d(_ConvAct): root_mod = _nn.Conv1d qat_mod = _nnqat.Conv1d fused_mod = _fuse.ConvAct1d pass class ConvAct2d(_ConvAct): root_mod = _nn.Conv2d qat_mod = _nnqat.Conv2d fused_mod = _fuse.ConvAct2d pass class ConvAct3d(_ConvAct): root_mod = _nn.Conv3d qat_mod = _nnqat.Conv3d fused_mod = _fuse.ConvAct3d pass class ConvTransposeAct1d(_ConvAct): root_mod = _nn.ConvTranspose1d qat_mod = _qconv_transpose.ConvTranspose1d fused_mod = _fuse.ConvTransposeAct1d pass class ConvTransposeAct2d(_ConvAct): root_mod = _nn.ConvTranspose2d qat_mod = _qconv_transpose.ConvTranspose2d fused_mod = _fuse.ConvTransposeAct2d pass class ConvTransposeAct3d(_ConvAct): root_mod = _nn.ConvTranspose3d qat_mod = _qconv_transpose.ConvTranspose3d fused_mod = _fuse.ConvTransposeAct3d pass class ConvBnAct1d(_ConvBnAct): intr_mod = _nni.ConvBn1d qat_mod = _nniqat.ConvBn1d fused_mod = _fuse.ConvAct1d pass class ConvBnAct2d(_ConvBnAct): intr_mod = _nni.ConvBn2d qat_mod = _nniqat.ConvBn2d fused_mod = _fuse.ConvAct2d pass class ConvBnAct3d(_ConvBnAct): intr_mod = _nni.ConvBn3d qat_mod = _nniqat.ConvBn3d fused_mod = _fuse.ConvAct3d pass class ConvTransposeBnAct1d(_ConvBnAct): intr_mod = _fuse.ConvTransposeBn1d qat_mod = _qconv_transpose_fused.ConvTransposeBn1d fused_mod = _fuse.ConvTransposeAct1d pass class ConvTransposeBnAct2d(_ConvBnAct): intr_mod = _fuse.ConvTransposeBn2d qat_mod = _qconv_transpose_fused.ConvTransposeBn2d fused_mod = _fuse.ConvTransposeAct2d pass class ConvTransposeBnAct3d(_ConvBnAct): intr_mod = _fuse.ConvTransposeBn3d qat_mod = _qconv_transpose_fused.ConvTransposeBn3d fused_mod = _fuse.ConvTransposeAct3d pass class LinearAct(_torch.nn.Sequential): def __init__(self, linear: _nnqat.Linear, act: _nn.Module, qconfig: _aoquant.QConfig): super().__init__(_OrderedDict([("linear", linear), ("act", act)])) self.qconfig = qconfig def forward(self, x: _torch.Tensor) -> _torch.Tensor: return self.act(self.linear(x)) @property def weight(self): return self.linear.weight @property def weight_fake_quant(self): return self.linear.weight_fake_quant @classmethod def from_float(cls, mod: _fuse.LinearAct): if isinstance(mod.linear, _nnqat.Linear): linear = mod.linear else: assert isinstance(mod.linear, _nn.Linear), ( f"Failed to convert module for QAT. " f"Expected module type {_nn.Linear}, " f"received type {type(mod.linear)}." ) linear = _nnqat.Linear.from_float(mod.linear) linear.activation_post_process = None return cls(linear, mod.act, mod.qconfig) def to_float(self) -> _fuse.LinearAct: return _fuse.LinearAct( linear=self.linear.to_float(), act=self.act, ) def update_bn_stats(mod): """ update_bn_stats methods updates BatchNorm statistics This method was originally defined in torch.nn.intrinsic.qat.modules.conv_fused. However, it limited the type of quantized fused modules for which BatchNorm statistics can be updated """ if hasattr(mod, "update_bn_stats"): mod.update_bn_stats() def freeze_bn_stats(mod): """ freeze_bn_stats method freezes BatchNorm statistics This method was originally defined in torch.nn.intrinsic.qat.modules.conv_fused. However, it limited the type of quantized fused modules for which BatchNorm statistics can be frozen """ if hasattr(mod, "freeze_bn_stats"): mod.freeze_bn_stats() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/modules/quantized_modules.py0000644000000000000000000000342314672066616030326 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict as _OrderedDict from typing import Type as _Type import torch.ao.nn.quantized.reference as _reference import torch.nn as _nn class _QuantizedConvAct(_nn.Sequential): ref_quant_mod: _Type[_nn.Module] def __init__(self, conv: _nn.Module, act: _nn.Module): super().__init__(_OrderedDict([("conv", conv), ("act", act)])) @classmethod def from_float(cls, float_conv_act, weight_qparams): conv = cls.ref_quant_mod.from_float(float_conv_act.conv, weight_qparams) return cls(conv, float_conv_act.act) class QuantizedConvAct1d(_QuantizedConvAct): ref_quant_mod = _reference.Conv1d pass class QuantizedConvAct2d(_QuantizedConvAct): ref_quant_mod = _reference.Conv2d pass class QuantizedConvAct3d(_QuantizedConvAct): ref_quant_mod = _reference.Conv3d pass class QuantizedConvTransposeAct1d(_QuantizedConvAct): ref_quant_mod = _reference.ConvTranspose1d pass class QuantizedConvTransposeAct2d(_QuantizedConvAct): ref_quant_mod = _reference.ConvTranspose2d pass class QuantizedConvTransposeAct3d(_QuantizedConvAct): ref_quant_mod = _reference.ConvTranspose3d pass class QuantizedLinearAct(_nn.Sequential): def __init__(self, linear: _reference.Linear, act: _nn.Module): super().__init__(_OrderedDict([("linear", linear), ("act", act)])) @classmethod def from_float(cls, float_linear_act, weight_qparams): linear = _reference.Linear.from_float(float_linear_act.linear, weight_qparams) return cls(linear, float_linear_act.act) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/post_training_quantization.py0000644000000000000000000005505014672066616030613 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import logging as _logging from collections import OrderedDict as _OrderedDict from typing import Any as _Any from typing import Callable as _Callable from typing import Dict as _Dict from typing import List as _List from typing import NewType as _NewType from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import Type as _Type from typing import Union as _Union import cattrs as _cattrs import numpy as _np import torch as _torch import torch.nn as _nn from attr import define as _define from attr import field as _field from attrs import validators as _validators from coremltools.converters.mil.mil.ops.defs.iOS18 import constexpr_blockwise_shift_scale from coremltools.optimize.coreml._utils import compute_qparams as _ct_compute_qparams from coremltools.optimize.torch._utils.metadata_utils import ( CompressionMetadata as _CompressionMetadata, ) from coremltools.optimize.torch._utils.report_utils import ( compute_post_training_report as _compute_post_training_report, ) from coremltools.optimize.torch._utils.torch_utils import get_atomic_layers as _get_atomic_layers from coremltools.optimize.torch._utils.torch_utils import ( get_n_bits_from_dtype as _get_n_bits_from_dtype, ) from coremltools.optimize.torch._utils.torch_utils import ( maybe_convert_str_to_dtype as _maybe_convert_str_to_dtype, ) from coremltools.optimize.torch._utils.torch_utils import ( maybe_convert_str_to_mod_type as _maybe_convert_str_to_mod_type, ) from coremltools.optimize.torch._utils.validation_utils import ( validate_param_config as _validate_param_config, ) from coremltools.optimize.torch.base_model_optimizer import ( BasePostTrainingModelOptimizer as _BasePostTrainingModelOptimizer, ) from coremltools.optimize.torch.base_model_optimizer import _Report from coremltools.optimize.torch.optimization_config import ( ModuleOptimizationConfig as _ModuleOptimizationConfig, ) from coremltools.optimize.torch.optimization_config import OptimizationConfig as _OptimizationConfig from coremltools.optimize.torch.optimization_config import ( QuantizationGranularity, _structure_from_dict_hook_factory, ) from coremltools.optimize.torch.quantization import QuantizationScheme as _QuantizationScheme _default_ptq_options = { "weight_dtype": "int8", "granularity": "per_channel", "quantization_scheme": _QuantizationScheme.symmetric, "block_size": None, } _logger = _logging.getLogger(__name__) @_define class ModulePostTrainingQuantizerConfig(_ModuleOptimizationConfig): """ Configuration class for specifying global and module-level quantizer options for :py:class:`PostTrainingQuantizer` algorithm. Args: weight_dtype (:py:class:`torch.dtype`): The dtype to use for quantizing the weights. The number of bits used for quantization is inferred from the dtype. When dtype is set to :py:class:`torch.float32`, the weights corresponding to that layer are not quantized. Defaults to :py:class:`torch.int8`, which corresponds to 8-bit quantization. granularity (:py:class:`QuantizationGranularity`): Specifies the granularity at which quantization parameters will be computed. Can be one of ``per_channel``, ``per_tensor`` or ``per_block``. When using ``per_block``, ``block_size`` argument must be specified. Defaults to ``per_channel``. quantization_scheme (:py:class:`~.coremltools.optimize.torch.quantization.quantization_config.QuantizationScheme`): Type of quantization configuration to use. When this parameter is set to ``QuantizationScheme.symmetric``, all weights are quantized with zero point as zero. When it is set to ``QuantizationScheme.affine``, zero point can be set anywhere in the range of values allowed for the quantized weight. Defaults to ``QuantizationScheme.symmetric``. block_size (:obj:`tuple` of :obj:`int` or :obj:`int`): When ``block_size`` is specified, ``block_size`` number of values will share the same quantization parameters of scale, as well as the same zero point when applicable, across the input channel axis. A tuple of integers can be provided for arbitrarily sized blockwise quantization. See more details on different possible configurations below. Defaults to ``None``. This class supports three different configurations to structure the quantization: 1. **Per-channel quantization**: This is the default configuration where ``granularity`` is ``per_channel`` and ``block_size`` is ``None``. In this configuration, quantization parameters are computed for each output channel. 2. **Per-tensor quantization**: In this configuration, quantization parameters are computed for the tensor as a whole. That is, all values in the tensor will share a single scale and, if applicable, a single zero point. The ``granularity`` argument is set to ``per_tensor``. 3. **Per-block quantization**: This configuration is used to structure the tensor for blockwise quantization. The ``granularity`` is set to ``per_block``, and the ``block_size`` argument has to be specified. The ``block_size`` argument can either be of type ``int`` or ``tuple``: * int: In this configuration, each row along the output channel axis will have its own quantization parameters, similar to the ``per_channel`` configuration. Additionally, ``block_size`` number of values will share the same quantization parameters along the input channel axis. For example, for a weight matrix of shape ``(10, 10)``, if we provide ``block_size = 2``, the shape of the quantization parameters would be ``(10, 5)``. * tuple: For a more advanced configuration, users can provide an arbitrary n-dimensional block to share the quantization parameters. This is specified in the form of a tuple, where each value corresponds to the block size for the respective axis of the weight matrix. The length of the provided tuple should be at most the number of dimensions of the weight matrix. .. note:: When performing 4-bit quantization, ``weight_dtype`` is set to :py:class:`torch.int8` for ``int4`` or :py:class:`torch.uint8` for ``uint4``. This is because PyTorch currently doesn't provide support for 4-bit data types. However, the quantization range is set according to 4-bit quantization and based on whether the ``weight_dtype`` is signed or unsigned. """ weight_dtype: _Union[str, _torch.dtype] = _field( default=_default_ptq_options["weight_dtype"], converter=_maybe_convert_str_to_dtype, validator=[ _validators.instance_of(_torch.dtype), _validators.in_([_torch.int8, _torch.uint8, _torch.float32]), ], ) granularity: QuantizationGranularity = _field( default=_default_ptq_options["granularity"], converter=QuantizationGranularity, validator=_validators.in_(QuantizationGranularity), ) quantization_scheme: _QuantizationScheme = _field( default=_default_ptq_options["quantization_scheme"], converter=_QuantizationScheme, validator=_validators.in_(_QuantizationScheme), ) block_size: _Optional[_Union[int, _Tuple[int]]] = _field( default=_default_ptq_options["block_size"], converter=lambda val: (val,) if type(val) is int else val, validator=_validators.optional( _validators.deep_iterable( member_validator=_validators.instance_of(int), iterable_validator=_validators.instance_of(tuple), ) ), ) def __attrs_post_init__(self): self.weight_n_bits = _get_n_bits_from_dtype(self.weight_dtype) @block_size.validator def per_block_granularity(self, attribute, value): if self.granularity == QuantizationGranularity.per_block: assert ( value is not None ), "block_size has to be specified along with per_block granularity." else: assert ( value is None ), "block_size can't be specified along with per_tensor or per_channel granularity." @classmethod def from_dict(cls, config_dict): converter = _cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook( _Union[str, _torch.dtype], lambda obj, type: obj, ) return converter.structure_attrs_fromdict(config_dict, cls) _ModuleTypeConfigType = _NewType( "ModuleTypeConfigType", _Dict[_Union[_Callable, str], _Optional[ModulePostTrainingQuantizerConfig]], ) @_define class PostTrainingQuantizerConfig(_OptimizationConfig): """ Configuration class for specifying how different submodules of a model should be post-training quantized by :py:class:`PostTrainingQuantizer`. Args: global_config (:py:class:`ModulePostTrainingQuantizerConfig`): Config to be applied globally to all supported modules. module_type_configs (:obj:`dict` of :obj:`str` to :py:class:`ModulePostTrainingQuantizerConfig`): Module type configs applied to a specific module class, such as :py:class:`torch.nn.Linear`. The keys can be either strings or module classes. module_name_configs (:obj:`dict` of :obj:`str` to :py:class:`ModulePostTrainingQuantizerConfig`): Module name configs applied to specific modules. This can be a dictionary with module names pointing to their corresponding :py:class:`ModulePostTrainingQuantizerConfig`. """ global_config: _Optional[ModulePostTrainingQuantizerConfig] = _field( default=None, validator=_validators.optional(_validators.instance_of(ModulePostTrainingQuantizerConfig)), ) module_type_configs: _ModuleTypeConfigType = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.instance_of((str, _Callable)), value_validator=_validators.optional( _validators.instance_of(ModulePostTrainingQuantizerConfig) ), mapping_validator=_validators.instance_of(dict), ), ) module_name_configs: _Dict[str, _Optional[ModulePostTrainingQuantizerConfig]] = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.instance_of(str), value_validator=_validators.optional( _validators.instance_of(ModulePostTrainingQuantizerConfig) ), mapping_validator=_validators.instance_of(dict), ), ) def __attrs_post_init__(self): if ( self.global_config is None and len(self.module_type_configs) == 0 and len(self.module_name_configs) == 0 ): self.global_config = ModulePostTrainingQuantizerConfig() self.module_type_configs = { _maybe_convert_str_to_mod_type(key): val for key, val in self.module_type_configs.items() } self._validate_same_params(["quantization_scheme"]) @classmethod def from_dict(cls, config_dict: _Dict[str, _Any]) -> "PostTrainingQuantizerConfig": super().from_dict(config_dict) converter = _cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook( _Union[str, _torch.dtype], lambda obj, type: obj, ) converter.register_structure_hook( _ModuleTypeConfigType, _structure_from_dict_hook_factory(ModulePostTrainingQuantizerConfig), ) return converter.structure_attrs_fromdict(config_dict, cls) class PostTrainingQuantizer(_BasePostTrainingModelOptimizer): """ Perform post-training quantization on a torch model. After quantization, weights of all submodules selected for quantization contain full precision values obtained by quantizing and dequantizing the original weights, which captures the error induced by quantization. .. note:: After quantization, the weight values stored will still remain in full precision, so the PyTorch model size will not be reduced. To see the reduction in model size, please convert the model using ``coremltools.convert(...)``, which will produce a model intermediate language (MIL) model containing the compressed weights. Example: .. code-block:: python import torch.nn as nn from coremltools.optimize.torch.quantization import ( PostTrainingQuantizerConfig, PostTrainingQuantizer, ) model = nn.Sequential( OrderedDict( { "conv": nn.Conv2d(1, 20, (3, 3)), "relu1": nn.ReLU(), "conv2": nn.Conv2d(20, 20, (3, 3)), "relu2": nn.ReLU(), } ) ) # initialize the quantizer config = PostTrainingQuantizerConfig.from_dict( { "global_config": { "weight_dtype": "int8", }, } ) ptq = PostTrainingQuantizer(model, config) quantized_model = ptq.compress() Args: model (:obj:`torch.nn.Module`): Module to be compressed. config (:py:class:`PostTrainingQuantizerConfig`): Config that specifies how different submodules in the model will be quantized. """ _supported_modules: _Tuple[_Type[_torch.nn.Module]] = ( _nn.Conv1d, _nn.Conv2d, _nn.Conv3d, _nn.ConvTranspose1d, _nn.ConvTranspose2d, _nn.ConvTranspose3d, _nn.Linear, _nn.MultiheadAttention, ) def _get_quantization_mode( self, weight_dtype: _torch.dtype, quantization_scheme: _QuantizationScheme ): """ Returns quantization mode as string """ if quantization_scheme not in [ _QuantizationScheme.affine, _QuantizationScheme.symmetric, ]: raise ValueError( f" Linear quantization scheme must be one of (affine, " f"symmetric) not {quantization_scheme}" ) quantization_mode = ( "LINEAR_SYMMETRIC" if quantization_scheme == _QuantizationScheme.symmetric else "LINEAR" ) return quantization_mode def __init__(self, model: _torch.nn.Module, config: PostTrainingQuantizerConfig = None): config = PostTrainingQuantizerConfig() if config is None else config super().__init__(model, config) def _compute_quantization_params( self, weight: _np.ndarray, nbits: int, dtype: _np.dtype, block_sizes: _List[int], quantization_mode: _Optional[str] = None, signed: bool = True, ) -> _Optional[_Tuple[_np.ndarray, _np.ndarray, _Optional[_np.ndarray]]]: """ Compute quantization parameters """ ret = _ct_compute_qparams( weight=weight, nbits=nbits, quantization_mode=quantization_mode, dtype=dtype, block_sizes=block_sizes, signed=signed, # Always used signed dtype range ) return ret def _dequantize_weight( self, quantized_weight: _np.ndarray, scale: _np.ndarray, zero_point: _Optional[_np.ndarray], quantization_mode: _Optional[str] = None, ): """ De-quantize weights """ dequantized_weight = constexpr_blockwise_shift_scale.decompress( quantized_weight, scale, zero_point ) return dequantized_weight @_torch.no_grad() def _quantize_weight( self, submod_name: str, submodule: _torch.nn.Module, submod_config: ModulePostTrainingQuantizerConfig, param_name: str, ) -> _Optional[_Tuple[_torch.Tensor, _torch.Tensor, _Optional[_torch.Tensor]]]: """ Helper function to perform the quantization on a PyTorch submodule's parameter Args: submod_name (:obj:`str`): Name of the submodule submodule (:obj:`torch.nn.Module`) Submodule which is being quantized submod_config (:py:class:`ModulePostTrainingQuantizerConfig`): Config for the submodule param_name (:obj:`str`): Name of the parameter within the submodule to quantize .. note:: This function extracts the numpy array out of the torch weight value and uses that for performing the quantization """ torch_weight = submodule.get_parameter(param_name) weight = torch_weight.numpy() block_sizes = [0] * weight.ndim assert len(block_sizes) >= 2, "Weight matrix has to be at least 2D or greater" if submod_config.granularity == QuantizationGranularity.per_channel: blocking_axis = ( 1 if isinstance( submodule, ( _nn.ConvTranspose1d, _nn.ConvTranspose2d, _nn.ConvTranspose3d, ), ) else 0 ) block_sizes[blocking_axis] = 1 elif submod_config.granularity == QuantizationGranularity.per_block: updated_config = _validate_param_config( submod_name + "." + param_name, torch_weight, submodule, submod_config, ["quantization_block_size"], ) if not updated_config: _logger.warning(f"Unable to quantize layer {submod_name} - skipping it.") return block_size_config = list(updated_config.block_size) if isinstance( submodule, ( _nn.ConvTranspose1d, _nn.ConvTranspose2d, _nn.ConvTranspose3d, ), ): block_sizes[: len(block_size_config)] = block_size_config[::-1] else: block_sizes[: len(block_size_config)] = block_size_config quantization_mode = self._get_quantization_mode( submod_config.weight_dtype, submod_config.quantization_scheme ) ret = self._compute_quantization_params( weight=weight, nbits=submod_config.weight_n_bits, quantization_mode=quantization_mode, dtype=weight.dtype, block_sizes=block_sizes, signed=True, ) # Always used signed dtype range if ret is None: _logger.warning(f"Unable to quantize layer {submod_name} - skipping it.") return quant_weight, scale, zp = ret dequant_weight = self._dequantize_weight(quant_weight, scale, zp, quantization_mode) # Convert back to torch tensors dequant_weight = _torch.from_numpy(dequant_weight) scale = _torch.from_numpy(scale) if zp is not None: zp = _torch.from_numpy(zp) # Replace the parameter's value submodule.get_parameter(param_name).data.copy_(dequant_weight) # Register compression metadata metadata = self._get_compression_metadata(param_name, submod_config, scale, zp) metadata.register(submodule) def _get_compression_metadata(self, param_name, submod_config, scale, zero_point): metadata = _CompressionMetadata(param_name) metadata.compression_type = ["quantization"] metadata.quantization_n_bits = submod_config.weight_n_bits metadata.quantization_scale = scale if submod_config.quantization_scheme == _QuantizationScheme.affine: assert zero_point is not None metadata.zero_point = zero_point return metadata def compress(self, inplace: bool = False) -> _torch.nn.Module: """ Compress the supported layers in the module by quantizing each weight value of the layer. Args: inplace (:obj:`bool`): If ``True``, model transformations are carried out in-place and the original module is mutated, otherwise a copy of the model is mutated and returned. Defaults to ``False``. """ self._model = super().compress(inplace=inplace) for submod_name, submodule in _get_atomic_layers( self._model, layer_types=list(self._supported_modules) ).items(): submod_config = self._config.get_module_config(submod_name, submodule) if submod_config is None: continue # TODO: Replace this with supported modules abstraction # --- Conv, ConvTranspose & Linear layers --- if isinstance(submodule, self._supported_modules) and not isinstance( submodule, _nn.MultiheadAttention ): assert hasattr( submodule, "weight" ), f"No parameter named weight in submodule {submod_name}" self._quantize_weight(submod_name, submodule, submod_config, "weight") # --- MultiheadAttention layer --- elif isinstance(submodule, _nn.MultiheadAttention): param_names = [ "in_proj_weight", "q_proj_weight", "k_proj_weight", "v_proj_weight", ] for param_name in param_names: if not hasattr(submodule, param_name): continue if getattr(submodule, param_name) is None: continue self._quantize_weight(submod_name, submodule, submod_config, param_name) if hasattr(submodule, "out_proj") and submodule.out_proj.weight is not None: self._quantize_weight( f"{submod_name}.out_proj", submodule.out_proj, submod_config, "weight", ) return self._model def report(self) -> _Report: return _compute_post_training_report( self._uncompressed_model, self._model, supported_modules=self._supported_modules, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/quantization_config.py0000644000000000000000000004353114672066616027201 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import logging as _logging from collections import OrderedDict as _OrderedDict from enum import Enum as _Enum from enum import unique as _unique from typing import Any as _Any from typing import Callable as _Callable from typing import Dict as _Dict from typing import List as _List from typing import NewType as _NewType from typing import Optional as _Optional from typing import Union as _Union import cattrs as _cattrs import torch as _torch import torch.ao.quantization as _aoquant from attr import define as _define from attr import field as _field from attrs import validators as _validators from coremltools.optimize.torch._utils.torch_utils import ( get_n_bits_from_dtype as _get_n_bits_from_dtype, ) from coremltools.optimize.torch._utils.torch_utils import ( maybe_convert_str_to_dtype as _maybe_convert_str_to_dtype, ) from coremltools.optimize.torch._utils.torch_utils import ( maybe_convert_str_to_mod_type as _maybe_convert_str_to_mod_type, ) from coremltools.optimize.torch.optimization_config import ( ModuleOptimizationConfig as _ModuleOptimizationConfig, ) from coremltools.optimize.torch.optimization_config import OptimizationConfig as _OptimizationConfig from coremltools.optimize.torch.optimization_config import _structure_from_dict_hook_factory _logger = _logging.getLogger(__name__) @_unique class ObserverType(_Enum): """ An enum indicating the type of observer. Allowed options are moving_average_min_max, min_max, ema_min_max, ema_percentile, mse, ema_mse, lsq and lsq_plus. """ moving_average_min_max = "moving_average_min_max" min_max = "min_max" @staticmethod def get_observer(observer_type: "ObserverType", is_per_channel: bool) -> _Any: _str_to_observer_map = { "moving_average_min_max": _aoquant.MovingAverageMinMaxObserver, "min_max": _aoquant.MinMaxObserver, "moving_average_min_max_per_channel": _aoquant.MovingAveragePerChannelMinMaxObserver, "min_max_per_channel": _aoquant.PerChannelMinMaxObserver, } observer_name = observer_type.value if is_per_channel: observer_name = f"{observer_name}_per_channel" return _str_to_observer_map[observer_name] @_unique class QuantizationScheme(_Enum): """ An enum indicating the type of quantization to be performed. Allowed options are symmetric and affine. """ symmetric = "symmetric" affine = "affine" @staticmethod def get_qscheme( quantizaton_scheme: "QuantizationScheme", is_per_channel: bool ) -> _torch.qscheme: _str_to_qscheme_map = { "symmetric": _torch.per_tensor_symmetric, "affine": _torch.per_tensor_affine, "symmetric_per_channel": _torch.per_channel_symmetric, "affine_per_channel": _torch.per_channel_affine, } quantization_scheme_name = quantizaton_scheme.value if is_per_channel: quantization_scheme_name = f"{quantization_scheme_name}_per_channel" return _str_to_qscheme_map[quantization_scheme_name] _default_quantization_options = { "weight_dtype": _torch.qint8, "weight_per_channel": True, "weight_ch_axis": 0, "activation_dtype": _torch.quint8, "observer": ObserverType.moving_average_min_max, "quantization_scheme": QuantizationScheme.symmetric, } # Backends only support 4 and 8 bit quantization _SUPPORTED_N_BITS = [4, 8, 32] @_define class ModuleLinearQuantizerConfig(_ModuleOptimizationConfig): """ Configuration class for specifying global and module-level quantization options for linear quantization algorithm implemented in :py:class:`LinearQuantizer`. Linear quantization algorithm simulates the effects of quantization during training, by quantizing and dequantizing the weights and/or activations during the model's forward pass. The forward and backward pass computations are conducted in ``float`` dtype, however, these ``float`` values follow the constraints imposed by ``int8`` and ``quint8`` dtypes. For more details, please refer to `Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference `_. For most applications, the only parameters that need to be set are ``quantization_scheme`` and ``milestones``. By default, ``quantization_scheme`` is set to :py:class:`QuantizationScheme.symmetric`, which means all weights are quantized with zero point as zero, and activations are quantized with zero point as zero for non-negative activations and 128 for all other activations. The weights are quantized using :py:class:`torch.qint8` and activations are quantized using :py:class:`torch.quint8`. Linear quantization algorithm inserts ``observers`` for each weight/activation tensor. These observers collect statistics of these tensors' values, for example, the minimum and maximum values they can take. These statistics are then used to compute the scale and zero point, which are in turn used for quantizing the weights/activations. By default, ``moving_average_min_max`` observer is used. For more details, please check `MinMaxObserver `_. The ``milestones`` parameter controls the flow of the quantization algorithm. The example below illustrates its usage in more detail: .. code-block:: python model = define_model() config = LinearQuantizerConfig( global_config=ModuleLinearQuantizerConfig( quantization_scheme="symmetric", milestones=[0, 100, 300, 200], ) ) quantizer = LinearQuantizer(model, config) # prepare the model to insert FakeQuantize layers for QAT model = quantizer.prepare() # use quantizer in your PyTorch training loop for inputs, labels in data: output = model(inputs) loss = loss_fn(output, labels) loss.backward() optimizer.step() quantizer.step() # In this example, from step 0 onwards, observers will collect statistics # of the values of weights/activations. However, between steps 0 and 100, # effects of quantization will not be simulated. At step 100, quantization # simulation will begin and at step 300, observer statistics collection will # stop. A batch norm layer computes mean and variance of input batch for normalizing # it during training, and collects running estimates of its computed mean and variance, # which are then used for normalization during evaluation. At step 200, batch norm # statistics collection is frozen, and the batch norm layers switch to evaluation # mode, thus more closely simulating the inference numerics during training time. Args: weight_dtype (:py:class:`torch.dtype`): The dtype to use for quantizing the weights. The number of bits used for quantization is inferred from the dtype. When dtype is set to :py:class:`torch.float32`, the weights corresponding to that layer are not quantized. Defaults to :py:class:`torch.int8` which corresponds to 8-bit quantization. weight_observer (:py:class:`ObserverType`): Type of observer to use for quantizing weights. Defaults to ``moving_average_min_max``. weight_per_channel (:obj:`bool`): When ``True``, weights are quantized per channel; otherwise, per tensor. activation_dtype (:py:class:`torch.dtype`): The dtype to use for quantizing the activations. When dtype is set to :py:class:`torch.float32`, the activations corresponding to that layer are not quantized. Defaults to :py:class:`torch.quint8`. activation_observer (:py:class:`ObserverType`): Type of observer to use for quantizing activations. Defaults to ``moving_average_min_max``. quantization_scheme: (:py:class:`QuantizationScheme`): Type of quantization configuration to use. When this parameter is set to :py:class:`QuantizationScheme.symmetric`, all weights are quantized with zero point as zero, and activations are quantized with zero point as zero for non-negative activations and 128 for all other activations. When it is set to :py:class:`QuantizationScheme.affine`, zero point can be set anywhere in the range of values allowed for the quantized weight/activation. Defaults to :py:class:`QuantizationScheme.symmetric`. milestones (:obj:`list` of :obj:`int`): A list of four integers indicating milestones to use during quantization. The first milestone corresponds to enabling observers, the second to enabling fake quantization simulation, the third to disabling observers, and the last to freezing batch norm statistics. Defaults to ``None``, which means the ``step`` method of :py:class:`LinearQuantizer` will be a no-op and all observers and quantization simulation will be turned on from the first step, batch norm layers always operate in training mode, and mean and variance statistics collection is not frozen. """ weight_dtype: _Union[str, _torch.dtype] = _field( default=_default_quantization_options["weight_dtype"], ) weight_observer: ObserverType = _field( default=_default_quantization_options["observer"], converter=ObserverType, validator=_validators.in_(ObserverType), ) weight_per_channel: bool = _field( default=_default_quantization_options["weight_per_channel"], validator=_validators.instance_of(bool), ) activation_dtype: _torch.dtype = _field( default=_default_quantization_options["activation_dtype"], converter=_maybe_convert_str_to_dtype, validator=[ _validators.instance_of(_torch.dtype), _validators.in_([_torch.quint8, _torch.float32]), ], ) activation_observer: ObserverType = _field( default=_default_quantization_options["observer"], converter=ObserverType, validator=_validators.in_(ObserverType), ) quantization_scheme: QuantizationScheme = _field( default=_default_quantization_options["quantization_scheme"], converter=QuantizationScheme, validator=_validators.in_(QuantizationScheme), ) milestones: _Optional[_List[int]] = _field( default=None, validator=_validators.optional( _validators.deep_iterable( member_validator=_validators.instance_of(int), iterable_validator=_validators.instance_of(list), ) ), ) def __attrs_post_init__(self): self.weight_n_bits = _get_n_bits_from_dtype(self.weight_dtype) self.weight_dtype = _maybe_convert_str_to_dtype(self.weight_dtype) if self.weight_dtype not in [_torch.qint8, _torch.quint8, _torch.float32]: raise ValueError( f"weight_dtype must be one of (_torch.qint8, _torch.quint8, _torch.float32) not {self.weight_dtype}" ) @milestones.validator def _check_milestones(self, attribute, value): if value is not None: assert len(value) == 4, ( f"Received milestones = {value}. " f"Milestones should be of length 4. " f"Refer to docs for more information." ) @classmethod def from_dict(cls, config_dict): converter = _cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook( _Union[str, _torch.dtype], lambda obj, type: obj, ) return converter.structure_attrs_fromdict(config_dict, cls) _ModuleTypeConfigType = _NewType( "ModuleTypeConfigType", _Dict[_Union[_Callable, str], _Optional[ModuleLinearQuantizerConfig]], ) @_define class LinearQuantizerConfig(_OptimizationConfig): """ Configuration class for specifying how different submodules of a model are quantized by :py:class:`LinearQuantizer`. In order to disable quantizing a layer or an operation, ``module_type_config`` or ``module_name_config`` corresponding to that operation can be set to ``None``. For example: .. code-block:: python # The following config will enable weight only quantization for all layers: config = LinearQuantizerConfig.from_dict( { "global_config": { "activation_dtype": "float32", } } ) # The following config will disable quantization for all linear layers and # set quantization mode to weight only quantization for convolution layers: config = LinearQuantizerConfig.from_dict( { "module_type_configs": { "Linear": None, "Conv2d": { "activation_dtype": "float32", }, } } ) # The following config will disable quantization for layers named conv1 and conv2: config = LinearQuantizerConfig.from_dict( { "module_name_configs": { "conv1": None, "conv2": None, } } ) # If model has some methods and attributes which are not used in the forward # pass, but are needed to be preserved after quantization is added, they can # be preserved on the quantized model by passing them in preserved_attributes # parameter model = MyModel() model.key_1 = value_1 model.key_2 = value_2 config = LinearQuantizerConfig.from_dict({"preserved_attributes": ["key_1", "key_2"]}) Args: global_config (:py:class:`ModuleLinearQuantizerConfig`): Config to be applied globally to all supported modules. Missing values are chosen from the default config. module_type_configs (:obj:`dict` of :obj:`str` to :py:class:`ModuleLinearQuantizerConfig`): Module type level configs applied to a specific module class, such as :py:class:`torch.nn.Linear`. The keys can be either strings or module classes. module_name_configs (:obj:`dict` of :obj:`str` to :py:class:`ModuleLinearQuantizerConfig`): Module-level configs applied to specific modules. The name of the module must be a fully qualified name that can be used to fetch it from the top level module using the ``module.get_submodule(target)`` method. non_traceable_module_names (:obj:`list` of :obj:`str`): Names of modules which cannot be traced using ``torch.fx``. preserved_attributes (:obj:`list` of :obj:`str`): Names of attributes of the model which should be preserved on the prepared and finalized models, even if they are not used in the model's forward pass. .. note:: The ``quantization_scheme`` parameter must be the same across all configs. """ global_config: _Optional[ModuleLinearQuantizerConfig] = _field( default=None, validator=_validators.optional(_validators.instance_of(ModuleLinearQuantizerConfig)), ) module_type_configs: _ModuleTypeConfigType = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.instance_of((str, _Callable)), value_validator=_validators.optional( _validators.instance_of(ModuleLinearQuantizerConfig) ), mapping_validator=_validators.instance_of(dict), ), ) module_name_configs: _Dict[str, _Optional[ModuleLinearQuantizerConfig]] = _field( factory=_OrderedDict, validator=_validators.deep_mapping( key_validator=_validators.instance_of(str), value_validator=_validators.optional( _validators.instance_of(ModuleLinearQuantizerConfig) ), mapping_validator=_validators.instance_of(dict), ), ) non_traceable_module_names: _List[str] = _field( default=list(), validator=_validators.deep_iterable( member_validator=_validators.instance_of(str), ), ) preserved_attributes: _List[str] = _field( factory=list, validator=_validators.deep_iterable( member_validator=_validators.instance_of(str), ), ) def __attrs_post_init__(self): if ( self.global_config is None and len(self.module_type_configs) == 0 and len(self.module_name_configs) == 0 ): self.global_config = ModuleLinearQuantizerConfig() self.module_type_configs = { _maybe_convert_str_to_mod_type(key): val for key, val in self.module_type_configs.items() } self._validate_same_params(["quantization_scheme"]) @classmethod def from_dict(cls, config_dict: _Dict[str, _Any]) -> "LinearQuantizerConfig": super().from_dict(config_dict) converter = _cattrs.Converter(forbid_extra_keys=True) converter.register_structure_hook( _Union[str, _torch.dtype], lambda obj, type: obj, ) converter.register_structure_hook( _ModuleTypeConfigType, _structure_from_dict_hook_factory(ModuleLinearQuantizerConfig), ) return converter.structure_attrs_fromdict(config_dict, cls) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/optimize/torch/quantization/quantizer.py0000644000000000000000000003735614672066616025160 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy as _copy import logging as _logging from typing import Any as _Any from typing import Optional as _Optional from typing import Tuple as _Tuple from typing import Type as _Type import torch as _torch import torch.ao.quantization as _aoquant from torch.ao.quantization.fx.custom_config import ConvertCustomConfig as _ConvertCustomConfig from torch.ao.quantization.fx.custom_config import PrepareCustomConfig as _PrepareCustomConfig from torch.ao.quantization.quantize_fx import convert_to_reference_fx as _convert_to_reference_fx import coremltools.optimize.torch.quantization.modules.qat_modules as _qat from coremltools.optimize.torch._utils.math_utils import rmse_error as _rmse_error from coremltools.optimize.torch._utils.metadata_utils import ( register_metadata_version as _register_metadata_version, ) from coremltools.optimize.torch._utils.torch_utils import get_eval_model as _get_eval_model from coremltools.optimize.torch.base_model_optimizer import ( BaseTrainingTimeModelOptimizer as _BaseTrainingTimeModelOptimizer, ) from coremltools.optimize.torch.base_model_optimizer import _Report from coremltools.optimize.torch.quantization._backend_config import ( get_backend_config as _get_backend_config, ) from coremltools.optimize.torch.quantization._backend_config import ( get_supported_modules as _get_supported_modules, ) from coremltools.optimize.torch.quantization._configure import ( QATConfigurationHandler as _QATConfigurationHandler, ) from coremltools.optimize.torch.quantization._qconfig_mapping import _QConfigMappingBuilder from coremltools.optimize.torch.quantization._utils import ( is_per_channel_quant as _is_per_channel_quant, ) from coremltools.optimize.torch.quantization._utils import is_symmetric_quant as _is_symmetric_quant from coremltools.optimize.torch.quantization._utils import ( pre_apply_weight_quant as _pre_apply_weight_quant, ) from coremltools.optimize.torch.quantization._utils import ( register_compression_metadata as _register_compression_metadata, ) from coremltools.optimize.torch.quantization.quantization_config import ( LinearQuantizerConfig as _LinearQuantizerConfig, ) from coremltools.optimize.torch.quantization.quantization_config import ( ModuleLinearQuantizerConfig as _ModuleLinearQuantizerConfig, ) _logger = _logging.getLogger(__name__) class Quantizer(_BaseTrainingTimeModelOptimizer): pass class LinearQuantizer(Quantizer): """ Perform quantization aware training (QAT) of models. This algorithm simulates the effects of quantization during training, by quantizing and dequantizing the weights and/or activations during the model's forward pass. The forward and backward pass computations are conducted in ``float`` dtype, however, these ``float`` values follow the constraints imposed by ``int8`` and ``quint8`` dtypes. Thus, this algorithm adjusts the model's weights while closely simulating the numerics which get executed during quantized inference, allowing model's weights to adjust to quantization constraints. For more details, please refer to `Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference `_. Example: .. code-block:: python import torch.nn as nn from coremltools.optimize.torch.quantization import ( LinearQuantizer, LinearQuantizerConfig, ) model = nn.Sequential( OrderedDict( { "conv": nn.Conv2d(1, 20, (3, 3)), "relu1": nn.ReLU(), "conv2": nn.Conv2d(20, 20, (3, 3)), "relu2": nn.ReLU(), } ) ) loss_fn = define_loss() # initialize the quantizer config = LinearQuantizerConfig.from_dict( { "global_config": { "quantization_scheme": "symmetric", "milestones": [0, 100, 400, 400], } } ) quantizer = LinearQuantizer(model, config) # prepare the model to insert FakeQuantize layers for QAT model = quantizer.prepare() # use quantizer in your PyTorch training loop for inputs, labels in data: output = model(inputs) loss = loss_fn(output, labels) loss.backward() optimizer.step() quantizer.step() # convert operations to their quantized counterparts using parameters learned via QAT model = quantizer.finalize(inplace=True) Args: model (:obj:`torch.nn.Module`): Module to be trained. config (:py:class:`_LinearQuantizerConfig`): Config that specifies how different submodules in the model will be quantized. Default config is used when passed as ``None``. """ _supported_modules: _Tuple = tuple(_get_supported_modules()) _qconfig_mapping_builder_cls: _Type = _QConfigMappingBuilder _qat_configuration_handler_cls: _Type = _QATConfigurationHandler def __init__(self, model: _torch.nn.Module, config: _Optional[_LinearQuantizerConfig] = None): config = _LinearQuantizerConfig() if config is None else config super().__init__(model, config) global_config = self._construct_global_config() self._is_prepared = False self._quantization_scheme = global_config.quantization_scheme self._milestones = global_config.milestones qmapping_builder = self._qconfig_mapping_builder_cls() self._qconfig_mapping = qmapping_builder.get_qconfig_mapping_from_quantization_config( model=self._model, quantization_config=self._config, quantization_scheme=self._quantization_scheme, ) def _construct_global_config(self) -> _ModuleLinearQuantizerConfig: if self._config.global_config is not None: return self._config.global_config for _, config in self._config.module_type_configs.items(): if config is not None: return config for _, config in self._config.module_name_configs.items(): if config is not None: return config return _ModuleLinearQuantizerConfig() def prepare(self, example_inputs: _Tuple[_Any, ...], inplace: bool = False) -> _torch.nn.Module: """ Prepares the model for quantization aware training by inserting :py:class:`torch.ao.quantization.FakeQuantize` layers in the model in appropriate places. Args: example_inputs (:obj:`Tuple[Any, ...]`): Example inputs for forward function of the model, tuple of positional args (keyword args can be passed as positional args as well) inplace (:obj:`bool`): If ``True``, model transformations are carried out in-place and the original module is mutated, otherwise a copy of the model is mutated and returned. .. note:: This method uses `prepare_qat_fx method `_ to insert quantization layers and the returned model is a :py:class:`torch.fx.GraphModule`. Some models, like those with dynamic control flow, may not be trace-able into a :py:class:`torch.fx.GraphModule`. Please follow directions in `Limitations of Symbolic Tracing `_ to update your model first before using :py:class:`LinearQuantizer` algorithm. """ if self._is_prepared: _logger.warning( "Model has already been prepared for QAT. This API call " "will be a no-op." ) return self._model model = self._get_model_for_compression(inplace=inplace) model.train() prepare_custom_config = _PrepareCustomConfig().set_non_traceable_module_names( self._config.non_traceable_module_names ) prepare_custom_config = prepare_custom_config.set_preserved_attributes( self._config.preserved_attributes ) qat_handler = self._qat_configuration_handler_cls( prepare_custom_config=prepare_custom_config, qconfig_mapping=self._qconfig_mapping, backend_config=_get_backend_config(), quantization_scheme=self._quantization_scheme, ) prepared_model = qat_handler.prepare(model, example_inputs) if self._milestones is not None: prepared_model.apply(_aoquant.disable_observer) prepared_model.apply(_aoquant.disable_fake_quant) self._model = prepared_model self._is_prepared = True return prepared_model def step(self): """ Steps through the milestones defined for this quantizer. The first milestone corresponds to enabling observers, the second to enabling fake quantization simulation, the third to disabling observers, and the last to freezing batch norm statistics. .. note:: If milestones argument is set as ``None``, this method is a no-op. .. note:: In order to not use a particular milestone, its value can be set as ``-1``. """ if not self._is_prepared: _logger.warning( "Model has not been prepared for QAT. This API call " "will be a no-op. prepare method must be called before " "a call to the step method." ) return if self._milestones is None: return else: if self._step_count == self._milestones[0]: self._model.apply(_aoquant.enable_observer) if self._step_count == self._milestones[1]: self._model.apply(_aoquant.enable_fake_quant) if self._step_count == self._milestones[2]: self._model.apply(_aoquant.disable_observer) if self._step_count == self._milestones[3]: self._model.apply(_qat.freeze_bn_stats) self._step_count += 1 def finalize( self, model: _Optional[_torch.nn.Module] = None, inplace: bool = False ) -> _torch.nn.Module: """ Prepares the model for export. Args: model (:py:class:`_torch.nn.Module`): Model to be finalized. inplace (:obj:`bool`): If ``True``, model transformations are carried out in-place and the original module is mutated; otherwise, a copy of the model is mutated and returned. .. note:: Once the model is finalized with ``in_place = True``, it may not be runnable on the GPU. """ if not self._is_prepared: _logger.warning( "Model has not been prepared for QAT. This API call " "will be a no-op. prepare method must be called before " "a call to the finalize method." ) return self._model if model is None: model = self._model if not inplace: model = _copy.deepcopy(model) model.eval() convert_custom_config = _ConvertCustomConfig().set_preserved_attributes( self._config.preserved_attributes ) finalized_model = _convert_to_reference_fx( model, convert_custom_config=convert_custom_config, qconfig_mapping=self._qconfig_mapping, backend_config=_get_backend_config(), ) # PyTorch fx QAT does not properly handle the clipping of < 8-bit weights during # finalization so have to apply the utility method below after finalization to clip # the de-quantized weights. _pre_apply_weight_quant(finalized_model) _register_metadata_version(finalized_model) for name, submodule in finalized_model.named_modules(remove_duplicate=True): if hasattr(submodule, "weight_scale"): _register_compression_metadata(submodule) if model is None: self._model = finalized_model return finalized_model def report(self) -> _Report: """ Returns a dictionary with important statistics related to current state of quantization. Each key in the dictionary corresponds to a module name, and the value is a dictionary containing the statistics such as scale, zero point, number of parameters, and so on. Note that error will be nan and #params will be -1 for activations. """ report = _Report() with _get_eval_model(self._model) as model: with _torch.no_grad(): for name, module in model.named_modules(remove_duplicate=True): if ( hasattr(module, "weight_fake_quant") and module.weight_fake_quant is not None ): module_summary = dict() module_summary["type"] = "weight" module_summary["device"] = module.weight.device qscheme = module.weight_fake_quant.qscheme module_summary["qscheme"] = ( "symmetric" if _is_symmetric_quant(qscheme) else "affine" ) module_summary["per_channel"] = _is_per_channel_quant(qscheme) qweight = module.weight_fake_quant.forward(module.weight.detach()) module_summary["dtype"] = module.weight_fake_quant.dtype module_summary["qmin"] = module.weight_fake_quant.quant_min module_summary["qmax"] = module.weight_fake_quant.quant_max module_summary["error"] = _rmse_error( module.weight.detach(), qweight ).item() module_summary["#params"] = int(_torch.numel(qweight)) report[name] = module_summary elif ( not name.endswith(".weight_fake_quant") and isinstance(module, _aoquant.FakeQuantize) and hasattr(module, "activation_post_process") and module.activation_post_process is not None ): module_summary = dict() module_summary["type"] = "activation" scale, zp = module.activation_post_process.calculate_qparams() module_summary["device"] = scale.device qscheme = module.qscheme module_summary["qscheme"] = ( "symmetric" if _is_symmetric_quant(qscheme) else "affine" ) module_summary["per_channel"] = _is_per_channel_quant(qscheme) module_summary["dtype"] = module.dtype module_summary["qmin"] = module.quant_min module_summary["qmax"] = module.quant_max module_summary["error"] = float("nan") module_summary["#params"] = -1 report[name] = module_summary return report ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2775474 coremltools-8.0/coremltools/proto/0000755000000000000000000000000014672075535016204 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/ArrayFeatureExtractor_pb2.py0000644000000000000000000000262614672066616023615 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: ArrayFeatureExtractor.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x1b\x41rrayFeatureExtractor.proto\x12\x14\x43oreML.Specification\"-\n\x15\x41rrayFeatureExtractor\x12\x14\n\x0c\x65xtractIndex\x18\x01 \x03(\x04\x42\x02H\x03\x62\x06proto3') _ARRAYFEATUREEXTRACTOR = DESCRIPTOR.message_types_by_name['ArrayFeatureExtractor'] ArrayFeatureExtractor = _reflection.GeneratedProtocolMessageType('ArrayFeatureExtractor', (_message.Message,), { 'DESCRIPTOR' : _ARRAYFEATUREEXTRACTOR, '__module__' : 'ArrayFeatureExtractor_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ArrayFeatureExtractor) }) _sym_db.RegisterMessage(ArrayFeatureExtractor) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _ARRAYFEATUREEXTRACTOR._serialized_start=53 _ARRAYFEATUREEXTRACTOR._serialized_end=98 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/AudioFeaturePrint_pb2.py0000644000000000000000000000463614672066616022724 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: AudioFeaturePrint.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x17\x41udioFeaturePrint.proto\x12!CoreML.Specification.CoreMLModels\"\x9d\x02\n\x11\x41udioFeaturePrint\x12K\n\x05sound\x18\x14 \x01(\x0b\x32:.CoreML.Specification.CoreMLModels.AudioFeaturePrint.SoundH\x00\x1a\xa1\x01\n\x05Sound\x12X\n\x07version\x18\x01 \x01(\x0e\x32G.CoreML.Specification.CoreMLModels.AudioFeaturePrint.Sound.SoundVersion\">\n\x0cSoundVersion\x12\x19\n\x15SOUND_VERSION_INVALID\x10\x00\x12\x13\n\x0fSOUND_VERSION_1\x10\x01\x42\x17\n\x15\x41udioFeaturePrintTypeB\x02H\x03\x62\x06proto3') _AUDIOFEATUREPRINT = DESCRIPTOR.message_types_by_name['AudioFeaturePrint'] _AUDIOFEATUREPRINT_SOUND = _AUDIOFEATUREPRINT.nested_types_by_name['Sound'] _AUDIOFEATUREPRINT_SOUND_SOUNDVERSION = _AUDIOFEATUREPRINT_SOUND.enum_types_by_name['SoundVersion'] AudioFeaturePrint = _reflection.GeneratedProtocolMessageType('AudioFeaturePrint', (_message.Message,), { 'Sound' : _reflection.GeneratedProtocolMessageType('Sound', (_message.Message,), { 'DESCRIPTOR' : _AUDIOFEATUREPRINT_SOUND, '__module__' : 'AudioFeaturePrint_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CoreMLModels.AudioFeaturePrint.Sound) }) , 'DESCRIPTOR' : _AUDIOFEATUREPRINT, '__module__' : 'AudioFeaturePrint_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CoreMLModels.AudioFeaturePrint) }) _sym_db.RegisterMessage(AudioFeaturePrint) _sym_db.RegisterMessage(AudioFeaturePrint.Sound) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _AUDIOFEATUREPRINT._serialized_start=63 _AUDIOFEATUREPRINT._serialized_end=348 _AUDIOFEATUREPRINT_SOUND._serialized_start=162 _AUDIOFEATUREPRINT_SOUND._serialized_end=323 _AUDIOFEATUREPRINT_SOUND_SOUNDVERSION._serialized_start=261 _AUDIOFEATUREPRINT_SOUND_SOUNDVERSION._serialized_end=323 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/BayesianProbitRegressor_pb2.py0000644000000000000000000001047514672066616024137 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: BayesianProbitRegressor.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x1d\x42\x61yesianProbitRegressor.proto\x12\x14\x43oreML.Specification\"\xa0\x06\n\x17\x42\x61yesianProbitRegressor\x12\x18\n\x10numberOfFeatures\x18\x01 \x01(\r\x12\x44\n\x04\x62ias\x18\x02 \x01(\x0b\x32\x36.CoreML.Specification.BayesianProbitRegressor.Gaussian\x12M\n\x08\x66\x65\x61tures\x18\x03 \x03(\x0b\x32;.CoreML.Specification.BayesianProbitRegressor.FeatureWeight\x12\"\n\x1aregressionInputFeatureName\x18\n \x01(\t\x12 \n\x18optimismInputFeatureName\x18\x0b \x01(\t\x12%\n\x1dsamplingScaleInputFeatureName\x18\x0c \x01(\t\x12*\n\"samplingTruncationInputFeatureName\x18\r \x01(\t\x12\x1d\n\x15meanOutputFeatureName\x18\x14 \x01(\t\x12!\n\x19varianceOutputFeatureName\x18\x15 \x01(\t\x12/\n\'pessimisticProbabilityOutputFeatureName\x18\x16 \x01(\t\x12+\n#sampledProbabilityOutputFeatureName\x18\x17 \x01(\t\x1a+\n\x08Gaussian\x12\x0c\n\x04mean\x18\x01 \x01(\x01\x12\x11\n\tprecision\x18\x02 \x01(\x01\x1ay\n\x12\x46\x65\x61tureValueWeight\x12\x14\n\x0c\x66\x65\x61tureValue\x18\x01 \x01(\r\x12M\n\rfeatureWeight\x18\x02 \x01(\x0b\x32\x36.CoreML.Specification.BayesianProbitRegressor.Gaussian\x1au\n\rFeatureWeight\x12\x11\n\tfeatureId\x18\x01 \x01(\r\x12Q\n\x07weights\x18\x02 \x03(\x0b\x32@.CoreML.Specification.BayesianProbitRegressor.FeatureValueWeightB\x02H\x03\x62\x06proto3') _BAYESIANPROBITREGRESSOR = DESCRIPTOR.message_types_by_name['BayesianProbitRegressor'] _BAYESIANPROBITREGRESSOR_GAUSSIAN = _BAYESIANPROBITREGRESSOR.nested_types_by_name['Gaussian'] _BAYESIANPROBITREGRESSOR_FEATUREVALUEWEIGHT = _BAYESIANPROBITREGRESSOR.nested_types_by_name['FeatureValueWeight'] _BAYESIANPROBITREGRESSOR_FEATUREWEIGHT = _BAYESIANPROBITREGRESSOR.nested_types_by_name['FeatureWeight'] BayesianProbitRegressor = _reflection.GeneratedProtocolMessageType('BayesianProbitRegressor', (_message.Message,), { 'Gaussian' : _reflection.GeneratedProtocolMessageType('Gaussian', (_message.Message,), { 'DESCRIPTOR' : _BAYESIANPROBITREGRESSOR_GAUSSIAN, '__module__' : 'BayesianProbitRegressor_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BayesianProbitRegressor.Gaussian) }) , 'FeatureValueWeight' : _reflection.GeneratedProtocolMessageType('FeatureValueWeight', (_message.Message,), { 'DESCRIPTOR' : _BAYESIANPROBITREGRESSOR_FEATUREVALUEWEIGHT, '__module__' : 'BayesianProbitRegressor_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BayesianProbitRegressor.FeatureValueWeight) }) , 'FeatureWeight' : _reflection.GeneratedProtocolMessageType('FeatureWeight', (_message.Message,), { 'DESCRIPTOR' : _BAYESIANPROBITREGRESSOR_FEATUREWEIGHT, '__module__' : 'BayesianProbitRegressor_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BayesianProbitRegressor.FeatureWeight) }) , 'DESCRIPTOR' : _BAYESIANPROBITREGRESSOR, '__module__' : 'BayesianProbitRegressor_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BayesianProbitRegressor) }) _sym_db.RegisterMessage(BayesianProbitRegressor) _sym_db.RegisterMessage(BayesianProbitRegressor.Gaussian) _sym_db.RegisterMessage(BayesianProbitRegressor.FeatureValueWeight) _sym_db.RegisterMessage(BayesianProbitRegressor.FeatureWeight) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _BAYESIANPROBITREGRESSOR._serialized_start=56 _BAYESIANPROBITREGRESSOR._serialized_end=856 _BAYESIANPROBITREGRESSOR_GAUSSIAN._serialized_start=571 _BAYESIANPROBITREGRESSOR_GAUSSIAN._serialized_end=614 _BAYESIANPROBITREGRESSOR_FEATUREVALUEWEIGHT._serialized_start=616 _BAYESIANPROBITREGRESSOR_FEATUREVALUEWEIGHT._serialized_end=737 _BAYESIANPROBITREGRESSOR_FEATUREWEIGHT._serialized_start=739 _BAYESIANPROBITREGRESSOR_FEATUREWEIGHT._serialized_end=856 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/CategoricalMapping_pb2.py0000644000000000000000000000364714672066616023064 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: CategoricalMapping.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x18\x43\x61tegoricalMapping.proto\x12\x14\x43oreML.Specification\x1a\x14\x44\x61taStructures.proto\"\xe7\x01\n\x12\x43\x61tegoricalMapping\x12\x42\n\x10stringToInt64Map\x18\x01 \x01(\x0b\x32&.CoreML.Specification.StringToInt64MapH\x00\x12\x42\n\x10int64ToStringMap\x18\x02 \x01(\x0b\x32&.CoreML.Specification.Int64ToStringMapH\x00\x12\x12\n\x08strValue\x18\x65 \x01(\tH\x01\x12\x14\n\nint64Value\x18\x66 \x01(\x03H\x01\x42\r\n\x0bMappingTypeB\x10\n\x0eValueOnUnknownB\x02H\x03P\x00\x62\x06proto3') _CATEGORICALMAPPING = DESCRIPTOR.message_types_by_name['CategoricalMapping'] CategoricalMapping = _reflection.GeneratedProtocolMessageType('CategoricalMapping', (_message.Message,), { 'DESCRIPTOR' : _CATEGORICALMAPPING, '__module__' : 'CategoricalMapping_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CategoricalMapping) }) _sym_db.RegisterMessage(CategoricalMapping) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _CATEGORICALMAPPING._serialized_start=73 _CATEGORICALMAPPING._serialized_end=304 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/ClassConfidenceThresholding_pb2.py0000644000000000000000000000344214672066616024722 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: ClassConfidenceThresholding.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n!ClassConfidenceThresholding.proto\x12\x14\x43oreML.Specification\x1a\x14\x44\x61taStructures.proto\"h\n\x1b\x43lassConfidenceThresholding\x12I\n\x15precisionRecallCurves\x18\x64 \x03(\x0b\x32*.CoreML.Specification.PrecisionRecallCurveB\x02H\x03P\x00\x62\x06proto3') _CLASSCONFIDENCETHRESHOLDING = DESCRIPTOR.message_types_by_name['ClassConfidenceThresholding'] ClassConfidenceThresholding = _reflection.GeneratedProtocolMessageType('ClassConfidenceThresholding', (_message.Message,), { 'DESCRIPTOR' : _CLASSCONFIDENCETHRESHOLDING, '__module__' : 'ClassConfidenceThresholding_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ClassConfidenceThresholding) }) _sym_db.RegisterMessage(ClassConfidenceThresholding) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _CLASSCONFIDENCETHRESHOLDING._serialized_start=81 _CLASSCONFIDENCETHRESHOLDING._serialized_end=185 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/CustomModel_pb2.py0000644000000000000000000000614614672066616021563 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: CustomModel.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x11\x43ustomModel.proto\x12\x14\x43oreML.Specification\"\x8d\x03\n\x0b\x43ustomModel\x12\x11\n\tclassName\x18\n \x01(\t\x12\x45\n\nparameters\x18\x1e \x03(\x0b\x32\x31.CoreML.Specification.CustomModel.ParametersEntry\x12\x13\n\x0b\x64\x65scription\x18( \x01(\t\x1a\xa2\x01\n\x15\x43ustomModelParamValue\x12\x15\n\x0b\x64oubleValue\x18\n \x01(\x01H\x00\x12\x15\n\x0bstringValue\x18\x14 \x01(\tH\x00\x12\x12\n\x08intValue\x18\x1e \x01(\x05H\x00\x12\x13\n\tlongValue\x18( \x01(\x03H\x00\x12\x13\n\tboolValue\x18\x32 \x01(\x08H\x00\x12\x14\n\nbytesValue\x18< \x01(\x0cH\x00\x42\x07\n\x05value\x1aj\n\x0fParametersEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\x46\n\x05value\x18\x02 \x01(\x0b\x32\x37.CoreML.Specification.CustomModel.CustomModelParamValue:\x02\x38\x01\x42\x02H\x03\x62\x06proto3') _CUSTOMMODEL = DESCRIPTOR.message_types_by_name['CustomModel'] _CUSTOMMODEL_CUSTOMMODELPARAMVALUE = _CUSTOMMODEL.nested_types_by_name['CustomModelParamValue'] _CUSTOMMODEL_PARAMETERSENTRY = _CUSTOMMODEL.nested_types_by_name['ParametersEntry'] CustomModel = _reflection.GeneratedProtocolMessageType('CustomModel', (_message.Message,), { 'CustomModelParamValue' : _reflection.GeneratedProtocolMessageType('CustomModelParamValue', (_message.Message,), { 'DESCRIPTOR' : _CUSTOMMODEL_CUSTOMMODELPARAMVALUE, '__module__' : 'CustomModel_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CustomModel.CustomModelParamValue) }) , 'ParametersEntry' : _reflection.GeneratedProtocolMessageType('ParametersEntry', (_message.Message,), { 'DESCRIPTOR' : _CUSTOMMODEL_PARAMETERSENTRY, '__module__' : 'CustomModel_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CustomModel.ParametersEntry) }) , 'DESCRIPTOR' : _CUSTOMMODEL, '__module__' : 'CustomModel_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CustomModel) }) _sym_db.RegisterMessage(CustomModel) _sym_db.RegisterMessage(CustomModel.CustomModelParamValue) _sym_db.RegisterMessage(CustomModel.ParametersEntry) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _CUSTOMMODEL_PARAMETERSENTRY._options = None _CUSTOMMODEL_PARAMETERSENTRY._serialized_options = b'8\001' _CUSTOMMODEL._serialized_start=44 _CUSTOMMODEL._serialized_end=441 _CUSTOMMODEL_CUSTOMMODELPARAMVALUE._serialized_start=171 _CUSTOMMODEL_CUSTOMMODELPARAMVALUE._serialized_end=333 _CUSTOMMODEL_PARAMETERSENTRY._serialized_start=335 _CUSTOMMODEL_PARAMETERSENTRY._serialized_end=441 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/DataStructures_pb2.py0000644000000000000000000002424214672066616022302 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: DataStructures.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import FeatureTypes_pb2 as FeatureTypes__pb2 from .FeatureTypes_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x14\x44\x61taStructures.proto\x12\x14\x43oreML.Specification\x1a\x12\x46\x65\x61tureTypes.proto\"|\n\x10StringToInt64Map\x12<\n\x03map\x18\x01 \x03(\x0b\x32/.CoreML.Specification.StringToInt64Map.MapEntry\x1a*\n\x08MapEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\r\n\x05value\x18\x02 \x01(\x03:\x02\x38\x01\"|\n\x10Int64ToStringMap\x12<\n\x03map\x18\x01 \x03(\x0b\x32/.CoreML.Specification.Int64ToStringMap.MapEntry\x1a*\n\x08MapEntry\x12\x0b\n\x03key\x18\x01 \x01(\x03\x12\r\n\x05value\x18\x02 \x01(\t:\x02\x38\x01\"~\n\x11StringToDoubleMap\x12=\n\x03map\x18\x01 \x03(\x0b\x32\x30.CoreML.Specification.StringToDoubleMap.MapEntry\x1a*\n\x08MapEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\r\n\x05value\x18\x02 \x01(\x01:\x02\x38\x01\"|\n\x10Int64ToDoubleMap\x12<\n\x03map\x18\x01 \x03(\x0b\x32/.CoreML.Specification.Int64ToDoubleMap.MapEntry\x1a*\n\x08MapEntry\x12\x0b\n\x03key\x18\x01 \x01(\x03\x12\r\n\x05value\x18\x02 \x01(\x01:\x02\x38\x01\"\x1e\n\x0cStringVector\x12\x0e\n\x06vector\x18\x01 \x03(\t\"\x1d\n\x0bInt64Vector\x12\x0e\n\x06vector\x18\x01 \x03(\x03\"\x1d\n\x0b\x46loatVector\x12\x0e\n\x06vector\x18\x01 \x03(\x02\"\x1e\n\x0c\x44oubleVector\x12\x0e\n\x06vector\x18\x01 \x03(\x01\"0\n\nInt64Range\x12\x10\n\x08minValue\x18\x01 \x01(\x03\x12\x10\n\x08maxValue\x18\x02 \x01(\x03\"\x1a\n\x08Int64Set\x12\x0e\n\x06values\x18\x01 \x03(\x03\"1\n\x0b\x44oubleRange\x12\x10\n\x08minValue\x18\x01 \x01(\x01\x12\x10\n\x08maxValue\x18\x02 \x01(\x01\"\x9c\x02\n\x14PrecisionRecallCurve\x12:\n\x0fprecisionValues\x18\x01 \x01(\x0b\x32!.CoreML.Specification.FloatVector\x12H\n\x1dprecisionConfidenceThresholds\x18\x02 \x01(\x0b\x32!.CoreML.Specification.FloatVector\x12\x37\n\x0crecallValues\x18\x03 \x01(\x0b\x32!.CoreML.Specification.FloatVector\x12\x45\n\x1arecallConfidenceThresholds\x18\x04 \x01(\x0b\x32!.CoreML.Specification.FloatVectorB\x02H\x03P\x00\x62\x06proto3') _STRINGTOINT64MAP = DESCRIPTOR.message_types_by_name['StringToInt64Map'] _STRINGTOINT64MAP_MAPENTRY = _STRINGTOINT64MAP.nested_types_by_name['MapEntry'] _INT64TOSTRINGMAP = DESCRIPTOR.message_types_by_name['Int64ToStringMap'] _INT64TOSTRINGMAP_MAPENTRY = _INT64TOSTRINGMAP.nested_types_by_name['MapEntry'] _STRINGTODOUBLEMAP = DESCRIPTOR.message_types_by_name['StringToDoubleMap'] _STRINGTODOUBLEMAP_MAPENTRY = _STRINGTODOUBLEMAP.nested_types_by_name['MapEntry'] _INT64TODOUBLEMAP = DESCRIPTOR.message_types_by_name['Int64ToDoubleMap'] _INT64TODOUBLEMAP_MAPENTRY = _INT64TODOUBLEMAP.nested_types_by_name['MapEntry'] _STRINGVECTOR = DESCRIPTOR.message_types_by_name['StringVector'] _INT64VECTOR = DESCRIPTOR.message_types_by_name['Int64Vector'] _FLOATVECTOR = DESCRIPTOR.message_types_by_name['FloatVector'] _DOUBLEVECTOR = DESCRIPTOR.message_types_by_name['DoubleVector'] _INT64RANGE = DESCRIPTOR.message_types_by_name['Int64Range'] _INT64SET = DESCRIPTOR.message_types_by_name['Int64Set'] _DOUBLERANGE = DESCRIPTOR.message_types_by_name['DoubleRange'] _PRECISIONRECALLCURVE = DESCRIPTOR.message_types_by_name['PrecisionRecallCurve'] StringToInt64Map = _reflection.GeneratedProtocolMessageType('StringToInt64Map', (_message.Message,), { 'MapEntry' : _reflection.GeneratedProtocolMessageType('MapEntry', (_message.Message,), { 'DESCRIPTOR' : _STRINGTOINT64MAP_MAPENTRY, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.StringToInt64Map.MapEntry) }) , 'DESCRIPTOR' : _STRINGTOINT64MAP, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.StringToInt64Map) }) _sym_db.RegisterMessage(StringToInt64Map) _sym_db.RegisterMessage(StringToInt64Map.MapEntry) Int64ToStringMap = _reflection.GeneratedProtocolMessageType('Int64ToStringMap', (_message.Message,), { 'MapEntry' : _reflection.GeneratedProtocolMessageType('MapEntry', (_message.Message,), { 'DESCRIPTOR' : _INT64TOSTRINGMAP_MAPENTRY, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Int64ToStringMap.MapEntry) }) , 'DESCRIPTOR' : _INT64TOSTRINGMAP, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Int64ToStringMap) }) _sym_db.RegisterMessage(Int64ToStringMap) _sym_db.RegisterMessage(Int64ToStringMap.MapEntry) StringToDoubleMap = _reflection.GeneratedProtocolMessageType('StringToDoubleMap', (_message.Message,), { 'MapEntry' : _reflection.GeneratedProtocolMessageType('MapEntry', (_message.Message,), { 'DESCRIPTOR' : _STRINGTODOUBLEMAP_MAPENTRY, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.StringToDoubleMap.MapEntry) }) , 'DESCRIPTOR' : _STRINGTODOUBLEMAP, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.StringToDoubleMap) }) _sym_db.RegisterMessage(StringToDoubleMap) _sym_db.RegisterMessage(StringToDoubleMap.MapEntry) Int64ToDoubleMap = _reflection.GeneratedProtocolMessageType('Int64ToDoubleMap', (_message.Message,), { 'MapEntry' : _reflection.GeneratedProtocolMessageType('MapEntry', (_message.Message,), { 'DESCRIPTOR' : _INT64TODOUBLEMAP_MAPENTRY, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Int64ToDoubleMap.MapEntry) }) , 'DESCRIPTOR' : _INT64TODOUBLEMAP, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Int64ToDoubleMap) }) _sym_db.RegisterMessage(Int64ToDoubleMap) _sym_db.RegisterMessage(Int64ToDoubleMap.MapEntry) StringVector = _reflection.GeneratedProtocolMessageType('StringVector', (_message.Message,), { 'DESCRIPTOR' : _STRINGVECTOR, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.StringVector) }) _sym_db.RegisterMessage(StringVector) Int64Vector = _reflection.GeneratedProtocolMessageType('Int64Vector', (_message.Message,), { 'DESCRIPTOR' : _INT64VECTOR, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Int64Vector) }) _sym_db.RegisterMessage(Int64Vector) FloatVector = _reflection.GeneratedProtocolMessageType('FloatVector', (_message.Message,), { 'DESCRIPTOR' : _FLOATVECTOR, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FloatVector) }) _sym_db.RegisterMessage(FloatVector) DoubleVector = _reflection.GeneratedProtocolMessageType('DoubleVector', (_message.Message,), { 'DESCRIPTOR' : _DOUBLEVECTOR, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.DoubleVector) }) _sym_db.RegisterMessage(DoubleVector) Int64Range = _reflection.GeneratedProtocolMessageType('Int64Range', (_message.Message,), { 'DESCRIPTOR' : _INT64RANGE, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Int64Range) }) _sym_db.RegisterMessage(Int64Range) Int64Set = _reflection.GeneratedProtocolMessageType('Int64Set', (_message.Message,), { 'DESCRIPTOR' : _INT64SET, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Int64Set) }) _sym_db.RegisterMessage(Int64Set) DoubleRange = _reflection.GeneratedProtocolMessageType('DoubleRange', (_message.Message,), { 'DESCRIPTOR' : _DOUBLERANGE, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.DoubleRange) }) _sym_db.RegisterMessage(DoubleRange) PrecisionRecallCurve = _reflection.GeneratedProtocolMessageType('PrecisionRecallCurve', (_message.Message,), { 'DESCRIPTOR' : _PRECISIONRECALLCURVE, '__module__' : 'DataStructures_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.PrecisionRecallCurve) }) _sym_db.RegisterMessage(PrecisionRecallCurve) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _STRINGTOINT64MAP_MAPENTRY._options = None _STRINGTOINT64MAP_MAPENTRY._serialized_options = b'8\001' _INT64TOSTRINGMAP_MAPENTRY._options = None _INT64TOSTRINGMAP_MAPENTRY._serialized_options = b'8\001' _STRINGTODOUBLEMAP_MAPENTRY._options = None _STRINGTODOUBLEMAP_MAPENTRY._serialized_options = b'8\001' _INT64TODOUBLEMAP_MAPENTRY._options = None _INT64TODOUBLEMAP_MAPENTRY._serialized_options = b'8\001' _STRINGTOINT64MAP._serialized_start=66 _STRINGTOINT64MAP._serialized_end=190 _STRINGTOINT64MAP_MAPENTRY._serialized_start=148 _STRINGTOINT64MAP_MAPENTRY._serialized_end=190 _INT64TOSTRINGMAP._serialized_start=192 _INT64TOSTRINGMAP._serialized_end=316 _INT64TOSTRINGMAP_MAPENTRY._serialized_start=274 _INT64TOSTRINGMAP_MAPENTRY._serialized_end=316 _STRINGTODOUBLEMAP._serialized_start=318 _STRINGTODOUBLEMAP._serialized_end=444 _STRINGTODOUBLEMAP_MAPENTRY._serialized_start=402 _STRINGTODOUBLEMAP_MAPENTRY._serialized_end=444 _INT64TODOUBLEMAP._serialized_start=446 _INT64TODOUBLEMAP._serialized_end=570 _INT64TODOUBLEMAP_MAPENTRY._serialized_start=528 _INT64TODOUBLEMAP_MAPENTRY._serialized_end=570 _STRINGVECTOR._serialized_start=572 _STRINGVECTOR._serialized_end=602 _INT64VECTOR._serialized_start=604 _INT64VECTOR._serialized_end=633 _FLOATVECTOR._serialized_start=635 _FLOATVECTOR._serialized_end=664 _DOUBLEVECTOR._serialized_start=666 _DOUBLEVECTOR._serialized_end=696 _INT64RANGE._serialized_start=698 _INT64RANGE._serialized_end=746 _INT64SET._serialized_start=748 _INT64SET._serialized_end=774 _DOUBLERANGE._serialized_start=776 _DOUBLERANGE._serialized_end=825 _PRECISIONRECALLCURVE._serialized_start=828 _PRECISIONRECALLCURVE._serialized_end=1112 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/DictVectorizer_pb2.py0000644000000000000000000000334214672066616022263 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: DictVectorizer.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x14\x44ictVectorizer.proto\x12\x14\x43oreML.Specification\x1a\x14\x44\x61taStructures.proto\"\x8f\x01\n\x0e\x44ictVectorizer\x12;\n\rstringToIndex\x18\x01 \x01(\x0b\x32\".CoreML.Specification.StringVectorH\x00\x12\x39\n\x0cint64ToIndex\x18\x02 \x01(\x0b\x32!.CoreML.Specification.Int64VectorH\x00\x42\x05\n\x03MapB\x02H\x03P\x00\x62\x06proto3') _DICTVECTORIZER = DESCRIPTOR.message_types_by_name['DictVectorizer'] DictVectorizer = _reflection.GeneratedProtocolMessageType('DictVectorizer', (_message.Message,), { 'DESCRIPTOR' : _DICTVECTORIZER, '__module__' : 'DictVectorizer_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.DictVectorizer) }) _sym_db.RegisterMessage(DictVectorizer) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _DICTVECTORIZER._serialized_start=69 _DICTVECTORIZER._serialized_end=212 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/FeatureTypes_pb2.py0000644000000000000000000003111114672066616021736 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: FeatureTypes.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x12\x46\x65\x61tureTypes.proto\x12\x14\x43oreML.Specification\"\x12\n\x10Int64FeatureType\"\x13\n\x11\x44oubleFeatureType\"\x13\n\x11StringFeatureType\"3\n\tSizeRange\x12\x12\n\nlowerBound\x18\x01 \x01(\x04\x12\x12\n\nupperBound\x18\x02 \x01(\x03\"\x95\x05\n\x10ImageFeatureType\x12\r\n\x05width\x18\x01 \x01(\x03\x12\x0e\n\x06height\x18\x02 \x01(\x03\x12V\n\x0f\x65numeratedSizes\x18\x15 \x01(\x0b\x32;.CoreML.Specification.ImageFeatureType.EnumeratedImageSizesH\x00\x12O\n\x0eimageSizeRange\x18\x1f \x01(\x0b\x32\x35.CoreML.Specification.ImageFeatureType.ImageSizeRangeH\x00\x12\x45\n\ncolorSpace\x18\x03 \x01(\x0e\x32\x31.CoreML.Specification.ImageFeatureType.ColorSpace\x1a*\n\tImageSize\x12\r\n\x05width\x18\x01 \x01(\x04\x12\x0e\n\x06height\x18\x02 \x01(\x04\x1aW\n\x14\x45numeratedImageSizes\x12?\n\x05sizes\x18\x01 \x03(\x0b\x32\x30.CoreML.Specification.ImageFeatureType.ImageSize\x1a{\n\x0eImageSizeRange\x12\x33\n\nwidthRange\x18\x01 \x01(\x0b\x32\x1f.CoreML.Specification.SizeRange\x12\x34\n\x0bheightRange\x18\x02 \x01(\x0b\x32\x1f.CoreML.Specification.SizeRange\"]\n\nColorSpace\x12\x17\n\x13INVALID_COLOR_SPACE\x10\x00\x12\r\n\tGRAYSCALE\x10\n\x12\x07\n\x03RGB\x10\x14\x12\x07\n\x03\x42GR\x10\x1e\x12\x15\n\x11GRAYSCALE_FLOAT16\x10(B\x11\n\x0fSizeFlexibility\"\x9d\x05\n\x10\x41rrayFeatureType\x12\r\n\x05shape\x18\x01 \x03(\x03\x12\x46\n\x08\x64\x61taType\x18\x02 \x01(\x0e\x32\x34.CoreML.Specification.ArrayFeatureType.ArrayDataType\x12S\n\x10\x65numeratedShapes\x18\x15 \x01(\x0b\x32\x37.CoreML.Specification.ArrayFeatureType.EnumeratedShapesH\x00\x12G\n\nshapeRange\x18\x1f \x01(\x0b\x32\x31.CoreML.Specification.ArrayFeatureType.ShapeRangeH\x00\x12\x19\n\x0fintDefaultValue\x18) \x01(\x05H\x01\x12\x1b\n\x11\x66loatDefaultValue\x18\x33 \x01(\x02H\x01\x12\x1c\n\x12\x64oubleDefaultValue\x18= \x01(\x01H\x01\x1a\x16\n\x05Shape\x12\r\n\x05shape\x18\x01 \x03(\x03\x1aP\n\x10\x45numeratedShapes\x12<\n\x06shapes\x18\x01 \x03(\x0b\x32,.CoreML.Specification.ArrayFeatureType.Shape\x1a\x41\n\nShapeRange\x12\x33\n\nsizeRanges\x18\x01 \x03(\x0b\x32\x1f.CoreML.Specification.SizeRange\"e\n\rArrayDataType\x12\x1b\n\x17INVALID_ARRAY_DATA_TYPE\x10\x00\x12\r\n\x07\x46LOAT32\x10\xa0\x80\x04\x12\x0c\n\x06\x44OUBLE\x10\xc0\x80\x04\x12\x0b\n\x05INT32\x10\xa0\x80\x08\x12\r\n\x07\x46LOAT16\x10\x90\x80\x04\x42\x12\n\x10ShapeFlexibilityB\x16\n\x14\x64\x65\x66\x61ultOptionalValue\"\xa4\x01\n\x15\x44ictionaryFeatureType\x12>\n\x0cint64KeyType\x18\x01 \x01(\x0b\x32&.CoreML.Specification.Int64FeatureTypeH\x00\x12@\n\rstringKeyType\x18\x02 \x01(\x0b\x32\'.CoreML.Specification.StringFeatureTypeH\x00\x42\t\n\x07KeyType\"\xcd\x01\n\x13SequenceFeatureType\x12;\n\tint64Type\x18\x01 \x01(\x0b\x32&.CoreML.Specification.Int64FeatureTypeH\x00\x12=\n\nstringType\x18\x03 \x01(\x0b\x32\'.CoreML.Specification.StringFeatureTypeH\x00\x12\x32\n\tsizeRange\x18\x65 \x01(\x0b\x32\x1f.CoreML.Specification.SizeRangeB\x06\n\x04Type\"W\n\x10StateFeatureType\x12;\n\tarrayType\x18\x01 \x01(\x0b\x32&.CoreML.Specification.ArrayFeatureTypeH\x00\x42\x06\n\x04Type\"\xab\x04\n\x0b\x46\x65\x61tureType\x12;\n\tint64Type\x18\x01 \x01(\x0b\x32&.CoreML.Specification.Int64FeatureTypeH\x00\x12=\n\ndoubleType\x18\x02 \x01(\x0b\x32\'.CoreML.Specification.DoubleFeatureTypeH\x00\x12=\n\nstringType\x18\x03 \x01(\x0b\x32\'.CoreML.Specification.StringFeatureTypeH\x00\x12;\n\timageType\x18\x04 \x01(\x0b\x32&.CoreML.Specification.ImageFeatureTypeH\x00\x12@\n\x0emultiArrayType\x18\x05 \x01(\x0b\x32&.CoreML.Specification.ArrayFeatureTypeH\x00\x12\x45\n\x0e\x64ictionaryType\x18\x06 \x01(\x0b\x32+.CoreML.Specification.DictionaryFeatureTypeH\x00\x12\x41\n\x0csequenceType\x18\x07 \x01(\x0b\x32).CoreML.Specification.SequenceFeatureTypeH\x00\x12;\n\tstateType\x18\x08 \x01(\x0b\x32&.CoreML.Specification.StateFeatureTypeH\x00\x12\x13\n\nisOptional\x18\xe8\x07 \x01(\x08\x42\x06\n\x04TypeB\x02H\x03\x62\x06proto3') _INT64FEATURETYPE = DESCRIPTOR.message_types_by_name['Int64FeatureType'] _DOUBLEFEATURETYPE = DESCRIPTOR.message_types_by_name['DoubleFeatureType'] _STRINGFEATURETYPE = DESCRIPTOR.message_types_by_name['StringFeatureType'] _SIZERANGE = DESCRIPTOR.message_types_by_name['SizeRange'] _IMAGEFEATURETYPE = DESCRIPTOR.message_types_by_name['ImageFeatureType'] _IMAGEFEATURETYPE_IMAGESIZE = _IMAGEFEATURETYPE.nested_types_by_name['ImageSize'] _IMAGEFEATURETYPE_ENUMERATEDIMAGESIZES = _IMAGEFEATURETYPE.nested_types_by_name['EnumeratedImageSizes'] _IMAGEFEATURETYPE_IMAGESIZERANGE = _IMAGEFEATURETYPE.nested_types_by_name['ImageSizeRange'] _ARRAYFEATURETYPE = DESCRIPTOR.message_types_by_name['ArrayFeatureType'] _ARRAYFEATURETYPE_SHAPE = _ARRAYFEATURETYPE.nested_types_by_name['Shape'] _ARRAYFEATURETYPE_ENUMERATEDSHAPES = _ARRAYFEATURETYPE.nested_types_by_name['EnumeratedShapes'] _ARRAYFEATURETYPE_SHAPERANGE = _ARRAYFEATURETYPE.nested_types_by_name['ShapeRange'] _DICTIONARYFEATURETYPE = DESCRIPTOR.message_types_by_name['DictionaryFeatureType'] _SEQUENCEFEATURETYPE = DESCRIPTOR.message_types_by_name['SequenceFeatureType'] _STATEFEATURETYPE = DESCRIPTOR.message_types_by_name['StateFeatureType'] _FEATURETYPE = DESCRIPTOR.message_types_by_name['FeatureType'] _IMAGEFEATURETYPE_COLORSPACE = _IMAGEFEATURETYPE.enum_types_by_name['ColorSpace'] _ARRAYFEATURETYPE_ARRAYDATATYPE = _ARRAYFEATURETYPE.enum_types_by_name['ArrayDataType'] Int64FeatureType = _reflection.GeneratedProtocolMessageType('Int64FeatureType', (_message.Message,), { 'DESCRIPTOR' : _INT64FEATURETYPE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Int64FeatureType) }) _sym_db.RegisterMessage(Int64FeatureType) DoubleFeatureType = _reflection.GeneratedProtocolMessageType('DoubleFeatureType', (_message.Message,), { 'DESCRIPTOR' : _DOUBLEFEATURETYPE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.DoubleFeatureType) }) _sym_db.RegisterMessage(DoubleFeatureType) StringFeatureType = _reflection.GeneratedProtocolMessageType('StringFeatureType', (_message.Message,), { 'DESCRIPTOR' : _STRINGFEATURETYPE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.StringFeatureType) }) _sym_db.RegisterMessage(StringFeatureType) SizeRange = _reflection.GeneratedProtocolMessageType('SizeRange', (_message.Message,), { 'DESCRIPTOR' : _SIZERANGE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SizeRange) }) _sym_db.RegisterMessage(SizeRange) ImageFeatureType = _reflection.GeneratedProtocolMessageType('ImageFeatureType', (_message.Message,), { 'ImageSize' : _reflection.GeneratedProtocolMessageType('ImageSize', (_message.Message,), { 'DESCRIPTOR' : _IMAGEFEATURETYPE_IMAGESIZE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ImageFeatureType.ImageSize) }) , 'EnumeratedImageSizes' : _reflection.GeneratedProtocolMessageType('EnumeratedImageSizes', (_message.Message,), { 'DESCRIPTOR' : _IMAGEFEATURETYPE_ENUMERATEDIMAGESIZES, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ImageFeatureType.EnumeratedImageSizes) }) , 'ImageSizeRange' : _reflection.GeneratedProtocolMessageType('ImageSizeRange', (_message.Message,), { 'DESCRIPTOR' : _IMAGEFEATURETYPE_IMAGESIZERANGE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ImageFeatureType.ImageSizeRange) }) , 'DESCRIPTOR' : _IMAGEFEATURETYPE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ImageFeatureType) }) _sym_db.RegisterMessage(ImageFeatureType) _sym_db.RegisterMessage(ImageFeatureType.ImageSize) _sym_db.RegisterMessage(ImageFeatureType.EnumeratedImageSizes) _sym_db.RegisterMessage(ImageFeatureType.ImageSizeRange) ArrayFeatureType = _reflection.GeneratedProtocolMessageType('ArrayFeatureType', (_message.Message,), { 'Shape' : _reflection.GeneratedProtocolMessageType('Shape', (_message.Message,), { 'DESCRIPTOR' : _ARRAYFEATURETYPE_SHAPE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ArrayFeatureType.Shape) }) , 'EnumeratedShapes' : _reflection.GeneratedProtocolMessageType('EnumeratedShapes', (_message.Message,), { 'DESCRIPTOR' : _ARRAYFEATURETYPE_ENUMERATEDSHAPES, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ArrayFeatureType.EnumeratedShapes) }) , 'ShapeRange' : _reflection.GeneratedProtocolMessageType('ShapeRange', (_message.Message,), { 'DESCRIPTOR' : _ARRAYFEATURETYPE_SHAPERANGE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ArrayFeatureType.ShapeRange) }) , 'DESCRIPTOR' : _ARRAYFEATURETYPE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ArrayFeatureType) }) _sym_db.RegisterMessage(ArrayFeatureType) _sym_db.RegisterMessage(ArrayFeatureType.Shape) _sym_db.RegisterMessage(ArrayFeatureType.EnumeratedShapes) _sym_db.RegisterMessage(ArrayFeatureType.ShapeRange) DictionaryFeatureType = _reflection.GeneratedProtocolMessageType('DictionaryFeatureType', (_message.Message,), { 'DESCRIPTOR' : _DICTIONARYFEATURETYPE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.DictionaryFeatureType) }) _sym_db.RegisterMessage(DictionaryFeatureType) SequenceFeatureType = _reflection.GeneratedProtocolMessageType('SequenceFeatureType', (_message.Message,), { 'DESCRIPTOR' : _SEQUENCEFEATURETYPE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SequenceFeatureType) }) _sym_db.RegisterMessage(SequenceFeatureType) StateFeatureType = _reflection.GeneratedProtocolMessageType('StateFeatureType', (_message.Message,), { 'DESCRIPTOR' : _STATEFEATURETYPE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.StateFeatureType) }) _sym_db.RegisterMessage(StateFeatureType) FeatureType = _reflection.GeneratedProtocolMessageType('FeatureType', (_message.Message,), { 'DESCRIPTOR' : _FEATURETYPE, '__module__' : 'FeatureTypes_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FeatureType) }) _sym_db.RegisterMessage(FeatureType) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _INT64FEATURETYPE._serialized_start=44 _INT64FEATURETYPE._serialized_end=62 _DOUBLEFEATURETYPE._serialized_start=64 _DOUBLEFEATURETYPE._serialized_end=83 _STRINGFEATURETYPE._serialized_start=85 _STRINGFEATURETYPE._serialized_end=104 _SIZERANGE._serialized_start=106 _SIZERANGE._serialized_end=157 _IMAGEFEATURETYPE._serialized_start=160 _IMAGEFEATURETYPE._serialized_end=821 _IMAGEFEATURETYPE_IMAGESIZE._serialized_start=451 _IMAGEFEATURETYPE_IMAGESIZE._serialized_end=493 _IMAGEFEATURETYPE_ENUMERATEDIMAGESIZES._serialized_start=495 _IMAGEFEATURETYPE_ENUMERATEDIMAGESIZES._serialized_end=582 _IMAGEFEATURETYPE_IMAGESIZERANGE._serialized_start=584 _IMAGEFEATURETYPE_IMAGESIZERANGE._serialized_end=707 _IMAGEFEATURETYPE_COLORSPACE._serialized_start=709 _IMAGEFEATURETYPE_COLORSPACE._serialized_end=802 _ARRAYFEATURETYPE._serialized_start=824 _ARRAYFEATURETYPE._serialized_end=1493 _ARRAYFEATURETYPE_SHAPE._serialized_start=1175 _ARRAYFEATURETYPE_SHAPE._serialized_end=1197 _ARRAYFEATURETYPE_ENUMERATEDSHAPES._serialized_start=1199 _ARRAYFEATURETYPE_ENUMERATEDSHAPES._serialized_end=1279 _ARRAYFEATURETYPE_SHAPERANGE._serialized_start=1281 _ARRAYFEATURETYPE_SHAPERANGE._serialized_end=1346 _ARRAYFEATURETYPE_ARRAYDATATYPE._serialized_start=1348 _ARRAYFEATURETYPE_ARRAYDATATYPE._serialized_end=1449 _DICTIONARYFEATURETYPE._serialized_start=1496 _DICTIONARYFEATURETYPE._serialized_end=1660 _SEQUENCEFEATURETYPE._serialized_start=1663 _SEQUENCEFEATURETYPE._serialized_end=1868 _STATEFEATURETYPE._serialized_start=1870 _STATEFEATURETYPE._serialized_end=1957 _FEATURETYPE._serialized_start=1960 _FEATURETYPE._serialized_end=2515 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/FeatureVectorizer_pb2.py0000644000000000000000000000407414672066616022776 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: FeatureVectorizer.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x17\x46\x65\x61tureVectorizer.proto\x12\x14\x43oreML.Specification\"\x98\x01\n\x11\x46\x65\x61tureVectorizer\x12\x46\n\tinputList\x18\x01 \x03(\x0b\x32\x33.CoreML.Specification.FeatureVectorizer.InputColumn\x1a;\n\x0bInputColumn\x12\x13\n\x0binputColumn\x18\x01 \x01(\t\x12\x17\n\x0finputDimensions\x18\x02 \x01(\x04\x42\x02H\x03\x62\x06proto3') _FEATUREVECTORIZER = DESCRIPTOR.message_types_by_name['FeatureVectorizer'] _FEATUREVECTORIZER_INPUTCOLUMN = _FEATUREVECTORIZER.nested_types_by_name['InputColumn'] FeatureVectorizer = _reflection.GeneratedProtocolMessageType('FeatureVectorizer', (_message.Message,), { 'InputColumn' : _reflection.GeneratedProtocolMessageType('InputColumn', (_message.Message,), { 'DESCRIPTOR' : _FEATUREVECTORIZER_INPUTCOLUMN, '__module__' : 'FeatureVectorizer_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FeatureVectorizer.InputColumn) }) , 'DESCRIPTOR' : _FEATUREVECTORIZER, '__module__' : 'FeatureVectorizer_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FeatureVectorizer) }) _sym_db.RegisterMessage(FeatureVectorizer) _sym_db.RegisterMessage(FeatureVectorizer.InputColumn) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _FEATUREVECTORIZER._serialized_start=50 _FEATUREVECTORIZER._serialized_end=202 _FEATUREVECTORIZER_INPUTCOLUMN._serialized_start=143 _FEATUREVECTORIZER_INPUTCOLUMN._serialized_end=202 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/GLMClassifier_pb2.py0000644000000000000000000000626714672066616021760 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: GLMClassifier.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x13GLMClassifier.proto\x12\x14\x43oreML.Specification\x1a\x14\x44\x61taStructures.proto\"\x9c\x04\n\rGLMClassifier\x12@\n\x07weights\x18\x01 \x03(\x0b\x32/.CoreML.Specification.GLMClassifier.DoubleArray\x12\x0e\n\x06offset\x18\x02 \x03(\x01\x12\\\n\x17postEvaluationTransform\x18\x03 \x01(\x0e\x32;.CoreML.Specification.GLMClassifier.PostEvaluationTransform\x12H\n\rclassEncoding\x18\x04 \x01(\x0e\x32\x31.CoreML.Specification.GLMClassifier.ClassEncoding\x12?\n\x11stringClassLabels\x18\x64 \x01(\x0b\x32\".CoreML.Specification.StringVectorH\x00\x12=\n\x10int64ClassLabels\x18\x65 \x01(\x0b\x32!.CoreML.Specification.Int64VectorH\x00\x1a\x1c\n\x0b\x44oubleArray\x12\r\n\x05value\x18\x01 \x03(\x01\"0\n\x17PostEvaluationTransform\x12\t\n\x05Logit\x10\x00\x12\n\n\x06Probit\x10\x01\"2\n\rClassEncoding\x12\x12\n\x0eReferenceClass\x10\x00\x12\r\n\tOneVsRest\x10\x01\x42\r\n\x0b\x43lassLabelsB\x02H\x03P\x00\x62\x06proto3') _GLMCLASSIFIER = DESCRIPTOR.message_types_by_name['GLMClassifier'] _GLMCLASSIFIER_DOUBLEARRAY = _GLMCLASSIFIER.nested_types_by_name['DoubleArray'] _GLMCLASSIFIER_POSTEVALUATIONTRANSFORM = _GLMCLASSIFIER.enum_types_by_name['PostEvaluationTransform'] _GLMCLASSIFIER_CLASSENCODING = _GLMCLASSIFIER.enum_types_by_name['ClassEncoding'] GLMClassifier = _reflection.GeneratedProtocolMessageType('GLMClassifier', (_message.Message,), { 'DoubleArray' : _reflection.GeneratedProtocolMessageType('DoubleArray', (_message.Message,), { 'DESCRIPTOR' : _GLMCLASSIFIER_DOUBLEARRAY, '__module__' : 'GLMClassifier_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.GLMClassifier.DoubleArray) }) , 'DESCRIPTOR' : _GLMCLASSIFIER, '__module__' : 'GLMClassifier_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.GLMClassifier) }) _sym_db.RegisterMessage(GLMClassifier) _sym_db.RegisterMessage(GLMClassifier.DoubleArray) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _GLMCLASSIFIER._serialized_start=68 _GLMCLASSIFIER._serialized_end=608 _GLMCLASSIFIER_DOUBLEARRAY._serialized_start=463 _GLMCLASSIFIER_DOUBLEARRAY._serialized_end=491 _GLMCLASSIFIER_POSTEVALUATIONTRANSFORM._serialized_start=493 _GLMCLASSIFIER_POSTEVALUATIONTRANSFORM._serialized_end=541 _GLMCLASSIFIER_CLASSENCODING._serialized_start=543 _GLMCLASSIFIER_CLASSENCODING._serialized_end=593 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/GLMRegressor_pb2.py0000644000000000000000000000456214672066616021643 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: GLMRegressor.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x12GLMRegressor.proto\x12\x14\x43oreML.Specification\"\x9d\x02\n\x0cGLMRegressor\x12?\n\x07weights\x18\x01 \x03(\x0b\x32..CoreML.Specification.GLMRegressor.DoubleArray\x12\x0e\n\x06offset\x18\x02 \x03(\x01\x12[\n\x17postEvaluationTransform\x18\x03 \x01(\x0e\x32:.CoreML.Specification.GLMRegressor.PostEvaluationTransform\x1a\x1c\n\x0b\x44oubleArray\x12\r\n\x05value\x18\x01 \x03(\x01\"A\n\x17PostEvaluationTransform\x12\x0f\n\x0bNoTransform\x10\x00\x12\t\n\x05Logit\x10\x01\x12\n\n\x06Probit\x10\x02\x42\x02H\x03\x62\x06proto3') _GLMREGRESSOR = DESCRIPTOR.message_types_by_name['GLMRegressor'] _GLMREGRESSOR_DOUBLEARRAY = _GLMREGRESSOR.nested_types_by_name['DoubleArray'] _GLMREGRESSOR_POSTEVALUATIONTRANSFORM = _GLMREGRESSOR.enum_types_by_name['PostEvaluationTransform'] GLMRegressor = _reflection.GeneratedProtocolMessageType('GLMRegressor', (_message.Message,), { 'DoubleArray' : _reflection.GeneratedProtocolMessageType('DoubleArray', (_message.Message,), { 'DESCRIPTOR' : _GLMREGRESSOR_DOUBLEARRAY, '__module__' : 'GLMRegressor_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.GLMRegressor.DoubleArray) }) , 'DESCRIPTOR' : _GLMREGRESSOR, '__module__' : 'GLMRegressor_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.GLMRegressor) }) _sym_db.RegisterMessage(GLMRegressor) _sym_db.RegisterMessage(GLMRegressor.DoubleArray) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _GLMREGRESSOR._serialized_start=45 _GLMREGRESSOR._serialized_end=330 _GLMREGRESSOR_DOUBLEARRAY._serialized_start=235 _GLMREGRESSOR_DOUBLEARRAY._serialized_end=263 _GLMREGRESSOR_POSTEVALUATIONTRANSFORM._serialized_start=265 _GLMREGRESSOR_POSTEVALUATIONTRANSFORM._serialized_end=330 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/Gazetteer_pb2.py0000644000000000000000000000334514672066616021260 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: Gazetteer.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x0fGazetteer.proto\x12!CoreML.Specification.CoreMLModels\x1a\x14\x44\x61taStructures.proto\"\x9c\x01\n\tGazetteer\x12\x10\n\x08revision\x18\x01 \x01(\r\x12\x10\n\x08language\x18\n \x01(\t\x12\x1a\n\x12modelParameterData\x18\x64 \x01(\x0c\x12@\n\x11stringClassLabels\x18\xc8\x01 \x01(\x0b\x32\".CoreML.Specification.StringVectorH\x00\x42\r\n\x0b\x43lassLabelsB\x02H\x03P\x00\x62\x06proto3') _GAZETTEER = DESCRIPTOR.message_types_by_name['Gazetteer'] Gazetteer = _reflection.GeneratedProtocolMessageType('Gazetteer', (_message.Message,), { 'DESCRIPTOR' : _GAZETTEER, '__module__' : 'Gazetteer_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CoreMLModels.Gazetteer) }) _sym_db.RegisterMessage(Gazetteer) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _GAZETTEER._serialized_start=77 _GAZETTEER._serialized_end=233 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/Identity_pb2.py0000644000000000000000000000226614672066616021120 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: Identity.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x0eIdentity.proto\x12\x14\x43oreML.Specification\"\n\n\x08IdentityB\x02H\x03\x62\x06proto3') _IDENTITY = DESCRIPTOR.message_types_by_name['Identity'] Identity = _reflection.GeneratedProtocolMessageType('Identity', (_message.Message,), { 'DESCRIPTOR' : _IDENTITY, '__module__' : 'Identity_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Identity) }) _sym_db.RegisterMessage(Identity) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _IDENTITY._serialized_start=40 _IDENTITY._serialized_end=50 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/Imputer_pb2.py0000644000000000000000000000426314672066616020753 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: Imputer.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\rImputer.proto\x12\x14\x43oreML.Specification\x1a\x14\x44\x61taStructures.proto\"\xf3\x03\n\x07Imputer\x12\x1c\n\x12imputedDoubleValue\x18\x01 \x01(\x01H\x00\x12\x1b\n\x11imputedInt64Value\x18\x02 \x01(\x03H\x00\x12\x1c\n\x12imputedStringValue\x18\x03 \x01(\tH\x00\x12@\n\x12imputedDoubleArray\x18\x04 \x01(\x0b\x32\".CoreML.Specification.DoubleVectorH\x00\x12>\n\x11imputedInt64Array\x18\x05 \x01(\x0b\x32!.CoreML.Specification.Int64VectorH\x00\x12J\n\x17imputedStringDictionary\x18\x06 \x01(\x0b\x32\'.CoreML.Specification.StringToDoubleMapH\x00\x12H\n\x16imputedInt64Dictionary\x18\x07 \x01(\x0b\x32&.CoreML.Specification.Int64ToDoubleMapH\x00\x12\x1c\n\x12replaceDoubleValue\x18\x0b \x01(\x01H\x01\x12\x1b\n\x11replaceInt64Value\x18\x0c \x01(\x03H\x01\x12\x1c\n\x12replaceStringValue\x18\r \x01(\tH\x01\x42\x0e\n\x0cImputedValueB\x0e\n\x0cReplaceValueB\x02H\x03P\x00\x62\x06proto3') _IMPUTER = DESCRIPTOR.message_types_by_name['Imputer'] Imputer = _reflection.GeneratedProtocolMessageType('Imputer', (_message.Message,), { 'DESCRIPTOR' : _IMPUTER, '__module__' : 'Imputer_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Imputer) }) _sym_db.RegisterMessage(Imputer) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _IMPUTER._serialized_start=62 _IMPUTER._serialized_end=561 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/ItemSimilarityRecommender_pb2.py0000644000000000000000000000752514672066616024460 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: ItemSimilarityRecommender.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x1fItemSimilarityRecommender.proto\x12\x14\x43oreML.Specification\x1a\x14\x44\x61taStructures.proto\"\xb2\x05\n\x19ItemSimilarityRecommender\x12Z\n\x14itemItemSimilarities\x18\x01 \x03(\x0b\x32<.CoreML.Specification.ItemSimilarityRecommender.SimilarItems\x12\x39\n\ritemStringIds\x18\x02 \x01(\x0b\x32\".CoreML.Specification.StringVector\x12\x37\n\x0citemInt64Ids\x18\x03 \x01(\x0b\x32!.CoreML.Specification.Int64Vector\x12\x1c\n\x14itemInputFeatureName\x18\n \x01(\t\x12*\n\"numRecommendationsInputFeatureName\x18\x0b \x01(\t\x12\'\n\x1fitemRestrictionInputFeatureName\x18\x0c \x01(\t\x12%\n\x1ditemExclusionInputFeatureName\x18\r \x01(\t\x12,\n$recommendedItemListOutputFeatureName\x18\x14 \x01(\t\x12-\n%recommendedItemScoreOutputFeatureName\x18\x15 \x01(\t\x1a\x38\n\rConnectedItem\x12\x0e\n\x06itemId\x18\x01 \x01(\x04\x12\x17\n\x0fsimilarityScore\x18\x02 \x01(\x01\x1a\x93\x01\n\x0cSimilarItems\x12\x0e\n\x06itemId\x18\x01 \x01(\x04\x12V\n\x0fsimilarItemList\x18\x02 \x03(\x0b\x32=.CoreML.Specification.ItemSimilarityRecommender.ConnectedItem\x12\x1b\n\x13itemScoreAdjustment\x18\x03 \x01(\x01\x42\x02H\x03P\x00\x62\x06proto3') _ITEMSIMILARITYRECOMMENDER = DESCRIPTOR.message_types_by_name['ItemSimilarityRecommender'] _ITEMSIMILARITYRECOMMENDER_CONNECTEDITEM = _ITEMSIMILARITYRECOMMENDER.nested_types_by_name['ConnectedItem'] _ITEMSIMILARITYRECOMMENDER_SIMILARITEMS = _ITEMSIMILARITYRECOMMENDER.nested_types_by_name['SimilarItems'] ItemSimilarityRecommender = _reflection.GeneratedProtocolMessageType('ItemSimilarityRecommender', (_message.Message,), { 'ConnectedItem' : _reflection.GeneratedProtocolMessageType('ConnectedItem', (_message.Message,), { 'DESCRIPTOR' : _ITEMSIMILARITYRECOMMENDER_CONNECTEDITEM, '__module__' : 'ItemSimilarityRecommender_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ItemSimilarityRecommender.ConnectedItem) }) , 'SimilarItems' : _reflection.GeneratedProtocolMessageType('SimilarItems', (_message.Message,), { 'DESCRIPTOR' : _ITEMSIMILARITYRECOMMENDER_SIMILARITEMS, '__module__' : 'ItemSimilarityRecommender_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ItemSimilarityRecommender.SimilarItems) }) , 'DESCRIPTOR' : _ITEMSIMILARITYRECOMMENDER, '__module__' : 'ItemSimilarityRecommender_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ItemSimilarityRecommender) }) _sym_db.RegisterMessage(ItemSimilarityRecommender) _sym_db.RegisterMessage(ItemSimilarityRecommender.ConnectedItem) _sym_db.RegisterMessage(ItemSimilarityRecommender.SimilarItems) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _ITEMSIMILARITYRECOMMENDER._serialized_start=80 _ITEMSIMILARITYRECOMMENDER._serialized_end=770 _ITEMSIMILARITYRECOMMENDER_CONNECTEDITEM._serialized_start=564 _ITEMSIMILARITYRECOMMENDER_CONNECTEDITEM._serialized_end=620 _ITEMSIMILARITYRECOMMENDER_SIMILARITEMS._serialized_start=623 _ITEMSIMILARITYRECOMMENDER_SIMILARITEMS._serialized_end=770 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/LinkedModel_pb2.py0000644000000000000000000000456014672066616021515 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: LinkedModel.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import Parameters_pb2 as Parameters__pb2 try: DataStructures__pb2 = Parameters__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = Parameters__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = Parameters__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = Parameters__pb2.FeatureTypes_pb2 from .Parameters_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x11LinkedModel.proto\x12\x14\x43oreML.Specification\x1a\x10Parameters.proto\"[\n\x0bLinkedModel\x12@\n\x0flinkedModelFile\x18\x01 \x01(\x0b\x32%.CoreML.Specification.LinkedModelFileH\x00\x42\n\n\x08LinkType\"\x9b\x01\n\x0fLinkedModelFile\x12\x42\n\x13linkedModelFileName\x18\x01 \x01(\x0b\x32%.CoreML.Specification.StringParameter\x12\x44\n\x15linkedModelSearchPath\x18\x02 \x01(\x0b\x32%.CoreML.Specification.StringParameterB\x02H\x03P\x00\x62\x06proto3') _LINKEDMODEL = DESCRIPTOR.message_types_by_name['LinkedModel'] _LINKEDMODELFILE = DESCRIPTOR.message_types_by_name['LinkedModelFile'] LinkedModel = _reflection.GeneratedProtocolMessageType('LinkedModel', (_message.Message,), { 'DESCRIPTOR' : _LINKEDMODEL, '__module__' : 'LinkedModel_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LinkedModel) }) _sym_db.RegisterMessage(LinkedModel) LinkedModelFile = _reflection.GeneratedProtocolMessageType('LinkedModelFile', (_message.Message,), { 'DESCRIPTOR' : _LINKEDMODELFILE, '__module__' : 'LinkedModel_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LinkedModelFile) }) _sym_db.RegisterMessage(LinkedModelFile) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _LINKEDMODEL._serialized_start=61 _LINKEDMODEL._serialized_end=152 _LINKEDMODELFILE._serialized_start=155 _LINKEDMODELFILE._serialized_end=310 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/MIL_pb2.py0000644000000000000000000007100614672066616017746 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: MIL.proto """Generated protocol buffer code.""" from google.protobuf.internal import enum_type_wrapper from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\tMIL.proto\x12\x1c\x43oreML.Specification.MILSpec\"\xf3\x02\n\x07Program\x12\x0f\n\x07version\x18\x01 \x01(\x03\x12G\n\tfunctions\x18\x02 \x03(\x0b\x32\x34.CoreML.Specification.MILSpec.Program.FunctionsEntry\x12\x11\n\tdocString\x18\x03 \x01(\t\x12I\n\nattributes\x18\x04 \x03(\x0b\x32\x35.CoreML.Specification.MILSpec.Program.AttributesEntry\x1aX\n\x0e\x46unctionsEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\x35\n\x05value\x18\x02 \x01(\x0b\x32&.CoreML.Specification.MILSpec.Function:\x02\x38\x01\x1aV\n\x0f\x41ttributesEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\x32\n\x05value\x18\x02 \x01(\x0b\x32#.CoreML.Specification.MILSpec.Value:\x02\x38\x01\"\xbe\x03\n\x08\x46unction\x12<\n\x06inputs\x18\x01 \x03(\x0b\x32,.CoreML.Specification.MILSpec.NamedValueType\x12\r\n\x05opset\x18\x02 \x01(\t\x12_\n\x15\x62lock_specializations\x18\x03 \x03(\x0b\x32@.CoreML.Specification.MILSpec.Function.BlockSpecializationsEntry\x12J\n\nattributes\x18\x04 \x03(\x0b\x32\x36.CoreML.Specification.MILSpec.Function.AttributesEntry\x1a`\n\x19\x42lockSpecializationsEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\x32\n\x05value\x18\x02 \x01(\x0b\x32#.CoreML.Specification.MILSpec.Block:\x02\x38\x01\x1aV\n\x0f\x41ttributesEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\x32\n\x05value\x18\x02 \x01(\x0b\x32#.CoreML.Specification.MILSpec.Value:\x02\x38\x01\"\xb4\x02\n\x05\x42lock\x12<\n\x06inputs\x18\x01 \x03(\x0b\x32,.CoreML.Specification.MILSpec.NamedValueType\x12\x0f\n\x07outputs\x18\x02 \x03(\t\x12;\n\noperations\x18\x03 \x03(\x0b\x32\'.CoreML.Specification.MILSpec.Operation\x12G\n\nattributes\x18\x04 \x03(\x0b\x32\x33.CoreML.Specification.MILSpec.Block.AttributesEntry\x1aV\n\x0f\x41ttributesEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\x32\n\x05value\x18\x02 \x01(\x0b\x32#.CoreML.Specification.MILSpec.Value:\x02\x38\x01\"\xa9\x01\n\x08\x41rgument\x12\x41\n\targuments\x18\x01 \x03(\x0b\x32..CoreML.Specification.MILSpec.Argument.Binding\x1aZ\n\x07\x42inding\x12\x0e\n\x04name\x18\x01 \x01(\tH\x00\x12\x34\n\x05value\x18\x02 \x01(\x0b\x32#.CoreML.Specification.MILSpec.ValueH\x00\x42\t\n\x07\x62inding\"\xce\x03\n\tOperation\x12\x0c\n\x04type\x18\x01 \x01(\t\x12\x43\n\x06inputs\x18\x02 \x03(\x0b\x32\x33.CoreML.Specification.MILSpec.Operation.InputsEntry\x12=\n\x07outputs\x18\x03 \x03(\x0b\x32,.CoreML.Specification.MILSpec.NamedValueType\x12\x33\n\x06\x62locks\x18\x04 \x03(\x0b\x32#.CoreML.Specification.MILSpec.Block\x12K\n\nattributes\x18\x05 \x03(\x0b\x32\x37.CoreML.Specification.MILSpec.Operation.AttributesEntry\x1aU\n\x0bInputsEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\x35\n\x05value\x18\x02 \x01(\x0b\x32&.CoreML.Specification.MILSpec.Argument:\x02\x38\x01\x1aV\n\x0f\x41ttributesEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\x32\n\x05value\x18\x02 \x01(\x0b\x32#.CoreML.Specification.MILSpec.Value:\x02\x38\x01\"U\n\x0eNamedValueType\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\x35\n\x04type\x18\x02 \x01(\x0b\x32\'.CoreML.Specification.MILSpec.ValueType\"\xd3\x02\n\tValueType\x12>\n\ntensorType\x18\x01 \x01(\x0b\x32(.CoreML.Specification.MILSpec.TensorTypeH\x00\x12:\n\x08listType\x18\x02 \x01(\x0b\x32&.CoreML.Specification.MILSpec.ListTypeH\x00\x12<\n\ttupleType\x18\x03 \x01(\x0b\x32\'.CoreML.Specification.MILSpec.TupleTypeH\x00\x12\x46\n\x0e\x64ictionaryType\x18\x04 \x01(\x0b\x32,.CoreML.Specification.MILSpec.DictionaryTypeH\x00\x12<\n\tstateType\x18\x05 \x01(\x0b\x32\'.CoreML.Specification.MILSpec.StateTypeH\x00\x42\x06\n\x04type\"\xb7\x02\n\nTensorType\x12\x38\n\x08\x64\x61taType\x18\x01 \x01(\x0e\x32&.CoreML.Specification.MILSpec.DataType\x12\x0c\n\x04rank\x18\x02 \x01(\x03\x12;\n\ndimensions\x18\x03 \x03(\x0b\x32\'.CoreML.Specification.MILSpec.Dimension\x12L\n\nattributes\x18\x04 \x03(\x0b\x32\x38.CoreML.Specification.MILSpec.TensorType.AttributesEntry\x1aV\n\x0f\x41ttributesEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\x32\n\x05value\x18\x02 \x01(\x0b\x32#.CoreML.Specification.MILSpec.Value:\x02\x38\x01\"C\n\tTupleType\x12\x36\n\x05types\x18\x01 \x03(\x0b\x32\'.CoreML.Specification.MILSpec.ValueType\"z\n\x08ListType\x12\x35\n\x04type\x18\x01 \x01(\x0b\x32\'.CoreML.Specification.MILSpec.ValueType\x12\x37\n\x06length\x18\x02 \x01(\x0b\x32\'.CoreML.Specification.MILSpec.Dimension\"\x86\x01\n\x0e\x44ictionaryType\x12\x38\n\x07keyType\x18\x01 \x01(\x0b\x32\'.CoreML.Specification.MILSpec.ValueType\x12:\n\tvalueType\x18\x02 \x01(\x0b\x32\'.CoreML.Specification.MILSpec.ValueType\"I\n\tStateType\x12<\n\x0bwrappedType\x18\x01 \x01(\x0b\x32\'.CoreML.Specification.MILSpec.ValueType\"\xfd\x01\n\tDimension\x12M\n\x08\x63onstant\x18\x01 \x01(\x0b\x32\x39.CoreML.Specification.MILSpec.Dimension.ConstantDimensionH\x00\x12K\n\x07unknown\x18\x02 \x01(\x0b\x32\x38.CoreML.Specification.MILSpec.Dimension.UnknownDimensionH\x00\x1a!\n\x11\x43onstantDimension\x12\x0c\n\x04size\x18\x01 \x01(\x04\x1a$\n\x10UnknownDimension\x12\x10\n\x08variadic\x18\x01 \x01(\x08\x42\x0b\n\tdimension\"\xb9\x04\n\x05Value\x12\x11\n\tdocString\x18\x01 \x01(\t\x12\x35\n\x04type\x18\x02 \x01(\x0b\x32\'.CoreML.Specification.MILSpec.ValueType\x12L\n\x0eimmediateValue\x18\x03 \x01(\x0b\x32\x32.CoreML.Specification.MILSpec.Value.ImmediateValueH\x00\x12J\n\rblobFileValue\x18\x05 \x01(\x0b\x32\x31.CoreML.Specification.MILSpec.Value.BlobFileValueH\x00\x1a\x8f\x02\n\x0eImmediateValue\x12;\n\x06tensor\x18\x01 \x01(\x0b\x32).CoreML.Specification.MILSpec.TensorValueH\x00\x12\x39\n\x05tuple\x18\x02 \x01(\x0b\x32(.CoreML.Specification.MILSpec.TupleValueH\x00\x12\x37\n\x04list\x18\x03 \x01(\x0b\x32\'.CoreML.Specification.MILSpec.ListValueH\x00\x12\x43\n\ndictionary\x18\x04 \x01(\x0b\x32-.CoreML.Specification.MILSpec.DictionaryValueH\x00\x42\x07\n\x05value\x1a\x31\n\rBlobFileValue\x12\x10\n\x08\x66ileName\x18\x01 \x01(\t\x12\x0e\n\x06offset\x18\x02 \x01(\x04\x42\x07\n\x05value\"\xac\x06\n\x0bTensorValue\x12J\n\x06\x66loats\x18\x01 \x01(\x0b\x32\x38.CoreML.Specification.MILSpec.TensorValue.RepeatedFloatsH\x00\x12\x46\n\x04ints\x18\x02 \x01(\x0b\x32\x36.CoreML.Specification.MILSpec.TensorValue.RepeatedIntsH\x00\x12H\n\x05\x62ools\x18\x03 \x01(\x0b\x32\x37.CoreML.Specification.MILSpec.TensorValue.RepeatedBoolsH\x00\x12L\n\x07strings\x18\x04 \x01(\x0b\x32\x39.CoreML.Specification.MILSpec.TensorValue.RepeatedStringsH\x00\x12N\n\x08longInts\x18\x05 \x01(\x0b\x32:.CoreML.Specification.MILSpec.TensorValue.RepeatedLongIntsH\x00\x12L\n\x07\x64oubles\x18\x06 \x01(\x0b\x32\x39.CoreML.Specification.MILSpec.TensorValue.RepeatedDoublesH\x00\x12H\n\x05\x62ytes\x18\x07 \x01(\x0b\x32\x37.CoreML.Specification.MILSpec.TensorValue.RepeatedBytesH\x00\x1a$\n\x0eRepeatedFloats\x12\x12\n\x06values\x18\x01 \x03(\x02\x42\x02\x10\x01\x1a%\n\x0fRepeatedDoubles\x12\x12\n\x06values\x18\x01 \x03(\x01\x42\x02\x10\x01\x1a\"\n\x0cRepeatedInts\x12\x12\n\x06values\x18\x01 \x03(\x05\x42\x02\x10\x01\x1a&\n\x10RepeatedLongInts\x12\x12\n\x06values\x18\x01 \x03(\x03\x42\x02\x10\x01\x1a#\n\rRepeatedBools\x12\x12\n\x06values\x18\x01 \x03(\x08\x42\x02\x10\x01\x1a!\n\x0fRepeatedStrings\x12\x0e\n\x06values\x18\x01 \x03(\t\x1a\x1f\n\rRepeatedBytes\x12\x0e\n\x06values\x18\x01 \x01(\x0c\x42\x07\n\x05value\"A\n\nTupleValue\x12\x33\n\x06values\x18\x01 \x03(\x0b\x32#.CoreML.Specification.MILSpec.Value\"@\n\tListValue\x12\x33\n\x06values\x18\x01 \x03(\x0b\x32#.CoreML.Specification.MILSpec.Value\"\xd3\x01\n\x0f\x44ictionaryValue\x12J\n\x06values\x18\x01 \x03(\x0b\x32:.CoreML.Specification.MILSpec.DictionaryValue.KeyValuePair\x1at\n\x0cKeyValuePair\x12\x30\n\x03key\x18\x01 \x01(\x0b\x32#.CoreML.Specification.MILSpec.Value\x12\x32\n\x05value\x18\x02 \x01(\x0b\x32#.CoreML.Specification.MILSpec.Value*\xa3\x02\n\x08\x44\x61taType\x12\x0f\n\x0bUNUSED_TYPE\x10\x00\x12\x08\n\x04\x42OOL\x10\x01\x12\n\n\x06STRING\x10\x02\x12\x10\n\x0c\x46LOAT8E4M3FN\x10(\x12\x0e\n\nFLOAT8E5M2\x10)\x12\x0b\n\x07\x46LOAT16\x10\n\x12\x0b\n\x07\x46LOAT32\x10\x0b\x12\x0b\n\x07\x46LOAT64\x10\x0c\x12\x0c\n\x08\x42\x46LOAT16\x10\r\x12\x08\n\x04INT8\x10\x15\x12\t\n\x05INT16\x10\x16\x12\t\n\x05INT32\x10\x17\x12\t\n\x05INT64\x10\x18\x12\x08\n\x04INT4\x10\x19\x12\t\n\x05UINT8\x10\x1f\x12\n\n\x06UINT16\x10 \x12\n\n\x06UINT32\x10!\x12\n\n\x06UINT64\x10\"\x12\t\n\x05UINT4\x10#\x12\t\n\x05UINT2\x10$\x12\t\n\x05UINT1\x10%\x12\t\n\x05UINT6\x10&\x12\t\n\x05UINT3\x10\'B\x02H\x03\x62\x06proto3') _DATATYPE = DESCRIPTOR.enum_types_by_name['DataType'] DataType = enum_type_wrapper.EnumTypeWrapper(_DATATYPE) UNUSED_TYPE = 0 BOOL = 1 STRING = 2 FLOAT8E4M3FN = 40 FLOAT8E5M2 = 41 FLOAT16 = 10 FLOAT32 = 11 FLOAT64 = 12 BFLOAT16 = 13 INT8 = 21 INT16 = 22 INT32 = 23 INT64 = 24 INT4 = 25 UINT8 = 31 UINT16 = 32 UINT32 = 33 UINT64 = 34 UINT4 = 35 UINT2 = 36 UINT1 = 37 UINT6 = 38 UINT3 = 39 _PROGRAM = DESCRIPTOR.message_types_by_name['Program'] _PROGRAM_FUNCTIONSENTRY = _PROGRAM.nested_types_by_name['FunctionsEntry'] _PROGRAM_ATTRIBUTESENTRY = _PROGRAM.nested_types_by_name['AttributesEntry'] _FUNCTION = DESCRIPTOR.message_types_by_name['Function'] _FUNCTION_BLOCKSPECIALIZATIONSENTRY = _FUNCTION.nested_types_by_name['BlockSpecializationsEntry'] _FUNCTION_ATTRIBUTESENTRY = _FUNCTION.nested_types_by_name['AttributesEntry'] _BLOCK = DESCRIPTOR.message_types_by_name['Block'] _BLOCK_ATTRIBUTESENTRY = _BLOCK.nested_types_by_name['AttributesEntry'] _ARGUMENT = DESCRIPTOR.message_types_by_name['Argument'] _ARGUMENT_BINDING = _ARGUMENT.nested_types_by_name['Binding'] _OPERATION = DESCRIPTOR.message_types_by_name['Operation'] _OPERATION_INPUTSENTRY = _OPERATION.nested_types_by_name['InputsEntry'] _OPERATION_ATTRIBUTESENTRY = _OPERATION.nested_types_by_name['AttributesEntry'] _NAMEDVALUETYPE = DESCRIPTOR.message_types_by_name['NamedValueType'] _VALUETYPE = DESCRIPTOR.message_types_by_name['ValueType'] _TENSORTYPE = DESCRIPTOR.message_types_by_name['TensorType'] _TENSORTYPE_ATTRIBUTESENTRY = _TENSORTYPE.nested_types_by_name['AttributesEntry'] _TUPLETYPE = DESCRIPTOR.message_types_by_name['TupleType'] _LISTTYPE = DESCRIPTOR.message_types_by_name['ListType'] _DICTIONARYTYPE = DESCRIPTOR.message_types_by_name['DictionaryType'] _STATETYPE = DESCRIPTOR.message_types_by_name['StateType'] _DIMENSION = DESCRIPTOR.message_types_by_name['Dimension'] _DIMENSION_CONSTANTDIMENSION = _DIMENSION.nested_types_by_name['ConstantDimension'] _DIMENSION_UNKNOWNDIMENSION = _DIMENSION.nested_types_by_name['UnknownDimension'] _VALUE = DESCRIPTOR.message_types_by_name['Value'] _VALUE_IMMEDIATEVALUE = _VALUE.nested_types_by_name['ImmediateValue'] _VALUE_BLOBFILEVALUE = _VALUE.nested_types_by_name['BlobFileValue'] _TENSORVALUE = DESCRIPTOR.message_types_by_name['TensorValue'] _TENSORVALUE_REPEATEDFLOATS = _TENSORVALUE.nested_types_by_name['RepeatedFloats'] _TENSORVALUE_REPEATEDDOUBLES = _TENSORVALUE.nested_types_by_name['RepeatedDoubles'] _TENSORVALUE_REPEATEDINTS = _TENSORVALUE.nested_types_by_name['RepeatedInts'] _TENSORVALUE_REPEATEDLONGINTS = _TENSORVALUE.nested_types_by_name['RepeatedLongInts'] _TENSORVALUE_REPEATEDBOOLS = _TENSORVALUE.nested_types_by_name['RepeatedBools'] _TENSORVALUE_REPEATEDSTRINGS = _TENSORVALUE.nested_types_by_name['RepeatedStrings'] _TENSORVALUE_REPEATEDBYTES = _TENSORVALUE.nested_types_by_name['RepeatedBytes'] _TUPLEVALUE = DESCRIPTOR.message_types_by_name['TupleValue'] _LISTVALUE = DESCRIPTOR.message_types_by_name['ListValue'] _DICTIONARYVALUE = DESCRIPTOR.message_types_by_name['DictionaryValue'] _DICTIONARYVALUE_KEYVALUEPAIR = _DICTIONARYVALUE.nested_types_by_name['KeyValuePair'] Program = _reflection.GeneratedProtocolMessageType('Program', (_message.Message,), { 'FunctionsEntry' : _reflection.GeneratedProtocolMessageType('FunctionsEntry', (_message.Message,), { 'DESCRIPTOR' : _PROGRAM_FUNCTIONSENTRY, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Program.FunctionsEntry) }) , 'AttributesEntry' : _reflection.GeneratedProtocolMessageType('AttributesEntry', (_message.Message,), { 'DESCRIPTOR' : _PROGRAM_ATTRIBUTESENTRY, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Program.AttributesEntry) }) , 'DESCRIPTOR' : _PROGRAM, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Program) }) _sym_db.RegisterMessage(Program) _sym_db.RegisterMessage(Program.FunctionsEntry) _sym_db.RegisterMessage(Program.AttributesEntry) Function = _reflection.GeneratedProtocolMessageType('Function', (_message.Message,), { 'BlockSpecializationsEntry' : _reflection.GeneratedProtocolMessageType('BlockSpecializationsEntry', (_message.Message,), { 'DESCRIPTOR' : _FUNCTION_BLOCKSPECIALIZATIONSENTRY, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Function.BlockSpecializationsEntry) }) , 'AttributesEntry' : _reflection.GeneratedProtocolMessageType('AttributesEntry', (_message.Message,), { 'DESCRIPTOR' : _FUNCTION_ATTRIBUTESENTRY, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Function.AttributesEntry) }) , 'DESCRIPTOR' : _FUNCTION, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Function) }) _sym_db.RegisterMessage(Function) _sym_db.RegisterMessage(Function.BlockSpecializationsEntry) _sym_db.RegisterMessage(Function.AttributesEntry) Block = _reflection.GeneratedProtocolMessageType('Block', (_message.Message,), { 'AttributesEntry' : _reflection.GeneratedProtocolMessageType('AttributesEntry', (_message.Message,), { 'DESCRIPTOR' : _BLOCK_ATTRIBUTESENTRY, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Block.AttributesEntry) }) , 'DESCRIPTOR' : _BLOCK, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Block) }) _sym_db.RegisterMessage(Block) _sym_db.RegisterMessage(Block.AttributesEntry) Argument = _reflection.GeneratedProtocolMessageType('Argument', (_message.Message,), { 'Binding' : _reflection.GeneratedProtocolMessageType('Binding', (_message.Message,), { 'DESCRIPTOR' : _ARGUMENT_BINDING, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Argument.Binding) }) , 'DESCRIPTOR' : _ARGUMENT, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Argument) }) _sym_db.RegisterMessage(Argument) _sym_db.RegisterMessage(Argument.Binding) Operation = _reflection.GeneratedProtocolMessageType('Operation', (_message.Message,), { 'InputsEntry' : _reflection.GeneratedProtocolMessageType('InputsEntry', (_message.Message,), { 'DESCRIPTOR' : _OPERATION_INPUTSENTRY, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Operation.InputsEntry) }) , 'AttributesEntry' : _reflection.GeneratedProtocolMessageType('AttributesEntry', (_message.Message,), { 'DESCRIPTOR' : _OPERATION_ATTRIBUTESENTRY, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Operation.AttributesEntry) }) , 'DESCRIPTOR' : _OPERATION, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Operation) }) _sym_db.RegisterMessage(Operation) _sym_db.RegisterMessage(Operation.InputsEntry) _sym_db.RegisterMessage(Operation.AttributesEntry) NamedValueType = _reflection.GeneratedProtocolMessageType('NamedValueType', (_message.Message,), { 'DESCRIPTOR' : _NAMEDVALUETYPE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.NamedValueType) }) _sym_db.RegisterMessage(NamedValueType) ValueType = _reflection.GeneratedProtocolMessageType('ValueType', (_message.Message,), { 'DESCRIPTOR' : _VALUETYPE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.ValueType) }) _sym_db.RegisterMessage(ValueType) TensorType = _reflection.GeneratedProtocolMessageType('TensorType', (_message.Message,), { 'AttributesEntry' : _reflection.GeneratedProtocolMessageType('AttributesEntry', (_message.Message,), { 'DESCRIPTOR' : _TENSORTYPE_ATTRIBUTESENTRY, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.TensorType.AttributesEntry) }) , 'DESCRIPTOR' : _TENSORTYPE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.TensorType) }) _sym_db.RegisterMessage(TensorType) _sym_db.RegisterMessage(TensorType.AttributesEntry) TupleType = _reflection.GeneratedProtocolMessageType('TupleType', (_message.Message,), { 'DESCRIPTOR' : _TUPLETYPE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.TupleType) }) _sym_db.RegisterMessage(TupleType) ListType = _reflection.GeneratedProtocolMessageType('ListType', (_message.Message,), { 'DESCRIPTOR' : _LISTTYPE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.ListType) }) _sym_db.RegisterMessage(ListType) DictionaryType = _reflection.GeneratedProtocolMessageType('DictionaryType', (_message.Message,), { 'DESCRIPTOR' : _DICTIONARYTYPE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.DictionaryType) }) _sym_db.RegisterMessage(DictionaryType) StateType = _reflection.GeneratedProtocolMessageType('StateType', (_message.Message,), { 'DESCRIPTOR' : _STATETYPE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.StateType) }) _sym_db.RegisterMessage(StateType) Dimension = _reflection.GeneratedProtocolMessageType('Dimension', (_message.Message,), { 'ConstantDimension' : _reflection.GeneratedProtocolMessageType('ConstantDimension', (_message.Message,), { 'DESCRIPTOR' : _DIMENSION_CONSTANTDIMENSION, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Dimension.ConstantDimension) }) , 'UnknownDimension' : _reflection.GeneratedProtocolMessageType('UnknownDimension', (_message.Message,), { 'DESCRIPTOR' : _DIMENSION_UNKNOWNDIMENSION, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Dimension.UnknownDimension) }) , 'DESCRIPTOR' : _DIMENSION, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Dimension) }) _sym_db.RegisterMessage(Dimension) _sym_db.RegisterMessage(Dimension.ConstantDimension) _sym_db.RegisterMessage(Dimension.UnknownDimension) Value = _reflection.GeneratedProtocolMessageType('Value', (_message.Message,), { 'ImmediateValue' : _reflection.GeneratedProtocolMessageType('ImmediateValue', (_message.Message,), { 'DESCRIPTOR' : _VALUE_IMMEDIATEVALUE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Value.ImmediateValue) }) , 'BlobFileValue' : _reflection.GeneratedProtocolMessageType('BlobFileValue', (_message.Message,), { 'DESCRIPTOR' : _VALUE_BLOBFILEVALUE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Value.BlobFileValue) }) , 'DESCRIPTOR' : _VALUE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.Value) }) _sym_db.RegisterMessage(Value) _sym_db.RegisterMessage(Value.ImmediateValue) _sym_db.RegisterMessage(Value.BlobFileValue) TensorValue = _reflection.GeneratedProtocolMessageType('TensorValue', (_message.Message,), { 'RepeatedFloats' : _reflection.GeneratedProtocolMessageType('RepeatedFloats', (_message.Message,), { 'DESCRIPTOR' : _TENSORVALUE_REPEATEDFLOATS, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.TensorValue.RepeatedFloats) }) , 'RepeatedDoubles' : _reflection.GeneratedProtocolMessageType('RepeatedDoubles', (_message.Message,), { 'DESCRIPTOR' : _TENSORVALUE_REPEATEDDOUBLES, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.TensorValue.RepeatedDoubles) }) , 'RepeatedInts' : _reflection.GeneratedProtocolMessageType('RepeatedInts', (_message.Message,), { 'DESCRIPTOR' : _TENSORVALUE_REPEATEDINTS, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.TensorValue.RepeatedInts) }) , 'RepeatedLongInts' : _reflection.GeneratedProtocolMessageType('RepeatedLongInts', (_message.Message,), { 'DESCRIPTOR' : _TENSORVALUE_REPEATEDLONGINTS, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.TensorValue.RepeatedLongInts) }) , 'RepeatedBools' : _reflection.GeneratedProtocolMessageType('RepeatedBools', (_message.Message,), { 'DESCRIPTOR' : _TENSORVALUE_REPEATEDBOOLS, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.TensorValue.RepeatedBools) }) , 'RepeatedStrings' : _reflection.GeneratedProtocolMessageType('RepeatedStrings', (_message.Message,), { 'DESCRIPTOR' : _TENSORVALUE_REPEATEDSTRINGS, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.TensorValue.RepeatedStrings) }) , 'RepeatedBytes' : _reflection.GeneratedProtocolMessageType('RepeatedBytes', (_message.Message,), { 'DESCRIPTOR' : _TENSORVALUE_REPEATEDBYTES, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.TensorValue.RepeatedBytes) }) , 'DESCRIPTOR' : _TENSORVALUE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.TensorValue) }) _sym_db.RegisterMessage(TensorValue) _sym_db.RegisterMessage(TensorValue.RepeatedFloats) _sym_db.RegisterMessage(TensorValue.RepeatedDoubles) _sym_db.RegisterMessage(TensorValue.RepeatedInts) _sym_db.RegisterMessage(TensorValue.RepeatedLongInts) _sym_db.RegisterMessage(TensorValue.RepeatedBools) _sym_db.RegisterMessage(TensorValue.RepeatedStrings) _sym_db.RegisterMessage(TensorValue.RepeatedBytes) TupleValue = _reflection.GeneratedProtocolMessageType('TupleValue', (_message.Message,), { 'DESCRIPTOR' : _TUPLEVALUE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.TupleValue) }) _sym_db.RegisterMessage(TupleValue) ListValue = _reflection.GeneratedProtocolMessageType('ListValue', (_message.Message,), { 'DESCRIPTOR' : _LISTVALUE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.ListValue) }) _sym_db.RegisterMessage(ListValue) DictionaryValue = _reflection.GeneratedProtocolMessageType('DictionaryValue', (_message.Message,), { 'KeyValuePair' : _reflection.GeneratedProtocolMessageType('KeyValuePair', (_message.Message,), { 'DESCRIPTOR' : _DICTIONARYVALUE_KEYVALUEPAIR, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.DictionaryValue.KeyValuePair) }) , 'DESCRIPTOR' : _DICTIONARYVALUE, '__module__' : 'MIL_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MILSpec.DictionaryValue) }) _sym_db.RegisterMessage(DictionaryValue) _sym_db.RegisterMessage(DictionaryValue.KeyValuePair) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _PROGRAM_FUNCTIONSENTRY._options = None _PROGRAM_FUNCTIONSENTRY._serialized_options = b'8\001' _PROGRAM_ATTRIBUTESENTRY._options = None _PROGRAM_ATTRIBUTESENTRY._serialized_options = b'8\001' _FUNCTION_BLOCKSPECIALIZATIONSENTRY._options = None _FUNCTION_BLOCKSPECIALIZATIONSENTRY._serialized_options = b'8\001' _FUNCTION_ATTRIBUTESENTRY._options = None _FUNCTION_ATTRIBUTESENTRY._serialized_options = b'8\001' _BLOCK_ATTRIBUTESENTRY._options = None _BLOCK_ATTRIBUTESENTRY._serialized_options = b'8\001' _OPERATION_INPUTSENTRY._options = None _OPERATION_INPUTSENTRY._serialized_options = b'8\001' _OPERATION_ATTRIBUTESENTRY._options = None _OPERATION_ATTRIBUTESENTRY._serialized_options = b'8\001' _TENSORTYPE_ATTRIBUTESENTRY._options = None _TENSORTYPE_ATTRIBUTESENTRY._serialized_options = b'8\001' _TENSORVALUE_REPEATEDFLOATS.fields_by_name['values']._options = None _TENSORVALUE_REPEATEDFLOATS.fields_by_name['values']._serialized_options = b'\020\001' _TENSORVALUE_REPEATEDDOUBLES.fields_by_name['values']._options = None _TENSORVALUE_REPEATEDDOUBLES.fields_by_name['values']._serialized_options = b'\020\001' _TENSORVALUE_REPEATEDINTS.fields_by_name['values']._options = None _TENSORVALUE_REPEATEDINTS.fields_by_name['values']._serialized_options = b'\020\001' _TENSORVALUE_REPEATEDLONGINTS.fields_by_name['values']._options = None _TENSORVALUE_REPEATEDLONGINTS.fields_by_name['values']._serialized_options = b'\020\001' _TENSORVALUE_REPEATEDBOOLS.fields_by_name['values']._options = None _TENSORVALUE_REPEATEDBOOLS.fields_by_name['values']._serialized_options = b'\020\001' _DATATYPE._serialized_start=4953 _DATATYPE._serialized_end=5244 _PROGRAM._serialized_start=44 _PROGRAM._serialized_end=415 _PROGRAM_FUNCTIONSENTRY._serialized_start=239 _PROGRAM_FUNCTIONSENTRY._serialized_end=327 _PROGRAM_ATTRIBUTESENTRY._serialized_start=329 _PROGRAM_ATTRIBUTESENTRY._serialized_end=415 _FUNCTION._serialized_start=418 _FUNCTION._serialized_end=864 _FUNCTION_BLOCKSPECIALIZATIONSENTRY._serialized_start=680 _FUNCTION_BLOCKSPECIALIZATIONSENTRY._serialized_end=776 _FUNCTION_ATTRIBUTESENTRY._serialized_start=329 _FUNCTION_ATTRIBUTESENTRY._serialized_end=415 _BLOCK._serialized_start=867 _BLOCK._serialized_end=1175 _BLOCK_ATTRIBUTESENTRY._serialized_start=329 _BLOCK_ATTRIBUTESENTRY._serialized_end=415 _ARGUMENT._serialized_start=1178 _ARGUMENT._serialized_end=1347 _ARGUMENT_BINDING._serialized_start=1257 _ARGUMENT_BINDING._serialized_end=1347 _OPERATION._serialized_start=1350 _OPERATION._serialized_end=1812 _OPERATION_INPUTSENTRY._serialized_start=1639 _OPERATION_INPUTSENTRY._serialized_end=1724 _OPERATION_ATTRIBUTESENTRY._serialized_start=329 _OPERATION_ATTRIBUTESENTRY._serialized_end=415 _NAMEDVALUETYPE._serialized_start=1814 _NAMEDVALUETYPE._serialized_end=1899 _VALUETYPE._serialized_start=1902 _VALUETYPE._serialized_end=2241 _TENSORTYPE._serialized_start=2244 _TENSORTYPE._serialized_end=2555 _TENSORTYPE_ATTRIBUTESENTRY._serialized_start=329 _TENSORTYPE_ATTRIBUTESENTRY._serialized_end=415 _TUPLETYPE._serialized_start=2557 _TUPLETYPE._serialized_end=2624 _LISTTYPE._serialized_start=2626 _LISTTYPE._serialized_end=2748 _DICTIONARYTYPE._serialized_start=2751 _DICTIONARYTYPE._serialized_end=2885 _STATETYPE._serialized_start=2887 _STATETYPE._serialized_end=2960 _DIMENSION._serialized_start=2963 _DIMENSION._serialized_end=3216 _DIMENSION_CONSTANTDIMENSION._serialized_start=3132 _DIMENSION_CONSTANTDIMENSION._serialized_end=3165 _DIMENSION_UNKNOWNDIMENSION._serialized_start=3167 _DIMENSION_UNKNOWNDIMENSION._serialized_end=3203 _VALUE._serialized_start=3219 _VALUE._serialized_end=3788 _VALUE_IMMEDIATEVALUE._serialized_start=3457 _VALUE_IMMEDIATEVALUE._serialized_end=3728 _VALUE_BLOBFILEVALUE._serialized_start=3730 _VALUE_BLOBFILEVALUE._serialized_end=3779 _TENSORVALUE._serialized_start=3791 _TENSORVALUE._serialized_end=4603 _TENSORVALUE_REPEATEDFLOATS._serialized_start=4338 _TENSORVALUE_REPEATEDFLOATS._serialized_end=4374 _TENSORVALUE_REPEATEDDOUBLES._serialized_start=4376 _TENSORVALUE_REPEATEDDOUBLES._serialized_end=4413 _TENSORVALUE_REPEATEDINTS._serialized_start=4415 _TENSORVALUE_REPEATEDINTS._serialized_end=4449 _TENSORVALUE_REPEATEDLONGINTS._serialized_start=4451 _TENSORVALUE_REPEATEDLONGINTS._serialized_end=4489 _TENSORVALUE_REPEATEDBOOLS._serialized_start=4491 _TENSORVALUE_REPEATEDBOOLS._serialized_end=4526 _TENSORVALUE_REPEATEDSTRINGS._serialized_start=4528 _TENSORVALUE_REPEATEDSTRINGS._serialized_end=4561 _TENSORVALUE_REPEATEDBYTES._serialized_start=4563 _TENSORVALUE_REPEATEDBYTES._serialized_end=4594 _TUPLEVALUE._serialized_start=4605 _TUPLEVALUE._serialized_end=4670 _LISTVALUE._serialized_start=4672 _LISTVALUE._serialized_end=4736 _DICTIONARYVALUE._serialized_start=4739 _DICTIONARYVALUE._serialized_end=4950 _DICTIONARYVALUE_KEYVALUEPAIR._serialized_start=4834 _DICTIONARYVALUE_KEYVALUEPAIR._serialized_end=4950 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/Model_pb2.py0000644000000000000000000005234014672066616020365 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: Model.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import VisionFeaturePrint_pb2 as VisionFeaturePrint__pb2 from . import AudioFeaturePrint_pb2 as AudioFeaturePrint__pb2 from . import TextClassifier_pb2 as TextClassifier__pb2 try: DataStructures__pb2 = TextClassifier__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = TextClassifier__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = TextClassifier__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = TextClassifier__pb2.FeatureTypes_pb2 from . import WordTagger_pb2 as WordTagger__pb2 try: DataStructures__pb2 = WordTagger__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = WordTagger__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = WordTagger__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = WordTagger__pb2.FeatureTypes_pb2 from . import Gazetteer_pb2 as Gazetteer__pb2 try: DataStructures__pb2 = Gazetteer__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = Gazetteer__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = Gazetteer__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = Gazetteer__pb2.FeatureTypes_pb2 from . import WordEmbedding_pb2 as WordEmbedding__pb2 try: DataStructures__pb2 = WordEmbedding__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = WordEmbedding__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = WordEmbedding__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = WordEmbedding__pb2.FeatureTypes_pb2 from . import ArrayFeatureExtractor_pb2 as ArrayFeatureExtractor__pb2 from . import BayesianProbitRegressor_pb2 as BayesianProbitRegressor__pb2 from . import CategoricalMapping_pb2 as CategoricalMapping__pb2 try: DataStructures__pb2 = CategoricalMapping__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = CategoricalMapping__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = CategoricalMapping__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = CategoricalMapping__pb2.FeatureTypes_pb2 from . import CustomModel_pb2 as CustomModel__pb2 from . import DictVectorizer_pb2 as DictVectorizer__pb2 try: DataStructures__pb2 = DictVectorizer__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = DictVectorizer__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = DictVectorizer__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DictVectorizer__pb2.FeatureTypes_pb2 from . import FeatureTypes_pb2 as FeatureTypes__pb2 from . import FeatureVectorizer_pb2 as FeatureVectorizer__pb2 from . import GLMRegressor_pb2 as GLMRegressor__pb2 from . import GLMClassifier_pb2 as GLMClassifier__pb2 try: DataStructures__pb2 = GLMClassifier__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = GLMClassifier__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = GLMClassifier__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = GLMClassifier__pb2.FeatureTypes_pb2 from . import NearestNeighbors_pb2 as NearestNeighbors__pb2 try: DataStructures__pb2 = NearestNeighbors__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = NearestNeighbors__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = NearestNeighbors__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = NearestNeighbors__pb2.FeatureTypes_pb2 try: Parameters__pb2 = NearestNeighbors__pb2.Parameters__pb2 except AttributeError: Parameters__pb2 = NearestNeighbors__pb2.Parameters_pb2 try: DataStructures__pb2 = NearestNeighbors__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = NearestNeighbors__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = NearestNeighbors__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = NearestNeighbors__pb2.FeatureTypes_pb2 from . import Identity_pb2 as Identity__pb2 from . import Imputer_pb2 as Imputer__pb2 try: DataStructures__pb2 = Imputer__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = Imputer__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = Imputer__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = Imputer__pb2.FeatureTypes_pb2 from . import MIL_pb2 as MIL__pb2 from . import NeuralNetwork_pb2 as NeuralNetwork__pb2 try: DataStructures__pb2 = NeuralNetwork__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = NeuralNetwork__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = NeuralNetwork__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = NeuralNetwork__pb2.FeatureTypes_pb2 try: Parameters__pb2 = NeuralNetwork__pb2.Parameters__pb2 except AttributeError: Parameters__pb2 = NeuralNetwork__pb2.Parameters_pb2 try: DataStructures__pb2 = NeuralNetwork__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = NeuralNetwork__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = NeuralNetwork__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = NeuralNetwork__pb2.FeatureTypes_pb2 from . import Normalizer_pb2 as Normalizer__pb2 from . import OneHotEncoder_pb2 as OneHotEncoder__pb2 try: DataStructures__pb2 = OneHotEncoder__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = OneHotEncoder__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = OneHotEncoder__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = OneHotEncoder__pb2.FeatureTypes_pb2 from . import Scaler_pb2 as Scaler__pb2 from . import NonMaximumSuppression_pb2 as NonMaximumSuppression__pb2 try: DataStructures__pb2 = NonMaximumSuppression__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = NonMaximumSuppression__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = NonMaximumSuppression__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = NonMaximumSuppression__pb2.FeatureTypes_pb2 from . import SVM_pb2 as SVM__pb2 try: DataStructures__pb2 = SVM__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = SVM__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = SVM__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = SVM__pb2.FeatureTypes_pb2 from . import TreeEnsemble_pb2 as TreeEnsemble__pb2 try: DataStructures__pb2 = TreeEnsemble__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = TreeEnsemble__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = TreeEnsemble__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = TreeEnsemble__pb2.FeatureTypes_pb2 from . import Parameters_pb2 as Parameters__pb2 try: DataStructures__pb2 = Parameters__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = Parameters__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = Parameters__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = Parameters__pb2.FeatureTypes_pb2 from . import ItemSimilarityRecommender_pb2 as ItemSimilarityRecommender__pb2 try: DataStructures__pb2 = ItemSimilarityRecommender__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = ItemSimilarityRecommender__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = ItemSimilarityRecommender__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = ItemSimilarityRecommender__pb2.FeatureTypes_pb2 from . import SoundAnalysisPreprocessing_pb2 as SoundAnalysisPreprocessing__pb2 from . import LinkedModel_pb2 as LinkedModel__pb2 try: Parameters__pb2 = LinkedModel__pb2.Parameters__pb2 except AttributeError: Parameters__pb2 = LinkedModel__pb2.Parameters_pb2 try: DataStructures__pb2 = LinkedModel__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = LinkedModel__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = LinkedModel__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = LinkedModel__pb2.FeatureTypes_pb2 from . import ClassConfidenceThresholding_pb2 as ClassConfidenceThresholding__pb2 try: DataStructures__pb2 = ClassConfidenceThresholding__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = ClassConfidenceThresholding__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = ClassConfidenceThresholding__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = ClassConfidenceThresholding__pb2.FeatureTypes_pb2 from .VisionFeaturePrint_pb2 import * from .AudioFeaturePrint_pb2 import * from .TextClassifier_pb2 import * from .WordTagger_pb2 import * from .Gazetteer_pb2 import * from .WordEmbedding_pb2 import * from .ArrayFeatureExtractor_pb2 import * from .BayesianProbitRegressor_pb2 import * from .CategoricalMapping_pb2 import * from .CustomModel_pb2 import * from .DictVectorizer_pb2 import * from .FeatureTypes_pb2 import * from .FeatureVectorizer_pb2 import * from .GLMRegressor_pb2 import * from .GLMClassifier_pb2 import * from .NearestNeighbors_pb2 import * from .Identity_pb2 import * from .Imputer_pb2 import * from .MIL_pb2 import * from .NeuralNetwork_pb2 import * from .Normalizer_pb2 import * from .OneHotEncoder_pb2 import * from .Scaler_pb2 import * from .NonMaximumSuppression_pb2 import * from .SVM_pb2 import * from .TreeEnsemble_pb2 import * from .Parameters_pb2 import * from .ItemSimilarityRecommender_pb2 import * from .SoundAnalysisPreprocessing_pb2 import * from .LinkedModel_pb2 import * from .ClassConfidenceThresholding_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x0bModel.proto\x12\x14\x43oreML.Specification\x1a\x18VisionFeaturePrint.proto\x1a\x17\x41udioFeaturePrint.proto\x1a\x14TextClassifier.proto\x1a\x10WordTagger.proto\x1a\x0fGazetteer.proto\x1a\x13WordEmbedding.proto\x1a\x1b\x41rrayFeatureExtractor.proto\x1a\x1d\x42\x61yesianProbitRegressor.proto\x1a\x18\x43\x61tegoricalMapping.proto\x1a\x11\x43ustomModel.proto\x1a\x14\x44ictVectorizer.proto\x1a\x12\x46\x65\x61tureTypes.proto\x1a\x17\x46\x65\x61tureVectorizer.proto\x1a\x12GLMRegressor.proto\x1a\x13GLMClassifier.proto\x1a\x16NearestNeighbors.proto\x1a\x0eIdentity.proto\x1a\rImputer.proto\x1a\tMIL.proto\x1a\x13NeuralNetwork.proto\x1a\x10Normalizer.proto\x1a\x13OneHotEncoder.proto\x1a\x0cScaler.proto\x1a\x1bNonMaximumSuppression.proto\x1a\tSVM.proto\x1a\x12TreeEnsemble.proto\x1a\x10Parameters.proto\x1a\x1fItemSimilarityRecommender.proto\x1a SoundAnalysisPreprocessing.proto\x1a\x11LinkedModel.proto\x1a!ClassConfidenceThresholding.proto\"F\n\x08Pipeline\x12+\n\x06models\x18\x01 \x03(\x0b\x32\x1b.CoreML.Specification.Model\x12\r\n\x05names\x18\x02 \x03(\t\"F\n\x12PipelineClassifier\x12\x30\n\x08pipeline\x18\x01 \x01(\x0b\x32\x1e.CoreML.Specification.Pipeline\"E\n\x11PipelineRegressor\x12\x30\n\x08pipeline\x18\x01 \x01(\x0b\x32\x1e.CoreML.Specification.Pipeline\"m\n\x12\x46\x65\x61tureDescription\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\x18\n\x10shortDescription\x18\x02 \x01(\t\x12/\n\x04type\x18\x03 \x01(\x0b\x32!.CoreML.Specification.FeatureType\"\xd6\x01\n\x08Metadata\x12\x18\n\x10shortDescription\x18\x01 \x01(\t\x12\x15\n\rversionString\x18\x02 \x01(\t\x12\x0e\n\x06\x61uthor\x18\x03 \x01(\t\x12\x0f\n\x07license\x18\x04 \x01(\t\x12\x44\n\x0buserDefined\x18\x64 \x03(\x0b\x32/.CoreML.Specification.Metadata.UserDefinedEntry\x1a\x32\n\x10UserDefinedEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\r\n\x05value\x18\x02 \x01(\t:\x02\x38\x01\"\x91\x02\n\x13\x46unctionDescription\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\x37\n\x05input\x18\x02 \x03(\x0b\x32(.CoreML.Specification.FeatureDescription\x12\x38\n\x06output\x18\x03 \x03(\x0b\x32(.CoreML.Specification.FeatureDescription\x12\x37\n\x05state\x18\x06 \x03(\x0b\x32(.CoreML.Specification.FeatureDescription\x12\x1c\n\x14predictedFeatureName\x18\x04 \x01(\t\x12\"\n\x1apredictedProbabilitiesName\x18\x05 \x01(\t\"\xce\x03\n\x10ModelDescription\x12<\n\tfunctions\x18\x14 \x03(\x0b\x32).CoreML.Specification.FunctionDescription\x12\x1b\n\x13\x64\x65\x66\x61ultFunctionName\x18\x15 \x01(\t\x12\x30\n\x08metadata\x18\x64 \x01(\x0b\x32\x1e.CoreML.Specification.Metadata\x12\x37\n\x05input\x18\x01 \x03(\x0b\x32(.CoreML.Specification.FeatureDescription\x12\x38\n\x06output\x18\n \x03(\x0b\x32(.CoreML.Specification.FeatureDescription\x12\x37\n\x05state\x18\r \x03(\x0b\x32(.CoreML.Specification.FeatureDescription\x12\x1c\n\x14predictedFeatureName\x18\x0b \x01(\t\x12\"\n\x1apredictedProbabilitiesName\x18\x0c \x01(\t\x12?\n\rtrainingInput\x18\x32 \x03(\x0b\x32(.CoreML.Specification.FeatureDescription\"4\n\x0fSerializedModel\x12\x12\n\nidentifier\x18\x01 \x01(\t\x12\r\n\x05model\x18\x02 \x01(\x0c\"\xf1\x15\n\x05Model\x12\x1c\n\x14specificationVersion\x18\x01 \x01(\x05\x12;\n\x0b\x64\x65scription\x18\x02 \x01(\x0b\x32&.CoreML.Specification.ModelDescription\x12\x13\n\x0bisUpdatable\x18\n \x01(\x08\x12G\n\x12pipelineClassifier\x18\xc8\x01 \x01(\x0b\x32(.CoreML.Specification.PipelineClassifierH\x00\x12\x45\n\x11pipelineRegressor\x18\xc9\x01 \x01(\x0b\x32\'.CoreML.Specification.PipelineRegressorH\x00\x12\x33\n\x08pipeline\x18\xca\x01 \x01(\x0b\x32\x1e.CoreML.Specification.PipelineH\x00\x12;\n\x0cglmRegressor\x18\xac\x02 \x01(\x0b\x32\".CoreML.Specification.GLMRegressorH\x00\x12O\n\x16supportVectorRegressor\x18\xad\x02 \x01(\x0b\x32,.CoreML.Specification.SupportVectorRegressorH\x00\x12M\n\x15treeEnsembleRegressor\x18\xae\x02 \x01(\x0b\x32+.CoreML.Specification.TreeEnsembleRegressorH\x00\x12O\n\x16neuralNetworkRegressor\x18\xaf\x02 \x01(\x0b\x32,.CoreML.Specification.NeuralNetworkRegressorH\x00\x12Q\n\x17\x62\x61yesianProbitRegressor\x18\xb0\x02 \x01(\x0b\x32-.CoreML.Specification.BayesianProbitRegressorH\x00\x12=\n\rglmClassifier\x18\x90\x03 \x01(\x0b\x32#.CoreML.Specification.GLMClassifierH\x00\x12Q\n\x17supportVectorClassifier\x18\x91\x03 \x01(\x0b\x32-.CoreML.Specification.SupportVectorClassifierH\x00\x12O\n\x16treeEnsembleClassifier\x18\x92\x03 \x01(\x0b\x32,.CoreML.Specification.TreeEnsembleClassifierH\x00\x12Q\n\x17neuralNetworkClassifier\x18\x93\x03 \x01(\x0b\x32-.CoreML.Specification.NeuralNetworkClassifierH\x00\x12Y\n\x1bkNearestNeighborsClassifier\x18\x94\x03 \x01(\x0b\x32\x31.CoreML.Specification.KNearestNeighborsClassifierH\x00\x12=\n\rneuralNetwork\x18\xf4\x03 \x01(\x0b\x32#.CoreML.Specification.NeuralNetworkH\x00\x12U\n\x19itemSimilarityRecommender\x18\xf5\x03 \x01(\x0b\x32/.CoreML.Specification.ItemSimilarityRecommenderH\x00\x12;\n\tmlProgram\x18\xf6\x03 \x01(\x0b\x32%.CoreML.Specification.MILSpec.ProgramH\x00\x12\x39\n\x0b\x63ustomModel\x18\xab\x04 \x01(\x0b\x32!.CoreML.Specification.CustomModelH\x00\x12\x39\n\x0blinkedModel\x18\xac\x04 \x01(\x0b\x32!.CoreML.Specification.LinkedModelH\x00\x12Y\n\x1b\x63lassConfidenceThresholding\x18\xb0\x04 \x01(\x0b\x32\x31.CoreML.Specification.ClassConfidenceThresholdingH\x00\x12=\n\roneHotEncoder\x18\xd8\x04 \x01(\x0b\x32#.CoreML.Specification.OneHotEncoderH\x00\x12\x31\n\x07imputer\x18\xd9\x04 \x01(\x0b\x32\x1d.CoreML.Specification.ImputerH\x00\x12\x45\n\x11\x66\x65\x61tureVectorizer\x18\xda\x04 \x01(\x0b\x32\'.CoreML.Specification.FeatureVectorizerH\x00\x12?\n\x0e\x64ictVectorizer\x18\xdb\x04 \x01(\x0b\x32$.CoreML.Specification.DictVectorizerH\x00\x12/\n\x06scaler\x18\xdc\x04 \x01(\x0b\x32\x1c.CoreML.Specification.ScalerH\x00\x12G\n\x12\x63\x61tegoricalMapping\x18\xde\x04 \x01(\x0b\x32(.CoreML.Specification.CategoricalMappingH\x00\x12\x37\n\nnormalizer\x18\xdf\x04 \x01(\x0b\x32 .CoreML.Specification.NormalizerH\x00\x12M\n\x15\x61rrayFeatureExtractor\x18\xe1\x04 \x01(\x0b\x32+.CoreML.Specification.ArrayFeatureExtractorH\x00\x12M\n\x15nonMaximumSuppression\x18\xe2\x04 \x01(\x0b\x32+.CoreML.Specification.NonMaximumSuppressionH\x00\x12\x33\n\x08identity\x18\x84\x07 \x01(\x0b\x32\x1e.CoreML.Specification.IdentityH\x00\x12L\n\x0etextClassifier\x18\xd0\x0f \x01(\x0b\x32\x31.CoreML.Specification.CoreMLModels.TextClassifierH\x00\x12\x44\n\nwordTagger\x18\xd1\x0f \x01(\x0b\x32-.CoreML.Specification.CoreMLModels.WordTaggerH\x00\x12T\n\x12visionFeaturePrint\x18\xd2\x0f \x01(\x0b\x32\x35.CoreML.Specification.CoreMLModels.VisionFeaturePrintH\x00\x12\x64\n\x1asoundAnalysisPreprocessing\x18\xd3\x0f \x01(\x0b\x32=.CoreML.Specification.CoreMLModels.SoundAnalysisPreprocessingH\x00\x12\x42\n\tgazetteer\x18\xd4\x0f \x01(\x0b\x32,.CoreML.Specification.CoreMLModels.GazetteerH\x00\x12J\n\rwordEmbedding\x18\xd5\x0f \x01(\x0b\x32\x30.CoreML.Specification.CoreMLModels.WordEmbeddingH\x00\x12R\n\x11\x61udioFeaturePrint\x18\xd6\x0f \x01(\x0b\x32\x34.CoreML.Specification.CoreMLModels.AudioFeaturePrintH\x00\x12\x41\n\x0fserializedModel\x18\xb8\x17 \x01(\x0b\x32%.CoreML.Specification.SerializedModelH\x00\x42\x06\n\x04TypeB\x02H\x03P\x00P\x01P\x02P\x03P\x04P\x05P\x06P\x07P\x08P\tP\nP\x0bP\x0cP\rP\x0eP\x0fP\x10P\x11P\x12P\x13P\x14P\x15P\x16P\x17P\x18P\x19P\x1aP\x1bP\x1cP\x1dP\x1e\x62\x06proto3') _PIPELINE = DESCRIPTOR.message_types_by_name['Pipeline'] _PIPELINECLASSIFIER = DESCRIPTOR.message_types_by_name['PipelineClassifier'] _PIPELINEREGRESSOR = DESCRIPTOR.message_types_by_name['PipelineRegressor'] _FEATUREDESCRIPTION = DESCRIPTOR.message_types_by_name['FeatureDescription'] _METADATA = DESCRIPTOR.message_types_by_name['Metadata'] _METADATA_USERDEFINEDENTRY = _METADATA.nested_types_by_name['UserDefinedEntry'] _FUNCTIONDESCRIPTION = DESCRIPTOR.message_types_by_name['FunctionDescription'] _MODELDESCRIPTION = DESCRIPTOR.message_types_by_name['ModelDescription'] _SERIALIZEDMODEL = DESCRIPTOR.message_types_by_name['SerializedModel'] _MODEL = DESCRIPTOR.message_types_by_name['Model'] Pipeline = _reflection.GeneratedProtocolMessageType('Pipeline', (_message.Message,), { 'DESCRIPTOR' : _PIPELINE, '__module__' : 'Model_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Pipeline) }) _sym_db.RegisterMessage(Pipeline) PipelineClassifier = _reflection.GeneratedProtocolMessageType('PipelineClassifier', (_message.Message,), { 'DESCRIPTOR' : _PIPELINECLASSIFIER, '__module__' : 'Model_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.PipelineClassifier) }) _sym_db.RegisterMessage(PipelineClassifier) PipelineRegressor = _reflection.GeneratedProtocolMessageType('PipelineRegressor', (_message.Message,), { 'DESCRIPTOR' : _PIPELINEREGRESSOR, '__module__' : 'Model_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.PipelineRegressor) }) _sym_db.RegisterMessage(PipelineRegressor) FeatureDescription = _reflection.GeneratedProtocolMessageType('FeatureDescription', (_message.Message,), { 'DESCRIPTOR' : _FEATUREDESCRIPTION, '__module__' : 'Model_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FeatureDescription) }) _sym_db.RegisterMessage(FeatureDescription) Metadata = _reflection.GeneratedProtocolMessageType('Metadata', (_message.Message,), { 'UserDefinedEntry' : _reflection.GeneratedProtocolMessageType('UserDefinedEntry', (_message.Message,), { 'DESCRIPTOR' : _METADATA_USERDEFINEDENTRY, '__module__' : 'Model_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Metadata.UserDefinedEntry) }) , 'DESCRIPTOR' : _METADATA, '__module__' : 'Model_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Metadata) }) _sym_db.RegisterMessage(Metadata) _sym_db.RegisterMessage(Metadata.UserDefinedEntry) FunctionDescription = _reflection.GeneratedProtocolMessageType('FunctionDescription', (_message.Message,), { 'DESCRIPTOR' : _FUNCTIONDESCRIPTION, '__module__' : 'Model_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FunctionDescription) }) _sym_db.RegisterMessage(FunctionDescription) ModelDescription = _reflection.GeneratedProtocolMessageType('ModelDescription', (_message.Message,), { 'DESCRIPTOR' : _MODELDESCRIPTION, '__module__' : 'Model_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ModelDescription) }) _sym_db.RegisterMessage(ModelDescription) SerializedModel = _reflection.GeneratedProtocolMessageType('SerializedModel', (_message.Message,), { 'DESCRIPTOR' : _SERIALIZEDMODEL, '__module__' : 'Model_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SerializedModel) }) _sym_db.RegisterMessage(SerializedModel) Model = _reflection.GeneratedProtocolMessageType('Model', (_message.Message,), { 'DESCRIPTOR' : _MODEL, '__module__' : 'Model_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Model) }) _sym_db.RegisterMessage(Model) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _METADATA_USERDEFINEDENTRY._options = None _METADATA_USERDEFINEDENTRY._serialized_options = b'8\001' _PIPELINE._serialized_start=718 _PIPELINE._serialized_end=788 _PIPELINECLASSIFIER._serialized_start=790 _PIPELINECLASSIFIER._serialized_end=860 _PIPELINEREGRESSOR._serialized_start=862 _PIPELINEREGRESSOR._serialized_end=931 _FEATUREDESCRIPTION._serialized_start=933 _FEATUREDESCRIPTION._serialized_end=1042 _METADATA._serialized_start=1045 _METADATA._serialized_end=1259 _METADATA_USERDEFINEDENTRY._serialized_start=1209 _METADATA_USERDEFINEDENTRY._serialized_end=1259 _FUNCTIONDESCRIPTION._serialized_start=1262 _FUNCTIONDESCRIPTION._serialized_end=1535 _MODELDESCRIPTION._serialized_start=1538 _MODELDESCRIPTION._serialized_end=2000 _SERIALIZEDMODEL._serialized_start=2002 _SERIALIZEDMODEL._serialized_end=2054 _MODEL._serialized_start=2057 _MODEL._serialized_end=4858 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/NamedParameters_pb2.py0000644000000000000000000003420714672066616022377 0ustar00rootroot# Generated by the protocol buffer compiler. DO NOT EDIT! # source: NamedParameters.proto import sys _b=sys.version_info[0]<3 and (lambda x:x) or (lambda x:x.encode('latin1')) from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pb2 from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor.FileDescriptor( name='NamedParameters.proto', package='CoreML.Specification', syntax='proto3', serialized_pb=_b('\n\x15NamedParameters.proto\x12\x14\x43oreML.Specification\"0\n\nInt32Range\x12\x10\n\x08minValue\x18\x01 \x01(\x05\x12\x10\n\x08maxValue\x18\x02 \x01(\x05\"\x1a\n\x08Int32Set\x12\x0e\n\x06values\x18\x01 \x03(\x05\"0\n\nFloatRange\x12\x10\n\x08minValue\x18\x01 \x01(\x02\x12\x10\n\x08maxValue\x18\x02 \x01(\x02\"\x99\x01\n\x0eInt32Parameter\x12\x14\n\x0c\x64\x65\x66\x61ultValue\x18\x01 \x01(\x05\x12\x31\n\x05range\x18\n \x01(\x0b\x32 .CoreML.Specification.Int32RangeH\x00\x12-\n\x03set\x18\x0b \x01(\x0b\x32\x1e.CoreML.Specification.Int32SetH\x00\x42\x0f\n\rAllowedValues\"j\n\x0e\x46loatParameter\x12\x14\n\x0c\x64\x65\x66\x61ultValue\x18\x01 \x01(\x02\x12\x31\n\x05range\x18\n \x01(\x0b\x32 .CoreML.Specification.FloatRangeH\x00\x42\x0f\n\rAllowedValues\"\x93\x01\n\tParameter\x12>\n\x0eint32Parameter\x18\x01 \x01(\x0b\x32$.CoreML.Specification.Int32ParameterH\x00\x12>\n\x0e\x66loatParameter\x18\x02 \x01(\x0b\x32$.CoreML.Specification.FloatParameterH\x00\x42\x06\n\x04Type\"l\n\x0eNamedParameter\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\x18\n\x10shortDescription\x18\x02 \x01(\t\x12\x32\n\tparameter\x18\x03 \x01(\x0b\x32\x1f.CoreML.Specification.ParameterB\x02H\x03\x62\x06proto3') ) _INT32RANGE = _descriptor.Descriptor( name='Int32Range', full_name='CoreML.Specification.Int32Range', filename=None, file=DESCRIPTOR, containing_type=None, fields=[ _descriptor.FieldDescriptor( name='minValue', full_name='CoreML.Specification.Int32Range.minValue', index=0, number=1, type=5, cpp_type=1, label=1, has_default_value=False, default_value=0, message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), _descriptor.FieldDescriptor( name='maxValue', full_name='CoreML.Specification.Int32Range.maxValue', index=1, number=2, type=5, cpp_type=1, label=1, has_default_value=False, default_value=0, message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), ], extensions=[ ], nested_types=[], enum_types=[ ], options=None, is_extendable=False, syntax='proto3', extension_ranges=[], oneofs=[ ], serialized_start=47, serialized_end=95, ) _INT32SET = _descriptor.Descriptor( name='Int32Set', full_name='CoreML.Specification.Int32Set', filename=None, file=DESCRIPTOR, containing_type=None, fields=[ _descriptor.FieldDescriptor( name='values', full_name='CoreML.Specification.Int32Set.values', index=0, number=1, type=5, cpp_type=1, label=3, has_default_value=False, default_value=[], message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), ], extensions=[ ], nested_types=[], enum_types=[ ], options=None, is_extendable=False, syntax='proto3', extension_ranges=[], oneofs=[ ], serialized_start=97, serialized_end=123, ) _FLOATRANGE = _descriptor.Descriptor( name='FloatRange', full_name='CoreML.Specification.FloatRange', filename=None, file=DESCRIPTOR, containing_type=None, fields=[ _descriptor.FieldDescriptor( name='minValue', full_name='CoreML.Specification.FloatRange.minValue', index=0, number=1, type=2, cpp_type=6, label=1, has_default_value=False, default_value=float(0), message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), _descriptor.FieldDescriptor( name='maxValue', full_name='CoreML.Specification.FloatRange.maxValue', index=1, number=2, type=2, cpp_type=6, label=1, has_default_value=False, default_value=float(0), message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), ], extensions=[ ], nested_types=[], enum_types=[ ], options=None, is_extendable=False, syntax='proto3', extension_ranges=[], oneofs=[ ], serialized_start=125, serialized_end=173, ) _INT32PARAMETER = _descriptor.Descriptor( name='Int32Parameter', full_name='CoreML.Specification.Int32Parameter', filename=None, file=DESCRIPTOR, containing_type=None, fields=[ _descriptor.FieldDescriptor( name='defaultValue', full_name='CoreML.Specification.Int32Parameter.defaultValue', index=0, number=1, type=5, cpp_type=1, label=1, has_default_value=False, default_value=0, message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), _descriptor.FieldDescriptor( name='range', full_name='CoreML.Specification.Int32Parameter.range', index=1, number=10, type=11, cpp_type=10, label=1, has_default_value=False, default_value=None, message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), _descriptor.FieldDescriptor( name='set', full_name='CoreML.Specification.Int32Parameter.set', index=2, number=11, type=11, cpp_type=10, label=1, has_default_value=False, default_value=None, message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), ], extensions=[ ], nested_types=[], enum_types=[ ], options=None, is_extendable=False, syntax='proto3', extension_ranges=[], oneofs=[ _descriptor.OneofDescriptor( name='AllowedValues', full_name='CoreML.Specification.Int32Parameter.AllowedValues', index=0, containing_type=None, fields=[]), ], serialized_start=176, serialized_end=329, ) _FLOATPARAMETER = _descriptor.Descriptor( name='FloatParameter', full_name='CoreML.Specification.FloatParameter', filename=None, file=DESCRIPTOR, containing_type=None, fields=[ _descriptor.FieldDescriptor( name='defaultValue', full_name='CoreML.Specification.FloatParameter.defaultValue', index=0, number=1, type=2, cpp_type=6, label=1, has_default_value=False, default_value=float(0), message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), _descriptor.FieldDescriptor( name='range', full_name='CoreML.Specification.FloatParameter.range', index=1, number=10, type=11, cpp_type=10, label=1, has_default_value=False, default_value=None, message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), ], extensions=[ ], nested_types=[], enum_types=[ ], options=None, is_extendable=False, syntax='proto3', extension_ranges=[], oneofs=[ _descriptor.OneofDescriptor( name='AllowedValues', full_name='CoreML.Specification.FloatParameter.AllowedValues', index=0, containing_type=None, fields=[]), ], serialized_start=331, serialized_end=437, ) _PARAMETER = _descriptor.Descriptor( name='Parameter', full_name='CoreML.Specification.Parameter', filename=None, file=DESCRIPTOR, containing_type=None, fields=[ _descriptor.FieldDescriptor( name='int32Parameter', full_name='CoreML.Specification.Parameter.int32Parameter', index=0, number=1, type=11, cpp_type=10, label=1, has_default_value=False, default_value=None, message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), _descriptor.FieldDescriptor( name='floatParameter', full_name='CoreML.Specification.Parameter.floatParameter', index=1, number=2, type=11, cpp_type=10, label=1, has_default_value=False, default_value=None, message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), ], extensions=[ ], nested_types=[], enum_types=[ ], options=None, is_extendable=False, syntax='proto3', extension_ranges=[], oneofs=[ _descriptor.OneofDescriptor( name='Type', full_name='CoreML.Specification.Parameter.Type', index=0, containing_type=None, fields=[]), ], serialized_start=440, serialized_end=587, ) _NAMEDPARAMETER = _descriptor.Descriptor( name='NamedParameter', full_name='CoreML.Specification.NamedParameter', filename=None, file=DESCRIPTOR, containing_type=None, fields=[ _descriptor.FieldDescriptor( name='name', full_name='CoreML.Specification.NamedParameter.name', index=0, number=1, type=9, cpp_type=9, label=1, has_default_value=False, default_value=_b("").decode('utf-8'), message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), _descriptor.FieldDescriptor( name='shortDescription', full_name='CoreML.Specification.NamedParameter.shortDescription', index=1, number=2, type=9, cpp_type=9, label=1, has_default_value=False, default_value=_b("").decode('utf-8'), message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), _descriptor.FieldDescriptor( name='parameter', full_name='CoreML.Specification.NamedParameter.parameter', index=2, number=3, type=11, cpp_type=10, label=1, has_default_value=False, default_value=None, message_type=None, enum_type=None, containing_type=None, is_extension=False, extension_scope=None, options=None), ], extensions=[ ], nested_types=[], enum_types=[ ], options=None, is_extendable=False, syntax='proto3', extension_ranges=[], oneofs=[ ], serialized_start=589, serialized_end=697, ) _INT32PARAMETER.fields_by_name['range'].message_type = _INT32RANGE _INT32PARAMETER.fields_by_name['set'].message_type = _INT32SET _INT32PARAMETER.oneofs_by_name['AllowedValues'].fields.append( _INT32PARAMETER.fields_by_name['range']) _INT32PARAMETER.fields_by_name['range'].containing_oneof = _INT32PARAMETER.oneofs_by_name['AllowedValues'] _INT32PARAMETER.oneofs_by_name['AllowedValues'].fields.append( _INT32PARAMETER.fields_by_name['set']) _INT32PARAMETER.fields_by_name['set'].containing_oneof = _INT32PARAMETER.oneofs_by_name['AllowedValues'] _FLOATPARAMETER.fields_by_name['range'].message_type = _FLOATRANGE _FLOATPARAMETER.oneofs_by_name['AllowedValues'].fields.append( _FLOATPARAMETER.fields_by_name['range']) _FLOATPARAMETER.fields_by_name['range'].containing_oneof = _FLOATPARAMETER.oneofs_by_name['AllowedValues'] _PARAMETER.fields_by_name['int32Parameter'].message_type = _INT32PARAMETER _PARAMETER.fields_by_name['floatParameter'].message_type = _FLOATPARAMETER _PARAMETER.oneofs_by_name['Type'].fields.append( _PARAMETER.fields_by_name['int32Parameter']) _PARAMETER.fields_by_name['int32Parameter'].containing_oneof = _PARAMETER.oneofs_by_name['Type'] _PARAMETER.oneofs_by_name['Type'].fields.append( _PARAMETER.fields_by_name['floatParameter']) _PARAMETER.fields_by_name['floatParameter'].containing_oneof = _PARAMETER.oneofs_by_name['Type'] _NAMEDPARAMETER.fields_by_name['parameter'].message_type = _PARAMETER DESCRIPTOR.message_types_by_name['Int32Range'] = _INT32RANGE DESCRIPTOR.message_types_by_name['Int32Set'] = _INT32SET DESCRIPTOR.message_types_by_name['FloatRange'] = _FLOATRANGE DESCRIPTOR.message_types_by_name['Int32Parameter'] = _INT32PARAMETER DESCRIPTOR.message_types_by_name['FloatParameter'] = _FLOATPARAMETER DESCRIPTOR.message_types_by_name['Parameter'] = _PARAMETER DESCRIPTOR.message_types_by_name['NamedParameter'] = _NAMEDPARAMETER _sym_db.RegisterFileDescriptor(DESCRIPTOR) Int32Range = _reflection.GeneratedProtocolMessageType('Int32Range', (_message.Message,), dict( DESCRIPTOR = _INT32RANGE, __module__ = 'NamedParameters_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Int32Range) )) _sym_db.RegisterMessage(Int32Range) Int32Set = _reflection.GeneratedProtocolMessageType('Int32Set', (_message.Message,), dict( DESCRIPTOR = _INT32SET, __module__ = 'NamedParameters_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Int32Set) )) _sym_db.RegisterMessage(Int32Set) FloatRange = _reflection.GeneratedProtocolMessageType('FloatRange', (_message.Message,), dict( DESCRIPTOR = _FLOATRANGE, __module__ = 'NamedParameters_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FloatRange) )) _sym_db.RegisterMessage(FloatRange) Int32Parameter = _reflection.GeneratedProtocolMessageType('Int32Parameter', (_message.Message,), dict( DESCRIPTOR = _INT32PARAMETER, __module__ = 'NamedParameters_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Int32Parameter) )) _sym_db.RegisterMessage(Int32Parameter) FloatParameter = _reflection.GeneratedProtocolMessageType('FloatParameter', (_message.Message,), dict( DESCRIPTOR = _FLOATPARAMETER, __module__ = 'NamedParameters_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FloatParameter) )) _sym_db.RegisterMessage(FloatParameter) Parameter = _reflection.GeneratedProtocolMessageType('Parameter', (_message.Message,), dict( DESCRIPTOR = _PARAMETER, __module__ = 'NamedParameters_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Parameter) )) _sym_db.RegisterMessage(Parameter) NamedParameter = _reflection.GeneratedProtocolMessageType('NamedParameter', (_message.Message,), dict( DESCRIPTOR = _NAMEDPARAMETER, __module__ = 'NamedParameters_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NamedParameter) )) _sym_db.RegisterMessage(NamedParameter) DESCRIPTOR.has_options = True DESCRIPTOR._options = _descriptor._ParseOptions(descriptor_pb2.FileOptions(), _b('H\003')) # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/NearestNeighbors_pb2.py0000644000000000000000000001460314672066616022567 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: NearestNeighbors.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from . import Parameters_pb2 as Parameters__pb2 try: DataStructures__pb2 = Parameters__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = Parameters__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = Parameters__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = Parameters__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * from .Parameters_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x16NearestNeighbors.proto\x12\x14\x43oreML.Specification\x1a\x14\x44\x61taStructures.proto\x1a\x10Parameters.proto\"\xb6\x04\n\x1bKNearestNeighborsClassifier\x12J\n\x15nearestNeighborsIndex\x18\x01 \x01(\x0b\x32+.CoreML.Specification.NearestNeighborsIndex\x12?\n\x11numberOfNeighbors\x18\x03 \x01(\x0b\x32$.CoreML.Specification.Int64Parameter\x12?\n\x11stringClassLabels\x18\x64 \x01(\x0b\x32\".CoreML.Specification.StringVectorH\x00\x12=\n\x10int64ClassLabels\x18\x65 \x01(\x0b\x32!.CoreML.Specification.Int64VectorH\x00\x12\x1c\n\x12\x64\x65\x66\x61ultStringLabel\x18n \x01(\tH\x01\x12\x1b\n\x11\x64\x65\x66\x61ultInt64Label\x18o \x01(\x03H\x01\x12\x43\n\x10uniformWeighting\x18\xc8\x01 \x01(\x0b\x32&.CoreML.Specification.UniformWeightingH\x02\x12S\n\x18inverseDistanceWeighting\x18\xd2\x01 \x01(\x0b\x32..CoreML.Specification.InverseDistanceWeightingH\x02\x42\r\n\x0b\x43lassLabelsB\x13\n\x11\x44\x65\x66\x61ultClassLabelB\x11\n\x0fWeightingScheme\"\xe2\x02\n\x15NearestNeighborsIndex\x12\x1a\n\x12numberOfDimensions\x18\x01 \x01(\x05\x12\x37\n\x0c\x66loatSamples\x18\x02 \x03(\x0b\x32!.CoreML.Specification.FloatVector\x12\x38\n\x0blinearIndex\x18\x64 \x01(\x0b\x32!.CoreML.Specification.LinearIndexH\x00\x12\x44\n\x11singleKdTreeIndex\x18n \x01(\x0b\x32\'.CoreML.Specification.SingleKdTreeIndexH\x00\x12S\n\x18squaredEuclideanDistance\x18\xc8\x01 \x01(\x0b\x32..CoreML.Specification.SquaredEuclideanDistanceH\x01\x42\x0b\n\tIndexTypeB\x12\n\x10\x44istanceFunction\"\x12\n\x10UniformWeighting\"\x1a\n\x18InverseDistanceWeighting\"\r\n\x0bLinearIndex\"%\n\x11SingleKdTreeIndex\x12\x10\n\x08leafSize\x18\x01 \x01(\x05\"\x1a\n\x18SquaredEuclideanDistanceB\x02H\x03P\x00P\x01\x62\x06proto3') _KNEARESTNEIGHBORSCLASSIFIER = DESCRIPTOR.message_types_by_name['KNearestNeighborsClassifier'] _NEARESTNEIGHBORSINDEX = DESCRIPTOR.message_types_by_name['NearestNeighborsIndex'] _UNIFORMWEIGHTING = DESCRIPTOR.message_types_by_name['UniformWeighting'] _INVERSEDISTANCEWEIGHTING = DESCRIPTOR.message_types_by_name['InverseDistanceWeighting'] _LINEARINDEX = DESCRIPTOR.message_types_by_name['LinearIndex'] _SINGLEKDTREEINDEX = DESCRIPTOR.message_types_by_name['SingleKdTreeIndex'] _SQUAREDEUCLIDEANDISTANCE = DESCRIPTOR.message_types_by_name['SquaredEuclideanDistance'] KNearestNeighborsClassifier = _reflection.GeneratedProtocolMessageType('KNearestNeighborsClassifier', (_message.Message,), { 'DESCRIPTOR' : _KNEARESTNEIGHBORSCLASSIFIER, '__module__' : 'NearestNeighbors_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.KNearestNeighborsClassifier) }) _sym_db.RegisterMessage(KNearestNeighborsClassifier) NearestNeighborsIndex = _reflection.GeneratedProtocolMessageType('NearestNeighborsIndex', (_message.Message,), { 'DESCRIPTOR' : _NEARESTNEIGHBORSINDEX, '__module__' : 'NearestNeighbors_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NearestNeighborsIndex) }) _sym_db.RegisterMessage(NearestNeighborsIndex) UniformWeighting = _reflection.GeneratedProtocolMessageType('UniformWeighting', (_message.Message,), { 'DESCRIPTOR' : _UNIFORMWEIGHTING, '__module__' : 'NearestNeighbors_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.UniformWeighting) }) _sym_db.RegisterMessage(UniformWeighting) InverseDistanceWeighting = _reflection.GeneratedProtocolMessageType('InverseDistanceWeighting', (_message.Message,), { 'DESCRIPTOR' : _INVERSEDISTANCEWEIGHTING, '__module__' : 'NearestNeighbors_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.InverseDistanceWeighting) }) _sym_db.RegisterMessage(InverseDistanceWeighting) LinearIndex = _reflection.GeneratedProtocolMessageType('LinearIndex', (_message.Message,), { 'DESCRIPTOR' : _LINEARINDEX, '__module__' : 'NearestNeighbors_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LinearIndex) }) _sym_db.RegisterMessage(LinearIndex) SingleKdTreeIndex = _reflection.GeneratedProtocolMessageType('SingleKdTreeIndex', (_message.Message,), { 'DESCRIPTOR' : _SINGLEKDTREEINDEX, '__module__' : 'NearestNeighbors_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SingleKdTreeIndex) }) _sym_db.RegisterMessage(SingleKdTreeIndex) SquaredEuclideanDistance = _reflection.GeneratedProtocolMessageType('SquaredEuclideanDistance', (_message.Message,), { 'DESCRIPTOR' : _SQUAREDEUCLIDEANDISTANCE, '__module__' : 'NearestNeighbors_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SquaredEuclideanDistance) }) _sym_db.RegisterMessage(SquaredEuclideanDistance) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _KNEARESTNEIGHBORSCLASSIFIER._serialized_start=89 _KNEARESTNEIGHBORSCLASSIFIER._serialized_end=655 _NEARESTNEIGHBORSINDEX._serialized_start=658 _NEARESTNEIGHBORSINDEX._serialized_end=1012 _UNIFORMWEIGHTING._serialized_start=1014 _UNIFORMWEIGHTING._serialized_end=1032 _INVERSEDISTANCEWEIGHTING._serialized_start=1034 _INVERSEDISTANCEWEIGHTING._serialized_end=1060 _LINEARINDEX._serialized_start=1062 _LINEARINDEX._serialized_end=1075 _SINGLEKDTREEINDEX._serialized_start=1077 _SINGLEKDTREEINDEX._serialized_end=1114 _SQUAREDEUCLIDEANDISTANCE._serialized_start=1116 _SQUAREDEUCLIDEANDISTANCE._serialized_end=1142 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/NeuralNetwork_pb2.py0000644000000000000000000047653614672066616022146 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: NeuralNetwork.proto """Generated protocol buffer code.""" from google.protobuf.internal import enum_type_wrapper from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from . import Parameters_pb2 as Parameters__pb2 try: DataStructures__pb2 = Parameters__pb2.DataStructures__pb2 except AttributeError: DataStructures__pb2 = Parameters__pb2.DataStructures_pb2 try: FeatureTypes__pb2 = Parameters__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = Parameters__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * from .Parameters_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x13NeuralNetwork.proto\x12\x14\x43oreML.Specification\x1a\x14\x44\x61taStructures.proto\x1a\x10Parameters.proto\"\x88\x03\n\rNeuralNetwork\x12\x38\n\x06layers\x18\x01 \x03(\x0b\x32(.CoreML.Specification.NeuralNetworkLayer\x12G\n\rpreprocessing\x18\x02 \x03(\x0b\x32\x30.CoreML.Specification.NeuralNetworkPreprocessing\x12Y\n\x16\x61rrayInputShapeMapping\x18\x05 \x01(\x0e\x32\x39.CoreML.Specification.NeuralNetworkMultiArrayShapeMapping\x12T\n\x16imageInputShapeMapping\x18\x06 \x01(\x0e\x32\x34.CoreML.Specification.NeuralNetworkImageShapeMapping\x12\x43\n\x0cupdateParams\x18\n \x01(\x0b\x32-.CoreML.Specification.NetworkUpdateParameters\"x\n\x18NeuralNetworkImageScaler\x12\x14\n\x0c\x63hannelScale\x18\n \x01(\x02\x12\x10\n\x08\x62lueBias\x18\x14 \x01(\x02\x12\x11\n\tgreenBias\x18\x15 \x01(\x02\x12\x0f\n\x07redBias\x18\x16 \x01(\x02\x12\x10\n\x08grayBias\x18\x1e \x01(\x02\"+\n\x16NeuralNetworkMeanImage\x12\x11\n\tmeanImage\x18\x01 \x03(\x02\"\xc6\x01\n\x1aNeuralNetworkPreprocessing\x12\x13\n\x0b\x66\x65\x61tureName\x18\x01 \x01(\t\x12@\n\x06scaler\x18\n \x01(\x0b\x32..CoreML.Specification.NeuralNetworkImageScalerH\x00\x12\x41\n\tmeanImage\x18\x0b \x01(\x0b\x32,.CoreML.Specification.NeuralNetworkMeanImageH\x00\x42\x0e\n\x0cpreprocessor\"\x10\n\x0e\x41\x63tivationReLU\"$\n\x13\x41\x63tivationLeakyReLU\x12\r\n\x05\x61lpha\x18\x01 \x01(\x02\"\x10\n\x0e\x41\x63tivationTanh\"3\n\x14\x41\x63tivationScaledTanh\x12\r\n\x05\x61lpha\x18\x01 \x01(\x02\x12\x0c\n\x04\x62\x65ta\x18\x02 \x01(\x02\"\x13\n\x11\x41\x63tivationSigmoid\"/\n\x10\x41\x63tivationLinear\x12\r\n\x05\x61lpha\x18\x01 \x01(\x02\x12\x0c\n\x04\x62\x65ta\x18\x02 \x01(\x02\"4\n\x15\x41\x63tivationSigmoidHard\x12\r\n\x05\x61lpha\x18\x01 \x01(\x02\x12\x0c\n\x04\x62\x65ta\x18\x02 \x01(\x02\"D\n\x0f\x41\x63tivationPReLU\x12\x31\n\x05\x61lpha\x18\x01 \x01(\x0b\x32\".CoreML.Specification.WeightParams\"\x1e\n\rActivationELU\x12\r\n\x05\x61lpha\x18\x01 \x01(\x02\"*\n\x19\x41\x63tivationThresholdedReLU\x12\r\n\x05\x61lpha\x18\x01 \x01(\x02\"\x14\n\x12\x41\x63tivationSoftsign\"\x14\n\x12\x41\x63tivationSoftplus\"\x83\x01\n\x1c\x41\x63tivationParametricSoftplus\x12\x31\n\x05\x61lpha\x18\x01 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x30\n\x04\x62\x65ta\x18\x02 \x01(\x0b\x32\".CoreML.Specification.WeightParams\"\xd4\x06\n\x10\x41\x63tivationParams\x12\x38\n\x06linear\x18\x05 \x01(\x0b\x32&.CoreML.Specification.ActivationLinearH\x00\x12\x34\n\x04ReLU\x18\n \x01(\x0b\x32$.CoreML.Specification.ActivationReLUH\x00\x12>\n\tleakyReLU\x18\x0f \x01(\x0b\x32).CoreML.Specification.ActivationLeakyReLUH\x00\x12J\n\x0fthresholdedReLU\x18\x14 \x01(\x0b\x32/.CoreML.Specification.ActivationThresholdedReLUH\x00\x12\x36\n\x05PReLU\x18\x19 \x01(\x0b\x32%.CoreML.Specification.ActivationPReLUH\x00\x12\x34\n\x04tanh\x18\x1e \x01(\x0b\x32$.CoreML.Specification.ActivationTanhH\x00\x12@\n\nscaledTanh\x18\x1f \x01(\x0b\x32*.CoreML.Specification.ActivationScaledTanhH\x00\x12:\n\x07sigmoid\x18( \x01(\x0b\x32\'.CoreML.Specification.ActivationSigmoidH\x00\x12\x42\n\x0bsigmoidHard\x18) \x01(\x0b\x32+.CoreML.Specification.ActivationSigmoidHardH\x00\x12\x32\n\x03\x45LU\x18\x32 \x01(\x0b\x32#.CoreML.Specification.ActivationELUH\x00\x12<\n\x08softsign\x18< \x01(\x0b\x32(.CoreML.Specification.ActivationSoftsignH\x00\x12<\n\x08softplus\x18\x46 \x01(\x0b\x32(.CoreML.Specification.ActivationSoftplusH\x00\x12P\n\x12parametricSoftplus\x18G \x01(\x0b\x32\x32.CoreML.Specification.ActivationParametricSoftplusH\x00\x42\x12\n\x10NonlinearityType\"(\n\x06Tensor\x12\x0c\n\x04rank\x18\x01 \x01(\r\x12\x10\n\x08\x64imValue\x18\x02 \x03(\x03\"\xeaU\n\x12NeuralNetworkLayer\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\r\n\x05input\x18\x02 \x03(\t\x12\x0e\n\x06output\x18\x03 \x03(\t\x12\x31\n\x0binputTensor\x18\x04 \x03(\x0b\x32\x1c.CoreML.Specification.Tensor\x12\x32\n\x0coutputTensor\x18\x05 \x03(\x0b\x32\x1c.CoreML.Specification.Tensor\x12\x13\n\x0bisUpdatable\x18\n \x01(\x08\x12\x43\n\x0b\x63onvolution\x18\x64 \x01(\x0b\x32,.CoreML.Specification.ConvolutionLayerParamsH\x00\x12;\n\x07pooling\x18x \x01(\x0b\x32(.CoreML.Specification.PoolingLayerParamsH\x00\x12=\n\nactivation\x18\x82\x01 \x01(\x0b\x32&.CoreML.Specification.ActivationParamsH\x00\x12\x46\n\x0cinnerProduct\x18\x8c\x01 \x01(\x0b\x32-.CoreML.Specification.InnerProductLayerParamsH\x00\x12@\n\tembedding\x18\x96\x01 \x01(\x0b\x32*.CoreML.Specification.EmbeddingLayerParamsH\x00\x12@\n\tbatchnorm\x18\xa0\x01 \x01(\x0b\x32*.CoreML.Specification.BatchnormLayerParamsH\x00\x12\x46\n\x03mvn\x18\xa5\x01 \x01(\x0b\x32\x36.CoreML.Specification.MeanVarianceNormalizeLayerParamsH\x00\x12\x44\n\x0bl2normalize\x18\xaa\x01 \x01(\x0b\x32,.CoreML.Specification.L2NormalizeLayerParamsH\x00\x12<\n\x07softmax\x18\xaf\x01 \x01(\x0b\x32(.CoreML.Specification.SoftmaxLayerParamsH\x00\x12\x34\n\x03lrn\x18\xb4\x01 \x01(\x0b\x32$.CoreML.Specification.LRNLayerParamsH\x00\x12\x36\n\x04\x63rop\x18\xbe\x01 \x01(\x0b\x32%.CoreML.Specification.CropLayerParamsH\x00\x12<\n\x07padding\x18\xc8\x01 \x01(\x0b\x32(.CoreML.Specification.PaddingLayerParamsH\x00\x12>\n\x08upsample\x18\xd2\x01 \x01(\x0b\x32).CoreML.Specification.UpsampleLayerParamsH\x00\x12J\n\x0eresizeBilinear\x18\xd3\x01 \x01(\x0b\x32/.CoreML.Specification.ResizeBilinearLayerParamsH\x00\x12\x42\n\ncropResize\x18\xd4\x01 \x01(\x0b\x32+.CoreML.Specification.CropResizeLayerParamsH\x00\x12@\n\x05unary\x18\xdc\x01 \x01(\x0b\x32..CoreML.Specification.UnaryFunctionLayerParamsH\x00\x12\x34\n\x03\x61\x64\x64\x18\xe6\x01 \x01(\x0b\x32$.CoreML.Specification.AddLayerParamsH\x00\x12>\n\x08multiply\x18\xe7\x01 \x01(\x0b\x32).CoreML.Specification.MultiplyLayerParamsH\x00\x12<\n\x07\x61verage\x18\xf0\x01 \x01(\x0b\x32(.CoreML.Specification.AverageLayerParamsH\x00\x12\x38\n\x05scale\x18\xf5\x01 \x01(\x0b\x32&.CoreML.Specification.ScaleLayerParamsH\x00\x12\x36\n\x04\x62ias\x18\xfa\x01 \x01(\x0b\x32%.CoreML.Specification.BiasLayerParamsH\x00\x12\x34\n\x03max\x18\x84\x02 \x01(\x0b\x32$.CoreML.Specification.MaxLayerParamsH\x00\x12\x34\n\x03min\x18\x85\x02 \x01(\x0b\x32$.CoreML.Specification.MinLayerParamsH\x00\x12;\n\x03\x64ot\x18\x8e\x02 \x01(\x0b\x32+.CoreML.Specification.DotProductLayerParamsH\x00\x12:\n\x06reduce\x18\x98\x02 \x01(\x0b\x32\'.CoreML.Specification.ReduceLayerParamsH\x00\x12\x46\n\x0cloadConstant\x18\xa2\x02 \x01(\x0b\x32-.CoreML.Specification.LoadConstantLayerParamsH\x00\x12<\n\x07reshape\x18\xac\x02 \x01(\x0b\x32(.CoreML.Specification.ReshapeLayerParamsH\x00\x12<\n\x07\x66latten\x18\xad\x02 \x01(\x0b\x32(.CoreML.Specification.FlattenLayerParamsH\x00\x12<\n\x07permute\x18\xb6\x02 \x01(\x0b\x32(.CoreML.Specification.PermuteLayerParamsH\x00\x12:\n\x06\x63oncat\x18\xc0\x02 \x01(\x0b\x32\'.CoreML.Specification.ConcatLayerParamsH\x00\x12\x38\n\x05split\x18\xca\x02 \x01(\x0b\x32&.CoreML.Specification.SplitLayerParamsH\x00\x12J\n\x0esequenceRepeat\x18\xd4\x02 \x01(\x0b\x32/.CoreML.Specification.SequenceRepeatLayerParamsH\x00\x12J\n\x0ereorganizeData\x18\xd9\x02 \x01(\x0b\x32/.CoreML.Specification.ReorganizeDataLayerParamsH\x00\x12\x38\n\x05slice\x18\xde\x02 \x01(\x0b\x32&.CoreML.Specification.SliceLayerParamsH\x00\x12L\n\x0fsimpleRecurrent\x18\x90\x03 \x01(\x0b\x32\x30.CoreML.Specification.SimpleRecurrentLayerParamsH\x00\x12\x34\n\x03gru\x18\x9a\x03 \x01(\x0b\x32$.CoreML.Specification.GRULayerParamsH\x00\x12R\n\x12uniDirectionalLSTM\x18\xa4\x03 \x01(\x0b\x32\x33.CoreML.Specification.UniDirectionalLSTMLayerParamsH\x00\x12P\n\x11\x62iDirectionalLSTM\x18\xae\x03 \x01(\x0b\x32\x32.CoreML.Specification.BiDirectionalLSTMLayerParamsH\x00\x12:\n\x06\x63ustom\x18\xf4\x03 \x01(\x0b\x32\'.CoreML.Specification.CustomLayerParamsH\x00\x12\x36\n\x04\x63opy\x18\xd8\x04 \x01(\x0b\x32%.CoreML.Specification.CopyLayerParamsH\x00\x12:\n\x06\x62ranch\x18\xdd\x04 \x01(\x0b\x32\'.CoreML.Specification.BranchLayerParamsH\x00\x12\x36\n\x04loop\x18\xe7\x04 \x01(\x0b\x32%.CoreML.Specification.LoopLayerParamsH\x00\x12@\n\tloopBreak\x18\xec\x04 \x01(\x0b\x32*.CoreML.Specification.LoopBreakLayerParamsH\x00\x12\x46\n\x0cloopContinue\x18\xf1\x04 \x01(\x0b\x32-.CoreML.Specification.LoopContinueLayerParamsH\x00\x12\x44\n\x0brangeStatic\x18\xfb\x04 \x01(\x0b\x32,.CoreML.Specification.RangeStaticLayerParamsH\x00\x12\x46\n\x0crangeDynamic\x18\x80\x05 \x01(\x0b\x32-.CoreML.Specification.RangeDynamicLayerParamsH\x00\x12\x36\n\x04\x63lip\x18\x94\x05 \x01(\x0b\x32%.CoreML.Specification.ClipLayerParamsH\x00\x12\x36\n\x04\x63\x65il\x18\x99\x05 \x01(\x0b\x32%.CoreML.Specification.CeilLayerParamsH\x00\x12\x38\n\x05\x66loor\x18\x9e\x05 \x01(\x0b\x32&.CoreML.Specification.FloorLayerParamsH\x00\x12\x36\n\x04sign\x18\xa8\x05 \x01(\x0b\x32%.CoreML.Specification.SignLayerParamsH\x00\x12\x38\n\x05round\x18\xad\x05 \x01(\x0b\x32&.CoreML.Specification.RoundLayerParamsH\x00\x12\x36\n\x04\x65xp2\x18\xbc\x05 \x01(\x0b\x32%.CoreML.Specification.Exp2LayerParamsH\x00\x12\x34\n\x03sin\x18\xc6\x05 \x01(\x0b\x32$.CoreML.Specification.SinLayerParamsH\x00\x12\x34\n\x03\x63os\x18\xcb\x05 \x01(\x0b\x32$.CoreML.Specification.CosLayerParamsH\x00\x12\x34\n\x03tan\x18\xd0\x05 \x01(\x0b\x32$.CoreML.Specification.TanLayerParamsH\x00\x12\x36\n\x04\x61sin\x18\xda\x05 \x01(\x0b\x32%.CoreML.Specification.AsinLayerParamsH\x00\x12\x36\n\x04\x61\x63os\x18\xdf\x05 \x01(\x0b\x32%.CoreML.Specification.AcosLayerParamsH\x00\x12\x36\n\x04\x61tan\x18\xe4\x05 \x01(\x0b\x32%.CoreML.Specification.AtanLayerParamsH\x00\x12\x36\n\x04sinh\x18\xee\x05 \x01(\x0b\x32%.CoreML.Specification.SinhLayerParamsH\x00\x12\x36\n\x04\x63osh\x18\xf3\x05 \x01(\x0b\x32%.CoreML.Specification.CoshLayerParamsH\x00\x12\x36\n\x04tanh\x18\xf8\x05 \x01(\x0b\x32%.CoreML.Specification.TanhLayerParamsH\x00\x12\x38\n\x05\x61sinh\x18\x82\x06 \x01(\x0b\x32&.CoreML.Specification.AsinhLayerParamsH\x00\x12\x38\n\x05\x61\x63osh\x18\x87\x06 \x01(\x0b\x32&.CoreML.Specification.AcoshLayerParamsH\x00\x12\x38\n\x05\x61tanh\x18\x8c\x06 \x01(\x0b\x32&.CoreML.Specification.AtanhLayerParamsH\x00\x12\x34\n\x03\x65rf\x18\x96\x06 \x01(\x0b\x32$.CoreML.Specification.ErfLayerParamsH\x00\x12\x36\n\x04gelu\x18\x9b\x06 \x01(\x0b\x32%.CoreML.Specification.GeluLayerParamsH\x00\x12\x38\n\x05\x65qual\x18\xaf\x06 \x01(\x0b\x32&.CoreML.Specification.EqualLayerParamsH\x00\x12>\n\x08notEqual\x18\xb4\x06 \x01(\x0b\x32).CoreML.Specification.NotEqualLayerParamsH\x00\x12>\n\x08lessThan\x18\xb9\x06 \x01(\x0b\x32).CoreML.Specification.LessThanLayerParamsH\x00\x12@\n\tlessEqual\x18\xbb\x06 \x01(\x0b\x32*.CoreML.Specification.LessEqualLayerParamsH\x00\x12\x44\n\x0bgreaterThan\x18\xbe\x06 \x01(\x0b\x32,.CoreML.Specification.GreaterThanLayerParamsH\x00\x12\x46\n\x0cgreaterEqual\x18\xc0\x06 \x01(\x0b\x32-.CoreML.Specification.GreaterEqualLayerParamsH\x00\x12@\n\tlogicalOr\x18\xc8\x06 \x01(\x0b\x32*.CoreML.Specification.LogicalOrLayerParamsH\x00\x12\x42\n\nlogicalXor\x18\xcd\x06 \x01(\x0b\x32+.CoreML.Specification.LogicalXorLayerParamsH\x00\x12\x42\n\nlogicalNot\x18\xd2\x06 \x01(\x0b\x32+.CoreML.Specification.LogicalNotLayerParamsH\x00\x12\x42\n\nlogicalAnd\x18\xd7\x06 \x01(\x0b\x32+.CoreML.Specification.LogicalAndLayerParamsH\x00\x12N\n\x10modBroadcastable\x18\xe1\x06 \x01(\x0b\x32\x31.CoreML.Specification.ModBroadcastableLayerParamsH\x00\x12N\n\x10minBroadcastable\x18\xe6\x06 \x01(\x0b\x32\x31.CoreML.Specification.MinBroadcastableLayerParamsH\x00\x12N\n\x10maxBroadcastable\x18\xeb\x06 \x01(\x0b\x32\x31.CoreML.Specification.MaxBroadcastableLayerParamsH\x00\x12N\n\x10\x61\x64\x64\x42roadcastable\x18\xf0\x06 \x01(\x0b\x32\x31.CoreML.Specification.AddBroadcastableLayerParamsH\x00\x12N\n\x10powBroadcastable\x18\xf5\x06 \x01(\x0b\x32\x31.CoreML.Specification.PowBroadcastableLayerParamsH\x00\x12T\n\x13\x64ivideBroadcastable\x18\xfa\x06 \x01(\x0b\x32\x34.CoreML.Specification.DivideBroadcastableLayerParamsH\x00\x12X\n\x15\x66loorDivBroadcastable\x18\xff\x06 \x01(\x0b\x32\x36.CoreML.Specification.FloorDivBroadcastableLayerParamsH\x00\x12X\n\x15multiplyBroadcastable\x18\x84\x07 \x01(\x0b\x32\x36.CoreML.Specification.MultiplyBroadcastableLayerParamsH\x00\x12X\n\x15subtractBroadcastable\x18\x89\x07 \x01(\x0b\x32\x36.CoreML.Specification.SubtractBroadcastableLayerParamsH\x00\x12\x36\n\x04tile\x18\x98\x07 \x01(\x0b\x32%.CoreML.Specification.TileLayerParamsH\x00\x12\x38\n\x05stack\x18\x9d\x07 \x01(\x0b\x32&.CoreML.Specification.StackLayerParamsH\x00\x12:\n\x06gather\x18\xa2\x07 \x01(\x0b\x32\'.CoreML.Specification.GatherLayerParamsH\x00\x12<\n\x07scatter\x18\xa7\x07 \x01(\x0b\x32(.CoreML.Specification.ScatterLayerParamsH\x00\x12>\n\x08gatherND\x18\xac\x07 \x01(\x0b\x32).CoreML.Specification.GatherNDLayerParamsH\x00\x12@\n\tscatterND\x18\xb1\x07 \x01(\x0b\x32*.CoreML.Specification.ScatterNDLayerParamsH\x00\x12@\n\tsoftmaxND\x18\xb6\x07 \x01(\x0b\x32*.CoreML.Specification.SoftmaxNDLayerParamsH\x00\x12L\n\x0fgatherAlongAxis\x18\xb8\x07 \x01(\x0b\x32\x30.CoreML.Specification.GatherAlongAxisLayerParamsH\x00\x12N\n\x10scatterAlongAxis\x18\xba\x07 \x01(\x0b\x32\x31.CoreML.Specification.ScatterAlongAxisLayerParamsH\x00\x12<\n\x07reverse\x18\xc0\x07 \x01(\x0b\x32(.CoreML.Specification.ReverseLayerParamsH\x00\x12\x42\n\nreverseSeq\x18\xc5\x07 \x01(\x0b\x32+.CoreML.Specification.ReverseSeqLayerParamsH\x00\x12<\n\x07splitND\x18\xcf\x07 \x01(\x0b\x32(.CoreML.Specification.SplitNDLayerParamsH\x00\x12>\n\x08\x63oncatND\x18\xd4\x07 \x01(\x0b\x32).CoreML.Specification.ConcatNDLayerParamsH\x00\x12@\n\ttranspose\x18\xd9\x07 \x01(\x0b\x32*.CoreML.Specification.TransposeLayerParamsH\x00\x12\x44\n\x0bsliceStatic\x18\xe3\x07 \x01(\x0b\x32,.CoreML.Specification.SliceStaticLayerParamsH\x00\x12\x46\n\x0csliceDynamic\x18\xe8\x07 \x01(\x0b\x32-.CoreML.Specification.SliceDynamicLayerParamsH\x00\x12J\n\x0eslidingWindows\x18\xed\x07 \x01(\x0b\x32/.CoreML.Specification.SlidingWindowsLayerParamsH\x00\x12\x36\n\x04topK\x18\xf7\x07 \x01(\x0b\x32%.CoreML.Specification.TopKLayerParamsH\x00\x12:\n\x06\x61rgMin\x18\xfc\x07 \x01(\x0b\x32\'.CoreML.Specification.ArgMinLayerParamsH\x00\x12:\n\x06\x61rgMax\x18\x81\x08 \x01(\x0b\x32\'.CoreML.Specification.ArgMaxLayerParamsH\x00\x12\x44\n\x0b\x65mbeddingND\x18\x90\x08 \x01(\x0b\x32,.CoreML.Specification.EmbeddingNDLayerParamsH\x00\x12H\n\rbatchedMatmul\x18\x95\x08 \x01(\x0b\x32..CoreML.Specification.BatchedMatMulLayerParamsH\x00\x12>\n\x08getShape\x18\xa9\x08 \x01(\x0b\x32).CoreML.Specification.GetShapeLayerParamsH\x00\x12J\n\x0eloadConstantND\x18\xae\x08 \x01(\x0b\x32/.CoreML.Specification.LoadConstantNDLayerParamsH\x00\x12>\n\x08\x66illLike\x18\xb8\x08 \x01(\x0b\x32).CoreML.Specification.FillLikeLayerParamsH\x00\x12\x42\n\nfillStatic\x18\xbd\x08 \x01(\x0b\x32+.CoreML.Specification.FillStaticLayerParamsH\x00\x12\x44\n\x0b\x66illDynamic\x18\xc2\x08 \x01(\x0b\x32,.CoreML.Specification.FillDynamicLayerParamsH\x00\x12L\n\x0f\x62roadcastToLike\x18\xcc\x08 \x01(\x0b\x32\x30.CoreML.Specification.BroadcastToLikeLayerParamsH\x00\x12P\n\x11\x62roadcastToStatic\x18\xd1\x08 \x01(\x0b\x32\x32.CoreML.Specification.BroadcastToStaticLayerParamsH\x00\x12R\n\x12\x62roadcastToDynamic\x18\xd6\x08 \x01(\x0b\x32\x33.CoreML.Specification.BroadcastToDynamicLayerParamsH\x00\x12<\n\x07squeeze\x18\xe0\x08 \x01(\x0b\x32(.CoreML.Specification.SqueezeLayerParamsH\x00\x12\x42\n\nexpandDims\x18\xe5\x08 \x01(\x0b\x32+.CoreML.Specification.ExpandDimsLayerParamsH\x00\x12\x44\n\x0b\x66lattenTo2D\x18\xea\x08 \x01(\x0b\x32,.CoreML.Specification.FlattenTo2DLayerParamsH\x00\x12\x44\n\x0breshapeLike\x18\xef\x08 \x01(\x0b\x32,.CoreML.Specification.ReshapeLikeLayerParamsH\x00\x12H\n\rreshapeStatic\x18\xf4\x08 \x01(\x0b\x32..CoreML.Specification.ReshapeStaticLayerParamsH\x00\x12J\n\x0ereshapeDynamic\x18\xf9\x08 \x01(\x0b\x32/.CoreML.Specification.ReshapeDynamicLayerParamsH\x00\x12X\n\x15rankPreservingReshape\x18\xfe\x08 \x01(\x0b\x32\x36.CoreML.Specification.RankPreservingReshapeLayerParamsH\x00\x12H\n\x0b\x63onstantPad\x18\x83\t \x01(\x0b\x32\x30.CoreML.Specification.ConstantPaddingLayerParamsH\x00\x12N\n\x10randomNormalLike\x18\x92\t \x01(\x0b\x32\x31.CoreML.Specification.RandomNormalLikeLayerParamsH\x00\x12R\n\x12randomNormalStatic\x18\x97\t \x01(\x0b\x32\x33.CoreML.Specification.RandomNormalStaticLayerParamsH\x00\x12T\n\x13randomNormalDynamic\x18\x9c\t \x01(\x0b\x32\x34.CoreML.Specification.RandomNormalDynamicLayerParamsH\x00\x12P\n\x11randomUniformLike\x18\xa6\t \x01(\x0b\x32\x32.CoreML.Specification.RandomUniformLikeLayerParamsH\x00\x12T\n\x13randomUniformStatic\x18\xab\t \x01(\x0b\x32\x34.CoreML.Specification.RandomUniformStaticLayerParamsH\x00\x12V\n\x14randomUniformDynamic\x18\xb0\t \x01(\x0b\x32\x35.CoreML.Specification.RandomUniformDynamicLayerParamsH\x00\x12T\n\x13randomBernoulliLike\x18\xba\t \x01(\x0b\x32\x34.CoreML.Specification.RandomBernoulliLikeLayerParamsH\x00\x12X\n\x15randomBernoulliStatic\x18\xbf\t \x01(\x0b\x32\x36.CoreML.Specification.RandomBernoulliStaticLayerParamsH\x00\x12Z\n\x16randomBernoulliDynamic\x18\xc4\t \x01(\x0b\x32\x37.CoreML.Specification.RandomBernoulliDynamicLayerParamsH\x00\x12\\\n\x17\x63\x61tegoricalDistribution\x18\xce\t \x01(\x0b\x32\x38.CoreML.Specification.CategoricalDistributionLayerParamsH\x00\x12>\n\x08reduceL1\x18\xe2\t \x01(\x0b\x32).CoreML.Specification.ReduceL1LayerParamsH\x00\x12>\n\x08reduceL2\x18\xe7\t \x01(\x0b\x32).CoreML.Specification.ReduceL2LayerParamsH\x00\x12@\n\treduceMax\x18\xec\t \x01(\x0b\x32*.CoreML.Specification.ReduceMaxLayerParamsH\x00\x12@\n\treduceMin\x18\xf1\t \x01(\x0b\x32*.CoreML.Specification.ReduceMinLayerParamsH\x00\x12@\n\treduceSum\x18\xf6\t \x01(\x0b\x32*.CoreML.Specification.ReduceSumLayerParamsH\x00\x12\x42\n\nreduceProd\x18\xfb\t \x01(\x0b\x32+.CoreML.Specification.ReduceProdLayerParamsH\x00\x12\x42\n\nreduceMean\x18\x80\n \x01(\x0b\x32+.CoreML.Specification.ReduceMeanLayerParamsH\x00\x12\x46\n\x0creduceLogSum\x18\x85\n \x01(\x0b\x32-.CoreML.Specification.ReduceLogSumLayerParamsH\x00\x12L\n\x0freduceSumSquare\x18\x8a\n \x01(\x0b\x32\x30.CoreML.Specification.ReduceSumSquareLayerParamsH\x00\x12L\n\x0freduceLogSumExp\x18\x8f\n \x01(\x0b\x32\x30.CoreML.Specification.ReduceLogSumExpLayerParamsH\x00\x12\x46\n\x0cwhereNonZero\x18\xa1\n \x01(\x0b\x32-.CoreML.Specification.WhereNonZeroLayerParamsH\x00\x12J\n\x0ematrixBandPart\x18\xa3\n \x01(\x0b\x32/.CoreML.Specification.MatrixBandPartLayerParamsH\x00\x12L\n\x0flowerTriangular\x18\xa8\n \x01(\x0b\x32\x30.CoreML.Specification.LowerTriangularLayerParamsH\x00\x12L\n\x0fupperTriangular\x18\xad\n \x01(\x0b\x32\x30.CoreML.Specification.UpperTriangularLayerParamsH\x00\x12R\n\x12whereBroadcastable\x18\xb2\n \x01(\x0b\x32\x33.CoreML.Specification.WhereBroadcastableLayerParamsH\x00\x12R\n\x12layerNormalization\x18\xc6\n \x01(\x0b\x32\x33.CoreML.Specification.LayerNormalizationLayerParamsH\x00\x12X\n\x15NonMaximumSuppression\x18\xf8\n \x01(\x0b\x32\x36.CoreML.Specification.NonMaximumSuppressionLayerParamsH\x00\x12:\n\x06oneHot\x18\xaa\x0b \x01(\x0b\x32\'.CoreML.Specification.OneHotLayerParamsH\x00\x12:\n\x06\x63umSum\x18\xaf\x0b \x01(\x0b\x32\'.CoreML.Specification.CumSumLayerParamsH\x00\x12\x44\n\x0b\x63lampedReLU\x18\xb4\x0b \x01(\x0b\x32,.CoreML.Specification.ClampedReLULayerParamsH\x00\x12<\n\x07\x61rgSort\x18\xb5\x0b \x01(\x0b\x32(.CoreML.Specification.ArgSortLayerParamsH\x00\x12@\n\tpooling3d\x18\xb9\x0b \x01(\x0b\x32*.CoreML.Specification.Pooling3DLayerParamsH\x00\x12L\n\x0fglobalPooling3d\x18\xba\x0b \x01(\x0b\x32\x30.CoreML.Specification.GlobalPooling3DLayerParamsH\x00\x12\x44\n\x0bsliceBySize\x18\xbe\x0b \x01(\x0b\x32,.CoreML.Specification.SliceBySizeLayerParamsH\x00\x12H\n\rconvolution3d\x18\xbf\x0b \x01(\x0b\x32..CoreML.Specification.Convolution3DLayerParamsH\x00\x42\x07\n\x05layer\"\x83\x01\n\x11\x42ranchLayerParams\x12\x35\n\x08ifBranch\x18\x01 \x01(\x0b\x32#.CoreML.Specification.NeuralNetwork\x12\x37\n\nelseBranch\x18\x02 \x01(\x0b\x32#.CoreML.Specification.NeuralNetwork\"\xbb\x01\n\x0fLoopLayerParams\x12\x19\n\x11maxLoopIterations\x18\x01 \x01(\x04\x12\x14\n\x0c\x63onditionVar\x18\x02 \x01(\t\x12=\n\x10\x63onditionNetwork\x18\x03 \x01(\x0b\x32#.CoreML.Specification.NeuralNetwork\x12\x38\n\x0b\x62odyNetwork\x18\x04 \x01(\x0b\x32#.CoreML.Specification.NeuralNetwork\"\x16\n\x14LoopBreakLayerParams\"\x19\n\x17LoopContinueLayerParams\"\x11\n\x0f\x43opyLayerParams\"\'\n\x16GreaterThanLayerParams\x12\r\n\x05\x61lpha\x18\x02 \x01(\x02\"(\n\x17GreaterEqualLayerParams\x12\r\n\x05\x61lpha\x18\x02 \x01(\x02\"$\n\x13LessThanLayerParams\x12\r\n\x05\x61lpha\x18\x02 \x01(\x02\"%\n\x14LessEqualLayerParams\x12\r\n\x05\x61lpha\x18\x02 \x01(\x02\"!\n\x10\x45qualLayerParams\x12\r\n\x05\x61lpha\x18\x01 \x01(\x02\"$\n\x13NotEqualLayerParams\x12\r\n\x05\x61lpha\x18\x01 \x01(\x02\"\x17\n\x15LogicalAndLayerParams\"\x16\n\x14LogicalOrLayerParams\"\x17\n\x15LogicalXorLayerParams\"\x17\n\x15LogicalNotLayerParams\"\x8e\x01\n\rBorderAmounts\x12\x44\n\rborderAmounts\x18\n \x03(\x0b\x32-.CoreML.Specification.BorderAmounts.EdgeSizes\x1a\x37\n\tEdgeSizes\x12\x15\n\rstartEdgeSize\x18\x01 \x01(\x04\x12\x13\n\x0b\x65ndEdgeSize\x18\x02 \x01(\x04\"K\n\x0cValidPadding\x12;\n\x0epaddingAmounts\x18\x01 \x01(\x0b\x32#.CoreML.Specification.BorderAmounts\"\x96\x01\n\x0bSamePadding\x12H\n\rasymmetryMode\x18\x01 \x01(\x0e\x32\x31.CoreML.Specification.SamePadding.SamePaddingMode\"=\n\x0fSamePaddingMode\x12\x16\n\x12\x42OTTOM_RIGHT_HEAVY\x10\x00\x12\x12\n\x0eTOP_LEFT_HEAVY\x10\x01\"\xbd\x01\n\x0cSamplingMode\x12\x41\n\x0esamplingMethod\x18\x01 \x01(\x0e\x32).CoreML.Specification.SamplingMode.Method\"j\n\x06Method\x12\x1f\n\x1bSTRICT_ALIGN_ENDPOINTS_MODE\x10\x00\x12\x18\n\x14\x41LIGN_ENDPOINTS_MODE\x10\x01\x12\x11\n\rUPSAMPLE_MODE\x10\x02\x12\x12\n\x0eROI_ALIGN_MODE\x10\x03\"\xd8\x01\n\x12\x42oxCoordinatesMode\x12\x45\n\x07\x62oxMode\x18\x01 \x01(\x0e\x32\x34.CoreML.Specification.BoxCoordinatesMode.Coordinates\"{\n\x0b\x43oordinates\x12\x18\n\x14\x43ORNERS_HEIGHT_FIRST\x10\x00\x12\x17\n\x13\x43ORNERS_WIDTH_FIRST\x10\x01\x12\x1c\n\x18\x43\x45NTER_SIZE_HEIGHT_FIRST\x10\x02\x12\x1b\n\x17\x43\x45NTER_SIZE_WIDTH_FIRST\x10\x03\"\xb5\x01\n\x0cWeightParams\x12\x12\n\nfloatValue\x18\x01 \x03(\x02\x12\x14\n\x0c\x66loat16Value\x18\x02 \x01(\x0c\x12\x10\n\x08rawValue\x18\x1e \x01(\x0c\x12\x14\n\x0cint8RawValue\x18\x1f \x01(\x0c\x12>\n\x0cquantization\x18( \x01(\x0b\x32(.CoreML.Specification.QuantizationParams\x12\x13\n\x0bisUpdatable\x18\x32 \x01(\x08\"\xe4\x01\n\x12QuantizationParams\x12\x14\n\x0cnumberOfBits\x18\x01 \x01(\x04\x12L\n\x12linearQuantization\x18\x65 \x01(\x0b\x32..CoreML.Specification.LinearQuantizationParamsH\x00\x12V\n\x17lookupTableQuantization\x18\x66 \x01(\x0b\x32\x33.CoreML.Specification.LookUpTableQuantizationParamsH\x00\x42\x12\n\x10QuantizationType\"7\n\x18LinearQuantizationParams\x12\r\n\x05scale\x18\x01 \x03(\x02\x12\x0c\n\x04\x62ias\x18\x02 \x03(\x02\"3\n\x1dLookUpTableQuantizationParams\x12\x12\n\nfloatValue\x18\x01 \x03(\x02\"\xbd\x03\n\x16\x43onvolutionLayerParams\x12\x16\n\x0eoutputChannels\x18\x01 \x01(\x04\x12\x16\n\x0ekernelChannels\x18\x02 \x01(\x04\x12\x0f\n\x07nGroups\x18\n \x01(\x04\x12\x12\n\nkernelSize\x18\x14 \x03(\x04\x12\x0e\n\x06stride\x18\x1e \x03(\x04\x12\x16\n\x0e\x64ilationFactor\x18( \x03(\x04\x12\x33\n\x05valid\x18\x32 \x01(\x0b\x32\".CoreML.Specification.ValidPaddingH\x00\x12\x31\n\x04same\x18\x33 \x01(\x0b\x32!.CoreML.Specification.SamePaddingH\x00\x12\x17\n\x0fisDeconvolution\x18< \x01(\x08\x12\x0f\n\x07hasBias\x18\x46 \x01(\x08\x12\x33\n\x07weights\x18Z \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x30\n\x04\x62ias\x18[ \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x13\n\x0boutputShape\x18\x64 \x03(\x04\x42\x18\n\x16\x43onvolutionPaddingType\"\xec\x05\n\x18\x43onvolution3DLayerParams\x12\x16\n\x0eoutputChannels\x18\x01 \x01(\x05\x12\x15\n\rinputChannels\x18\x02 \x01(\x05\x12\x0f\n\x07nGroups\x18\n \x01(\x05\x12\x13\n\x0bkernelDepth\x18\x14 \x01(\x05\x12\x14\n\x0ckernelHeight\x18\x15 \x01(\x05\x12\x13\n\x0bkernelWidth\x18\x16 \x01(\x05\x12\x13\n\x0bstrideDepth\x18\x1f \x01(\x05\x12\x14\n\x0cstrideHeight\x18 \x01(\x05\x12\x13\n\x0bstrideWidth\x18! \x01(\x05\x12\x15\n\rdilationDepth\x18( \x01(\x05\x12\x16\n\x0e\x64ilationHeight\x18) \x01(\x05\x12\x15\n\rdilationWidth\x18* \x01(\x05\x12\x0f\n\x07hasBias\x18\x32 \x01(\x08\x12\x33\n\x07weights\x18< \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x30\n\x04\x62ias\x18= \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12O\n\x0bpaddingType\x18\x46 \x01(\x0e\x32:.CoreML.Specification.Convolution3DLayerParams.PaddingType\x12\x1a\n\x12\x63ustomPaddingFront\x18P \x01(\x05\x12\x19\n\x11\x63ustomPaddingBack\x18Q \x01(\x05\x12\x18\n\x10\x63ustomPaddingTop\x18R \x01(\x05\x12\x1b\n\x13\x63ustomPaddingBottom\x18S \x01(\x05\x12\x19\n\x11\x63ustomPaddingLeft\x18T \x01(\x05\x12\x1a\n\x12\x63ustomPaddingRight\x18U \x01(\x05\x12\x17\n\x0fisDeconvolution\x18V \x01(\x08\x12\x13\n\x0boutputShape\x18W \x03(\x04\".\n\x0bPaddingType\x12\n\n\x06\x43USTOM\x10\x00\x12\t\n\x05VALID\x10\x01\x12\x08\n\x04SAME\x10\x02\"\xdd\x01\n\x17InnerProductLayerParams\x12\x15\n\rinputChannels\x18\x01 \x01(\x04\x12\x16\n\x0eoutputChannels\x18\x02 \x01(\x04\x12\x0f\n\x07hasBias\x18\n \x01(\x08\x12\x33\n\x07weights\x18\x14 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x30\n\x04\x62ias\x18\x15 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x1b\n\x13int8DynamicQuantize\x18\x16 \x01(\x08\"\xb8\x01\n\x14\x45mbeddingLayerParams\x12\x10\n\x08inputDim\x18\x01 \x01(\x04\x12\x16\n\x0eoutputChannels\x18\x02 \x01(\x04\x12\x0f\n\x07hasBias\x18\n \x01(\x08\x12\x33\n\x07weights\x18\x14 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x30\n\x04\x62ias\x18\x15 \x01(\x0b\x32\".CoreML.Specification.WeightParams\"\xba\x01\n\x16\x45mbeddingNDLayerParams\x12\x11\n\tvocabSize\x18\x01 \x01(\x04\x12\x15\n\rembeddingSize\x18\x02 \x01(\x04\x12\x0f\n\x07hasBias\x18\x03 \x01(\x08\x12\x33\n\x07weights\x18\x14 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x30\n\x04\x62ias\x18\x15 \x01(\x0b\x32\".CoreML.Specification.WeightParams\"\xbd\x02\n\x14\x42\x61tchnormLayerParams\x12\x10\n\x08\x63hannels\x18\x01 \x01(\x04\x12\x16\n\x0e\x63omputeMeanVar\x18\x05 \x01(\x08\x12\x1d\n\x15instanceNormalization\x18\x06 \x01(\x08\x12\x0f\n\x07\x65psilon\x18\n \x01(\x02\x12\x31\n\x05gamma\x18\x0f \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x30\n\x04\x62\x65ta\x18\x10 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x30\n\x04mean\x18\x11 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x34\n\x08variance\x18\x12 \x01(\x0b\x32\".CoreML.Specification.WeightParams\"\xe8\x03\n\x12PoolingLayerParams\x12\x42\n\x04type\x18\x01 \x01(\x0e\x32\x34.CoreML.Specification.PoolingLayerParams.PoolingType\x12\x12\n\nkernelSize\x18\n \x03(\x04\x12\x0e\n\x06stride\x18\x14 \x03(\x04\x12\x33\n\x05valid\x18\x1e \x01(\x0b\x32\".CoreML.Specification.ValidPaddingH\x00\x12\x31\n\x04same\x18\x1f \x01(\x0b\x32!.CoreML.Specification.SamePaddingH\x00\x12Y\n\x10includeLastPixel\x18 \x01(\x0b\x32=.CoreML.Specification.PoolingLayerParams.ValidCompletePaddingH\x00\x12\x1d\n\x15\x61vgPoolExcludePadding\x18\x32 \x01(\x08\x12\x15\n\rglobalPooling\x18< \x01(\x08\x1a.\n\x14ValidCompletePadding\x12\x16\n\x0epaddingAmounts\x18\n \x03(\x04\"+\n\x0bPoolingType\x12\x07\n\x03MAX\x10\x00\x12\x0b\n\x07\x41VERAGE\x10\x01\x12\x06\n\x02L2\x10\x02\x42\x14\n\x12PoolingPaddingType\"\xd6\x04\n\x14Pooling3DLayerParams\x12\x46\n\x04type\x18\x01 \x01(\x0e\x32\x38.CoreML.Specification.Pooling3DLayerParams.PoolingType3D\x12\x13\n\x0bkernelDepth\x18\x02 \x01(\x05\x12\x14\n\x0ckernelHeight\x18\x03 \x01(\x05\x12\x13\n\x0bkernelWidth\x18\x04 \x01(\x05\x12\x13\n\x0bstrideDepth\x18\x05 \x01(\x05\x12\x14\n\x0cstrideHeight\x18\x06 \x01(\x05\x12\x13\n\x0bstrideWidth\x18\x07 \x01(\x05\x12T\n\x0bpaddingType\x18\x0f \x01(\x0e\x32?.CoreML.Specification.Pooling3DLayerParams.Pooling3DPaddingType\x12\x1a\n\x12\x63ustomPaddingFront\x18\x08 \x01(\x05\x12\x19\n\x11\x63ustomPaddingBack\x18\t \x01(\x05\x12\x18\n\x10\x63ustomPaddingTop\x18\n \x01(\x05\x12\x1b\n\x13\x63ustomPaddingBottom\x18\x0b \x01(\x05\x12\x19\n\x11\x63ustomPaddingLeft\x18\x0c \x01(\x05\x12\x1a\n\x12\x63ustomPaddingRight\x18\r \x01(\x05\x12\x1b\n\x13\x63ountExcludePadding\x18\x0e \x01(\x08\"%\n\rPoolingType3D\x12\x07\n\x03MAX\x10\x00\x12\x0b\n\x07\x41VERAGE\x10\x01\"7\n\x14Pooling3DPaddingType\x12\n\n\x06\x43USTOM\x10\x00\x12\t\n\x05VALID\x10\x01\x12\x08\n\x04SAME\x10\x02\"\x9d\x01\n\x1aGlobalPooling3DLayerParams\x12R\n\x04type\x18\x01 \x01(\x0e\x32\x44.CoreML.Specification.GlobalPooling3DLayerParams.GlobalPoolingType3D\"+\n\x13GlobalPoolingType3D\x12\x07\n\x03MAX\x10\x00\x12\x0b\n\x07\x41VERAGE\x10\x01\"\xa1\x03\n\x12PaddingLayerParams\x12L\n\x08\x63onstant\x18\x01 \x01(\x0b\x32\x38.CoreML.Specification.PaddingLayerParams.PaddingConstantH\x00\x12P\n\nreflection\x18\x02 \x01(\x0b\x32:.CoreML.Specification.PaddingLayerParams.PaddingReflectionH\x00\x12R\n\x0breplication\x18\x03 \x01(\x0b\x32;.CoreML.Specification.PaddingLayerParams.PaddingReplicationH\x00\x12;\n\x0epaddingAmounts\x18\n \x01(\x0b\x32#.CoreML.Specification.BorderAmounts\x1a \n\x0fPaddingConstant\x12\r\n\x05value\x18\x01 \x01(\x02\x1a\x13\n\x11PaddingReflection\x1a\x14\n\x12PaddingReplicationB\r\n\x0bPaddingType\"+\n\x11\x43oncatLayerParams\x12\x16\n\x0esequenceConcat\x18\x64 \x01(\x08\"K\n\x0eLRNLayerParams\x12\r\n\x05\x61lpha\x18\x01 \x01(\x02\x12\x0c\n\x04\x62\x65ta\x18\x02 \x01(\x02\x12\x11\n\tlocalSize\x18\x03 \x01(\x04\x12\t\n\x01k\x18\x04 \x01(\x02\"\x14\n\x12SoftmaxLayerParams\"$\n\x10SplitLayerParams\x12\x10\n\x08nOutputs\x18\x01 \x01(\x04\"\x1f\n\x0e\x41\x64\x64LayerParams\x12\r\n\x05\x61lpha\x18\x01 \x01(\x02\"$\n\x13MultiplyLayerParams\x12\r\n\x05\x61lpha\x18\x01 \x01(\x02\"\x84\x02\n\x18UnaryFunctionLayerParams\x12\x46\n\x04type\x18\x01 \x01(\x0e\x32\x38.CoreML.Specification.UnaryFunctionLayerParams.Operation\x12\r\n\x05\x61lpha\x18\x02 \x01(\x02\x12\x0f\n\x07\x65psilon\x18\x03 \x01(\x02\x12\r\n\x05shift\x18\x04 \x01(\x02\x12\r\n\x05scale\x18\x05 \x01(\x02\"b\n\tOperation\x12\x08\n\x04SQRT\x10\x00\x12\t\n\x05RSQRT\x10\x01\x12\x0b\n\x07INVERSE\x10\x02\x12\t\n\x05POWER\x10\x03\x12\x07\n\x03\x45XP\x10\x04\x12\x07\n\x03LOG\x10\x05\x12\x07\n\x03\x41\x42S\x10\x06\x12\r\n\tTHRESHOLD\x10\x07\"\xf1\x02\n\x13UpsampleLayerParams\x12\x15\n\rscalingFactor\x18\x01 \x03(\x04\x12\x1f\n\x17\x66ractionalScalingFactor\x18\x07 \x03(\x02\x12I\n\x04mode\x18\x05 \x01(\x0e\x32;.CoreML.Specification.UpsampleLayerParams.InterpolationMode\x12X\n\x12linearUpsampleMode\x18\x06 \x01(\x0e\x32<.CoreML.Specification.UpsampleLayerParams.LinearUpsampleMode\")\n\x11InterpolationMode\x12\x06\n\x02NN\x10\x00\x12\x0c\n\x08\x42ILINEAR\x10\x01\"R\n\x12LinearUpsampleMode\x12\x0b\n\x07\x44\x45\x46\x41ULT\x10\x00\x12\x16\n\x12\x41LIGN_CORNERS_TRUE\x10\x01\x12\x17\n\x13\x41LIGN_CORNERS_FALSE\x10\x02\"a\n\x19ResizeBilinearLayerParams\x12\x12\n\ntargetSize\x18\x01 \x03(\x04\x12\x30\n\x04mode\x18\x02 \x01(\x0b\x32\".CoreML.Specification.SamplingMode\"\xd4\x01\n\x15\x43ropResizeLayerParams\x12\x12\n\ntargetSize\x18\x01 \x03(\x04\x12\x1d\n\x15normalizedCoordinates\x18\x02 \x01(\x08\x12\x30\n\x04mode\x18\x03 \x01(\x0b\x32\".CoreML.Specification.SamplingMode\x12@\n\x0e\x62oxIndicesMode\x18\x04 \x01(\x0b\x32(.CoreML.Specification.BoxCoordinatesMode\x12\x14\n\x0cspatialScale\x18\x05 \x01(\x02\"R\n\x0f\x42iasLayerParams\x12\r\n\x05shape\x18\x01 \x03(\x04\x12\x30\n\x04\x62ias\x18\x02 \x01(\x0b\x32\".CoreML.Specification.WeightParams\"\xaf\x01\n\x10ScaleLayerParams\x12\x12\n\nshapeScale\x18\x01 \x03(\x04\x12\x31\n\x05scale\x18\x02 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x0f\n\x07hasBias\x18\x03 \x01(\x08\x12\x11\n\tshapeBias\x18\x04 \x03(\x04\x12\x30\n\x04\x62ias\x18\x05 \x01(\x0b\x32\".CoreML.Specification.WeightParams\"Z\n\x17LoadConstantLayerParams\x12\r\n\x05shape\x18\x01 \x03(\x04\x12\x30\n\x04\x64\x61ta\x18\x02 \x01(\x0b\x32\".CoreML.Specification.WeightParams\")\n\x16L2NormalizeLayerParams\x12\x0f\n\x07\x65psilon\x18\x01 \x01(\x02\"\x8e\x01\n\x12\x46lattenLayerParams\x12\x43\n\x04mode\x18\x01 \x01(\x0e\x32\x35.CoreML.Specification.FlattenLayerParams.FlattenOrder\"3\n\x0c\x46lattenOrder\x12\x11\n\rCHANNEL_FIRST\x10\x00\x12\x10\n\x0c\x43HANNEL_LAST\x10\x01\"\xa3\x01\n\x12ReshapeLayerParams\x12\x13\n\x0btargetShape\x18\x01 \x03(\x03\x12\x43\n\x04mode\x18\x02 \x01(\x0e\x32\x35.CoreML.Specification.ReshapeLayerParams.ReshapeOrder\"3\n\x0cReshapeOrder\x12\x11\n\rCHANNEL_FIRST\x10\x00\x12\x10\n\x0c\x43HANNEL_LAST\x10\x01\"\"\n\x12PermuteLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x03(\x04\"\xd1\x01\n\x19ReorganizeDataLayerParams\x12P\n\x04mode\x18\x01 \x01(\x0e\x32\x42.CoreML.Specification.ReorganizeDataLayerParams.ReorganizationType\x12\x11\n\tblockSize\x18\x02 \x01(\x04\"O\n\x12ReorganizationType\x12\x12\n\x0eSPACE_TO_DEPTH\x10\x00\x12\x12\n\x0e\x44\x45PTH_TO_SPACE\x10\x01\x12\x11\n\rPIXEL_SHUFFLE\x10\x02\"\xc8\x01\n\x10SliceLayerParams\x12\x12\n\nstartIndex\x18\x01 \x01(\x03\x12\x10\n\x08\x65ndIndex\x18\x02 \x01(\x03\x12\x0e\n\x06stride\x18\x03 \x01(\x04\x12>\n\x04\x61xis\x18\x04 \x01(\x0e\x32\x30.CoreML.Specification.SliceLayerParams.SliceAxis\">\n\tSliceAxis\x12\x10\n\x0c\x43HANNEL_AXIS\x10\x00\x12\x0f\n\x0bHEIGHT_AXIS\x10\x01\x12\x0e\n\nWIDTH_AXIS\x10\x02\"\xd9\x02\n\x11ReduceLayerParams\x12\x45\n\x04mode\x18\x01 \x01(\x0e\x32\x37.CoreML.Specification.ReduceLayerParams.ReduceOperation\x12\x0f\n\x07\x65psilon\x18\x02 \x01(\x02\x12@\n\x04\x61xis\x18\x03 \x01(\x0e\x32\x32.CoreML.Specification.ReduceLayerParams.ReduceAxis\"v\n\x0fReduceOperation\x12\x07\n\x03SUM\x10\x00\x12\x07\n\x03\x41VG\x10\x01\x12\x08\n\x04PROD\x10\x02\x12\n\n\x06LOGSUM\x10\x03\x12\r\n\tSUMSQUARE\x10\x04\x12\x06\n\x02L1\x10\x05\x12\x06\n\x02L2\x10\x06\x12\x07\n\x03MAX\x10\x07\x12\x07\n\x03MIN\x10\x08\x12\n\n\x06\x41RGMAX\x10\t\"2\n\nReduceAxis\x12\x07\n\x03\x43HW\x10\x00\x12\x06\n\x02HW\x10\x01\x12\x05\n\x01\x43\x10\x02\x12\x05\n\x01H\x10\x03\x12\x05\n\x01W\x10\x04\"[\n\x0f\x43ropLayerParams\x12\x38\n\x0b\x63ropAmounts\x18\x01 \x01(\x0b\x32#.CoreML.Specification.BorderAmounts\x12\x0e\n\x06offset\x18\x05 \x03(\x04\"\x14\n\x12\x41verageLayerParams\"\x10\n\x0eMaxLayerParams\"\x10\n\x0eMinLayerParams\"1\n\x15\x44otProductLayerParams\x12\x18\n\x10\x63osineSimilarity\x18\x01 \x01(\x08\"f\n MeanVarianceNormalizeLayerParams\x12\x16\n\x0e\x61\x63rossChannels\x18\x01 \x01(\x08\x12\x19\n\x11normalizeVariance\x18\x02 \x01(\x08\x12\x0f\n\x07\x65psilon\x18\x03 \x01(\x02\"1\n\x19SequenceRepeatLayerParams\x12\x14\n\x0cnRepetitions\x18\x01 \x01(\x04\"\xff\x02\n\x1aSimpleRecurrentLayerParams\x12\x17\n\x0finputVectorSize\x18\x01 \x01(\x04\x12\x18\n\x10outputVectorSize\x18\x02 \x01(\x04\x12:\n\nactivation\x18\n \x01(\x0b\x32&.CoreML.Specification.ActivationParams\x12\x16\n\x0esequenceOutput\x18\x0f \x01(\x08\x12\x15\n\rhasBiasVector\x18\x14 \x01(\x08\x12\x38\n\x0cweightMatrix\x18\x1e \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12;\n\x0frecursionMatrix\x18\x1f \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x36\n\nbiasVector\x18 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x14\n\x0creverseInput\x18\x64 \x01(\x08\"\xaa\x06\n\x0eGRULayerParams\x12\x17\n\x0finputVectorSize\x18\x01 \x01(\x04\x12\x18\n\x10outputVectorSize\x18\x02 \x01(\x04\x12;\n\x0b\x61\x63tivations\x18\n \x03(\x0b\x32&.CoreML.Specification.ActivationParams\x12\x16\n\x0esequenceOutput\x18\x0f \x01(\x08\x12\x16\n\x0ehasBiasVectors\x18\x14 \x01(\x08\x12\x42\n\x16updateGateWeightMatrix\x18\x1e \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x41\n\x15resetGateWeightMatrix\x18\x1f \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x42\n\x16outputGateWeightMatrix\x18 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x45\n\x19updateGateRecursionMatrix\x18\x32 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x44\n\x18resetGateRecursionMatrix\x18\x33 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x45\n\x19outputGateRecursionMatrix\x18\x34 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12@\n\x14updateGateBiasVector\x18\x46 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12?\n\x13resetGateBiasVector\x18G \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12@\n\x14outputGateBiasVector\x18H \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x14\n\x0creverseInput\x18\x64 \x01(\x08\"\xaa\x01\n\nLSTMParams\x12\x16\n\x0esequenceOutput\x18\n \x01(\x08\x12\x16\n\x0ehasBiasVectors\x18\x14 \x01(\x08\x12\x12\n\nforgetBias\x18\x1e \x01(\x08\x12\x1a\n\x12hasPeepholeVectors\x18( \x01(\x08\x12!\n\x19\x63oupledInputAndForgetGate\x18\x32 \x01(\x08\x12\x19\n\x11\x63\x65llClipThreshold\x18< \x01(\x02\"\x94\x08\n\x10LSTMWeightParams\x12\x41\n\x15inputGateWeightMatrix\x18\x01 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x42\n\x16\x66orgetGateWeightMatrix\x18\x02 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x42\n\x16\x62lockInputWeightMatrix\x18\x03 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x42\n\x16outputGateWeightMatrix\x18\x04 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x44\n\x18inputGateRecursionMatrix\x18\x14 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x45\n\x19\x66orgetGateRecursionMatrix\x18\x15 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x45\n\x19\x62lockInputRecursionMatrix\x18\x16 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x45\n\x19outputGateRecursionMatrix\x18\x17 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12?\n\x13inputGateBiasVector\x18( \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12@\n\x14\x66orgetGateBiasVector\x18) \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12@\n\x14\x62lockInputBiasVector\x18* \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12@\n\x14outputGateBiasVector\x18+ \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x43\n\x17inputGatePeepholeVector\x18< \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x44\n\x18\x66orgetGatePeepholeVector\x18= \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x44\n\x18outputGatePeepholeVector\x18> \x01(\x0b\x32\".CoreML.Specification.WeightParams\"\x95\x02\n\x1dUniDirectionalLSTMLayerParams\x12\x17\n\x0finputVectorSize\x18\x01 \x01(\x04\x12\x18\n\x10outputVectorSize\x18\x02 \x01(\x04\x12;\n\x0b\x61\x63tivations\x18\n \x03(\x0b\x32&.CoreML.Specification.ActivationParams\x12\x30\n\x06params\x18\x0f \x01(\x0b\x32 .CoreML.Specification.LSTMParams\x12<\n\x0cweightParams\x18\x14 \x01(\x0b\x32&.CoreML.Specification.LSTMWeightParams\x12\x14\n\x0creverseInput\x18\x64 \x01(\x08\"\xd2\x02\n\x1c\x42iDirectionalLSTMLayerParams\x12\x17\n\x0finputVectorSize\x18\x01 \x01(\x04\x12\x18\n\x10outputVectorSize\x18\x02 \x01(\x04\x12\x46\n\x16\x61\x63tivationsForwardLSTM\x18\n \x03(\x0b\x32&.CoreML.Specification.ActivationParams\x12G\n\x17\x61\x63tivationsBackwardLSTM\x18\x0b \x03(\x0b\x32&.CoreML.Specification.ActivationParams\x12\x30\n\x06params\x18\x0f \x01(\x0b\x32 .CoreML.Specification.LSTMParams\x12<\n\x0cweightParams\x18\x14 \x03(\x0b\x32&.CoreML.Specification.LSTMWeightParams\"\xbe\x03\n\x11\x43ustomLayerParams\x12\x11\n\tclassName\x18\n \x01(\t\x12\x33\n\x07weights\x18\x14 \x03(\x0b\x32\".CoreML.Specification.WeightParams\x12K\n\nparameters\x18\x1e \x03(\x0b\x32\x37.CoreML.Specification.CustomLayerParams.ParametersEntry\x12\x13\n\x0b\x64\x65scription\x18( \x01(\t\x1a\x8c\x01\n\x15\x43ustomLayerParamValue\x12\x15\n\x0b\x64oubleValue\x18\n \x01(\x01H\x00\x12\x15\n\x0bstringValue\x18\x14 \x01(\tH\x00\x12\x12\n\x08intValue\x18\x1e \x01(\x05H\x00\x12\x13\n\tlongValue\x18( \x01(\x03H\x00\x12\x13\n\tboolValue\x18\x32 \x01(\x08H\x00\x42\x07\n\x05value\x1ap\n\x0fParametersEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12L\n\x05value\x18\x02 \x01(\x0b\x32=.CoreML.Specification.CustomLayerParams.CustomLayerParamValue:\x02\x38\x01\"$\n\x14TransposeLayerParams\x12\x0c\n\x04\x61xes\x18\x01 \x03(\x04\"\xa0\x02\n\x18\x42\x61tchedMatMulLayerParams\x12\x12\n\ntransposeA\x18\x01 \x01(\x08\x12\x12\n\ntransposeB\x18\x02 \x01(\x08\x12\"\n\x1aweightMatrixFirstDimension\x18\x05 \x01(\x04\x12#\n\x1bweightMatrixSecondDimension\x18\x06 \x01(\x04\x12\x0f\n\x07hasBias\x18\x07 \x01(\x08\x12\x33\n\x07weights\x18\x08 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x30\n\x04\x62ias\x18\t \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x1b\n\x13int8DynamicQuantize\x18\n \x01(\x08\"7\n\x13\x43oncatNDLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\x12\x12\n\ninterleave\x18\x02 \x01(\x08\"$\n\x14SoftmaxNDLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\"(\n\x12ReverseLayerParams\x12\x12\n\nreverseDim\x18\x01 \x03(\x08\"@\n\x15ReverseSeqLayerParams\x12\x11\n\tbatchAxis\x18\x01 \x01(\x03\x12\x14\n\x0csequenceAxis\x18\x02 \x01(\x03\"\\\n\x19LoadConstantNDLayerParams\x12\r\n\x05shape\x18\x01 \x03(\x04\x12\x30\n\x04\x64\x61ta\x18\x02 \x01(\x0b\x32\".CoreML.Specification.WeightParams\"$\n\x13\x46illLikeLayerParams\x12\r\n\x05value\x18\x01 \x01(\x02\";\n\x15\x46illStaticLayerParams\x12\r\n\x05value\x18\x01 \x01(\x02\x12\x13\n\x0btargetShape\x18\x02 \x03(\x04\"\'\n\x16\x46illDynamicLayerParams\x12\r\n\x05value\x18\x01 \x01(\x02\"\x1f\n\x1dWhereBroadcastableLayerParams\"\x10\n\x0eSinLayerParams\"\x10\n\x0e\x43osLayerParams\"\x10\n\x0eTanLayerParams\"\x11\n\x0f\x41sinLayerParams\"\x11\n\x0f\x41\x63osLayerParams\"\x11\n\x0f\x41tanLayerParams\"\x11\n\x0fSinhLayerParams\"\x11\n\x0f\x43oshLayerParams\"\x11\n\x0fTanhLayerParams\"\x12\n\x10\x41sinhLayerParams\"\x12\n\x10\x41\x63oshLayerParams\"\x12\n\x10\x41tanhLayerParams\"\x1d\n\x1bPowBroadcastableLayerParams\"\x11\n\x0f\x45xp2LayerParams\"\x19\n\x17WhereNonZeroLayerParams\"?\n\x19MatrixBandPartLayerParams\x12\x10\n\x08numLower\x18\x01 \x01(\x03\x12\x10\n\x08numUpper\x18\x02 \x01(\x03\"\'\n\x1aUpperTriangularLayerParams\x12\t\n\x01k\x18\x01 \x01(\x03\"\'\n\x1aLowerTriangularLayerParams\x12\t\n\x01k\x18\x01 \x01(\x03\"\x1c\n\x1a\x42roadcastToLikeLayerParams\"3\n\x1c\x42roadcastToStaticLayerParams\x12\x13\n\x0btargetShape\x18\x01 \x03(\x04\"\x1f\n\x1d\x42roadcastToDynamicLayerParams\"\x1d\n\x1b\x41\x64\x64\x42roadcastableLayerParams\"\x1d\n\x1bMaxBroadcastableLayerParams\"\x1d\n\x1bMinBroadcastableLayerParams\"\x1d\n\x1bModBroadcastableLayerParams\"\"\n FloorDivBroadcastableLayerParams\"\"\n SubtractBroadcastableLayerParams\"\"\n MultiplyBroadcastableLayerParams\" \n\x1e\x44ivideBroadcastableLayerParams\"!\n\x11GatherLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\"S\n\x12ScatterLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\x12/\n\x04mode\x18\x02 \x01(\x0e\x32!.CoreML.Specification.ScatterMode\"\x15\n\x13GatherNDLayerParams\"G\n\x14ScatterNDLayerParams\x12/\n\x04mode\x18\x01 \x01(\x0e\x32!.CoreML.Specification.ScatterMode\"*\n\x1aGatherAlongAxisLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\"\\\n\x1bScatterAlongAxisLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\x12/\n\x04mode\x18\x02 \x01(\x0e\x32!.CoreML.Specification.ScatterMode\" \n\x10StackLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\"7\n RankPreservingReshapeLayerParams\x12\x13\n\x0btargetShape\x18\x01 \x03(\x03\"a\n\x1a\x43onstantPaddingLayerParams\x12\r\n\x05value\x18\x01 \x01(\x02\x12\x12\n\npadAmounts\x18\x02 \x03(\x04\x12 \n\x18padToGivenOutputSizeMode\x18\x03 \x01(\x08\"I\n\x1bRandomNormalLikeLayerParams\x12\x0c\n\x04seed\x18\x01 \x01(\x03\x12\x0c\n\x04mean\x18\x02 \x01(\x02\x12\x0e\n\x06stdDev\x18\x03 \x01(\x02\"`\n\x1dRandomNormalStaticLayerParams\x12\x0c\n\x04seed\x18\x01 \x01(\x03\x12\x0c\n\x04mean\x18\x02 \x01(\x02\x12\x0e\n\x06stdDev\x18\x03 \x01(\x02\x12\x13\n\x0boutputShape\x18\x04 \x03(\x04\"L\n\x1eRandomNormalDynamicLayerParams\x12\x0c\n\x04seed\x18\x01 \x01(\x03\x12\x0c\n\x04mean\x18\x02 \x01(\x02\x12\x0e\n\x06stdDev\x18\x03 \x01(\x02\"L\n\x1cRandomUniformLikeLayerParams\x12\x0c\n\x04seed\x18\x01 \x01(\x03\x12\x0e\n\x06minVal\x18\x02 \x01(\x02\x12\x0e\n\x06maxVal\x18\x03 \x01(\x02\"c\n\x1eRandomUniformStaticLayerParams\x12\x0c\n\x04seed\x18\x01 \x01(\x03\x12\x0e\n\x06minVal\x18\x02 \x01(\x02\x12\x0e\n\x06maxVal\x18\x03 \x01(\x02\x12\x13\n\x0boutputShape\x18\x04 \x03(\x04\"O\n\x1fRandomUniformDynamicLayerParams\x12\x0c\n\x04seed\x18\x01 \x01(\x03\x12\x0e\n\x06minVal\x18\x02 \x01(\x02\x12\x0e\n\x06maxVal\x18\x03 \x01(\x02\"<\n\x1eRandomBernoulliLikeLayerParams\x12\x0c\n\x04seed\x18\x01 \x01(\x03\x12\x0c\n\x04prob\x18\x02 \x01(\x02\"S\n RandomBernoulliStaticLayerParams\x12\x0c\n\x04seed\x18\x01 \x01(\x03\x12\x0c\n\x04prob\x18\x02 \x01(\x02\x12\x13\n\x0boutputShape\x18\x03 \x03(\x04\"?\n!RandomBernoulliDynamicLayerParams\x12\x0c\n\x04seed\x18\x01 \x01(\x03\x12\x0c\n\x04prob\x18\x02 \x01(\x02\"z\n\"CategoricalDistributionLayerParams\x12\x0c\n\x04seed\x18\x01 \x01(\x03\x12\x12\n\nnumSamples\x18\x02 \x01(\x03\x12\x10\n\x08isLogits\x18\x03 \x01(\x08\x12\x0b\n\x03\x65ps\x18\x04 \x01(\x02\x12\x13\n\x0btemperature\x18\x05 \x01(\x02\"H\n\x13ReduceL1LayerParams\x12\x0c\n\x04\x61xes\x18\x01 \x03(\x03\x12\x10\n\x08keepDims\x18\x02 \x01(\x08\x12\x11\n\treduceAll\x18\x03 \x01(\x08\"H\n\x13ReduceL2LayerParams\x12\x0c\n\x04\x61xes\x18\x01 \x03(\x03\x12\x10\n\x08keepDims\x18\x02 \x01(\x08\x12\x11\n\treduceAll\x18\x03 \x01(\x08\"I\n\x14ReduceMaxLayerParams\x12\x0c\n\x04\x61xes\x18\x01 \x03(\x03\x12\x10\n\x08keepDims\x18\x02 \x01(\x08\x12\x11\n\treduceAll\x18\x03 \x01(\x08\"I\n\x14ReduceMinLayerParams\x12\x0c\n\x04\x61xes\x18\x01 \x03(\x03\x12\x10\n\x08keepDims\x18\x02 \x01(\x08\x12\x11\n\treduceAll\x18\x03 \x01(\x08\"I\n\x14ReduceSumLayerParams\x12\x0c\n\x04\x61xes\x18\x01 \x03(\x03\x12\x10\n\x08keepDims\x18\x02 \x01(\x08\x12\x11\n\treduceAll\x18\x03 \x01(\x08\"J\n\x15ReduceProdLayerParams\x12\x0c\n\x04\x61xes\x18\x01 \x03(\x03\x12\x10\n\x08keepDims\x18\x02 \x01(\x08\x12\x11\n\treduceAll\x18\x03 \x01(\x08\"J\n\x15ReduceMeanLayerParams\x12\x0c\n\x04\x61xes\x18\x01 \x03(\x03\x12\x10\n\x08keepDims\x18\x02 \x01(\x08\x12\x11\n\treduceAll\x18\x03 \x01(\x08\"L\n\x17ReduceLogSumLayerParams\x12\x0c\n\x04\x61xes\x18\x01 \x03(\x03\x12\x10\n\x08keepDims\x18\x02 \x01(\x08\x12\x11\n\treduceAll\x18\x03 \x01(\x08\"O\n\x1aReduceSumSquareLayerParams\x12\x0c\n\x04\x61xes\x18\x01 \x03(\x03\x12\x10\n\x08keepDims\x18\x02 \x01(\x08\x12\x11\n\treduceAll\x18\x03 \x01(\x08\"O\n\x1aReduceLogSumExpLayerParams\x12\x0c\n\x04\x61xes\x18\x01 \x03(\x03\x12\x10\n\x08keepDims\x18\x02 \x01(\x08\x12\x11\n\treduceAll\x18\x03 \x01(\x08\"%\n\x15\x45xpandDimsLayerParams\x12\x0c\n\x04\x61xes\x18\x01 \x03(\x03\"&\n\x16\x46lattenTo2DLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\"/\n\x18ReshapeStaticLayerParams\x12\x13\n\x0btargetShape\x18\x01 \x03(\x03\"\x18\n\x16ReshapeLikeLayerParams\"\x1b\n\x19ReshapeDynamicLayerParams\"6\n\x12SqueezeLayerParams\x12\x0c\n\x04\x61xes\x18\x01 \x03(\x03\x12\x12\n\nsqueezeAll\x18\x02 \x01(\x08\">\n\x0fTopKLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\x12\t\n\x01K\x18\x02 \x01(\x04\x12\x12\n\nuseBottomK\x18\x03 \x01(\x08\"4\n\x11\x41rgMaxLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\x12\x11\n\tremoveDim\x18\x02 \x01(\x08\"4\n\x11\x41rgMinLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\x12\x11\n\tremoveDim\x18\x02 \x01(\x08\"I\n\x12SplitNDLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\x12\x11\n\tnumSplits\x18\x02 \x01(\x04\x12\x12\n\nsplitSizes\x18\x03 \x03(\x04\"\x11\n\x0f\x43\x65ilLayerParams\"\x12\n\x10RoundLayerParams\"\x12\n\x10\x46loorLayerParams\"\x11\n\x0fSignLayerParams\"1\n\x0f\x43lipLayerParams\x12\x0e\n\x06minVal\x18\x01 \x01(\x02\x12\x0e\n\x06maxVal\x18\x02 \x01(\x02\"\x87\x01\n\x16SliceStaticLayerParams\x12\x10\n\x08\x62\x65ginIds\x18\x01 \x03(\x03\x12\x12\n\nbeginMasks\x18\x02 \x03(\x08\x12\x0e\n\x06\x65ndIds\x18\x03 \x03(\x03\x12\x10\n\x08\x65ndMasks\x18\x04 \x03(\x08\x12\x0f\n\x07strides\x18\x05 \x03(\x03\x12\x14\n\x0csqueezeMasks\x18\x06 \x03(\x08\"v\n\x17SliceDynamicLayerParams\x12\x12\n\nbeginMasks\x18\x02 \x03(\x08\x12\x0e\n\x06\x65ndIds\x18\x03 \x03(\x03\x12\x10\n\x08\x65ndMasks\x18\x04 \x03(\x08\x12\x0f\n\x07strides\x18\x05 \x03(\x03\x12\x14\n\x0csqueezeMasks\x18\x06 \x03(\x08\"\x1f\n\x0fTileLayerParams\x12\x0c\n\x04reps\x18\x01 \x03(\x04\"\x15\n\x13GetShapeLayerParams\"\x10\n\x0e\x45rfLayerParams\"\x99\x01\n\x0fGeluLayerParams\x12<\n\x04mode\x18\x01 \x01(\x0e\x32..CoreML.Specification.GeluLayerParams.GeluMode\"H\n\x08GeluMode\x12\t\n\x05\x45XACT\x10\x00\x12\x16\n\x12TANH_APPROXIMATION\x10\x01\x12\x19\n\x15SIGMOID_APPROXIMATION\x10\x02\"U\n\x16RangeStaticLayerParams\x12\x10\n\x08\x65ndValue\x18\x01 \x01(\x02\x12\x12\n\nstartValue\x18\x02 \x01(\x02\x12\x15\n\rstepSizeValue\x18\x03 \x01(\x02\"D\n\x17RangeDynamicLayerParams\x12\x12\n\nstartValue\x18\x02 \x01(\x02\x12\x15\n\rstepSizeValue\x18\x03 \x01(\x02\"K\n\x19SlidingWindowsLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\x12\x12\n\nwindowSize\x18\x02 \x01(\x04\x12\x0c\n\x04step\x18\x03 \x01(\x04\"\xaa\x01\n\x1dLayerNormalizationLayerParams\x12\x17\n\x0fnormalizedShape\x18\x01 \x03(\x03\x12\x0b\n\x03\x65ps\x18\x02 \x01(\x02\x12\x31\n\x05gamma\x18\x03 \x01(\x0b\x32\".CoreML.Specification.WeightParams\x12\x30\n\x04\x62\x65ta\x18\x04 \x01(\x0b\x32\".CoreML.Specification.WeightParams\"\x7f\n NonMaximumSuppressionLayerParams\x12\x14\n\x0ciouThreshold\x18\x01 \x01(\x02\x12\x16\n\x0escoreThreshold\x18\x02 \x01(\x02\x12\x10\n\x08maxBoxes\x18\x03 \x01(\x04\x12\x1b\n\x13perClassSuppression\x18\x04 \x01(\x08\"5\n\x16\x43lampedReLULayerParams\x12\r\n\x05\x61lpha\x18\x01 \x01(\x02\x12\x0c\n\x04\x62\x65ta\x18\x02 \x01(\x02\"6\n\x12\x41rgSortLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\x12\x12\n\ndescending\x18\x02 \x01(\x08\"4\n\x16SliceBySizeLayerParams\x12\x0c\n\x04size\x18\x02 \x01(\x03\x12\x0c\n\x04\x61xis\x18\x03 \x01(\x03\"\xc5\x04\n\x17NeuralNetworkClassifier\x12\x38\n\x06layers\x18\x01 \x03(\x0b\x32(.CoreML.Specification.NeuralNetworkLayer\x12G\n\rpreprocessing\x18\x02 \x03(\x0b\x32\x30.CoreML.Specification.NeuralNetworkPreprocessing\x12Y\n\x16\x61rrayInputShapeMapping\x18\x05 \x01(\x0e\x32\x39.CoreML.Specification.NeuralNetworkMultiArrayShapeMapping\x12T\n\x16imageInputShapeMapping\x18\x06 \x01(\x0e\x32\x34.CoreML.Specification.NeuralNetworkImageShapeMapping\x12\x43\n\x0cupdateParams\x18\n \x01(\x0b\x32-.CoreML.Specification.NetworkUpdateParameters\x12?\n\x11stringClassLabels\x18\x64 \x01(\x0b\x32\".CoreML.Specification.StringVectorH\x00\x12=\n\x10int64ClassLabels\x18\x65 \x01(\x0b\x32!.CoreML.Specification.Int64VectorH\x00\x12\"\n\x19labelProbabilityLayerName\x18\xc8\x01 \x01(\tB\r\n\x0b\x43lassLabels\"^\n\x11OneHotLayerParams\x12\x18\n\x10oneHotVectorSize\x18\x01 \x01(\x04\x12\x0c\n\x04\x61xis\x18\x02 \x01(\x03\x12\x0f\n\x07onValue\x18\x03 \x01(\x02\x12\x10\n\x08offValue\x18\x04 \x01(\x02\"K\n\x11\x43umSumLayerParams\x12\x0c\n\x04\x61xis\x18\x01 \x01(\x03\x12\x17\n\x0f\x65xcludeFinalSum\x18\x02 \x01(\x08\x12\x0f\n\x07reverse\x18\x03 \x01(\x08\"\x91\x03\n\x16NeuralNetworkRegressor\x12\x38\n\x06layers\x18\x01 \x03(\x0b\x32(.CoreML.Specification.NeuralNetworkLayer\x12G\n\rpreprocessing\x18\x02 \x03(\x0b\x32\x30.CoreML.Specification.NeuralNetworkPreprocessing\x12Y\n\x16\x61rrayInputShapeMapping\x18\x05 \x01(\x0e\x32\x39.CoreML.Specification.NeuralNetworkMultiArrayShapeMapping\x12T\n\x16imageInputShapeMapping\x18\x06 \x01(\x0e\x32\x34.CoreML.Specification.NeuralNetworkImageShapeMapping\x12\x43\n\x0cupdateParams\x18\n \x01(\x0b\x32-.CoreML.Specification.NetworkUpdateParameters\"\xa2\x02\n\x17NetworkUpdateParameters\x12\x33\n\nlossLayers\x18\x01 \x03(\x0b\x32\x1f.CoreML.Specification.LossLayer\x12\x32\n\toptimizer\x18\x02 \x01(\x0b\x32\x1f.CoreML.Specification.Optimizer\x12\x34\n\x06\x65pochs\x18\x03 \x01(\x0b\x32$.CoreML.Specification.Int64Parameter\x12\x34\n\x07shuffle\x18\n \x01(\x0b\x32#.CoreML.Specification.BoolParameter\x12\x32\n\x04seed\x18\x14 \x01(\x0b\x32$.CoreML.Specification.Int64Parameter\"\xe4\x01\n\tLossLayer\x12\x0c\n\x04name\x18\x01 \x01(\t\x12\x62\n categoricalCrossEntropyLossLayer\x18\n \x01(\x0b\x32\x36.CoreML.Specification.CategoricalCrossEntropyLossLayerH\x00\x12T\n\x19meanSquaredErrorLossLayer\x18\x0b \x01(\x0b\x32/.CoreML.Specification.MeanSquaredErrorLossLayerH\x00\x42\x0f\n\rLossLayerType\"A\n CategoricalCrossEntropyLossLayer\x12\r\n\x05input\x18\x01 \x01(\t\x12\x0e\n\x06target\x18\x02 \x01(\t\":\n\x19MeanSquaredErrorLossLayer\x12\r\n\x05input\x18\x01 \x01(\t\x12\x0e\n\x06target\x18\x02 \x01(\t\"\x96\x01\n\tOptimizer\x12:\n\x0csgdOptimizer\x18\n \x01(\x0b\x32\".CoreML.Specification.SGDOptimizerH\x00\x12<\n\radamOptimizer\x18\x0b \x01(\x0b\x32#.CoreML.Specification.AdamOptimizerH\x00\x42\x0f\n\rOptimizerType\"\xc1\x01\n\x0cSGDOptimizer\x12;\n\x0clearningRate\x18\x01 \x01(\x0b\x32%.CoreML.Specification.DoubleParameter\x12;\n\rminiBatchSize\x18\x02 \x01(\x0b\x32$.CoreML.Specification.Int64Parameter\x12\x37\n\x08momentum\x18\x03 \x01(\x0b\x32%.CoreML.Specification.DoubleParameter\"\xa9\x02\n\rAdamOptimizer\x12;\n\x0clearningRate\x18\x01 \x01(\x0b\x32%.CoreML.Specification.DoubleParameter\x12;\n\rminiBatchSize\x18\x02 \x01(\x0b\x32$.CoreML.Specification.Int64Parameter\x12\x34\n\x05\x62\x65ta1\x18\x03 \x01(\x0b\x32%.CoreML.Specification.DoubleParameter\x12\x34\n\x05\x62\x65ta2\x18\x04 \x01(\x0b\x32%.CoreML.Specification.DoubleParameter\x12\x32\n\x03\x65ps\x18\x05 \x01(\x0b\x32%.CoreML.Specification.DoubleParameter*W\n#NeuralNetworkMultiArrayShapeMapping\x12\x17\n\x13RANK5_ARRAY_MAPPING\x10\x00\x12\x17\n\x13\x45XACT_ARRAY_MAPPING\x10\x01*R\n\x1eNeuralNetworkImageShapeMapping\x12\x17\n\x13RANK5_IMAGE_MAPPING\x10\x00\x12\x17\n\x13RANK4_IMAGE_MAPPING\x10\x01*\x87\x01\n\x0bScatterMode\x12\x12\n\x0eSCATTER_UPDATE\x10\x00\x12\x0f\n\x0bSCATTER_ADD\x10\x01\x12\x0f\n\x0bSCATTER_SUB\x10\x02\x12\x0f\n\x0bSCATTER_MUL\x10\x03\x12\x0f\n\x0bSCATTER_DIV\x10\x04\x12\x0f\n\x0bSCATTER_MAX\x10\x05\x12\x0f\n\x0bSCATTER_MIN\x10\x06\x42\x02H\x03P\x00P\x01\x62\x06proto3') _NEURALNETWORKMULTIARRAYSHAPEMAPPING = DESCRIPTOR.enum_types_by_name['NeuralNetworkMultiArrayShapeMapping'] NeuralNetworkMultiArrayShapeMapping = enum_type_wrapper.EnumTypeWrapper(_NEURALNETWORKMULTIARRAYSHAPEMAPPING) _NEURALNETWORKIMAGESHAPEMAPPING = DESCRIPTOR.enum_types_by_name['NeuralNetworkImageShapeMapping'] NeuralNetworkImageShapeMapping = enum_type_wrapper.EnumTypeWrapper(_NEURALNETWORKIMAGESHAPEMAPPING) _SCATTERMODE = DESCRIPTOR.enum_types_by_name['ScatterMode'] ScatterMode = enum_type_wrapper.EnumTypeWrapper(_SCATTERMODE) RANK5_ARRAY_MAPPING = 0 EXACT_ARRAY_MAPPING = 1 RANK5_IMAGE_MAPPING = 0 RANK4_IMAGE_MAPPING = 1 SCATTER_UPDATE = 0 SCATTER_ADD = 1 SCATTER_SUB = 2 SCATTER_MUL = 3 SCATTER_DIV = 4 SCATTER_MAX = 5 SCATTER_MIN = 6 _NEURALNETWORK = DESCRIPTOR.message_types_by_name['NeuralNetwork'] _NEURALNETWORKIMAGESCALER = DESCRIPTOR.message_types_by_name['NeuralNetworkImageScaler'] _NEURALNETWORKMEANIMAGE = DESCRIPTOR.message_types_by_name['NeuralNetworkMeanImage'] _NEURALNETWORKPREPROCESSING = DESCRIPTOR.message_types_by_name['NeuralNetworkPreprocessing'] _ACTIVATIONRELU = DESCRIPTOR.message_types_by_name['ActivationReLU'] _ACTIVATIONLEAKYRELU = DESCRIPTOR.message_types_by_name['ActivationLeakyReLU'] _ACTIVATIONTANH = DESCRIPTOR.message_types_by_name['ActivationTanh'] _ACTIVATIONSCALEDTANH = DESCRIPTOR.message_types_by_name['ActivationScaledTanh'] _ACTIVATIONSIGMOID = DESCRIPTOR.message_types_by_name['ActivationSigmoid'] _ACTIVATIONLINEAR = DESCRIPTOR.message_types_by_name['ActivationLinear'] _ACTIVATIONSIGMOIDHARD = DESCRIPTOR.message_types_by_name['ActivationSigmoidHard'] _ACTIVATIONPRELU = DESCRIPTOR.message_types_by_name['ActivationPReLU'] _ACTIVATIONELU = DESCRIPTOR.message_types_by_name['ActivationELU'] _ACTIVATIONTHRESHOLDEDRELU = DESCRIPTOR.message_types_by_name['ActivationThresholdedReLU'] _ACTIVATIONSOFTSIGN = DESCRIPTOR.message_types_by_name['ActivationSoftsign'] _ACTIVATIONSOFTPLUS = DESCRIPTOR.message_types_by_name['ActivationSoftplus'] _ACTIVATIONPARAMETRICSOFTPLUS = DESCRIPTOR.message_types_by_name['ActivationParametricSoftplus'] _ACTIVATIONPARAMS = DESCRIPTOR.message_types_by_name['ActivationParams'] _TENSOR = DESCRIPTOR.message_types_by_name['Tensor'] _NEURALNETWORKLAYER = DESCRIPTOR.message_types_by_name['NeuralNetworkLayer'] _BRANCHLAYERPARAMS = DESCRIPTOR.message_types_by_name['BranchLayerParams'] _LOOPLAYERPARAMS = DESCRIPTOR.message_types_by_name['LoopLayerParams'] _LOOPBREAKLAYERPARAMS = DESCRIPTOR.message_types_by_name['LoopBreakLayerParams'] _LOOPCONTINUELAYERPARAMS = DESCRIPTOR.message_types_by_name['LoopContinueLayerParams'] _COPYLAYERPARAMS = DESCRIPTOR.message_types_by_name['CopyLayerParams'] _GREATERTHANLAYERPARAMS = DESCRIPTOR.message_types_by_name['GreaterThanLayerParams'] _GREATEREQUALLAYERPARAMS = DESCRIPTOR.message_types_by_name['GreaterEqualLayerParams'] _LESSTHANLAYERPARAMS = DESCRIPTOR.message_types_by_name['LessThanLayerParams'] _LESSEQUALLAYERPARAMS = DESCRIPTOR.message_types_by_name['LessEqualLayerParams'] _EQUALLAYERPARAMS = DESCRIPTOR.message_types_by_name['EqualLayerParams'] _NOTEQUALLAYERPARAMS = DESCRIPTOR.message_types_by_name['NotEqualLayerParams'] _LOGICALANDLAYERPARAMS = DESCRIPTOR.message_types_by_name['LogicalAndLayerParams'] _LOGICALORLAYERPARAMS = DESCRIPTOR.message_types_by_name['LogicalOrLayerParams'] _LOGICALXORLAYERPARAMS = DESCRIPTOR.message_types_by_name['LogicalXorLayerParams'] _LOGICALNOTLAYERPARAMS = DESCRIPTOR.message_types_by_name['LogicalNotLayerParams'] _BORDERAMOUNTS = DESCRIPTOR.message_types_by_name['BorderAmounts'] _BORDERAMOUNTS_EDGESIZES = _BORDERAMOUNTS.nested_types_by_name['EdgeSizes'] _VALIDPADDING = DESCRIPTOR.message_types_by_name['ValidPadding'] _SAMEPADDING = DESCRIPTOR.message_types_by_name['SamePadding'] _SAMPLINGMODE = DESCRIPTOR.message_types_by_name['SamplingMode'] _BOXCOORDINATESMODE = DESCRIPTOR.message_types_by_name['BoxCoordinatesMode'] _WEIGHTPARAMS = DESCRIPTOR.message_types_by_name['WeightParams'] _QUANTIZATIONPARAMS = DESCRIPTOR.message_types_by_name['QuantizationParams'] _LINEARQUANTIZATIONPARAMS = DESCRIPTOR.message_types_by_name['LinearQuantizationParams'] _LOOKUPTABLEQUANTIZATIONPARAMS = DESCRIPTOR.message_types_by_name['LookUpTableQuantizationParams'] _CONVOLUTIONLAYERPARAMS = DESCRIPTOR.message_types_by_name['ConvolutionLayerParams'] _CONVOLUTION3DLAYERPARAMS = DESCRIPTOR.message_types_by_name['Convolution3DLayerParams'] _INNERPRODUCTLAYERPARAMS = DESCRIPTOR.message_types_by_name['InnerProductLayerParams'] _EMBEDDINGLAYERPARAMS = DESCRIPTOR.message_types_by_name['EmbeddingLayerParams'] _EMBEDDINGNDLAYERPARAMS = DESCRIPTOR.message_types_by_name['EmbeddingNDLayerParams'] _BATCHNORMLAYERPARAMS = DESCRIPTOR.message_types_by_name['BatchnormLayerParams'] _POOLINGLAYERPARAMS = DESCRIPTOR.message_types_by_name['PoolingLayerParams'] _POOLINGLAYERPARAMS_VALIDCOMPLETEPADDING = _POOLINGLAYERPARAMS.nested_types_by_name['ValidCompletePadding'] _POOLING3DLAYERPARAMS = DESCRIPTOR.message_types_by_name['Pooling3DLayerParams'] _GLOBALPOOLING3DLAYERPARAMS = DESCRIPTOR.message_types_by_name['GlobalPooling3DLayerParams'] _PADDINGLAYERPARAMS = DESCRIPTOR.message_types_by_name['PaddingLayerParams'] _PADDINGLAYERPARAMS_PADDINGCONSTANT = _PADDINGLAYERPARAMS.nested_types_by_name['PaddingConstant'] _PADDINGLAYERPARAMS_PADDINGREFLECTION = _PADDINGLAYERPARAMS.nested_types_by_name['PaddingReflection'] _PADDINGLAYERPARAMS_PADDINGREPLICATION = _PADDINGLAYERPARAMS.nested_types_by_name['PaddingReplication'] _CONCATLAYERPARAMS = DESCRIPTOR.message_types_by_name['ConcatLayerParams'] _LRNLAYERPARAMS = DESCRIPTOR.message_types_by_name['LRNLayerParams'] _SOFTMAXLAYERPARAMS = DESCRIPTOR.message_types_by_name['SoftmaxLayerParams'] _SPLITLAYERPARAMS = DESCRIPTOR.message_types_by_name['SplitLayerParams'] _ADDLAYERPARAMS = DESCRIPTOR.message_types_by_name['AddLayerParams'] _MULTIPLYLAYERPARAMS = DESCRIPTOR.message_types_by_name['MultiplyLayerParams'] _UNARYFUNCTIONLAYERPARAMS = DESCRIPTOR.message_types_by_name['UnaryFunctionLayerParams'] _UPSAMPLELAYERPARAMS = DESCRIPTOR.message_types_by_name['UpsampleLayerParams'] _RESIZEBILINEARLAYERPARAMS = DESCRIPTOR.message_types_by_name['ResizeBilinearLayerParams'] _CROPRESIZELAYERPARAMS = DESCRIPTOR.message_types_by_name['CropResizeLayerParams'] _BIASLAYERPARAMS = DESCRIPTOR.message_types_by_name['BiasLayerParams'] _SCALELAYERPARAMS = DESCRIPTOR.message_types_by_name['ScaleLayerParams'] _LOADCONSTANTLAYERPARAMS = DESCRIPTOR.message_types_by_name['LoadConstantLayerParams'] _L2NORMALIZELAYERPARAMS = DESCRIPTOR.message_types_by_name['L2NormalizeLayerParams'] _FLATTENLAYERPARAMS = DESCRIPTOR.message_types_by_name['FlattenLayerParams'] _RESHAPELAYERPARAMS = DESCRIPTOR.message_types_by_name['ReshapeLayerParams'] _PERMUTELAYERPARAMS = DESCRIPTOR.message_types_by_name['PermuteLayerParams'] _REORGANIZEDATALAYERPARAMS = DESCRIPTOR.message_types_by_name['ReorganizeDataLayerParams'] _SLICELAYERPARAMS = DESCRIPTOR.message_types_by_name['SliceLayerParams'] _REDUCELAYERPARAMS = DESCRIPTOR.message_types_by_name['ReduceLayerParams'] _CROPLAYERPARAMS = DESCRIPTOR.message_types_by_name['CropLayerParams'] _AVERAGELAYERPARAMS = DESCRIPTOR.message_types_by_name['AverageLayerParams'] _MAXLAYERPARAMS = DESCRIPTOR.message_types_by_name['MaxLayerParams'] _MINLAYERPARAMS = DESCRIPTOR.message_types_by_name['MinLayerParams'] _DOTPRODUCTLAYERPARAMS = DESCRIPTOR.message_types_by_name['DotProductLayerParams'] _MEANVARIANCENORMALIZELAYERPARAMS = DESCRIPTOR.message_types_by_name['MeanVarianceNormalizeLayerParams'] _SEQUENCEREPEATLAYERPARAMS = DESCRIPTOR.message_types_by_name['SequenceRepeatLayerParams'] _SIMPLERECURRENTLAYERPARAMS = DESCRIPTOR.message_types_by_name['SimpleRecurrentLayerParams'] _GRULAYERPARAMS = DESCRIPTOR.message_types_by_name['GRULayerParams'] _LSTMPARAMS = DESCRIPTOR.message_types_by_name['LSTMParams'] _LSTMWEIGHTPARAMS = DESCRIPTOR.message_types_by_name['LSTMWeightParams'] _UNIDIRECTIONALLSTMLAYERPARAMS = DESCRIPTOR.message_types_by_name['UniDirectionalLSTMLayerParams'] _BIDIRECTIONALLSTMLAYERPARAMS = DESCRIPTOR.message_types_by_name['BiDirectionalLSTMLayerParams'] _CUSTOMLAYERPARAMS = DESCRIPTOR.message_types_by_name['CustomLayerParams'] _CUSTOMLAYERPARAMS_CUSTOMLAYERPARAMVALUE = _CUSTOMLAYERPARAMS.nested_types_by_name['CustomLayerParamValue'] _CUSTOMLAYERPARAMS_PARAMETERSENTRY = _CUSTOMLAYERPARAMS.nested_types_by_name['ParametersEntry'] _TRANSPOSELAYERPARAMS = DESCRIPTOR.message_types_by_name['TransposeLayerParams'] _BATCHEDMATMULLAYERPARAMS = DESCRIPTOR.message_types_by_name['BatchedMatMulLayerParams'] _CONCATNDLAYERPARAMS = DESCRIPTOR.message_types_by_name['ConcatNDLayerParams'] _SOFTMAXNDLAYERPARAMS = DESCRIPTOR.message_types_by_name['SoftmaxNDLayerParams'] _REVERSELAYERPARAMS = DESCRIPTOR.message_types_by_name['ReverseLayerParams'] _REVERSESEQLAYERPARAMS = DESCRIPTOR.message_types_by_name['ReverseSeqLayerParams'] _LOADCONSTANTNDLAYERPARAMS = DESCRIPTOR.message_types_by_name['LoadConstantNDLayerParams'] _FILLLIKELAYERPARAMS = DESCRIPTOR.message_types_by_name['FillLikeLayerParams'] _FILLSTATICLAYERPARAMS = DESCRIPTOR.message_types_by_name['FillStaticLayerParams'] _FILLDYNAMICLAYERPARAMS = DESCRIPTOR.message_types_by_name['FillDynamicLayerParams'] _WHEREBROADCASTABLELAYERPARAMS = DESCRIPTOR.message_types_by_name['WhereBroadcastableLayerParams'] _SINLAYERPARAMS = DESCRIPTOR.message_types_by_name['SinLayerParams'] _COSLAYERPARAMS = DESCRIPTOR.message_types_by_name['CosLayerParams'] _TANLAYERPARAMS = DESCRIPTOR.message_types_by_name['TanLayerParams'] _ASINLAYERPARAMS = DESCRIPTOR.message_types_by_name['AsinLayerParams'] _ACOSLAYERPARAMS = DESCRIPTOR.message_types_by_name['AcosLayerParams'] _ATANLAYERPARAMS = DESCRIPTOR.message_types_by_name['AtanLayerParams'] _SINHLAYERPARAMS = DESCRIPTOR.message_types_by_name['SinhLayerParams'] _COSHLAYERPARAMS = DESCRIPTOR.message_types_by_name['CoshLayerParams'] _TANHLAYERPARAMS = DESCRIPTOR.message_types_by_name['TanhLayerParams'] _ASINHLAYERPARAMS = DESCRIPTOR.message_types_by_name['AsinhLayerParams'] _ACOSHLAYERPARAMS = DESCRIPTOR.message_types_by_name['AcoshLayerParams'] _ATANHLAYERPARAMS = DESCRIPTOR.message_types_by_name['AtanhLayerParams'] _POWBROADCASTABLELAYERPARAMS = DESCRIPTOR.message_types_by_name['PowBroadcastableLayerParams'] _EXP2LAYERPARAMS = DESCRIPTOR.message_types_by_name['Exp2LayerParams'] _WHERENONZEROLAYERPARAMS = DESCRIPTOR.message_types_by_name['WhereNonZeroLayerParams'] _MATRIXBANDPARTLAYERPARAMS = DESCRIPTOR.message_types_by_name['MatrixBandPartLayerParams'] _UPPERTRIANGULARLAYERPARAMS = DESCRIPTOR.message_types_by_name['UpperTriangularLayerParams'] _LOWERTRIANGULARLAYERPARAMS = DESCRIPTOR.message_types_by_name['LowerTriangularLayerParams'] _BROADCASTTOLIKELAYERPARAMS = DESCRIPTOR.message_types_by_name['BroadcastToLikeLayerParams'] _BROADCASTTOSTATICLAYERPARAMS = DESCRIPTOR.message_types_by_name['BroadcastToStaticLayerParams'] _BROADCASTTODYNAMICLAYERPARAMS = DESCRIPTOR.message_types_by_name['BroadcastToDynamicLayerParams'] _ADDBROADCASTABLELAYERPARAMS = DESCRIPTOR.message_types_by_name['AddBroadcastableLayerParams'] _MAXBROADCASTABLELAYERPARAMS = DESCRIPTOR.message_types_by_name['MaxBroadcastableLayerParams'] _MINBROADCASTABLELAYERPARAMS = DESCRIPTOR.message_types_by_name['MinBroadcastableLayerParams'] _MODBROADCASTABLELAYERPARAMS = DESCRIPTOR.message_types_by_name['ModBroadcastableLayerParams'] _FLOORDIVBROADCASTABLELAYERPARAMS = DESCRIPTOR.message_types_by_name['FloorDivBroadcastableLayerParams'] _SUBTRACTBROADCASTABLELAYERPARAMS = DESCRIPTOR.message_types_by_name['SubtractBroadcastableLayerParams'] _MULTIPLYBROADCASTABLELAYERPARAMS = DESCRIPTOR.message_types_by_name['MultiplyBroadcastableLayerParams'] _DIVIDEBROADCASTABLELAYERPARAMS = DESCRIPTOR.message_types_by_name['DivideBroadcastableLayerParams'] _GATHERLAYERPARAMS = DESCRIPTOR.message_types_by_name['GatherLayerParams'] _SCATTERLAYERPARAMS = DESCRIPTOR.message_types_by_name['ScatterLayerParams'] _GATHERNDLAYERPARAMS = DESCRIPTOR.message_types_by_name['GatherNDLayerParams'] _SCATTERNDLAYERPARAMS = DESCRIPTOR.message_types_by_name['ScatterNDLayerParams'] _GATHERALONGAXISLAYERPARAMS = DESCRIPTOR.message_types_by_name['GatherAlongAxisLayerParams'] _SCATTERALONGAXISLAYERPARAMS = DESCRIPTOR.message_types_by_name['ScatterAlongAxisLayerParams'] _STACKLAYERPARAMS = DESCRIPTOR.message_types_by_name['StackLayerParams'] _RANKPRESERVINGRESHAPELAYERPARAMS = DESCRIPTOR.message_types_by_name['RankPreservingReshapeLayerParams'] _CONSTANTPADDINGLAYERPARAMS = DESCRIPTOR.message_types_by_name['ConstantPaddingLayerParams'] _RANDOMNORMALLIKELAYERPARAMS = DESCRIPTOR.message_types_by_name['RandomNormalLikeLayerParams'] _RANDOMNORMALSTATICLAYERPARAMS = DESCRIPTOR.message_types_by_name['RandomNormalStaticLayerParams'] _RANDOMNORMALDYNAMICLAYERPARAMS = DESCRIPTOR.message_types_by_name['RandomNormalDynamicLayerParams'] _RANDOMUNIFORMLIKELAYERPARAMS = DESCRIPTOR.message_types_by_name['RandomUniformLikeLayerParams'] _RANDOMUNIFORMSTATICLAYERPARAMS = DESCRIPTOR.message_types_by_name['RandomUniformStaticLayerParams'] _RANDOMUNIFORMDYNAMICLAYERPARAMS = DESCRIPTOR.message_types_by_name['RandomUniformDynamicLayerParams'] _RANDOMBERNOULLILIKELAYERPARAMS = DESCRIPTOR.message_types_by_name['RandomBernoulliLikeLayerParams'] _RANDOMBERNOULLISTATICLAYERPARAMS = DESCRIPTOR.message_types_by_name['RandomBernoulliStaticLayerParams'] _RANDOMBERNOULLIDYNAMICLAYERPARAMS = DESCRIPTOR.message_types_by_name['RandomBernoulliDynamicLayerParams'] _CATEGORICALDISTRIBUTIONLAYERPARAMS = DESCRIPTOR.message_types_by_name['CategoricalDistributionLayerParams'] _REDUCEL1LAYERPARAMS = DESCRIPTOR.message_types_by_name['ReduceL1LayerParams'] _REDUCEL2LAYERPARAMS = DESCRIPTOR.message_types_by_name['ReduceL2LayerParams'] _REDUCEMAXLAYERPARAMS = DESCRIPTOR.message_types_by_name['ReduceMaxLayerParams'] _REDUCEMINLAYERPARAMS = DESCRIPTOR.message_types_by_name['ReduceMinLayerParams'] _REDUCESUMLAYERPARAMS = DESCRIPTOR.message_types_by_name['ReduceSumLayerParams'] _REDUCEPRODLAYERPARAMS = DESCRIPTOR.message_types_by_name['ReduceProdLayerParams'] _REDUCEMEANLAYERPARAMS = DESCRIPTOR.message_types_by_name['ReduceMeanLayerParams'] _REDUCELOGSUMLAYERPARAMS = DESCRIPTOR.message_types_by_name['ReduceLogSumLayerParams'] _REDUCESUMSQUARELAYERPARAMS = DESCRIPTOR.message_types_by_name['ReduceSumSquareLayerParams'] _REDUCELOGSUMEXPLAYERPARAMS = DESCRIPTOR.message_types_by_name['ReduceLogSumExpLayerParams'] _EXPANDDIMSLAYERPARAMS = DESCRIPTOR.message_types_by_name['ExpandDimsLayerParams'] _FLATTENTO2DLAYERPARAMS = DESCRIPTOR.message_types_by_name['FlattenTo2DLayerParams'] _RESHAPESTATICLAYERPARAMS = DESCRIPTOR.message_types_by_name['ReshapeStaticLayerParams'] _RESHAPELIKELAYERPARAMS = DESCRIPTOR.message_types_by_name['ReshapeLikeLayerParams'] _RESHAPEDYNAMICLAYERPARAMS = DESCRIPTOR.message_types_by_name['ReshapeDynamicLayerParams'] _SQUEEZELAYERPARAMS = DESCRIPTOR.message_types_by_name['SqueezeLayerParams'] _TOPKLAYERPARAMS = DESCRIPTOR.message_types_by_name['TopKLayerParams'] _ARGMAXLAYERPARAMS = DESCRIPTOR.message_types_by_name['ArgMaxLayerParams'] _ARGMINLAYERPARAMS = DESCRIPTOR.message_types_by_name['ArgMinLayerParams'] _SPLITNDLAYERPARAMS = DESCRIPTOR.message_types_by_name['SplitNDLayerParams'] _CEILLAYERPARAMS = DESCRIPTOR.message_types_by_name['CeilLayerParams'] _ROUNDLAYERPARAMS = DESCRIPTOR.message_types_by_name['RoundLayerParams'] _FLOORLAYERPARAMS = DESCRIPTOR.message_types_by_name['FloorLayerParams'] _SIGNLAYERPARAMS = DESCRIPTOR.message_types_by_name['SignLayerParams'] _CLIPLAYERPARAMS = DESCRIPTOR.message_types_by_name['ClipLayerParams'] _SLICESTATICLAYERPARAMS = DESCRIPTOR.message_types_by_name['SliceStaticLayerParams'] _SLICEDYNAMICLAYERPARAMS = DESCRIPTOR.message_types_by_name['SliceDynamicLayerParams'] _TILELAYERPARAMS = DESCRIPTOR.message_types_by_name['TileLayerParams'] _GETSHAPELAYERPARAMS = DESCRIPTOR.message_types_by_name['GetShapeLayerParams'] _ERFLAYERPARAMS = DESCRIPTOR.message_types_by_name['ErfLayerParams'] _GELULAYERPARAMS = DESCRIPTOR.message_types_by_name['GeluLayerParams'] _RANGESTATICLAYERPARAMS = DESCRIPTOR.message_types_by_name['RangeStaticLayerParams'] _RANGEDYNAMICLAYERPARAMS = DESCRIPTOR.message_types_by_name['RangeDynamicLayerParams'] _SLIDINGWINDOWSLAYERPARAMS = DESCRIPTOR.message_types_by_name['SlidingWindowsLayerParams'] _LAYERNORMALIZATIONLAYERPARAMS = DESCRIPTOR.message_types_by_name['LayerNormalizationLayerParams'] _NONMAXIMUMSUPPRESSIONLAYERPARAMS = DESCRIPTOR.message_types_by_name['NonMaximumSuppressionLayerParams'] _CLAMPEDRELULAYERPARAMS = DESCRIPTOR.message_types_by_name['ClampedReLULayerParams'] _ARGSORTLAYERPARAMS = DESCRIPTOR.message_types_by_name['ArgSortLayerParams'] _SLICEBYSIZELAYERPARAMS = DESCRIPTOR.message_types_by_name['SliceBySizeLayerParams'] _NEURALNETWORKCLASSIFIER = DESCRIPTOR.message_types_by_name['NeuralNetworkClassifier'] _ONEHOTLAYERPARAMS = DESCRIPTOR.message_types_by_name['OneHotLayerParams'] _CUMSUMLAYERPARAMS = DESCRIPTOR.message_types_by_name['CumSumLayerParams'] _NEURALNETWORKREGRESSOR = DESCRIPTOR.message_types_by_name['NeuralNetworkRegressor'] _NETWORKUPDATEPARAMETERS = DESCRIPTOR.message_types_by_name['NetworkUpdateParameters'] _LOSSLAYER = DESCRIPTOR.message_types_by_name['LossLayer'] _CATEGORICALCROSSENTROPYLOSSLAYER = DESCRIPTOR.message_types_by_name['CategoricalCrossEntropyLossLayer'] _MEANSQUAREDERRORLOSSLAYER = DESCRIPTOR.message_types_by_name['MeanSquaredErrorLossLayer'] _OPTIMIZER = DESCRIPTOR.message_types_by_name['Optimizer'] _SGDOPTIMIZER = DESCRIPTOR.message_types_by_name['SGDOptimizer'] _ADAMOPTIMIZER = DESCRIPTOR.message_types_by_name['AdamOptimizer'] _SAMEPADDING_SAMEPADDINGMODE = _SAMEPADDING.enum_types_by_name['SamePaddingMode'] _SAMPLINGMODE_METHOD = _SAMPLINGMODE.enum_types_by_name['Method'] _BOXCOORDINATESMODE_COORDINATES = _BOXCOORDINATESMODE.enum_types_by_name['Coordinates'] _CONVOLUTION3DLAYERPARAMS_PADDINGTYPE = _CONVOLUTION3DLAYERPARAMS.enum_types_by_name['PaddingType'] _POOLINGLAYERPARAMS_POOLINGTYPE = _POOLINGLAYERPARAMS.enum_types_by_name['PoolingType'] _POOLING3DLAYERPARAMS_POOLINGTYPE3D = _POOLING3DLAYERPARAMS.enum_types_by_name['PoolingType3D'] _POOLING3DLAYERPARAMS_POOLING3DPADDINGTYPE = _POOLING3DLAYERPARAMS.enum_types_by_name['Pooling3DPaddingType'] _GLOBALPOOLING3DLAYERPARAMS_GLOBALPOOLINGTYPE3D = _GLOBALPOOLING3DLAYERPARAMS.enum_types_by_name['GlobalPoolingType3D'] _UNARYFUNCTIONLAYERPARAMS_OPERATION = _UNARYFUNCTIONLAYERPARAMS.enum_types_by_name['Operation'] _UPSAMPLELAYERPARAMS_INTERPOLATIONMODE = _UPSAMPLELAYERPARAMS.enum_types_by_name['InterpolationMode'] _UPSAMPLELAYERPARAMS_LINEARUPSAMPLEMODE = _UPSAMPLELAYERPARAMS.enum_types_by_name['LinearUpsampleMode'] _FLATTENLAYERPARAMS_FLATTENORDER = _FLATTENLAYERPARAMS.enum_types_by_name['FlattenOrder'] _RESHAPELAYERPARAMS_RESHAPEORDER = _RESHAPELAYERPARAMS.enum_types_by_name['ReshapeOrder'] _REORGANIZEDATALAYERPARAMS_REORGANIZATIONTYPE = _REORGANIZEDATALAYERPARAMS.enum_types_by_name['ReorganizationType'] _SLICELAYERPARAMS_SLICEAXIS = _SLICELAYERPARAMS.enum_types_by_name['SliceAxis'] _REDUCELAYERPARAMS_REDUCEOPERATION = _REDUCELAYERPARAMS.enum_types_by_name['ReduceOperation'] _REDUCELAYERPARAMS_REDUCEAXIS = _REDUCELAYERPARAMS.enum_types_by_name['ReduceAxis'] _GELULAYERPARAMS_GELUMODE = _GELULAYERPARAMS.enum_types_by_name['GeluMode'] NeuralNetwork = _reflection.GeneratedProtocolMessageType('NeuralNetwork', (_message.Message,), { 'DESCRIPTOR' : _NEURALNETWORK, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NeuralNetwork) }) _sym_db.RegisterMessage(NeuralNetwork) NeuralNetworkImageScaler = _reflection.GeneratedProtocolMessageType('NeuralNetworkImageScaler', (_message.Message,), { 'DESCRIPTOR' : _NEURALNETWORKIMAGESCALER, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NeuralNetworkImageScaler) }) _sym_db.RegisterMessage(NeuralNetworkImageScaler) NeuralNetworkMeanImage = _reflection.GeneratedProtocolMessageType('NeuralNetworkMeanImage', (_message.Message,), { 'DESCRIPTOR' : _NEURALNETWORKMEANIMAGE, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NeuralNetworkMeanImage) }) _sym_db.RegisterMessage(NeuralNetworkMeanImage) NeuralNetworkPreprocessing = _reflection.GeneratedProtocolMessageType('NeuralNetworkPreprocessing', (_message.Message,), { 'DESCRIPTOR' : _NEURALNETWORKPREPROCESSING, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NeuralNetworkPreprocessing) }) _sym_db.RegisterMessage(NeuralNetworkPreprocessing) ActivationReLU = _reflection.GeneratedProtocolMessageType('ActivationReLU', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONRELU, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationReLU) }) _sym_db.RegisterMessage(ActivationReLU) ActivationLeakyReLU = _reflection.GeneratedProtocolMessageType('ActivationLeakyReLU', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONLEAKYRELU, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationLeakyReLU) }) _sym_db.RegisterMessage(ActivationLeakyReLU) ActivationTanh = _reflection.GeneratedProtocolMessageType('ActivationTanh', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONTANH, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationTanh) }) _sym_db.RegisterMessage(ActivationTanh) ActivationScaledTanh = _reflection.GeneratedProtocolMessageType('ActivationScaledTanh', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONSCALEDTANH, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationScaledTanh) }) _sym_db.RegisterMessage(ActivationScaledTanh) ActivationSigmoid = _reflection.GeneratedProtocolMessageType('ActivationSigmoid', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONSIGMOID, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationSigmoid) }) _sym_db.RegisterMessage(ActivationSigmoid) ActivationLinear = _reflection.GeneratedProtocolMessageType('ActivationLinear', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONLINEAR, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationLinear) }) _sym_db.RegisterMessage(ActivationLinear) ActivationSigmoidHard = _reflection.GeneratedProtocolMessageType('ActivationSigmoidHard', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONSIGMOIDHARD, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationSigmoidHard) }) _sym_db.RegisterMessage(ActivationSigmoidHard) ActivationPReLU = _reflection.GeneratedProtocolMessageType('ActivationPReLU', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONPRELU, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationPReLU) }) _sym_db.RegisterMessage(ActivationPReLU) ActivationELU = _reflection.GeneratedProtocolMessageType('ActivationELU', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONELU, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationELU) }) _sym_db.RegisterMessage(ActivationELU) ActivationThresholdedReLU = _reflection.GeneratedProtocolMessageType('ActivationThresholdedReLU', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONTHRESHOLDEDRELU, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationThresholdedReLU) }) _sym_db.RegisterMessage(ActivationThresholdedReLU) ActivationSoftsign = _reflection.GeneratedProtocolMessageType('ActivationSoftsign', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONSOFTSIGN, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationSoftsign) }) _sym_db.RegisterMessage(ActivationSoftsign) ActivationSoftplus = _reflection.GeneratedProtocolMessageType('ActivationSoftplus', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONSOFTPLUS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationSoftplus) }) _sym_db.RegisterMessage(ActivationSoftplus) ActivationParametricSoftplus = _reflection.GeneratedProtocolMessageType('ActivationParametricSoftplus', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONPARAMETRICSOFTPLUS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationParametricSoftplus) }) _sym_db.RegisterMessage(ActivationParametricSoftplus) ActivationParams = _reflection.GeneratedProtocolMessageType('ActivationParams', (_message.Message,), { 'DESCRIPTOR' : _ACTIVATIONPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ActivationParams) }) _sym_db.RegisterMessage(ActivationParams) Tensor = _reflection.GeneratedProtocolMessageType('Tensor', (_message.Message,), { 'DESCRIPTOR' : _TENSOR, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Tensor) }) _sym_db.RegisterMessage(Tensor) NeuralNetworkLayer = _reflection.GeneratedProtocolMessageType('NeuralNetworkLayer', (_message.Message,), { 'DESCRIPTOR' : _NEURALNETWORKLAYER, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NeuralNetworkLayer) }) _sym_db.RegisterMessage(NeuralNetworkLayer) BranchLayerParams = _reflection.GeneratedProtocolMessageType('BranchLayerParams', (_message.Message,), { 'DESCRIPTOR' : _BRANCHLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BranchLayerParams) }) _sym_db.RegisterMessage(BranchLayerParams) LoopLayerParams = _reflection.GeneratedProtocolMessageType('LoopLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LOOPLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LoopLayerParams) }) _sym_db.RegisterMessage(LoopLayerParams) LoopBreakLayerParams = _reflection.GeneratedProtocolMessageType('LoopBreakLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LOOPBREAKLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LoopBreakLayerParams) }) _sym_db.RegisterMessage(LoopBreakLayerParams) LoopContinueLayerParams = _reflection.GeneratedProtocolMessageType('LoopContinueLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LOOPCONTINUELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LoopContinueLayerParams) }) _sym_db.RegisterMessage(LoopContinueLayerParams) CopyLayerParams = _reflection.GeneratedProtocolMessageType('CopyLayerParams', (_message.Message,), { 'DESCRIPTOR' : _COPYLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CopyLayerParams) }) _sym_db.RegisterMessage(CopyLayerParams) GreaterThanLayerParams = _reflection.GeneratedProtocolMessageType('GreaterThanLayerParams', (_message.Message,), { 'DESCRIPTOR' : _GREATERTHANLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.GreaterThanLayerParams) }) _sym_db.RegisterMessage(GreaterThanLayerParams) GreaterEqualLayerParams = _reflection.GeneratedProtocolMessageType('GreaterEqualLayerParams', (_message.Message,), { 'DESCRIPTOR' : _GREATEREQUALLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.GreaterEqualLayerParams) }) _sym_db.RegisterMessage(GreaterEqualLayerParams) LessThanLayerParams = _reflection.GeneratedProtocolMessageType('LessThanLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LESSTHANLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LessThanLayerParams) }) _sym_db.RegisterMessage(LessThanLayerParams) LessEqualLayerParams = _reflection.GeneratedProtocolMessageType('LessEqualLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LESSEQUALLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LessEqualLayerParams) }) _sym_db.RegisterMessage(LessEqualLayerParams) EqualLayerParams = _reflection.GeneratedProtocolMessageType('EqualLayerParams', (_message.Message,), { 'DESCRIPTOR' : _EQUALLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.EqualLayerParams) }) _sym_db.RegisterMessage(EqualLayerParams) NotEqualLayerParams = _reflection.GeneratedProtocolMessageType('NotEqualLayerParams', (_message.Message,), { 'DESCRIPTOR' : _NOTEQUALLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NotEqualLayerParams) }) _sym_db.RegisterMessage(NotEqualLayerParams) LogicalAndLayerParams = _reflection.GeneratedProtocolMessageType('LogicalAndLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LOGICALANDLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LogicalAndLayerParams) }) _sym_db.RegisterMessage(LogicalAndLayerParams) LogicalOrLayerParams = _reflection.GeneratedProtocolMessageType('LogicalOrLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LOGICALORLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LogicalOrLayerParams) }) _sym_db.RegisterMessage(LogicalOrLayerParams) LogicalXorLayerParams = _reflection.GeneratedProtocolMessageType('LogicalXorLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LOGICALXORLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LogicalXorLayerParams) }) _sym_db.RegisterMessage(LogicalXorLayerParams) LogicalNotLayerParams = _reflection.GeneratedProtocolMessageType('LogicalNotLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LOGICALNOTLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LogicalNotLayerParams) }) _sym_db.RegisterMessage(LogicalNotLayerParams) BorderAmounts = _reflection.GeneratedProtocolMessageType('BorderAmounts', (_message.Message,), { 'EdgeSizes' : _reflection.GeneratedProtocolMessageType('EdgeSizes', (_message.Message,), { 'DESCRIPTOR' : _BORDERAMOUNTS_EDGESIZES, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BorderAmounts.EdgeSizes) }) , 'DESCRIPTOR' : _BORDERAMOUNTS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BorderAmounts) }) _sym_db.RegisterMessage(BorderAmounts) _sym_db.RegisterMessage(BorderAmounts.EdgeSizes) ValidPadding = _reflection.GeneratedProtocolMessageType('ValidPadding', (_message.Message,), { 'DESCRIPTOR' : _VALIDPADDING, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ValidPadding) }) _sym_db.RegisterMessage(ValidPadding) SamePadding = _reflection.GeneratedProtocolMessageType('SamePadding', (_message.Message,), { 'DESCRIPTOR' : _SAMEPADDING, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SamePadding) }) _sym_db.RegisterMessage(SamePadding) SamplingMode = _reflection.GeneratedProtocolMessageType('SamplingMode', (_message.Message,), { 'DESCRIPTOR' : _SAMPLINGMODE, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SamplingMode) }) _sym_db.RegisterMessage(SamplingMode) BoxCoordinatesMode = _reflection.GeneratedProtocolMessageType('BoxCoordinatesMode', (_message.Message,), { 'DESCRIPTOR' : _BOXCOORDINATESMODE, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BoxCoordinatesMode) }) _sym_db.RegisterMessage(BoxCoordinatesMode) WeightParams = _reflection.GeneratedProtocolMessageType('WeightParams', (_message.Message,), { 'DESCRIPTOR' : _WEIGHTPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.WeightParams) }) _sym_db.RegisterMessage(WeightParams) QuantizationParams = _reflection.GeneratedProtocolMessageType('QuantizationParams', (_message.Message,), { 'DESCRIPTOR' : _QUANTIZATIONPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.QuantizationParams) }) _sym_db.RegisterMessage(QuantizationParams) LinearQuantizationParams = _reflection.GeneratedProtocolMessageType('LinearQuantizationParams', (_message.Message,), { 'DESCRIPTOR' : _LINEARQUANTIZATIONPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LinearQuantizationParams) }) _sym_db.RegisterMessage(LinearQuantizationParams) LookUpTableQuantizationParams = _reflection.GeneratedProtocolMessageType('LookUpTableQuantizationParams', (_message.Message,), { 'DESCRIPTOR' : _LOOKUPTABLEQUANTIZATIONPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LookUpTableQuantizationParams) }) _sym_db.RegisterMessage(LookUpTableQuantizationParams) ConvolutionLayerParams = _reflection.GeneratedProtocolMessageType('ConvolutionLayerParams', (_message.Message,), { 'DESCRIPTOR' : _CONVOLUTIONLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ConvolutionLayerParams) }) _sym_db.RegisterMessage(ConvolutionLayerParams) Convolution3DLayerParams = _reflection.GeneratedProtocolMessageType('Convolution3DLayerParams', (_message.Message,), { 'DESCRIPTOR' : _CONVOLUTION3DLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Convolution3DLayerParams) }) _sym_db.RegisterMessage(Convolution3DLayerParams) InnerProductLayerParams = _reflection.GeneratedProtocolMessageType('InnerProductLayerParams', (_message.Message,), { 'DESCRIPTOR' : _INNERPRODUCTLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.InnerProductLayerParams) }) _sym_db.RegisterMessage(InnerProductLayerParams) EmbeddingLayerParams = _reflection.GeneratedProtocolMessageType('EmbeddingLayerParams', (_message.Message,), { 'DESCRIPTOR' : _EMBEDDINGLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.EmbeddingLayerParams) }) _sym_db.RegisterMessage(EmbeddingLayerParams) EmbeddingNDLayerParams = _reflection.GeneratedProtocolMessageType('EmbeddingNDLayerParams', (_message.Message,), { 'DESCRIPTOR' : _EMBEDDINGNDLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.EmbeddingNDLayerParams) }) _sym_db.RegisterMessage(EmbeddingNDLayerParams) BatchnormLayerParams = _reflection.GeneratedProtocolMessageType('BatchnormLayerParams', (_message.Message,), { 'DESCRIPTOR' : _BATCHNORMLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BatchnormLayerParams) }) _sym_db.RegisterMessage(BatchnormLayerParams) PoolingLayerParams = _reflection.GeneratedProtocolMessageType('PoolingLayerParams', (_message.Message,), { 'ValidCompletePadding' : _reflection.GeneratedProtocolMessageType('ValidCompletePadding', (_message.Message,), { 'DESCRIPTOR' : _POOLINGLAYERPARAMS_VALIDCOMPLETEPADDING, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.PoolingLayerParams.ValidCompletePadding) }) , 'DESCRIPTOR' : _POOLINGLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.PoolingLayerParams) }) _sym_db.RegisterMessage(PoolingLayerParams) _sym_db.RegisterMessage(PoolingLayerParams.ValidCompletePadding) Pooling3DLayerParams = _reflection.GeneratedProtocolMessageType('Pooling3DLayerParams', (_message.Message,), { 'DESCRIPTOR' : _POOLING3DLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Pooling3DLayerParams) }) _sym_db.RegisterMessage(Pooling3DLayerParams) GlobalPooling3DLayerParams = _reflection.GeneratedProtocolMessageType('GlobalPooling3DLayerParams', (_message.Message,), { 'DESCRIPTOR' : _GLOBALPOOLING3DLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.GlobalPooling3DLayerParams) }) _sym_db.RegisterMessage(GlobalPooling3DLayerParams) PaddingLayerParams = _reflection.GeneratedProtocolMessageType('PaddingLayerParams', (_message.Message,), { 'PaddingConstant' : _reflection.GeneratedProtocolMessageType('PaddingConstant', (_message.Message,), { 'DESCRIPTOR' : _PADDINGLAYERPARAMS_PADDINGCONSTANT, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.PaddingLayerParams.PaddingConstant) }) , 'PaddingReflection' : _reflection.GeneratedProtocolMessageType('PaddingReflection', (_message.Message,), { 'DESCRIPTOR' : _PADDINGLAYERPARAMS_PADDINGREFLECTION, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.PaddingLayerParams.PaddingReflection) }) , 'PaddingReplication' : _reflection.GeneratedProtocolMessageType('PaddingReplication', (_message.Message,), { 'DESCRIPTOR' : _PADDINGLAYERPARAMS_PADDINGREPLICATION, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.PaddingLayerParams.PaddingReplication) }) , 'DESCRIPTOR' : _PADDINGLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.PaddingLayerParams) }) _sym_db.RegisterMessage(PaddingLayerParams) _sym_db.RegisterMessage(PaddingLayerParams.PaddingConstant) _sym_db.RegisterMessage(PaddingLayerParams.PaddingReflection) _sym_db.RegisterMessage(PaddingLayerParams.PaddingReplication) ConcatLayerParams = _reflection.GeneratedProtocolMessageType('ConcatLayerParams', (_message.Message,), { 'DESCRIPTOR' : _CONCATLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ConcatLayerParams) }) _sym_db.RegisterMessage(ConcatLayerParams) LRNLayerParams = _reflection.GeneratedProtocolMessageType('LRNLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LRNLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LRNLayerParams) }) _sym_db.RegisterMessage(LRNLayerParams) SoftmaxLayerParams = _reflection.GeneratedProtocolMessageType('SoftmaxLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SOFTMAXLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SoftmaxLayerParams) }) _sym_db.RegisterMessage(SoftmaxLayerParams) SplitLayerParams = _reflection.GeneratedProtocolMessageType('SplitLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SPLITLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SplitLayerParams) }) _sym_db.RegisterMessage(SplitLayerParams) AddLayerParams = _reflection.GeneratedProtocolMessageType('AddLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ADDLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.AddLayerParams) }) _sym_db.RegisterMessage(AddLayerParams) MultiplyLayerParams = _reflection.GeneratedProtocolMessageType('MultiplyLayerParams', (_message.Message,), { 'DESCRIPTOR' : _MULTIPLYLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MultiplyLayerParams) }) _sym_db.RegisterMessage(MultiplyLayerParams) UnaryFunctionLayerParams = _reflection.GeneratedProtocolMessageType('UnaryFunctionLayerParams', (_message.Message,), { 'DESCRIPTOR' : _UNARYFUNCTIONLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.UnaryFunctionLayerParams) }) _sym_db.RegisterMessage(UnaryFunctionLayerParams) UpsampleLayerParams = _reflection.GeneratedProtocolMessageType('UpsampleLayerParams', (_message.Message,), { 'DESCRIPTOR' : _UPSAMPLELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.UpsampleLayerParams) }) _sym_db.RegisterMessage(UpsampleLayerParams) ResizeBilinearLayerParams = _reflection.GeneratedProtocolMessageType('ResizeBilinearLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RESIZEBILINEARLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ResizeBilinearLayerParams) }) _sym_db.RegisterMessage(ResizeBilinearLayerParams) CropResizeLayerParams = _reflection.GeneratedProtocolMessageType('CropResizeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _CROPRESIZELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CropResizeLayerParams) }) _sym_db.RegisterMessage(CropResizeLayerParams) BiasLayerParams = _reflection.GeneratedProtocolMessageType('BiasLayerParams', (_message.Message,), { 'DESCRIPTOR' : _BIASLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BiasLayerParams) }) _sym_db.RegisterMessage(BiasLayerParams) ScaleLayerParams = _reflection.GeneratedProtocolMessageType('ScaleLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SCALELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ScaleLayerParams) }) _sym_db.RegisterMessage(ScaleLayerParams) LoadConstantLayerParams = _reflection.GeneratedProtocolMessageType('LoadConstantLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LOADCONSTANTLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LoadConstantLayerParams) }) _sym_db.RegisterMessage(LoadConstantLayerParams) L2NormalizeLayerParams = _reflection.GeneratedProtocolMessageType('L2NormalizeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _L2NORMALIZELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.L2NormalizeLayerParams) }) _sym_db.RegisterMessage(L2NormalizeLayerParams) FlattenLayerParams = _reflection.GeneratedProtocolMessageType('FlattenLayerParams', (_message.Message,), { 'DESCRIPTOR' : _FLATTENLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FlattenLayerParams) }) _sym_db.RegisterMessage(FlattenLayerParams) ReshapeLayerParams = _reflection.GeneratedProtocolMessageType('ReshapeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RESHAPELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReshapeLayerParams) }) _sym_db.RegisterMessage(ReshapeLayerParams) PermuteLayerParams = _reflection.GeneratedProtocolMessageType('PermuteLayerParams', (_message.Message,), { 'DESCRIPTOR' : _PERMUTELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.PermuteLayerParams) }) _sym_db.RegisterMessage(PermuteLayerParams) ReorganizeDataLayerParams = _reflection.GeneratedProtocolMessageType('ReorganizeDataLayerParams', (_message.Message,), { 'DESCRIPTOR' : _REORGANIZEDATALAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReorganizeDataLayerParams) }) _sym_db.RegisterMessage(ReorganizeDataLayerParams) SliceLayerParams = _reflection.GeneratedProtocolMessageType('SliceLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SLICELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SliceLayerParams) }) _sym_db.RegisterMessage(SliceLayerParams) ReduceLayerParams = _reflection.GeneratedProtocolMessageType('ReduceLayerParams', (_message.Message,), { 'DESCRIPTOR' : _REDUCELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReduceLayerParams) }) _sym_db.RegisterMessage(ReduceLayerParams) CropLayerParams = _reflection.GeneratedProtocolMessageType('CropLayerParams', (_message.Message,), { 'DESCRIPTOR' : _CROPLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CropLayerParams) }) _sym_db.RegisterMessage(CropLayerParams) AverageLayerParams = _reflection.GeneratedProtocolMessageType('AverageLayerParams', (_message.Message,), { 'DESCRIPTOR' : _AVERAGELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.AverageLayerParams) }) _sym_db.RegisterMessage(AverageLayerParams) MaxLayerParams = _reflection.GeneratedProtocolMessageType('MaxLayerParams', (_message.Message,), { 'DESCRIPTOR' : _MAXLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MaxLayerParams) }) _sym_db.RegisterMessage(MaxLayerParams) MinLayerParams = _reflection.GeneratedProtocolMessageType('MinLayerParams', (_message.Message,), { 'DESCRIPTOR' : _MINLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MinLayerParams) }) _sym_db.RegisterMessage(MinLayerParams) DotProductLayerParams = _reflection.GeneratedProtocolMessageType('DotProductLayerParams', (_message.Message,), { 'DESCRIPTOR' : _DOTPRODUCTLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.DotProductLayerParams) }) _sym_db.RegisterMessage(DotProductLayerParams) MeanVarianceNormalizeLayerParams = _reflection.GeneratedProtocolMessageType('MeanVarianceNormalizeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _MEANVARIANCENORMALIZELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MeanVarianceNormalizeLayerParams) }) _sym_db.RegisterMessage(MeanVarianceNormalizeLayerParams) SequenceRepeatLayerParams = _reflection.GeneratedProtocolMessageType('SequenceRepeatLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SEQUENCEREPEATLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SequenceRepeatLayerParams) }) _sym_db.RegisterMessage(SequenceRepeatLayerParams) SimpleRecurrentLayerParams = _reflection.GeneratedProtocolMessageType('SimpleRecurrentLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SIMPLERECURRENTLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SimpleRecurrentLayerParams) }) _sym_db.RegisterMessage(SimpleRecurrentLayerParams) GRULayerParams = _reflection.GeneratedProtocolMessageType('GRULayerParams', (_message.Message,), { 'DESCRIPTOR' : _GRULAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.GRULayerParams) }) _sym_db.RegisterMessage(GRULayerParams) LSTMParams = _reflection.GeneratedProtocolMessageType('LSTMParams', (_message.Message,), { 'DESCRIPTOR' : _LSTMPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LSTMParams) }) _sym_db.RegisterMessage(LSTMParams) LSTMWeightParams = _reflection.GeneratedProtocolMessageType('LSTMWeightParams', (_message.Message,), { 'DESCRIPTOR' : _LSTMWEIGHTPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LSTMWeightParams) }) _sym_db.RegisterMessage(LSTMWeightParams) UniDirectionalLSTMLayerParams = _reflection.GeneratedProtocolMessageType('UniDirectionalLSTMLayerParams', (_message.Message,), { 'DESCRIPTOR' : _UNIDIRECTIONALLSTMLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.UniDirectionalLSTMLayerParams) }) _sym_db.RegisterMessage(UniDirectionalLSTMLayerParams) BiDirectionalLSTMLayerParams = _reflection.GeneratedProtocolMessageType('BiDirectionalLSTMLayerParams', (_message.Message,), { 'DESCRIPTOR' : _BIDIRECTIONALLSTMLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BiDirectionalLSTMLayerParams) }) _sym_db.RegisterMessage(BiDirectionalLSTMLayerParams) CustomLayerParams = _reflection.GeneratedProtocolMessageType('CustomLayerParams', (_message.Message,), { 'CustomLayerParamValue' : _reflection.GeneratedProtocolMessageType('CustomLayerParamValue', (_message.Message,), { 'DESCRIPTOR' : _CUSTOMLAYERPARAMS_CUSTOMLAYERPARAMVALUE, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CustomLayerParams.CustomLayerParamValue) }) , 'ParametersEntry' : _reflection.GeneratedProtocolMessageType('ParametersEntry', (_message.Message,), { 'DESCRIPTOR' : _CUSTOMLAYERPARAMS_PARAMETERSENTRY, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CustomLayerParams.ParametersEntry) }) , 'DESCRIPTOR' : _CUSTOMLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CustomLayerParams) }) _sym_db.RegisterMessage(CustomLayerParams) _sym_db.RegisterMessage(CustomLayerParams.CustomLayerParamValue) _sym_db.RegisterMessage(CustomLayerParams.ParametersEntry) TransposeLayerParams = _reflection.GeneratedProtocolMessageType('TransposeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _TRANSPOSELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.TransposeLayerParams) }) _sym_db.RegisterMessage(TransposeLayerParams) BatchedMatMulLayerParams = _reflection.GeneratedProtocolMessageType('BatchedMatMulLayerParams', (_message.Message,), { 'DESCRIPTOR' : _BATCHEDMATMULLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BatchedMatMulLayerParams) }) _sym_db.RegisterMessage(BatchedMatMulLayerParams) ConcatNDLayerParams = _reflection.GeneratedProtocolMessageType('ConcatNDLayerParams', (_message.Message,), { 'DESCRIPTOR' : _CONCATNDLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ConcatNDLayerParams) }) _sym_db.RegisterMessage(ConcatNDLayerParams) SoftmaxNDLayerParams = _reflection.GeneratedProtocolMessageType('SoftmaxNDLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SOFTMAXNDLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SoftmaxNDLayerParams) }) _sym_db.RegisterMessage(SoftmaxNDLayerParams) ReverseLayerParams = _reflection.GeneratedProtocolMessageType('ReverseLayerParams', (_message.Message,), { 'DESCRIPTOR' : _REVERSELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReverseLayerParams) }) _sym_db.RegisterMessage(ReverseLayerParams) ReverseSeqLayerParams = _reflection.GeneratedProtocolMessageType('ReverseSeqLayerParams', (_message.Message,), { 'DESCRIPTOR' : _REVERSESEQLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReverseSeqLayerParams) }) _sym_db.RegisterMessage(ReverseSeqLayerParams) LoadConstantNDLayerParams = _reflection.GeneratedProtocolMessageType('LoadConstantNDLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LOADCONSTANTNDLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LoadConstantNDLayerParams) }) _sym_db.RegisterMessage(LoadConstantNDLayerParams) FillLikeLayerParams = _reflection.GeneratedProtocolMessageType('FillLikeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _FILLLIKELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FillLikeLayerParams) }) _sym_db.RegisterMessage(FillLikeLayerParams) FillStaticLayerParams = _reflection.GeneratedProtocolMessageType('FillStaticLayerParams', (_message.Message,), { 'DESCRIPTOR' : _FILLSTATICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FillStaticLayerParams) }) _sym_db.RegisterMessage(FillStaticLayerParams) FillDynamicLayerParams = _reflection.GeneratedProtocolMessageType('FillDynamicLayerParams', (_message.Message,), { 'DESCRIPTOR' : _FILLDYNAMICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FillDynamicLayerParams) }) _sym_db.RegisterMessage(FillDynamicLayerParams) WhereBroadcastableLayerParams = _reflection.GeneratedProtocolMessageType('WhereBroadcastableLayerParams', (_message.Message,), { 'DESCRIPTOR' : _WHEREBROADCASTABLELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.WhereBroadcastableLayerParams) }) _sym_db.RegisterMessage(WhereBroadcastableLayerParams) SinLayerParams = _reflection.GeneratedProtocolMessageType('SinLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SINLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SinLayerParams) }) _sym_db.RegisterMessage(SinLayerParams) CosLayerParams = _reflection.GeneratedProtocolMessageType('CosLayerParams', (_message.Message,), { 'DESCRIPTOR' : _COSLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CosLayerParams) }) _sym_db.RegisterMessage(CosLayerParams) TanLayerParams = _reflection.GeneratedProtocolMessageType('TanLayerParams', (_message.Message,), { 'DESCRIPTOR' : _TANLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.TanLayerParams) }) _sym_db.RegisterMessage(TanLayerParams) AsinLayerParams = _reflection.GeneratedProtocolMessageType('AsinLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ASINLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.AsinLayerParams) }) _sym_db.RegisterMessage(AsinLayerParams) AcosLayerParams = _reflection.GeneratedProtocolMessageType('AcosLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ACOSLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.AcosLayerParams) }) _sym_db.RegisterMessage(AcosLayerParams) AtanLayerParams = _reflection.GeneratedProtocolMessageType('AtanLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ATANLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.AtanLayerParams) }) _sym_db.RegisterMessage(AtanLayerParams) SinhLayerParams = _reflection.GeneratedProtocolMessageType('SinhLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SINHLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SinhLayerParams) }) _sym_db.RegisterMessage(SinhLayerParams) CoshLayerParams = _reflection.GeneratedProtocolMessageType('CoshLayerParams', (_message.Message,), { 'DESCRIPTOR' : _COSHLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CoshLayerParams) }) _sym_db.RegisterMessage(CoshLayerParams) TanhLayerParams = _reflection.GeneratedProtocolMessageType('TanhLayerParams', (_message.Message,), { 'DESCRIPTOR' : _TANHLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.TanhLayerParams) }) _sym_db.RegisterMessage(TanhLayerParams) AsinhLayerParams = _reflection.GeneratedProtocolMessageType('AsinhLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ASINHLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.AsinhLayerParams) }) _sym_db.RegisterMessage(AsinhLayerParams) AcoshLayerParams = _reflection.GeneratedProtocolMessageType('AcoshLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ACOSHLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.AcoshLayerParams) }) _sym_db.RegisterMessage(AcoshLayerParams) AtanhLayerParams = _reflection.GeneratedProtocolMessageType('AtanhLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ATANHLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.AtanhLayerParams) }) _sym_db.RegisterMessage(AtanhLayerParams) PowBroadcastableLayerParams = _reflection.GeneratedProtocolMessageType('PowBroadcastableLayerParams', (_message.Message,), { 'DESCRIPTOR' : _POWBROADCASTABLELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.PowBroadcastableLayerParams) }) _sym_db.RegisterMessage(PowBroadcastableLayerParams) Exp2LayerParams = _reflection.GeneratedProtocolMessageType('Exp2LayerParams', (_message.Message,), { 'DESCRIPTOR' : _EXP2LAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Exp2LayerParams) }) _sym_db.RegisterMessage(Exp2LayerParams) WhereNonZeroLayerParams = _reflection.GeneratedProtocolMessageType('WhereNonZeroLayerParams', (_message.Message,), { 'DESCRIPTOR' : _WHERENONZEROLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.WhereNonZeroLayerParams) }) _sym_db.RegisterMessage(WhereNonZeroLayerParams) MatrixBandPartLayerParams = _reflection.GeneratedProtocolMessageType('MatrixBandPartLayerParams', (_message.Message,), { 'DESCRIPTOR' : _MATRIXBANDPARTLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MatrixBandPartLayerParams) }) _sym_db.RegisterMessage(MatrixBandPartLayerParams) UpperTriangularLayerParams = _reflection.GeneratedProtocolMessageType('UpperTriangularLayerParams', (_message.Message,), { 'DESCRIPTOR' : _UPPERTRIANGULARLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.UpperTriangularLayerParams) }) _sym_db.RegisterMessage(UpperTriangularLayerParams) LowerTriangularLayerParams = _reflection.GeneratedProtocolMessageType('LowerTriangularLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LOWERTRIANGULARLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LowerTriangularLayerParams) }) _sym_db.RegisterMessage(LowerTriangularLayerParams) BroadcastToLikeLayerParams = _reflection.GeneratedProtocolMessageType('BroadcastToLikeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _BROADCASTTOLIKELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BroadcastToLikeLayerParams) }) _sym_db.RegisterMessage(BroadcastToLikeLayerParams) BroadcastToStaticLayerParams = _reflection.GeneratedProtocolMessageType('BroadcastToStaticLayerParams', (_message.Message,), { 'DESCRIPTOR' : _BROADCASTTOSTATICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BroadcastToStaticLayerParams) }) _sym_db.RegisterMessage(BroadcastToStaticLayerParams) BroadcastToDynamicLayerParams = _reflection.GeneratedProtocolMessageType('BroadcastToDynamicLayerParams', (_message.Message,), { 'DESCRIPTOR' : _BROADCASTTODYNAMICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BroadcastToDynamicLayerParams) }) _sym_db.RegisterMessage(BroadcastToDynamicLayerParams) AddBroadcastableLayerParams = _reflection.GeneratedProtocolMessageType('AddBroadcastableLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ADDBROADCASTABLELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.AddBroadcastableLayerParams) }) _sym_db.RegisterMessage(AddBroadcastableLayerParams) MaxBroadcastableLayerParams = _reflection.GeneratedProtocolMessageType('MaxBroadcastableLayerParams', (_message.Message,), { 'DESCRIPTOR' : _MAXBROADCASTABLELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MaxBroadcastableLayerParams) }) _sym_db.RegisterMessage(MaxBroadcastableLayerParams) MinBroadcastableLayerParams = _reflection.GeneratedProtocolMessageType('MinBroadcastableLayerParams', (_message.Message,), { 'DESCRIPTOR' : _MINBROADCASTABLELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MinBroadcastableLayerParams) }) _sym_db.RegisterMessage(MinBroadcastableLayerParams) ModBroadcastableLayerParams = _reflection.GeneratedProtocolMessageType('ModBroadcastableLayerParams', (_message.Message,), { 'DESCRIPTOR' : _MODBROADCASTABLELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ModBroadcastableLayerParams) }) _sym_db.RegisterMessage(ModBroadcastableLayerParams) FloorDivBroadcastableLayerParams = _reflection.GeneratedProtocolMessageType('FloorDivBroadcastableLayerParams', (_message.Message,), { 'DESCRIPTOR' : _FLOORDIVBROADCASTABLELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FloorDivBroadcastableLayerParams) }) _sym_db.RegisterMessage(FloorDivBroadcastableLayerParams) SubtractBroadcastableLayerParams = _reflection.GeneratedProtocolMessageType('SubtractBroadcastableLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SUBTRACTBROADCASTABLELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SubtractBroadcastableLayerParams) }) _sym_db.RegisterMessage(SubtractBroadcastableLayerParams) MultiplyBroadcastableLayerParams = _reflection.GeneratedProtocolMessageType('MultiplyBroadcastableLayerParams', (_message.Message,), { 'DESCRIPTOR' : _MULTIPLYBROADCASTABLELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MultiplyBroadcastableLayerParams) }) _sym_db.RegisterMessage(MultiplyBroadcastableLayerParams) DivideBroadcastableLayerParams = _reflection.GeneratedProtocolMessageType('DivideBroadcastableLayerParams', (_message.Message,), { 'DESCRIPTOR' : _DIVIDEBROADCASTABLELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.DivideBroadcastableLayerParams) }) _sym_db.RegisterMessage(DivideBroadcastableLayerParams) GatherLayerParams = _reflection.GeneratedProtocolMessageType('GatherLayerParams', (_message.Message,), { 'DESCRIPTOR' : _GATHERLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.GatherLayerParams) }) _sym_db.RegisterMessage(GatherLayerParams) ScatterLayerParams = _reflection.GeneratedProtocolMessageType('ScatterLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SCATTERLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ScatterLayerParams) }) _sym_db.RegisterMessage(ScatterLayerParams) GatherNDLayerParams = _reflection.GeneratedProtocolMessageType('GatherNDLayerParams', (_message.Message,), { 'DESCRIPTOR' : _GATHERNDLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.GatherNDLayerParams) }) _sym_db.RegisterMessage(GatherNDLayerParams) ScatterNDLayerParams = _reflection.GeneratedProtocolMessageType('ScatterNDLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SCATTERNDLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ScatterNDLayerParams) }) _sym_db.RegisterMessage(ScatterNDLayerParams) GatherAlongAxisLayerParams = _reflection.GeneratedProtocolMessageType('GatherAlongAxisLayerParams', (_message.Message,), { 'DESCRIPTOR' : _GATHERALONGAXISLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.GatherAlongAxisLayerParams) }) _sym_db.RegisterMessage(GatherAlongAxisLayerParams) ScatterAlongAxisLayerParams = _reflection.GeneratedProtocolMessageType('ScatterAlongAxisLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SCATTERALONGAXISLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ScatterAlongAxisLayerParams) }) _sym_db.RegisterMessage(ScatterAlongAxisLayerParams) StackLayerParams = _reflection.GeneratedProtocolMessageType('StackLayerParams', (_message.Message,), { 'DESCRIPTOR' : _STACKLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.StackLayerParams) }) _sym_db.RegisterMessage(StackLayerParams) RankPreservingReshapeLayerParams = _reflection.GeneratedProtocolMessageType('RankPreservingReshapeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RANKPRESERVINGRESHAPELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RankPreservingReshapeLayerParams) }) _sym_db.RegisterMessage(RankPreservingReshapeLayerParams) ConstantPaddingLayerParams = _reflection.GeneratedProtocolMessageType('ConstantPaddingLayerParams', (_message.Message,), { 'DESCRIPTOR' : _CONSTANTPADDINGLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ConstantPaddingLayerParams) }) _sym_db.RegisterMessage(ConstantPaddingLayerParams) RandomNormalLikeLayerParams = _reflection.GeneratedProtocolMessageType('RandomNormalLikeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RANDOMNORMALLIKELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RandomNormalLikeLayerParams) }) _sym_db.RegisterMessage(RandomNormalLikeLayerParams) RandomNormalStaticLayerParams = _reflection.GeneratedProtocolMessageType('RandomNormalStaticLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RANDOMNORMALSTATICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RandomNormalStaticLayerParams) }) _sym_db.RegisterMessage(RandomNormalStaticLayerParams) RandomNormalDynamicLayerParams = _reflection.GeneratedProtocolMessageType('RandomNormalDynamicLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RANDOMNORMALDYNAMICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RandomNormalDynamicLayerParams) }) _sym_db.RegisterMessage(RandomNormalDynamicLayerParams) RandomUniformLikeLayerParams = _reflection.GeneratedProtocolMessageType('RandomUniformLikeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RANDOMUNIFORMLIKELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RandomUniformLikeLayerParams) }) _sym_db.RegisterMessage(RandomUniformLikeLayerParams) RandomUniformStaticLayerParams = _reflection.GeneratedProtocolMessageType('RandomUniformStaticLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RANDOMUNIFORMSTATICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RandomUniformStaticLayerParams) }) _sym_db.RegisterMessage(RandomUniformStaticLayerParams) RandomUniformDynamicLayerParams = _reflection.GeneratedProtocolMessageType('RandomUniformDynamicLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RANDOMUNIFORMDYNAMICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RandomUniformDynamicLayerParams) }) _sym_db.RegisterMessage(RandomUniformDynamicLayerParams) RandomBernoulliLikeLayerParams = _reflection.GeneratedProtocolMessageType('RandomBernoulliLikeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RANDOMBERNOULLILIKELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RandomBernoulliLikeLayerParams) }) _sym_db.RegisterMessage(RandomBernoulliLikeLayerParams) RandomBernoulliStaticLayerParams = _reflection.GeneratedProtocolMessageType('RandomBernoulliStaticLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RANDOMBERNOULLISTATICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RandomBernoulliStaticLayerParams) }) _sym_db.RegisterMessage(RandomBernoulliStaticLayerParams) RandomBernoulliDynamicLayerParams = _reflection.GeneratedProtocolMessageType('RandomBernoulliDynamicLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RANDOMBERNOULLIDYNAMICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RandomBernoulliDynamicLayerParams) }) _sym_db.RegisterMessage(RandomBernoulliDynamicLayerParams) CategoricalDistributionLayerParams = _reflection.GeneratedProtocolMessageType('CategoricalDistributionLayerParams', (_message.Message,), { 'DESCRIPTOR' : _CATEGORICALDISTRIBUTIONLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CategoricalDistributionLayerParams) }) _sym_db.RegisterMessage(CategoricalDistributionLayerParams) ReduceL1LayerParams = _reflection.GeneratedProtocolMessageType('ReduceL1LayerParams', (_message.Message,), { 'DESCRIPTOR' : _REDUCEL1LAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReduceL1LayerParams) }) _sym_db.RegisterMessage(ReduceL1LayerParams) ReduceL2LayerParams = _reflection.GeneratedProtocolMessageType('ReduceL2LayerParams', (_message.Message,), { 'DESCRIPTOR' : _REDUCEL2LAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReduceL2LayerParams) }) _sym_db.RegisterMessage(ReduceL2LayerParams) ReduceMaxLayerParams = _reflection.GeneratedProtocolMessageType('ReduceMaxLayerParams', (_message.Message,), { 'DESCRIPTOR' : _REDUCEMAXLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReduceMaxLayerParams) }) _sym_db.RegisterMessage(ReduceMaxLayerParams) ReduceMinLayerParams = _reflection.GeneratedProtocolMessageType('ReduceMinLayerParams', (_message.Message,), { 'DESCRIPTOR' : _REDUCEMINLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReduceMinLayerParams) }) _sym_db.RegisterMessage(ReduceMinLayerParams) ReduceSumLayerParams = _reflection.GeneratedProtocolMessageType('ReduceSumLayerParams', (_message.Message,), { 'DESCRIPTOR' : _REDUCESUMLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReduceSumLayerParams) }) _sym_db.RegisterMessage(ReduceSumLayerParams) ReduceProdLayerParams = _reflection.GeneratedProtocolMessageType('ReduceProdLayerParams', (_message.Message,), { 'DESCRIPTOR' : _REDUCEPRODLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReduceProdLayerParams) }) _sym_db.RegisterMessage(ReduceProdLayerParams) ReduceMeanLayerParams = _reflection.GeneratedProtocolMessageType('ReduceMeanLayerParams', (_message.Message,), { 'DESCRIPTOR' : _REDUCEMEANLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReduceMeanLayerParams) }) _sym_db.RegisterMessage(ReduceMeanLayerParams) ReduceLogSumLayerParams = _reflection.GeneratedProtocolMessageType('ReduceLogSumLayerParams', (_message.Message,), { 'DESCRIPTOR' : _REDUCELOGSUMLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReduceLogSumLayerParams) }) _sym_db.RegisterMessage(ReduceLogSumLayerParams) ReduceSumSquareLayerParams = _reflection.GeneratedProtocolMessageType('ReduceSumSquareLayerParams', (_message.Message,), { 'DESCRIPTOR' : _REDUCESUMSQUARELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReduceSumSquareLayerParams) }) _sym_db.RegisterMessage(ReduceSumSquareLayerParams) ReduceLogSumExpLayerParams = _reflection.GeneratedProtocolMessageType('ReduceLogSumExpLayerParams', (_message.Message,), { 'DESCRIPTOR' : _REDUCELOGSUMEXPLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReduceLogSumExpLayerParams) }) _sym_db.RegisterMessage(ReduceLogSumExpLayerParams) ExpandDimsLayerParams = _reflection.GeneratedProtocolMessageType('ExpandDimsLayerParams', (_message.Message,), { 'DESCRIPTOR' : _EXPANDDIMSLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ExpandDimsLayerParams) }) _sym_db.RegisterMessage(ExpandDimsLayerParams) FlattenTo2DLayerParams = _reflection.GeneratedProtocolMessageType('FlattenTo2DLayerParams', (_message.Message,), { 'DESCRIPTOR' : _FLATTENTO2DLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FlattenTo2DLayerParams) }) _sym_db.RegisterMessage(FlattenTo2DLayerParams) ReshapeStaticLayerParams = _reflection.GeneratedProtocolMessageType('ReshapeStaticLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RESHAPESTATICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReshapeStaticLayerParams) }) _sym_db.RegisterMessage(ReshapeStaticLayerParams) ReshapeLikeLayerParams = _reflection.GeneratedProtocolMessageType('ReshapeLikeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RESHAPELIKELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReshapeLikeLayerParams) }) _sym_db.RegisterMessage(ReshapeLikeLayerParams) ReshapeDynamicLayerParams = _reflection.GeneratedProtocolMessageType('ReshapeDynamicLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RESHAPEDYNAMICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ReshapeDynamicLayerParams) }) _sym_db.RegisterMessage(ReshapeDynamicLayerParams) SqueezeLayerParams = _reflection.GeneratedProtocolMessageType('SqueezeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SQUEEZELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SqueezeLayerParams) }) _sym_db.RegisterMessage(SqueezeLayerParams) TopKLayerParams = _reflection.GeneratedProtocolMessageType('TopKLayerParams', (_message.Message,), { 'DESCRIPTOR' : _TOPKLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.TopKLayerParams) }) _sym_db.RegisterMessage(TopKLayerParams) ArgMaxLayerParams = _reflection.GeneratedProtocolMessageType('ArgMaxLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ARGMAXLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ArgMaxLayerParams) }) _sym_db.RegisterMessage(ArgMaxLayerParams) ArgMinLayerParams = _reflection.GeneratedProtocolMessageType('ArgMinLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ARGMINLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ArgMinLayerParams) }) _sym_db.RegisterMessage(ArgMinLayerParams) SplitNDLayerParams = _reflection.GeneratedProtocolMessageType('SplitNDLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SPLITNDLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SplitNDLayerParams) }) _sym_db.RegisterMessage(SplitNDLayerParams) CeilLayerParams = _reflection.GeneratedProtocolMessageType('CeilLayerParams', (_message.Message,), { 'DESCRIPTOR' : _CEILLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CeilLayerParams) }) _sym_db.RegisterMessage(CeilLayerParams) RoundLayerParams = _reflection.GeneratedProtocolMessageType('RoundLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ROUNDLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RoundLayerParams) }) _sym_db.RegisterMessage(RoundLayerParams) FloorLayerParams = _reflection.GeneratedProtocolMessageType('FloorLayerParams', (_message.Message,), { 'DESCRIPTOR' : _FLOORLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.FloorLayerParams) }) _sym_db.RegisterMessage(FloorLayerParams) SignLayerParams = _reflection.GeneratedProtocolMessageType('SignLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SIGNLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SignLayerParams) }) _sym_db.RegisterMessage(SignLayerParams) ClipLayerParams = _reflection.GeneratedProtocolMessageType('ClipLayerParams', (_message.Message,), { 'DESCRIPTOR' : _CLIPLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ClipLayerParams) }) _sym_db.RegisterMessage(ClipLayerParams) SliceStaticLayerParams = _reflection.GeneratedProtocolMessageType('SliceStaticLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SLICESTATICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SliceStaticLayerParams) }) _sym_db.RegisterMessage(SliceStaticLayerParams) SliceDynamicLayerParams = _reflection.GeneratedProtocolMessageType('SliceDynamicLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SLICEDYNAMICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SliceDynamicLayerParams) }) _sym_db.RegisterMessage(SliceDynamicLayerParams) TileLayerParams = _reflection.GeneratedProtocolMessageType('TileLayerParams', (_message.Message,), { 'DESCRIPTOR' : _TILELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.TileLayerParams) }) _sym_db.RegisterMessage(TileLayerParams) GetShapeLayerParams = _reflection.GeneratedProtocolMessageType('GetShapeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _GETSHAPELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.GetShapeLayerParams) }) _sym_db.RegisterMessage(GetShapeLayerParams) ErfLayerParams = _reflection.GeneratedProtocolMessageType('ErfLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ERFLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ErfLayerParams) }) _sym_db.RegisterMessage(ErfLayerParams) GeluLayerParams = _reflection.GeneratedProtocolMessageType('GeluLayerParams', (_message.Message,), { 'DESCRIPTOR' : _GELULAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.GeluLayerParams) }) _sym_db.RegisterMessage(GeluLayerParams) RangeStaticLayerParams = _reflection.GeneratedProtocolMessageType('RangeStaticLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RANGESTATICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RangeStaticLayerParams) }) _sym_db.RegisterMessage(RangeStaticLayerParams) RangeDynamicLayerParams = _reflection.GeneratedProtocolMessageType('RangeDynamicLayerParams', (_message.Message,), { 'DESCRIPTOR' : _RANGEDYNAMICLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RangeDynamicLayerParams) }) _sym_db.RegisterMessage(RangeDynamicLayerParams) SlidingWindowsLayerParams = _reflection.GeneratedProtocolMessageType('SlidingWindowsLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SLIDINGWINDOWSLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SlidingWindowsLayerParams) }) _sym_db.RegisterMessage(SlidingWindowsLayerParams) LayerNormalizationLayerParams = _reflection.GeneratedProtocolMessageType('LayerNormalizationLayerParams', (_message.Message,), { 'DESCRIPTOR' : _LAYERNORMALIZATIONLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LayerNormalizationLayerParams) }) _sym_db.RegisterMessage(LayerNormalizationLayerParams) NonMaximumSuppressionLayerParams = _reflection.GeneratedProtocolMessageType('NonMaximumSuppressionLayerParams', (_message.Message,), { 'DESCRIPTOR' : _NONMAXIMUMSUPPRESSIONLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NonMaximumSuppressionLayerParams) }) _sym_db.RegisterMessage(NonMaximumSuppressionLayerParams) ClampedReLULayerParams = _reflection.GeneratedProtocolMessageType('ClampedReLULayerParams', (_message.Message,), { 'DESCRIPTOR' : _CLAMPEDRELULAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ClampedReLULayerParams) }) _sym_db.RegisterMessage(ClampedReLULayerParams) ArgSortLayerParams = _reflection.GeneratedProtocolMessageType('ArgSortLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ARGSORTLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.ArgSortLayerParams) }) _sym_db.RegisterMessage(ArgSortLayerParams) SliceBySizeLayerParams = _reflection.GeneratedProtocolMessageType('SliceBySizeLayerParams', (_message.Message,), { 'DESCRIPTOR' : _SLICEBYSIZELAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SliceBySizeLayerParams) }) _sym_db.RegisterMessage(SliceBySizeLayerParams) NeuralNetworkClassifier = _reflection.GeneratedProtocolMessageType('NeuralNetworkClassifier', (_message.Message,), { 'DESCRIPTOR' : _NEURALNETWORKCLASSIFIER, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NeuralNetworkClassifier) }) _sym_db.RegisterMessage(NeuralNetworkClassifier) OneHotLayerParams = _reflection.GeneratedProtocolMessageType('OneHotLayerParams', (_message.Message,), { 'DESCRIPTOR' : _ONEHOTLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.OneHotLayerParams) }) _sym_db.RegisterMessage(OneHotLayerParams) CumSumLayerParams = _reflection.GeneratedProtocolMessageType('CumSumLayerParams', (_message.Message,), { 'DESCRIPTOR' : _CUMSUMLAYERPARAMS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CumSumLayerParams) }) _sym_db.RegisterMessage(CumSumLayerParams) NeuralNetworkRegressor = _reflection.GeneratedProtocolMessageType('NeuralNetworkRegressor', (_message.Message,), { 'DESCRIPTOR' : _NEURALNETWORKREGRESSOR, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NeuralNetworkRegressor) }) _sym_db.RegisterMessage(NeuralNetworkRegressor) NetworkUpdateParameters = _reflection.GeneratedProtocolMessageType('NetworkUpdateParameters', (_message.Message,), { 'DESCRIPTOR' : _NETWORKUPDATEPARAMETERS, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NetworkUpdateParameters) }) _sym_db.RegisterMessage(NetworkUpdateParameters) LossLayer = _reflection.GeneratedProtocolMessageType('LossLayer', (_message.Message,), { 'DESCRIPTOR' : _LOSSLAYER, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LossLayer) }) _sym_db.RegisterMessage(LossLayer) CategoricalCrossEntropyLossLayer = _reflection.GeneratedProtocolMessageType('CategoricalCrossEntropyLossLayer', (_message.Message,), { 'DESCRIPTOR' : _CATEGORICALCROSSENTROPYLOSSLAYER, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CategoricalCrossEntropyLossLayer) }) _sym_db.RegisterMessage(CategoricalCrossEntropyLossLayer) MeanSquaredErrorLossLayer = _reflection.GeneratedProtocolMessageType('MeanSquaredErrorLossLayer', (_message.Message,), { 'DESCRIPTOR' : _MEANSQUAREDERRORLOSSLAYER, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.MeanSquaredErrorLossLayer) }) _sym_db.RegisterMessage(MeanSquaredErrorLossLayer) Optimizer = _reflection.GeneratedProtocolMessageType('Optimizer', (_message.Message,), { 'DESCRIPTOR' : _OPTIMIZER, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Optimizer) }) _sym_db.RegisterMessage(Optimizer) SGDOptimizer = _reflection.GeneratedProtocolMessageType('SGDOptimizer', (_message.Message,), { 'DESCRIPTOR' : _SGDOPTIMIZER, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SGDOptimizer) }) _sym_db.RegisterMessage(SGDOptimizer) AdamOptimizer = _reflection.GeneratedProtocolMessageType('AdamOptimizer', (_message.Message,), { 'DESCRIPTOR' : _ADAMOPTIMIZER, '__module__' : 'NeuralNetwork_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.AdamOptimizer) }) _sym_db.RegisterMessage(AdamOptimizer) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _CUSTOMLAYERPARAMS_PARAMETERSENTRY._options = None _CUSTOMLAYERPARAMS_PARAMETERSENTRY._serialized_options = b'8\001' _NEURALNETWORKMULTIARRAYSHAPEMAPPING._serialized_start=33746 _NEURALNETWORKMULTIARRAYSHAPEMAPPING._serialized_end=33833 _NEURALNETWORKIMAGESHAPEMAPPING._serialized_start=33835 _NEURALNETWORKIMAGESHAPEMAPPING._serialized_end=33917 _SCATTERMODE._serialized_start=33920 _SCATTERMODE._serialized_end=34055 _NEURALNETWORK._serialized_start=86 _NEURALNETWORK._serialized_end=478 _NEURALNETWORKIMAGESCALER._serialized_start=480 _NEURALNETWORKIMAGESCALER._serialized_end=600 _NEURALNETWORKMEANIMAGE._serialized_start=602 _NEURALNETWORKMEANIMAGE._serialized_end=645 _NEURALNETWORKPREPROCESSING._serialized_start=648 _NEURALNETWORKPREPROCESSING._serialized_end=846 _ACTIVATIONRELU._serialized_start=848 _ACTIVATIONRELU._serialized_end=864 _ACTIVATIONLEAKYRELU._serialized_start=866 _ACTIVATIONLEAKYRELU._serialized_end=902 _ACTIVATIONTANH._serialized_start=904 _ACTIVATIONTANH._serialized_end=920 _ACTIVATIONSCALEDTANH._serialized_start=922 _ACTIVATIONSCALEDTANH._serialized_end=973 _ACTIVATIONSIGMOID._serialized_start=975 _ACTIVATIONSIGMOID._serialized_end=994 _ACTIVATIONLINEAR._serialized_start=996 _ACTIVATIONLINEAR._serialized_end=1043 _ACTIVATIONSIGMOIDHARD._serialized_start=1045 _ACTIVATIONSIGMOIDHARD._serialized_end=1097 _ACTIVATIONPRELU._serialized_start=1099 _ACTIVATIONPRELU._serialized_end=1167 _ACTIVATIONELU._serialized_start=1169 _ACTIVATIONELU._serialized_end=1199 _ACTIVATIONTHRESHOLDEDRELU._serialized_start=1201 _ACTIVATIONTHRESHOLDEDRELU._serialized_end=1243 _ACTIVATIONSOFTSIGN._serialized_start=1245 _ACTIVATIONSOFTSIGN._serialized_end=1265 _ACTIVATIONSOFTPLUS._serialized_start=1267 _ACTIVATIONSOFTPLUS._serialized_end=1287 _ACTIVATIONPARAMETRICSOFTPLUS._serialized_start=1290 _ACTIVATIONPARAMETRICSOFTPLUS._serialized_end=1421 _ACTIVATIONPARAMS._serialized_start=1424 _ACTIVATIONPARAMS._serialized_end=2276 _TENSOR._serialized_start=2278 _TENSOR._serialized_end=2318 _NEURALNETWORKLAYER._serialized_start=2321 _NEURALNETWORKLAYER._serialized_end=13307 _BRANCHLAYERPARAMS._serialized_start=13310 _BRANCHLAYERPARAMS._serialized_end=13441 _LOOPLAYERPARAMS._serialized_start=13444 _LOOPLAYERPARAMS._serialized_end=13631 _LOOPBREAKLAYERPARAMS._serialized_start=13633 _LOOPBREAKLAYERPARAMS._serialized_end=13655 _LOOPCONTINUELAYERPARAMS._serialized_start=13657 _LOOPCONTINUELAYERPARAMS._serialized_end=13682 _COPYLAYERPARAMS._serialized_start=13684 _COPYLAYERPARAMS._serialized_end=13701 _GREATERTHANLAYERPARAMS._serialized_start=13703 _GREATERTHANLAYERPARAMS._serialized_end=13742 _GREATEREQUALLAYERPARAMS._serialized_start=13744 _GREATEREQUALLAYERPARAMS._serialized_end=13784 _LESSTHANLAYERPARAMS._serialized_start=13786 _LESSTHANLAYERPARAMS._serialized_end=13822 _LESSEQUALLAYERPARAMS._serialized_start=13824 _LESSEQUALLAYERPARAMS._serialized_end=13861 _EQUALLAYERPARAMS._serialized_start=13863 _EQUALLAYERPARAMS._serialized_end=13896 _NOTEQUALLAYERPARAMS._serialized_start=13898 _NOTEQUALLAYERPARAMS._serialized_end=13934 _LOGICALANDLAYERPARAMS._serialized_start=13936 _LOGICALANDLAYERPARAMS._serialized_end=13959 _LOGICALORLAYERPARAMS._serialized_start=13961 _LOGICALORLAYERPARAMS._serialized_end=13983 _LOGICALXORLAYERPARAMS._serialized_start=13985 _LOGICALXORLAYERPARAMS._serialized_end=14008 _LOGICALNOTLAYERPARAMS._serialized_start=14010 _LOGICALNOTLAYERPARAMS._serialized_end=14033 _BORDERAMOUNTS._serialized_start=14036 _BORDERAMOUNTS._serialized_end=14178 _BORDERAMOUNTS_EDGESIZES._serialized_start=14123 _BORDERAMOUNTS_EDGESIZES._serialized_end=14178 _VALIDPADDING._serialized_start=14180 _VALIDPADDING._serialized_end=14255 _SAMEPADDING._serialized_start=14258 _SAMEPADDING._serialized_end=14408 _SAMEPADDING_SAMEPADDINGMODE._serialized_start=14347 _SAMEPADDING_SAMEPADDINGMODE._serialized_end=14408 _SAMPLINGMODE._serialized_start=14411 _SAMPLINGMODE._serialized_end=14600 _SAMPLINGMODE_METHOD._serialized_start=14494 _SAMPLINGMODE_METHOD._serialized_end=14600 _BOXCOORDINATESMODE._serialized_start=14603 _BOXCOORDINATESMODE._serialized_end=14819 _BOXCOORDINATESMODE_COORDINATES._serialized_start=14696 _BOXCOORDINATESMODE_COORDINATES._serialized_end=14819 _WEIGHTPARAMS._serialized_start=14822 _WEIGHTPARAMS._serialized_end=15003 _QUANTIZATIONPARAMS._serialized_start=15006 _QUANTIZATIONPARAMS._serialized_end=15234 _LINEARQUANTIZATIONPARAMS._serialized_start=15236 _LINEARQUANTIZATIONPARAMS._serialized_end=15291 _LOOKUPTABLEQUANTIZATIONPARAMS._serialized_start=15293 _LOOKUPTABLEQUANTIZATIONPARAMS._serialized_end=15344 _CONVOLUTIONLAYERPARAMS._serialized_start=15347 _CONVOLUTIONLAYERPARAMS._serialized_end=15792 _CONVOLUTION3DLAYERPARAMS._serialized_start=15795 _CONVOLUTION3DLAYERPARAMS._serialized_end=16543 _CONVOLUTION3DLAYERPARAMS_PADDINGTYPE._serialized_start=16497 _CONVOLUTION3DLAYERPARAMS_PADDINGTYPE._serialized_end=16543 _INNERPRODUCTLAYERPARAMS._serialized_start=16546 _INNERPRODUCTLAYERPARAMS._serialized_end=16767 _EMBEDDINGLAYERPARAMS._serialized_start=16770 _EMBEDDINGLAYERPARAMS._serialized_end=16954 _EMBEDDINGNDLAYERPARAMS._serialized_start=16957 _EMBEDDINGNDLAYERPARAMS._serialized_end=17143 _BATCHNORMLAYERPARAMS._serialized_start=17146 _BATCHNORMLAYERPARAMS._serialized_end=17463 _POOLINGLAYERPARAMS._serialized_start=17466 _POOLINGLAYERPARAMS._serialized_end=17954 _POOLINGLAYERPARAMS_VALIDCOMPLETEPADDING._serialized_start=17841 _POOLINGLAYERPARAMS_VALIDCOMPLETEPADDING._serialized_end=17887 _POOLINGLAYERPARAMS_POOLINGTYPE._serialized_start=17889 _POOLINGLAYERPARAMS_POOLINGTYPE._serialized_end=17932 _POOLING3DLAYERPARAMS._serialized_start=17957 _POOLING3DLAYERPARAMS._serialized_end=18555 _POOLING3DLAYERPARAMS_POOLINGTYPE3D._serialized_start=18461 _POOLING3DLAYERPARAMS_POOLINGTYPE3D._serialized_end=18498 _POOLING3DLAYERPARAMS_POOLING3DPADDINGTYPE._serialized_start=18500 _POOLING3DLAYERPARAMS_POOLING3DPADDINGTYPE._serialized_end=18555 _GLOBALPOOLING3DLAYERPARAMS._serialized_start=18558 _GLOBALPOOLING3DLAYERPARAMS._serialized_end=18715 _GLOBALPOOLING3DLAYERPARAMS_GLOBALPOOLINGTYPE3D._serialized_start=18672 _GLOBALPOOLING3DLAYERPARAMS_GLOBALPOOLINGTYPE3D._serialized_end=18715 _PADDINGLAYERPARAMS._serialized_start=18718 _PADDINGLAYERPARAMS._serialized_end=19135 _PADDINGLAYERPARAMS_PADDINGCONSTANT._serialized_start=19045 _PADDINGLAYERPARAMS_PADDINGCONSTANT._serialized_end=19077 _PADDINGLAYERPARAMS_PADDINGREFLECTION._serialized_start=19079 _PADDINGLAYERPARAMS_PADDINGREFLECTION._serialized_end=19098 _PADDINGLAYERPARAMS_PADDINGREPLICATION._serialized_start=19100 _PADDINGLAYERPARAMS_PADDINGREPLICATION._serialized_end=19120 _CONCATLAYERPARAMS._serialized_start=19137 _CONCATLAYERPARAMS._serialized_end=19180 _LRNLAYERPARAMS._serialized_start=19182 _LRNLAYERPARAMS._serialized_end=19257 _SOFTMAXLAYERPARAMS._serialized_start=19259 _SOFTMAXLAYERPARAMS._serialized_end=19279 _SPLITLAYERPARAMS._serialized_start=19281 _SPLITLAYERPARAMS._serialized_end=19317 _ADDLAYERPARAMS._serialized_start=19319 _ADDLAYERPARAMS._serialized_end=19350 _MULTIPLYLAYERPARAMS._serialized_start=19352 _MULTIPLYLAYERPARAMS._serialized_end=19388 _UNARYFUNCTIONLAYERPARAMS._serialized_start=19391 _UNARYFUNCTIONLAYERPARAMS._serialized_end=19651 _UNARYFUNCTIONLAYERPARAMS_OPERATION._serialized_start=19553 _UNARYFUNCTIONLAYERPARAMS_OPERATION._serialized_end=19651 _UPSAMPLELAYERPARAMS._serialized_start=19654 _UPSAMPLELAYERPARAMS._serialized_end=20023 _UPSAMPLELAYERPARAMS_INTERPOLATIONMODE._serialized_start=19898 _UPSAMPLELAYERPARAMS_INTERPOLATIONMODE._serialized_end=19939 _UPSAMPLELAYERPARAMS_LINEARUPSAMPLEMODE._serialized_start=19941 _UPSAMPLELAYERPARAMS_LINEARUPSAMPLEMODE._serialized_end=20023 _RESIZEBILINEARLAYERPARAMS._serialized_start=20025 _RESIZEBILINEARLAYERPARAMS._serialized_end=20122 _CROPRESIZELAYERPARAMS._serialized_start=20125 _CROPRESIZELAYERPARAMS._serialized_end=20337 _BIASLAYERPARAMS._serialized_start=20339 _BIASLAYERPARAMS._serialized_end=20421 _SCALELAYERPARAMS._serialized_start=20424 _SCALELAYERPARAMS._serialized_end=20599 _LOADCONSTANTLAYERPARAMS._serialized_start=20601 _LOADCONSTANTLAYERPARAMS._serialized_end=20691 _L2NORMALIZELAYERPARAMS._serialized_start=20693 _L2NORMALIZELAYERPARAMS._serialized_end=20734 _FLATTENLAYERPARAMS._serialized_start=20737 _FLATTENLAYERPARAMS._serialized_end=20879 _FLATTENLAYERPARAMS_FLATTENORDER._serialized_start=20828 _FLATTENLAYERPARAMS_FLATTENORDER._serialized_end=20879 _RESHAPELAYERPARAMS._serialized_start=20882 _RESHAPELAYERPARAMS._serialized_end=21045 _RESHAPELAYERPARAMS_RESHAPEORDER._serialized_start=20994 _RESHAPELAYERPARAMS_RESHAPEORDER._serialized_end=21045 _PERMUTELAYERPARAMS._serialized_start=21047 _PERMUTELAYERPARAMS._serialized_end=21081 _REORGANIZEDATALAYERPARAMS._serialized_start=21084 _REORGANIZEDATALAYERPARAMS._serialized_end=21293 _REORGANIZEDATALAYERPARAMS_REORGANIZATIONTYPE._serialized_start=21214 _REORGANIZEDATALAYERPARAMS_REORGANIZATIONTYPE._serialized_end=21293 _SLICELAYERPARAMS._serialized_start=21296 _SLICELAYERPARAMS._serialized_end=21496 _SLICELAYERPARAMS_SLICEAXIS._serialized_start=21434 _SLICELAYERPARAMS_SLICEAXIS._serialized_end=21496 _REDUCELAYERPARAMS._serialized_start=21499 _REDUCELAYERPARAMS._serialized_end=21844 _REDUCELAYERPARAMS_REDUCEOPERATION._serialized_start=21674 _REDUCELAYERPARAMS_REDUCEOPERATION._serialized_end=21792 _REDUCELAYERPARAMS_REDUCEAXIS._serialized_start=21794 _REDUCELAYERPARAMS_REDUCEAXIS._serialized_end=21844 _CROPLAYERPARAMS._serialized_start=21846 _CROPLAYERPARAMS._serialized_end=21937 _AVERAGELAYERPARAMS._serialized_start=21939 _AVERAGELAYERPARAMS._serialized_end=21959 _MAXLAYERPARAMS._serialized_start=21961 _MAXLAYERPARAMS._serialized_end=21977 _MINLAYERPARAMS._serialized_start=21979 _MINLAYERPARAMS._serialized_end=21995 _DOTPRODUCTLAYERPARAMS._serialized_start=21997 _DOTPRODUCTLAYERPARAMS._serialized_end=22046 _MEANVARIANCENORMALIZELAYERPARAMS._serialized_start=22048 _MEANVARIANCENORMALIZELAYERPARAMS._serialized_end=22150 _SEQUENCEREPEATLAYERPARAMS._serialized_start=22152 _SEQUENCEREPEATLAYERPARAMS._serialized_end=22201 _SIMPLERECURRENTLAYERPARAMS._serialized_start=22204 _SIMPLERECURRENTLAYERPARAMS._serialized_end=22587 _GRULAYERPARAMS._serialized_start=22590 _GRULAYERPARAMS._serialized_end=23400 _LSTMPARAMS._serialized_start=23403 _LSTMPARAMS._serialized_end=23573 _LSTMWEIGHTPARAMS._serialized_start=23576 _LSTMWEIGHTPARAMS._serialized_end=24620 _UNIDIRECTIONALLSTMLAYERPARAMS._serialized_start=24623 _UNIDIRECTIONALLSTMLAYERPARAMS._serialized_end=24900 _BIDIRECTIONALLSTMLAYERPARAMS._serialized_start=24903 _BIDIRECTIONALLSTMLAYERPARAMS._serialized_end=25241 _CUSTOMLAYERPARAMS._serialized_start=25244 _CUSTOMLAYERPARAMS._serialized_end=25690 _CUSTOMLAYERPARAMS_CUSTOMLAYERPARAMVALUE._serialized_start=25436 _CUSTOMLAYERPARAMS_CUSTOMLAYERPARAMVALUE._serialized_end=25576 _CUSTOMLAYERPARAMS_PARAMETERSENTRY._serialized_start=25578 _CUSTOMLAYERPARAMS_PARAMETERSENTRY._serialized_end=25690 _TRANSPOSELAYERPARAMS._serialized_start=25692 _TRANSPOSELAYERPARAMS._serialized_end=25728 _BATCHEDMATMULLAYERPARAMS._serialized_start=25731 _BATCHEDMATMULLAYERPARAMS._serialized_end=26019 _CONCATNDLAYERPARAMS._serialized_start=26021 _CONCATNDLAYERPARAMS._serialized_end=26076 _SOFTMAXNDLAYERPARAMS._serialized_start=26078 _SOFTMAXNDLAYERPARAMS._serialized_end=26114 _REVERSELAYERPARAMS._serialized_start=26116 _REVERSELAYERPARAMS._serialized_end=26156 _REVERSESEQLAYERPARAMS._serialized_start=26158 _REVERSESEQLAYERPARAMS._serialized_end=26222 _LOADCONSTANTNDLAYERPARAMS._serialized_start=26224 _LOADCONSTANTNDLAYERPARAMS._serialized_end=26316 _FILLLIKELAYERPARAMS._serialized_start=26318 _FILLLIKELAYERPARAMS._serialized_end=26354 _FILLSTATICLAYERPARAMS._serialized_start=26356 _FILLSTATICLAYERPARAMS._serialized_end=26415 _FILLDYNAMICLAYERPARAMS._serialized_start=26417 _FILLDYNAMICLAYERPARAMS._serialized_end=26456 _WHEREBROADCASTABLELAYERPARAMS._serialized_start=26458 _WHEREBROADCASTABLELAYERPARAMS._serialized_end=26489 _SINLAYERPARAMS._serialized_start=26491 _SINLAYERPARAMS._serialized_end=26507 _COSLAYERPARAMS._serialized_start=26509 _COSLAYERPARAMS._serialized_end=26525 _TANLAYERPARAMS._serialized_start=26527 _TANLAYERPARAMS._serialized_end=26543 _ASINLAYERPARAMS._serialized_start=26545 _ASINLAYERPARAMS._serialized_end=26562 _ACOSLAYERPARAMS._serialized_start=26564 _ACOSLAYERPARAMS._serialized_end=26581 _ATANLAYERPARAMS._serialized_start=26583 _ATANLAYERPARAMS._serialized_end=26600 _SINHLAYERPARAMS._serialized_start=26602 _SINHLAYERPARAMS._serialized_end=26619 _COSHLAYERPARAMS._serialized_start=26621 _COSHLAYERPARAMS._serialized_end=26638 _TANHLAYERPARAMS._serialized_start=26640 _TANHLAYERPARAMS._serialized_end=26657 _ASINHLAYERPARAMS._serialized_start=26659 _ASINHLAYERPARAMS._serialized_end=26677 _ACOSHLAYERPARAMS._serialized_start=26679 _ACOSHLAYERPARAMS._serialized_end=26697 _ATANHLAYERPARAMS._serialized_start=26699 _ATANHLAYERPARAMS._serialized_end=26717 _POWBROADCASTABLELAYERPARAMS._serialized_start=26719 _POWBROADCASTABLELAYERPARAMS._serialized_end=26748 _EXP2LAYERPARAMS._serialized_start=26750 _EXP2LAYERPARAMS._serialized_end=26767 _WHERENONZEROLAYERPARAMS._serialized_start=26769 _WHERENONZEROLAYERPARAMS._serialized_end=26794 _MATRIXBANDPARTLAYERPARAMS._serialized_start=26796 _MATRIXBANDPARTLAYERPARAMS._serialized_end=26859 _UPPERTRIANGULARLAYERPARAMS._serialized_start=26861 _UPPERTRIANGULARLAYERPARAMS._serialized_end=26900 _LOWERTRIANGULARLAYERPARAMS._serialized_start=26902 _LOWERTRIANGULARLAYERPARAMS._serialized_end=26941 _BROADCASTTOLIKELAYERPARAMS._serialized_start=26943 _BROADCASTTOLIKELAYERPARAMS._serialized_end=26971 _BROADCASTTOSTATICLAYERPARAMS._serialized_start=26973 _BROADCASTTOSTATICLAYERPARAMS._serialized_end=27024 _BROADCASTTODYNAMICLAYERPARAMS._serialized_start=27026 _BROADCASTTODYNAMICLAYERPARAMS._serialized_end=27057 _ADDBROADCASTABLELAYERPARAMS._serialized_start=27059 _ADDBROADCASTABLELAYERPARAMS._serialized_end=27088 _MAXBROADCASTABLELAYERPARAMS._serialized_start=27090 _MAXBROADCASTABLELAYERPARAMS._serialized_end=27119 _MINBROADCASTABLELAYERPARAMS._serialized_start=27121 _MINBROADCASTABLELAYERPARAMS._serialized_end=27150 _MODBROADCASTABLELAYERPARAMS._serialized_start=27152 _MODBROADCASTABLELAYERPARAMS._serialized_end=27181 _FLOORDIVBROADCASTABLELAYERPARAMS._serialized_start=27183 _FLOORDIVBROADCASTABLELAYERPARAMS._serialized_end=27217 _SUBTRACTBROADCASTABLELAYERPARAMS._serialized_start=27219 _SUBTRACTBROADCASTABLELAYERPARAMS._serialized_end=27253 _MULTIPLYBROADCASTABLELAYERPARAMS._serialized_start=27255 _MULTIPLYBROADCASTABLELAYERPARAMS._serialized_end=27289 _DIVIDEBROADCASTABLELAYERPARAMS._serialized_start=27291 _DIVIDEBROADCASTABLELAYERPARAMS._serialized_end=27323 _GATHERLAYERPARAMS._serialized_start=27325 _GATHERLAYERPARAMS._serialized_end=27358 _SCATTERLAYERPARAMS._serialized_start=27360 _SCATTERLAYERPARAMS._serialized_end=27443 _GATHERNDLAYERPARAMS._serialized_start=27445 _GATHERNDLAYERPARAMS._serialized_end=27466 _SCATTERNDLAYERPARAMS._serialized_start=27468 _SCATTERNDLAYERPARAMS._serialized_end=27539 _GATHERALONGAXISLAYERPARAMS._serialized_start=27541 _GATHERALONGAXISLAYERPARAMS._serialized_end=27583 _SCATTERALONGAXISLAYERPARAMS._serialized_start=27585 _SCATTERALONGAXISLAYERPARAMS._serialized_end=27677 _STACKLAYERPARAMS._serialized_start=27679 _STACKLAYERPARAMS._serialized_end=27711 _RANKPRESERVINGRESHAPELAYERPARAMS._serialized_start=27713 _RANKPRESERVINGRESHAPELAYERPARAMS._serialized_end=27768 _CONSTANTPADDINGLAYERPARAMS._serialized_start=27770 _CONSTANTPADDINGLAYERPARAMS._serialized_end=27867 _RANDOMNORMALLIKELAYERPARAMS._serialized_start=27869 _RANDOMNORMALLIKELAYERPARAMS._serialized_end=27942 _RANDOMNORMALSTATICLAYERPARAMS._serialized_start=27944 _RANDOMNORMALSTATICLAYERPARAMS._serialized_end=28040 _RANDOMNORMALDYNAMICLAYERPARAMS._serialized_start=28042 _RANDOMNORMALDYNAMICLAYERPARAMS._serialized_end=28118 _RANDOMUNIFORMLIKELAYERPARAMS._serialized_start=28120 _RANDOMUNIFORMLIKELAYERPARAMS._serialized_end=28196 _RANDOMUNIFORMSTATICLAYERPARAMS._serialized_start=28198 _RANDOMUNIFORMSTATICLAYERPARAMS._serialized_end=28297 _RANDOMUNIFORMDYNAMICLAYERPARAMS._serialized_start=28299 _RANDOMUNIFORMDYNAMICLAYERPARAMS._serialized_end=28378 _RANDOMBERNOULLILIKELAYERPARAMS._serialized_start=28380 _RANDOMBERNOULLILIKELAYERPARAMS._serialized_end=28440 _RANDOMBERNOULLISTATICLAYERPARAMS._serialized_start=28442 _RANDOMBERNOULLISTATICLAYERPARAMS._serialized_end=28525 _RANDOMBERNOULLIDYNAMICLAYERPARAMS._serialized_start=28527 _RANDOMBERNOULLIDYNAMICLAYERPARAMS._serialized_end=28590 _CATEGORICALDISTRIBUTIONLAYERPARAMS._serialized_start=28592 _CATEGORICALDISTRIBUTIONLAYERPARAMS._serialized_end=28714 _REDUCEL1LAYERPARAMS._serialized_start=28716 _REDUCEL1LAYERPARAMS._serialized_end=28788 _REDUCEL2LAYERPARAMS._serialized_start=28790 _REDUCEL2LAYERPARAMS._serialized_end=28862 _REDUCEMAXLAYERPARAMS._serialized_start=28864 _REDUCEMAXLAYERPARAMS._serialized_end=28937 _REDUCEMINLAYERPARAMS._serialized_start=28939 _REDUCEMINLAYERPARAMS._serialized_end=29012 _REDUCESUMLAYERPARAMS._serialized_start=29014 _REDUCESUMLAYERPARAMS._serialized_end=29087 _REDUCEPRODLAYERPARAMS._serialized_start=29089 _REDUCEPRODLAYERPARAMS._serialized_end=29163 _REDUCEMEANLAYERPARAMS._serialized_start=29165 _REDUCEMEANLAYERPARAMS._serialized_end=29239 _REDUCELOGSUMLAYERPARAMS._serialized_start=29241 _REDUCELOGSUMLAYERPARAMS._serialized_end=29317 _REDUCESUMSQUARELAYERPARAMS._serialized_start=29319 _REDUCESUMSQUARELAYERPARAMS._serialized_end=29398 _REDUCELOGSUMEXPLAYERPARAMS._serialized_start=29400 _REDUCELOGSUMEXPLAYERPARAMS._serialized_end=29479 _EXPANDDIMSLAYERPARAMS._serialized_start=29481 _EXPANDDIMSLAYERPARAMS._serialized_end=29518 _FLATTENTO2DLAYERPARAMS._serialized_start=29520 _FLATTENTO2DLAYERPARAMS._serialized_end=29558 _RESHAPESTATICLAYERPARAMS._serialized_start=29560 _RESHAPESTATICLAYERPARAMS._serialized_end=29607 _RESHAPELIKELAYERPARAMS._serialized_start=29609 _RESHAPELIKELAYERPARAMS._serialized_end=29633 _RESHAPEDYNAMICLAYERPARAMS._serialized_start=29635 _RESHAPEDYNAMICLAYERPARAMS._serialized_end=29662 _SQUEEZELAYERPARAMS._serialized_start=29664 _SQUEEZELAYERPARAMS._serialized_end=29718 _TOPKLAYERPARAMS._serialized_start=29720 _TOPKLAYERPARAMS._serialized_end=29782 _ARGMAXLAYERPARAMS._serialized_start=29784 _ARGMAXLAYERPARAMS._serialized_end=29836 _ARGMINLAYERPARAMS._serialized_start=29838 _ARGMINLAYERPARAMS._serialized_end=29890 _SPLITNDLAYERPARAMS._serialized_start=29892 _SPLITNDLAYERPARAMS._serialized_end=29965 _CEILLAYERPARAMS._serialized_start=29967 _CEILLAYERPARAMS._serialized_end=29984 _ROUNDLAYERPARAMS._serialized_start=29986 _ROUNDLAYERPARAMS._serialized_end=30004 _FLOORLAYERPARAMS._serialized_start=30006 _FLOORLAYERPARAMS._serialized_end=30024 _SIGNLAYERPARAMS._serialized_start=30026 _SIGNLAYERPARAMS._serialized_end=30043 _CLIPLAYERPARAMS._serialized_start=30045 _CLIPLAYERPARAMS._serialized_end=30094 _SLICESTATICLAYERPARAMS._serialized_start=30097 _SLICESTATICLAYERPARAMS._serialized_end=30232 _SLICEDYNAMICLAYERPARAMS._serialized_start=30234 _SLICEDYNAMICLAYERPARAMS._serialized_end=30352 _TILELAYERPARAMS._serialized_start=30354 _TILELAYERPARAMS._serialized_end=30385 _GETSHAPELAYERPARAMS._serialized_start=30387 _GETSHAPELAYERPARAMS._serialized_end=30408 _ERFLAYERPARAMS._serialized_start=30410 _ERFLAYERPARAMS._serialized_end=30426 _GELULAYERPARAMS._serialized_start=30429 _GELULAYERPARAMS._serialized_end=30582 _GELULAYERPARAMS_GELUMODE._serialized_start=30510 _GELULAYERPARAMS_GELUMODE._serialized_end=30582 _RANGESTATICLAYERPARAMS._serialized_start=30584 _RANGESTATICLAYERPARAMS._serialized_end=30669 _RANGEDYNAMICLAYERPARAMS._serialized_start=30671 _RANGEDYNAMICLAYERPARAMS._serialized_end=30739 _SLIDINGWINDOWSLAYERPARAMS._serialized_start=30741 _SLIDINGWINDOWSLAYERPARAMS._serialized_end=30816 _LAYERNORMALIZATIONLAYERPARAMS._serialized_start=30819 _LAYERNORMALIZATIONLAYERPARAMS._serialized_end=30989 _NONMAXIMUMSUPPRESSIONLAYERPARAMS._serialized_start=30991 _NONMAXIMUMSUPPRESSIONLAYERPARAMS._serialized_end=31118 _CLAMPEDRELULAYERPARAMS._serialized_start=31120 _CLAMPEDRELULAYERPARAMS._serialized_end=31173 _ARGSORTLAYERPARAMS._serialized_start=31175 _ARGSORTLAYERPARAMS._serialized_end=31229 _SLICEBYSIZELAYERPARAMS._serialized_start=31231 _SLICEBYSIZELAYERPARAMS._serialized_end=31283 _NEURALNETWORKCLASSIFIER._serialized_start=31286 _NEURALNETWORKCLASSIFIER._serialized_end=31867 _ONEHOTLAYERPARAMS._serialized_start=31869 _ONEHOTLAYERPARAMS._serialized_end=31963 _CUMSUMLAYERPARAMS._serialized_start=31965 _CUMSUMLAYERPARAMS._serialized_end=32040 _NEURALNETWORKREGRESSOR._serialized_start=32043 _NEURALNETWORKREGRESSOR._serialized_end=32444 _NETWORKUPDATEPARAMETERS._serialized_start=32447 _NETWORKUPDATEPARAMETERS._serialized_end=32737 _LOSSLAYER._serialized_start=32740 _LOSSLAYER._serialized_end=32968 _CATEGORICALCROSSENTROPYLOSSLAYER._serialized_start=32970 _CATEGORICALCROSSENTROPYLOSSLAYER._serialized_end=33035 _MEANSQUAREDERRORLOSSLAYER._serialized_start=33037 _MEANSQUAREDERRORLOSSLAYER._serialized_end=33095 _OPTIMIZER._serialized_start=33098 _OPTIMIZER._serialized_end=33248 _SGDOPTIMIZER._serialized_start=33251 _SGDOPTIMIZER._serialized_end=33444 _ADAMOPTIMIZER._serialized_start=33447 _ADAMOPTIMIZER._serialized_end=33744 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/NonMaximumSuppression_pb2.py0000644000000000000000000000576214672066616023676 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: NonMaximumSuppression.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x1bNonMaximumSuppression.proto\x12\x14\x43oreML.Specification\x1a\x14\x44\x61taStructures.proto\"\xc0\x04\n\x15NonMaximumSuppression\x12\x46\n\x07pickTop\x18\x01 \x01(\x0b\x32\x33.CoreML.Specification.NonMaximumSuppression.PickTopH\x00\x12?\n\x11stringClassLabels\x18\x64 \x01(\x0b\x32\".CoreML.Specification.StringVectorH\x01\x12=\n\x10int64ClassLabels\x18\x65 \x01(\x0b\x32!.CoreML.Specification.Int64VectorH\x01\x12\x14\n\x0ciouThreshold\x18n \x01(\x01\x12\x1b\n\x13\x63onfidenceThreshold\x18o \x01(\x01\x12#\n\x1a\x63onfidenceInputFeatureName\x18\xc8\x01 \x01(\t\x12$\n\x1b\x63oordinatesInputFeatureName\x18\xc9\x01 \x01(\t\x12%\n\x1ciouThresholdInputFeatureName\x18\xca\x01 \x01(\t\x12,\n#confidenceThresholdInputFeatureName\x18\xcb\x01 \x01(\t\x12$\n\x1b\x63onfidenceOutputFeatureName\x18\xd2\x01 \x01(\t\x12%\n\x1c\x63oordinatesOutputFeatureName\x18\xd3\x01 \x01(\t\x1a\x1b\n\x07PickTop\x12\x10\n\x08perClass\x18\x01 \x01(\x08\x42\x13\n\x11SuppressionMethodB\r\n\x0b\x43lassLabelsB\x02H\x03P\x00\x62\x06proto3') _NONMAXIMUMSUPPRESSION = DESCRIPTOR.message_types_by_name['NonMaximumSuppression'] _NONMAXIMUMSUPPRESSION_PICKTOP = _NONMAXIMUMSUPPRESSION.nested_types_by_name['PickTop'] NonMaximumSuppression = _reflection.GeneratedProtocolMessageType('NonMaximumSuppression', (_message.Message,), { 'PickTop' : _reflection.GeneratedProtocolMessageType('PickTop', (_message.Message,), { 'DESCRIPTOR' : _NONMAXIMUMSUPPRESSION_PICKTOP, '__module__' : 'NonMaximumSuppression_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NonMaximumSuppression.PickTop) }) , 'DESCRIPTOR' : _NONMAXIMUMSUPPRESSION, '__module__' : 'NonMaximumSuppression_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.NonMaximumSuppression) }) _sym_db.RegisterMessage(NonMaximumSuppression) _sym_db.RegisterMessage(NonMaximumSuppression.PickTop) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _NONMAXIMUMSUPPRESSION._serialized_start=76 _NONMAXIMUMSUPPRESSION._serialized_end=652 _NONMAXIMUMSUPPRESSION_PICKTOP._serialized_start=589 _NONMAXIMUMSUPPRESSION_PICKTOP._serialized_end=616 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/Normalizer_pb2.py0000644000000000000000000000303114672066616021440 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: Normalizer.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x10Normalizer.proto\x12\x14\x43oreML.Specification\"o\n\nNormalizer\x12;\n\x08normType\x18\x01 \x01(\x0e\x32).CoreML.Specification.Normalizer.NormType\"$\n\x08NormType\x12\x08\n\x04LMax\x10\x00\x12\x06\n\x02L1\x10\x01\x12\x06\n\x02L2\x10\x02\x42\x02H\x03\x62\x06proto3') _NORMALIZER = DESCRIPTOR.message_types_by_name['Normalizer'] _NORMALIZER_NORMTYPE = _NORMALIZER.enum_types_by_name['NormType'] Normalizer = _reflection.GeneratedProtocolMessageType('Normalizer', (_message.Message,), { 'DESCRIPTOR' : _NORMALIZER, '__module__' : 'Normalizer_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Normalizer) }) _sym_db.RegisterMessage(Normalizer) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _NORMALIZER._serialized_start=42 _NORMALIZER._serialized_end=153 _NORMALIZER_NORMTYPE._serialized_start=117 _NORMALIZER_NORMTYPE._serialized_end=153 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/OneHotEncoder_pb2.py0000644000000000000000000000420214672066616022013 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: OneHotEncoder.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x13OneHotEncoder.proto\x12\x14\x43oreML.Specification\x1a\x14\x44\x61taStructures.proto\"\xb5\x02\n\rOneHotEncoder\x12>\n\x10stringCategories\x18\x01 \x01(\x0b\x32\".CoreML.Specification.StringVectorH\x00\x12<\n\x0fint64Categories\x18\x02 \x01(\x0b\x32!.CoreML.Specification.Int64VectorH\x00\x12\x14\n\x0coutputSparse\x18\n \x01(\x08\x12H\n\rhandleUnknown\x18\x0b \x01(\x0e\x32\x31.CoreML.Specification.OneHotEncoder.HandleUnknown\"6\n\rHandleUnknown\x12\x12\n\x0e\x45rrorOnUnknown\x10\x00\x12\x11\n\rIgnoreUnknown\x10\x01\x42\x0e\n\x0c\x43\x61tegoryTypeB\x02H\x03P\x00\x62\x06proto3') _ONEHOTENCODER = DESCRIPTOR.message_types_by_name['OneHotEncoder'] _ONEHOTENCODER_HANDLEUNKNOWN = _ONEHOTENCODER.enum_types_by_name['HandleUnknown'] OneHotEncoder = _reflection.GeneratedProtocolMessageType('OneHotEncoder', (_message.Message,), { 'DESCRIPTOR' : _ONEHOTENCODER, '__module__' : 'OneHotEncoder_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.OneHotEncoder) }) _sym_db.RegisterMessage(OneHotEncoder) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _ONEHOTENCODER._serialized_start=68 _ONEHOTENCODER._serialized_end=377 _ONEHOTENCODER_HANDLEUNKNOWN._serialized_start=307 _ONEHOTENCODER_HANDLEUNKNOWN._serialized_end=361 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/Parameters_pb2.py0000644000000000000000000000657714672066616021443 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: Parameters.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x10Parameters.proto\x12\x14\x43oreML.Specification\x1a\x14\x44\x61taStructures.proto\"\x99\x01\n\x0eInt64Parameter\x12\x14\n\x0c\x64\x65\x66\x61ultValue\x18\x01 \x01(\x03\x12\x31\n\x05range\x18\n \x01(\x0b\x32 .CoreML.Specification.Int64RangeH\x00\x12-\n\x03set\x18\x0b \x01(\x0b\x32\x1e.CoreML.Specification.Int64SetH\x00\x42\x0f\n\rAllowedValues\"l\n\x0f\x44oubleParameter\x12\x14\n\x0c\x64\x65\x66\x61ultValue\x18\x01 \x01(\x01\x12\x32\n\x05range\x18\n \x01(\x0b\x32!.CoreML.Specification.DoubleRangeH\x00\x42\x0f\n\rAllowedValues\"\'\n\x0fStringParameter\x12\x14\n\x0c\x64\x65\x66\x61ultValue\x18\x01 \x01(\t\"%\n\rBoolParameter\x12\x14\n\x0c\x64\x65\x66\x61ultValue\x18\x01 \x01(\x08\x42\x02H\x03P\x00\x62\x06proto3') _INT64PARAMETER = DESCRIPTOR.message_types_by_name['Int64Parameter'] _DOUBLEPARAMETER = DESCRIPTOR.message_types_by_name['DoubleParameter'] _STRINGPARAMETER = DESCRIPTOR.message_types_by_name['StringParameter'] _BOOLPARAMETER = DESCRIPTOR.message_types_by_name['BoolParameter'] Int64Parameter = _reflection.GeneratedProtocolMessageType('Int64Parameter', (_message.Message,), { 'DESCRIPTOR' : _INT64PARAMETER, '__module__' : 'Parameters_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Int64Parameter) }) _sym_db.RegisterMessage(Int64Parameter) DoubleParameter = _reflection.GeneratedProtocolMessageType('DoubleParameter', (_message.Message,), { 'DESCRIPTOR' : _DOUBLEPARAMETER, '__module__' : 'Parameters_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.DoubleParameter) }) _sym_db.RegisterMessage(DoubleParameter) StringParameter = _reflection.GeneratedProtocolMessageType('StringParameter', (_message.Message,), { 'DESCRIPTOR' : _STRINGPARAMETER, '__module__' : 'Parameters_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.StringParameter) }) _sym_db.RegisterMessage(StringParameter) BoolParameter = _reflection.GeneratedProtocolMessageType('BoolParameter', (_message.Message,), { 'DESCRIPTOR' : _BOOLPARAMETER, '__module__' : 'Parameters_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.BoolParameter) }) _sym_db.RegisterMessage(BoolParameter) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _INT64PARAMETER._serialized_start=65 _INT64PARAMETER._serialized_end=218 _DOUBLEPARAMETER._serialized_start=220 _DOUBLEPARAMETER._serialized_end=328 _STRINGPARAMETER._serialized_start=330 _STRINGPARAMETER._serialized_end=369 _BOOLPARAMETER._serialized_start=371 _BOOLPARAMETER._serialized_end=408 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/SVM_pb2.py0000644000000000000000000002163214672066616017772 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: SVM.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\tSVM.proto\x12\x14\x43oreML.Specification\x1a\x14\x44\x61taStructures.proto\"\x0e\n\x0cLinearKernel\"\x1a\n\tRBFKernel\x12\r\n\x05gamma\x18\x01 \x01(\x01\"6\n\nPolyKernel\x12\x0e\n\x06\x64\x65gree\x18\x01 \x01(\x05\x12\t\n\x01\x63\x18\x02 \x01(\x01\x12\r\n\x05gamma\x18\x03 \x01(\x01\")\n\rSigmoidKernel\x12\r\n\x05gamma\x18\x01 \x01(\x01\x12\t\n\x01\x63\x18\x02 \x01(\x01\"\xfa\x01\n\x06Kernel\x12:\n\x0clinearKernel\x18\x01 \x01(\x0b\x32\".CoreML.Specification.LinearKernelH\x00\x12\x34\n\trbfKernel\x18\x02 \x01(\x0b\x32\x1f.CoreML.Specification.RBFKernelH\x00\x12\x36\n\npolyKernel\x18\x03 \x01(\x0b\x32 .CoreML.Specification.PolyKernelH\x00\x12<\n\rsigmoidKernel\x18\x04 \x01(\x0b\x32#.CoreML.Specification.SigmoidKernelH\x00\x42\x08\n\x06kernel\"*\n\nSparseNode\x12\r\n\x05index\x18\x01 \x01(\x05\x12\r\n\x05value\x18\x02 \x01(\x01\"?\n\x0cSparseVector\x12/\n\x05nodes\x18\x01 \x03(\x0b\x32 .CoreML.Specification.SparseNode\"K\n\x14SparseSupportVectors\x12\x33\n\x07vectors\x18\x01 \x03(\x0b\x32\".CoreML.Specification.SparseVector\"\x1d\n\x0b\x44\x65nseVector\x12\x0e\n\x06values\x18\x01 \x03(\x01\"I\n\x13\x44\x65nseSupportVectors\x12\x32\n\x07vectors\x18\x01 \x03(\x0b\x32!.CoreML.Specification.DenseVector\"\x1d\n\x0c\x43oefficients\x12\r\n\x05\x61lpha\x18\x01 \x03(\x01\"\xb5\x02\n\x16SupportVectorRegressor\x12,\n\x06kernel\x18\x01 \x01(\x0b\x32\x1c.CoreML.Specification.Kernel\x12J\n\x14sparseSupportVectors\x18\x02 \x01(\x0b\x32*.CoreML.Specification.SparseSupportVectorsH\x00\x12H\n\x13\x64\x65nseSupportVectors\x18\x03 \x01(\x0b\x32).CoreML.Specification.DenseSupportVectorsH\x00\x12\x38\n\x0c\x63oefficients\x18\x04 \x01(\x0b\x32\".CoreML.Specification.Coefficients\x12\x0b\n\x03rho\x18\x05 \x01(\x01\x42\x10\n\x0esupportVectors\"\x8b\x04\n\x17SupportVectorClassifier\x12,\n\x06kernel\x18\x01 \x01(\x0b\x32\x1c.CoreML.Specification.Kernel\x12&\n\x1enumberOfSupportVectorsPerClass\x18\x02 \x03(\x05\x12J\n\x14sparseSupportVectors\x18\x03 \x01(\x0b\x32*.CoreML.Specification.SparseSupportVectorsH\x00\x12H\n\x13\x64\x65nseSupportVectors\x18\x04 \x01(\x0b\x32).CoreML.Specification.DenseSupportVectorsH\x00\x12\x38\n\x0c\x63oefficients\x18\x05 \x03(\x0b\x32\".CoreML.Specification.Coefficients\x12\x0b\n\x03rho\x18\x06 \x03(\x01\x12\r\n\x05probA\x18\x07 \x03(\x01\x12\r\n\x05probB\x18\x08 \x03(\x01\x12?\n\x11stringClassLabels\x18\x64 \x01(\x0b\x32\".CoreML.Specification.StringVectorH\x01\x12=\n\x10int64ClassLabels\x18\x65 \x01(\x0b\x32!.CoreML.Specification.Int64VectorH\x01\x42\x10\n\x0esupportVectorsB\r\n\x0b\x43lassLabelsB\x02H\x03P\x00\x62\x06proto3') _LINEARKERNEL = DESCRIPTOR.message_types_by_name['LinearKernel'] _RBFKERNEL = DESCRIPTOR.message_types_by_name['RBFKernel'] _POLYKERNEL = DESCRIPTOR.message_types_by_name['PolyKernel'] _SIGMOIDKERNEL = DESCRIPTOR.message_types_by_name['SigmoidKernel'] _KERNEL = DESCRIPTOR.message_types_by_name['Kernel'] _SPARSENODE = DESCRIPTOR.message_types_by_name['SparseNode'] _SPARSEVECTOR = DESCRIPTOR.message_types_by_name['SparseVector'] _SPARSESUPPORTVECTORS = DESCRIPTOR.message_types_by_name['SparseSupportVectors'] _DENSEVECTOR = DESCRIPTOR.message_types_by_name['DenseVector'] _DENSESUPPORTVECTORS = DESCRIPTOR.message_types_by_name['DenseSupportVectors'] _COEFFICIENTS = DESCRIPTOR.message_types_by_name['Coefficients'] _SUPPORTVECTORREGRESSOR = DESCRIPTOR.message_types_by_name['SupportVectorRegressor'] _SUPPORTVECTORCLASSIFIER = DESCRIPTOR.message_types_by_name['SupportVectorClassifier'] LinearKernel = _reflection.GeneratedProtocolMessageType('LinearKernel', (_message.Message,), { 'DESCRIPTOR' : _LINEARKERNEL, '__module__' : 'SVM_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.LinearKernel) }) _sym_db.RegisterMessage(LinearKernel) RBFKernel = _reflection.GeneratedProtocolMessageType('RBFKernel', (_message.Message,), { 'DESCRIPTOR' : _RBFKERNEL, '__module__' : 'SVM_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.RBFKernel) }) _sym_db.RegisterMessage(RBFKernel) PolyKernel = _reflection.GeneratedProtocolMessageType('PolyKernel', (_message.Message,), { 'DESCRIPTOR' : _POLYKERNEL, '__module__' : 'SVM_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.PolyKernel) }) _sym_db.RegisterMessage(PolyKernel) SigmoidKernel = _reflection.GeneratedProtocolMessageType('SigmoidKernel', (_message.Message,), { 'DESCRIPTOR' : _SIGMOIDKERNEL, '__module__' : 'SVM_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SigmoidKernel) }) _sym_db.RegisterMessage(SigmoidKernel) Kernel = _reflection.GeneratedProtocolMessageType('Kernel', (_message.Message,), { 'DESCRIPTOR' : _KERNEL, '__module__' : 'SVM_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Kernel) }) _sym_db.RegisterMessage(Kernel) SparseNode = _reflection.GeneratedProtocolMessageType('SparseNode', (_message.Message,), { 'DESCRIPTOR' : _SPARSENODE, '__module__' : 'SVM_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SparseNode) }) _sym_db.RegisterMessage(SparseNode) SparseVector = _reflection.GeneratedProtocolMessageType('SparseVector', (_message.Message,), { 'DESCRIPTOR' : _SPARSEVECTOR, '__module__' : 'SVM_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SparseVector) }) _sym_db.RegisterMessage(SparseVector) SparseSupportVectors = _reflection.GeneratedProtocolMessageType('SparseSupportVectors', (_message.Message,), { 'DESCRIPTOR' : _SPARSESUPPORTVECTORS, '__module__' : 'SVM_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SparseSupportVectors) }) _sym_db.RegisterMessage(SparseSupportVectors) DenseVector = _reflection.GeneratedProtocolMessageType('DenseVector', (_message.Message,), { 'DESCRIPTOR' : _DENSEVECTOR, '__module__' : 'SVM_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.DenseVector) }) _sym_db.RegisterMessage(DenseVector) DenseSupportVectors = _reflection.GeneratedProtocolMessageType('DenseSupportVectors', (_message.Message,), { 'DESCRIPTOR' : _DENSESUPPORTVECTORS, '__module__' : 'SVM_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.DenseSupportVectors) }) _sym_db.RegisterMessage(DenseSupportVectors) Coefficients = _reflection.GeneratedProtocolMessageType('Coefficients', (_message.Message,), { 'DESCRIPTOR' : _COEFFICIENTS, '__module__' : 'SVM_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Coefficients) }) _sym_db.RegisterMessage(Coefficients) SupportVectorRegressor = _reflection.GeneratedProtocolMessageType('SupportVectorRegressor', (_message.Message,), { 'DESCRIPTOR' : _SUPPORTVECTORREGRESSOR, '__module__' : 'SVM_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SupportVectorRegressor) }) _sym_db.RegisterMessage(SupportVectorRegressor) SupportVectorClassifier = _reflection.GeneratedProtocolMessageType('SupportVectorClassifier', (_message.Message,), { 'DESCRIPTOR' : _SUPPORTVECTORCLASSIFIER, '__module__' : 'SVM_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.SupportVectorClassifier) }) _sym_db.RegisterMessage(SupportVectorClassifier) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _LINEARKERNEL._serialized_start=57 _LINEARKERNEL._serialized_end=71 _RBFKERNEL._serialized_start=73 _RBFKERNEL._serialized_end=99 _POLYKERNEL._serialized_start=101 _POLYKERNEL._serialized_end=155 _SIGMOIDKERNEL._serialized_start=157 _SIGMOIDKERNEL._serialized_end=198 _KERNEL._serialized_start=201 _KERNEL._serialized_end=451 _SPARSENODE._serialized_start=453 _SPARSENODE._serialized_end=495 _SPARSEVECTOR._serialized_start=497 _SPARSEVECTOR._serialized_end=560 _SPARSESUPPORTVECTORS._serialized_start=562 _SPARSESUPPORTVECTORS._serialized_end=637 _DENSEVECTOR._serialized_start=639 _DENSEVECTOR._serialized_end=668 _DENSESUPPORTVECTORS._serialized_start=670 _DENSESUPPORTVECTORS._serialized_end=743 _COEFFICIENTS._serialized_start=745 _COEFFICIENTS._serialized_end=774 _SUPPORTVECTORREGRESSOR._serialized_start=777 _SUPPORTVECTORREGRESSOR._serialized_end=1086 _SUPPORTVECTORCLASSIFIER._serialized_start=1089 _SUPPORTVECTORCLASSIFIER._serialized_end=1612 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/Scaler_pb2.py0000644000000000000000000000235614672066616020540 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: Scaler.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x0cScaler.proto\x12\x14\x43oreML.Specification\"0\n\x06Scaler\x12\x12\n\nshiftValue\x18\x01 \x03(\x01\x12\x12\n\nscaleValue\x18\x02 \x03(\x01\x42\x02H\x03\x62\x06proto3') _SCALER = DESCRIPTOR.message_types_by_name['Scaler'] Scaler = _reflection.GeneratedProtocolMessageType('Scaler', (_message.Message,), { 'DESCRIPTOR' : _SCALER, '__module__' : 'Scaler_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.Scaler) }) _sym_db.RegisterMessage(Scaler) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _SCALER._serialized_start=38 _SCALER._serialized_end=86 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/SoundAnalysisPreprocessing_pb2.py0000644000000000000000000000426314672066616024666 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: SoundAnalysisPreprocessing.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n SoundAnalysisPreprocessing.proto\x12!CoreML.Specification.CoreMLModels\"\xa0\x01\n\x1aSoundAnalysisPreprocessing\x12V\n\x06vggish\x18\x14 \x01(\x0b\x32\x44.CoreML.Specification.CoreMLModels.SoundAnalysisPreprocessing.VggishH\x00\x1a\x08\n\x06VggishB \n\x1eSoundAnalysisPreprocessingTypeB\x02H\x03\x62\x06proto3') _SOUNDANALYSISPREPROCESSING = DESCRIPTOR.message_types_by_name['SoundAnalysisPreprocessing'] _SOUNDANALYSISPREPROCESSING_VGGISH = _SOUNDANALYSISPREPROCESSING.nested_types_by_name['Vggish'] SoundAnalysisPreprocessing = _reflection.GeneratedProtocolMessageType('SoundAnalysisPreprocessing', (_message.Message,), { 'Vggish' : _reflection.GeneratedProtocolMessageType('Vggish', (_message.Message,), { 'DESCRIPTOR' : _SOUNDANALYSISPREPROCESSING_VGGISH, '__module__' : 'SoundAnalysisPreprocessing_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CoreMLModels.SoundAnalysisPreprocessing.Vggish) }) , 'DESCRIPTOR' : _SOUNDANALYSISPREPROCESSING, '__module__' : 'SoundAnalysisPreprocessing_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CoreMLModels.SoundAnalysisPreprocessing) }) _sym_db.RegisterMessage(SoundAnalysisPreprocessing) _sym_db.RegisterMessage(SoundAnalysisPreprocessing.Vggish) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _SOUNDANALYSISPREPROCESSING._serialized_start=72 _SOUNDANALYSISPREPROCESSING._serialized_end=232 _SOUNDANALYSISPREPROCESSING_VGGISH._serialized_start=190 _SOUNDANALYSISPREPROCESSING_VGGISH._serialized_end=198 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/TextClassifier_pb2.py0000644000000000000000000000345014672066616022254 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: TextClassifier.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x14TextClassifier.proto\x12!CoreML.Specification.CoreMLModels\x1a\x14\x44\x61taStructures.proto\"\xa1\x01\n\x0eTextClassifier\x12\x10\n\x08revision\x18\x01 \x01(\r\x12\x10\n\x08language\x18\n \x01(\t\x12\x1a\n\x12modelParameterData\x18\x64 \x01(\x0c\x12@\n\x11stringClassLabels\x18\xc8\x01 \x01(\x0b\x32\".CoreML.Specification.StringVectorH\x00\x42\r\n\x0b\x43lassLabelsB\x02H\x03P\x00\x62\x06proto3') _TEXTCLASSIFIER = DESCRIPTOR.message_types_by_name['TextClassifier'] TextClassifier = _reflection.GeneratedProtocolMessageType('TextClassifier', (_message.Message,), { 'DESCRIPTOR' : _TEXTCLASSIFIER, '__module__' : 'TextClassifier_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CoreMLModels.TextClassifier) }) _sym_db.RegisterMessage(TextClassifier) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _TEXTCLASSIFIER._serialized_start=82 _TEXTCLASSIFIER._serialized_end=243 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/TreeEnsemble_pb2.py0000644000000000000000000001544314672066616021702 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: TreeEnsemble.proto """Generated protocol buffer code.""" from google.protobuf.internal import enum_type_wrapper from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x12TreeEnsemble.proto\x12\x14\x43oreML.Specification\x1a\x14\x44\x61taStructures.proto\"\xc4\x06\n\x16TreeEnsembleParameters\x12\x44\n\x05nodes\x18\x01 \x03(\x0b\x32\x35.CoreML.Specification.TreeEnsembleParameters.TreeNode\x12\x1f\n\x17numPredictionDimensions\x18\x02 \x01(\x04\x12\x1b\n\x13\x62\x61sePredictionValue\x18\x03 \x03(\x01\x1a\xa5\x05\n\x08TreeNode\x12\x0e\n\x06treeId\x18\x01 \x01(\x04\x12\x0e\n\x06nodeId\x18\x02 \x01(\x04\x12\\\n\x0cnodeBehavior\x18\x03 \x01(\x0e\x32\x46.CoreML.Specification.TreeEnsembleParameters.TreeNode.TreeNodeBehavior\x12\x1a\n\x12\x62ranchFeatureIndex\x18\n \x01(\x04\x12\x1a\n\x12\x62ranchFeatureValue\x18\x0b \x01(\x01\x12\x17\n\x0ftrueChildNodeId\x18\x0c \x01(\x04\x12\x18\n\x10\x66\x61lseChildNodeId\x18\r \x01(\x04\x12#\n\x1bmissingValueTracksTrueChild\x18\x0e \x01(\x08\x12\\\n\x0e\x65valuationInfo\x18\x14 \x03(\x0b\x32\x44.CoreML.Specification.TreeEnsembleParameters.TreeNode.EvaluationInfo\x12\x17\n\x0frelativeHitRate\x18\x1e \x01(\x01\x1a\x42\n\x0e\x45valuationInfo\x12\x17\n\x0f\x65valuationIndex\x18\x01 \x01(\x04\x12\x17\n\x0f\x65valuationValue\x18\x02 \x01(\x01\"\xcf\x01\n\x10TreeNodeBehavior\x12\x1e\n\x1a\x42ranchOnValueLessThanEqual\x10\x00\x12\x19\n\x15\x42ranchOnValueLessThan\x10\x01\x12!\n\x1d\x42ranchOnValueGreaterThanEqual\x10\x02\x12\x1c\n\x18\x42ranchOnValueGreaterThan\x10\x03\x12\x16\n\x12\x42ranchOnValueEqual\x10\x04\x12\x19\n\x15\x42ranchOnValueNotEqual\x10\x05\x12\x0c\n\x08LeafNode\x10\x06\"\xc7\x02\n\x16TreeEnsembleClassifier\x12\x42\n\x0ctreeEnsemble\x18\x01 \x01(\x0b\x32,.CoreML.Specification.TreeEnsembleParameters\x12Z\n\x17postEvaluationTransform\x18\x02 \x01(\x0e\x32\x39.CoreML.Specification.TreeEnsemblePostEvaluationTransform\x12?\n\x11stringClassLabels\x18\x64 \x01(\x0b\x32\".CoreML.Specification.StringVectorH\x00\x12=\n\x10int64ClassLabels\x18\x65 \x01(\x0b\x32!.CoreML.Specification.Int64VectorH\x00\x42\r\n\x0b\x43lassLabels\"\xb7\x01\n\x15TreeEnsembleRegressor\x12\x42\n\x0ctreeEnsemble\x18\x01 \x01(\x0b\x32,.CoreML.Specification.TreeEnsembleParameters\x12Z\n\x17postEvaluationTransform\x18\x02 \x01(\x0e\x32\x39.CoreML.Specification.TreeEnsemblePostEvaluationTransform*\x9d\x01\n#TreeEnsemblePostEvaluationTransform\x12\x0f\n\x0bNoTransform\x10\x00\x12\x1a\n\x16\x43lassification_SoftMax\x10\x01\x12\x17\n\x13Regression_Logistic\x10\x02\x12\x30\n,Classification_SoftMaxWithZeroClassReference\x10\x03\x42\x02H\x03P\x00\x62\x06proto3') _TREEENSEMBLEPOSTEVALUATIONTRANSFORM = DESCRIPTOR.enum_types_by_name['TreeEnsemblePostEvaluationTransform'] TreeEnsemblePostEvaluationTransform = enum_type_wrapper.EnumTypeWrapper(_TREEENSEMBLEPOSTEVALUATIONTRANSFORM) NoTransform = 0 Classification_SoftMax = 1 Regression_Logistic = 2 Classification_SoftMaxWithZeroClassReference = 3 _TREEENSEMBLEPARAMETERS = DESCRIPTOR.message_types_by_name['TreeEnsembleParameters'] _TREEENSEMBLEPARAMETERS_TREENODE = _TREEENSEMBLEPARAMETERS.nested_types_by_name['TreeNode'] _TREEENSEMBLEPARAMETERS_TREENODE_EVALUATIONINFO = _TREEENSEMBLEPARAMETERS_TREENODE.nested_types_by_name['EvaluationInfo'] _TREEENSEMBLECLASSIFIER = DESCRIPTOR.message_types_by_name['TreeEnsembleClassifier'] _TREEENSEMBLEREGRESSOR = DESCRIPTOR.message_types_by_name['TreeEnsembleRegressor'] _TREEENSEMBLEPARAMETERS_TREENODE_TREENODEBEHAVIOR = _TREEENSEMBLEPARAMETERS_TREENODE.enum_types_by_name['TreeNodeBehavior'] TreeEnsembleParameters = _reflection.GeneratedProtocolMessageType('TreeEnsembleParameters', (_message.Message,), { 'TreeNode' : _reflection.GeneratedProtocolMessageType('TreeNode', (_message.Message,), { 'EvaluationInfo' : _reflection.GeneratedProtocolMessageType('EvaluationInfo', (_message.Message,), { 'DESCRIPTOR' : _TREEENSEMBLEPARAMETERS_TREENODE_EVALUATIONINFO, '__module__' : 'TreeEnsemble_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.TreeEnsembleParameters.TreeNode.EvaluationInfo) }) , 'DESCRIPTOR' : _TREEENSEMBLEPARAMETERS_TREENODE, '__module__' : 'TreeEnsemble_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.TreeEnsembleParameters.TreeNode) }) , 'DESCRIPTOR' : _TREEENSEMBLEPARAMETERS, '__module__' : 'TreeEnsemble_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.TreeEnsembleParameters) }) _sym_db.RegisterMessage(TreeEnsembleParameters) _sym_db.RegisterMessage(TreeEnsembleParameters.TreeNode) _sym_db.RegisterMessage(TreeEnsembleParameters.TreeNode.EvaluationInfo) TreeEnsembleClassifier = _reflection.GeneratedProtocolMessageType('TreeEnsembleClassifier', (_message.Message,), { 'DESCRIPTOR' : _TREEENSEMBLECLASSIFIER, '__module__' : 'TreeEnsemble_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.TreeEnsembleClassifier) }) _sym_db.RegisterMessage(TreeEnsembleClassifier) TreeEnsembleRegressor = _reflection.GeneratedProtocolMessageType('TreeEnsembleRegressor', (_message.Message,), { 'DESCRIPTOR' : _TREEENSEMBLEREGRESSOR, '__module__' : 'TreeEnsemble_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.TreeEnsembleRegressor) }) _sym_db.RegisterMessage(TreeEnsembleRegressor) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _TREEENSEMBLEPOSTEVALUATIONTRANSFORM._serialized_start=1422 _TREEENSEMBLEPOSTEVALUATIONTRANSFORM._serialized_end=1579 _TREEENSEMBLEPARAMETERS._serialized_start=67 _TREEENSEMBLEPARAMETERS._serialized_end=903 _TREEENSEMBLEPARAMETERS_TREENODE._serialized_start=226 _TREEENSEMBLEPARAMETERS_TREENODE._serialized_end=903 _TREEENSEMBLEPARAMETERS_TREENODE_EVALUATIONINFO._serialized_start=627 _TREEENSEMBLEPARAMETERS_TREENODE_EVALUATIONINFO._serialized_end=693 _TREEENSEMBLEPARAMETERS_TREENODE_TREENODEBEHAVIOR._serialized_start=696 _TREEENSEMBLEPARAMETERS_TREENODE_TREENODEBEHAVIOR._serialized_end=903 _TREEENSEMBLECLASSIFIER._serialized_start=906 _TREEENSEMBLECLASSIFIER._serialized_end=1233 _TREEENSEMBLEREGRESSOR._serialized_start=1236 _TREEENSEMBLEREGRESSOR._serialized_end=1419 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/VisionFeaturePrint_pb2.py0000644000000000000000000000714614672066616023131 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: VisionFeaturePrint.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x18VisionFeaturePrint.proto\x12!CoreML.Specification.CoreMLModels\"\xc9\x04\n\x12VisionFeaturePrint\x12L\n\x05scene\x18\x14 \x01(\x0b\x32;.CoreML.Specification.CoreMLModels.VisionFeaturePrint.SceneH\x00\x12P\n\x07objects\x18\x15 \x01(\x0b\x32=.CoreML.Specification.CoreMLModels.VisionFeaturePrint.ObjectsH\x00\x1a\xb7\x01\n\x05Scene\x12Y\n\x07version\x18\x01 \x01(\x0e\x32H.CoreML.Specification.CoreMLModels.VisionFeaturePrint.Scene.SceneVersion\"S\n\x0cSceneVersion\x12\x19\n\x15SCENE_VERSION_INVALID\x10\x00\x12\x13\n\x0fSCENE_VERSION_1\x10\x01\x12\x13\n\x0fSCENE_VERSION_2\x10\x02\x1a\xbe\x01\n\x07Objects\x12]\n\x07version\x18\x01 \x01(\x0e\x32L.CoreML.Specification.CoreMLModels.VisionFeaturePrint.Objects.ObjectsVersion\x12\x0e\n\x06output\x18\x64 \x03(\t\"D\n\x0eObjectsVersion\x12\x1b\n\x17OBJECTS_VERSION_INVALID\x10\x00\x12\x15\n\x11OBJECTS_VERSION_1\x10\x01\x42\x18\n\x16VisionFeaturePrintTypeB\x02H\x03\x62\x06proto3') _VISIONFEATUREPRINT = DESCRIPTOR.message_types_by_name['VisionFeaturePrint'] _VISIONFEATUREPRINT_SCENE = _VISIONFEATUREPRINT.nested_types_by_name['Scene'] _VISIONFEATUREPRINT_OBJECTS = _VISIONFEATUREPRINT.nested_types_by_name['Objects'] _VISIONFEATUREPRINT_SCENE_SCENEVERSION = _VISIONFEATUREPRINT_SCENE.enum_types_by_name['SceneVersion'] _VISIONFEATUREPRINT_OBJECTS_OBJECTSVERSION = _VISIONFEATUREPRINT_OBJECTS.enum_types_by_name['ObjectsVersion'] VisionFeaturePrint = _reflection.GeneratedProtocolMessageType('VisionFeaturePrint', (_message.Message,), { 'Scene' : _reflection.GeneratedProtocolMessageType('Scene', (_message.Message,), { 'DESCRIPTOR' : _VISIONFEATUREPRINT_SCENE, '__module__' : 'VisionFeaturePrint_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CoreMLModels.VisionFeaturePrint.Scene) }) , 'Objects' : _reflection.GeneratedProtocolMessageType('Objects', (_message.Message,), { 'DESCRIPTOR' : _VISIONFEATUREPRINT_OBJECTS, '__module__' : 'VisionFeaturePrint_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CoreMLModels.VisionFeaturePrint.Objects) }) , 'DESCRIPTOR' : _VISIONFEATUREPRINT, '__module__' : 'VisionFeaturePrint_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CoreMLModels.VisionFeaturePrint) }) _sym_db.RegisterMessage(VisionFeaturePrint) _sym_db.RegisterMessage(VisionFeaturePrint.Scene) _sym_db.RegisterMessage(VisionFeaturePrint.Objects) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _VISIONFEATUREPRINT._serialized_start=64 _VISIONFEATUREPRINT._serialized_end=649 _VISIONFEATUREPRINT_SCENE._serialized_start=247 _VISIONFEATUREPRINT_SCENE._serialized_end=430 _VISIONFEATUREPRINT_SCENE_SCENEVERSION._serialized_start=347 _VISIONFEATUREPRINT_SCENE_SCENEVERSION._serialized_end=430 _VISIONFEATUREPRINT_OBJECTS._serialized_start=433 _VISIONFEATUREPRINT_OBJECTS._serialized_end=623 _VISIONFEATUREPRINT_OBJECTS_OBJECTSVERSION._serialized_start=555 _VISIONFEATUREPRINT_OBJECTS_OBJECTSVERSION._serialized_end=623 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/WordEmbedding_pb2.py0000644000000000000000000000323414672066616022035 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: WordEmbedding.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x13WordEmbedding.proto\x12!CoreML.Specification.CoreMLModels\x1a\x14\x44\x61taStructures.proto\"O\n\rWordEmbedding\x12\x10\n\x08revision\x18\x01 \x01(\r\x12\x10\n\x08language\x18\n \x01(\t\x12\x1a\n\x12modelParameterData\x18\x64 \x01(\x0c\x42\x02H\x03P\x00\x62\x06proto3') _WORDEMBEDDING = DESCRIPTOR.message_types_by_name['WordEmbedding'] WordEmbedding = _reflection.GeneratedProtocolMessageType('WordEmbedding', (_message.Message,), { 'DESCRIPTOR' : _WORDEMBEDDING, '__module__' : 'WordEmbedding_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CoreMLModels.WordEmbedding) }) _sym_db.RegisterMessage(WordEmbedding) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _WORDEMBEDDING._serialized_start=80 _WORDEMBEDDING._serialized_end=159 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/WordTagger_pb2.py0000644000000000000000000000370214672066616021370 0ustar00rootroot# -*- coding: utf-8 -*- # Generated by the protocol buffer compiler. DO NOT EDIT! # source: WordTagger.proto """Generated protocol buffer code.""" from google.protobuf import descriptor as _descriptor from google.protobuf import descriptor_pool as _descriptor_pool from google.protobuf import message as _message from google.protobuf import reflection as _reflection from google.protobuf import symbol_database as _symbol_database # @@protoc_insertion_point(imports) _sym_db = _symbol_database.Default() from . import DataStructures_pb2 as DataStructures__pb2 try: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes__pb2 except AttributeError: FeatureTypes__pb2 = DataStructures__pb2.FeatureTypes_pb2 from .DataStructures_pb2 import * DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x10WordTagger.proto\x12!CoreML.Specification.CoreMLModels\x1a\x14\x44\x61taStructures.proto\"\xa4\x02\n\nWordTagger\x12\x10\n\x08revision\x18\x01 \x01(\r\x12\x10\n\x08language\x18\n \x01(\t\x12\x1f\n\x17tokensOutputFeatureName\x18\x14 \x01(\t\x12\"\n\x1atokenTagsOutputFeatureName\x18\x15 \x01(\t\x12\'\n\x1ftokenLocationsOutputFeatureName\x18\x16 \x01(\t\x12%\n\x1dtokenLengthsOutputFeatureName\x18\x17 \x01(\t\x12\x1a\n\x12modelParameterData\x18\x64 \x01(\x0c\x12\x39\n\nstringTags\x18\xc8\x01 \x01(\x0b\x32\".CoreML.Specification.StringVectorH\x00\x42\x06\n\x04TagsB\x02H\x03P\x00\x62\x06proto3') _WORDTAGGER = DESCRIPTOR.message_types_by_name['WordTagger'] WordTagger = _reflection.GeneratedProtocolMessageType('WordTagger', (_message.Message,), { 'DESCRIPTOR' : _WORDTAGGER, '__module__' : 'WordTagger_pb2' # @@protoc_insertion_point(class_scope:CoreML.Specification.CoreMLModels.WordTagger) }) _sym_db.RegisterMessage(WordTagger) if _descriptor._USE_C_DESCRIPTORS == False: DESCRIPTOR._options = None DESCRIPTOR._serialized_options = b'H\003' _WORDTAGGER._serialized_start=78 _WORDTAGGER._serialized_end=370 # @@protoc_insertion_point(module_scope) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/proto/__init__.py0000644000000000000000000000005414672066616020314 0ustar00rootroot### Module for proto generated Python code. ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2775474 coremltools-8.0/coremltools/test/0000755000000000000000000000000014672075535016020 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/__init__.py0000644000000000000000000000033214672066616020127 0ustar00rootroot# Copyright (c) 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2775474 coremltools-8.0/coremltools/test/api/0000755000000000000000000000000014672075535016571 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/api/__init__.py0000644000000000000000000000034114672066616020700 0ustar00rootroot# Copyright (c) 2017 - 2020, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/api/test_api_examples.py0000644000000000000000000005245414672066616022663 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import os import tempfile from collections import Counter import numpy as np import pytest import coremltools as ct from coremltools._deps import _HAS_TORCH from coremltools.converters.mil import Builder as mb from coremltools.converters.mil import mil from coremltools.converters.mil.mil import Function, get_new_symbol, types from coremltools.converters.mil.testing_utils import get_op_types_in_program if _HAS_TORCH: import torch class TestMILExamples: @staticmethod def test_tutorial(): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 100, 100, 3))] ) def prog(x): x = mb.relu(x=x, name="relu") x = mb.transpose(x=x, perm=[0, 3, 1, 2], name="transpose") x = mb.reduce_mean(x=x, axes=[2, 3], keep_dims=False, name="reduce") x = mb.log(x=x, name="log") y = mb.add(x=1, y=2) return x # Convert and verify mlmodel = ct.convert(prog) # running predict() is only supported on macOS if ct.utils._is_macos(): prediction = mlmodel.predict( {"x": np.random.rand(1, 100, 100, 3).astype(np.float32)} ) assert len(prediction) == 1 @pytest.mark.skipif(ct.utils._macos_version() < (10, 15), reason='Model produces specification 4.') class TestInputs: @staticmethod @pytest.mark.skipif(not ct.utils._is_macos(), reason="Platform is not Mac OS") @pytest.mark.parametrize( "convert_to", ["mlprogram", "neuralnetwork"], ) def test_unsanitized_input_name_during_prediction(convert_to): ''' input name : "x/0" becomes "x_0" due to name sanitization applied during conversion ''' prog = mil.Program() func_inputs = {"x/0": mb.placeholder(shape=[2, 3]), "y": mb.placeholder(shape=[2, 3])} with Function(func_inputs) as ssa_fun: x, y = ssa_fun.inputs["x/0"], ssa_fun.inputs["y"] x = mb.relu(x=x, name="relu") z = mb.add(x=x, y=y, name="out") ssa_fun.set_outputs([z]) prog.add_function("main", ssa_fun) mlmodel = ct.convert(prog, convert_to=convert_to) with pytest.raises(KeyError) as error_info: mlmodel.predict( {"x/0": np.random.rand(2, 3).astype(np.float32), "y": np.random.rand(2, 3).astype(np.float32)} ) error_str = str(error_info.value) assert "does not match any of the model input" in error_str @staticmethod def _test_variant_input_type_prediction(to_tensor, convert_to): prog = mil.Program() func_inputs = {"x": mb.placeholder(shape=[2, 3]), "y": mb.placeholder(shape=[2, 3])} with Function(func_inputs) as ssa_fun: x, y = ssa_fun.inputs["x"], ssa_fun.inputs["y"] x = mb.relu(x=x, name="relu") z = mb.add(x=x, y=y, name="out") ssa_fun.set_outputs([z]) prog.add_function("main", ssa_fun) mlmodel = ct.convert(prog, convert_to=convert_to) x_numpy = np.random.rand(2, 3) y_numpy = np.random.rand(2, 3) out_by_numpy = mlmodel.predict( {"x": x_numpy, "y": y_numpy} ) out_by_tensor = mlmodel.predict( {"x": to_tensor(x_numpy), "y": to_tensor(y_numpy)} ) np.allclose(out_by_numpy["out"], out_by_tensor["out"]) @staticmethod @pytest.mark.skipif(not ct.utils._is_macos(), reason="test needs predictions") @pytest.mark.parametrize( "convert_to", ["mlprogram", "neuralnetwork"], ) def test_list_predict_input(convert_to): TestInputs._test_variant_input_type_prediction(lambda x: x.tolist(), convert_to) @staticmethod def test_rank0_inputs_mil(): with pytest.raises(ValueError, match=r"Rank-0"): @mb.program( input_specs=[ mb.TensorSpec(shape=()), ] ) def prog(x): return x ############################################################################### # Note: all tests are examples of conversion to the Core ML format # Each test case is expected to be runnable and self-complete. ############################################################################### class TestMLProgramConverterExamples: @staticmethod @pytest.mark.skipif( ct.utils._macos_version() < (15, 0), reason="Tests are for deployment target iOS18/macos15" ) def test_build_stateful_model(): @mb.program( input_specs=[ mb.TensorSpec((1,), dtype=types.fp16), mb.StateTensorSpec((1,), dtype=types.fp16), ], ) def prog(x, accumulator_state): # Read state accumulator_value = mb.read_state(input=accumulator_state) # Update value y = mb.add(x=x, y=accumulator_value, name="y") # Write state mb.coreml_update_state(state=accumulator_state, value=y) return y mlmodel = ct.convert(prog, minimum_deployment_target=ct.target.iOS18) # try to run prediction on the stateful model state = mlmodel.make_state() assert mlmodel.predict({"x": np.array([1.0])}, state=state)["y"] == 1 assert mlmodel.predict({"x": np.array([1.0])}, state=state)["y"] == 2 @staticmethod def test_model_save(tmpdir): save_path_dir = str(tmpdir) @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.square(x=x) return x # save neuralnetwork model without extension and check that it is saved with # mlmodel extension mlmodel = ct.convert(prog, convert_to="neuralnetwork") mlmodel_path = os.path.join(save_path_dir, "model_nn") mlmodel.save(mlmodel_path) assert os.path.exists(mlmodel_path + ".mlmodel") # save neuralnetwork model with mlpackage extension mlmodel_path = os.path.join(save_path_dir, "model_nn2.mlpackage") mlmodel.save(mlmodel_path) assert os.path.exists(mlmodel_path) # save mlprogram model without extension and check that it is saved with # mlpackage extension mlmodel = ct.convert(prog, convert_to="mlprogram") mlmodel_path = os.path.join(save_path_dir, "model_mlprogram") mlmodel.save(mlmodel_path) assert os.path.exists(mlmodel_path + ".mlpackage") # check error if mlprogram is saved with mlmodel extension mlmodel_path = os.path.join(save_path_dir, "model_mlprogram.mlmodel") expected_pattern = "For an ML Program\, extension must be \.mlpackage \(not \.mlmodel\)\. Please see .* to see the difference between neuralnetwork and mlprogram model types\." with pytest.raises(Exception, match=expected_pattern): mlmodel.save(mlmodel_path) @staticmethod @pytest.mark.skipif(not ct.utils._is_macos(), reason="Platform is not Mac OS") def test_deepcopy_error_with_symbols_in_prog(): prog = mil.Program() func_inputs = {"x": mb.placeholder(shape=[get_new_symbol(), 3]), "y": mb.placeholder(shape=[2, 3])} with Function(func_inputs) as ssa_fun: x, y = ssa_fun.inputs["x"], ssa_fun.inputs["y"] x = mb.relu(x=x) z = mb.add(x=x, y=y) ssa_fun.set_outputs([z]) prog.add_function("main", ssa_fun) mlmodel = ct.convert(prog, convert_to="mlprogram", compute_precision=ct.precision.FLOAT32) prog2 = mlmodel._get_mil_internal() # this will invoke a deepcopy on the prog @pytest.mark.skipif(not ct.utils._is_macos(), reason="Platform is not Mac OS") @pytest.mark.parametrize("skip_model_load", [True, False]) def test_model_load_skip_flag(self, skip_model_load): @mb.program(input_specs=[mb.TensorSpec(shape=(3,)), ]) def prog(x): return mb.relu(x=x, name='relu') if ct.utils._macos_version() < (12, 0) and not skip_model_load: # converting to mlprogram, on macOS < 12 # should raise a runtime error when skip_model_load is False with pytest.warns(RuntimeWarning): model = ct.convert(prog, convert_to="mlprogram", skip_model_load=skip_model_load) else: model = ct.convert(prog, convert_to="mlprogram", skip_model_load=skip_model_load) assert model is not None if skip_model_load: assert model.__proxy__ is None model_dir = tempfile.TemporaryDirectory() filename = os.path.join(model_dir.name, "test.mlpackage") model.save(filename) assert os.path.exists(filename) @pytest.mark.skipif(ct.utils._macos_version() < (12, 0), reason='Model produces specification 6.') class TestMLProgramFP16Transform: @staticmethod def test_compute_precision_api(): @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.square(x=x) return x mlmodel = ct.convert(copy.deepcopy(prog), compute_precision=ct.precision.FLOAT16, convert_to='mlprogram') mil_prog = mlmodel._get_mil_internal() np.testing.assert_array_equal(["cast", "square", "cast"], get_op_types_in_program(mil_prog)) mlmodel = ct.convert(copy.deepcopy(prog), compute_precision=ct.precision.FLOAT32, convert_to='mlprogram') mil_prog = mlmodel._get_mil_internal() np.testing.assert_array_equal(["square"], get_op_types_in_program(mil_prog)) mlmodel = ct.convert( copy.deepcopy(prog), compute_precision=ct.transform.FP16ComputePrecision( op_selector=lambda op: op.op_type != "square" ), convert_to="mlprogram", ) mil_prog = mlmodel._get_mil_internal() np.testing.assert_array_equal(["square"], get_op_types_in_program(mil_prog)) with pytest.raises(ValueError) as e: mlmodel = ct.convert(copy.deepcopy(prog), compute_precision='fp64', convert_to='mlprogram') expected_error = "'compute_precision' must be either coremltools.precision.FLOAT32 or " \ "coremltools.precision.FLOAT16 or of type coremltools.transform.FP16ComputePrecision()" assert expected_error == str(e.value) expected_pattern = "compute_precision .* supported .* mlprogram .* None .* target=='neuralnetwork'.*minimum_deployment_target.*" with pytest.raises(ValueError, match=expected_pattern) as e: mlmodel = ct.convert( copy.deepcopy(prog), convert_to="neuralnetwork", compute_precision="fp16" ) @staticmethod def test_invalid_argument_nn_backend(): ''' Since the compute_precision argument is only applicable when converting to "mlprogram", check that an error is correctly raised when conversion is targeted at the neuralnetwork backend ''' @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20))]) def prog(x): x = mb.square(x=x) return x expected_err_str = "compute_precision is only supported for mlprogram target and must be None if target.*" with pytest.raises(ValueError, match=expected_err_str): mlmodel = ct.convert( prog, convert_to="neuralnetwork", compute_precision=ct.precision.FLOAT16 ) with pytest.raises(ValueError, match=expected_err_str): mlmodel = ct.convert( prog, convert_to="neuralnetwork", compute_precision=ct.precision.FLOAT32 ) @pytest.mark.skipif(not _HAS_TORCH, reason="PyTorch not found") class TestGraphPassManagement: @staticmethod def _get_test_model(): class TestModel(torch.nn.Module): def __init__(self): super().__init__() self.conv1 = torch.nn.Conv2d(1, 8, 5, padding="same") self.bn1 = torch.nn.BatchNorm2d(8) self.linear1 = torch.nn.Linear(28 * 28 * 8, 5) self.alpha = 0.7 def forward(self, x): x = self.conv1(x) x = self.bn1(x) x = self.linear1(torch.flatten(x)) x = torch.maximum(self.alpha * x, x) return x return TestModel().eval() def test_default_pipeline(self): model = self._get_test_model() example_input = torch.rand(1, 1, 28, 28) traced_model = torch.jit.trace(model, example_input) model_converted = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], convert_to="mlprogram", pass_pipeline=ct.PassPipeline(), ) assert get_op_types_in_program(model_converted._get_mil_internal()) == [ "cast", "conv", "reshape", "linear", "leaky_relu", "cast", ] def test_skip_pass(self): model = self._get_test_model() example_input = torch.rand(1, 1, 28, 28) traced_model = torch.jit.trace(model, example_input) model_converted = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], convert_to="mlprogram" ) assert get_op_types_in_program(model_converted._get_mil_internal()) == [ "cast", "conv", "reshape", "linear", "leaky_relu", "cast", ] pipeline = ct.PassPipeline() pipeline.remove_passes(passes_names=["common::fuse_conv_batchnorm"]) model_converted_with_skipped_passes = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], convert_to="mlprogram", pass_pipeline=pipeline, ) assert get_op_types_in_program(model_converted_with_skipped_passes._get_mil_internal()) == [ "cast", "conv", "batch_norm", "reshape", "linear", "leaky_relu", "cast", ] def test_skip_two_passes(self): model = self._get_test_model() example_input = torch.rand(1, 1, 28, 28) traced_model = torch.jit.trace(model, example_input) pipeline = ct.PassPipeline() pipeline.remove_passes( passes_names=["common::fuse_conv_batchnorm", "common::fuse_leaky_relu"] ) model_converted_with_skipped_passes = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], convert_to="mlprogram", pass_pipeline=pipeline, ) assert get_op_types_in_program(model_converted_with_skipped_passes._get_mil_internal()) == [ "cast", "conv", "batch_norm", "reshape", "linear", "mul", "maximum", "cast", ] def test_skip_passes_in_different_pipelines(self): """ Some passes exist in different pipelines. For example, const_elimination is in both main and backend pipelines. If the user want to skip the const_elimination pass, we want to make sure both pipelines skip that pass. """ model = self._get_test_model() example_input = torch.rand(1, 1, 28, 28) traced_model = torch.jit.trace(model, example_input) pipeline = ct.PassPipeline() pipeline.remove_passes(passes_names=["common::const_elimination"]) model_converted = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], convert_to="mlprogram", pass_pipeline=pipeline, ) op_types = get_op_types_in_program(model_converted._mil_program, skip_const_ops=False) expected_counts = { "const": 26, "cast": 7, "conv": 1, "matmul": 1, "add": 1, "shape": 1, "slice_by_index": 2, "concat": 1, "reshape": 1, "leaky_relu": 1, } assert Counter(op_types) == expected_counts def test_empty_pipeline(self): model = self._get_test_model() example_input = torch.rand(1, 1, 28, 28) traced_model = torch.jit.trace(model, example_input) pipeline = ct.PassPipeline.EMPTY model_converted = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], convert_to="mlprogram", pass_pipeline=pipeline, # TODO (rdar://131396853) Re-enable model load skip_model_load=True, ) assert get_op_types_in_program(model_converted._get_mil_internal()) == [ "conv", "batch_norm", "shape", "slice_by_index", "slice_by_index", "concat", "cast", "reshape", "linear", "mul", "maximum", ] def test_pass_option_skip_ops_by_type(self): model = self._get_test_model() example_input = torch.rand(1, 1, 28, 28) traced_model = torch.jit.trace(model, example_input) pipeline = ct.PassPipeline() pipeline.set_options("common::add_fp16_cast", {"skip_ops_by_type": "conv,linear"}) model_converted = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], convert_to="mlprogram", pass_pipeline=pipeline, # TODO (rdar://131396853) Re-enable model load skip_model_load=True, ) # The fp16 cast is skipped for conv and linear as we specified them in the pass options. assert get_op_types_in_program(model_converted._get_mil_internal()) == [ "conv", "cast", "reshape", "cast", "linear", "cast", "leaky_relu", "cast", ] def test_pass_option_skip_const_by_size(self): model = self._get_test_model() example_input = torch.rand(1, 1, 28, 28) traced_model = torch.jit.trace(model, example_input) model_converted_without_pipeline = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], convert_to="mlprogram", ) pipeline = ct.PassPipeline() pipeline.set_options("common::const_elimination", {"skip_const_by_size": "1e8"}) model_converted = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], convert_to="mlprogram", pass_pipeline=pipeline, ) # When the threshold is set to 1e8, no var is skipped in const elimination. assert get_op_types_in_program( model_converted._get_mil_internal(), skip_const_ops=False ).count("const") == get_op_types_in_program( model_converted_without_pipeline._get_mil_internal(), skip_const_ops=False ).count( "const" ) pipeline.set_options( "common::const_elimination", {"skip_const_by_size": "-1"} ) model_converted = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], convert_to="mlprogram", pass_pipeline=pipeline, ) # When the threshold -1, almost all vars (except scalars) are skipped in const elimination. assert ( get_op_types_in_program( model_converted._get_mil_internal(), skip_const_ops=False ).count("const") == 25 ) def test_pass_unsupported_option(self): model = self._get_test_model() example_input = torch.rand(1, 1, 28, 28) traced_model = torch.jit.trace(model, example_input) pipeline = ct.PassPipeline() pipeline.set_options("common::fuse_conv_batchnorm", {"skip_ops_by_type": "conv,linear"}) with pytest.raises( NotImplementedError, match="The graph pass `fuse_conv_batchnorm` doesn't support option `skip_ops_by_type`.", ): ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], convert_to="mlprogram", pass_pipeline=pipeline, ) def test_pass_option_invalid_val(self): model = self._get_test_model() example_input = torch.rand(1, 1, 28, 28) traced_model = torch.jit.trace(model, example_input) pipeline = ct.PassPipeline() pipeline.set_options("common::const_elimination", {"skip_const_by_size": "dummy"}) with pytest.raises( ValueError, match="Expected to get float threshold, but got `dummy` which cannot be converted to float", ): ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], convert_to="mlprogram", pass_pipeline=pipeline, ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/api/test_api_visibilities.py0000644000000000000000000002001414672066616023527 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import coremltools as ct def _get_visible_items(d): return [x for x in dir(d) if not x.startswith("_")] def _check_visible_modules(actual, expected): assert set(actual) == set(expected), "API mis-matched. Got %s, expected %s" % ( actual, expected, ) EXPECTED_MODULES = [ "ClassifierConfig", "ComputeUnit", "EnumeratedShapes", "ImageType", "RangeDim", "SPECIFICATION_VERSION", "Shape", "TensorType", "colorlayout", "compression_utils", "convert", "converters", "libcoremlpython", "models", "PassPipeline", "proto", "precision", "target", "utils", "version", "test", "transform", "libmodelpackage", "libmilstoragepython", "optimize", "StateType", "ReshapeFrequency", "SpecializationStrategy", ] class TestApiVisibilities: """Test public coremltools API visibilities.""" def test_top_level(self): if not ct.utils._is_macos(): EXPECTED_MODULES.remove("libcoremlpython") _check_visible_modules(_get_visible_items(ct), EXPECTED_MODULES) def test_utils(self): expected = [ "compile_model", "convert_double_to_float_multiarray_type", "evaluate_classifier", "evaluate_classifier_with_probabilities", "evaluate_regressor", "evaluate_transformer", "make_pipeline", "materialize_dynamic_shape_mlmodel", "load_spec", "rename_feature", "save_spec", "save_multifunction", "MultiFunctionDescriptor", "randomize_weights", "bisect_model", ] _check_visible_modules(_get_visible_items(ct.utils), expected) def test_models(self): expected = [ "array_feature_extractor", "CompiledMLModel", "MLModel", "datatypes", "feature_vectorizer", "ml_program", "model", "nearest_neighbors", "neural_network", "pipeline", "tree_ensemble", "utils", ] _check_visible_modules(_get_visible_items(ct.models), expected) def test_models_mlmodel(self): expected = [ "author", "get_compiled_model_path", "get_spec", "input_description", "license", "output_description", "predict", "save", "short_description", "user_defined_metadata", "version", "weights_dir", "make_state", ] _check_visible_modules(_get_visible_items(ct.models.MLModel), expected) def test_models_neural_network(self): expected = [ "AdamParams", "NeuralNetworkBuilder", "SgdParams", "builder", "flexible_shape_utils", "optimization_utils", "printer", "quantization_utils", "spec_inspection_utils", "update_optimizer_utils", "utils", ] _check_visible_modules(_get_visible_items(ct.models.neural_network), expected) def test_models_neural_network_utils(self): expected = ["NeuralNetworkBuilder", "make_image_input", "make_nn_classifier"] _check_visible_modules( _get_visible_items(ct.models.neural_network.utils), expected ) def test_models_tree_ensemble(self): expected = [ "TreeEnsembleBase", "TreeEnsembleClassifier", "TreeEnsembleRegressor", "set_classifier_interface_params", "set_regressor_interface_params", ] _check_visible_modules(_get_visible_items(ct.models.tree_ensemble), expected) def test_models_pipeline(self): expected = [ "Pipeline", "PipelineClassifier", "PipelineRegressor", "set_classifier_interface_params", "set_regressor_interface_params", "set_training_features", "set_transform_interface_params", ] _check_visible_modules(_get_visible_items(ct.models.pipeline), expected) def test_converters(self): expected = [ "ClassifierConfig", "ColorLayout", "EnumeratedShapes", "ImageType", "RangeDim", "Shape", "TensorType", "convert", "libsvm", "mil", "sklearn", "xgboost", "StateType", ] _check_visible_modules(_get_visible_items(ct.converters), expected) def test_optimize(self): expected = [ "coreml", "torch", ] _check_visible_modules(_get_visible_items(ct.optimize), expected) def test_optimize_coreml(self): expected = [ "OpLinearQuantizerConfig", "OpMagnitudePrunerConfig", "OpPalettizerConfig", "OptimizationConfig", "OpThresholdPrunerConfig", "experimental", "linear_quantize_weights", "palettize_weights", "prune_weights", "decompress_weights", "get_weights_metadata", "CoreMLWeightMetaData", "CoreMLOpMetaData", ] _check_visible_modules(_get_visible_items(ct.optimize.coreml), expected) def test_converters_libsvm(self): _check_visible_modules(_get_visible_items(ct.converters.libsvm), ["convert"]) def test_converters_sklearn(self): _check_visible_modules(_get_visible_items(ct.converters.sklearn), ["convert"]) def test_converters_xgboost(self): _check_visible_modules(_get_visible_items(ct.converters.xgboost), ["convert"]) def test_models_neural_network_quantization_utils(self): expected = [ "AdvancedQuantizedLayerSelector", "MatrixMultiplyLayerSelector", "ModelMetrics", "NoiseMetrics", "OutputMetric", "QuantizedLayerSelector", "TopKMetrics", "activate_int8_int8_matrix_multiplications", "compare_models", "quantize_weights", ] _check_visible_modules( _get_visible_items(ct.models.neural_network.quantization_utils), expected ) def test_compression_utils(self): expected = [ "affine_quantize_weights", "palettize_weights", "sparsify_weights", "decompress_weights", ] _check_visible_modules( _get_visible_items(ct.compression_utils), expected ) def test_models_neural_network_flexible_shape_utils(self): expected = [ "NeuralNetworkImageSize", "NeuralNetworkImageSizeRange", "NeuralNetworkMultiArrayShape", "NeuralNetworkMultiArrayShapeRange", "Shape", "ShapeRange", "Size", "add_enumerated_image_sizes", "add_enumerated_multiarray_shapes", "add_multiarray_ndshape_enumeration", "set_multiarray_ndshape_range", "update_image_size_range", "update_multiarray_shape_range", ] _check_visible_modules( _get_visible_items(ct.models.neural_network.flexible_shape_utils), expected ) def test_models_neural_network_update_optimizer_utils(self): expected = ["AdamParams", "Batch", "RangeParam", "SgdParams"] _check_visible_modules( _get_visible_items(ct.models.neural_network.update_optimizer_utils), expected, ) def test_models_neural_network_optimization_utils(self): _check_visible_modules( _get_visible_items(ct.models.neural_network.optimization_utils), [], ) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2775474 coremltools-8.0/coremltools/test/blob/0000755000000000000000000000000014672075535016736 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/blob/__init__.py0000644000000000000000000000034114672066616021045 0ustar00rootroot# Copyright (c) 2017 - 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/blob/test_weights.py0000644000000000000000000003105714672066616022027 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import shutil import tempfile import unittest import numpy as np import pytest import coremltools as ct from coremltools import _SPECIFICATION_VERSION_IOS_18 from coremltools.converters.mil import mil from coremltools.converters.mil.converter import mil_convert as _mil_convert from coremltools.converters.mil.mil.builder import Builder as mb from coremltools.libmilstoragepython import _BlobStorageReader as BlobReader from coremltools.libmilstoragepython import _BlobStorageWriter as BlobWriter class TestWeightBlob: @classmethod def setup_class(cls): cls.working_dir = tempfile.mkdtemp() @classmethod def teardown_class(cls): if os.path.exists(cls.working_dir): shutil.rmtree(cls.working_dir) def test_weight_blob_int4(self): writer = BlobWriter(self.working_dir + "/net.wt") # All values in input_arr should be within range of int4, although they are stored in int8. input_arr1 = np.array([-8, -2, 0, 2, 7], dtype=np.int8) offset1 = writer.write_int4_data(input_arr1) input_arr2 = np.array([3, -8, 5, 7, -6], dtype=np.int8) offset2 = writer.write_int4_data(input_arr2) writer = None reader = BlobReader(self.working_dir + "/net.wt") output_arr1 = reader.read_int4_data(offset1) output_arr2 = reader.read_int4_data(offset2) np.testing.assert_equal(input_arr1, output_arr1) np.testing.assert_equal(input_arr2, output_arr2) def test_weight_blob_int4_invalid(self): writer = BlobWriter(self.working_dir + "/net.wt") input_arr = np.array([-80, -2, 0, 2, 7], dtype=np.float32) with pytest.raises( ValueError, match="Value -80 is outside allowed subbyte datatype range \[-8, 7\]." ): writer.write_int4_data(input_arr) @pytest.mark.parametrize("nbits", (1, 2, 3, 4, 6)) def test_weight_blob_unsigned_sub_byte(self, nbits): writer = BlobWriter(self.working_dir + "/net.wt") # All values in input_arr are within range of uint{nbits}, but stored in uint8. input_arr1 = np.random.randint(0, 2**nbits, (5,), dtype=np.uint8) write_method = getattr(writer, f"write_uint{nbits}_data") offset1 = write_method(input_arr1) input_arr2 = np.random.randint(0, 2**nbits, (5,), dtype=np.uint8) offset2 = write_method(input_arr2) writer = None reader = BlobReader(self.working_dir + "/net.wt") read_method = getattr(reader, f"read_uint{nbits}_data") output_arr1 = read_method(offset1) output_arr2 = read_method(offset2) np.testing.assert_equal(input_arr1, output_arr1) np.testing.assert_equal(input_arr2, output_arr2) @pytest.mark.parametrize("nbits", (1, 2, 3, 4, 6)) def test_weight_blob_unsigned_sub_byte_invalid(self, nbits): writer = BlobWriter(self.working_dir + "/net.wt") input_arr = np.array([1, 80, 2, 0, 2]) with pytest.raises( ValueError, match=f"Value 80 is outside allowed subbyte datatype range \[0, {2 ** nbits - 1}\].", ): getattr(writer, f"write_uint{nbits}_data")(input_arr) def test_weight_blob_int8(self): writer = BlobWriter(self.working_dir + "/net.wt") input_arr = np.array([-5, -2, 0, 2, 5], dtype=np.int8) offset = writer.write_int8_data(input_arr) writer = None reader = BlobReader(self.working_dir + "/net.wt") output_arr = reader.read_int8_data(offset) np.testing.assert_equal(input_arr, output_arr) def test_weight_blob_uint8(self): writer = BlobWriter(self.working_dir + "/net.wt") input_arr = np.array([1, 2, 3, 4, 5], dtype=np.uint8) offset = writer.write_uint8_data(input_arr) writer = None reader = BlobReader(self.working_dir + "/net.wt") output_arr = reader.read_uint8_data(offset) np.testing.assert_almost_equal(input_arr, output_arr) def test_weight_blob_int16(self): writer = BlobWriter(self.working_dir + "/net.wt") input_arr = np.array([-5, -2, 0, 2, 5], dtype=np.int16) offset = writer.write_int16_data(input_arr) writer = None reader = BlobReader(self.working_dir + "/net.wt") output_arr = reader.read_int16_data(offset) np.testing.assert_equal(input_arr, output_arr) def test_weight_blob_int32(self): writer = BlobWriter(self.working_dir + "/net.wt") input_arr = np.array([-5, -2, 0, 2, 5], dtype=np.int32) offset = writer.write_int32_data(input_arr) writer = None reader = BlobReader(self.working_dir + "/net.wt") output_arr = reader.read_int32_data(offset) np.testing.assert_equal(input_arr, output_arr) def test_weight_blob_uint16(self): writer = BlobWriter(self.working_dir + "/net.wt") input_arr = np.array([1, 2, 3, 4, 5], dtype=np.uint16) offset = writer.write_uint16_data(input_arr) writer = None reader = BlobReader(self.working_dir + "/net.wt") output_arr = reader.read_uint16_data(offset) np.testing.assert_almost_equal(input_arr, output_arr) def test_weight_blob_uint32(self): writer = BlobWriter(self.working_dir + "/net.wt") input_arr = np.array([1, 2, 3, 4, 5], dtype=np.uint32) offset = writer.write_uint32_data(input_arr) writer = None reader = BlobReader(self.working_dir + "/net.wt") output_arr = reader.read_uint32_data(offset) np.testing.assert_almost_equal(input_arr, output_arr) def test_weight_blob_fp16(self): writer = BlobWriter(self.working_dir + "/net.wt") input_arr = np.array([2.3, 4.6, 7.9], dtype=np.float16) input_arr_to_bytes_uint16 = np.frombuffer(input_arr.tobytes(), np.uint16) offset = writer.write_fp16_data(input_arr_to_bytes_uint16) writer = None reader = BlobReader(self.working_dir + "/net.wt") output_arr_uint16 = reader.read_fp16_data(offset) output_arr = np.frombuffer(output_arr_uint16.tobytes(), np.float16) np.testing.assert_almost_equal(input_arr, output_arr) def test_weight_blob_fp32(self): writer = BlobWriter(self.working_dir + "/net.wt") input_arr = np.array([1.0, 2.4, 3.9, -4.8, 5.2], dtype=np.float32) offset = writer.write_float_data(input_arr) writer = None reader = BlobReader(self.working_dir + "/net.wt") output_arr = reader.read_float_data(offset) np.testing.assert_almost_equal(input_arr, output_arr) @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Multi-function only supported on macOS 15+") class TestWeightIDSharing: @staticmethod def test_single_function(): @mb.program( input_specs=[mb.TensorSpec((500,))], opset_version=ct.target.iOS16, ) def prog(x): val = np.random.rand( 500, ) const_1 = mb.const(val=val, name="const_1") const_2 = mb.const(val=val, name="const_2") const_3 = mb.const(val=val, name="const_3") # const 1 and 2 share the same weight id, so they should be serialized # as the same blob value const_1.op.weight_id = 0 const_2.op.weight_id = 0 x = mb.add(x=x, y=const_1) x = mb.add(x=x, y=const_2) x = mb.add(x=x, y=const_3) return x # skip all passes to avoid running the const_deduplicate pass prog.skip_all_passes = True mlmodel = ct.convert( prog, convert_to="mlprogram", compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS16, ) # In the above model, const_1 and const_2 are going to share the same blob file value. package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) with tempfile.TemporaryDirectory() as serialize_dir: os.system(f"coremlcompiler compile {package_path} {serialize_dir}") model_name_with_extension = os.path.basename(package_path) model_name_wo_extension, _ = os.path.splitext(model_name_with_extension) mil_file = open( os.path.join(serialize_dir, f"{model_name_wo_extension}.mlmodelc", "model.mil") ) mil_txt = mil_file.read() assert ( 'tensor const_1 = const()[name = tensor("const_1"), val = tensor(BLOBFILE(path = tensor("@model_path/weights/weight.bin"), offset = tensor(64)))];' in mil_txt ) assert ( 'tensor const_2 = const()[name = tensor("const_2"), val = tensor(BLOBFILE(path = tensor("@model_path/weights/weight.bin"), offset = tensor(64)))];' in mil_txt ) assert ( 'tensor const_3 = const()[name = tensor("const_3"), val = tensor(BLOBFILE(path = tensor("@model_path/weights/weight.bin"), offset = tensor(2176)))];' in mil_txt ) assert "add(x = x, y = const_1)" in mil_txt assert "add(x = add_0, y = const_2)" in mil_txt shutil.rmtree(package_path) @staticmethod def test_multi_functions(): val = np.random.rand( 500, ) @mb.function( input_specs=[mb.TensorSpec((500,))], opset_version=ct.target.iOS16, ) def func(x): const_1 = mb.const(val=val, name="const_1") const_1.op.weight_id = 0 return mb.add(x=x, y=const_1) @mb.function( input_specs=[mb.TensorSpec((500,))], opset_version=ct.target.iOS16, ) def func_1(x): const_2 = mb.const(val=val, name="const_2") const_3 = mb.const(val=val, name="const_3") # const_3 shared the same blob file value with const_1 in another function const_3.op.weight_id = 0 x = mb.add(x=x, y=const_2) return mb.add(x=x, y=const_3) prog = mil.Program() prog.add_function("main", func) prog.add_function("func_1", func_1) # skip all passes to avoid running the const_deduplicate pass prog.skip_all_passes = True mlmodel = _mil_convert( prog, convert_to="mlprogram", convert_from="milinternal", specification_version=_SPECIFICATION_VERSION_IOS_18, compute_units=ct.ComputeUnit.CPU_ONLY, export_multi_functions=True, skip_model_load=True, ) # In the above model, const_1 and const_3 are going to share the same blob file value. package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) with tempfile.TemporaryDirectory() as serialize_dir: os.system(f"coremlcompiler compile {package_path} {serialize_dir}") model_name_with_extension = os.path.basename(package_path) model_name_wo_extension, _ = os.path.splitext(model_name_with_extension) mil_file = open( os.path.join(serialize_dir, f"{model_name_wo_extension}.mlmodelc", "model.mil") ) mil_txt = mil_file.read() assert ( 'tensor const_3 = const()[name = string("const_3"), val = tensor(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(64)))];' in mil_txt ) assert ( 'tensor const_2 = const()[name = string("const_2"), val = tensor(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(2176)))];' in mil_txt ) assert ( 'tensor const_1 = const()[name = string("const_1"), val = tensor(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(64)))];' in mil_txt ) assert "add(x = x, y = const_2)" in mil_txt assert "add(x = add_1, y = const_3)" in mil_txt assert "add(x = x, y = const_1)" in mil_txt shutil.rmtree(package_path) if __name__ == "__main__": unittest.main() ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2775474 coremltools-8.0/coremltools/test/ml_program/0000755000000000000000000000000014672075535020157 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/ml_program/__init__.py0000644000000000000000000000032614672066616022271 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/ml_program/test_compression.py0000644000000000000000000001767314672066616024147 0ustar00rootroot# Copyright (c) 2022, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import Optional import numpy as np import torch import coremltools as ct from coremltools.converters.mil.testing_utils import get_op_types_in_program from coremltools.models.ml_program.compression_utils import ( affine_quantize_weights, decompress_weights, palettize_weights, sparsify_weights, ) from coremltools.optimize.coreml._config import OpCompressorConfig def get_test_model_and_data( multi_layer: bool = False, quantize_config: Optional[OpCompressorConfig] = None, use_linear: bool = False, ): """ Prepare test model and data. :param multi_layer: If set, the test model will have multiple `nn.Conv2d` layers. :param quantize_config: If set, the weights in the test model will be nbits quantization-friendly, which means it will be first quantized according to the config, and then dequantized, so the numerical error introduced during the quantization test will be minimum. :param use_linear: If set, use linear instead of conv in the model. """ if quantize_config is not None and multi_layer: raise AssertionError("Multi-layer model doesn't support pre_quantize_nbits.") inputs = [ct.TensorType(name="data", shape=(1, 64, 10, 10))] if use_linear: inputs = [ct.TensorType(name="data", shape=(1, 64))] torch_input_values = [torch.rand(*i.shape.to_list()) for i in inputs] coreml_input_values = { i.name: val.detach().numpy() for i, val in zip(inputs, torch_input_values) } if multi_layer: class ConvModel(torch.nn.Module): def __init__(self): super(ConvModel, self).__init__() self.conv_1 = torch.nn.Conv2d(in_channels=64, out_channels=32, kernel_size=2) self.conv_2 = torch.nn.Conv2d(in_channels=32, out_channels=64, kernel_size=2) def forward(self, x): conv_1 = self.conv_1(x) conv_2 = self.conv_2(conv_1) return conv_2 class LinearModel(torch.nn.Module): def __init__(self): super(LinearModel, self).__init__() self.linear_1 = torch.nn.Linear(in_features=64, out_features=32, bias=False) self.linear_2 = torch.nn.Linear(in_features=32, out_features=16, bias=False) def forward(self, x): linear_1 = self.linear_1(x) return self.linear_2(linear_1) model = LinearModel().eval() if use_linear else ConvModel().eval() else: model = torch.nn.Conv2d(in_channels=64, out_channels=32, kernel_size=2) if use_linear: model = torch.nn.Linear(in_features=64, out_features=32, bias=False) if quantize_config is not None: # Manually change weight to make it quantization friendly. nbits_range_max = 2 ** (quantize_config.nbits - 1) - 1 mode_to_range = { "LINEAR": (-nbits_range_max - 1, nbits_range_max), "LINEAR_SYMMETRIC": (-nbits_range_max, nbits_range_max), } q_val_min, q_val_max = mode_to_range[quantize_config.mode] original_shape = model.weight.detach().numpy().shape fake_scale = 2.0 quantize_friendly_weight = ( np.random.randint(low=q_val_min, high=q_val_max + 1, size=original_shape) * fake_scale ) with torch.no_grad(): model.weight = torch.nn.Parameter( torch.from_numpy(quantize_friendly_weight).float() ) model = model.eval() return model, inputs, torch_input_values, coreml_input_values class TestCompressionUtils: """ Since ct.compression_utils is deprecated, this test is only checking the API is still working. """ @staticmethod def test_op_selector(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") mlmodel_no_quantized = affine_quantize_weights(mlmodel, mode="linear", op_selector=lambda const_op: const_op.val.val.size > 1e7) expected_ops = ['cast', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_no_quantized._mil_program) == expected_ops @staticmethod def test_affine_quantize_weights_smoke(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") mlmodel_quantized = affine_quantize_weights(mlmodel, mode="linear_symmetric", dtype=np.int8) # validate parameters expected_ops = ['constexpr_affine_dequantize', 'cast', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_quantized._mil_program) == expected_ops @staticmethod def test_palettize_weights_smoke(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") mlmodel_palettized = palettize_weights(mlmodel, nbits=4, mode="uniform") # validate parameters expected_ops = ['constexpr_lut_to_dense', 'cast', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_palettized._mil_program) == expected_ops @staticmethod def test_sparsify_weights_threshold_smoke(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() with torch.no_grad(): model.weight[0][0][0][0] = 101 torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") mlmodel_sparsified = sparsify_weights(mlmodel, mode="threshold_based", threshold=0.01) # validate parameters expected_ops = ['constexpr_sparse_to_dense', 'cast', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_sparsified._mil_program) == expected_ops @staticmethod def test_sparsify_weights_percentile_smoke(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() with torch.no_grad(): model.weight[0][0][0][0] = 101 torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") mlmodel_sparsified = sparsify_weights(mlmodel, mode="percentile_based", target_percentile=0.8) # validate parameters expected_ops = ['constexpr_sparse_to_dense', 'cast', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_sparsified._mil_program) == expected_ops @staticmethod def test_weight_decompression_smoke(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data(multi_layer=True) torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") # we first compress the model mlmodel = palettize_weights(mlmodel, mode="kmeans", nbits=4, op_selector=lambda const_op: const_op.name == "conv_1_weight_to_fp16") mlmodel = affine_quantize_weights(mlmodel, mode="linear", op_selector=lambda const_op: const_op.name == "conv_2_weight_to_fp16") expected_ops = ['constexpr_lut_to_dense', 'cast', 'conv', 'constexpr_affine_dequantize', 'conv', 'cast'] assert get_op_types_in_program(mlmodel._mil_program) == expected_ops # decompress the model decompressed_model = decompress_weights(mlmodel) assert get_op_types_in_program(decompressed_model._mil_program) == ['cast', 'conv', 'conv', 'cast'] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/ml_program/test_utils.py0000644000000000000000000021155714672066616022743 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import itertools import os import platform import shutil import tempfile from typing import Dict, Tuple import numpy as np import pytest import torch import coremltools as ct from coremltools import _SPECIFICATION_VERSION_IOS_18, proto from coremltools.converters.mil import mil from coremltools.converters.mil.converter import mil_convert as _mil_convert from coremltools.converters.mil.mil import Program from coremltools.converters.mil.mil.builder import Builder as mb from coremltools.converters.mil.mil.passes.pass_pipeline import ( PASS_REGISTRY, PassPipeline, PassPipelineManager, ) from coremltools.converters.mil.testing_utils import assert_spec_input_type, assert_spec_output_type, DTYPE_TO_FEATURE_TYPE_MAP, get_op_types_in_program from coremltools.models.utils import bisect_model, MultiFunctionDescriptor, load_spec, save_multifunction, load_spec import coremltools.optimize as cto class TestMILConvertCall: @staticmethod def test_pass_pipeline(): X_SHAPE = (2, 3, 16, 16) WEIGHT_SHAPE = (5, X_SHAPE[1], 3, 3) BIAS_SHAPE = (WEIGHT_SHAPE[0], 1, 1) WEIGHT = np.random.rand(*WEIGHT_SHAPE) BIAS = np.random.rand(*BIAS_SHAPE) @mb.program(input_specs=[mb.TensorSpec(shape=X_SHAPE)]) def prog(x): y = mb.conv(x=x, weight=WEIGHT) z = mb.add(x=y, y=BIAS) return z prog1 = copy.deepcopy(prog) prog2 = copy.deepcopy(prog) common_kwargs = { "convert_to": "mlprogram", "convert_from": "milinternal", "compute_units": ct.ComputeUnit.CPU_ONLY, "skip_model_load": True, } mlmodel1 = _mil_convert(prog1, **common_kwargs) mlmodel2 = _mil_convert(prog2, pass_pipeline=PassPipeline.EMPTY, **common_kwargs) assert get_op_types_in_program(mlmodel1._mil_program) == ["conv"] assert get_op_types_in_program(mlmodel2._mil_program) == ["conv", "add"] @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Multi-function only supported on macOS 15+") class TestMultiFunctionDescriptor: @staticmethod def _convert_multifunction_prog(prog): mlmodel = _mil_convert( prog, convert_to="mlprogram", convert_from="milinternal", specification_version=_SPECIFICATION_VERSION_IOS_18, compute_units=ct.ComputeUnit.CPU_ONLY, export_multi_functions=True, skip_model_load=True, ) package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) return package_path @staticmethod def _get_singlefunction_mlpackage(opset_version=ct.target.iOS16): @mb.program( input_specs=[mb.TensorSpec((3,))], opset_version=opset_version, ) def prog(x): return mb.relu(x=x) mlmodel = ct.convert( prog, minimum_deployment_target=opset_version, ) package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) return package_path def _get_multifunction_mlpackage_1(self): @mb.function( input_specs=[mb.TensorSpec((3,))], opset_version=ct.target.iOS18, ) def func(x): return mb.relu(x=x) @mb.function( input_specs=[mb.TensorSpec((3,))], opset_version=ct.target.iOS18, ) def func_1(x): return mb.sin(x=x) @mb.function( input_specs=[mb.TensorSpec((3,))], opset_version=ct.target.iOS18, ) def func_2(x): return mb.cos(x=x) prog = mil.Program() prog.add_function("relu", func) prog.add_function("sin", func_1) prog.add_function("cos", func_2) prog.default_function_name = "relu" return self._convert_multifunction_prog(prog) def _get_multifunction_mlpackage_2(self): @mb.function( input_specs=[mb.TensorSpec((3,))], opset_version=ct.target.iOS18, ) def func(x): return mb.relu(x=x) @mb.function( input_specs=[mb.TensorSpec((3,))], opset_version=ct.target.iOS18, ) def func_1(x): return mb.sin(x=x) prog = mil.Program() prog.add_function("relu", func) prog.add_function("sin", func_1) prog.default_function_name = "sin" return self._convert_multifunction_prog(prog) def _get_multifunction_mlpackage_3(self): @mb.function( input_specs=[mb.TensorSpec((3,))], opset_version=ct.target.iOS18, ) def func(x): return mb.relu(x=x) prog = mil.Program() prog.add_function("relu", func) prog.default_function_name = "relu" return self._convert_multifunction_prog(prog) def test_initialization(self): # Test empty initialization desc = MultiFunctionDescriptor() assert desc._functions() == {} # Initialize with a single function model model = self._get_singlefunction_mlpackage() desc = MultiFunctionDescriptor(model) assert desc._functions() == {"main": (model, "main")} shutil.rmtree(model) # Initialize with a multifunction model with only a single function model = self._get_multifunction_mlpackage_3() desc = MultiFunctionDescriptor(model) assert desc._functions() == {"relu": (model, "relu")} shutil.rmtree(model) # Initialize with a multifunction model with several functions model = self._get_multifunction_mlpackage_1() desc = MultiFunctionDescriptor(model) assert desc._functions() == { "relu": (model, "relu"), "sin": (model, "sin"), "cos": (model, "cos"), } shutil.rmtree(model) # Initialize with invalid path with pytest.raises(ValueError, match="invalid model_path invalid_path with error"): desc = MultiFunctionDescriptor("invalid_path") def test_add_function(self): # Add function from a single function model desc = MultiFunctionDescriptor() model = self._get_singlefunction_mlpackage() desc.add_function(model, "main", "main_1") assert desc._functions() == {"main_1": (model, "main")} desc.add_function(model, "main", "main_2") assert desc._functions() == {"main_1": (model, "main"), "main_2": (model, "main")} with pytest.raises(ValueError, match="src_function_name invalid not found in"): desc.add_function(model, "invalid", "main_3") with pytest.raises(ValueError, match="function main_1 already exist"): desc.add_function(model, "main", "main_1") shutil.rmtree(model) # Add function from multifunction model desc = MultiFunctionDescriptor() model = self._get_multifunction_mlpackage_1() desc.add_function(model, "relu", "main_1") assert desc._functions() == {"main_1": (model, "relu")} desc.add_function(model, "sin", "main_2") assert desc._functions() == {"main_1": (model, "relu"), "main_2": (model, "sin")} shutil.rmtree(model) # Initialize a desc with a model and add functions to it model = self._get_multifunction_mlpackage_1() desc = MultiFunctionDescriptor(model) assert desc._functions() == { "relu": (model, "relu"), "sin": (model, "sin"), "cos": (model, "cos"), } model_2 = self._get_multifunction_mlpackage_2() desc.add_function(model_2, "sin", "new_sin") assert desc._functions() == { "relu": (model, "relu"), "sin": (model, "sin"), "cos": (model, "cos"), "new_sin": (model_2, "sin"), } with pytest.raises(ValueError, match="function relu already exist"): desc.add_function(model, "relu", "relu") shutil.rmtree(model) shutil.rmtree(model_2) def test_add_model(self): # Add model from a single function model desc = MultiFunctionDescriptor() model = self._get_singlefunction_mlpackage() desc.add_model(model) assert desc._functions() == {"main": (model, "main")} shutil.rmtree(model) # Add a multifunction model with only a single function desc = MultiFunctionDescriptor() model = self._get_multifunction_mlpackage_3() desc.add_model(model) assert desc._functions() == {"relu": (model, "relu")} shutil.rmtree(model) # Add a multifunction model with several functions desc = MultiFunctionDescriptor() model = self._get_multifunction_mlpackage_1() desc.add_model(model) assert desc._functions() == { "relu": (model, "relu"), "sin": (model, "sin"), "cos": (model, "cos"), } shutil.rmtree(model) # Add a model to a desc with functions model = self._get_singlefunction_mlpackage() desc = MultiFunctionDescriptor(model) assert desc._functions() == {"main": (model, "main")} model_2 = self._get_multifunction_mlpackage_1() desc.add_model(model_2) assert desc._functions() == { "relu": (model_2, "relu"), "sin": (model_2, "sin"), "cos": (model_2, "cos"), "main": (model, "main"), } shutil.rmtree(model) shutil.rmtree(model_2) # Error handling when adding model with duplicated function name model = self._get_multifunction_mlpackage_2() with pytest.raises(ValueError, match="function relu already exist"): desc.add_model(model) shutil.rmtree(model) def test_remove_function(self): model = self._get_multifunction_mlpackage_1() desc = MultiFunctionDescriptor(model) assert desc._functions() == { "relu": (model, "relu"), "sin": (model, "sin"), "cos": (model, "cos"), } desc.remove_function("relu") assert desc._functions() == { "sin": (model, "sin"), "cos": (model, "cos"), } with pytest.raises(ValueError, match="function_name relu not found"): desc.remove_function("relu") desc.remove_function("sin") assert desc._functions() == { "cos": (model, "cos"), } desc.remove_function("cos") assert desc._functions() == {} with pytest.raises(ValueError, match="function_name relu not found"): desc.remove_function("relu") shutil.rmtree(model) def test_convert_single_function_into_multifunction_model(self): """ Convert a single function model into a multifunction model format, but only consists of one function. """ model = self._get_singlefunction_mlpackage() desc = MultiFunctionDescriptor() desc.add_function(model, "main", "main_1") desc.default_function_name = "main_1" package_path = tempfile.mkdtemp(suffix=".mlpackage") save_multifunction(desc, package_path) shutil.rmtree(model) # verify the model spec spec = load_spec(package_path) model_desc = spec.description assert len(model_desc.functions) == 1 assert model_desc.functions[0].name == "main_1" assert model_desc.defaultFunctionName == "main_1" # verify the model can be load / run new_model = ct.models.MLModel(package_path, function_name="main_1") new_model.predict( { "x": np.random.rand( 3, ) } ) shutil.rmtree(package_path) def test_merge_two_models_into_multifunction_model(self): """ Merge two single function models into one multifunction model. """ model_1 = self._get_singlefunction_mlpackage() model_2 = self._get_singlefunction_mlpackage() desc = MultiFunctionDescriptor() desc.add_function(model_1, "main", "main_1") desc.add_function(model_2, "main", "main_2") desc.default_function_name = "main_2" package_path = tempfile.mkdtemp(suffix=".mlpackage") save_multifunction(desc, package_path) shutil.rmtree(model_1) shutil.rmtree(model_2) # verify the model spec spec = load_spec(package_path) model_desc = spec.description assert len(model_desc.functions) == 2 assert model_desc.functions[0].name == "main_1" assert model_desc.functions[1].name == "main_2" assert model_desc.defaultFunctionName == "main_2" # verify the model can be load / run new_model = ct.models.MLModel(package_path, function_name="main_1") new_model.predict( { "x": np.random.rand( 3, ) } ) new_model = ct.models.MLModel(package_path, function_name="main_2") new_model.predict( { "x": np.random.rand( 3, ) } ) shutil.rmtree(package_path) def test_copy_a_single_model_twice_into_multifunction_model(self): """ Copy the function in a single function model twice to make a multifunction model. """ model = self._get_singlefunction_mlpackage() desc = MultiFunctionDescriptor() desc.add_function(model, "main", "main_1") desc.add_function(model, "main", "main_2") desc.default_function_name = "main_2" package_path = tempfile.mkdtemp(suffix=".mlpackage") save_multifunction(desc, package_path) shutil.rmtree(model) # verify the model spec spec = load_spec(package_path) model_desc = spec.description assert len(model_desc.functions) == 2 assert model_desc.functions[0].name == "main_1" assert model_desc.functions[1].name == "main_2" assert model_desc.defaultFunctionName == "main_2" # verify the model can be load / run new_model = ct.models.MLModel(package_path, function_name="main_1") new_model.predict( { "x": np.random.rand( 3, ) } ) new_model = ct.models.MLModel(package_path, function_name="main_2") new_model.predict( { "x": np.random.rand( 3, ) } ) shutil.rmtree(package_path) def test_combine_multifunctin_models(self): """ Combine two multifunction models into one multifunction model. """ model_1 = self._get_multifunction_mlpackage_1() desc = MultiFunctionDescriptor(model_1) model_2 = self._get_multifunction_mlpackage_2() desc.add_function(model_2, "relu", "main_1") desc.add_function(model_2, "sin", "main_2") desc.default_function_name = "main_2" package_path = tempfile.mkdtemp(suffix=".mlpackage") save_multifunction(desc, package_path) shutil.rmtree(model_1) shutil.rmtree(model_2) # verify the model spec spec = load_spec(package_path) model_desc = spec.description assert len(model_desc.functions) == 5 assert model_desc.functions[0].name == "relu" assert model_desc.functions[1].name == "sin" assert model_desc.functions[2].name == "cos" assert model_desc.functions[3].name == "main_1" assert model_desc.functions[4].name == "main_2" assert model_desc.defaultFunctionName == "main_2" # verify the model can be load / run new_model = ct.models.MLModel(package_path, function_name="relu") new_model.predict( { "x": np.random.rand( 3, ) } ) new_model = ct.models.MLModel(package_path, function_name="sin") new_model.predict( { "x": np.random.rand( 3, ) } ) new_model = ct.models.MLModel(package_path, function_name="cos") new_model.predict( { "x": np.random.rand( 3, ) } ) new_model = ct.models.MLModel(package_path, function_name="main_1") new_model.predict( { "x": np.random.rand( 3, ) } ) new_model = ct.models.MLModel(package_path, function_name="main_2") new_model.predict( { "x": np.random.rand( 3, ) } ) shutil.rmtree(package_path) def test_invalid_default_function_name(self): # invalid type model = self._get_multifunction_mlpackage_1() desc = MultiFunctionDescriptor(model) with pytest.raises(ValueError, match="default_function_name must be type of str. Got 1."): desc.default_function_name = 1 # default function name not found in the program desc.default_function_name = "invalid" package_path = tempfile.mkdtemp(suffix=".mlpackage") with pytest.raises( ValueError, match="default_function_name invalid not found in the program." ): save_multifunction(desc, package_path) # default function name not set desc = MultiFunctionDescriptor(model) with pytest.raises( ValueError, match="default_function_name must be set for the MultiFunctionDescriptor instance before calling save_multifunction.", ): save_multifunction(desc, package_path) # cleanup def test_spec_version_save_multifunction(self): """ When save models to the multifunction format, the spec version are promoted to iOS18. """ model_1 = self._get_singlefunction_mlpackage(opset_version=ct.target.iOS15) model_2 = self._get_singlefunction_mlpackage(opset_version=ct.target.iOS16) desc = MultiFunctionDescriptor(model_1) desc.add_function(model_2, "main", "main_2") desc.default_function_name = "main_2" package_path = tempfile.mkdtemp(suffix=".mlpackage") save_multifunction(desc, package_path) shutil.rmtree(model_1) shutil.rmtree(model_2) # verify the spec version of the multifunctino model is iOS18 spec = load_spec(package_path) assert spec.specificationVersion == _SPECIFICATION_VERSION_IOS_18 shutil.rmtree(package_path) @staticmethod def _multifunction_model_from_single_function(model_path: str) -> str: desc = MultiFunctionDescriptor() desc.add_function(model_path, "main", "main_1") desc.add_function(model_path, "main", "main_2") desc.default_function_name = "main_1" multifunction_path = tempfile.mkdtemp(suffix=".mlpackage") save_multifunction(desc, multifunction_path) return multifunction_path @staticmethod def _multifunction_model_from_multifunction_model(model_path: str) -> str: desc = MultiFunctionDescriptor() desc.add_function(model_path, "main_1", "main_3") desc.add_function(model_path, "main_2", "main_4") desc.default_function_name = "main_3" multifunction_path = tempfile.mkdtemp(suffix=".mlpackage") save_multifunction(desc, multifunction_path) return multifunction_path def test_classifier_description(self): """ If the source model is a classifier, the resulting multifunction model should inherit the classifier description as well. """ def check_classifier_spec(model_path: str) -> None: spec = load_spec(model_path) model_desc = spec.description assert len(model_desc.functions) == 2 for idx in [0, 1]: assert model_desc.functions[idx].predictedFeatureName == "class_label" assert model_desc.functions[idx].predictedProbabilitiesName == "class_label_probs" assert model_desc.functions[idx].output[0].name == "class_label" assert model_desc.functions[idx].output[1].name == "class_label_probs" # source model with classifier config torch_model = torch.nn.ReLU().eval() traced_model = torch.jit.trace( torch_model, torch.rand( 3, ), ) variable_name = "var_2" class_label_name = "class_label" classifier_config = ct.ClassifierConfig( class_labels=["a", "b", "c"], predicted_feature_name=class_label_name, predicted_probabilities_output=variable_name, ) mlmodel = ct.convert( traced_model, inputs=[ct.TensorType(shape=(3,))], classifier_config=classifier_config, minimum_deployment_target=ct.target.iOS16, ) package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) # multifunction model should have the same classifier description model_path = self._multifunction_model_from_single_function(package_path) check_classifier_spec(model_path) # construct another multifunction model with an existing multifunction model, # the classifier description should still be the same. model_path_2 = self._multifunction_model_from_multifunction_model(model_path) check_classifier_spec(model_path_2) # cleanup shutil.rmtree(package_path) shutil.rmtree(model_path) shutil.rmtree(model_path_2) def test_input_output_description(self): """ When using save_multifunction to produce a model, we should respect the original model description in the original model. """ def check_i_o_spec(model_path: str) -> None: spec = load_spec(model_path) model_desc = spec.description assert len(model_desc.functions) == 2 for idx in [0, 1]: assert ( model_desc.functions[idx].input[0].type.imageType.colorSpace == proto.FeatureTypes_pb2.ImageFeatureType.BGR ) assert ( model_desc.functions[idx].output[0].type.imageType.colorSpace == proto.FeatureTypes_pb2.ImageFeatureType.RGB ) # source model with i/o with ImageType class Model(torch.nn.Module): def forward(self, x): return x + 5.0 example_input = torch.randint(0, 100, (1, 3, 10, 20), dtype=torch.float32) model = torch.jit.trace(Model().eval(), example_input) mlmodel = ct.convert( model, inputs=[ct.ImageType(shape=(1, 3, 10, 20), color_layout=ct.colorlayout.BGR)], outputs=[ct.ImageType(color_layout=ct.colorlayout.RGB)], minimum_deployment_target=ct.target.iOS16, ) package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) # multifunction model should have the same i/o description model_path = self._multifunction_model_from_single_function(package_path) check_i_o_spec(model_path) # construct another multifunction model with an existing multifunction model, # the i/o description should still be the same model_path_2 = self._multifunction_model_from_multifunction_model(model_path) check_i_o_spec(model_path_2) # cleanup shutil.rmtree(package_path) shutil.rmtree(model_path) shutil.rmtree(model_path_2) @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Multi-function only supported on macOS 15+") class TestMultiFunctionModelEnd2End: @staticmethod def _get_test_model(): class TestModel(torch.nn.Module): def __init__(self): super().__init__() self.conv1 = torch.nn.Conv2d(1, 8, 5, padding="same", bias=False) self.bn1 = torch.nn.BatchNorm2d(8) self.linear1 = torch.nn.Linear(28 * 28 * 8, 5, bias=False) def forward(self, x): x = self.conv1(x) x = self.bn1(x) x = self.linear1(torch.flatten(x)) return x model = TestModel().eval() example_input = torch.rand(1, 1, 28, 28) return torch.jit.trace(model, example_input) @staticmethod def _get_test_model_2(): """ Base model have the same weights, while the weights in submodule are different. """ class SubModel(torch.nn.Module): def __init__(self): super().__init__() self.linear1 = torch.nn.Linear(28 * 28 * 8, 5, bias=False) def forward(self, x): return self.linear1(x) class TestModel(torch.nn.Module): def __init__(self): super().__init__() self.conv1 = torch.nn.Conv2d(1, 8, 5, padding="same", bias=False) self.bn1 = torch.nn.BatchNorm2d(8) self.linear1 = None def forward(self, x): x = self.conv1(x) x = self.bn1(x) x = self.linear1(torch.flatten(x)) return x example_input = torch.rand(1, 1, 28, 28) model = TestModel().eval() submodule_1 = SubModel().eval() model.linear1 = submodule_1 trace_1 = torch.jit.trace(model, example_input) submodule_2 = SubModel().eval() model.linear1 = submodule_2 trace_2 = torch.jit.trace(model, example_input) return trace_1, trace_2 def test_two_models(self): """ model_1: base + function_1 model_2: base + function_2 After merging model_1 with model_2, the base weights should be shared. """ traced_model_1, traced_model_2 = self._get_test_model_2() input = np.random.rand(1, 1, 28, 28) mlmodel_1 = ct.convert( traced_model_1, inputs=[ct.TensorType(name="x", shape=(1, 1, 28, 28))], outputs=[ct.TensorType(name="out")], convert_to="mlprogram", minimum_deployment_target=ct.target.iOS17, ) mlmodel_2 = ct.convert( traced_model_2, inputs=[ct.TensorType(name="x", shape=(1, 1, 28, 28))], outputs=[ct.TensorType(name="out")], convert_to="mlprogram", minimum_deployment_target=ct.target.iOS17, ) gt_output_1 = mlmodel_1.predict({"x": input})["out"] gt_output_2 = mlmodel_2.predict({"x": input})["out"] package_path_1 = tempfile.mkdtemp(suffix=".mlpackage") mlmodel_1.save(package_path_1) package_path_2 = tempfile.mkdtemp(suffix=".mlpackage") mlmodel_2.save(package_path_2) # save multifuntion model desc = MultiFunctionDescriptor() desc.add_function(package_path_1, "main", "main_1") desc.add_function(package_path_2, "main", "main_2") desc.default_function_name = "main_1" saved_package_path = tempfile.mkdtemp(suffix=".mlpackage") save_multifunction(desc, saved_package_path) shutil.rmtree(package_path_1) shutil.rmtree(package_path_2) # verify the model spec spec = load_spec(saved_package_path) model_desc = spec.description assert len(model_desc.functions) == 2 assert model_desc.functions[0].name == "main_1" assert model_desc.functions[1].name == "main_2" assert model_desc.defaultFunctionName == "main_1" # verify the model can be load / run # rdar://126898335 ([multifunction][bug] CoreML "maybe" is not handling the fallback for the compute units) if platform.machine() == "arm64": multifunction_mlmodel_1 = ct.models.MLModel(saved_package_path, function_name="main_1") output = multifunction_mlmodel_1.predict({"x": input})["out"] np.testing.assert_allclose(gt_output_1, output) multifunction_mlmodel_2 = ct.models.MLModel(saved_package_path, function_name="main_2") output = multifunction_mlmodel_2.predict({"x": input})["out"] np.testing.assert_allclose(gt_output_2, output) # make sure the weights are deduplicated with tempfile.TemporaryDirectory() as serialize_dir: os.system(f"coremlcompiler compile {saved_package_path} {serialize_dir}") model_name_with_extension = os.path.basename(saved_package_path) model_name_wo_extension, _ = os.path.splitext(model_name_with_extension) mil_file = open( os.path.join(serialize_dir, f"{model_name_wo_extension}.mlmodelc", "model.mil") ) mil_txt = mil_file.read() assert ( mil_txt.count( 'const()[name = string("const_0_to_fp16"), val = tensor(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(64)))];' ) == 2 ) assert ( mil_txt.count( 'tensor linear1_linear1_weight_to_fp16 = const()[name = string("linear1_linear1_weight_to_fp16"), val = tensor(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(576)))];' ) == 1 ) assert ( mil_txt.count( 'tensor linear1_linear1_weight_to_fp16 = const()[name = string("linear1_linear1_weight_to_fp16"), val = tensor(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(63360)))];' ) == 1 ) shutil.rmtree(saved_package_path) def test_single_model(self): """ Convert a single model into a multi-functions model with only one function. """ traced_model = self._get_test_model() input = np.random.rand(1, 1, 28, 28) mlmodel = ct.convert( traced_model, inputs=[ct.TensorType(name="x", shape=(1, 1, 28, 28))], outputs=[ct.TensorType(name="out")], convert_to="mlprogram", minimum_deployment_target=ct.target.iOS16, ) gt_output = mlmodel.predict({"x": input})["out"] package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) # save multifuntion model desc = MultiFunctionDescriptor() desc.add_function(package_path, "main", "main_1") desc.default_function_name = "main_1" saved_package_path = tempfile.mkdtemp(suffix=".mlpackage") save_multifunction(desc, saved_package_path) shutil.rmtree(package_path) # verify the model spec spec = load_spec(saved_package_path) model_desc = spec.description assert len(model_desc.functions) == 1 assert model_desc.functions[0].name == "main_1" assert model_desc.defaultFunctionName == "main_1" # verify the model can be load / run # rdar://126898335 ([multifunction][bug] CoreML "maybe" is not handling the fallback for the compute units) if platform.machine() == "arm64": multifunction_mlmodel = ct.models.MLModel(saved_package_path, function_name="main_1") output = multifunction_mlmodel.predict({"x": input})["out"] np.testing.assert_allclose(gt_output, output) shutil.rmtree(saved_package_path) def test_10_duplicated_model(self): """ Copy a single model 10 times and create a multi-functions model with 10 functions. """ traced_model = self._get_test_model() input = np.random.rand(1, 1, 28, 28) NUM_MODEL = 10 saved_paths = [] for i in range(NUM_MODEL): mlmodel = ct.convert( traced_model, inputs=[ct.TensorType(name="x", shape=(1, 1, 28, 28))], outputs=[ct.TensorType(name="out")], convert_to="mlprogram", minimum_deployment_target=ct.target.iOS17, ) gt_output = mlmodel.predict({"x": input})["out"] saved_paths.append(tempfile.mkdtemp(suffix=".mlpackage")) mlmodel.save(saved_paths[-1]) # save the multifunction model desc = MultiFunctionDescriptor() for i in range(NUM_MODEL): desc.add_function(saved_paths[i], "main", f"main_{i}") desc.default_function_name = "main_5" saved_package_path = tempfile.mkdtemp(suffix=".mlpackage") save_multifunction(desc, saved_package_path) for val in saved_paths: shutil.rmtree(val) # verify the model spec spec = load_spec(saved_package_path) model_desc = spec.description assert len(model_desc.functions) == NUM_MODEL for i in range(NUM_MODEL): assert model_desc.functions[i].name == f"main_{i}" assert model_desc.defaultFunctionName == "main_5" # verify the model can be load / run # rdar://126898335 ([multifunction][bug] CoreML "maybe" is not handling the fallback for the compute units) if platform.machine() == "arm64": for i in range(NUM_MODEL): multifunction_mlmodel = ct.models.MLModel( saved_package_path, function_name=f"main_{i}" ) output = multifunction_mlmodel.predict({"x": input})["out"] np.testing.assert_allclose(gt_output, output) # make sure the weights are deduplicated with tempfile.TemporaryDirectory() as serialize_dir: os.system(f"coremlcompiler compile {saved_package_path} {serialize_dir}") model_name_with_extension = os.path.basename(saved_package_path) model_name_wo_extension, _ = os.path.splitext(model_name_with_extension) mil_file = open( os.path.join(serialize_dir, f"{model_name_wo_extension}.mlmodelc", "model.mil") ) mil_txt = mil_file.read() assert ( mil_txt.count( 'const()[name = string("const_0_to_fp16"), val = tensor(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(64)))];' ) == 10 ) assert ( mil_txt.count( 'tensor linear1_weight_to_fp16 = const()[name = string("linear1_weight_to_fp16"), val = tensor(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(576)))];' ) == 10 ) shutil.rmtree(saved_package_path) class TestMaterializeSymbolicShapeMLModel: FEATURE_DIM = 1024 NUM_HEADS = 4 MULTI_HEAD_OUT_FEATURE_DIM = 128 MULTI_HEAD_IN_FEATURE_DIM = int(FEATURE_DIM / NUM_HEADS) OUT_FEATURE_DIM = int(NUM_HEADS * MULTI_HEAD_OUT_FEATURE_DIM) @staticmethod def initialte_weight_and_bias(in_features, out_features) -> Tuple[torch.Tensor]: stdv = 1.0 / np.sqrt(in_features) weight = torch.empty((out_features, in_features), dtype=torch.float16).uniform_(-stdv, stdv) bias = torch.empty(out_features, dtype=torch.float16).uniform_(-stdv, stdv) return weight, bias @staticmethod def create_multihead_torch_model() -> torch.nn.Module: class Model(torch.nn.Module): def __init__(self) -> None: super().__init__() self.fc = torch.nn.Linear( TestMaterializeSymbolicShapeMLModel.MULTI_HEAD_IN_FEATURE_DIM, TestMaterializeSymbolicShapeMLModel.MULTI_HEAD_OUT_FEATURE_DIM, ) self.relu = torch.nn.ReLU() def forward(self, x) -> torch.Tensor: multi_head_x_shape = ( x.shape[0], x.shape[1], TestMaterializeSymbolicShapeMLModel.NUM_HEADS, TestMaterializeSymbolicShapeMLModel.MULTI_HEAD_IN_FEATURE_DIM, ) x_multi_head = torch.reshape(x, multi_head_x_shape) x_batched_multi_head = torch.permute(x_multi_head, (0, 2, 1, 3)) y_linear = self.fc(x_batched_multi_head) y_activated = self.relu(y_linear) y_multi_head = torch.permute(y_activated, (0, 2, 1, 3)) y_shape = ( x.shape[0], x.shape[1], TestMaterializeSymbolicShapeMLModel.OUT_FEATURE_DIM, ) y = torch.reshape(y_multi_head, y_shape) return y torch_model = Model() torch_model.eval() return torch_model @staticmethod def create_stateful_multihead_torch_model() -> torch.nn.Module: class Model(torch.nn.Module): def __init__(self) -> None: super().__init__() self.fc = torch.nn.Linear( TestMaterializeSymbolicShapeMLModel.MULTI_HEAD_IN_FEATURE_DIM, TestMaterializeSymbolicShapeMLModel.MULTI_HEAD_OUT_FEATURE_DIM, ) self.relu = torch.nn.ReLU() self.register_buffer( "cache", torch.zeros( TestMaterializeSymbolicShapeMLModel.OUT_FEATURE_DIM, dtype=torch.float32 ), ) def forward(self, x) -> torch.Tensor: multi_head_x_shape = ( x.shape[0], x.shape[1], TestMaterializeSymbolicShapeMLModel.NUM_HEADS, TestMaterializeSymbolicShapeMLModel.MULTI_HEAD_IN_FEATURE_DIM, ) x_multi_head = torch.reshape(x, multi_head_x_shape) x_batched_multi_head = torch.permute(x_multi_head, (0, 2, 1, 3)) y_linear = self.fc(x_batched_multi_head) y_activated = self.relu(y_linear) y_multi_head = torch.permute(y_activated, (0, 2, 1, 3)) y_shape = ( x.shape[0], x.shape[1], TestMaterializeSymbolicShapeMLModel.OUT_FEATURE_DIM, ) y = torch.reshape(y_multi_head, y_shape) z = y + self.cache z_mean = torch.mean(z, dim=(0, 1)) self.cache *= 0.8 self.cache += 0.2 * z_mean return z torch_model = Model() torch_model.eval() return torch_model @staticmethod def create_intermediate_state_torch_model(leading_sizes, weight, bias) -> torch.nn.Module: class Model(torch.nn.Module): def __init__(self, leading_sizes, weight, bias) -> None: super().__init__() self.fc = torch.nn.Linear( TestMaterializeSymbolicShapeMLModel.MULTI_HEAD_IN_FEATURE_DIM, TestMaterializeSymbolicShapeMLModel.MULTI_HEAD_OUT_FEATURE_DIM, ) with torch.no_grad(): self.fc.weight.copy_(weight) self.fc.bias.copy_(bias) self.relu = torch.nn.ReLU() self.register_buffer( "cache", torch.zeros( (*leading_sizes, TestMaterializeSymbolicShapeMLModel.FEATURE_DIM), dtype=torch.float32, ), ) def forward(self, x) -> torch.Tensor: self.cache *= 0.2 self.cache += 0.8 * x x = self.cache multi_head_x_shape = ( x.shape[0], x.shape[1], TestMaterializeSymbolicShapeMLModel.NUM_HEADS, TestMaterializeSymbolicShapeMLModel.MULTI_HEAD_IN_FEATURE_DIM, ) x_multi_head = torch.reshape(x, multi_head_x_shape) x_batched_multi_head = torch.permute(x_multi_head, (0, 2, 1, 3)) y_linear = self.fc(x_batched_multi_head) y_activated = self.relu(y_linear) y_multi_head = torch.permute(y_activated, (0, 2, 1, 3)) y_shape = ( x.shape[0], x.shape[1], TestMaterializeSymbolicShapeMLModel.OUT_FEATURE_DIM, ) y = torch.reshape(y_multi_head, y_shape) return y torch_model = Model(leading_sizes, weight, bias) torch_model.eval() return torch_model @staticmethod def read_mil_text(mlpackage_path: str) -> str: with tempfile.TemporaryDirectory() as serialize_dir: os.system(f"coremlcompiler compile {mlpackage_path} {serialize_dir}") model_name_with_extension = os.path.basename(mlpackage_path) model_name_wo_extension, _ = os.path.splitext(model_name_with_extension) mil_file = open( os.path.join(serialize_dir, f"{model_name_wo_extension}.mlmodelc", "model.mil") ) mil_text = mil_file.read() return mil_text @pytest.mark.parametrize( "symbolic_shape, override_main_function, reload_mlmodel", itertools.product( ( ct.EnumeratedShapes( shapes=[[1, 3, FEATURE_DIM], [2, 5, FEATURE_DIM], [4, 7, FEATURE_DIM]], default=[1, 3, FEATURE_DIM], ), (ct.RangeDim(1, 4, 1), ct.RangeDim(3, 7, 3), FEATURE_DIM), ), (True, False), (True, False), ), ) def test_multihead(self, symbolic_shape, override_main_function, reload_mlmodel): new_function_name = "main" if override_main_function else "materialization_2_5" def export_symbolic_shape_mlmodel(torch_model: torch.nn.Module) -> ct.models.MLModel: example_input = torch.rand((1, 3, self.FEATURE_DIM)) traced_model = torch.jit.trace(torch_model, (example_input,)) ct_inputs = [ct.TensorType(name="x", shape=symbolic_shape, dtype=np.float16)] ct_outputs = [ct.TensorType(name="y")] symbolic_shape_mlmodel = ct.convert( traced_model, inputs=ct_inputs, outputs=ct_outputs, minimum_deployment_target=ct.target.iOS17, skip_model_load=True, ) return symbolic_shape_mlmodel def validate_mil_text(multifunction_mlpackage_path: str) -> None: mil_text = self.read_mil_text(multifunction_mlpackage_path) if override_main_function: assert 1 == mil_text.count( '(BLOBFILE(path = tensor("@model_path/weights/weight.bin"), offset = tensor(64)))' ) assert 1 == mil_text.count( '(BLOBFILE(path = tensor("@model_path/weights/weight.bin"), offset = tensor(65664)))' ) else: assert 2 == mil_text.count( '(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(64)))' ) assert 2 == mil_text.count( '(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(65664)))' ) def validate_inference( torch_model: torch.nn.Module, symbolic_shape_mlmodel: ct.models.MLModel, multifunction_mlpackage_path: str, ) -> None: size_to_function_name = {(2, 5): new_function_name} for size, function_name in size_to_function_name.items(): mlmodel_materialized = ct.models.MLModel( multifunction_mlpackage_path, function_name=None if override_main_function else function_name, ) x = torch.rand(*size, self.FEATURE_DIM) output_torch = torch_model(x).detach().numpy() output_symbolic = symbolic_shape_mlmodel.predict({"x": x.numpy()})["y"] output_materialized = mlmodel_materialized.predict({"x": x.numpy()})["y"] np.testing.assert_allclose(output_symbolic, output_torch, atol=5e-3, rtol=5e-3) np.testing.assert_allclose(output_materialized, output_torch, atol=5e-3, rtol=5e-3) torch_model = self.create_multihead_torch_model() symbolic_shape_mlmodel = export_symbolic_shape_mlmodel(torch_model) symbolic_mlpackage_path = tempfile.mkdtemp(suffix=".mlpackage") symbolic_shape_mlmodel.save(symbolic_mlpackage_path) if reload_mlmodel: symbolic_shape_mlmodel = ct.models.MLModel( symbolic_mlpackage_path, skip_model_load=True ) multifunction_mlpackage_path = tempfile.mkdtemp(suffix=".mlpackage") ct.utils.materialize_dynamic_shape_mlmodel( symbolic_shape_mlmodel, {new_function_name: {"x": (2, 5, self.FEATURE_DIM)}}, multifunction_mlpackage_path, ) if override_main_function: assert ( ct.models.MLModel(multifunction_mlpackage_path)._spec.specificationVersion == ct.target.iOS17 ) else: assert ( ct.models.MLModel(multifunction_mlpackage_path)._spec.specificationVersion == ct.target.iOS18 ) # coremlcompiler had bug compiling the model < macOS 14 if ct.utils._macos_version() >= (15, 0): validate_mil_text(multifunction_mlpackage_path) if platform.machine() == "arm64" and ( override_main_function or ct.utils._macos_version() >= (15, 0) ): # Intel machines fails to run the model. # rdar://132919101 ([Bug] Intel machines fails on running several multifunction unittest) symbolic_shape_mlmodel = ct.models.MLModel(symbolic_mlpackage_path) validate_inference(torch_model, symbolic_shape_mlmodel, multifunction_mlpackage_path) shutil.rmtree(symbolic_mlpackage_path) shutil.rmtree(multifunction_mlpackage_path) @pytest.mark.skipif( ct.utils._macos_version() < (15, 0), reason="State only supported on macOS 15+" ) @pytest.mark.parametrize( "symbolic_shape, override_main_function, reload_mlmodel", itertools.product( ( ct.EnumeratedShapes( shapes=[[3, 1, FEATURE_DIM], [5, 2, FEATURE_DIM], [7, 4, FEATURE_DIM]], default=[3, 1, FEATURE_DIM], ), (ct.RangeDim(3, 7, 3), ct.RangeDim(1, 4, 1), FEATURE_DIM), ), (True, False), (True, False), ), ) def test_stateful_multihead(self, symbolic_shape, override_main_function, reload_mlmodel): new_function_name_1 = "main" if override_main_function else "materialization_5_2" new_function_name_2 = "materialization_7_4" def export_symbolic_shape_mlmodel(torch_model: torch.nn.Module) -> ct.models.MLModel: example_input = torch.rand((3, 1, self.FEATURE_DIM)) traced_model = torch.jit.trace(torch_model, (example_input,)) ct_inputs = [ct.TensorType(name="x", shape=symbolic_shape, dtype=np.float16)] ct_states = [ ct.StateType( wrapped_type=ct.TensorType(shape=(self.OUT_FEATURE_DIM,), dtype=np.float16), name="cache", ) ] ct_outputs = [ct.TensorType(name="y")] symbolic_shape_mlmodel = ct.convert( traced_model, inputs=ct_inputs, states=ct_states, outputs=ct_outputs, minimum_deployment_target=ct.target.iOS18, skip_model_load=True, ) return symbolic_shape_mlmodel def validate_mil_text(multifunction_mlpackage_path: str) -> None: mil_text = self.read_mil_text(multifunction_mlpackage_path) expected_counts = 2 if override_main_function else 3 assert expected_counts == mil_text.count( '(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(64)))' ) assert expected_counts == mil_text.count( '(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(65664)))' ) def validate_inference( torch_model: torch.nn.Module, symbolic_shape_mlmodel: ct.models.MLModel, multifunction_mlpackage_path: str, ) -> None: size_to_function_name = {(5, 2): new_function_name_1, (7, 4): new_function_name_2} for size, function_name in size_to_function_name.items(): mlmodel_materialized = ct.models.MLModel( multifunction_mlpackage_path, function_name=function_name ) torch_model.cache.fill_(0.0) cache_main = symbolic_shape_mlmodel.make_state() cache_materialized = mlmodel_materialized.make_state() for _ in range(10): x = torch.rand(*size, self.FEATURE_DIM) output_torch = torch_model(x).detach().numpy() output_symbolic = symbolic_shape_mlmodel.predict( {"x": x.numpy()}, state=cache_main )["y"] output_materialized = mlmodel_materialized.predict( {"x": x.numpy()}, state=cache_materialized )["y"] np.testing.assert_allclose(output_symbolic, output_torch, atol=5e-3, rtol=5e-3) np.testing.assert_allclose( output_materialized, output_torch, atol=5e-3, rtol=5e-3 ) torch_model = self.create_stateful_multihead_torch_model() symbolic_shape_mlmodel = export_symbolic_shape_mlmodel(torch_model) symbolic_mlpackage_path = tempfile.mkdtemp(suffix=".mlpackage") symbolic_shape_mlmodel.save(symbolic_mlpackage_path) if reload_mlmodel: symbolic_shape_mlmodel = ct.models.MLModel( symbolic_mlpackage_path, skip_model_load=True ) multifunction_mlpackage_path = tempfile.mkdtemp(suffix=".mlpackage") ct.utils.materialize_dynamic_shape_mlmodel( symbolic_shape_mlmodel, { new_function_name_1: {"x": (5, 2, self.FEATURE_DIM)}, new_function_name_2: {"x": (7, 4, self.FEATURE_DIM)}, }, multifunction_mlpackage_path, ) validate_mil_text(multifunction_mlpackage_path) if platform.machine() == "arm64" and ( override_main_function or ct.utils._macos_version() >= (15, 0) ): # Intel machines fails to run the model. # rdar://132919101 ([Bug] Intel machines fails on running several multifunction unittest) symbolic_shape_mlmodel = ct.models.MLModel(symbolic_mlpackage_path) validate_inference(torch_model, symbolic_shape_mlmodel, multifunction_mlpackage_path) shutil.rmtree(symbolic_mlpackage_path) shutil.rmtree(multifunction_mlpackage_path) @pytest.mark.skipif( ct.utils._macos_version() < (15, 0), reason="State only supported on macOS 15+" ) def test_advanced_intermediate_state(self): WEIGHT, BIAS = self.initialte_weight_and_bias( self.MULTI_HEAD_IN_FEATURE_DIM, self.MULTI_HEAD_OUT_FEATURE_DIM ) def export_symbolic_shape_program() -> Program: leading_sizes = (1, 3) torch_model = self.create_intermediate_state_torch_model(leading_sizes, WEIGHT, BIAS) x_shape = (*leading_sizes, self.FEATURE_DIM) x = torch.rand(x_shape) traced_model = torch.jit.trace(torch_model, x) x_dynamic_shape = (ct.RangeDim(1, 1024), ct.RangeDim(1, 1024), self.FEATURE_DIM) ct_inputs = [ct.TensorType(name="x", shape=x_dynamic_shape, dtype=np.float16)] ct_states = [ ct.StateType( wrapped_type=ct.TensorType(shape=x_dynamic_shape, dtype=np.float16), name="cache", ) ] symbolic_shape_prog = ct.convert( traced_model, inputs=ct_inputs, states=ct_states, minimum_deployment_target=ct.target.iOS18, convert_to="milinternal", ) return symbolic_shape_prog def export_fixed_shape_mlmodel(leading_sizes) -> ct.models.MLModel: torch_model = self.create_intermediate_state_torch_model(leading_sizes, WEIGHT, BIAS) x_shape = (*leading_sizes, self.FEATURE_DIM) x = torch.rand(x_shape) traced_model = torch.jit.trace(torch_model, x) ct_inputs = [ct.TensorType(name="x", shape=x_shape, dtype=np.float16)] ct_states = [ ct.StateType( wrapped_type=ct.TensorType(shape=x_shape, dtype=np.float16), name="cache" ) ] fixed_shape_mlmodel = ct.convert( traced_model, inputs=ct_inputs, states=ct_states, minimum_deployment_target=ct.target.iOS18, compute_units=ct.ComputeUnit.CPU_ONLY, ) return fixed_shape_mlmodel def materialize_dynamic_shape_program( dynamic_shape_prog: Program, function_name_to_materialization_map: Dict[str, Dict[str, Tuple[int]]], destination_path: str, ) -> None: # Materialize symbolic shapes, then run all optimization passes pass_pipeline = ct.PassPipeline.DEFAULT # If dynamic shape prog is obtained from `ct.convert(convert_to="milinternal")`, # then names are not sanitized. What is worse, mil_backend::sanitize_name_strings # does not work for multifunction pymil program. As a result, # we explicitly add mil_backend::sanitize_name_strings before materialization # TODO (rdar://131726375) Have mil_backend::sanitize_name_strings work on multifunction pass_pipeline.insert_pass(0, "mil_backend::sanitize_name_strings") pass_pipeline.insert_pass(1, "common::materialize_symbolic_shape_program") pass_pipeline.set_options( "common::materialize_symbolic_shape_program", { "function_name_to_materialization_map": function_name_to_materialization_map, }, ) PassPipelineManager.apply_pipeline(dynamic_shape_prog, pass_pipeline) # Weights are duplicated in each materialized new function # By default, graph pass const_deduplication will not deduplicate across functions, # so we need to call it explicitly here const_deduplication_pass = PASS_REGISTRY["common::const_deduplication"] const_deduplication_pass._deduplicate_const_across_functions(dynamic_shape_prog) # Source function may no longer be needed, # e.g. if it has intermediate symbolic-shape state dynamic_shape_prog.functions.pop("main") dynamic_shape_prog.default_function_name = list( function_name_to_materialization_map.keys() )[0] dynamic_shape_prog.skip_all_passes = True materialized_mlmodel = _mil_convert( dynamic_shape_prog, convert_from="milinternal", convert_to="mlprogram", specification_version=ct.target.iOS18, compute_units=ct.ComputeUnit.CPU_ONLY, export_multi_functions=True, skip_model_load=True, ) materialized_mlmodel.save(destination_path) def validate_mil_text(multifunction_mlpackage_path: str) -> None: mil_text = self.read_mil_text(multifunction_mlpackage_path) assert 2 == mil_text.count( '(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(64)))' ) assert 2 == mil_text.count( '(BLOBFILE(path = string("@model_path/weights/weight.bin"), offset = uint64(65664)))' ) def validate_inference(multifunction_mlpackage_path: str) -> None: size_to_function_name = {(2, 5): "materialization_2_5", (4, 7): "materialization_4_7"} for leading_sizes, function_name in size_to_function_name.items(): torch_model = self.create_intermediate_state_torch_model( leading_sizes, WEIGHT, BIAS ) mlmodel_unifunction = export_fixed_shape_mlmodel(leading_sizes) mlmodel_multifunction = ct.models.MLModel( multifunction_mlpackage_path, function_name=function_name, compute_units=ct.ComputeUnit.CPU_ONLY, ) torch_model.cache.fill_(0.0) cache_unifunction = mlmodel_unifunction.make_state() cache_multifunction = mlmodel_multifunction.make_state() for _ in range(3): x = torch.rand(size=(*leading_sizes, self.FEATURE_DIM), dtype=torch.float16) output_torch = torch_model(x).detach().numpy() output_unifunction = list( mlmodel_unifunction.predict( {"x": x.numpy()}, state=cache_unifunction ).values() )[0] output_multifunction = list( mlmodel_multifunction.predict( {"x": x.numpy()}, state=cache_multifunction ).values() )[0] np.testing.assert_allclose( output_unifunction, output_torch, atol=5e-3, rtol=5e-3 ) np.testing.assert_allclose( output_multifunction, output_torch, atol=5e-3, rtol=5e-3 ) symbolic_shape_prog = export_symbolic_shape_program() multifunction_mlpackage_path = tempfile.mkdtemp(suffix=".mlpackage") materialize_dynamic_shape_program( symbolic_shape_prog, { "materialization_2_5": { "x": (2, 5, self.FEATURE_DIM), "cache": (2, 5, self.FEATURE_DIM), }, "materialization_4_7": { "x": (4, 7, self.FEATURE_DIM), "cache": (4, 7, self.FEATURE_DIM), }, }, multifunction_mlpackage_path, ) validate_mil_text(multifunction_mlpackage_path) validate_inference(multifunction_mlpackage_path) shutil.rmtree(multifunction_mlpackage_path) class TestBisectModel: @staticmethod def check_spec_op_type(model_path, expected_ops): spec = load_spec(model_path) mil = spec.mlProgram for function in mil.functions.values(): for block in function.block_specializations.values(): ops = list(block.operations) for i, op_type in enumerate(expected_ops): assert ops[i].type == op_type @staticmethod def get_test_model_path(minimum_deployment_target=ct.target.iOS16, return_as_mlmodel=False): # pytorch model and tracing class Model(torch.nn.Module): def __init__(self): super().__init__() self.linear1 = torch.nn.Linear(6000, 6000) self.relu = torch.nn.ReLU() self.linear2 = torch.nn.Linear(6000, 6000) def forward(self, x): x = self.linear1(x) x = self.relu(x) x = self.linear2(x) x = torch.sin(x) return x example_input = torch.rand(1, 6000) model = Model().eval() traced_model = torch.jit.trace(model, example_input) # convert to mlpackage mlmodel = ct.convert( traced_model, inputs=[ct.TensorType(shape=(1, 6000), name="input")], minimum_deployment_target=minimum_deployment_target, ) # return as mlmodel if return_as_mlmodel: return mlmodel # save on disk and return the model path package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) return package_path def test_invalid_mlpackage(self): traced_model = TestMultiFunctionModelEnd2End._get_test_model() input = np.random.rand(1, 1, 28, 28) mlmodel = ct.convert( traced_model, inputs=[ct.TensorType(name="x", shape=(1, 1, 28, 28))], outputs=[ct.TensorType(name="out")], convert_to="mlprogram", minimum_deployment_target=ct.target.iOS16, ) package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) # function name other than "main" will error out desc = MultiFunctionDescriptor() desc.add_function(package_path, "main", "main_1") desc.default_function_name = "main_1" saved_package_path = tempfile.mkdtemp(suffix=".mlpackage") save_multifunction(desc, saved_package_path) with tempfile.TemporaryDirectory() as output_dir: with pytest.raises(ValueError, match="only support model with a single"): bisect_model( saved_package_path, output_dir=output_dir, ) shutil.rmtree(saved_package_path) # multi-function model is not supported desc = MultiFunctionDescriptor() desc.add_function(package_path, "main", "main") desc.add_function(package_path, "main", "main_1") desc.default_function_name = "main" saved_package_path = tempfile.mkdtemp(suffix=".mlpackage") save_multifunction(desc, saved_package_path) with pytest.raises(ValueError, match="only support model with a single"): bisect_model( saved_package_path, output_dir=output_dir, ) shutil.rmtree(saved_package_path) shutil.rmtree(package_path) @pytest.mark.parametrize( "mlmodel_as_input", [True, False], ) def test_pipeline(self, mlmodel_as_input): model = self.get_test_model_path(return_as_mlmodel=mlmodel_as_input) output_dir = str(tempfile.TemporaryDirectory()) # The API will bisect the model into two chunks, and produces a pipeline model bisect_model( model, output_dir, merge_chunks_to_pipeline=True, ) # check the file name is correct if mlmodel_as_input: name = "" else: mlpackage_name = os.path.basename(model) name, _ = os.path.splitext(mlpackage_name) name += "_" pipeline_path = os.path.join(output_dir, f"{name}chunked_pipeline.mlpackage") assert os.path.isdir(pipeline_path) # check the Core ML model is a pipeline model spec = load_spec(pipeline_path) assert spec.WhichOneof("Type") == "pipeline" # cleanup if not mlmodel_as_input: shutil.rmtree(model) shutil.rmtree(output_dir) def test_compressed_model(self): # use coremltools.optimizee to palettize a Core ML model model = self.get_test_model_path(return_as_mlmodel=True) op_config = cto.coreml.OpPalettizerConfig(mode="kmeans", nbits=8) config = cto.coreml.OptimizationConfig(global_config=op_config) model = cto.coreml.palettize_weights(model, config) # test that the bisect API works output_dir = str(tempfile.TemporaryDirectory()) bisect_model( model, output_dir, ) # test the models contain correct ops name = "" chunk1_path = os.path.join(output_dir, f"{name}chunk1.mlpackage") chunk2_path = os.path.join(output_dir, f"{name}chunk2.mlpackage") assert os.path.isdir(chunk1_path) assert os.path.isdir(chunk2_path) self.check_spec_op_type( chunk1_path, [ "constexpr_lut_to_dense", "const", "linear", "const", "cast", ] ) self.check_spec_op_type( chunk2_path, [ "const", "cast", "relu", "constexpr_lut_to_dense", "const", "linear", "sin", ] ) # cleanup shutil.rmtree(output_dir) @pytest.mark.parametrize( "mlmodel_as_input", [True, False], ) def test_basic(self, mlmodel_as_input): def check_spec_version(model_path, expected_spec_version): spec = load_spec(model_path) assert spec.specificationVersion == expected_spec_version def check_output_dtype(model_path, expected_output_dtype): spec = load_spec(model_path) assert_spec_output_type(spec, DTYPE_TO_FEATURE_TYPE_MAP[expected_output_dtype]) def check_input_dtype(model_path, expected_input_dtype): spec = load_spec(model_path) assert_spec_input_type(spec, DTYPE_TO_FEATURE_TYPE_MAP[expected_input_dtype]) model = self.get_test_model_path(ct.target.iOS17, return_as_mlmodel=mlmodel_as_input) output_dir = str(tempfile.TemporaryDirectory()) # By bisecting the model into half, there will be two new mlpackages, with suffix `_chunk1.mlpackage` and `_chunk2.mlpackage` # in the target `output_dir`. bisect_model( model, output_dir, ) # check the API doesn't delete the original mlpackage if not mlmodel_as_input: assert os.path.isdir(model) # check the file names are correct if mlmodel_as_input: name = "" else: mlpackage_name = os.path.basename(model) name, _ = os.path.splitext(mlpackage_name) name += "_" chunk1_path = os.path.join(output_dir, f"{name}chunk1.mlpackage") chunk2_path = os.path.join(output_dir, f"{name}chunk2.mlpackage") assert os.path.isdir(chunk1_path) assert os.path.isdir(chunk2_path) # check the model op type self.check_spec_op_type( chunk1_path, [ "const", "const", "linear", "const", "cast", ] ) self.check_spec_op_type( chunk2_path, [ "const", "cast", "relu", "const", "const", "linear", "sin", ] ) # check the spec has the correct version check_spec_version(chunk1_path, ct.target.iOS17) check_spec_version(chunk2_path, ct.target.iOS17) # the i/o dtype of the two chunk models should be: # 1. fp16 -> fp32 # 2. fp32 -> fp16 check_input_dtype(chunk1_path, "fp16") check_output_dtype(chunk1_path, "fp32") check_input_dtype(chunk2_path, "fp32") check_output_dtype(chunk2_path, "fp16") # cleanup if not mlmodel_as_input: shutil.rmtree(model) shutil.rmtree(output_dir) def test_api_example(self): """ Test the API example in https://apple.github.io/coremltools/docs-guides/source/mlmodel-utilities.html """ model_path = self.get_test_model_path() output_dir = str(tempfile.TemporaryDirectory()) # The following code will produce two chunks models: # `./output/my_model_chunk1.mlpackage` and `./output/my_model_chunk2.mlpackage` ct.models.utils.bisect_model( model_path, output_dir, ) # The following code will produce a single pipeline model `./output/my_model_chunked_pipeline.mlpackage` ct.models.utils.bisect_model( model_path, output_dir, merge_chunks_to_pipeline=True, ) # You can also pass the MLModel object directly mlmodel = ct.models.MLModel(model_path) ct.models.utils.bisect_model( mlmodel, output_dir, ) # clean up shutil.rmtree(output_dir) shutil.rmtree(model_path) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2775474 coremltools-8.0/coremltools/test/modelpackage/0000755000000000000000000000000014672075535020434 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/modelpackage/__init__.py0000644000000000000000000000034114672066616022543 0ustar00rootroot# Copyright (c) 2017 - 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/modelpackage/test_mlmodel.py0000644000000000000000000000413114672066616023475 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import shutil import numpy as np import torch import coremltools as ct from coremltools._deps import _IS_MACOS from coremltools.models.model import MLModel from coremltools.models.utils import _macos_version def test_mlmodel_demo(tmpdir): NUM_TOKENS = 3 EMBEDDING_SIZE = 5 class TestModule(torch.nn.Module): def __init__(self): super(TestModule, self).__init__() self.embedding = torch.nn.Embedding(NUM_TOKENS, EMBEDDING_SIZE) def forward(self, x): return self.embedding(x) model = TestModule() model.eval() example_input = torch.randint(high=NUM_TOKENS, size=(2,), dtype=torch.int64) traced_model = torch.jit.trace(model, example_input) mlmodel = ct.convert( traced_model, source='pytorch', convert_to='mlprogram', inputs=[ ct.TensorType( name="input", shape=example_input.shape, dtype=example_input.numpy().dtype, ) ], compute_precision=ct.precision.FLOAT32, compute_units=ct.ComputeUnit.CPU_ONLY ) assert isinstance(mlmodel, MLModel) # mlpackage_path is a model package mlpackage_path = os.path.join(str(tmpdir), 'mymodel.mlpackage') mlmodel.save(mlpackage_path) # Read back the saved bundle and compile mlmodel2 = MLModel(mlpackage_path) if not _IS_MACOS or _macos_version() < (12, 0): # Can not get predictions unless on macOS 12 or higher. shutil.rmtree(mlpackage_path) return result = mlmodel2.predict( {"input": example_input.cpu().detach().numpy().astype(np.float32)}, ) # Verify outputs expected = model(example_input) name = list(result.keys())[0] np.testing.assert_allclose(result[name], expected.cpu().detach().numpy()) # Cleanup package shutil.rmtree(mlpackage_path) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/modelpackage/test_modelpackage.py0000644000000000000000000007232614672066616024473 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import json import os import platform import shutil import tempfile import numpy as np import pytest import coremltools import coremltools as ct from coremltools import ComputeUnit, utils from coremltools._deps import _HAS_EXECUTORCH, _HAS_TORCH from coremltools.converters.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.builder import Builder as mb from coremltools.libmodelpackage import ModelPackage from coremltools.models import _METADATA_VERSION, CompiledMLModel, MLModel from coremltools.models.utils import _MLPACKAGE_AUTHOR_NAME, _WEIGHTS_DIR_NAME from coremltools.proto import Model_pb2 if _HAS_TORCH: import torch if _HAS_EXECUTORCH: import executorch.exir def _remove_path(path): if os.path.isdir(path): shutil.rmtree(path) else: os.remove(path) class TestMLModel: def setup_class(self): spec = Model_pb2.Model() spec.specificationVersion = coremltools.SPECIFICATION_VERSION features = ["feature_1", "feature_2"] output = "output" for f in features: input_ = spec.description.input.add() input_.name = f input_.type.doubleType.MergeFromString(b"") output_ = spec.description.output.add() output_.name = output output_.type.doubleType.MergeFromString(b"") lr = spec.glmRegressor lr.offset.append(0.1) weights = lr.weights.add() coefs = [1.0, 2.0] for i in coefs: weights.value.append(i) spec.description.predictedFeatureName = "output" self.spec = spec def test_model_creation(self): model = MLModel(self.spec) assert model is not None package = tempfile.TemporaryDirectory(suffix=".mlpackage") package.cleanup() utils.save_spec(self.spec, package.name) model = MLModel(package.name) assert model is not None # cleanup _remove_path(package.name) def test_model_api(self): model = MLModel(self.spec) assert model is not None model.author = "Test author" assert model.author == "Test author" assert model.get_spec().description.metadata.author == "Test author" model.license = "Test license" assert model.license == "Test license" assert model.get_spec().description.metadata.license == "Test license" model.short_description = "Test model" assert model.short_description == "Test model" assert model.get_spec().description.metadata.shortDescription == "Test model" model.version = "1.3" assert model.version == "1.3" assert model.get_spec().description.metadata.versionString == "1.3" model.input_description["feature_1"] = "This is feature 1" assert model.input_description["feature_1"] == "This is feature 1" model.output_description["output"] = "This is output" assert model.output_description["output"] == "This is output" package = tempfile.TemporaryDirectory(suffix=".mlpackage") package.cleanup() model.save(package.name) loaded_model = MLModel(package.name) assert loaded_model.author == "Test author" assert loaded_model.license == "Test license" assert loaded_model.short_description == "Test model" assert loaded_model.input_description["feature_1"] == "This is feature 1" assert loaded_model.output_description["output"] == "This is output" # cleanup _remove_path(package.name) @pytest.mark.skipif(not _HAS_TORCH, reason="requires torch") def test_save_from_mlpackage(self): class Model(torch.nn.Module): def forward(self, x): return x example_input = torch.rand(1, 3, 50, 50) traced_model = torch.jit.trace(Model().eval(), example_input) model = coremltools.convert( traced_model, inputs=[coremltools.TensorType(shape=example_input.shape)], convert_to="mlprogram", ) author = "Bobby Joe!" model.author = author save_dir = tempfile.TemporaryDirectory(suffix=".mlpackage").name model.save(save_dir) loaded_model = MLModel(save_dir) assert loaded_model.author == author _remove_path(save_dir) def test_predict_api(self): model = MLModel(self.spec) package = tempfile.TemporaryDirectory(suffix=".mlpackage") package.cleanup() model.save(package.name) if utils._macos_version() >= (12, 0): for compute_units in coremltools.ComputeUnit: if (compute_units == coremltools.ComputeUnit.CPU_AND_NE and utils._macos_version() < (13, 0)): continue loaded_model = MLModel(package.name, compute_units=compute_units) preds = loaded_model.predict({"feature_1": 1.0, "feature_2": 1.0}) assert preds is not None assert len(preds.keys()) == 1 assert preds["output"] == 3.1 assert loaded_model.compute_unit == compute_units else: # just check if we can load it loaded_model = MLModel(package.name) # cleanup _remove_path(package.name) @pytest.mark.skipif(utils._macos_version() < (12, 0), reason="prediction available only on macOS12+") def test_batch_predict(self): model = MLModel(self.spec) x = [ {"feature_1": 1.0, "feature_2": 1.0}, {"feature_1": 2.0, "feature_2": 2.0} ] y = model.predict(x) assert len(y) == 2 assert y[0]["output"] == 3.1 assert len(y[0].keys()) == 1 assert y[1]["output"] == 6.1 assert len(y[1].keys()) == 1 def test_rename_input(self): utils.rename_feature(self.spec, "feature_1", "renamed_feature", rename_inputs=True) model = MLModel(self.spec) package = tempfile.TemporaryDirectory(suffix=".mlpackage") package.cleanup() model.save(package.name) loaded_model = MLModel(package.name) if utils._macos_version() >= (12, 0): preds = loaded_model.predict({"renamed_feature": 1.0, "feature_2": 1.0}) assert preds is not None assert preds["output"] == 3.1 # reset the spec for next run utils.rename_feature(self.spec, "renamed_feature", "feature_1", rename_inputs=True) # cleanup _remove_path(package.name) def test_rename_input_bad(self): utils.rename_feature(self.spec, "blah", "bad_name", rename_inputs=True) model = MLModel(self.spec) package = tempfile.TemporaryDirectory(suffix=".mlpackage") package.cleanup() model.save(package.name) loaded_model = MLModel(package.name) if utils._macos_version() >= (12, 0): preds = loaded_model.predict({"feature_1": 1.0, "feature_2": 1.0}) assert preds is not None assert preds["output"] == 3.1 # cleanup _remove_path(package.name) def test_save(self): model = MLModel(self.spec) # Verify "save" can be called twice and the saved # model can be loaded successfully each time for _ in range(0, 2): package = tempfile.TemporaryDirectory(suffix=".mlpackage") package.cleanup() model.save(package.name) loaded_model = MLModel(package.name) if utils._macos_version() >= (12, 0): preds = loaded_model.predict({"feature_1": 1.0, "feature_2": 1.0}) assert preds is not None assert preds["output"] == 3.1 _remove_path(package.name) def test_save_in_place(self): model = MLModel(self.spec) # Verify "save" can be called twice and the saved # model can be loaded successfully each time # the mlpackage remains in place after the first save package = tempfile.TemporaryDirectory(suffix=".mlpackage") package.cleanup() for _ in range(2): model.save(package.name) loaded_model = MLModel(package.name) if utils._macos_version() >= (12, 0): preds = loaded_model.predict({"feature_1": 1.0, "feature_2": 1.0}) assert preds is not None assert preds["output"] == 3.1 _remove_path(package.name) @pytest.mark.skipif(not _HAS_EXECUTORCH, reason="requires ExecuTorch") def test_save_EXIR_debug_handle(self): """ If we update EXIR debug handle serialization, we should update this test as well """ INPUT_SHAPE = (2, 10) LINEAR_SHAPE = (INPUT_SHAPE[-1], 20) class TestModule(torch.nn.Module): def __init__(self): super().__init__() self.linear = torch.nn.Linear(*LINEAR_SHAPE) def forward(self, x): return self.linear(x) def _compare_loaded_debug_handle_mapping_with_original(package): debug_handle_mapping_json_path = os.path.join( package, "executorch_debug_handle_mapping.json" ) assert os.path.exists(debug_handle_mapping_json_path) with open(debug_handle_mapping_json_path, "r") as f: loaded_debug_handle_mapping = json.load(f) assert loaded_debug_handle_mapping == debug_handle_mapping def _compare_prediction_with_torch(coreml_model, torch_model): x = torch.rand(INPUT_SHAPE) coreml_x = {list(coreml_model.input_description)[0]: x.numpy()} coreml_preds = coreml_model.predict(coreml_x) assert coreml_preds is not None coreml_y = list(coreml_preds.values())[0] torch_y = torch_model(x).detach().numpy() np.testing.assert_allclose(coreml_y, torch_y, rtol=1e-3, atol=1e-3) torch_model = TestModule() torch_model.eval() example_input = (torch.rand(*INPUT_SHAPE, dtype=torch.float16).to(torch.float32),) exir_program_aten = torch.export.export(torch_model, example_input) exir_program_edge = executorch.exir.to_edge(exir_program_aten).exported_program() coreml_model = coremltools.convert(exir_program_edge) debug_handle_mapping = { "version" : coreml_model.user_defined_metadata[_METADATA_VERSION], "mapping" : { str(k): v for k, v in coreml_model._mil_program.construct_debug_handle_to_ops_mapping().items() }, } with tempfile.TemporaryDirectory(suffix=".mlpackage") as package0: coreml_model.save(package0) loaded_model0 = MLModel(package0) if utils._macos_version() >= (12, 0): _compare_prediction_with_torch(loaded_model0, torch_model) _compare_loaded_debug_handle_mapping_with_original(package0) with tempfile.TemporaryDirectory(suffix=".mlpackage") as package1: loaded_model0.save(package1) loaded_model1 = MLModel(package1) if utils._macos_version() >= (12, 0): _compare_prediction_with_torch(loaded_model1, torch_model) # Although debug handle info will be lost in loaded model due to we do not # deserialize executorch_debug_handle_mapping.json, package1 will still have # executorch_debug_handle_mapping.json, which is copied from package0 _compare_loaded_debug_handle_mapping_with_original(package1) @pytest.mark.skipif(not _HAS_TORCH, reason="requires torch") def test_mil_as_package(self): num_tokens = 3 embedding_size = 5 class TestModule(torch.nn.Module): def __init__(self): super().__init__() self.embedding = torch.nn.Embedding(num_tokens, embedding_size) def forward(self, x): return self.embedding(x) model = TestModule() model.eval() example_input = torch.randint(high=num_tokens, size=(2,), dtype=torch.int64) traced_model = torch.jit.trace(model, example_input) temp_package_dir = tempfile.TemporaryDirectory(suffix=".mlpackage") for converted_package_path in [None, temp_package_dir.name]: mlmodel = coremltools.convert( traced_model, package_dir=converted_package_path, source='pytorch', convert_to='mlprogram', compute_precision=coremltools.precision.FLOAT32, inputs=[ coremltools.TensorType( name="input", shape=example_input.shape, dtype=example_input.numpy().dtype, ) ], ) assert isinstance(mlmodel, MLModel) package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) assert ModelPackage.isValid(package_path) assert os.path.exists(ModelPackage(package_path).getRootModel().path()) # Read back the saved bundle and compile mlmodel2 = MLModel(package_path, compute_units=ComputeUnit.CPU_ONLY) if utils._macos_version() >= (12, 0): result = mlmodel2.predict( {"input": example_input.cpu().detach().numpy().astype(np.float32)} ) # Verify outputs expected = model(example_input) name = list(result.keys())[0] np.testing.assert_allclose(result[name], expected.cpu().detach().numpy()) # Cleanup package shutil.rmtree(package_path) tmp_package_path = mlmodel.package_path assert os.path.exists(tmp_package_path) del mlmodel if converted_package_path is not None: # Verify we leave the provided package dir alone assert os.path.exists(tmp_package_path) temp_package_dir.cleanup() def test_model_save_no_extension(self): import torch num_tokens = 3 embedding_size = 5 class TestModule(torch.nn.Module): def __init__(self): super().__init__() self.embedding = torch.nn.Embedding(num_tokens, embedding_size) def forward(self, x): return self.embedding(x) model = TestModule() model.eval() example_input = torch.randint(high=num_tokens, size=(2,), dtype=torch.int64) traced_model = torch.jit.trace(model, example_input) mlmodel = coremltools.convert( traced_model, package_dir=None, source='pytorch', convert_to='mlprogram', inputs=[ coremltools.TensorType( name="input", shape=example_input.shape, dtype=example_input.numpy().dtype, ) ], ) assert isinstance(mlmodel, MLModel) package = tempfile.TemporaryDirectory() package.cleanup() package_path = package.name mlmodel.save(package_path) assert not os.path.exists(package_path) package_path = package_path + ".mlpackage" assert os.path.exists(package_path) shutil.rmtree(package_path) @pytest.mark.skipif(utils._macos_version() < (15, 0), reason="optimization hints available only on macOS15+") @pytest.mark.parametrize("reshapeFrequency, specializationStrategy", itertools.product( (ct.ReshapeFrequency.Frequent, ct.ReshapeFrequency.Infrequent, None), (ct.SpecializationStrategy.FastPrediction, ct.SpecializationStrategy.Default, None), )) def test_optimization_hints(self, reshapeFrequency, specializationStrategy): optimization_hints={} if reshapeFrequency is not None: optimization_hints['reshapeFrequency'] = reshapeFrequency if specializationStrategy is not None: optimization_hints['specializationStrategy'] = specializationStrategy if len(optimization_hints) == 0: optimization_hints = None m = MLModel(self.spec, optimization_hints=optimization_hints) assert isinstance(m, MLModel) assert(m.optimization_hints == optimization_hints) @pytest.mark.skipif(utils._macos_version() < (15, 0), reason="optimization hints available only on macOS15+") def test_optimization_hint_error_cases(self): with pytest.raises(TypeError, match='"optimization_hint_input" must be a dictionary'): MLModel(self.spec, optimization_hints=12) with pytest.raises(ValueError, match='Unrecognized key in optimization_hint dictionary: bad key'): MLModel(self.spec, optimization_hints={'bad key': ct.ReshapeFrequency.Frequent}) with pytest.raises(TypeError, match='"specializationStrategy" value of "optimization_hint_input" dictionary must be of type coremltools.SpecializationStrategy'): MLModel(self.spec, optimization_hints={"specializationStrategy": 12}) with pytest.raises(TypeError, match='"reshapeFrequency" value of "optimization_hint_input" dictionary must be of type coremltools.ReshapeFrequency'): MLModel(self.spec, optimization_hints={"reshapeFrequency": 12}) with pytest.raises(TypeError, match='"reshapeFrequency" value of "optimization_hint_input" dictionary must be of type coremltools.ReshapeFrequency'): # SpecializationStrategy value for ReshapeFrequency key MLModel(self.spec, optimization_hints={"reshapeFrequency": ct.SpecializationStrategy.Default}) class TestCompiledMLModel: @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="State only supported on macOS 15+") def test_state(self): """ Test prediction from a stateful model """ @mb.program( input_specs=[ mb.StateTensorSpec((1,), dtype=types.fp16), ], opset_version=ct.target.iOS18, ) def increment(x): # Read y = mb.read_state(input=x) # Update y = mb.add(x=y, y=np.array([1.0]).astype("float16")) # Write y = mb.coreml_update_state(state=x, value=y) # Return return y mlmodel = ct.convert( increment, convert_to="mlprogram", minimum_deployment_target=ct.target.iOS18, ) def extract_value(y): return list(y.values())[0][0] compiled_model = CompiledMLModel(mlmodel.get_compiled_model_path()) # Using first state state1 = compiled_model.make_state() for i in range(1, 5): y = compiled_model.predict({}, state=state1) assert extract_value(y) == i # rdar://126957030 ([State][Bug][Intel] Stateful model prediction is wrong on Intel laptop) if platform.machine() != "arm64": return # Use a new state state2 = compiled_model.make_state() for i in range(1, 5): y = compiled_model.predict({}, state=state2) assert extract_value(y) == i # Go back to using the first state for i in range(5, 10): y = compiled_model.predict({}, state=state1) assert extract_value(y) == i class TestSpecAndMLModelAPIs: def setup_class(self): # define an mlprogram, which has weights @mb.program(input_specs=[mb.TensorSpec(shape=(4, 500))]) def linear_prog(input): W = mb.const(val=np.random.rand(100, 500), name="const_W") out = mb.linear(x=input, weight=W, name="output") return out # define another mlprogram, which does not have weights @mb.program(input_specs=[mb.TensorSpec(shape=(4, 5, 2))]) def relu_prog(input): out = mb.relu(x=input, name="output") return out # convert and save model on disk self.mlmodel = coremltools.convert(linear_prog, convert_to="mlprogram") self.mlpackage_path = tempfile.mkdtemp(suffix=utils._MLPACKAGE_EXTENSION) self.mlmodel.save(self.mlpackage_path) self.mlmodel_no_weights = coremltools.convert(relu_prog, convert_to="mlprogram") def teardown_class(self): _remove_path(self.mlpackage_path) self.mlmodel = None self.mlmodel_no_weights = None def _test_mlmodel_correctness(self, mlmodel): """ :param mlmodel: coremltools.models.MLModel Test the following: - calling .predict on mlmodel works correctly - calling .save on mlmodel works correctly """ # construct input dictionary spec = mlmodel.get_spec() inputs = spec.description.input input_dict = {} for input in inputs: input_dict[input.name] = np.random.rand(*tuple(input.type.multiArrayType.shape)) # check prediction preds = mlmodel.predict(input_dict) assert preds is not None # save, load and predict again to check that the saving and loading worked correctly with tempfile.TemporaryDirectory(suffix=utils._MLPACKAGE_EXTENSION) as temp_path: mlmodel.save(temp_path) mlmodel_reloaded = MLModel(temp_path) preds = mlmodel_reloaded.predict(input_dict) assert preds is not None @pytest.mark.skipif(utils._macos_version() < (12, 0), reason="prediction on mlprogram model " "available only on macOS12+") def test_mlmodel_to_spec_to_mlmodel(self): """ convert mlmodel to spec, and then back to mlmodel and verify that it works """ spec = self.mlmodel.get_spec() # reload the model from the spec and verify it weights_dir = self.mlmodel.weights_dir mlmodel_from_spec = MLModel(spec, weights_dir=weights_dir) self._test_mlmodel_correctness(mlmodel_from_spec) # check that the original model still works self._test_mlmodel_correctness(self.mlmodel) # check that an error is raised when MLModel is initialized without the weights with pytest.raises(Exception, match="MLModel of type mlProgram cannot be loaded just from the model " "spec object. It also needs the path to the weights file. " "Please provide that as well, using the 'weights_dir' argument."): MLModel(spec) @pytest.mark.skipif(utils._macos_version() < (12, 0), reason="prediction on mlprogram model " "available only on macOS12+") def test_path_to_mlmodel_to_spec_to_mlmodel(self): """ load an mlmodel from disk, convert it to spec, and then convert the spec back to mlmodel """ mlmodel_from_disk = MLModel(self.mlpackage_path) spec = mlmodel_from_disk.get_spec() mlmodel_from_spec = MLModel(spec, weights_dir=mlmodel_from_disk.weights_dir) self._test_mlmodel_correctness(mlmodel_from_spec) @pytest.mark.skipif(utils._macos_version() < (12, 0), reason="prediction on mlprogram model " "available only on macOS12+") def test_path_to_spec_to_mlmodel(self): """ load a spec from disk, then convert it to mlmodel, and check that it works """ spec = utils.load_spec(self.mlpackage_path) weights_dir = self.mlpackage_path + "/Data/" + _MLPACKAGE_AUTHOR_NAME + "/weights" mlmodel = MLModel(spec, weights_dir=weights_dir) self._test_mlmodel_correctness(mlmodel) @pytest.mark.skipif(utils._macos_version() < (12, 0), reason="prediction on mlprogram model " "available only on macOS12+") def test_save_spec_api_mlprogram_without_weights_dir(self): """ save an mlpackage using the save_spec API. It should error out because no weights dir. """ spec = self.mlmodel.get_spec() with tempfile.TemporaryDirectory(suffix=utils._MLPACKAGE_EXTENSION) as model_path: # this should raise error: with pytest.raises(Exception, match="spec of type mlProgram cannot be saved without" " the weights file. Please provide the path to " "the weights file as well, using the 'weights_dir' argument."): utils.save_spec(spec, model_path) @pytest.mark.skipif( utils._macos_version() < (12, 0), reason="prediction on mlprogram model " "available only on macOS12+", ) def test_save_spec_api(self): """ save an mlpackage using the save_spec API. Reload the model from disk and verify it works """ spec = self.mlmodel.get_spec() with tempfile.TemporaryDirectory( suffix=utils._MLPACKAGE_EXTENSION ) as model_path: utils.save_spec(spec, model_path, weights_dir=self.mlmodel.weights_dir) model = MLModel(model_path) self._test_mlmodel_correctness(model) @pytest.mark.skipif(utils._macos_version() < (12, 0), reason="prediction on mlprogram model " "available only on macOS12+") def test_save_spec_api_model_with_no_weights(self): """ save an mlprogram model with no weights, using the save SPI and an empty weights directory. Reload the model from disk and verify it works """ spec = self.mlmodel_no_weights.get_spec() with tempfile.TemporaryDirectory(suffix=utils._MLPACKAGE_EXTENSION) as model_path: with tempfile.TemporaryDirectory() as empty_weight_dir: utils.save_spec(spec, model_path, weights_dir=empty_weight_dir) model = MLModel(model_path) self._test_mlmodel_correctness(model) @pytest.mark.skipif(utils._macos_version() < (12, 0), reason="prediction on mlprogram model " "available only on macOS12+") def test_mlmodel_to_spec_to_mlmodel_with_no_weights_model(self): """ convert mlmodel to spec, and then back to mlmodel and verify that it works """ spec = self.mlmodel_no_weights.get_spec() # if no weights_dir is passed, error will be raised with pytest.raises(Exception, match="MLModel of type mlProgram cannot be loaded just from the model " "spec object. It also needs the path to the weights file. " "Please provide that as well, using the 'weights_dir' argument."): MLModel(spec) # weights_dir will still exist, even though the model has no weights, # with a weights file that only has header and no data weights_dir = self.mlmodel_no_weights.weights_dir assert weights_dir is not None mlmodel_from_spec = MLModel(spec, weights_dir=weights_dir) self._test_mlmodel_correctness(mlmodel_from_spec) # load mlmodel from spec using an empty weights_dir with tempfile.TemporaryDirectory() as empty_weight_dir: mlmodel_from_spec = MLModel(spec, weights_dir=weights_dir) self._test_mlmodel_correctness(mlmodel_from_spec) def test_weights_path_correctness(self): """ test that after reloading an mlmodel from the spec, the weights path is updated """ spec = self.mlmodel.get_spec() original_weight_dir_path = self.mlmodel.weights_dir assert os.path.exists(original_weight_dir_path) # load mlmodel from spec: this will create a new mlpackage in a temp location # and copy over the weights mlmodel_reloaded = MLModel(spec, weights_dir=original_weight_dir_path) assert os.path.exists(mlmodel_reloaded.weights_dir) assert mlmodel_reloaded.weights_dir != original_weight_dir_path assert mlmodel_reloaded.weights_dir == mlmodel_reloaded.package_path + "/Data/" \ + _MLPACKAGE_AUTHOR_NAME + "/weights" def test_weights_dir_discovery_method(self): """ Test "coremltools.libmodelpackage.ModelPackage.findItemByNameAuthor" function """ mlpackage = ModelPackage(self.mlpackage_path) model_package_item_info = mlpackage.findItemByNameAuthor(_WEIGHTS_DIR_NAME, _MLPACKAGE_AUTHOR_NAME) weights_dir_path = model_package_item_info.path() assert weights_dir_path == self.mlpackage_path + "/Data/" + _MLPACKAGE_AUTHOR_NAME + "/weights" # verify that findItemByNameAuthor returns None, when item not found model_package_item_info = mlpackage.findItemByNameAuthor(_WEIGHTS_DIR_NAME, "inexistent_author_name") assert model_package_item_info is None ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2815473 coremltools-8.0/coremltools/test/neural_network/0000755000000000000000000000000014672075535021057 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/neural_network/__init__.py0000644000000000000000000000032714672066616023172 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/neural_network/test_compiled_model.py0000644000000000000000000001240014672066616025441 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools from shutil import copytree, rmtree from tempfile import TemporaryDirectory import pytest from coremltools import ComputeUnit, ReshapeFrequency, SpecializationStrategy, utils from coremltools.models import CompiledMLModel, MLModel from coremltools.models.utils import compile_model, load_spec, save_spec from coremltools.proto import Model_pb2 class TestCompiledModel: @classmethod def setup(self): spec = Model_pb2.Model() spec.specificationVersion = 1 input_ = spec.description.input.add() input_.name = 'x' input_.type.doubleType.MergeFromString(b"") output_ = spec.description.output.add() output_.name = 'y' output_.type.doubleType.MergeFromString(b"") lr = spec.glmRegressor lr.offset.append(0.1) weights = lr.weights.add() weights.value.append(2.0) spec.description.predictedFeatureName = 'y' self.spec = spec self.compiled_model_path = compile_model(self.spec) def teardown_class(self): rmtree(self.compiled_model_path) def _test_compile_model_path(self, compiled_model_path, compute_units=ComputeUnit.ALL): try: # Load compiled model model = CompiledMLModel(compiled_model_path, compute_units) # Single prediction y = model.predict({'x': 2}) assert y['y'] == 4.1 # Batch predictions y = model.predict([{'x': 2}, {'x': 4}]) assert y == [{'y': 4.1}, {'y': 8.1}] finally: rmtree(compiled_model_path) def test_mlmodel_file_input(self): with TemporaryDirectory() as save_dir: file_path = save_dir + '/m.mlmodel' MLModel(self.spec).save(file_path) with pytest.raises(TypeError, match=", first load the model, "): compiled_model_path = compile_model(file_path) def test_spec_input(self): compiled_model_path = compile_model(self.spec) self._test_compile_model_path(compiled_model_path) def test_mlmodel_input(self): ml_model = MLModel(self.spec) with pytest.raises(TypeError, match=" model has already been compiled."): compiled_model_path = compile_model(ml_model) def test_from_existing_mlmodel(self): ml_model = MLModel(self.spec) compiled_model_path = ml_model.get_compiled_model_path() with TemporaryDirectory() as temp_dir: dst_path = temp_dir + "/foo.mlmodelc" copytree(compiled_model_path, dst_path) del ml_model self._test_compile_model_path(dst_path) def test_non_default_compute_units(self): non_default_compute_units = (ComputeUnit.CPU_AND_GPU, ComputeUnit.CPU_AND_NE, ComputeUnit.CPU_ONLY) for cur_compute_unit in non_default_compute_units: compiled_model_path = compile_model(self.spec) self._test_compile_model_path(compiled_model_path, compute_units=cur_compute_unit) def test_destination_path_parameter(self): # Check correct usage with TemporaryDirectory() as temp_dir: dst_path = temp_dir + "/foo.mlmodelc" compiled_model_path = compile_model(self.spec, dst_path) self._test_compile_model_path(compiled_model_path) # Check bad input with TemporaryDirectory() as temp_dir: dst_path = temp_dir + "/foo.badFileExtension" with pytest.raises(Exception, match=" file extension."): compiled_model_path = compile_model(self.spec, dst_path) def test_save_load_spec(self): with TemporaryDirectory() as save_dir: file_path = save_dir + '/spec.mlmodel' save_spec(self.spec, file_path) my_spec = load_spec(file_path) compiled_model_path = compile_model(my_spec) self._test_compile_model_path(compiled_model_path) @pytest.mark.skipif(utils._macos_version() < (15, 0), reason="optimization hints available only on macOS15+") @pytest.mark.parametrize("reshapeFrequency, specializationStrategy", itertools.product( (ReshapeFrequency.Frequent, ReshapeFrequency.Infrequent, None), (SpecializationStrategy.FastPrediction, SpecializationStrategy.Default, None), )) def test_optimization_hints(self, reshapeFrequency, specializationStrategy): optimization_hints={} if reshapeFrequency is not None: optimization_hints['reshapeFrequency'] = reshapeFrequency if specializationStrategy is not None: optimization_hints["specializationStrategy"] = specializationStrategy if len(optimization_hints) == 0: optimization_hints = None m = CompiledMLModel(self.compiled_model_path, optimization_hints=optimization_hints) assert isinstance(m, CompiledMLModel) assert(m.optimization_hints == optimization_hints) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/neural_network/test_custom_neural_nets.py0000644000000000000000000000637014672066616026407 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import tempfile import unittest import numpy as np import coremltools import coremltools.models.datatypes as datatypes from coremltools.models import neural_network as neural_network from coremltools.models.utils import _is_macos, _macos_version class SimpleTest(unittest.TestCase): def test_fixed_seq_len(self): """ Input has a fixed sequence length. (this happens when model is trained using padded sequences, inspiration: https://forums.developer.apple.com/thread/80407) (Seq,Batch,C,H,W) embedding: input shape (15,1,1,1,1) --> output shape (15,1,32,1,1) permute : input shape (15,1,32,1,1) --> output shape (1,1,32,1,15) flatten : input shape (1,1,32,1,15) --> output shape (1,1,32 * 15,1,1) dense : input shape (1,1,480,1,1) --> output shape (1,1,2,1,1) """ coreml_preds = [] input_dim = (1, 1, 1) output_dim = ( 1, 1, 1, ) # some random dimensions here: we are going to remove this information later input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", datatypes.Array(*output_dim))] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) # ADD Layers builder.add_embedding( "embed", W=np.random.rand(10, 32), b=None, input_dim=10, output_channels=32, has_bias=0, input_name="data", output_name="embed", ) builder.add_permute( "permute", dim=[3, 1, 2, 0], input_name="embed", output_name="permute" ) builder.add_flatten( "flatten", mode=0, input_name="permute", output_name="flatten" ) builder.add_inner_product( "dense", W=np.random.rand(480, 2), b=None, input_channels=480, output_channels=2, has_bias=0, input_name="flatten", output_name="output", ) # Remove output shape by deleting and adding an output del builder.spec.description.output[-1] output = builder.spec.description.output.add() output.name = "output" output.type.multiArrayType.dataType = coremltools.proto.FeatureTypes_pb2.ArrayFeatureType.ArrayDataType.Value( "DOUBLE" ) # save the model model_dir = tempfile.TemporaryDirectory() model_path = os.path.join(model_dir.name, "test_layer.mlmodel") coremltools.utils.save_spec(builder.spec, model_path) # prepare input and get predictions coreml_model = coremltools.models.MLModel(model_path) X = np.random.randint(low=0, high=10, size=15) X = np.reshape(X, (15, 1, 1, 1, 1)).astype(np.float32) coreml_input = {"data": X} if _is_macos() and _macos_version() >= (10, 13): coreml_preds = coreml_model.predict(coreml_input)["output"] self.assertEqual(len(coreml_preds.flatten()), 2) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/neural_network/test_model.py0000644000000000000000000005375314672066616023605 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import tempfile import unittest import numpy as np import PIL.Image import coremltools from coremltools import ComputeUnit from coremltools._deps import _HAS_TORCH from coremltools.converters.mil import Builder as mb from coremltools.models import MLModel, datatypes from coremltools.models.neural_network import NeuralNetworkBuilder from coremltools.models.neural_network.utils import (make_image_input, make_nn_classifier) from coremltools.models.utils import ( _convert_neural_network_spec_weights_to_fp16, _is_macos, _macos_version, convert_double_to_float_multiarray_type, rename_feature, save_spec) from coremltools.proto import Model_pb2 if _HAS_TORCH: import torch as _torch class MLModelTest(unittest.TestCase): @classmethod def setUpClass(self): spec = Model_pb2.Model() spec.specificationVersion = coremltools.SPECIFICATION_VERSION features = ["feature_1", "feature_2"] output = "output" for f in features: input_ = spec.description.input.add() input_.name = f input_.type.doubleType.MergeFromString(b"") output_ = spec.description.output.add() output_.name = output output_.type.doubleType.MergeFromString(b"") lr = spec.glmRegressor lr.offset.append(0.1) weights = lr.weights.add() coefs = [1.0, 2.0] for i in coefs: weights.value.append(i) spec.description.predictedFeatureName = "output" self.spec = spec def test_model_creation(self): model = MLModel(self.spec) self.assertIsNotNone(model) filename = tempfile.mktemp(suffix=".mlmodel") save_spec(self.spec, filename) model = MLModel(filename) self.assertIsNotNone(model) def test_model_save_no_extension(self): model = MLModel(self.spec) self.assertIsNotNone(model) filename = tempfile.mktemp(suffix="") save_spec(self.spec, filename) # appends .mlmodel extension when it is not provided self.assertFalse(os.path.exists(filename)) filename = filename + ".mlmodel" self.assertTrue(os.path.exists(filename)) model = MLModel(filename) self.assertIsNotNone(model) os.remove(filename) def test_model_api(self): model = MLModel(self.spec) self.assertIsNotNone(model) model.author = "Test author" self.assertEqual(model.author, "Test author") self.assertEqual(model.get_spec().description.metadata.author, "Test author") model.license = "Test license" self.assertEqual(model.license, "Test license") self.assertEqual(model.get_spec().description.metadata.license, "Test license") model.short_description = "Test model" self.assertEqual(model.short_description, "Test model") self.assertEqual( model.get_spec().description.metadata.shortDescription, "Test model" ) model.version = "1.3" self.assertEqual(model.version, "1.3") self.assertEqual(model.get_spec().description.metadata.versionString, "1.3") model.input_description["feature_1"] = "This is feature 1" self.assertEqual(model.input_description["feature_1"], "This is feature 1") model.output_description["output"] = "This is output" self.assertEqual(model.output_description["output"], "This is output") filename = tempfile.mktemp(suffix=".mlmodel") model.save(filename) loaded_model = MLModel(filename) self.assertEqual(model.author, "Test author") self.assertEqual(model.license, "Test license") # self.assertEqual(model.short_description, 'Test model') self.assertEqual(model.input_description["feature_1"], "This is feature 1") self.assertEqual(model.output_description["output"], "This is output") @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_predict_api(self): model = MLModel(self.spec) preds = model.predict({"feature_1": 1.0, "feature_2": 1.0}) self.assertIsNotNone(preds) self.assertEqual(preds["output"], 3.1) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_rename_input(self): rename_feature(self.spec, "feature_1", "renamed_feature", rename_inputs=True) model = MLModel(self.spec) preds = model.predict({"renamed_feature": 1.0, "feature_2": 1.0}) self.assertIsNotNone(preds) self.assertEqual(preds["output"], 3.1) # reset the spec for next run rename_feature(self.spec, "renamed_feature", "feature_1", rename_inputs=True) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_rename_input_bad(self): rename_feature(self.spec, "blah", "bad_name", rename_inputs=True) model = MLModel(self.spec) preds = model.predict({"feature_1": 1.0, "feature_2": 1.0}) self.assertIsNotNone(preds) self.assertEqual(preds["output"], 3.1) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_rename_output(self): rename_feature( self.spec, "output", "renamed_output", rename_inputs=False, rename_outputs=True, ) model = MLModel(self.spec) preds = model.predict({"feature_1": 1.0, "feature_2": 1.0}) self.assertIsNotNone(preds) self.assertEqual(preds["renamed_output"], 3.1) rename_feature( self.spec, "renamed_output", "output", rename_inputs=False, rename_outputs=True, ) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_rename_output_bad(self): rename_feature( self.spec, "blah", "bad_name", rename_inputs=False, rename_outputs=True ) model = MLModel(self.spec) preds = model.predict({"feature_1": 1.0, "feature_2": 1.0}) self.assertIsNotNone(preds) self.assertEqual(preds["output"], 3.1) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_future_version(self): self.spec.specificationVersion = 10000 filename = tempfile.mktemp(suffix=".mlmodel") save_spec(self.spec, filename, auto_set_specification_version=False) model = MLModel(filename) # this model should exist, but throw an exception when we try to use # predict because the engine doesn't support this model version self.assertIsNotNone(model) with self.assertRaises(Exception): try: model.predict({}) except Exception as e: assert "Core ML model specification version" in str(e) raise self.spec.specificationVersion = 1 @unittest.skipUnless( _is_macos() and _macos_version() < (10, 13), "Only supported on macOS 10.13-" ) def test_MLModel_warning(self): self.spec.specificationVersion = 3 import warnings with warnings.catch_warnings(record=True) as w: # Cause all warnings to always be triggered. warnings.simplefilter("always") model = MLModel(self.spec) assert len(w) == 1 assert issubclass(w[-1].category, RuntimeWarning) assert "not able to run predict()" in str(w[-1].message) self.spec.specificationVersion = 1 model = MLModel(self.spec) def test_convert_nn_spec_to_half_precision(self): # simple network with quantization layer input_features = [("data", datatypes.Array(3))] output_features = [("out", datatypes.Array(3))] builder = NeuralNetworkBuilder(input_features, output_features) weights = np.random.uniform(-0.5, 0.5, (3, 3)) builder.add_inner_product( name="inner_product", W=weights, b=None, input_channels=3, output_channels=3, has_bias=False, input_name="data", output_name="out", ) model = MLModel(builder.spec) spec = _convert_neural_network_spec_weights_to_fp16(model.get_spec()) self.assertIsNotNone(spec) # simple network without quantization layer input_features = [("data", datatypes.Array(3))] output_features = [("out", datatypes.Array(3))] builder = NeuralNetworkBuilder(input_features, output_features) builder.add_lrn( name="lrn", input_name="data", output_name="out", alpha=2, beta=3, local_size=1, k=8, ) model = MLModel(builder.spec) spec = _convert_neural_network_spec_weights_to_fp16(model.get_spec()) self.assertIsNotNone(spec) @unittest.skip def test_downgrade_specification_version(self): # manually set a invalid specification version self.spec.specificationVersion = -1 model = MLModel(self.spec) assert model.get_spec().specificationVersion == 1 # manually set a high specification version self.spec.specificationVersion = 4 filename = tempfile.mktemp(suffix=".mlmodel") save_spec(self.spec, filename, auto_set_specification_version=True) model = MLModel(filename) assert model.get_spec().specificationVersion == 1 # simple neural network with only spec 1 layer input_features = [("data", datatypes.Array(3))] output_features = [("out", datatypes.Array(3))] builder = NeuralNetworkBuilder(input_features, output_features) builder.add_activation("relu", "RELU", "data", "out") # set a high specification version builder.spec.specificationVersion = 3 model = MLModel(builder.spec) filename = tempfile.mktemp(suffix=".mlmodel") model.save(filename) # load the model back model = MLModel(filename) assert model.get_spec().specificationVersion == 1 # test save without automatic set specification version self.spec.specificationVersion = 3 filename = tempfile.mktemp(suffix=".mlmodel") save_spec(self.spec, filename, auto_set_specification_version=False) model = MLModel(filename) # the specification version should be original assert model.get_spec().specificationVersion == 3 def test_multiarray_type_convert_to_float(self): input_features = [("data", datatypes.Array(2))] output_features = [("out", datatypes.Array(2))] builder = NeuralNetworkBuilder(input_features, output_features) builder.add_ceil("ceil", "data", "out") spec = builder.spec self.assertEqual( spec.description.input[0].type.multiArrayType.dataType, Model_pb2.ArrayFeatureType.DOUBLE, ) self.assertEqual( spec.description.output[0].type.multiArrayType.dataType, Model_pb2.ArrayFeatureType.DOUBLE, ) convert_double_to_float_multiarray_type(spec) self.assertEqual( spec.description.input[0].type.multiArrayType.dataType, Model_pb2.ArrayFeatureType.FLOAT32, ) self.assertEqual( spec.description.output[0].type.multiArrayType.dataType, Model_pb2.ArrayFeatureType.FLOAT32, ) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_multiarray_to_image_input_util(self): H, W, C = 1, 1, 3 input_features = [("data", datatypes.Array(C, H, W))] output_features = [("out", datatypes.Array(C, H, W))] builder = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_activation("linear", "LINEAR", "data", "out") spec = builder.spec mlmodel = MLModel(spec, compute_units=ComputeUnit.CPU_ONLY) mlmodel = make_image_input( mlmodel, "data", red_bias=-5, green_bias=-6, blue_bias=-2.5, scale=10.0, image_format="NCHW", ) x = np.array([4, 2, 5], dtype=np.uint8) x = np.reshape(x, (H, W, C)) pil_img = PIL.Image.fromarray(x) y = mlmodel.predict({"data": pil_img})["out"] self.assertEqual(y.shape, (C, H, W)) np.testing.assert_almost_equal(y.flatten(), [35.0, 14.0, 47.5]) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_multiarray_to_image_input_util_transpose_elimination(self): H, W, C = 1, 1, 3 input_features = [("data", datatypes.Array(H, W, C))] output_features = [("out", datatypes.Array(H, W, C))] builder = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_transpose("transpose", [2, 0, 1], "data", "transpose") builder.add_activation("linear", "LINEAR", "transpose", "out") spec = builder.spec mlmodel = MLModel(spec, compute_units=ComputeUnit.CPU_ONLY) mlmodel = make_image_input( mlmodel, "data", red_bias=-5, green_bias=-6, blue_bias=-2.5, scale=10.0, image_format="NHWC", ) x = np.array([4, 2, 5], dtype=np.uint8) x = np.reshape(x, (H, W, C)) pil_img = PIL.Image.fromarray(x) y = mlmodel.predict({"data": pil_img})["out"] self.assertEqual(y.shape, (H, W, C)) np.testing.assert_almost_equal(y.flatten(), [35.0, 14.0, 47.5]) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_multiarray_to_image_input_util_HWC_format(self): H, W, C = 1, 1, 3 input_features = [("data", datatypes.Array(H, W, C))] output_features = [("out", datatypes.Array(H, W, C))] builder = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_activation("linear", "LINEAR", "data", "out") spec = builder.spec mlmodel = MLModel(spec, compute_units=ComputeUnit.CPU_ONLY) mlmodel = make_image_input( mlmodel, "data", red_bias=-5, green_bias=-6, blue_bias=-2.5, scale=10.0, image_format="NHWC", ) x = np.array([4, 2, 5], dtype=np.uint8) x = np.reshape(x, (H, W, C)) pil_img = PIL.Image.fromarray(x) y = mlmodel.predict({"data": pil_img})["out"] self.assertEqual(y.shape, (H, W, C)) np.testing.assert_almost_equal(y.flatten(), [35.0, 14.0, 47.5]) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_nn_classifier_util(self): input_features = [("data", datatypes.Array(3))] output_features = [("out", datatypes.Array(3))] builder = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_activation("linear", "LINEAR", "data", "out") spec = builder.spec mlmodel = MLModel(spec, compute_units=ComputeUnit.CPU_ONLY) mlmodel = make_nn_classifier( mlmodel, class_labels=["a", "b", "c"], predicted_feature_name="out_confidence", predicted_probabilities_output="out", ) out_dict = mlmodel.predict({"data": np.array([4.0, 5.5, 6.0])}) self.assertEqual(out_dict["out_confidence"], "c") self.assertEqual( mlmodel.get_spec().WhichOneof("Type"), "neuralNetworkClassifier" ) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_nn_classifier_util_file(self): input_features = [("data", datatypes.Array(3))] output_features = [("out", datatypes.Array(3))] builder = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_activation("linear", "LINEAR", "data", "out") spec = builder.spec mlmodel = MLModel(spec, compute_units=ComputeUnit.CPU_ONLY) class_labels = ["a", "b", "c"] with tempfile.NamedTemporaryFile(mode="w", suffix=".txt") as f: f.write("\n".join(class_labels)) f.flush() mlmodel = make_nn_classifier( mlmodel, class_labels=f.name, predicted_feature_name="out_confidence", predicted_probabilities_output="out", ) out_dict = mlmodel.predict({"data": np.array([4.0, 5.5, 6.0])}) self.assertEqual(out_dict["out_confidence"], "c") self.assertEqual( mlmodel.get_spec().WhichOneof("Type"), "neuralNetworkClassifier" ) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_rename_output_nn_classifier(self): input_features = [("data", datatypes.Array(3))] output_features = [("out", datatypes.Array(3))] builder = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_activation("linear", "LINEAR", "data", "out") spec = builder.spec mlmodel = MLModel(spec) class_labels = ["a", "b", "c"] mlmodel = make_nn_classifier(mlmodel, class_labels=["a", "b", "c"]) # rename output spec = mlmodel.get_spec() rename_feature(spec, "out", "new_out_name") mlmodel = MLModel(spec, compute_units=ComputeUnit.CPU_ONLY) out_dict = mlmodel.predict({"data": np.array([4.0, 5.5, 6.0])}) self.assertEqual(out_dict["classLabel"], "c") self.assertTrue("new_out_name" in out_dict) self.assertTrue(isinstance(out_dict["new_out_name"], dict)) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_rename_image_input(self): input_features = [("data", datatypes.Array(3, 1, 1))] output_features = [("out", datatypes.Array(3, 1, 1))] builder = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_activation("linear", "LINEAR", "data", "out") spec = builder.spec # make an image input mlmodel = make_image_input(MLModel(spec), "data", image_format="NCHW", scale=2.0) # rename the input spec = mlmodel.get_spec() rename_feature(spec, "data", "new_input_name") mlmodel = MLModel(spec, compute_units=ComputeUnit.CPU_ONLY) # test x = np.array([4, 5, 6], dtype=np.uint8).reshape(1, 1, 3) pil_img = PIL.Image.fromarray(x) out = mlmodel.predict({"new_input_name": pil_img})['out'] np.testing.assert_equal(out, np.array([8.0, 10.0, 12.0]).reshape(3, 1, 1)) @unittest.skipUnless( _is_macos() and _macos_version() >= (12, 0), "Only supported on macOS 12+" ) def test_rename_feature_mlprogram(self): @mb.program(input_specs=[mb.TensorSpec(shape=(3,))]) def linear_prog(input): W = np.ones((10, 3), dtype=np.float32) out = mb.linear(x=input, weight=W, name="output") return out model = coremltools.convert( linear_prog, convert_to='mlprogram' ) spec = model.get_spec() input_name = spec.description.input[0].name output_name = spec.description.output[0].name # rename input rename_feature(spec, input_name, "new_input_name") self.assertEqual(spec.description.input[0].name, "new_input_name") model = coremltools.models.MLModel(spec, weights_dir=model.weights_dir) out = model.predict({"new_input_name": np.array([1.0, 2.0, 3.0])})[output_name] self.assertEqual(out.shape, (10,)) self.assertEqual(out[0], 6.0) # rename output rename_feature(spec, output_name, "new_output_name") self.assertEqual(spec.description.output[0].name, "new_output_name") model = coremltools.models.MLModel(spec, weights_dir=model.weights_dir) out = model.predict({"new_input_name": np.array([1.0, 2.0, 3.0])})["new_output_name"] self.assertEqual(out.shape, (10,)) self.assertEqual(out[1], 6.0) @unittest.skipUnless( _is_macos() and _macos_version() >= (12, 0) and _HAS_TORCH, "Only supported on macOS 12+" ) def test_rename_feature_classifier_mlprogram(self): torch_model = _torch.nn.ReLU().eval() model = coremltools.convert( _torch.jit.trace(torch_model, _torch.rand(3, )), inputs=[coremltools.TensorType(shape=(3,))], classifier_config=coremltools.ClassifierConfig(['a', 'b', 'c']), convert_to='mlprogram' ) spec = model.get_spec() input_name = spec.description.input[0].name rename_feature(spec, 'classLabel', 'highestProbClass') model = coremltools.models.MLModel(spec, weights_dir=model.weights_dir) output_class = model.predict({input_name: np.array([1.0, 2.0, 3.0])})['highestProbClass'] self.assertEqual(output_class, 'c') if __name__ == "__main__": unittest.main() # suite = unittest.TestSuite() # suite.addTest(MLModelTest('test_multiarray_type_convert_to_float')) # unittest.TextTestRunner().run(suite) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/neural_network/test_neural_networks.py0000644000000000000000000000401514672066616025712 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import coremltools from coremltools.models.utils import (_get_custom_layer_names, _replace_custom_layer_name) from coremltools.proto import Model_pb2 class CustomLayerUtilsTest(unittest.TestCase): @classmethod def setUpClass(self): spec = Model_pb2.Model() spec.specificationVersion = coremltools.SPECIFICATION_VERSION features = ["feature_1", "feature_2"] output = "output" for f in features: input_ = spec.description.input.add() input_.name = f input_.type.doubleType.MergeFromString(b"") output_ = spec.description.output.add() output_.name = output output_.type.doubleType.MergeFromString(b"") layer = spec.neuralNetwork.layers.add() layer.name = "custom1" layer.input.append("input") layer.output.append("temp1") layer.custom.className = "name1" layer2 = spec.neuralNetwork.layers.add() layer2.name = "custom2" layer2.input.append("temp1") layer2.output.append("temp2") layer2.custom.className = "name2" layer3 = spec.neuralNetwork.layers.add() layer3.name = "custom3" layer3.input.append("temp2") layer3.output.append("output") layer3.custom.className = "name1" self.spec = spec def test_get_custom_names(self): names = _get_custom_layer_names(self.spec) self.assertEqual(names, {"name1", "name2"}) def test_change_custom_name(self): _replace_custom_layer_name(self.spec, "name1", "notname1") names = _get_custom_layer_names(self.spec) self.assertEqual(names, {"notname1", "name2"}) # set it back for future tests _replace_custom_layer_name(self.spec, "notname1", "name1") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/neural_network/test_nn_builder.py0000644000000000000000000005675714672066616024635 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np import pytest import unittest import coremltools from coremltools import ComputeUnit from coremltools.converters.mil.mil.types.type_mapping import np_val_to_py_type from coremltools.models import MLModel, datatypes from coremltools.models.neural_network import NeuralNetworkBuilder from coremltools.models.neural_network.quantization_utils import ( _convert_array_to_nbit_quantized_bytes, quantize_weights) from coremltools.models.utils import _is_macos, _macos_version MIN_MACOS_VERSION_REQUIRED = (10, 13) LAYERS_10_14_MACOS_VERSION = (10, 14) LAYERS_10_15_MACOS_VERSION = (10, 15) @unittest.skipIf( not _is_macos() or _macos_version() < LAYERS_10_15_MACOS_VERSION, "Only supported on macOS 10.15+", ) class ControlFlowCorrectnessTest(unittest.TestCase): @classmethod def setup_class(cls): pass def runTest(): pass def _test_model(self, model, input_dict, output_ref, delta=1e-2): preds = model.predict(input_dict) for name in output_ref: ref_val = output_ref[name] val = preds[name] self.assertTrue(np.allclose(val, ref_val, rtol=delta)) def test_simple_branch(self): """ Test a simple if-else branch network """ input_features = [("data", datatypes.Array(3)), ("cond", datatypes.Array(1))] output_features = [("output", None)] builder_top = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) layer = builder_top.add_branch("branch_layer", "cond") builder_ifbranch = NeuralNetworkBuilder( input_features=None, output_features=None, spec=None, nn_spec=layer.branch.ifBranch, ) builder_ifbranch.add_elementwise( "mult_layer", input_names=["data"], output_name="output", mode="MULTIPLY", alpha=10, ) builder_elsebranch = NeuralNetworkBuilder( input_features=None, output_features=None, spec=None, nn_spec=layer.branch.elseBranch, ) builder_elsebranch.add_elementwise( "add_layer", input_names=["data"], output_name="output", mode="ADD", alpha=10, ) coremltools.models.utils.save_spec( builder_top.spec, "/tmp/simple_branch.mlmodel" ) mlmodel = MLModel(builder_top.spec) # True branch case input_dict = { "data": np.array(range(1, 4), dtype="float"), "cond": np.array([1], dtype="float"), } output_ref = {"output": input_dict["data"] * 10} self._test_model(mlmodel, input_dict, output_ref) # False branch case input_dict["cond"] = np.array([0], dtype="float") output_ref["output"] = input_dict["data"] + 10 self._test_model(mlmodel, input_dict, output_ref) def test_simple_loop_fixed_iterations(self): input_features = [("data", datatypes.Array(1))] output_features = [("output", None)] builder_top = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder_top.add_copy("copy_1", input_name="data", output_name="output") loop_layer = builder_top.add_loop("loop_layer") loop_layer.loop.maxLoopIterations = 5 builder_body = NeuralNetworkBuilder( input_features=None, output_features=None, spec=None, nn_spec=loop_layer.loop.bodyNetwork, ) builder_body.add_elementwise( "add", input_names=["output"], output_name="x", mode="ADD", alpha=2 ) builder_body.add_copy("copy_2", input_name="x", output_name="output") coremltools.models.utils.save_spec( builder_top.spec, "/tmp/simple_loop_fixed_iterations.mlmodel" ) mlmodel = MLModel(builder_top.spec) # True branch case input_dict = {"data": np.array([0], dtype="float")} output_ref = {"output": np.array([10], dtype="float")} self._test_model(mlmodel, input_dict, output_ref) @unittest.skipUnless( _is_macos() and _macos_version() >= LAYERS_10_14_MACOS_VERSION, "Only supported on macOS 10.14+", ) class BasicNumericCorrectnessTest_1014NewLayers(unittest.TestCase): def build_quant_conv_layer( self, W=None, quantization_type="linear", nbits=8, quant_scale=None, quant_bias=None, quant_lut=None, output_channels=2, ): input_features = [("data", datatypes.Array(1, 2, 2))] output_features = [("out", datatypes.Array(2, 1, 1))] builder = NeuralNetworkBuilder(input_features, output_features) builder.add_convolution( name="conv", kernel_channels=1, output_channels=output_channels, height=2, width=2, stride_height=1, stride_width=1, border_mode="valid", groups=1, W=W, b=None, has_bias=False, input_name="data", output_name="out", quantization_type=quantization_type, nbits=nbits, quant_scale=quant_scale, quant_bias=quant_bias, quant_lut=quant_lut, ) return MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) def test_linear_quant_convolution_8bit(self): W = np.ones((2, 2, 1, 2), dtype=np.uint8) W[:, :, :, 1] = 2 mlmodel = self.build_quant_conv_layer( W=W.flatten().tobytes(), quantization_type="linear", nbits=8, quant_scale=[4.0], quant_bias=[-2.0], ) data = np.ones((1, 2, 2)) data_dict = {"data": data} out = mlmodel.predict(data_dict)["out"] expected_out = np.reshape(np.array([8, 24]), (2, 1, 1)) self.assertTrue(np.allclose(out, expected_out)) def test_linear_quant_convolution_8bit_vector_scalebias(self): W = np.ones((2, 2, 1, 2), dtype=np.uint8) W[:, :, :, 1] = 2 mlmodel = self.build_quant_conv_layer( W=W.flatten().tobytes(), quantization_type="linear", nbits=8, quant_scale=[4.0, 5.0], quant_bias=[-2.0, 1.0], ) data = np.ones((1, 2, 2)) data_dict = {"data": data} out = mlmodel.predict(data_dict)["out"] expected_out = np.reshape(np.array([8, 44]), (2, 1, 1)) self.assertTrue(np.allclose(out, expected_out)) def test_linear_quant_convolution_8bit_float_scale_and_bias(self): W = np.array(([[[[1, 248], [248, 248]]]]), dtype=np.uint8) mlmodel = self.build_quant_conv_layer( W=W.flatten().tobytes(), quantization_type="linear", nbits=8, quant_scale=[15], quant_bias=[-3913], output_channels=1, ) data = np.ones((1, 2, 2)) data_dict = {"data": data} out = mlmodel.predict(data_dict)["out"] # Output should be equal to: (scale*(1+248+248+248)+(4*bias)) expected_out = np.reshape(np.array([-4477]), (1, 1, 1, 1, 1)) self.assertTrue(np.allclose(out, expected_out)) def test_lut_quant_convolution_2bit(self): W = np.zeros((2, 2, 1, 2), dtype=np.uint8) W[:, :, :, 0] = 0 W[:, :, :, 1] = 2 W = _convert_array_to_nbit_quantized_bytes(W.flatten(), 2).tobytes() mlmodel = self.build_quant_conv_layer( W=W, quantization_type="lut", nbits=2, quant_lut=[10.0, 11.0, -3.0, -1.0] ) data = np.ones((1, 2, 2)) data_dict = {"data": data} out = mlmodel.predict(data_dict)["out"] expected_out = np.reshape(np.array([40, -12]), (2, 1, 1)) self.assertTrue(np.allclose(out, expected_out)) def test_linear_quant_inner_product_3bit(self): pytest.xfail("rdar://101370330 ([CI] nnv1 model compression tests are failing after roots is updated)") W = np.reshape(np.arange(6), (2, 3)).astype(np.uint8) input_features = [("data", datatypes.Array(3))] output_features = [("probs", None)] builder = NeuralNetworkBuilder(input_features, output_features) builder.add_inner_product( name="ip1", W=_convert_array_to_nbit_quantized_bytes(W.flatten(), 3).tobytes(), b=None, input_channels=3, output_channels=2, has_bias=False, input_name="data", output_name="probs", quantization_type="linear", nbits=3, quant_scale=[11.0, 2.0], quant_bias=[-2.0, 10.0], ) mlmodel = MLModel(builder.spec) data = np.array([1.0, 3.0, 5.0]) data_dict = {"data": data} probs = mlmodel.predict(data_dict)["probs"] expected_out = np.array([125, 170]) self.assertTrue(np.allclose(probs.flatten(), expected_out.flatten())) def test_lut_quant_inner_product_1bit(self): pytest.xfail("rdar://101370330 ([CI] nnv1 model compression tests are failing after roots is updated)") W = np.zeros((2, 3), dtype=np.uint8) W[0, :] = [0, 1, 1] W[1, :] = [1, 0, 0] input_features = [("data", datatypes.Array(3))] output_features = [("probs", None)] builder = NeuralNetworkBuilder(input_features, output_features) builder.add_inner_product( name="ip1", W=_convert_array_to_nbit_quantized_bytes(W.flatten(), 1).tobytes(), b=None, input_channels=3, output_channels=2, has_bias=False, input_name="data", output_name="probs", quantization_type="lut", nbits=1, quant_lut=[5.0, -3.0], ) mlmodel = MLModel(builder.spec) data = np.array([1.0, 3.0, 5.0]) data_dict = {"data": data} probs = mlmodel.predict(data_dict)["probs"] expected_out = np.array([-19, 37]) self.assertTrue(np.allclose(probs.flatten(), expected_out.flatten())) @unittest.skipUnless( _is_macos() and _macos_version() >= LAYERS_10_15_MACOS_VERSION, "Only supported on macOS 10.15+", ) class BasicNumericCorrectnessTest_1015NewLayers(unittest.TestCase): def test_linear_quant_batchedmatmul_5bit(self): W = np.zeros((2, 3), dtype=np.uint8) W[0, :] = [31, 20, 11] W[1, :] = [1, 0, 8] quant_scale = np.reshape(np.array([10.0, 2.0, 3.0]), (1, 3)) quant_bias = np.reshape(np.array([-2.0, -10.0, 6.0]), (1, 3)) W_unquantized = np.broadcast_to(quant_scale, (2, 3)) * W + np.broadcast_to( quant_bias, (2, 3) ) bias = np.array([1.0, 2.0, 3.0]) input_features = [("data", datatypes.Array(2, 2))] output_features = [("out", None)] builder = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="out", weight_matrix_rows=2, weight_matrix_columns=3, W=_convert_array_to_nbit_quantized_bytes(W.flatten(), 5).tobytes(), bias=bias, is_quantized_weight=True, quantization_type="linear", nbits=5, quant_scale=quant_scale.flatten(), quant_bias=quant_bias.flatten(), ) mlmodel = MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) data = np.zeros((2, 2), dtype=np.float32) data[0, :] = [5, 6] data[1, :] = [10, 12] data_dict = {"data": data} out = mlmodel.predict(data_dict)["out"] expected_out = np.matmul(data, W_unquantized) + bias self.assertTrue(out.shape == expected_out.shape) self.assertTrue(np.allclose(out.flatten(), expected_out.flatten())) def test_linear_quant_batchedmatmul_8bit(self): np.random.seed(1988) W = np.random.rand(32, 32) * 2.0 - 1 bias = np.random.rand(32) input_features = [("data", datatypes.Array(2, 32))] output_features = [("out", None)] builder = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="out", weight_matrix_rows=32, weight_matrix_columns=32, W=W, bias=bias, ) mlmodel = MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) q_mlmodel = quantize_weights(mlmodel, 8) q_spec = q_mlmodel.get_spec() q_layer = q_spec.neuralNetwork.layers[0].batchedMatmul self.assertTrue(len(q_layer.weights.floatValue) == 0) self.assertTrue(len(q_layer.weights.rawValue) > 0) data = np.random.rand(2, 32) data_dict = {"data": data} out = q_mlmodel.predict(data_dict)["out"] expected_out = np.matmul(data, W) + bias self.assertTrue(out.shape == expected_out.shape) self.assertTrue(np.allclose(out.flatten(), expected_out.flatten(), atol=0.1)) def test_lut_quant_embedding_nd_2bit(self): embed_size = 2 vocab_size = 3 W = np.zeros((embed_size, vocab_size), dtype=np.uint8) W[:, 0] = [1, 0] W[:, 1] = [0, 1] W[:, 2] = [3, 2] bias = np.array([1.0, 2.0]) quant_lut = np.array([34.0, 12.0, -6.0, 6.0]) input_features = [("data", datatypes.Array(4, 1))] output_features = [("out", None)] builder = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_embedding_nd( name="embedding_nd", input_name="data", output_name="out", vocab_size=vocab_size, embedding_size=embed_size, W=_convert_array_to_nbit_quantized_bytes(W.flatten(), 2).tobytes(), b=bias, is_quantized_weight=True, quantization_type="lut", nbits=2, quant_lut=quant_lut, ) mlmodel = MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) data = np.reshape(np.array([2.0, 2.0, 1.0, 0.0]), (4, 1)) data_dict = {"data": data} out = mlmodel.predict(data_dict)["out"] expected_out = np.zeros((4, embed_size), dtype=np.float32) expected_out[0, :] = [quant_lut[W[0, 2]], quant_lut[W[1, 2]]] + bias expected_out[1, :] = [quant_lut[W[0, 2]], quant_lut[W[1, 2]]] + bias expected_out[2, :] = [quant_lut[W[0, 1]], quant_lut[W[1, 1]]] + bias expected_out[3, :] = [quant_lut[W[0, 0]], quant_lut[W[1, 0]]] + bias self.assertTrue(out.shape == expected_out.shape) self.assertTrue(np.allclose(out.flatten(), expected_out.flatten())) def test_linear_quant_embedding_7bit(self): embed_size = 2 vocab_size = 3 W = np.zeros((embed_size, vocab_size), dtype=np.uint8) W[:, 0] = [100, 127] W[:, 1] = [20, 40] W[:, 2] = [90, 1] quant_scale = np.reshape(np.array([10.0, 2.0]), (2, 1)) quant_bias = np.reshape(np.array([-2.0, -10.0]), (2, 1)) W_unquantized = np.broadcast_to(quant_scale, (2, 3)) * W + np.broadcast_to( quant_bias, (2, 3) ) bias = np.reshape(np.array([1.0, 2.0]), (2, 1)) W_unquantized = W_unquantized + np.broadcast_to(bias, (2, 3)) input_features = [("data", datatypes.Array(4, 1, 1, 1))] output_features = [("out", None)] builder = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_embedding( name="embed", W=_convert_array_to_nbit_quantized_bytes(W.flatten(), 7).tobytes(), b=bias, input_dim=vocab_size, output_channels=embed_size, has_bias=True, input_name="data", output_name="out", is_quantized_weight=True, quantization_type="linear", nbits=7, quant_scale=np_val_to_py_type(quant_scale), quant_bias=np_val_to_py_type(quant_bias), ) mlmodel = MLModel(builder.spec, compute_units=ComputeUnit.CPU_ONLY) data = np.reshape(np.array([2.0, 2.0, 1.0, 0.0]), (4, 1, 1, 1)) data_dict = {"data": data} out = mlmodel.predict(data_dict)["out"] self.assertTrue(out.shape == (4, embed_size, 1, 1)) expected_out = np.zeros((4, embed_size), dtype=np.float32) expected_out[0, :] = W_unquantized[:, 2].flatten() expected_out[1, :] = W_unquantized[:, 2].flatten() expected_out[2, :] = W_unquantized[:, 1].flatten() expected_out[3, :] = W_unquantized[:, 0].flatten() self.assertTrue(np.allclose(out.flatten(), expected_out.flatten())) @unittest.skipIf( not _is_macos() or _macos_version() < (10, 13), "Only supported on macOS 10.13+" ) class BasicNumericCorrectnessTest(unittest.TestCase): def _build_nn_with_one_ip_layer(self): input_features = [("data", datatypes.Array(3))] output_features = [("out", None)] builder = NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) w = np.random.uniform(-0.5, 0.5, (3, 3)) builder.add_inner_product( name="ip1", W=w, b=None, input_channels=3, output_channels=3, has_bias=False, input_name="input", output_name="hidden", ) return builder def test_undefined_shape_single_output(self): W = np.ones((3, 3)) input_features = [("data", datatypes.Array(3))] output_features = [("probs", None)] builder = NeuralNetworkBuilder(input_features, output_features) builder.add_inner_product( name="ip1", W=W, b=None, input_channels=3, output_channels=3, has_bias=False, input_name="data", output_name="probs", ) mlmodel = MLModel(builder.spec) data = np.ones((3,)) data_dict = {"data": data} probs = mlmodel.predict(data_dict)["probs"] self.assertTrue(np.allclose(probs, np.ones(3) * 3)) def test_set_input(self): builder = self._build_nn_with_one_ip_layer() builder.set_input(input_names=["data_renamed"], input_dims=[(2,)]) self.assertEqual( builder.spec.description.input[0].type.multiArrayType.shape[0], 2 ) self.assertEqual(builder.spec.description.input[0].name, "data_renamed") def test_set_input_fail(self): builder = self._build_nn_with_one_ip_layer() # fails since input_names and input_dims do not have same size with self.assertRaises(ValueError): builder.set_input(input_names=["data_1", "data_2"], input_dims=[(3,)]) def test_set_output(self): builder = self._build_nn_with_one_ip_layer() builder.set_output(output_names=["out_renamed"], output_dims=[(2,)]) self.assertEqual( builder.spec.description.output[0].type.multiArrayType.shape[0], 2 ) self.assertEqual(builder.spec.description.output[0].name, "out_renamed") def test_set_output_fail(self): builder = self._build_nn_with_one_ip_layer() # fails since output_names and output_dims do not have same size with self.assertRaises(ValueError): builder.set_output(output_names=["out_1", "out_2"], output_dims=[(3,)]) def test_invalid_image_preprocessing_params(self): builder = self._build_nn_with_one_ip_layer() image_input_names = ["input1", "input2"] with self.assertRaises(ValueError): image_scale = {"invalid": 1.0 / 255.0} builder.set_pre_processing_parameters( image_input_names=image_input_names, image_scale=image_scale ) with self.assertRaises(ValueError): red_bias = {"invalid": -1} builder.set_pre_processing_parameters( image_input_names=image_input_names, red_bias=red_bias ) with self.assertRaises(ValueError): blue_bias = {"invalid": -1} builder.set_pre_processing_parameters( image_input_names=image_input_names, blue_bias=blue_bias ) with self.assertRaises(ValueError): green_bias = {"invalid": -1} builder.set_pre_processing_parameters( image_input_names=image_input_names, green_bias=green_bias ) with self.assertRaises(ValueError): gray_bias = {"invalid": -1} builder.set_pre_processing_parameters( image_input_names=image_input_names, gray_bias=gray_bias ) with self.assertRaises(ValueError): is_bgr = {"invalid": False} builder.set_pre_processing_parameters( image_input_names=image_input_names, is_bgr=is_bgr ) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) class UseFloatArraytypeTest(unittest.TestCase): """Test that the boolean flag `use_float_arraytype` correctly changes the datatype of the network's inputs and outputs and produces a spec that the `MLModel` class can call `predict` with. """ def _test_use_float_array_helper(self, use_float_arraytype): input_features = [("data", datatypes.Array(3))] output_features = [("probs", None)] builder = NeuralNetworkBuilder( input_features=input_features, output_features=output_features, use_float_arraytype=use_float_arraytype, ) weights = np.ones((3, 3)) builder.add_inner_product( name="ip1", W=weights, b=None, input_channels=3, output_channels=3, has_bias=False, input_name="data", output_name="probs", ) spec = builder.spec array_feature_type = ( coremltools.proto.FeatureTypes_pb2.ArrayFeatureType.FLOAT32 if use_float_arraytype else coremltools.proto.FeatureTypes_pb2.ArrayFeatureType.DOUBLE ) for input in spec.description.input: self.assertEqual(input.type.multiArrayType.dataType, array_feature_type) for output in spec.description.input: self.assertEqual(output.type.multiArrayType.dataType, array_feature_type) # Assert that the generated spec is functional mlmodel = MLModel(spec) data = np.ones((3,)) data_dict = {"data": data} try: predictions = mlmodel.predict(data_dict) except Exception as e: self.fail(e) self.assertTrue(np.allclose(predictions["probs"], np.ones(3) * 3)) def test_true_use_float_array(self): # Instruct the builder to use the Float32 datatype for inputs and outputs self._test_use_float_array_helper(True) def test_false_use_float_array(self): # Instruct the builder to use its default Double datatype for inputs and outputs self._test_use_float_array_helper(False) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/neural_network/test_numpy_nn_layers.py0000644000000000000000000104664014672066616025725 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import math import os import platform import random import tempfile import unittest import numpy as np import pytest from coremltools._deps import _HAS_TF_2, MSG_TF2_NOT_FOUND if _HAS_TF_2: import tensorflow as tf import torch import coremltools import coremltools.models.datatypes as datatypes from coremltools import ComputeUnit from coremltools.converters.mil.mil.ops.defs._utils import aggregated_pad from coremltools.models import (_MLMODEL_FULL_PRECISION, _MLMODEL_HALF_PRECISION, neural_network) from coremltools.models.neural_network import flexible_shape_utils from coremltools.models.utils import _MODEL_FILE_NAME, _is_macos, _macos_version np.random.seed(10) MIN_MACOS_VERSION_REQUIRED = (10, 13) LAYERS_10_15_MACOS_VERSION = (10, 15) LAYERS_11_0_MACOS_VERSION = (11, 0) def _get_unary_model_spec(x, mode, alpha=1.0): input_dim = x.shape input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", datatypes.Array(*input_dim))] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_unary( name="unary", input_name="data", output_name="output", mode=mode, alpha=alpha ) return builder.spec class CorrectnessTest(unittest.TestCase): def runTest(self): pass def _compare_shapes(self, np_preds, coreml_preds): return np.squeeze(np_preds).shape == np.squeeze(coreml_preds).shape def _test_shape_equality(self, np_preds, coreml_preds): np.testing.assert_array_equal( np.squeeze(coreml_preds).shape, np.squeeze(np_preds).shape ) def _test_nd_shape_equality(self, np_preds, coreml_preds, shape=()): if shape: np.testing.assert_array_equal(coreml_preds.shape, shape) else: # check if shape has 0 valued dimension if np.prod(np_preds.shape) == 0 and np.prod(coreml_preds.shape) == 0: return np.testing.assert_array_equal(coreml_preds.shape, np_preds.shape) def _compare_predictions(self, np_preds, coreml_preds, delta=0.01): np_preds = np_preds.flatten() coreml_preds = coreml_preds.flatten() max_arr = np.maximum(np.maximum(np_preds, coreml_preds), 1.0) all_deltas = np.abs(np_preds / max_arr - coreml_preds / max_arr) max_delta = np.amax(all_deltas) if max_delta > delta: return False return True def _test_predictions( self, np_preds, coreml_preds, delta=0.01, test_metric="rel_error", SNR=30, PSNR=40, ): np_preds = np_preds.flatten() coreml_preds = coreml_preds.flatten() if test_metric == "rel_error": max_arr = np.maximum(np.abs(np_preds), 1.0) all_deltas = np.abs(np_preds / max_arr - coreml_preds / max_arr) max_delta = np.amax(all_deltas, initial=0) self.assertLessEqual( max_delta, delta, "Expected %s to be within %s of %s" % (coreml_preds, delta, np_preds), ) elif test_metric == "SNR": noise = np_preds - coreml_preds noise_var = np.sum(noise ** 2) / len(noise) + 1e-7 signal_energy = np.sum(np_preds ** 2) / len(np_preds) max_signal_energy = np.amax(np_preds ** 2) snr = 10 * np.log10(signal_energy / noise_var) psnr = 10 * np.log10(max_signal_energy / noise_var) self.assertGreaterEqual(snr, SNR) self.assertGreaterEqual(psnr, PSNR) else: raise ValueError("Test metric not supported") @staticmethod def _compare_moments(model, inputs, expected, use_cpu_only=True, num_moments=10): """ This utility function is used for validate random distributions layers. It validates the first 10 moments of prediction and expected values. """ def get_moment(data, k): return np.mean(np.power(data - np.mean(data), k)) if isinstance(model, str): model = coremltools.models.MLModel(model) if use_cpu_only: compute_unit=ComputeUnit.CPU_ONLY else: compute_unit=ComputeUnit.ALL model = coremltools.models.MLModel(model, compute_units=compute_unit) prediction = model.predict(inputs) for output_name in expected: np_preds = expected[output_name] coreml_preds = prediction[output_name] np_moments = [get_moment(np_preds.flatten(), k) for k in range(num_moments)] coreml_moments = [ get_moment(coreml_preds.flatten(), k) for k in range(num_moments) ] np.testing.assert_almost_equal(np_moments, coreml_moments, decimal=2) # override expected values to allow element-wise compares for output_name in expected: expected[output_name] = prediction[output_name] def _test_model( self, model, input, expected, model_precision=_MLMODEL_FULL_PRECISION, useCPUOnly=False, output_name_shape_dict={}, validate_shapes_only=False, test_metric="rel_error", delta=0.01, SNR=30, ): if useCPUOnly: compute_unit = ComputeUnit.CPU_ONLY else: compute_unit = ComputeUnit.ALL # if we're given a path to a model if isinstance(model, str): model = coremltools.models.MLModel(model, compute_units=compute_unit) # If we're passed in a specification, save out the model and then load it back. elif isinstance(model, coremltools.proto.Model_pb2.Model): tmp_model_file = tempfile.NamedTemporaryFile(suffix=_MODEL_FILE_NAME) coremltools.utils.save_spec(model, tmp_model_file.name) model = coremltools.models.MLModel( tmp_model_file.name, compute_units=compute_unit ) # If we want to test the half precision case if model_precision == _MLMODEL_HALF_PRECISION: model = coremltools.utils._convert_neural_network_weights_to_fp16(model) prediction = model.predict(input) for output_name in expected: if self.__class__.__name__ == "SimpleTest": self._test_shape_equality( expected[output_name], prediction[output_name] ) else: if output_name in output_name_shape_dict: output_shape = output_name_shape_dict[output_name] else: output_shape = [] if len(output_shape) == 0 and len(expected[output_name].shape) == 0: output_shape = (1,) self._test_nd_shape_equality( expected[output_name], prediction[output_name], output_shape ) if not validate_shapes_only: self._test_predictions( expected[output_name], prediction[output_name], delta=delta, test_metric=test_metric, SNR=SNR, ) @unittest.skipIf( not _is_macos() or _macos_version() < MIN_MACOS_VERSION_REQUIRED, "macOS 10.13+ is required. Skipping tests.", ) class SimpleTest(CorrectnessTest): def test_tiny_upsample_linear_mode(self): input_dim = (1, 1, 3) # (C,H,W) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_upsample( name="upsample", scaling_factor_h=2, scaling_factor_w=3, input_name="data", output_name="output", mode="BILINEAR", ) input = {"data": np.reshape(np.array([1.0, 2.0, 3.0]), (1, 1, 3))} expected = { "output": np.array( [ [1, 1.333, 1.666, 2, 2.333, 2.666, 3, 3, 3], [1, 1.333, 1.6666, 2, 2.33333, 2.6666, 3, 3, 3], ] ) } self._test_model(builder.spec, input, expected) self.assertEqual(len(input_dim), builder._get_rank("output")) def test_LRN(self): input_dim = (1, 3, 3) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", datatypes.Array(*input_dim))] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_lrn( name="lrn", input_name="data", output_name="output", alpha=2, beta=3, local_size=1, k=8, ) input = {"data": np.ones((1, 3, 3))} expected = {"output": 1e-3 * np.ones((1, 3, 3))} self._test_model(builder.spec, input, expected) self.assertEqual(len(input_dim), builder._get_rank("output")) def test_MVN(self): input_dim = (2, 2, 2) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", datatypes.Array(*input_dim))] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_mvn( name="mvn", input_name="data", output_name="output", across_channels=False, normalize_variance=False, ) input = {"data": np.reshape(np.arange(8, dtype=np.float32), (2, 2, 2))} expected = { "output": np.reshape( np.arange(8) - np.array([1.5, 1.5, 1.5, 1.5, 5.5, 5.5, 5.5, 5.5]), (2, 2, 2), ) } self._test_model(builder.spec, input, expected) def test_L2_normalize(self): input_dim = (1, 2, 2) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", datatypes.Array(*input_dim))] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_l2_normalize(name="mvn", input_name="data", output_name="output") input = {"data": np.reshape(np.arange(4, dtype=np.float32), (1, 2, 2))} expected = { "output": np.reshape(np.arange(4, dtype=np.float32), (1, 2, 2)) / np.sqrt(14) } self._test_model(builder.spec, input, expected) def test_unary_sqrt(self): x = np.reshape(np.arange(1, 5, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": np.sqrt(x)} spec = _get_unary_model_spec(x, "sqrt") self._test_model(spec, input, expected) def test_unary_rsqrt(self): x = np.reshape(np.arange(1, 5, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": 1 / np.sqrt(x)} spec = _get_unary_model_spec(x, "rsqrt") self._test_model(spec, input, expected) def test_unary_inverse(self): x = np.reshape(np.arange(1, 5, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": 1 / x} spec = _get_unary_model_spec(x, "inverse") self._test_model(spec, input, expected) def test_unary_power(self): x = np.reshape(np.arange(1, 5, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": x ** 3} spec = _get_unary_model_spec(x, "power", 3) self._test_model(spec, input, expected) def test_unary_exp(self): x = np.reshape(np.arange(1, 5, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": np.exp(x)} spec = _get_unary_model_spec(x, "exp") self._test_model(spec, input, expected) def test_unary_log(self): x = np.reshape(np.arange(1, 5, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": np.log(x)} spec = _get_unary_model_spec(x, "log") self._test_model(spec, input, expected) def test_unary_abs(self): x = np.reshape(np.arange(1, 5, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": np.abs(x)} spec = _get_unary_model_spec(x, "abs") self._test_model(spec, input, expected) def test_unary_threshold(self): x = np.reshape(np.arange(1, 5, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": np.maximum(x, 2)} spec = _get_unary_model_spec(x, "threshold", 2) self._test_model(spec, input, expected) def test_split(self): input_dim = (9, 2, 2) x = np.random.rand(*input_dim) input_features = [("data", datatypes.Array(*input_dim))] output_names = [] output_features = [] for i in range(3): out = "out_" + str(i) output_names.append(out) output_features.append((out, None)) builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_split(name="split", input_name="data", output_names=output_names) input = {"data": x} expected = {"out_0": x[0:3, :, :], "out_1": x[3:6, :, :], "out_2": x[6:9, :, :]} self._test_model(builder.spec, input, expected) for output_ in output_names: self.assertEqual(len(input_dim), builder._get_rank(output_)) def test_scale_constant(self): input_dim = (1, 2, 2) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_scale( name="scale", W=5, b=45, has_bias=True, input_name="data", output_name="output", ) x = np.reshape(np.arange(4, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": 5 * x + 45} self._test_model(builder.spec, input, expected) def test_scale_matrix(self): input_dim = (1, 2, 2) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) W = np.reshape(np.arange(5, 9), (1, 2, 2)) builder.add_scale( name="scale", W=W, b=None, has_bias=False, input_name="data", output_name="output", shape_scale=[1, 2, 2], ) x = np.reshape(np.arange(4, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": W * x} self._test_model(builder.spec, input, expected) def test_bias_constant(self): input_dim = (1, 2, 2) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_bias(name="bias", b=45, input_name="data", output_name="output") x = np.reshape(np.arange(4, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": x + 45} self._test_model(builder.spec, input, expected) def test_bias_matrix(self): input_dim = (1, 2, 2) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) b = np.reshape(np.arange(5, 9), (1, 2, 2)) builder.add_bias( name="bias", b=b, input_name="data", output_name="output", shape_bias=[1, 2, 2], ) x = np.reshape(np.arange(4, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": x + b} self._test_model(builder.spec, input, expected) def test_load_constant(self, model_precision=_MLMODEL_FULL_PRECISION): input_dim = (1, 2, 2) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) b = np.reshape(np.arange(5, 9), (1, 2, 2)) builder.add_load_constant( name="load_constant", output_name="bias", constant_value=b, shape=[1, 2, 2] ) builder.add_elementwise( name="add", input_names=["data", "bias"], output_name="output", mode="ADD" ) x = np.reshape(np.arange(4, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": x + b} self._test_model(builder.spec, input, expected, model_precision) self.assertEqual(len(input_dim), builder._get_rank("output")) def test_load_constant_half_precision(self): self.test_load_constant(model_precision=_MLMODEL_HALF_PRECISION) def test_min(self): input_dim = (1, 2, 2) input_features = [ ("data_0", datatypes.Array(*input_dim)), ("data_1", datatypes.Array(*input_dim)), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_elementwise( name="min", input_names=["data_0", "data_1"], output_name="output", mode="MIN", ) x1 = np.reshape(np.arange(4, dtype=np.float32), (1, 2, 2)) x2 = np.reshape(np.arange(2, 6, dtype=np.float32), (1, 2, 2)) input = {"data_0": x1, "data_1": x2} expected = {"output": np.minimum(x1, x2)} self._test_model(builder.spec, input, expected) self.assertEqual(len(input_dim), builder._get_rank("output")) def test_conv_same_padding(self): input_dim = (10, 15, 15) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) W = np.random.rand(3, 3, 10, 20) builder.add_convolution( name="conv", kernel_channels=10, output_channels=20, height=3, width=3, stride_height=2, stride_width=2, border_mode="same", groups=1, W=W, b=None, has_bias=False, input_name="data", output_name="output", same_padding_asymmetry_mode="TOP_LEFT_HEAVY", ) x = np.random.rand(*input_dim) input = {"data": x} expected = {"output": np.random.rand(20, 8, 8)} self._test_model(builder.spec, input, expected, validate_shapes_only=True) self.assertEqual(len(input_dim), builder._get_rank("output")) def test_deconv_valid_padding(self): input_dim = (10, 15, 15) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) W = np.random.rand(3, 3, 10, 20) builder.add_convolution( name="deconv", kernel_channels=10, output_channels=20, height=3, width=3, stride_height=2, stride_width=2, border_mode="valid", groups=1, W=W, b=None, has_bias=False, is_deconv=True, input_name="data", output_name="output", padding_top=2, padding_bottom=3, padding_left=2, padding_right=3, ) x = np.random.rand(*input_dim) input = {"data": x} expected = {"output": np.random.rand(20, 26, 26)} self._test_model(builder.spec, input, expected, validate_shapes_only=True) def test_deconv_non_unit_groups(self): input_dim = (16, 15, 15) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) W = np.random.rand(3, 3, 16, 5) builder.add_convolution( name="deconv", kernel_channels=16, output_channels=20, height=3, width=3, stride_height=2, stride_width=2, border_mode="valid", groups=4, W=W, b=None, has_bias=False, is_deconv=True, input_name="data", output_name="output", padding_top=2, padding_bottom=3, padding_left=2, padding_right=3, ) x = np.random.rand(*input_dim) input = {"data": x} expected = {"output": np.random.rand(20, 26, 26)} self._test_model(builder.spec, input, expected, validate_shapes_only=True) def test_linear_activation(self): input_dim = (10, 15, 15) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_activation( name="activation", non_linearity="LINEAR", input_name="data", output_name="output", params=[34.0, 67.0], ) x = np.random.rand(*input_dim) input = {"data": x} expected = {"output": 34.0 * x + 67.0} self._test_model(builder.spec, input, expected) def test_padding_constant(self): input_dim = (1, 2, 3) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_padding( name="pad", left=1, right=0, top=2, bottom=0, value=-1, input_name="data", output_name="output", ) x = np.reshape(np.array([[1, 2, 3], [4, 5, 6]]), (1, 2, 3)).astype(np.float32) input = {"data": x} y = np.reshape( np.array( [[-1, -1, -1, -1], [-1, -1, -1, -1], [-1, 1, 2, 3], [-1, 4, 5, 6]] ), (1, 4, 4), ).astype(np.float32) expected = {"output": y} self._test_model(builder.spec, input, expected) def test_padding_replication(self): input_dim = (1, 2, 3) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_padding( name="pad", left=1, top=2, input_name="data", output_name="output", padding_type="replication", ) x = np.reshape(np.array([[1, 2, 3], [4, 5, 6]]), (1, 2, 3)).astype(np.float32) input = {"data": x} y = np.reshape( np.array([[1, 1, 2, 3], [1, 1, 2, 3], [1, 1, 2, 3], [4, 4, 5, 6]]), (1, 4, 4), ).astype(np.float32) expected = {"output": y} self._test_model(builder.spec, input, expected) def test_reshape_target_shape_3(self): input_dim = (1, 2, 5) # (C,H,W) target_dim = (10, 1, 1) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_reshape( name="reshape", input_name="data", output_name="output", target_shape=target_dim, mode=0, ) x = np.random.rand(*input_dim) input = {"data": x} expected = {"output": np.reshape(x, (10, 1, 1))} self._test_model(builder.spec, input, expected) self.assertEqual(len(target_dim), builder._get_rank("output")) def test_reshape_target_shape_4(self): input_dim = (1, 2, 5) # (C,H,W) target_dim = (1, 10, 1, 1) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_reshape( name="reshape", input_name="data", output_name="output", target_shape=target_dim, mode=0, ) x = np.random.rand(*input_dim) input = {"data": x} expected = {"output": np.reshape(x, (1, 10, 1, 1))} self._test_model(builder.spec, input, expected) self.assertEqual(len(target_dim), builder._get_rank("output")) def test_bias_matrix_cpu(self): input_dim = (1, 2, 2) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) b = np.reshape(np.arange(5, 9), (1, 2, 2)) builder.add_bias( name="bias", b=b, input_name="data", output_name="output", shape_bias=[1, 2, 2], ) x = np.reshape(np.arange(4, dtype=np.float32), (1, 2, 2)) input = {"data": x} expected = {"output": x + b} self._test_model(builder.spec, input, expected, useCPUOnly=True) def test_linear_activation_cpu(self): input_dim = (10, 15, 15) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_activation( name="activation", non_linearity="LINEAR", input_name="data", output_name="output", params=[34.0, 67.0], ) x = np.random.rand(*input_dim) input = {"data": x} expected = {"output": 34.0 * x + 67.0} self._test_model(builder.spec, input, expected, useCPUOnly=True) @unittest.skipIf( not _is_macos() or _macos_version() < LAYERS_10_15_MACOS_VERSION, "macOS 10.15+ required. Skipping tests.", ) class NewLayersSimpleTest(CorrectnessTest): def test_shape_flexibility_range(self): input_features = [("data", datatypes.Array(*(3, 4)))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_sin(name="sin", input_name="data", output_name="output") spec = builder.spec flexible_shape_utils.set_multiarray_ndshape_range( spec, feature_name="data", lower_bounds=[1, 1], upper_bounds=[-1, 5] ) shapes = [(3, 4), (1, 5), (60, 5), (22, 4), (5, 3)] for s in shapes: x = np.random.rand(*s) expected = {"output": np.sin(x)} self._test_model(spec, {"data": x}, expected, useCPUOnly=True) def test_shape_flexibility_enumeration(self, rank=4): default_shape = tuple(np.random.randint(1, 15, size=rank)) input_features = [("data", datatypes.Array(*default_shape))] builder = neural_network.NeuralNetworkBuilder( input_features=input_features, output_features=[("output", None)], disable_rank5_shape_mapping=True, ) builder.add_sin(name="sin", input_name="data", output_name="output") spec = builder.spec shapes = [ tuple(np.random.randint(1, 15, size=rank)), tuple(np.random.randint(1, 15, size=rank)), ] flexible_shape_utils.add_multiarray_ndshape_enumeration( spec, feature_name="data", enumerated_shapes=shapes ) shapes.append(default_shape) for s in shapes: x = np.random.rand(*s) expected = {"output": np.sin(x)} self._test_model(spec, {"data": x}, expected, useCPUOnly=True) def test_shape_flexibility_enumeration_rank3(self): self.test_shape_flexibility_enumeration(rank=3) def test_shape_flexibility_enumeration_rank2(self): self.test_shape_flexibility_enumeration(rank=2) def test_transpose_cpu(self): for rank in range(1, 6): axes = np.random.permutation(rank) axes = [ axis - rank if np.random.choice([True, False]) else axis for axis in axes ] input_shape = np.random.randint(low=2, high=6, size=rank) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_transpose( name="TransposeND", axes=axes, input_name="data", output_name="output" ) x = np.random.rand(*input_shape) input = {"data": x} expected = {"output": np.transpose(x, axes)} self._test_model(builder.spec, input, expected, useCPUOnly=True) def test_dynamic_weight_conv(self): input_dim = (1, 3, 16, 16) # weight layout: (output_channels, kernel_channels, height, width) weight_dim = (4, 3, 3, 3) output_dim = (1, 4, 14, 14) kernel_channels = input_dim[0] output_channels, kernel_channels, height, width = weight_dim input_features = [ ("input", datatypes.Array(*input_dim)), ("weight", datatypes.Array(*weight_dim)), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_convolution( name="two_input_conv_layer", kernel_channels=kernel_channels, output_channels=output_channels, height=height, width=width, stride_height=1, stride_width=1, border_mode="valid", groups=1, W=None, b=None, has_bias=False, input_name=["input", "weight"], output_name="output", ) # Assigning everything to ones should cover the execution path # and engine failures, but is not a complete check on numerics. input_val = np.ones(input_dim) weight_val = np.ones(weight_dim) expected = np.ones(output_dim) * 27 feed_dict = {"input": input_val, "weight": weight_val} expected = {"output": expected} self._test_model(builder.spec, feed_dict, expected, useCPUOnly=True) self._test_model(builder.spec, feed_dict, expected, useCPUOnly=False) def test_batched_mat_mul_cpu(self, cpu_only=True): a_shapes = [ (10,), (4, 10), (10,), (10,), (2, 3), (1, 3, 4), (1, 3, 1, 2, 3), (2, 3, 1, 3, 4), ] b_shapes = [ (10,), (10,), (10, 3), (2, 10, 3), (3, 4), (3, 2, 4, 5), (1, 4, 3, 2), (2, 1, 2, 4, 5), ] out_shapes = [ (1, 1), (4, 1), (1, 3), (2, 1, 3), (2, 4), (3, 2, 3, 5), (1, 3, 4, 2, 2), (2, 3, 2, 3, 5), ] for a_shape, b_shape, outShape in zip(a_shapes, b_shapes, out_shapes): input_shapes = [a_shape, b_shape] input_features = [ ("A", datatypes.Array(*input_shapes[0])), ("B", datatypes.Array(*input_shapes[1])), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_batched_mat_mul( name="batched_mat_mul", input_names=["A", "B"], output_name="output", transpose_a=False, transpose_b=False, ) a = np.random.rand(*input_shapes[0]) b = np.random.rand(*input_shapes[1]) input_ = {"A": a, "B": b} expected = {"output": np.array(np.matmul(a, b))} shape_dict = {"output": outShape} self._test_model( builder.spec, input_, expected, useCPUOnly=cpu_only, output_name_shape_dict=shape_dict, ) self.assertEqual(len(outShape), builder._get_rank("output")) def test_batched_mat_mul_gpu(self): self.test_batched_mat_mul_cpu(cpu_only=False) def test_batched_mat_mul_with_transposes_cpu(self, cpu_only=True): for transpose_a, transpose_b in itertools.product([True, False], [True, False]): a_shape = (3, 4) b_shape = (4, 5) a_shape = a_shape[::-1] if transpose_a else a_shape b_shape = b_shape[::-1] if transpose_b else b_shape input_shapes = [a_shape, b_shape] input_features = [ ("A", datatypes.Array(*input_shapes[0])), ("B", datatypes.Array(*input_shapes[1])), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_batched_mat_mul( name="BatchedMatMul", input_names=["A", "B"], output_name="output", transpose_a=transpose_a, transpose_b=transpose_b, ) a = np.random.rand(*input_shapes[0]) b = np.random.rand(*input_shapes[1]) inputs = {"A": a, "B": b} a = a.T if transpose_a else a b = b.T if transpose_b else b expected = {"output": np.matmul(a, b)} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) def test_batched_mat_mul_with_transposes_gpu(self): self.test_batched_mat_mul_with_transposes_cpu(cpu_only=False) def test_batched_mat_mul_single_input_cpu( self, model_precision=_MLMODEL_FULL_PRECISION, cpu_only=True ): X1 = 11 X2 = 23 W = np.random.rand(X1, X2) bias = np.random.rand(X2) input_shapes = [ (X1,), (5, X1), (2, 3, X1), (4, 1, X1), (12, 5, 8, X1), (2, 3, 1, 5, X1), ] for input_shape in input_shapes: x = np.random.rand(*input_shape) np_out = np.matmul(x, W) + bias expected = {"output": np_out} input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_batched_mat_mul( name="batched_mat_mul", input_names=["data"], output_name="output", weight_matrix_rows=X1, weight_matrix_columns=X2, W=W, bias=bias, ) inputs = {"data": x} self._test_model( builder.spec, inputs, expected, model_precision=model_precision, useCPUOnly=cpu_only, ) def test_batched_mat_mul_single_input_half_precision_cpu(self): self.test_batched_mat_mul_single_input_cpu( model_precision=_MLMODEL_HALF_PRECISION, cpu_only=True ) def test_batched_mat_mul_single_input_gpu(self): self.test_batched_mat_mul_single_input_cpu( model_precision=_MLMODEL_FULL_PRECISION, cpu_only=False ) def test_embedding_nd_cpu( self, model_precision=_MLMODEL_FULL_PRECISION, use_cpu_only=True ): vocab_size = 10 embedding_size = 19 W = np.random.rand(embedding_size, vocab_size) input_shapes = [(5, 1), (2, 3, 1), (4, 1, 1), (12, 5, 8, 1), (2, 3, 1, 5, 1)] for input_shape in input_shapes: x = np.random.randint(vocab_size, size=input_shape) np_out = np.take(np.transpose(W), np.squeeze(x, axis=-1), axis=0) expected = {"output": np_out} input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_embedding_nd( name="embedding_nd", input_name="data", output_name="output", vocab_size=vocab_size, embedding_size=embedding_size, W=W, ) input = {"data": x.astype(np.float32)} self._test_model( builder.spec, input, expected, model_precision=model_precision, useCPUOnly=use_cpu_only, ) def test_embedding_nd_half_precision_cpu(self): self.test_embedding_nd_cpu( model_precision=_MLMODEL_HALF_PRECISION, use_cpu_only=True ) def test_embedding_nd_GPU(self): self.test_embedding_nd_cpu( model_precision=_MLMODEL_FULL_PRECISION, use_cpu_only=False ) def test_embedding_nd_half_precision_GPU(self): self.test_embedding_nd_cpu( model_precision=_MLMODEL_HALF_PRECISION, use_cpu_only=False ) def test_softmax_nan_bug_cpu(self): input_shape = [2, 2] input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] for axis in [0, 1]: builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_softmax_nd( name="softmax_nd", input_name="data", output_name="output", axis=axis ) x = np.array([[0.5, 0.5], [1e8, 1e8]]) input = {"data": x} y = np.exp(x - np.max(x, axis=axis, keepdims=True)) y = y / np.sum(y, axis=axis, keepdims=True) expected = {"output": y} self._test_model(builder.spec, input, expected, useCPUOnly=True) def test_softmax_nd_cpu(self, cpu_only=True): for rank in range(1, 6): for axis in range(-rank, rank): input_shape = np.random.randint(low=2, high=5, size=rank) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_softmax_nd( name="softmax_nd", input_name="data", output_name="output", axis=axis, ) x = np.random.rand(*input_shape) input = {"data": x} y = np.exp(x - np.max(x, axis=axis, keepdims=True)) y = y / np.sum(y, axis=axis, keepdims=True) expected = {"output": y} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_softmax_nd_gpu(self): self.test_softmax_nd_cpu(cpu_only=False) def test_concat_nd_cpu(self, cpu_only=True): for rank in range(1, 6): for axis in range(-rank, rank): n_inputs = np.random.choice(range(2, 5)) output_shape = np.random.randint(low=2, high=5, size=rank) output_shape[axis] = 0 input_shapes = [] input_features = [] input_names = [] for _ in range(n_inputs): input_shapes.append(np.copy(output_shape)) input_shapes[-1][axis] = np.random.choice(range(2, 8)) output_shape[axis] += input_shapes[-1][axis] for i, input_dim in enumerate(input_shapes): input_name = "input_%s" % str(i) input_names.append(input_name) input_features.append((input_name, datatypes.Array(*input_dim))) output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_concat_nd( name="concat_nd", input_names=input_names, output_name="output", axis=axis, ) input_tensors = [] for input_dim in input_shapes: input_tensors.append(np.random.rand(*input_dim)) input = dict(zip(input_names, input_tensors)) expected = {"output": np.concatenate(input_tensors, axis)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_concat_nd_gpu(self): self.test_concat_nd_cpu(cpu_only=False) def test_fill_like_cpu(self, cpu_only=True): for rank in range(1, 6): target_shape = np.random.randint(low=2, high=6, size=rank) value = float(np.random.rand()) input_features = [("tensor", datatypes.Array(*target_shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_fill_like( name="fill_like", input_name="tensor", output_name="output", value=value ) tensor = np.random.rand(*target_shape) input = {"tensor": tensor} expected = {"output": np.zeros(target_shape) + value} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_fill_like_gpu(self): self.test_fill_like_cpu(cpu_only=False) def test_fill_static_cpu(self, cpu_only=True): for rank in range(1, 6): shape = np.random.randint(low=2, high=8, size=rank) input_features = [("data", datatypes.Array(*shape))] value = float(np.random.rand()) builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_fill_static( name="fill_static", output_name="tmp", output_shape=list(shape), value=value, ) builder.add_elementwise("add_layer", ["data", "tmp"], "output", mode="ADD") data = np.random.rand(*shape) input = {"data": data} expected = {"output": data + value} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) self.assertEqual(len(shape), builder._get_rank("output")) def test_fill_static_gpu(self): self.test_fill_static_cpu(cpu_only=False) def test_fill_dynamic_cpu(self, cpu_only=True): for rank in range(1, 6): input_shape = np.random.randint(low=2, high=8, size=rank) value = float(np.random.rand()) input_features = [("shape", datatypes.Array(len(input_shape)))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_fill_dynamic( name="fill_dynamic", input_name="shape", output_name="output", value=value, ) input = {"shape": np.array(input_shape, dtype="float")} expected = {"output": np.zeros(input_shape) + value} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) self.assertEqual(builder._get_rank("output"), -1) def test_fill_dynamic_gpu(self): self.test_fill_dynamic_cpu(cpu_only=False) def test_broadcast_to_like_cpu(self, cpu_only=True): for rank in range(1, 6): input_shape = np.random.randint(low=2, high=8, size=rank) mask = [np.random.choice([True, False, False]) for _ in range(rank)] input_shape = np.where(mask, 1, input_shape) target_rank = np.random.randint(low=rank, high=6) target_shape = [ np.random.randint(low=2, high=8) if (-i > rank or input_shape[i] == 1) else input_shape[i] for i in range(-1, -target_rank - 1, -1) ][::-1] input_features = [ ("data", datatypes.Array(*input_shape)), ("tensor", datatypes.Array(*target_shape)), ] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_broadcast_to_like( name="broadcast_to_like", input_names=["data", "tensor"], output_name="output", ) data = np.random.rand(*input_shape) tensor = np.random.rand(*target_shape) inputs = {"data": data, "tensor": tensor} expected = {"output": np.broadcast_to(data, target_shape)} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) def test_broadcast_to_like_gpu(self): self.test_broadcast_to_like_cpu(cpu_only=False) def test_broadcast_to_static_cpu(self, cpu_only=True): for rank in range(1, 6): input_shape = np.random.randint(low=2, high=8, size=rank) mask = [np.random.choice([True, False, False]) for _ in range(rank)] input_shape = np.where(mask, 1, input_shape) target_rank = np.random.randint(low=rank, high=6) target_shape = [ np.random.randint(low=2, high=8) if (-i > rank or input_shape[i] == 1) else input_shape[i] for i in range(-1, -target_rank - 1, -1) ][::-1] input_features = [("data", datatypes.Array(*input_shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_broadcast_to_static( name="broadcast_to_static", input_name="data", output_name="output", output_shape=list(target_shape), ) data = np.random.rand(*input_shape) input = {"data": data} expected = {"output": np.broadcast_to(data, target_shape)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) self.assertEqual(target_rank, builder._get_rank("output")) def test_broadcast_to_static_gpu(self): self.test_broadcast_to_static_cpu(cpu_only=False) def test_broadcast_to_dynamic_cpu(self, cpu_only=True): for rank in range(1, 6): input_shape = np.random.randint(low=2, high=8, size=rank) mask = [np.random.choice([True, False, False]) for _ in range(rank)] input_shape = np.where(mask, 1, input_shape) target_rank = np.random.randint(low=rank, high=6) target_shape = [ np.random.randint(low=2, high=8) if (-i > rank or input_shape[i] == 1) else input_shape[i] for i in range(-1, -target_rank - 1, -1) ][::-1] input_features = [ ("data", datatypes.Array(*input_shape)), ("shape", datatypes.Array(len(target_shape))), ] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_broadcast_to_dynamic( name="broadcast_to_dynamic", input_names=["data", "shape"], output_name="output", ) data = np.random.rand(*input_shape) inputs = {"data": data, "shape": np.array(target_shape, dtype="float")} expected = {"output": np.broadcast_to(data, target_shape)} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(builder._get_rank("output"), -1) def test_broadcast_to_dynamic_gpu(self): self.test_broadcast_to_dynamic_cpu(cpu_only=False) # Test Rank being set to unknown when one of the input rank is unknown # For max rank case def test_unknown_rank(self, cpu_only=True): for rank in range(1, 6): input_shape = np.random.randint(low=2, high=8, size=rank) mask = [np.random.choice([True, False, False]) for _ in range(rank)] input_shape = np.where(mask, 1, input_shape) target_rank = np.random.randint(low=rank, high=6) target_shape = [ np.random.randint(low=2, high=8) if (-i > rank or input_shape[i] == 1) else input_shape[i] for i in range(-1, -target_rank - 1, -1) ][::-1] input_features = [ ("x", datatypes.Array(*input_shape)), ("shape", datatypes.Array(len(target_shape))), ] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_broadcast_to_dynamic( name="broadcast_to_dynamic", input_names=["x", "shape"], output_name="y" ) condition = np.random.randint(0, 2, input_shape).astype(np.float32) builder.add_load_constant_nd( name="load_constant_condition", output_name="condition", constant_value=condition, shape=input_shape, ) builder.add_where_broadcastable( name="where", input_names=["condition", "x", "y"], output_name="output" ) self.assertEqual(builder._get_rank("output"), -1) def test_trigonometry_cpu(self, cpu_only=True): ops = [ "sin", "cos", "tan", "asin", "acos", "atan", "sinh", "cosh", "tanh", "asinh", "acosh", "atanh", ] for op in ops: for rank in range(1, 6): shape = np.random.randint(low=2, high=8, size=rank) input_features = [("data", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) x = np.random.rand(*shape) if op == "sin": builder.add_sin(name=op, input_name="data", output_name="output") expected = {"output": np.sin(x)} elif op == "cos": builder.add_cos(name=op, input_name="data", output_name="output") expected = {"output": np.cos(x)} elif op == "tan": builder.add_tan(name=op, input_name="data", output_name="output") expected = {"output": np.tan(x)} elif op == "asin": builder.add_asin(name=op, input_name="data", output_name="output") expected = {"output": np.arcsin(x)} elif op == "acos": builder.add_acos(name=op, input_name="data", output_name="output") expected = {"output": np.arccos(x)} elif op == "atan": builder.add_atan(name=op, input_name="data", output_name="output") expected = {"output": np.arctan(x)} elif op == "sinh": builder.add_sinh(name=op, input_name="data", output_name="output") expected = {"output": np.sinh(x)} elif op == "cosh": builder.add_cosh(name=op, input_name="data", output_name="output") expected = {"output": np.cosh(x)} elif op == "tanh": builder.add_tanh(name=op, input_name="data", output_name="output") expected = {"output": np.tanh(x)} elif op == "asinh": builder.add_asinh(name=op, input_name="data", output_name="output") expected = {"output": np.arcsinh(x)} elif op == "acosh": x = np.random.choice([10, np.e, 1], tuple(shape)).astype(np.float32) builder.add_acosh(name=op, input_name="data", output_name="output") expected = {"output": np.arccosh(x)} elif op == "atanh": builder.add_atanh(name=op, input_name="data", output_name="output") expected = {"output": np.arctanh(x)} self._test_model( builder.spec, {"data": x}, expected, useCPUOnly=cpu_only ) def test_trigonometry_gpu(self): self.test_trigonometry_cpu(cpu_only=False) def test_exp2_cpu(self, cpu_only=True): for rank in range(1, 6): shape = np.random.randint(low=2, high=8, size=rank) input_features = [("data", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_exp2(name="exp2", input_name="data", output_name="output") x = np.random.rand(*shape) input = {"data": x} expected = {"output": np.exp2(x)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_exp2_gpu(self): self.test_exp2_cpu(cpu_only=False) def test_elementwise_binary_cpu(self, cpu_only=True): input_names = ["A", "B"] test_cases = [ "greater", "less", "equal", "not_equal", "greater_equal", "less_equal", "logical_and", "logical_or", "logical_xor", "add", "subtract", "multiply", "divide", "power", "maximum", "minimum", "floor_divide", "mod", ] for test_case in test_cases: for _ in range(10): rank_a = np.random.randint(low=1, high=6) rank_b = np.random.randint(low=1, high=6) rank_out = max(rank_a, rank_b) shape_a = np.random.randint(low=2, high=8, size=rank_a) shape_b = np.random.randint(low=2, high=8, size=rank_b) for i in range(-1, -rank_out - 1, -1): dims = [] if -i <= rank_a: dims.append(shape_a[i]) if -i <= rank_b: dims.append(shape_b[i]) dim = np.random.choice(dims) if -i <= rank_a: shape_a[i] = np.random.choice([1, dim]) if -i <= rank_b: shape_b[i] = np.random.choice([1, dim]) input_shapes = [shape_a, shape_b] input_features = [ ("A", datatypes.Array(*input_shapes[0])), ("B", datatypes.Array(*input_shapes[1])), ] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) func = getattr(np, test_case) if test_case == "greater": builder.add_greater_than( test_case, input_names=input_names, output_name="output" ) elif test_case == "less": builder.add_less_than( test_case, input_names=input_names, output_name="output" ) elif test_case == "equal": builder.add_equal( test_case, input_names=input_names, output_name="output" ) elif test_case == "not_equal": builder.add_not_equal( test_case, input_names=input_names, output_name="output" ) elif test_case == "greater_equal": builder.add_greater_than( test_case, input_names=input_names, output_name="output", use_greater_than_equal=True, ) elif test_case == "less_equal": builder.add_less_than( test_case, input_names=input_names, output_name="output", use_less_than_equal=True, ) elif test_case == "logical_and": builder.add_logical( test_case, input_names=input_names, output_name="output", mode="AND", ) elif test_case == "logical_or": builder.add_logical( test_case, input_names=input_names, output_name="output", mode="OR", ) elif test_case == "logical_xor": builder.add_logical( test_case, input_names=input_names, output_name="output", mode="XOR", ) elif test_case == "add": builder.add_add_broadcastable( test_case, input_names=input_names, output_name="output" ) elif test_case == "subtract": builder.add_subtract_broadcastable( test_case, input_names=input_names, output_name="output" ) elif test_case == "multiply": builder.add_multiply_broadcastable( test_case, input_names=input_names, output_name="output" ) elif test_case == "divide": builder.add_divide_broadcastable( test_case, input_names=input_names, output_name="output" ) elif test_case == "power": builder.add_pow_broadcastable( test_case, input_names=input_names, output_name="output" ) elif test_case == "maximum": builder.add_max_broadcastable( test_case, input_names=input_names, output_name="output" ) elif test_case == "minimum": builder.add_min_broadcastable( test_case, input_names=input_names, output_name="output" ) elif test_case == "floor_divide": builder.add_floor_div_broadcastable( test_case, input_names=input_names, output_name="output" ) elif test_case == "mod": builder.add_mod_broadcastable( test_case, input_names=input_names, output_name="output" ) a = np.random.rand(*input_shapes[0]) b = np.random.rand(*input_shapes[1]) input = {"A": a, "B": b} expected = {"output": func(a, b)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_elementwise_binary_gpu(self): self.test_elementwise_binary_cpu(cpu_only=False) def test_elementwise_boolean_unary_cpu(self, cpu_only=True): input_names = ["input"] shapes = [ (1, 2, 3, 1), (3, 1, 2, 1, 2), (1, 2, 1, 3), (2, 3), (2, 1, 1), (2, 3, 4), (2, 4), (1,), (1,), ] test_cases = [ "greater", "less", "equal", "not_equal", "greater_equal", "less_equal", ] for test_case in test_cases: for shape in shapes: input_features = [("input", datatypes.Array(*shape))] b = np.random.rand() builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) func = getattr(np, test_case) if test_case == "greater": builder.add_greater_than( test_case, input_names=input_names, output_name="output", alpha=b, ) elif test_case == "less": builder.add_less_than( test_case, input_names=input_names, output_name="output", alpha=b, ) elif test_case == "equal": builder.add_equal( test_case, input_names=input_names, output_name="output", alpha=b, ) elif test_case == "not_equal": builder.add_not_equal( test_case, input_names=input_names, output_name="output", alpha=b, ) elif test_case == "greater_equal": builder.add_greater_than( test_case, input_names=input_names, output_name="output", use_greater_than_equal=True, alpha=b, ) elif test_case == "less_equal": builder.add_less_than( test_case, input_names=input_names, output_name="output", use_less_than_equal=True, alpha=b, ) a = np.random.rand(*shape) input = {"input": a} expected = {"output": func(a, b)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_elementwise_boolean_unary_gpu(self): self.test_elementwise_boolean_unary_cpu(cpu_only=False) def test_logical_not_cpu(self, cpu_only=True): input_names = ["input"] shapes = [ (1, 2, 3, 1), (3, 1, 2, 1, 2), (1, 2, 1, 3), (2, 3), (2, 1, 1), (2, 3, 4), (2, 4), (1,), (1,), ] for shape in shapes: input_features = [("input", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_logical( "logical_not", input_names=input_names, output_name="output", mode="NOT" ) a = np.random.rand(*shape) input = {"input": a} expected = {"output": np.logical_not(a)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_logical_not_gpu(self): self.test_logical_not_cpu(cpu_only=False) def test_stack_cpu(self, cpu_only=True): for input_rank in range(1, 5): for axis in range(-input_rank - 1, input_rank + 1): n_inputs = np.random.choice(range(2, 5)) input_shape = np.random.randint(low=2, high=5, size=input_rank) input_features = [] input_names = [] for i in range(n_inputs): input_name = "input_%s" % str(i) input_names.append(input_name) input_features.append((input_name, datatypes.Array(*input_shape))) output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_stack( name="stack", input_names=input_names, output_name="output", axis=axis, ) input_tensors = [] for _ in range(n_inputs): input_tensors.append(np.random.rand(*input_shape)) input = dict(zip(input_names, input_tensors)) expected = {"output": np.stack(input_tensors, axis)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) self.assertEqual(input_rank + 1, builder._get_rank("output")) def test_stack_gpu(self): self.test_stack_cpu(cpu_only=False) def test_ceil_cpu(self, cpu_only=True): for rank in range(1, 6): shape = np.random.randint(low=2, high=8, size=rank) input_features = [("data", datatypes.Array(*shape))] output_features = [("output", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_ceil(name="ceil", input_name="data", output_name="output") x = np.random.rand(*shape) inputs = {"data": x} expected = {"output": np.ceil(x)} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(rank, builder._get_rank("output")) def test_ceil_gpu(self): self.test_ceil_cpu(cpu_only=False) def test_floor_cpu(self, cpu_only=True): for rank in range(1, 6): shape = np.random.randint(low=2, high=8, size=rank) input_features = [("data", datatypes.Array(*shape))] output_features = [("output", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_floor(name="floor", input_name="data", output_name="output") x = np.random.rand(*shape) inputs = {"data": x} expected = {"output": np.floor(x)} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) def test_round_cpu(self, cpu_only=True): for rank in range(1, 6): shape = np.random.randint(low=2, high=8, size=rank) input_features = [("data", datatypes.Array(*shape))] output_features = [("output", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_round(name="round", input_name="data", output_name="output") x = np.float32( np.random.rand(*shape) * np.random.randint(low=-100, high=101) ) inputs = {"data": x} expected = {"output": np.around(x)} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) def test_round_gpu(self): if platform.machine() == "arm64": pytest.xfail("rdar://98010495 (Some old nnv1 test are failing on M1 machine when running on ANE)") self.test_round_cpu(cpu_only=False) def test_sign_cpu(self, cpu_only=True): for rank in range(1, 6): shape = np.random.randint(low=2, high=8, size=rank) input_features = [("data", datatypes.Array(*shape))] output_features = [("output", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_sign(name="sign", input_name="data", output_name="output") x = np.random.choice( [-np.random.rand(1)[0], 0.0, np.random.rand(1)[0]], tuple(shape) ).astype(np.float32) inputs = {"data": x} expected = {"output": np.sign(x)} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) def test_sign_gpu(self): self.test_sign_cpu(cpu_only=False) def test_clip_cpu(self, cpu_only=True): for rank in range(1, 6): shape = np.random.randint(low=2, high=6, size=rank) input_features = [("data", datatypes.Array(*shape))] output_features = [("output", datatypes.Array(*shape))] x = np.random.rand(*shape) min_value = np.percentile(x, 25) max_value = np.percentile(x, 75) input = {"data": x} builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_clip( name="clip", input_name="data", output_name="output", min_value=min_value, max_value=max_value, ) expected = {"output": np.clip(x, min_value, max_value)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_clip_gpu(self): self.test_clip_cpu(cpu_only=False) def test_split_nd_cpu(self, cpu_only=True): for rank in range(1, 6): for axis in range(-rank, rank): n_outputs = np.random.choice(range(2, 4)) input_shape = np.random.randint(low=2, high=5, size=rank) input_shape[axis] = 0 output_shapes = [] output_features = [] output_names = [] almost_equal = random.choice([True, False]) remainder = np.random.choice(range(1, n_outputs)) if almost_equal else 0 value = np.random.choice(range(2, 5)) for k in range(n_outputs): output_shapes.append(np.copy(input_shape)) output_shapes[-1][axis] = value + 1 if k < remainder else value input_shape[axis] += output_shapes[-1][axis] for i in range(n_outputs): output_name = "output_%s" % str(i) output_names.append(output_name) output_features.append((output_name, None)) input_features = [("data", datatypes.Array(*input_shape))] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_split_nd( name="split_nd", input_name="data", output_names=output_names, axis=axis, num_splits=n_outputs, ) x = np.random.rand(*input_shape) input = {"data": x} expected = dict( zip( output_names, np.array_split(x, n_outputs, axis=axis) if almost_equal else np.split(x, n_outputs, axis=axis), ) ) # Explicitly trying to compare against both versions of numpy split self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) for output_ in output_names: self.assertEqual(rank, builder._get_rank(output_)) def test_split_nd_gpu(self): if platform.machine() == "arm64": pytest.xfail("rdar://98010495 (Some old nnv1 test are failing on M1 machine when running on ANE)") self.test_split_nd_cpu(cpu_only=False) def test_split_nd_with_split_sizes_cpu(self, cpu_only=True): for rank in range(1, 6): for axis in range(-rank, rank): n_outputs = np.random.choice(range(2, 4)) input_shape = np.random.randint(low=2, high=5, size=rank) input_shape[axis] = 0 output_shapes, output_features, output_names = [], [], [] sections, split_sizes = [], [] for _ in range(n_outputs): output_shapes.append(np.copy(input_shape)) output_shapes[-1][axis] = np.random.choice(range(2, 5)) input_shape[axis] += output_shapes[-1][axis] sections.append(input_shape[axis]) split_sizes.append(output_shapes[-1][axis]) sections.pop() for i in range(n_outputs): output_name = "output_%s" % str(i) output_names.append(output_name) output_features.append((output_name, None)) input_features = [("data", datatypes.Array(*input_shape))] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_split_nd( name="split_nd", input_name="data", output_names=output_names, axis=axis, split_sizes=split_sizes, ) x = np.random.rand(*input_shape) input = {"data": x} expected = dict(zip(output_names, np.split(x, sections, axis=axis))) self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) for output_ in output_names: self.assertEqual(rank, builder._get_rank(output_)) def test_split_nd_with_split_sizes_gpu(self): self.test_split_nd_with_split_sizes_cpu(cpu_only=False) def test_slice_static_cpu(self, cpu_only=True): for rank in range(1, 6): for _ in range(200): input_shape = np.array([5 for _ in range(rank)]) objs, strides, begin_masks, end_ids, end_masks, begin_ids = ( [], [], [], [], [], [], ) for dim in range(rank): stride = random.choice([-3, -1, 1, 2]) begin_mask = random.choice([True, False]) end_mask = random.choice([True, False]) length = 0 while length <= 0: begin_id = np.random.randint( low=-input_shape[dim], high=input_shape[dim] ) end_id = np.random.randint( low=-input_shape[dim], high=input_shape[dim] ) obj = slice( None if begin_mask else begin_id, None if end_mask else end_id, stride, ) length = np.arange(input_shape[dim])[(obj,)].shape[0] objs.append(obj), strides.append(stride), begin_masks.append( begin_mask ) end_masks.append(end_mask), begin_ids.append( begin_id ), end_ids.append(end_id) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_slice_static( "slice_static", "data", "output", begin_ids=begin_ids, end_ids=end_ids, strides=strides, begin_masks=begin_masks, end_masks=end_masks, ) x = np.random.rand(*input_shape) inputs = {"data": x} expected = {"output": x[tuple(objs)]} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(rank, builder._get_rank("output")) def test_slice_static_gpu(self): self.test_slice_static_cpu(cpu_only=False) def test_slice_dynamic_cpu(self, cpu_only=True): pytest.xfail( "rdar://111134257 ([Bug][Regression] nnv1 slice_by_index unittests are failing)" ) for rank in range(1, 6): input_shape = np.array([5 for _ in range(rank)]) objs, strides, begin_masks, end_ids, end_masks, begin_ids = ( [], [], [], [], [], [], ) squeeze_masks = [] squeeze_axes = [] for dim in range(rank): stride = random.choice([-3, -1, 1, 2]) begin_mask = random.choice([True, False]) end_mask = random.choice([True, False]) if len(squeeze_axes) + 1 < rank: squeeze_mask = random.choice([True, False]) else: squeeze_mask = False if squeeze_mask: squeeze_axes.append(dim) length = 0 while length <= 0: begin_id = np.random.randint( low=-input_shape[dim], high=input_shape[dim] ) end_id = np.random.randint( low=-input_shape[dim], high=input_shape[dim] ) obj = slice( None if begin_mask else begin_id, None if end_mask else end_id, stride, ) length = np.arange(input_shape[dim])[(obj,)].shape[0] objs.append(obj), strides.append(stride), begin_masks.append(begin_mask) end_masks.append(end_mask), begin_ids.append(begin_id), end_ids.append( end_id ) squeeze_masks.append(squeeze_mask) # test different number of inputs, from 2 inputs up to 7 inputs # when num_inputs == 2, begin_ids are inputs, rest are read from parameters # when num_inputs == 7, all read from inputs, none are read from parameters for num_inputs in [2, 3, 4, 5, 6]: x = np.random.rand(*input_shape) input_features = [("data", datatypes.Array(*input_shape))] input_names = ["data"] inputs = dict() inputs["data"] = x if num_inputs == 2: input_features = [ ("data", datatypes.Array(*input_shape)), ("begin_ids", datatypes.Array(len(begin_ids))), ] input_names = ["data", "begin_ids"] inputs["begin_ids"] = np.array(begin_ids, dtype=np.int32) elif num_inputs == 3: input_features = [ ("data", datatypes.Array(*input_shape)), ("begin_ids", datatypes.Array(len(begin_ids))), ("end_ids", datatypes.Array(len(end_ids))), ] input_names = ["data", "begin_ids", "end_ids"] inputs["begin_ids"] = np.array(begin_ids, dtype=np.int32) inputs["end_ids"] = np.array(end_ids, dtype=np.int32) elif num_inputs == 4: input_features = [ ("data", datatypes.Array(*input_shape)), ("begin_ids", datatypes.Array(len(begin_ids))), ("end_ids", datatypes.Array(len(end_ids))), ("strides", datatypes.Array(len(strides))), ] input_names = ["data", "begin_ids", "end_ids", "strides"] inputs["begin_ids"] = np.array(begin_ids, dtype=np.int32) inputs["end_ids"] = np.array(end_ids, dtype=np.int32) inputs["strides"] = np.array(strides, dtype=np.int32) elif num_inputs == 5: input_features = [ ("data", datatypes.Array(*input_shape)), ("begin_ids", datatypes.Array(len(begin_ids))), ("end_ids", datatypes.Array(len(end_ids))), ("strides", datatypes.Array(len(strides))), ("begin_masks", datatypes.Array(len(begin_masks))), ] input_names = [ "data", "begin_ids", "end_ids", "strides", "begin_masks", ] inputs["begin_ids"] = np.array(begin_ids, dtype=np.int32) inputs["end_ids"] = np.array(end_ids, dtype=np.int32) inputs["strides"] = np.array(strides, dtype=np.int32) inputs["begin_masks"] = np.array(begin_masks, dtype=np.int32) elif num_inputs == 6: input_features = [ ("data", datatypes.Array(*input_shape)), ("begin_ids", datatypes.Array(len(begin_ids))), ("end_ids", datatypes.Array(len(end_ids))), ("strides", datatypes.Array(len(strides))), ("begin_masks", datatypes.Array(len(begin_masks))), ("end_masks", datatypes.Array(len(end_masks))), ] input_names = [ "data", "begin_ids", "end_ids", "strides", "begin_masks", "end_masks", ] inputs["begin_ids"] = np.array(begin_ids, dtype=np.int32) inputs["end_ids"] = np.array(end_ids, dtype=np.int32) inputs["strides"] = np.array(strides, dtype=np.int32) inputs["begin_masks"] = np.array(begin_masks, dtype=np.int32) inputs["end_masks"] = np.array(end_masks, dtype=np.int32) elif num_inputs == 7: input_features = [ ("data", datatypes.Array(*input_shape)), ("begin_ids", datatypes.Array(len(begin_ids))), ("end_ids", datatypes.Array(len(end_ids))), ("strides", datatypes.Array(len(strides))), ("begin_masks", datatypes.Array(len(begin_masks))), ("end_masks", datatypes.Array(len(end_masks))), ("squeeze_masks", datatypes.Array(len(squeeze_masks))), ] input_names = [ "data", "begin_ids", "end_ids", "strides", "begin_masks", "end_masks", "squeeze_masks", ] inputs["begin_ids"] = np.array(begin_ids, dtype=np.int32) inputs["end_ids"] = np.array(end_ids, dtype=np.int32) inputs["strides"] = np.array(strides, dtype=np.int32) inputs["begin_masks"] = np.array(begin_masks, dtype=np.int32) inputs["end_masks"] = np.array(end_masks, dtype=np.int32) inputs["squeeze_masks"] = np.array(squeeze_masks, dtype=np.int32) builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) if num_inputs == 2: builder.add_slice_dynamic( "slice_dynamic", input_names, "output", end_ids=end_ids, strides=strides, begin_masks=begin_masks, end_masks=end_masks, squeeze_masks=squeeze_masks, ) elif num_inputs == 3: builder.add_slice_dynamic( "slice_dynamic", input_names, "output", strides=strides, begin_masks=begin_masks, end_masks=end_masks, squeeze_masks=squeeze_masks, ) elif num_inputs == 4: builder.add_slice_dynamic( "slice_dynamic", input_names, "output", begin_masks=begin_masks, end_masks=end_masks, squeeze_masks=squeeze_masks, ) elif num_inputs == 5: builder.add_slice_dynamic( "slice_dynamic", input_names, "output", end_masks=end_masks, squeeze_masks=squeeze_masks, ) elif num_inputs == 6: builder.add_slice_dynamic( "slice_dynamic", input_names, "output", squeeze_masks=squeeze_masks, ) elif num_inputs == 7: builder.add_slice_dynamic("slice_dynamic", input_names, "output") expected_x = x[tuple(objs)] squeeze_slices = [] for squeeze in squeeze_masks: if squeeze: squeeze_slices.append(slice(None, 1, None)) else: squeeze_slices.append(slice(None, None, None)) expected_x = np.squeeze( expected_x[tuple(squeeze_slices)], axis=tuple(squeeze_axes) ) expected = {"output": expected_x} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(rank, builder._get_rank("output")) def test_slice_dynamic_gpu(self): pytest.xfail( "rdar://111134257 ([Bug][Regression] nnv1 slice_by_index unittests are failing)" ) self.test_slice_dynamic_cpu(cpu_only=False) def test_tile_cpu(self, cpu_only=True): for rank in range(1, 6): input_shape = np.random.randint(low=2, high=5, size=rank) for rep_rank in range(1, rank + 1): reps = list(np.random.randint(low=1, high=9, size=rep_rank)) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_tile("Tile", "data", "output", reps) x = np.random.rand(*input_shape) input = {"data": x} expected = {"output": np.tile(x, reps)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_tile_gpu(self): self.test_tile_cpu(cpu_only=False) def test_dynamic_tile_cpu(self, cpu_only=True): for rank in range(1, 6): input_shape = np.random.randint(low=2, high=5, size=rank) for rep_rank in range(1, rank + 1): reps = np.random.randint(low=1, high=9, size=rep_rank) input_features = [ ("data", datatypes.Array(*input_shape)), ("reps", datatypes.Array(*reps.shape)), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_tile("Tile", ["data", "reps"], "output") x = np.random.rand(*input_shape) input = {"data": x, "reps": reps.astype(np.float32)} expected = {"output": np.tile(x, list(reps))} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_sliding_windows_cpu(self, cpu_only=True): def numpy_sliding_windows(a, np_axis, np_size, np_step): n = (a.shape[np_axis] - np_size) // np_step + 1 shape = list(a.shape) shape[np_axis] = n if np_axis < 0: np_axis += len(shape) shape.insert(np_axis + 1, np_size) strides = list(a.strides) effstride = strides[np_axis] * np_step strides.insert(np_axis, effstride) return np.lib.stride_tricks.as_strided(a, shape, strides) for rank in range(1, 5): for axis in range(-rank, rank): input_shape = np.random.randint(low=2, high=5, size=rank) output_shape = list(input_shape) window_size = np.random.randint(low=1, high=input_shape[axis]) length = 0 while length <= 0: step = np.random.randint(low=1, high=input_shape[axis]) length = (input_shape[axis] - window_size) // step + 1 output_shape[axis] = length pos_axis = axis if axis >= 0 else axis + rank output_shape.insert(pos_axis + 1, window_size) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_sliding_windows( "sliding_windows", input_name="data", output_name="output", axis=axis, window_size=window_size, step=step, ) x = np.random.rand(*input_shape) input = {"data": x} expected = {"output": numpy_sliding_windows(x, axis, window_size, step)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) self.assertEqual(rank + 1, builder._get_rank("output")) def test_sliding_windows_gpu(self): self.test_sliding_windows_cpu(cpu_only=False) def test_range_static_cpu(self, cpu_only=True): params = [ (-10.4, 23, 12.2), (0, 1000, 1), (50.5, 90.5, 1.5), (5, 8, 2), (5, 8, 98), (5, 8, 1.5), (10, 5, -0.6), (24, -65, -2), ] for param in params: start, end, step = param input_features = [("multiplicative_input", datatypes.Array(1))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_range_static( "range_static", "output_range", end=end, start=start, step=step ) builder.add_multiply_broadcastable( name="multiply_broadcastable", input_names=["multiplicative_input", "output_range"], output_name="output", ) # save the model model_dir = tempfile.TemporaryDirectory() model_path = os.path.join(model_dir.name, "test_layer.mlmodel") coremltools.utils.save_spec(builder.spec, model_path) inputs = dict() inputs["multiplicative_input"] = np.ones((1,), dtype=np.float64) expected = {"output": np.arange(start, end, step)} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(1, builder._get_rank("output")) def test_range_static_gpu(self): self.test_range_static_cpu(cpu_only=False) def test_range_dynamic_cpu(self, cpu_only=True): params = [ (-10.4, 23, 12.2), (0, 1000, 1), (50.5, 90.5, 1.5), (5, 8, 2), (5, 8, 98), (5, 8, 1.5), (10, 5, -0.6), (24, -65, -2), ] # input size == 1: end is input, start and step are read from parameters # input size == 2: end, start are inputs, step is read from parameters # input size == 3: start, end, step are all inputs, none of the parameters are used. for num_inputs in [1, 2, 3]: for param in params: inputs = dict() start, end, step = param if num_inputs == 1: input_features = [("end", datatypes.Array(1))] elif num_inputs == 2: input_features = [ ("end", datatypes.Array(1)), ("start", datatypes.Array(1)), ] elif num_inputs == 3: input_features = [ ("end", datatypes.Array(1)), ("start", datatypes.Array(1)), ("step", datatypes.Array(1)), ] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) if num_inputs == 1: inputs["end"] = end * np.ones((1,), dtype=np.float64) builder.add_range_dynamic( "range_dynamic", output_name="output", input_names=["end"], start=start, step=step, ) elif num_inputs == 2: inputs["end"] = end * np.ones((1,), dtype=np.float64) inputs["start"] = start * np.ones((1,), dtype=np.float64) builder.add_range_dynamic( "range_dynamic", output_name="output", input_names=["end", "start"], step=step, ) elif num_inputs == 3: inputs["end"] = end * np.ones((1,), dtype=np.float64) inputs["start"] = start * np.ones((1,), dtype=np.float64) inputs["step"] = step * np.ones((1,), dtype=np.float64) builder.add_range_dynamic( "range_dynamic", output_name="output", input_names=["end", "start", "step"], ) expected = {"output": np.arange(start, end, step)} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(1, builder._get_rank("output")) def test_range_dynamic_gpu(self): self.test_range_dynamic_cpu(cpu_only=False) def test_linear_activation_different_ranks_cpu(self, cpu_only=True): for input_dim in [(10, 15), (10, 15, 2, 3), (10, 2, 4, 15, 1), (6,)]: input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", datatypes.Array(*input_dim))] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_activation( name="activation", non_linearity="LINEAR", input_name="data", output_name="output", params=[34.0, 67.0], ) x = np.random.rand(*input_dim) input = {"data": x} expected = {"output": 34.0 * x + 67.0} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_linear_activation_different_ranks_gpu(self): self.test_linear_activation_different_ranks_cpu(cpu_only=False) def test_topk_cpu(self, cpu_only=True): test_input_shapes = [(9,), (8, 6), (9, 8, 10), (5, 9, 7, 9), (12, 8, 6, 6, 7)] K = [3, 5] axes = [[0], [0, 1], [1, 2], [0, 3, 1], [1, 3, 4]] for ii, input_shape in enumerate(test_input_shapes): for k in K: for n_inputs in [1, 2]: for bottom_k_flag in [False, True]: for axis in axes[ii]: for negative_axis in [False, True]: if negative_axis: axis = axis - len(input_shape) input_features = [ ("data", datatypes.Array(*input_shape)) ] output_features = [("values", None), ("indices", None)] input_names = ["data"] output_names = ["values", "indices"] if n_inputs == 2: input_names.append("k_in") input_features.append(("k_in", datatypes.Array(1))) builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) if n_inputs == 2: builder.add_topk( "topk", input_names, output_names, axis=axis, use_bottom_k=bottom_k_flag, ) else: builder.add_topk( "topk", input_names, output_names, k=k, axis=axis, use_bottom_k=bottom_k_flag, ) data = np.random.randint( low=0, high=int(np.prod(input_shape)), size=input_shape, ) data = data.astype(np.float32) input = {"data": data} if n_inputs == 2: input["k_in"] = k * np.ones([1], dtype=np.float32) # numpy reference values if bottom_k_flag: ref_indices = np.argsort(data, axis=axis) else: ref_indices = np.argsort(-data, axis=axis) slc = [slice(None)] * len(input_shape) slc[axis] = slice(0, k) ref_indices = ref_indices[tuple(slc)] ref_values = np.take_along_axis( data, ref_indices, axis=axis ) expected = { "values": ref_values, "indices": ref_indices, } self._test_model( builder.spec, input, expected, useCPUOnly=cpu_only ) def test_topk_gpu(self): self.test_topk_cpu(cpu_only=False) @unittest.skipIf(not _HAS_TF_2, MSG_TF2_NOT_FOUND) def test_const_pad_cpu(self, cpu_only=True): def get_reference(data, pads, value): res = tf.pad(data, pads, mode='CONSTANT', constant_values=value) return res.numpy() value = 34.0 shapes = [(3,), (4, 5), (2, 4, 5), (12, 6, 3, 5, 7), (1, 24, 2, 4, 8)] ctr = 0 for shape in shapes: rank = len(shape) for force_zeros_in_end in [0, 2, 6]: for max_pad_value in range(1, 6): for n_inputs in [1, 2]: pads = np.random.randint( low=0, high=max_pad_value, size=(rank, 2) ) if force_zeros_in_end > 2 * rank: continue # pads = np.reshape(np.array([1,1,1,0,0,1]), (rank, 2)) if force_zeros_in_end != 0: pads[-force_zeros_in_end:] = 0 data = np.random.rand(*shape) reference = get_reference(data, pads, value) ctr += 1 input_features = [("data", datatypes.Array(*shape))] output_features = [("output", None)] input_names = ["data"] if n_inputs == 2: input_names.append("pads") input_features.append(("pads", datatypes.Array(2 * rank,))) builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) if n_inputs == 2: builder.add_constant_pad( "pad", input_names, "output", value=value ) else: builder.add_constant_pad( "pad", input_names, "output", value=value, pad_amounts=pads.flatten(), ) input = {"data": data} if n_inputs == 2: input["pads"] = pads.flatten().astype(np.float32) expected = {"output": reference} self._test_model( builder.spec, input, expected, useCPUOnly=cpu_only ) def test_const_pad_gpu(self): if platform.machine() == "arm64": pytest.xfail("rdar://98010495 (Some old nnv1 test are failing on M1 machine when running on ANE)") self.test_const_pad_cpu(cpu_only=False) @unittest.skipIf(not _HAS_TF_2, MSG_TF2_NOT_FOUND) def test_const_pad_mode2_cpu(self, cpu_only=True): def get_reference(data, output_shape, value, left_pad=False): pads = np.zeros((len(output_shape), 2)) if left_pad: pads[:, 0] = np.array(output_shape) - np.array(data.shape) else: pads[:, 1] = np.array(output_shape) - np.array(data.shape) res = tf.pad(data, pads, mode="CONSTANT", constant_values=value) return res.numpy() value = 34.0 shapes = [(3,), (4, 5), (2, 4, 5), (12, 6, 3, 5, 7), (1, 24, 2, 4, 8)] out_shapes = [(5,), (4, 8), (2, 4, 10), (20, 6, 7, 10, 7), (5, 24, 10, 4, 10)] ctr = 0 for ii, shape in enumerate(shapes): rank = len(shape) for left_pad in [True, False]: for n_inputs in [1, 2]: data = np.random.rand(*shape) reference = get_reference(data, out_shapes[ii], value, left_pad) pads = np.zeros((rank, 2)) tmp = np.zeros((rank)) for i in range(rank): if out_shapes[ii][i] == shape[i]: tmp[i] = 0 else: tmp[i] = out_shapes[ii][i] if left_pad: pads[:, 0] = tmp else: pads[:, 1] = tmp ctr += 1 input_features = [("data", datatypes.Array(*shape))] output_features = [("output", None)] input_names = ["data"] if n_inputs == 2: input_names.append("pads") input_features.append(("pads", datatypes.Array(2 * rank,))) builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) if n_inputs == 2: builder.add_constant_pad( "pad", input_names, "output", value=value, pad_to_given_output_size_mode=True, ) else: builder.add_constant_pad( "pad", input_names, "output", value=value, pad_amounts=pads.flatten(), pad_to_given_output_size_mode=True, ) input = {"data": data} if n_inputs == 2: input["pads"] = pads.flatten().astype(np.float32) expected = {"output": reference} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_const_pad_mode2_gpu(self): if platform.machine() == "arm64": pytest.xfail("rdar://98010495 (Some old nnv1 test are failing on M1 machine when running on ANE)") self.test_const_pad_mode2_cpu(cpu_only=False) def test_nms_cpu(self, cpu_only=True): def _compute_iou_matrix(boxes): # input is (N,4), in order [center_w, center_h, width, height] self.assertEqual(len(boxes.shape), 2) self.assertEqual(boxes.shape[1], 4) boxes = boxes.astype(np.float32) center_w, center_h, width, height = np.split( boxes, 4, axis=1 ) # outs are all (N,1) top = center_h + 0.5 * height bottom = center_h - 0.5 * height left = center_w - 0.5 * width right = center_w + 0.5 * width area = width * height hB = np.minimum(top, np.transpose(top)) wB = np.minimum(right, np.transpose(right)) hA = np.maximum(bottom, np.transpose(bottom)) wA = np.maximum(left, np.transpose(left)) intersection_area = np.maximum(0, hB - hA) * np.maximum(0, wB - wA) union_area = area + np.transpose(area) - intersection_area iou = intersection_area / union_area return iou @unittest.skipIf(not _HAS_TF_2, MSG_TF2_NOT_FOUND) def _nms_TF( boxes, scores, iou_threshold, score_threshold, per_class_suppression, M ): # boxes is (B,N,4), in order [center_w, center_h, width, height] # scores is (B,N,C) # output shapes: (B,M,4), (B,M,C), (B,M), (B,) """ this is implementation of CoreML's NMS layer """ B, N, C = scores.shape iou_threshold = iou_threshold.astype(np.float32) score_threshold = score_threshold.astype(np.float32) # convert box ids to TF style center_w, center_h, width, height = np.split( boxes, 4, axis=-1 ) # outs are all (B,N,1) y1 = center_h - 0.5 * height y2 = center_h + 0.5 * height x1 = center_w - 0.5 * width x2 = center_w + 0.5 * width boxes_tf = np.concatenate((y1, x1, y2, x2), axis=-1) # (B,N,4) out1 = np.zeros((B, M, 4)) out2 = np.zeros((B, M, C)) out3 = -1 * np.ones((B, M)) out4 = np.zeros((B,)) for b in range(B): box_coord_matrix = boxes_tf[b, :, :] # (N,4) score_vector = np.max(scores[b, :, :], axis=-1) # (N,) if not per_class_suppression: # this is the simple case as TF directly supports it ids_g = tf.image.non_max_suppression( box_coord_matrix, score_vector, max_output_size=M, iou_threshold=iou_threshold, score_threshold=score_threshold, ) ids = ids_g.numpy() else: # this is slightly complicated as TF does not directly support it class_ids = np.argmax(scores[b, :, :], axis=-1) # (N,) sorted_score_ids = np.argsort(-score_vector) box_coord_matrix2 = np.take( box_coord_matrix, sorted_score_ids, axis=0 ) score_vector2 = np.take(score_vector, sorted_score_ids) class_ids = np.take(class_ids, sorted_score_ids) classes_seen = dict() ids_intermediate = np.array([], dtype=np.int32) for n in range(N): if class_ids[n] in classes_seen: continue c = class_ids[n] classes_seen[c] = True current_class_ids = np.where(class_ids == c)[0] if len(current_class_ids) > 0: feed_in1 = np.take( box_coord_matrix2, current_class_ids, axis=0 ) feed_in2 = np.take(score_vector2, current_class_ids) cur_ids_g = tf.image.non_max_suppression( feed_in1, feed_in2, max_output_size=M, iou_threshold=iou_threshold, score_threshold=score_threshold, ) cur_ids = cur_ids_g.numpy() from_sort_ids = np.take(current_class_ids, cur_ids) ids_intermediate = np.append( ids_intermediate, from_sort_ids ) ids_intermediate.sort() ids = np.take(sorted_score_ids, ids_intermediate) xx = len(ids) if xx == 0: ids = np.array([np.argmax(score_vector)]) xx = 1 if xx > M: ids = ids[:M] xx = len(ids) out1[b, :xx, :] = np.take(boxes[b, :, :], ids, axis=0) out2[b, :xx, :] = np.take(scores[b, :, :], ids, axis=0) out3[b, :xx] = ids out4[b] = xx return out1, out2, out3, out4 iou_threshold_percentile = [0, 30, 80, 100] score_threshold_percentile_arr = [0, 40, 100] N_M_pairs_to_test = [[100, 48], [100, 112]] # N : boxes in, M: max boxes out number_of_test = 0 for N_M in N_M_pairs_to_test: for B in [1]: # [1, 5] TODO Re-enable when rdar://60280745 is fixed for C in [1, 7]: N, M = N_M boxes = np.random.rand(B, N, 4) scores = np.random.rand(B, N, C) iou_matrix = _compute_iou_matrix(boxes[0, :, :]) # (N,N) iou_matrix = iou_matrix[ ~np.eye(iou_matrix.shape[0], dtype=bool) ].reshape(iou_matrix.shape[0], -1) for per_class_suppression in [False, True]: for iou_thresh in iou_threshold_percentile: for score_thresh in score_threshold_percentile_arr: for is_dynamic in [False, True]: if score_thresh == 0: score_threshold = np.min(scores) - 1 elif score_thresh == 100: score_threshold = np.max(scores) + 1 else: score_threshold = ( np.percentile(scores, score_thresh) + 0.01 ) if iou_thresh == 0: iou_threshold = np.maximum( np.min(iou_matrix) - 0.01, 0.0 ) else: iou_threshold = ( np.percentile(iou_matrix, iou_thresh) + 0.01 ) iou_threshold = np.maximum(iou_threshold, 1e-8) number_of_test += 1 tf_boxes, tf_scores, tf_ids, tf_num_boxes = _nms_TF( boxes, scores, iou_threshold, score_threshold, per_class_suppression, M, ) expected = dict() expected["selected_boxes"] = tf_boxes expected["selected_scores"] = tf_scores expected["selected_box_ids"] = tf_ids expected["number_of_boxes"] = tf_num_boxes # define CoreML model input_features = [ ("boxes", datatypes.Array(B, N, 4)), ("scores", datatypes.Array(B, N, C)), ] output_features = [ ("selected_boxes", None), ("selected_scores", None), ("selected_box_ids", None), ("number_of_boxes", None), ] input_names = ["boxes", "scores"] if is_dynamic: input_names.extend( [ "iou_threshold", "score_threshold", "max_boxes", ] ) input_features.append( ("iou_threshold", datatypes.Array(1,)) ) input_features.append( ("score_threshold", datatypes.Array(1,)) ) input_features.append( ("max_boxes", datatypes.Array(1,)) ) builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) input_dict = dict() input_dict["boxes"] = boxes input_dict["scores"] = scores if is_dynamic: builder.add_nms( "nms", input_names, [ "selected_boxes", "selected_scores", "selected_box_ids", "number_of_boxes", ], per_class_suppression=per_class_suppression, ) input_dict[ "iou_threshold" ] = iou_threshold * np.ones([1], dtype=np.float32) input_dict["score_threshold"] = ( score_threshold * np.ones([1], dtype=np.float32) ) input_dict["max_boxes"] = M * np.ones( [1], dtype=np.float32 ) else: builder.add_nms( "nms", input_names, [ "selected_boxes", "selected_scores", "selected_box_ids", "number_of_boxes", ], iou_threshold=iou_threshold, score_threshold=score_threshold, max_boxes=M, per_class_suppression=per_class_suppression, ) self._test_model( builder.spec, input_dict, expected, useCPUOnly=cpu_only, ) def test_nms_gpu(self): self.test_nms_cpu(cpu_only=False) def test_rank_preserving_reshape(self): input_shapes = [(20, 10), (20, 10, 5), (10, 3, 5)] target_shapes = [(5, -1), (0, 2, 25), (25, 0, -1)] output_shapes = [(5, 40), (20, 2, 25), (25, 3, 2)] for i in range(len(input_shapes)): input_features = [("data", datatypes.Array(*input_shapes[i]))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_rank_preserving_reshape( name="rank_preserving_reshape", input_name="data", output_name="output", output_shape=target_shapes[i], ) x = np.random.rand(*input_shapes[i]) input = {"data": x} expected = {"output": np.reshape(x, output_shapes[i])} self._test_model(builder.spec, input, expected, useCPUOnly=True) self.assertEqual(len(output_shapes[i]), builder._get_rank("output")) def test_expand_dims(self): input_shapes = [(10, 5), (10, 5), (10, 5), (10, 5), (10,)] axes = [(0, 1), (0, 2), (2, 0), (-2, -1), (1, 0, -2)] output_shapes = [ (1, 1, 10, 5), (1, 10, 1, 5), (1, 10, 1, 5), (10, 5, 1, 1), (1, 1, 1, 10), ] for i in range(len(input_shapes)): input_features = [("data", datatypes.Array(*input_shapes[i]))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_expand_dims( name="expand_dims", input_name="data", output_name="output", axes=axes[i], ) x = np.random.rand(*input_shapes[i]) input = {"data": x} expected = {"output": np.reshape(x, output_shapes[i])} self._test_model(builder.spec, input, expected, useCPUOnly=True) self.assertEqual(len(output_shapes[i]), builder._get_rank("output")) def test_squeeze(self): input_shapes = [ (1, 1, 10, 5), (1, 10, 1, 5), (10, 5, 1, 1), (10, 5, 1, 1), (1,), (10, 5, 1, 1), (3, 1, 7), ] axes = [(0, 1), (0, 2), (-2, -1), (-1, -2), (0,), (3, -2), (1,)] output_shapes = [(10, 5), (10, 5), (10, 5), (10, 5), (1,), (10, 5), (3, 7)] for i in range(len(input_shapes)): input_features = [("data", datatypes.Array(*input_shapes[i]))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_squeeze( name="squeeze_layer", input_name="data", output_name="output", axes=list(axes[i]), ) x = np.random.rand(*input_shapes[i]) input = {"data": x} expected = {"output": np.reshape(x, output_shapes[i])} self._test_model(builder.spec, input, expected, useCPUOnly=True) self.assertEqual(len(output_shapes[i]), builder._get_rank("output")) def test_squeeze_all(self): input_shapes = [ (1, 1, 10, 5), (1, 10, 1, 5), (10, 5, 1, 1), (10, 5, 1, 1), (1,), (10, 5, 1, 1), (3, 1, 7), (3,), (5, 6), ] for input_shape in input_shapes: input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_squeeze( name="squeeze_layer", input_name="data", output_name="output", squeeze_all=True, ) x = np.random.rand(*input_shape) input = {"data": x} reference = np.squeeze(x) if not reference.shape: reference = np.reshape(reference, (1,)) expected = {"output": reference} self._test_model(builder.spec, input, expected, useCPUOnly=True) self.assertEqual(-1, builder._get_rank("output")) def test_argmax_argmin(self): test_input_shapes = [(9,), (8, 6), (9, 8, 10), (5, 9, 7, 9), (12, 8, 6, 6, 7)] # (1+2+3+4+5) * 2^3 = 120 test cases for input_shape in test_input_shapes: for negative_axis in [False, True]: for mode in ["argmax", "argmin"]: for keep_dims in [True, False]: for axis in np.arange(len(input_shape)): if negative_axis: axis_val = axis - len(input_shape) else: axis_val = axis input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) x = np.random.rand(*input_shape) if mode == "argmax": builder.add_argmax( "argmax", "data", "output", axis=axis_val, keepdims=keep_dims, ) np_out = np.argmax(x, axis=axis_val) else: builder.add_argmin( "argmin", "data", "output", axis=axis_val, keepdims=keep_dims, ) np_out = np.argmin(x, axis=axis_val) if keep_dims: np_out = np.expand_dims(np_out, axis=axis_val) elif len(input_shape) == 1: np_out = np.expand_dims(np_out, axis=axis_val) input = {"data": x} expected = {"output": np_out} test_case = "test_argmax_argmin_input_shape_{}_axis_{}_keep_dims_{}_numpy_out_shape_{}".format( x.shape, axis_val, keep_dims, np_out.shape ) self._test_model( builder.spec, input, expected, useCPUOnly=True ) if len(np_out.shape) != 0: self.assertEqual( len(np_out.shape), builder._get_rank("output") ) def test_get_shape(self): dims = [1, 2, 3, 4, 5] for rank in range(1, len(dims) + 1): input_shape = dims[:rank] input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_get_shape( name="get_shape_layer", input_name="data", output_name="output" ) feed = {"data": np.random.rand(*input_shape)} expected = {"output": np.array(input_shape)} self._test_model(builder.spec, feed, expected, useCPUOnly=True) self.assertEqual(1, builder._get_rank("output")) def test_load_constant_nd(self): dims = [2, 3, 4, 5, 6] for rank in range(1, len(dims) + 1): input_shape = dims[:rank] input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_load_constant_nd( "load_const_nd_layer", "tmp", constant_value=np.ones(input_shape), shape=input_shape, ) builder.add_elementwise("add_layer", ["data", "tmp"], "output", mode="ADD") feed = {"data": np.random.rand(*input_shape)} expected = {"output": feed["data"] + 1} self._test_model(builder.spec, feed, expected, useCPUOnly=True) self.assertEqual(rank, builder._get_rank("output")) def test_simple_array_alloc_scatter(self): alloc_shape = [2, 3, 4] value_shape = [1, 3, 4] input_features = [ ("alloc_shape", datatypes.Array(len(alloc_shape))), ("value", datatypes.Array(*value_shape)), ("index", datatypes.Array(1)), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_fill_dynamic( name="fill_dynamic_layer", input_name="alloc_shape", output_name="array", value=np.float32(0.0), ) # CoreML input order: container (array), indices, slices (value) builder.add_scatter( name="scatter_layer", input_names=["array", "index", "value"], output_name="output", ) value = np.random.rand(*value_shape).astype("float") feed = { "alloc_shape": np.array(alloc_shape, dtype="float"), "value": value, "index": np.array([1], dtype="float"), } ref = np.zeros(alloc_shape) ref[1, :, :] = value expected = {"output": ref} self._test_model(builder.spec, feed, expected, useCPUOnly=True) def test_erf_activation_cpu(self, cpu_only=True): input_features = [("data", datatypes.Array(10, 45))] output_features = [("output", datatypes.Array(10, 45))] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_erf(name="erf", input_name="data", output_name="output") x = np.random.rand(10, 45) input = {"data": x} expected = { "output": np.asarray([math.erf(i) for i in x.flatten().tolist()]).reshape( 10, 45 ) } self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_erf_activation_gpu(self): self.test_erf_activation_cpu(cpu_only=False) def test_gelu_activation(self): for mode in ["EXACT", "TANH_APPROXIMATION", "SIGMOID_APPROXIMATION"]: for rank in range(1, 6): shape = np.random.randint(low=2, high=5, size=rank) input_features = [("data", datatypes.Array(*shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_gelu( name="gelu", input_name="data", output_name="output", mode=mode ) x = np.random.rand(*shape) input = {"data": x} exact = np.asarray( [ 0.5 * i * (1.0 + math.erf(i / math.sqrt(2))) for i in x.flatten().tolist() ] ).reshape(*shape) expected = {"output": exact} self._test_model(builder.spec, input, expected, useCPUOnly=True) def test_lower_triangular_cpu(self, cpu_only=True): for rank in range(2, 6): for k in range(-3, 4): shape = np.random.randint(low=2, high=6, size=rank) input_features = [("data", datatypes.Array(*shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_lower_triangular("tril", "data", "output", k=k) x = np.random.rand(*shape) input = {"data": x} expected = {"output": np.tril(x, k=k)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_lower_triangular_gpu(self): self.test_lower_triangular_cpu(cpu_only=False) def test_upper_triangular_cpu(self, cpu_only=True): for rank in range(2, 6): for k in range(-3, 4): shape = np.random.randint(low=2, high=6, size=rank) input_features = [("data", datatypes.Array(*shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_upper_triangular("triu", "data", "output", k=k) x = np.random.rand(*shape) input = {"data": x} expected = {"output": np.triu(x, k=k)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_upper_triangular_gpu(self): self.test_upper_triangular_cpu(cpu_only=False) def test_where_broadcastable_cpu(self, cpu_only=True): for _ in range(150): rank_cond = np.random.randint(low=1, high=6) rank_true = np.random.randint(low=1, high=6) rank_false = np.random.randint(low=1, high=6) rank_out = max(rank_cond, rank_true, rank_false) shape_cond = np.random.randint(low=2, high=8, size=rank_cond) shape_true = np.random.randint(low=2, high=8, size=rank_true) shape_false = np.random.randint(low=2, high=8, size=rank_false) for i in range(-1, -rank_out - 1, -1): dims = [] if -i <= rank_cond: dims.append(shape_cond[i]) if -i <= rank_true: dims.append(shape_true[i]) if -i <= rank_false: dims.append(shape_false[i]) dim = np.random.choice(dims) if -i <= rank_cond: shape_cond[i] = np.random.choice([1, dim]) if -i <= rank_true: shape_true[i] = np.random.choice([1, dim]) if -i <= rank_false: shape_false[i] = np.random.choice([1, dim]) input_features = [ ("cond", datatypes.Array(*shape_cond)), ("true", datatypes.Array(*shape_true)), ("false", datatypes.Array(*shape_false)), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_where_broadcastable( "if_broadcastable", input_names=["cond", "true", "false"], output_name="output", ) cond = np.random.choice([1.0, 0.0], size=shape_cond) true = np.random.rand(*shape_true) false = np.random.rand(*shape_false) input = {"cond": cond, "true": true, "false": false} expected = {"output": np.where(cond, true, false)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) self.assertEqual(len(expected["output"].shape), builder._get_rank("output")) def test_where_broadcastable_gpu(self): self.test_where_broadcastable_cpu(cpu_only=False) @pytest.mark.slow def test_random_normal_like_cpu(self, cpu_only=True): mean, stddev, seed = 0.0, 1.0, 42 for rank in range(5, -1, -1): if rank > 0: low_factor = np.random.randint(low=2, high=4) low = int(np.power(1000, 1.0 / rank)) * low_factor high = int(np.power(2000, 1.0 / rank)) * np.random.randint( low=low_factor, high=4 ) shape = np.random.randint(low=low, high=high, size=rank) else: # one extra test to test more moments shape = np.array([10, 10, 10, 10, 10000]) input_features = [("tensor", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_random_normal_like( name="random_normal_like", input_name="tensor", output_name="output", mean=mean, stddev=stddev, seed=seed, ) inputs = {"tensor": np.random.rand(*shape)} expected = {"output": np.random.normal(mean, stddev, shape)} if rank > 0: CorrectnessTest._compare_moments( builder.spec, inputs, expected, num_moments=2 ) self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) else: # one extra test to test more moments CorrectnessTest._compare_moments( builder.spec, inputs, expected, num_moments=6 ) @pytest.mark.slow def test_random_normal_like_gpu(self): self.test_random_normal_like_cpu(cpu_only=False) def test_random_normal_static_cpu(self, cpu_only=True): mean, stddev, seed = 0.0, 1.0, 42 for rank in range(1, 6): low_factor = np.random.randint(low=2, high=4) low = int(np.power(1000, 1.0 / rank)) * low_factor high = int(np.power(2000, 1.0 / rank)) * np.random.randint( low=low_factor, high=4 ) shape = np.random.randint(low=low, high=high, size=rank) input_features = [("data", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_random_normal_static( name="random_normal_static", output_name="tmp", output_shape=list(shape), mean=mean, stddev=stddev, seed=seed, ) builder.add_elementwise("add_layer", ["data", "tmp"], "output", mode="ADD") data = np.zeros(shape) inputs = {"data": data} expected = {"output": data + np.random.normal(mean, stddev, shape)} CorrectnessTest._compare_moments( builder.spec, inputs, expected, num_moments=2 ) self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(rank, builder._get_rank("output")) def test_random_normal_static_gpu(self): self.test_random_normal_static_cpu(cpu_only=False) def test_random_normal_dynamic_cpu(self, cpu_only=True): mean, stddev, seed = 0.0, 1.0, 42 for rank in range(1, 6): low_factor = np.random.randint(low=2, high=4) low = int(np.power(1000, 1.0 / rank)) * low_factor high = int(np.power(2000, 1.0 / rank)) * np.random.randint( low=low_factor, high=4 ) shape = np.random.randint(low=low, high=high, size=rank) input_features = [("shape", datatypes.Array(len(shape)))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_random_normal_dynamic( name="random_normal_dynamic", input_names=["shape"], output_name="output", mean=mean, stddev=stddev, seed=seed, ) inputs = {"shape": np.array(shape, np.float32)} expected = {"output": np.random.normal(mean, stddev, shape)} CorrectnessTest._compare_moments( builder.spec, inputs, expected, num_moments=2 ) self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(-1, builder._get_rank("output")) def test_random_normal_dynamic_gpu(self): self.test_random_normal_dynamic_cpu(cpu_only=False) def test_random_uniform_like_cpu(self, cpu_only=True): minval, maxval, seed = 0.0, 1.0, 42 for rank in range(1, 6): low_factor = np.random.randint(low=2, high=4) low = int(np.power(1000, 1.0 / rank)) * low_factor high = int(np.power(2000, 1.0 / rank)) * np.random.randint( low=low_factor, high=4 ) shape = np.random.randint(low=low, high=high, size=rank) input_features = [("tensor", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_random_uniform_like( name="random_uniform_like", input_name="tensor", output_name="output", minval=minval, maxval=maxval, seed=seed, ) tensor = np.random.rand(*shape) inputs = {"tensor": tensor} expected = {"output": np.random.uniform(minval, maxval, shape)} CorrectnessTest._compare_moments(builder.spec, inputs, expected) self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(rank, builder._get_rank("output")) def test_random_uniform_like_gpu(self): self.test_random_uniform_like_cpu(cpu_only=False) def test_random_uniform_static_cpu(self, cpu_only=True): minval, maxval, seed = 0.0, 1.0, 42 for rank in range(1, 6): low_factor = np.random.randint(low=2, high=4) low = int(np.power(1000, 1.0 / rank)) * low_factor high = int(np.power(2000, 1.0 / rank)) * np.random.randint( low=low_factor, high=4 ) shape = np.random.randint(low=low, high=high, size=rank) input_features = [("data", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_random_uniform_static( name="random_uniform_static", output_name="tmp", output_shape=list(shape), minval=minval, maxval=maxval, seed=seed, ) builder.add_elementwise("add_layer", ["data", "tmp"], "output", mode="ADD") data = np.zeros(shape) inputs = {"data": data} expected = {"output": data + np.random.uniform(minval, maxval, shape)} CorrectnessTest._compare_moments(builder.spec, inputs, expected) self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(rank, builder._get_rank("output")) def test_random_uniform_static_gpu(self): self.test_random_uniform_static_cpu(cpu_only=False) def test_random_uniform_dynamic_cpu(self, cpu_only=True): minval, maxval, seed = 0.0, 1.0, 42 for rank in range(1, 6): low_factor = np.random.randint(low=2, high=4) low = int(np.power(1000, 1.0 / rank)) * low_factor high = int(np.power(2000, 1.0 / rank)) * np.random.randint( low=low_factor, high=4 ) shape = np.random.randint(low=low, high=high, size=rank) input_features = [("shape", datatypes.Array(len(shape)))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_random_uniform_dynamic( name="random_uniform_dynamic", input_names=["shape"], output_name="output", minval=minval, maxval=maxval, seed=seed, ) inputs = {"shape": np.array(shape, np.float32)} expected = {"output": np.random.uniform(minval, maxval, shape)} CorrectnessTest._compare_moments(builder.spec, inputs, expected) self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(-1, builder._get_rank("output")) def test_random_uniform_dynamic_gpu(self): self.test_random_uniform_dynamic_cpu(cpu_only=False) def test_random_bernoulli_like_cpu(self, cpu_only=True): prob, seed = 0.5, 42 for rank in range(1, 6): low_factor = np.random.randint(low=2, high=4) low = int(np.power(1000, 1.0 / rank)) * low_factor high = int(np.power(2000, 1.0 / rank)) * np.random.randint( low=low_factor, high=4 ) shape = np.random.randint(low=low, high=high, size=rank) input_features = [("tensor", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_random_bernoulli_like( name="random_bernoulli_like", input_name="tensor", output_name="output", prob=prob, seed=seed, ) tensor = np.random.rand(*shape) inputs = {"tensor": tensor} expected = {"output": np.random.binomial(1, prob, shape)} CorrectnessTest._compare_moments(builder.spec, inputs, expected) self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) def test_random_bernoulli_like_gpu(self): self.test_random_bernoulli_like_cpu(cpu_only=False) def test_random_bernoulli_static_cpu(self, cpu_only=True): prob, seed = 0.5, 42 for rank in range(1, 6): low_factor = np.random.randint(low=2, high=4) low = int(np.power(1000, 1.0 / rank)) * low_factor high = int(np.power(2000, 1.0 / rank)) * np.random.randint( low=low_factor, high=4 ) shape = np.random.randint(low=low, high=high, size=rank) input_features = [("data", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_random_bernoulli_static( name="random_bernoulli_static", output_name="tmp", output_shape=list(shape), prob=prob, seed=seed, ) builder.add_elementwise("add_layer", ["data", "tmp"], "output", mode="ADD") data = np.zeros(shape) inputs = {"data": data} expected = {"output": data + np.random.binomial(1, prob, shape)} CorrectnessTest._compare_moments(builder.spec, inputs, expected) self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) def test_random_bernoulli_static_gpu(self): self.test_random_bernoulli_static_cpu(cpu_only=False) def test_random_bernoulli_dynamic_cpu(self, cpu_only=True): prob, seed = 0.5, 42 for rank in range(1, 6): low_factor = np.random.randint(low=2, high=4) low = int(np.power(1000, 1.0 / rank)) * low_factor high = int(np.power(2000, 1.0 / rank)) * np.random.randint( low=low_factor, high=4 ) shape = np.random.randint(low=low, high=high, size=rank) input_features = [("shape", datatypes.Array(len(shape)))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_random_bernoulli_dynamic( name="random_bernoulli_dynamic", input_names=["shape"], output_name="output", prob=prob, seed=seed, ) inputs = {"shape": np.array(shape, np.float32)} expected = {"output": np.random.binomial(1, prob, shape)} CorrectnessTest._compare_moments(builder.spec, inputs, expected) self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) def test_random_bernoulli_dynamic_gpu(self): self.test_random_bernoulli_dynamic_cpu(cpu_only=False) def test_categorical_distribution_cpu_shapes(self): for rank in range(1, 6): shape = np.random.randint(low=2, high=8, size=rank) num_samples = np.random.randint(low=10, high=1000) input_features = [("data", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_categorical_distribution( name="categorical_distribution", input_name="data", output_name="output", num_samples=num_samples, ) x = np.random.randint(low=0, high=20, size=shape).astype(np.float32) inputs = {"data": x} shape[-1] = num_samples expected = {"output": np.random.rand(*shape)} self._test_model( builder.spec, inputs, expected, useCPUOnly=True, validate_shapes_only=True, ) @pytest.mark.xfail( reason="rdar://64153463 ([GitLab CI] test_categorical_distribution_cpu_probs failing)" ) def test_categorical_distribution_cpu_logits(self): def softmax(data): e_data = np.exp(data - np.max(data)) return e_data / e_data.sum() num_samples, num_class = 50000, 10 input_name, output_name = "data", "output" shapes = [ (2, num_class), (2, 1, num_class), (1, 2, num_class), (2, 1, 1, num_class), (1, 2, 1, num_class), (1, 1, 2, num_class), (2, 1, 1, 1, num_class), (1, 2, 1, 1, num_class), (1, 1, 2, 1, num_class), (1, 1, 1, 2, num_class), ] for shape in shapes: input_features = [("data", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_categorical_distribution( name="categorical_distribution", input_name=input_name, output_name=output_name, num_samples=num_samples, is_logits=True, seed=42, ) x = np.random.rand(*shape) inputs = {input_name: x} model = builder.spec if isinstance(model, str): model = coremltools.models.MLModel(model) model = coremltools.models.MLModel(model) prediction = model.predict(inputs) # validate each distribution separately logits = x.reshape(2, num_class) probs = [softmax(logits[0]), softmax(logits[1])] ref0 = np.random.multinomial(num_samples, probs[0]) ref1 = np.random.multinomial(num_samples, probs[1]) pre0 = prediction[output_name].reshape(2, num_samples)[0] pre1 = prediction[output_name].reshape(2, num_samples)[1] expected = {output_name: np.stack((pre0, pre1))} # convert to bincount and validate probabilities pre0 = np.bincount(np.array(pre0).astype(np.int32), minlength=num_class) pre1 = np.bincount(np.array(pre1).astype(np.int32), minlength=num_class) np.testing.assert_allclose( np.true_divide(pre0, num_samples), probs[0], atol=1e-2 ) np.testing.assert_allclose( np.true_divide(pre0, num_samples), np.true_divide(ref0, num_samples), atol=1e-2, ) np.testing.assert_allclose( np.true_divide(pre1, num_samples), probs[1], atol=1e-2 ) np.testing.assert_allclose( np.true_divide(pre1, num_samples), np.true_divide(ref1, num_samples), atol=1e-2, ) self._test_model( model, inputs, expected, useCPUOnly=True, output_name_shape_dict={"output": prediction["output"].shape}, ) @pytest.mark.xfail( reason="rdar://64153463 ([GitLab CI] test_categorical_distribution_cpu_probs failing)" ) def test_categorical_distribution_cpu_probs(self): def softmax(data): e_data = np.exp(data - np.max(data)) return e_data / e_data.sum() num_samples, num_class = 50000, 10 input_name, output_name = "data", "output" shapes = [ (2, num_class), (2, 1, num_class), (1, 2, num_class), (2, 1, 1, num_class), (1, 2, 1, num_class), (1, 1, 2, num_class), (2, 1, 1, 1, num_class), (1, 2, 1, 1, num_class), (1, 1, 2, 1, num_class), (1, 1, 1, 2, num_class), ] for shape in shapes: input_features = [("data", datatypes.Array(*shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_categorical_distribution( name="categorical_distribution", input_name=input_name, output_name=output_name, num_samples=num_samples, is_logits=False, seed=42, ) x = np.random.rand(*shape) probs = x.reshape(2, num_class) probs[0], probs[1] = softmax(probs[0]), softmax(probs[1]) inputs = {input_name: np.reshape(probs, shape)} model = builder.spec if isinstance(model, str): model = coremltools.models.MLModel(model) model = coremltools.models.MLModel(model, useCPUOnly=True) prediction = model.predict(inputs, useCPUOnly=True) # validate each distribution separately probs = probs.reshape(2, num_class) ref0 = np.random.multinomial(num_samples, probs[0]) ref1 = np.random.multinomial(num_samples, probs[1]) pre0 = prediction[output_name].reshape(2, num_samples)[0] pre1 = prediction[output_name].reshape(2, num_samples)[1] expected = {output_name: np.stack((pre0, pre1))} # convert to bincount and validate probabilities pre0 = np.bincount(np.array(pre0).astype(np.int32), minlength=num_class) pre1 = np.bincount(np.array(pre1).astype(np.int32), minlength=num_class) np.testing.assert_allclose( np.true_divide(pre0, num_samples), probs[0], atol=1e-2 ) np.testing.assert_allclose( np.true_divide(pre0, num_samples), np.true_divide(ref0, num_samples), atol=1e-2, ) np.testing.assert_allclose( np.true_divide(pre1, num_samples), probs[1], atol=1e-2 ) np.testing.assert_allclose( np.true_divide(pre1, num_samples), np.true_divide(ref1, num_samples), atol=1e-2, ) self._test_model( model, inputs, expected, useCPUOnly=True, output_name_shape_dict={"output": prediction["output"].shape}, ) def test_reverse_cpu(self, cpu_only=True): for rank in range(1, 6): for _ in range(20): input_shape = np.random.randint(low=2, high=8, size=rank) reverse_dim = [np.random.choice([True, False]) for _ in range(rank)] axes = [i for i in range(rank) if reverse_dim[i] == True] input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_reverse("reverse", "data", "output", reverse_dim) x = np.random.rand(*input_shape) input = {"data": x} expected = {"output": np.flip(x, axis=axes)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_reverse_gpu(self): self.test_reverse_cpu(cpu_only=False) def test_matrix_band_part_cpu(self, cpu_only=True): for rank in range(2, 6): for _ in range(20): num_lower = np.random.randint(low=-7, high=8) num_upper = np.random.randint(low=-7, high=8) shape = np.random.randint(low=2, high=6, size=rank) input_features = [("data", datatypes.Array(*shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_matrix_band_part( "matrix_band_part", "data", "output", num_lower=num_lower, num_upper=num_upper, ) x = np.random.rand(*shape) input = {"data": x} rows, cols = shape[-2:] band = np.ones((rows, cols)) for m in range(rows): for n in range(cols): band[m, n] = (num_lower < 0 or (m - n) <= num_lower) and ( num_upper < 0 or (n - m) <= num_upper ) expected = {"output": np.multiply(band, x)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_matrix_band_part_gpu(self): self.test_matrix_band_part_cpu(cpu_only=False) def test_flatten_to_2d_cpu(self, cpu_only=True): for rank in range(1, 6): for axis in range(-rank, rank + 1): shape = np.random.randint(low=2, high=6, size=rank) input_features = [("data", datatypes.Array(*shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_flatten_to_2d("flatten_to_2d", "data", "output", axis=axis) x = np.random.rand(*shape) np_axis = axis + rank if axis < 0 else axis pl, pr = 1, 1 for i in range(0, np_axis): pl *= shape[i] for i in range(np_axis, len(shape)): pr *= shape[i] new_shape = [pl, pr] ref = x.reshape(new_shape) input = {"data": x} expected = {"output": ref} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) self.assertEqual(2, builder._get_rank("output")) def test_flatten_to_2d_gpu(self): self.test_flatten_to_2d_cpu(cpu_only=False) def test_reshape_like_cpu(self, cpu_only=True): for rank in range(1, 6): for _ in range(20): input_shape = np.random.randint(low=2, high=8, size=rank) n = int(np.prod(input_shape)) divisors = [d for d in range(1, n) if n % d == 0] target_rank = np.random.randint(low=2, high=6) target_shape = [1] for i in range(target_rank - 1): dim_size = np.random.choice(divisors) while n % (np.prod(target_shape) * dim_size) != 0: dim_size = np.random.choice(divisors) target_shape.append(dim_size) target_shape[0] = n // np.prod(target_shape) np.random.shuffle(target_shape) input_features = [ ("data", datatypes.Array(*input_shape)), ("tensor", datatypes.Array(*target_shape)), ] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_reshape_like( name="reshape_like", input_names=["data", "tensor"], output_name="output", ) data = np.random.rand(*input_shape) tensor = np.random.rand(*target_shape) inputs = {"data": data, "tensor": tensor} expected = {"output": np.reshape(data, target_shape)} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(target_rank, builder._get_rank("output")) def test_reshape_like_gpu(self): if platform.machine() == "arm64": pytest.xfail( "rdar://111942798 ([Regression][Bug] Reshape model got stuck while loading in M1 machine for non-cpu compute unit)" ) self.test_reshape_like_cpu(cpu_only=False) def test_reshape_static_cpu(self, cpu_only=True): for rank in range(1, 6): for _ in range(20): input_shape = np.random.randint(low=2, high=8, size=rank) n = int(np.prod(input_shape)) divisors = [d for d in range(1, n) if n % d == 0] target_rank = np.random.randint(low=2, high=6) target_shape = [1] for i in range(target_rank - 1): dim_size = np.random.choice(divisors) while n % (np.prod(target_shape) * dim_size) != 0: dim_size = np.random.choice(divisors) target_shape.append(dim_size) target_shape[0] = -1 np.random.shuffle(target_shape) input_features = [("data", datatypes.Array(*input_shape))] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_reshape_static( name="reshape_static", input_name="data", output_name="output", output_shape=target_shape, ) data = np.random.rand(*input_shape) inputs = {"data": data} expected = {"output": np.reshape(data, target_shape)} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(len(target_shape), builder._get_rank("output")) def test_reshape_static_gpu(self): if platform.machine() == "arm64": pytest.xfail( "rdar://111942798 ([Regression][Bug] Reshape model got stuck while loading in M1 machine for non-cpu compute unit)" ) self.test_reshape_static_cpu(cpu_only=False) def test_reshape_dynamic_cpu(self, cpu_only=True): for rank in range(1, 6): for _ in range(20): input_shape = np.random.randint(low=2, high=8, size=rank) n = int(np.prod(input_shape)) divisors = [d for d in range(1, n) if n % d == 0] target_rank = np.random.randint(low=2, high=6) target_shape = [1] for i in range(target_rank - 1): dim_size = np.random.choice(divisors) while n % (np.prod(target_shape) * dim_size) != 0: dim_size = np.random.choice(divisors) target_shape.append(dim_size) target_shape[0] = -1 np.random.shuffle(target_shape) input_features = [ ("data", datatypes.Array(*input_shape)), ("shape", datatypes.Array(len(target_shape))), ] builder = neural_network.NeuralNetworkBuilder( input_features, [("output", None)], disable_rank5_shape_mapping=True ) builder.add_reshape_dynamic( name="reshape_dynamic", input_names=["data", "shape"], output_name="output", ) data = np.random.rand(*input_shape) inputs = {"data": data, "shape": np.array(target_shape, dtype="float")} expected = {"output": np.reshape(data, target_shape)} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) self.assertEqual(-1, builder._get_rank("output")) def test_reshape_dynamic_gpu(self): self.test_reshape_dynamic_cpu(cpu_only=False) def test_reduce_sum_cpu(self, cpu_only=True): for rank in range(1, 6): axes_list = [ axes for length in range(1, rank + 1) for axes in itertools.combinations(range(rank), length) ] axes_list.append(None) for axes in axes_list: if axes: axes = tuple( [ axis if np.random.choice([True, False]) else axis - rank for axis in axes ] ) reduce_all = False else: reduce_all = True for keep_dims in [True, False]: input_shape = np.random.randint(low=2, high=5, size=rank) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_reduce_sum( "reduce", "data", "output", axes, keepdims=keep_dims, reduce_all=reduce_all, ) x = np.random.rand(*input_shape) input = {"data": x} expected = {"output": np.add.reduce(x, axes, keepdims=keep_dims)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) expected_rank = len(expected["output"].shape) if expected_rank == 0: expected_rank = 1 self.assertEqual(expected_rank, builder._get_rank("output")) def test_reduce_sum_gpu(self): self.test_reduce_sum_cpu(cpu_only=False) def test_reduce_prod_cpu(self, cpu_only=True): for rank in range(1, 6): axes_list = [ axes for length in range(1, rank + 1) for axes in itertools.combinations(range(rank), length) ] axes_list.append(None) for axes in axes_list: if axes: axes = tuple( [ axis if np.random.choice([True, False]) else axis - rank for axis in axes ] ) reduce_all = False else: reduce_all = True for keep_dims in [True, False]: input_shape = np.random.randint(low=2, high=5, size=rank) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_reduce_prod( "reduce", "data", "output", axes, keepdims=keep_dims, reduce_all=reduce_all, ) x = np.random.rand(*input_shape) input = {"data": x} expected = { "output": np.multiply.reduce(x, axes, keepdims=keep_dims) } self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) expected_rank = len(expected["output"].shape) if expected_rank == 0: expected_rank = 1 self.assertEqual(expected_rank, builder._get_rank("output")) def test_reduce_prod_gpu(self): self.test_reduce_prod_cpu(cpu_only=False) def test_reduce_mean_cpu(self, cpu_only=True): for rank in range(1, 6): axes_list = [ axes for length in range(1, rank + 1) for axes in itertools.combinations(range(rank), length) ] axes_list.append(None) for axes in axes_list: if axes: axes = tuple( [ axis if np.random.choice([True, False]) else axis - rank for axis in axes ] ) reduce_all = False else: reduce_all = True for keep_dims in [True, False]: input_shape = np.random.randint(low=2, high=5, size=rank) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_reduce_mean( "reduce", "data", "output", axes, keepdims=keep_dims, reduce_all=reduce_all, ) x = np.random.rand(*input_shape) input = {"data": x} expected = {"output": np.mean(x, axes, keepdims=keep_dims)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_reduce_mean_gpu(self): self.test_reduce_mean_cpu(cpu_only=False) def test_reduce_max_cpu(self, cpu_only=True): for rank in range(1, 6): axes_list = [ axes for length in range(1, rank + 1) for axes in itertools.combinations(range(rank), length) ] axes_list.append(None) for axes in axes_list: if axes: axes = tuple( [ axis if np.random.choice([True, False]) else axis - rank for axis in axes ] ) reduce_all = False else: reduce_all = True for keep_dims in [True, False]: input_shape = np.random.randint(low=2, high=5, size=rank) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_reduce_max( "reduce", "data", "output", axes, keepdims=keep_dims, reduce_all=reduce_all, ) x = np.random.rand(*input_shape) input = {"data": x} expected = { "output": np.maximum.reduce(x, axes, keepdims=keep_dims) } self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_reduce_max_gpu(self): self.test_reduce_max_cpu(cpu_only=False) def test_reduce_min_cpu(self, cpu_only=True): for rank in range(1, 6): axes_list = [ axes for length in range(1, rank + 1) for axes in itertools.combinations(range(rank), length) ] axes_list.append(None) for axes in axes_list: if axes: axes = tuple( [ axis if np.random.choice([True, False]) else axis - rank for axis in axes ] ) reduce_all = False else: reduce_all = True for keep_dims in [True, False]: input_shape = np.random.randint(low=2, high=5, size=rank) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_reduce_min( "reduce", "data", "output", axes, keepdims=keep_dims, reduce_all=reduce_all, ) x = np.random.rand(*input_shape) input = {"data": x} expected = { "output": np.minimum.reduce(x, axes, keepdims=keep_dims) } self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_reduce_min_gpu(self): self.test_reduce_min_cpu(cpu_only=False) def test_reduce_l2_cpu(self, cpu_only=True): for rank in range(1, 6): axes_list = [ axes for length in range(1, rank + 1) for axes in itertools.combinations(range(rank), length) ] axes_list.append(None) for axes in axes_list: if axes: axes = tuple( [ axis if np.random.choice([True, False]) else axis - rank for axis in axes ] ) reduce_all = False else: reduce_all = True for keep_dims in [True, False]: input_shape = np.random.randint(low=2, high=5, size=rank) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_reduce_l2( "reduce", "data", "output", axes, keepdims=keep_dims, reduce_all=reduce_all, ) x = np.random.rand(*input_shape) input = {"data": x} expected = { "output": np.sqrt( np.sum(np.square(x), axis=axes, keepdims=keep_dims) ) } self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_reduce_l2_gpu(self): self.test_reduce_l2_cpu(cpu_only=False) def test_reduce_l1_cpu(self, cpu_only=True): for rank in range(1, 6): axes_list = [ axes for length in range(1, rank + 1) for axes in itertools.combinations(range(rank), length) ] axes_list.append(None) for axes in axes_list: if axes: axes = tuple( [ axis if np.random.choice([True, False]) else axis - rank for axis in axes ] ) reduce_all = False else: reduce_all = True for keep_dims in [True, False]: input_shape = np.random.randint(low=2, high=5, size=rank) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_reduce_l1( "reduce", "data", "output", axes, keepdims=keep_dims, reduce_all=reduce_all, ) x = np.random.rand(*input_shape) input = {"data": x} expected = { "output": np.sum(np.abs(x), axis=axes, keepdims=keep_dims) } self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_reduce_l1_gpu(self): self.test_reduce_l1_cpu(cpu_only=False) def test_reduce_sumsquare_cpu(self, cpu_only=True): for rank in range(1, 6): axes_list = [ axes for length in range(1, rank + 1) for axes in itertools.combinations(range(rank), length) ] axes_list.append(None) for axes in axes_list: if axes: axes = tuple( [ axis if np.random.choice([True, False]) else axis - rank for axis in axes ] ) reduce_all = False else: reduce_all = True for keep_dims in [True, False]: input_shape = np.random.randint(low=2, high=5, size=rank) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_reduce_sumsquare( "reduce", "data", "output", axes, keepdims=keep_dims, reduce_all=reduce_all, ) x = np.random.rand(*input_shape) input = {"data": x} expected = { "output": np.sum(np.square(x), axis=axes, keepdims=keep_dims) } self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_reduce_sumsquare_gpu(self): self.test_reduce_sumsquare_cpu(cpu_only=False) def test_reduce_logsum_cpu(self, cpu_only=True): for rank in range(1, 6): axes_list = [ axes for length in range(1, rank + 1) for axes in itertools.combinations(range(rank), length) ] axes_list.append(None) for axes in axes_list: if axes: axes = tuple( [ axis if np.random.choice([True, False]) else axis - rank for axis in axes ] ) reduce_all = False else: reduce_all = True for keep_dims in [True, False]: input_shape = np.random.randint(low=2, high=5, size=rank) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_reduce_logsum( "reduce", "data", "output", axes, keepdims=keep_dims, reduce_all=reduce_all, ) x = np.random.rand(*input_shape) input = {"data": x} expected = { "output": np.log(np.sum(x, axis=axes, keepdims=keep_dims)) } self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_reduce_logsum_gpu(self): self.test_reduce_logsum_cpu(cpu_only=False) def test_reduce_logsumexp_cpu(self, cpu_only=True): for rank in range(1, 6): axes_list = [ axes for length in range(1, rank + 1) for axes in itertools.combinations(range(rank), length) ] axes_list.append(None) for axes in axes_list: if axes: axes = tuple( [ axis if np.random.choice([True, False]) else axis - rank for axis in axes ] ) reduce_all = False else: reduce_all = True for keep_dims in [True, False]: input_shape = np.random.randint(low=2, high=5, size=rank) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_reduce_logsumexp( "reduce", "data", "output", axes, keepdims=keep_dims, reduce_all=reduce_all, ) x = np.random.rand(*input_shape) input = {"data": x} expected = { "output": np.log( np.sum(np.exp(x), axis=axes, keepdims=keep_dims) ) } self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_reduce_logsumexp_gpu(self): self.test_reduce_logsumexp_cpu(cpu_only=False) @unittest.skipIf(not _HAS_TF_2, MSG_TF2_NOT_FOUND) def test_reverse_sequence_cpu(self, cpu_only=True): for rank in range(2, 6): for i in range(20): input_shape = np.random.randint(low=2, high=6, size=rank) seq_axis = np.random.randint(low=-rank, high=rank) batch_axis = np.random.randint(low=-rank, high=rank) pos_batch_axis = batch_axis if batch_axis >= 0 else rank + batch_axis pos_seq_axis = seq_axis if seq_axis >= 0 else rank + seq_axis while pos_batch_axis >= pos_seq_axis: seq_axis = np.random.randint(low=-rank, high=rank) batch_axis = np.random.randint(low=-rank, high=rank) pos_batch_axis = ( batch_axis if batch_axis >= 0 else rank + batch_axis ) pos_seq_axis = seq_axis if seq_axis >= 0 else rank + seq_axis input_features = [ ("data", datatypes.Array(*input_shape)), ("lengths", datatypes.Array(input_shape[batch_axis])), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_reverse_sequence( "reverse_sequence", ["data", "lengths"], "output", batch_axis=batch_axis, seq_axis=seq_axis, ) data = np.random.rand(*input_shape) lengths = np.random.randint( low=0, high=input_shape[seq_axis], size=input_shape[batch_axis] ) input = {"data": data, "lengths": lengths.astype(np.float32)} tf_op = tf.reverse_sequence( input=data, seq_lengths=lengths, seq_axis=pos_seq_axis, batch_axis=pos_batch_axis, ) expected = {"output": tf_op.numpy()} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_reverse_sequence_gpu(self): self.test_reverse_sequence_cpu(cpu_only=False) def test_where_nonzero_cpu(self, cpu_only=True): for rank in range(1, 6): for i in range(10): shape = np.random.randint(low=2, high=8, size=rank) input_features = [("data", datatypes.Array(*shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_where_nonzero("multi_indices", "data", "output") x = np.random.randint(low=0, high=3, size=shape) input = {"data": x.astype(np.float32)} expected = {"output": np.transpose(np.nonzero(x)).astype(np.float32)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_where_nonzero_gpu(self): self.test_where_nonzero_cpu(cpu_only=False) def test_gather_cpu(self, cpu_only=True): for rankParams, rankIndices in [ (i, j) for i in range(1, 6) for j in range(1, 6) ]: for axis in range(-rankParams, rankParams): shapeParams = np.random.randint(low=2, high=5, size=rankParams) shapeIndices = np.random.randint(low=2, high=5, size=rankIndices) input_shapes = [shapeParams, shapeIndices] posAxis = axis if axis >= 0 else axis + rankParams output_shape = ( list(shapeParams[:posAxis]) + list(shapeIndices) + list(shapeParams[posAxis + 1 :]) ) if len(output_shape) > 5: continue input_names = ["params", "indices"] input_features = [ ("params", datatypes.Array(*input_shapes[0])), ("indices", datatypes.Array(*input_shapes[1])), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_gather( name="gather", input_names=input_names, output_name="output", axis=axis, ) a = np.random.rand(*input_shapes[0]) b = np.random.randint( -shapeParams[axis], shapeParams[axis], size=shapeIndices ) input = {"params": a, "indices": b.astype(np.float32)} expected = {"output": np.take(a, b, axis=axis)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) self.assertEqual( len(expected["output"].shape), builder._get_rank("output") ) def test_gather_gpu(self): # This test can be stochastically failing, so we set the below seed: np.random.seed(0) pytest.xfail("rdar://124260627 ([CI] Two tests are random failing on CI)") self.test_gather_cpu(cpu_only=False) def test_gather_along_axis_cpu(self, cpu_only=True): for rank in range(1, 6): for axis in range(-rank, rank): for _ in range(5): params_shape = np.random.randint(low=2, high=8, size=rank) indices_shape = np.copy(params_shape) indices_shape[axis] = np.random.randint(low=1, high=8) input_features = [ ("params", datatypes.Array(*params_shape)), ("indices", datatypes.Array(*indices_shape)), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_gather_along_axis( "gather_along_axis", ["params", "indices"], "output", axis=axis ) a = np.random.rand(*params_shape) b = np.random.randint( -params_shape[axis], params_shape[axis], size=indices_shape ) input = {"params": a, "indices": b.astype(np.float32)} expected = {"output": np.take_along_axis(a, b, axis=axis)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) self.assertEqual( len(expected["output"].shape), builder._get_rank("output") ) def test_gather_along_axis_gpu(self): self.test_gather_along_axis_cpu(cpu_only=False) @unittest.skipIf(not _HAS_TF_2, MSG_TF2_NOT_FOUND) def test_gather_nd_cpu(self, cpu_only=True): for params_rank, indices_rank in [ (i, j) for i in range(1, 6) for j in range(1, 6) ]: params_shape = np.random.randint(low=2, high=8, size=params_rank) indices_shape = np.random.randint(low=2, high=8, size=indices_rank) indices_shape[-1] = np.random.randint(low=1, high=params_rank + 1) for _ in range(5): input_features = [ ("params", datatypes.Array(*params_shape)), ("indices", datatypes.Array(*indices_shape)), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) output_shape = list(indices_shape[:-1]) + list( params_shape[indices_shape[-1] :] ) if len(output_shape) > 5: continue builder.add_gather_nd("gather_nd", ["params", "indices"], "output") a = np.random.rand(*params_shape) indices_list = [] for i in range(indices_shape[-1]): indices_list.append( np.random.randint(0, params_shape[i], size=indices_shape[:-1]) ) indices = np.stack(indices_list, axis=-1) input = {"params": a, "indices": indices.astype(np.float32)} tf_op = tf.gather_nd(a, indices) expected = {"output": tf_op.numpy()} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) self.assertEqual(-1, builder._get_rank("output")) def test_gather_nd_gpu(self): self.test_gather_nd_cpu(cpu_only=False) @unittest.skipIf(not _HAS_TF_2, MSG_TF2_NOT_FOUND) def test_scatter_cpu(self, cpu_only=True): for ref_rank, indices_rank in [ (i, j) for i in range(1, 6) for j in range(1, 6) ]: for accumulate_mode in ["UPDATE", "ADD", "SUB", "MUL", "DIV", "MAX", "MIN"]: for _ in range(5): ref_shape = np.random.randint(low=2, high=8, size=ref_rank) indices_shape = np.random.randint(low=2, high=8, size=indices_rank) updates_shape = list(indices_shape) + list(ref_shape[1:]) input_features = [ ("ref", datatypes.Array(*ref_shape)), ("indices", datatypes.Array(*indices_shape)), ("updates", datatypes.Array(*updates_shape)), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) if len(updates_shape) > 5: continue builder.add_scatter( "scatter", ["ref", "indices", "updates"], "output", axis=0, mode=accumulate_mode, ) ref = np.random.rand(*ref_shape) updates = np.random.rand(*updates_shape) if accumulate_mode == "DIV": updates += 10.0 indices = np.random.randint(0, ref_shape[0], size=indices_shape) input = { "ref": ref, "indices": indices.astype(np.float32), "updates": updates, } tf_output = tf.Variable(ref) if accumulate_mode == "UPDATE": tf.compat.v1.scatter_update(tf_output, indices, updates) if accumulate_mode == "ADD": tf.compat.v1.scatter_add(tf_output, indices, updates) if accumulate_mode == "SUB": tf.compat.v1.scatter_sub(tf_output, indices, updates) if accumulate_mode == "MUL": tf.compat.v1.scatter_mul(tf_output, indices, updates) if accumulate_mode == "DIV": tf.compat.v1.scatter_div(tf_output, indices, updates) if accumulate_mode == "MAX": tf.compat.v1.scatter_max(tf_output, indices, updates) if accumulate_mode == "MIN": tf.compat.v1.scatter_min(tf_output, indices, updates) expected = {"output": tf_output.numpy()} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_scatter_gpu(self): self.test_scatter_cpu(cpu_only=False) def test_gather_scatter_multiple_axis_cpu(self, cpu_only=True): for params_rank, indices_rank in [ (i, j) for i in range(1, 6) for j in range(1, 6) ]: for axis in range(-params_rank, params_rank): for _ in range(5): params_shape = np.random.randint(low=2, high=8, size=params_rank) indices_shape = np.random.randint(low=2, high=8, size=indices_rank) pos_axis = axis if axis >= 0 else axis + params_rank output_shape = ( list(params_shape[:pos_axis]) + list(indices_shape) + list(params_shape[pos_axis + 1 :]) ) if len(output_shape) > 5: continue input_features = [ ("params", datatypes.Array(*params_shape)), ("indices", datatypes.Array(*indices_shape)), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_gather( "gather", ["params", "indices"], "updates", axis=axis ) builder.add_scatter( "scatter", ["params", "indices", "updates"], "output", axis=axis, mode="UPDATE", ) a = np.random.rand(*params_shape) b = np.random.randint( -params_shape[axis], params_shape[axis], size=indices_shape ) input = {"params": a, "indices": b.astype(np.float32)} expected = {"output": a} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_gather_scatter_multiple_axis_gpu(self): self.test_gather_scatter_multiple_axis_cpu(cpu_only=False) def test_scatter_along_axis_cpu(self, cpu_only=True): for rank in range(1, 6): for axis in range(-rank, rank): for id in range(5): ref_shape = np.random.randint(low=2, high=8, size=rank) indices_shape = np.copy(ref_shape) indices_shape[axis] = np.random.randint(low=1, high=8) updates_shape = indices_shape input_features = [ ("ref", datatypes.Array(*ref_shape)), ("indices", datatypes.Array(*indices_shape)), ("updates", datatypes.Array(*updates_shape)), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_scatter_along_axis( "scatter_along_axis", ["ref", "indices", "updates"], "output", axis=axis, mode="UPDATE", ) ref = np.random.rand(*ref_shape) updates = np.random.rand(*updates_shape) indices = np.random.randint( -ref_shape[axis], ref_shape[axis], size=indices_shape ) input = { "ref": ref, "indices": indices.astype(np.float32), "updates": updates, } np_output = np.copy(ref) np.put_along_axis(np_output, indices, updates, axis=axis) expected = {"output": np_output} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_scatter_along_axis_gpu(self): self.test_scatter_along_axis_cpu(cpu_only=False) @unittest.skipIf(not _HAS_TF_2, MSG_TF2_NOT_FOUND) def test_scatter_nd_cpu(self, cpu_only=True): for ref_rank, indices_rank in [ (i, j) for i in range(1, 6) for j in range(2, 6) ]: ref_shape = np.random.randint(low=2, high=8, size=ref_rank) indices_shape = np.random.randint(low=2, high=8, size=indices_rank) indices_shape[-1] = np.random.randint(low=1, high=ref_rank + 1) for accumulate_mode in ["UPDATE", "ADD", "SUB"]: for id in range(20): updates_shape = list(indices_shape[:-1]) + list( ref_shape[indices_shape[-1] :] ) if len(updates_shape) > 5: continue input_features = [ ("ref", datatypes.Array(*ref_shape)), ("indices", datatypes.Array(*indices_shape)), ("updates", datatypes.Array(*updates_shape)), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_scatter_nd( "scatter_nd", ["ref", "indices", "updates"], "output", mode=accumulate_mode, ) ref = np.random.rand(*ref_shape) updates = np.random.rand(*updates_shape) indices_list = [] for i in range(indices_shape[-1]): indices_list.append( np.random.randint(0, ref_shape[i], size=indices_shape[:-1]) ) indices = np.stack(indices_list, axis=-1) input = { "ref": ref, "indices": indices.astype(np.float32), "updates": updates, } tf_output = tf.Variable(ref) if accumulate_mode == "UPDATE": tf.compat.v1.scatter_nd_update(tf_output, indices, updates) if accumulate_mode == "ADD": tf.compat.v1.scatter_nd_add(tf_output, indices, updates) if accumulate_mode == "SUB": tf.compat.v1.scatter_nd_sub(tf_output, indices, updates) expected = {"output": tf_output.numpy()} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_scatter_nd_gpu(self): self.test_scatter_nd_cpu(cpu_only=False) def test_layer_normalization_cpu(self, cpu_only=True): def layer_norm_numpy(x, shapes, gamma_, beta_, eps=1e-5): axes = [-i - 1 for i, _ in enumerate(shapes)] num = x - np.mean(x, axis=tuple(axes), keepdims=True) dem = np.sqrt( np.sum(np.square(num), axis=tuple(axes), keepdims=True) / np.prod(shapes) + eps ) return num / dem * gamma_ + beta_ for rank in range(1, 6): input_shape = np.random.randint(low=2, high=6, size=rank) for axis in range(1, len(input_shape) + 1): norm_shapes = input_shape[-axis:] data = np.random.rand(*input_shape) gamma = np.random.rand(*norm_shapes) beta = np.random.rand(*norm_shapes) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_layer_normalization( name="layer_normalization", input_name="data", output_name="output", normalized_shape=norm_shapes, gamma=gamma, beta=beta, ) inputs = {"data": data} ref = layer_norm_numpy(data, norm_shapes, gamma, beta) expected = {"output": ref} self._test_model(builder.spec, inputs, expected, useCPUOnly=cpu_only) def test_layer_normalization_gpu(self): self.test_layer_normalization_cpu(cpu_only=False) def get_size_after_stride(X, params): start = params["start"] end = params["end"] stride = params["stride"] if params["axis"] == "width": axis = 2 if params["axis"] == "height": axis = 1 if params["axis"] == "channel": axis = 0 N = X.shape[axis] if end < 0: end = end + N end = min(end, N) if start > N - 1: L = 0 else: L = np.floor((end - 1 - start) / stride) + 1 if L < 0: L = 0 return L def get_numpy_predictions_slice(X, params): start = params["start"] end = params["end"] stride = params["stride"] if params["axis"] == "width": return X[:, :, start:end:stride] if params["axis"] == "height": return X[:, start:end:stride, :] if params["axis"] == "channel": return X[start:end:stride, :, :] def get_coreml_predictions_slice(X, params): coreml_preds = [] eval = True try: input_dim = X.shape output_dim = ( 1, 1, 1, ) # some random dimensions here: we are going to remove this information later input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", datatypes.Array(*output_dim))] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_slice( "slice", "data", "output", start_index=params["start"], end_index=params["end"], stride=params["stride"], axis=params["axis"], ) # Remove output shape by deleting and adding an output del builder.spec.description.output[-1] output = builder.spec.description.output.add() output.name = "output" output.type.multiArrayType.dataType = coremltools.proto.FeatureTypes_pb2.ArrayFeatureType.ArrayDataType.Value( "DOUBLE" ) # save the model model_dir = tempfile.TemporaryDirectory() model_path = os.path.join(model_dir.name, "test_layer.mlmodel") coremltools.utils.save_spec(builder.spec, model_path) # prepare input and get predictions coreml_model = coremltools.models.MLModel(model_path) coreml_input = {"data": X} if _is_macos() and _macos_version() >= (10, 13): coreml_preds = coreml_model.predict(coreml_input)["output"] else: coreml_preds = None except RuntimeError as e: print(e) eval = False return coreml_preds, eval def get_numpy_predictions_reduce(X, params): if params["axis"] == "CHW": axis = (0, 1, 2) if params["axis"] == "HW": axis = (1, 2) if params["axis"] == "C": axis = 0 if params["axis"] == "H": axis = 1 if params["axis"] == "W": axis = 2 if params["mode"] == "sum": return np.sum(X, axis) if params["mode"] == "avg": return np.mean(X, axis) if params["mode"] == "prod": return np.prod(X, axis) if params["mode"] == "logsum": return np.sum(np.log(X + 1e-6), axis) if params["mode"] == "sumsquare": return np.sum(X ** 2, axis) if params["mode"] == "L2": return np.sqrt(np.sum(X ** 2, axis)) if params["mode"] == "L1": return np.sum(np.abs(X), axis) if params["mode"] == "max": return np.amax(X, axis) if params["mode"] == "min": return np.amin(X, axis) if params["mode"] == "argmax": return np.argmax(X, axis) def get_coreml_predictions_reduce(X, params): coreml_preds = [] eval = True try: input_dim = X.shape # some random dimensions here: we are going to remove this information later output_dim = (1, 1, 1) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", datatypes.Array(*output_dim))] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_reduce( "reduce", "data", "output", axis=params["axis"], mode=params["mode"] ) # Remove output shape by deleting and adding an output del builder.spec.description.output[-1] output = builder.spec.description.output.add() output.name = "output" output.type.multiArrayType.dataType = coremltools.proto.FeatureTypes_pb2.ArrayFeatureType.ArrayDataType.Value( "DOUBLE" ) # save the model model_dir = tempfile.TemporaryDirectory() model_path = os.path.join(model_dir.name, "test_layer.mlmodel") coremltools.utils.save_spec(builder.spec, model_path) # prepare input and get predictions coreml_model = coremltools.models.MLModel(model_path) coreml_input = {"data": X} if _is_macos() and _macos_version() >= (10, 13): coreml_preds = coreml_model.predict(coreml_input)["output"] else: coreml_preds = None except RuntimeError as e: print(e) eval = False return coreml_preds, eval @pytest.mark.slow class StressTest(CorrectnessTest): def test_slice_layer(self): params_dict = dict( input_shape=[[30, 100, 8], [80, 50, 5], [4, 12, 5], [56, 8, 14]], axis=["channel", "height", "width"], start=[0, 1, 2, 5], end=[5, 100, 56, -1, -2, -4], stride=[1, 2, 3], ) params = list(itertools.product(*params_dict.values())) all_candidates = [dict(zip(params_dict.keys(), x)) for x in params] valid_params = [] for pr in all_candidates: X = np.random.rand(*pr["input_shape"]) if get_size_after_stride(X, pr): valid_params.append(pr) print( "Total params to be tested: ", len(valid_params), "out of candidates: ", len(all_candidates), ) failed_tests_compile = [] failed_tests_shape = [] failed_tests_numerical = [] for i in range(len(valid_params)): params = valid_params[i] X = np.random.rand(*params["input_shape"]) np_preds = get_numpy_predictions_slice(X, params) coreml_preds, eval = get_coreml_predictions_slice(X, params) if eval is False: failed_tests_compile.append(params) elif coreml_preds is not None: if not self._compare_shapes(np_preds, coreml_preds): failed_tests_shape.append(params) elif not self._compare_predictions(np_preds, coreml_preds): failed_tests_numerical.append(params) self.assertEqual(failed_tests_compile, []) self.assertEqual(failed_tests_shape, []) self.assertEqual(failed_tests_numerical, []) @pytest.mark.xfail( reason="rdar://132109960 ([Bug] Corner case breaks stress test when random seed changes)" ) def test_reduce_layer(self): params_dict = dict( input_shape=[[3, 10, 8], [8, 5, 5], [4, 12, 10], [7, 1, 14]], mode=[ "sum", "avg", "prod", "sumsquare", "L1", "L2", "max", "min", "argmax", ], axis=["CHW", "HW", "C", "H", "W"], ) params = list(itertools.product(*params_dict.values())) all_candidates = [dict(zip(params_dict.keys(), x)) for x in params] valid_params = [] for pr in all_candidates: if pr["mode"] == "argmax": if pr["axis"] == "CHW" or pr["axis"] == "HW": continue valid_params.append(pr) print( "Total params to be tested: ", len(valid_params), "out of candidates: ", len(all_candidates), ) failed_tests_compile = [] failed_tests_shape = [] failed_tests_numerical = [] for i in range(len(valid_params)): params = valid_params[i] X = np.random.rand(*params["input_shape"]) np_preds = get_numpy_predictions_reduce(X, params) coreml_preds, eval = get_coreml_predictions_reduce(X, params) if eval is False: failed_tests_compile.append(params) elif coreml_preds is not None: if not self._compare_shapes(np_preds, coreml_preds): failed_tests_shape.append(params) elif not self._compare_predictions(np_preds, coreml_preds): failed_tests_numerical.append(params) self.assertEqual(failed_tests_compile, []) self.assertEqual(failed_tests_shape, []) self.assertEqual(failed_tests_numerical, []) @pytest.mark.slow @unittest.skipIf( not _is_macos() or _macos_version() < LAYERS_10_15_MACOS_VERSION, "macOS 10.15+ required. Skipping tests.", ) class CoreML3NetworkStressTest(CorrectnessTest): def test_dyn_weight_conv2d_stress(self): options = dict( padding=["valid"], filters=[1, 2, 4], kernel_size=[1, 3, 5], # square kernels strides=[1, 2], dilation_rate=[1], batch_size=[1, 64, 512], ) input_size = 64 input_channels = 64 input_dim = [1, input_channels, input_size, input_size] def conv_spatial_size(image_size, kernel_size, stride, dilation, padding): if padding == "valid": kernel_size_dilated = (kernel_size - 1) * dilation + 1 return (image_size - kernel_size_dilated) // stride + 1 elif padding == "same": return int(math.ceil(image_size * 1.0 / stride)) else: return 0 for x in itertools.product(*options.values()): kwargs = dict(zip(options.keys(), x)) if kwargs["strides"] > 1 and kwargs["dilation_rate"] > 1: continue # weight layout: (output_channels, kernel_channels, height, width) weight_dim = ( kwargs["filters"], input_channels, kwargs["kernel_size"], kwargs["kernel_size"], ) input_dim[0] = kwargs["batch_size"] input_features = [ ("input", datatypes.Array(*input_dim)), ("weight", datatypes.Array(*weight_dim)), ] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_convolution( name="two_input_conv_layer", kernel_channels=input_channels, output_channels=kwargs["filters"], height=kwargs["kernel_size"], width=kwargs["kernel_size"], stride_height=kwargs["strides"], stride_width=kwargs["strides"], border_mode=kwargs["padding"], groups=1, W=None, b=None, has_bias=False, dilation_rate=kwargs["dilation_rate"], input_name=["input", "weight"], output_name="output", ) # Assigning everything to ones should cover the execution path # and engine failures, but is not a complete check on numerics. out_spatial_size = conv_spatial_size( input_size, kwargs["kernel_size"], kwargs["strides"], kwargs["dilation_rate"], kwargs["padding"], ) input_val = np.ones(input_dim) weight_val = np.ones(weight_dim) output_dim = ( kwargs["batch_size"], kwargs["filters"], out_spatial_size, out_spatial_size, ) expected = np.ones(output_dim) * ( kwargs["kernel_size"] * kwargs["kernel_size"] * input_channels ) feed_dict = {"input": input_val, "weight": weight_val} expected = {"output": expected} self._test_model(builder.spec, feed_dict, expected) def test_static_weight_conv2d_stress(self): options = dict( padding=["valid"], filters=[1, 2, 5], kernel_size=[1, 3, 4], # square kernels strides=[1, 2], dilation_rate=[1, 2], batch_size=[1, 32, 512], ) input_size = 64 input_channels = 64 input_dim = [1, input_channels, input_size, input_size] def conv_spatial_size(image_size, kernel_size, stride, dilation, padding): if padding == "valid": kernel_size_dilated = (kernel_size - 1) * dilation + 1 return (image_size - kernel_size_dilated) // stride + 1 elif padding == "same": return int(math.ceil(image_size * 1.0 / stride)) else: return 0 for x in itertools.product(*options.values()): kwargs = dict(zip(options.keys(), x)) if kwargs["strides"] > 1 and kwargs["dilation_rate"] > 1: continue # weight layout: (output_channels, kernel_channels, height, width) weight_dim = ( kwargs["filters"], input_channels, kwargs["kernel_size"], kwargs["kernel_size"], ) input_dim[0] = kwargs["batch_size"] input_features = [("input", datatypes.Array(*input_dim))] # ('weight', datatypes.Array(*weight_dim))] output_features = [("output", None)] input_weight = np.ones(weight_dim) builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_convolution( name="two_input_conv_layer", kernel_channels=input_channels, output_channels=kwargs["filters"], height=kwargs["kernel_size"], width=kwargs["kernel_size"], stride_height=kwargs["strides"], stride_width=kwargs["strides"], border_mode=kwargs["padding"], groups=1, W=input_weight, b=None, has_bias=False, dilation_factors=[kwargs["dilation_rate"]] * 2, input_name=["input"], output_name="output", ) # Assigning everything to ones should cover the execution path # and engine failures, but is not a complete check on numerics. out_spatial_size = conv_spatial_size( input_size, kwargs["kernel_size"], kwargs["strides"], kwargs["dilation_rate"], kwargs["padding"], ) input_val = np.ones(input_dim) weight_val = np.ones(weight_dim) output_dim = ( kwargs["batch_size"], kwargs["filters"], out_spatial_size, out_spatial_size, ) expected = np.ones(output_dim) * ( kwargs["kernel_size"] * kwargs["kernel_size"] * input_channels ) feed_dict = {"input": input_val} # , 'weight': weight_val} expected = {"output": expected} self._test_model(builder.spec, feed_dict, expected) def test_power_iteration_cpu(self): convergence_tolerance = 1e-8 number_of_iterations = 200 input_features = [ ("matrix", datatypes.Array(*(2, 2))), ("starting_vector", datatypes.Array(*(2,))), ] output_features = [ ("maximum_eigen_value", datatypes.Array(*(1, 1))), ("eigen_vector", None), ("iteration_count", datatypes.Array(*(1,))), ] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_expand_dims("expand_dims", "starting_vector", "x", axes=[-1]) builder.add_load_constant_nd( "iteration_count", "iteration_count", constant_value=np.zeros((1,)), shape=(1,), ) loop_layer = builder.add_loop("loop", max_iterations=number_of_iterations) loop_body_builder = neural_network.NeuralNetworkBuilder( nn_spec=loop_layer.loop.bodyNetwork ) # output shape: (n,1) loop_body_builder.add_batched_mat_mul( "bmm.1", input_names=["matrix", "x"], output_name="y" ) loop_body_builder.add_reduce_l2( "reduce", input_name="y", output_name="norm", axes=[0] ) loop_body_builder.add_divide_broadcastable( "divide", ["y", "norm"], "y_normalized" ) # find diff: 1- abs(cosine) loop_body_builder.add_batched_mat_mul( "cosine", ["y_normalized", "x"], "cosine_diff", transpose_a=True ) loop_body_builder.add_squeeze( "squeeze_all", "cosine_diff", "cosine_diff_squeeze", squeeze_all=True ) loop_body_builder.add_unary( "abs_cosine", "cosine_diff_squeeze", "abs_cosine_diff", mode="abs" ) loop_body_builder.add_activation( "diff", non_linearity="LINEAR", input_name="abs_cosine_diff", output_name="diff", params=[-1, 1], ) # update iteration count loop_body_builder.add_activation( "iteration_count_add", non_linearity="LINEAR", input_name="iteration_count", output_name="iteration_count_plus_1", params=[1, 1], ) loop_body_builder.add_copy( "iteration_count_copy", "iteration_count_plus_1", "iteration_count" ) # update 'x' loop_body_builder.add_copy("update_x", "y_normalized", "x") # add condition to break from the loop, if convergence criterion is met loop_body_builder.add_less_than( "cond", ["diff"], "cond", alpha=convergence_tolerance ) branch_layer = loop_body_builder.add_branch("branch_layer", "cond") builder_ifbranch = neural_network.NeuralNetworkBuilder( nn_spec=branch_layer.branch.ifBranch ) builder_ifbranch.add_loop_break("break") # now we are out of the loop, compute the eigen value builder.add_batched_mat_mul( "bmm.2", input_names=["matrix", "x"], output_name="x_right" ) builder.add_batched_mat_mul( "bmm.3", input_names=["x", "x_right"], output_name="maximum_eigen_value", transpose_a=True, ) builder.add_squeeze("squeeze", "x", "eigen_vector", squeeze_all=True) # make input sizes flexible spec = builder.spec flexible_shape_utils.add_multiarray_ndshape_enumeration( spec, feature_name="matrix", enumerated_shapes=[(3, 3), (4, 4)] ) flexible_shape_utils.add_multiarray_ndshape_enumeration( spec, feature_name="starting_vector", enumerated_shapes=[(3,), (4,)] ) from numpy import linalg as LA # try on 3x3 matrix A = np.array([[2, -6, 8], [-6, 4, 5], [8, 5, 3]], dtype=np.float32) starting_vector = np.random.rand(3) starting_vector = starting_vector / np.sqrt(np.sum(starting_vector ** 2)) e, v = LA.eig(A) idx = np.argmax(abs(e)) input = {"starting_vector": starting_vector, "matrix": A.astype(np.float32)} expected = {"maximum_eigen_value": np.array([[e[idx]]])} self._test_model(spec, input, expected, useCPUOnly=True) # try on 2x2 matrix A = np.array([[4, -5], [-5, 3]], dtype=np.float32) starting_vector = np.random.rand(2) starting_vector = starting_vector / np.sqrt(np.sum(starting_vector ** 2)) e, v = LA.eig(A) idx = np.argmax(abs(e)) input = {"starting_vector": starting_vector, "matrix": A.astype(np.float32)} expected = {"maximum_eigen_value": np.array([[e[idx]]])} self._test_model(spec, input, expected, useCPUOnly=True) @unittest.skipIf( _macos_version() < LAYERS_11_0_MACOS_VERSION, "macOS 11.0+ required. Skipping tests.", ) class IOS14SingleLayerTests(CorrectnessTest): @unittest.skipIf(not _HAS_TF_2, MSG_TF2_NOT_FOUND) def test_onehot_layer_cpu(self, cpu_only=True): ctr = 0 params_dict = dict( input_rank=[1, 2, 3, 4], negative_axis=[True, False], depth=[30], on_value=[30.0], off_value=[-4.0], ) params = list(itertools.product(*params_dict.values())) for param in params: param = dict(zip(params_dict.keys(), param)) input_rank = param["input_rank"] vectorSize = param["depth"] on_value = param["on_value"] off_value = param["off_value"] for axis in range(input_rank + 1): ctr += 1 if param["negative_axis"]: axis_param = axis - (input_rank + 1) else: axis_param = axis input_shape = np.random.randint(1, 10, size=(input_rank,)) input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_one_hot( "one_hot", ["data"], "output", one_hot_vector_size=vectorSize, axis=axis_param, on_value=on_value, off_value=off_value, ) x = np.random.randint(0, vectorSize, size=input_shape) # x[::4] -= vectorSize # [To do] Need to Handle this case. # TF seems to have a bug with axis < -1 if axis_param < -1: axis_param += input_rank + 1 tf_op = tf.one_hot( x, axis=axis_param, depth=vectorSize, on_value=on_value, off_value=off_value, ) expected = {"output": tf_op.numpy()} input = {"data": x.astype(np.float32)} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_batched_mat_mul_dynamic_quantization_cpu(self, cpu_only=True): X1 = 11 X2 = 23 W = np.random.rand(X1, X2) * 20 - 10 # uniform between [-10, 10] b = np.random.rand(X2) * 20 - 10 input_shapes = [ (X1,), (5, X1), (2, 3, X1), (4, 1, X1), ] # , (12, 5, 8, X1), (2, 3, 1, 5, X1)] W_max = max(np.abs(np.min(W)), np.abs(np.max(W))) W_normalized = W / W_max # [-1,1] W_quantized_int8 = 127.0 * W_normalized # [-127, 127] W_quantized_int8 = W_quantized_int8.astype(np.int8) quant_scale = W_max / 127.0 for input_shape in input_shapes: x = np.random.rand(*input_shape) * 10 input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] for has_bias in [True, False]: builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_batched_mat_mul( name="batched_mat_mul", input_names=["data"], output_name="output", weight_matrix_rows=X1, weight_matrix_columns=X2, int_8_dynamic_quantize=True, is_quantized_weight=True, quantization_type="linear", nbits=8, W=W_quantized_int8.tobytes(), bias=b if has_bias else None, quant_scale=np.array([quant_scale]), ) inputs = {"data": x} expected = { "output": np.matmul( x, W_quantized_int8.astype(np.float32) * quant_scale ) + (b if has_bias else np.zeros(X2)) } self._test_model( builder.spec, inputs, expected, useCPUOnly=cpu_only, test_metric="SNR", SNR=40, ) def test_batched_mat_mul_dynamic_quantization_gpu(self): self.test_batched_mat_mul_dynamic_quantization_cpu(cpu_only=False) def test_inner_product_dynamic_quantization_cpu(self, cpu_only=True): Xin = 24 Xout = 23 W = np.random.rand(Xout, Xin) b = np.random.rand(Xout) # For rank 4 and 5, the product of the last 3 dimensions must equal Xin input_shapes = [ (Xin,), (5, Xin), (2, 3, Xin), (4, 1, Xin), (5, 2, 3, 4), (5, 6, 2, 3, 4), ] W_max = max(np.abs(np.min(W)), np.abs(np.max(W))) W_normalized = W / W_max # [-1,1] W_quantized_int8 = 127.0 * W_normalized # [-127, 127] W_quantized_int8 = W_quantized_int8.astype(np.int8) quant_scale = W_max / 127.0 for input_shape in input_shapes: rank = len(input_shape) x = np.random.rand(*input_shape) * 5 W_for_numpy = W_quantized_int8.astype(np.float32) * quant_scale for has_bias in [True, False]: b = b if has_bias else np.zeros(Xout) if rank == 1 or rank == 2 or rank == 3: np_out = np.matmul(x, np.transpose(W_for_numpy)) + b expected = {"output": np_out} elif rank == 4: x_shaped = np.reshape(x, (x.shape[0], np.prod(x.shape[1:]))) np_out = np.matmul(x_shaped, np.transpose(W_for_numpy)) + b expected = {"output": np.reshape(np_out, np_out.shape + (1, 1))} elif rank == 5: x_shaped = np.reshape(x, x.shape[0:2] + (np.prod(x.shape[2:]),)) np_out = np.matmul(x_shaped, np.transpose(W_for_numpy)) + b expected = { "output": np.reshape( np_out, x.shape[0:2] + (np_out.shape[-1],) + (1, 1) ) } input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_inner_product( name="ip", W=W_quantized_int8.tobytes(), b=b if has_bias else None, input_channels=Xin, output_channels=Xout, has_bias=has_bias, input_name="data", output_name="output", int_8_dynamic_quantize=True, is_quantized_weight=True, quantization_type="linear", nbits=8, quant_scale=np.array([quant_scale]), ) inputs = {"data": x} self._test_model( builder.spec, inputs, expected, useCPUOnly=cpu_only, test_metric="SNR", SNR=40, ) def test_inner_product_dynamic_quantization_gpu(self): self.test_inner_product_dynamic_quantization_cpu(cpu_only=False) def test_onehot_layer_gpu(self): self.test_onehot_layer_cpu(cpu_only=False) @unittest.skipIf(not _HAS_TF_2, MSG_TF2_NOT_FOUND) def test_cumsum_layer_cpu(self, cpu_only=True): ctr = 0 params_dict = dict( rank=[1, 2, 3, 4, 5], exclusive=[False, True], reverse=[False, True], n_inputs=[1, 2], ) params = list(itertools.product(*params_dict.values())) for param in params: param = dict(zip(params_dict.keys(), param)) rank = param["rank"] exclusive = param["exclusive"] reverse = param["reverse"] n_inputs = param["n_inputs"] for axis in range(rank): ctr += 1 if np.random.rand(1) > 0.5: axis_param = axis else: axis_param = axis - rank input_shape = np.random.randint(1, 10, size=(rank,)) input_features = [("data", datatypes.Array(*input_shape))] if n_inputs == 2: input_features.append(("axis", datatypes.Array(1,))) output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) if n_inputs == 1: builder.add_cumsum( "cumsum", ["data"], "output", axis=axis_param, reverse=reverse, exclusive=exclusive, ) else: builder.add_cumsum( "cumsum", ["data", "axis"], "output", reverse=reverse, exclusive=exclusive, ) x = np.random.rand(*input_shape) tf_op = tf.cumsum( x, axis=axis_param, exclusive=exclusive, reverse=reverse ) expected = {"output": tf_op.numpy()} input = {"data": x} if n_inputs == 2: input["axis"] = axis_param * np.ones((1,), dtype=np.float32) self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_cumsum_layer_gpu(self): self.test_cumsum_layer_cpu(cpu_only=False) def test_clamped_relu_cpu(self, cpu_only=True): params_dict = dict(alpha=[0.0, 2.0, -3.0], beta=[7.0, -8.0]) params = list(itertools.product(*params_dict.values())) for param in params: param = dict(zip(params_dict.keys(), param)) alpha = param["alpha"] beta = param["beta"] input_shape = [40] input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_clamped_relu( "clamped_relu", "data", "output", alpha=alpha, beta=beta ) x = np.arange(-20, 20, dtype=np.float32) input = {"data": x} expected = {"output": np.minimum(beta, np.where(x >= 0, x, x * alpha))} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_clamped_relu_gpu(self): self.test_clamped_relu_cpu(cpu_only=False) def _test_pool3d(self, cpu_only): pool_types = ("MAX", "AVERAGE") # Defining shapes as (batch, channel, depth, height, width) shapes = ((1, 1, 1, 2, 2), (1, 1, 3, 3, 3), (3, 4, 10, 17, 90)) # Defining kernels and strides as (depth, height, width) kernels = ((2, 2, 2), (1, 3, 4), (2, 3, 4), (5, 1, 6), (8, 9, 1), (7, 11, 13)) strides = ((1, 1, 1), (1, 2, 3), (2, 3, 2), (4, 1, 2), (3, 4, 1), (7, 11, 13)) # Defining paddings as (left, right, top, bottom, front, back) # This is backwards from how we define shapes, kernels, and strides, # but it better matches pytorch, making the creation of pytorch layers # much easier. paddings = ( ("CUSTOM", (0, 0, 0, 0, 0, 0)), ("CUSTOM", (2, 2, 2, 2, 2, 2)), ("CUSTOM", (5, 6, 3, 4, 2, 2)), # VALID and SAME padding must have custom paddings unset or set to zero. ("VALID", (0, 0, 0, 0, 0, 0)), ("SAME", (0, 0, 0, 0, 0, 0)), ) # Structure to collect failures so # we can run all tests, even if one fails. # This should be able to go away when we can parameterize # our tests: Enable parameterized tests in test_numpy_nn_layers.py failures = [] num_successes = 0 num_skipped = 0 for pool_type in pool_types: for shape in shapes: for kernel in kernels: for stride in strides: for padding in paddings: for average_pooling_count_excludes_padding in (False, True): result = self._test_pool3d_single_case( cpu_only, pool_type, shape, kernel, stride, padding, average_pooling_count_excludes_padding, ) if type(result) is str: failures.append(result) elif result: num_successes += 1 else: num_skipped += 1 self.assertEqual( len(failures), 0, "Got %s successes, %s skipped, %s failures: %s" % (num_successes, num_skipped, len(failures), failures), ) def _test_pool3d_single_case( self, cpu_only, pool_type, shape, kernel, stride, padding, average_pooling_count_excludes_padding, ): """ Args: cpu_only: pool_type: shape: kernel: stride: padding: average_pooling_count_excludes_padding: Returns: True if success, False if skipped, Str if error """ test_case = ( "Test case:: pool_type: %s, shape: %s, kernel: %s, stride: %s, padding: %s, average_pooling_count_excludes_padding: %s" % ( pool_type, shape, kernel, stride, padding, average_pooling_count_excludes_padding, ) ) input_features = [("data", datatypes.Array(*shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) padding_mode = padding[0] padding_values = padding[1] builder.add_pooling3d( name="pooling3d", input_name="data", output_name="output", pooling_type=pool_type, kernel_depth=kernel[0], kernel_height=kernel[1], kernel_width=kernel[2], stride_depth=stride[0], stride_height=stride[1], stride_width=stride[2], padding_mode=padding_mode, custom_padding_front=padding_values[4], custom_padding_back=padding_values[5], custom_padding_top=padding_values[2], custom_padding_bottom=padding_values[3], custom_padding_left=padding_values[0], custom_padding_right=padding_values[1], average_pooling_count_excludes_padding=average_pooling_count_excludes_padding, ) # Expected output input = np.random.rand(*shape) torch_input = torch.from_numpy(np.reshape(input, shape)) # Padding if padding_mode == "CUSTOM": torch_padding = torch.nn.ConstantPad3d(padding_values, 0) elif padding_mode == "VALID": torch_padding = torch.nn.ConstantPad3d(0, 0) elif padding_mode == "SAME": padding_list = [] # torch.nn.ConstantPad3d wants (left, right, top, bottom, front, back) # but our shape, kernel, and stride are (depth, height, width). total_paddings = aggregated_pad( pad_type=padding_mode.lower(), kernel_shape=kernel, input_shape=shape[2:], strides=stride, ) total_paddings = list(total_paddings) total_paddings.reverse() for p in total_paddings: before = int(math.floor(float(p) / 2.0)) after = int(math.ceil(float(p) / 2.0)) padding_list.append(before) padding_list.append(after) torch_padding = torch.nn.ConstantPad3d(tuple(padding_list), 0) padding_values = padding_list[:] else: assert False # Validate output shape for i in range(3): try: IOS14SingleLayerTests._validate_pooling_dimension( shape[i + 2], kernel[i], stride[i], padding_values[6 - i - 2], padding_values[6 - i - 1], ) except ValueError: return False # Pooling type # Average pooling if pool_type == "AVERAGE": # torch.nn.AvgPool3d only accepts a single integer for padding, so we normally # create a pooling layer first which allows us to fully specify the # before and after padding in all three dimensions. # # However, when we use a padding layer, torch.nn.AvgPool3d doesn't # know what is padding and what isn't, which means that its # `count_include_pad` parameter has no effect. # # Therefore, we can only test average_pooling_count_excludes_padding=True # when padding is homogeneous. is_padding_homogeneous = all(p == padding_values[0] for p in padding_values) if average_pooling_count_excludes_padding: if not is_padding_homogeneous: return False else: # padding is homogeneous torch_model = torch.nn.AvgPool3d( kernel, stride=stride, padding=padding_values[0], count_include_pad=not average_pooling_count_excludes_padding, ) else: # average_pooling_count_excludes_padding == False torch_pool = torch.nn.AvgPool3d( kernel, stride=stride, count_include_pad=not average_pooling_count_excludes_padding, ) torch_model = torch.nn.Sequential(torch_padding, torch_pool) # Max pooling else: torch_pool = torch.nn.MaxPool3d(kernel, stride=stride) torch_model = torch.nn.Sequential(torch_padding, torch_pool) try: expected = torch_model(torch_input).numpy() self._test_model( builder.spec, {"data": input}, {"output": expected}, useCPUOnly=cpu_only ) return True except AssertionError as e: print(e) return "test_case: %s, error: %s" % (test_case, e) @staticmethod def _validate_pooling_dimension( input_size, kernel_size, stride, start_padding, end_padding ): # https://adeshpande3.github.io/A-Beginner%27s-Guide-To-Understanding-Convolutional-Neural-Networks-Part-2/ output_size = ( input_size + start_padding + end_padding - kernel_size ) / stride + 1 if output_size < 1: raise ValueError( "Dimension with input_size: %s, kernel_size: %s, stride: %s, start_padding: %s, end_padding: %s " "has output size of %s, but must be >= 1" % ( input_size, kernel_size, stride, start_padding, end_padding, output_size, ) ) if input_size < kernel_size: raise ValueError( "Dimension has input_size (%s) less than kernel_size (%s)" % (input_size, kernel_size) ) if (start_padding + end_padding) / 2 >= kernel_size / 2: raise ValueError( "The average of the start (%s) and end (%s) padding must be less than half the kernel size (%s / 2 = %s)" % (start_padding, end_padding, kernel_size, kernel_size / 2) ) def test_pool3d_cpu(self): self._test_pool3d(cpu_only=True) def test_pool3d_gpu(self): if platform.machine() == "arm64": pytest.xfail("rdar://98010495 (Some old nnv1 test are failing on M1 machine when running on ANE)") self._test_pool3d(cpu_only=False) def _test_global_pool3d(self, cpu_only): shapes = ((1, 1, 1, 2, 2), (1, 1, 3, 3, 3), (3, 4, 10, 17, 90)) pool_types = ("MAX", "AVERAGE") for shape in shapes: for pool_type in pool_types: test_case = "test_case:: shape: %s, pool_type: %s" % (shape, pool_type) print(test_case) input_features = [("data", datatypes.Array(*shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_global_pooling3d( name="pooling3d", input_name="data", output_name="output", pooling_type=pool_type, ) input = np.random.rand(*shape) # Expected output from Torch torch_input = torch.from_numpy(np.reshape(input, shape)) if pool_type == "AVERAGE": torch_pool = torch.nn.AvgPool3d(shape[-3:]) else: torch_pool = torch.nn.MaxPool3d(shape[-3:]) expected = torch_pool(torch_input).numpy() self._test_model( builder.spec, {"data": input}, {"output": expected}, useCPUOnly=cpu_only, ) def test_global_pool3d_cpu(self): self._test_global_pool3d(cpu_only=True) def test_global_pool3d_gpu(self): self._test_global_pool3d(cpu_only=False) def test_argsort_cpu(self, cpu_only=True): shapes = [(4,), (3, 4), (2, 5, 6), (3, 5, 2, 4), (4, 5, 3, 6, 7)] for shape in shapes: for descending in [False, True]: for axis in range(len(shape)): input_features = [("data", datatypes.Array(*shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_argsort( "argsort", "data", "output", axis=axis, descending=descending ) x = np.random.rand(*shape) if descending: expected = {"output": np.argsort(-x, axis)} else: expected = {"output": np.argsort(x, axis)} input = {"data": x} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_argsort_gpu(self): self.test_argsort_cpu(cpu_only=False) def test_upsample_pytorch_cpu(self): self.upsample_pytorch_test_iter(np.arange(1, 4), True) self.upsample_pytorch_test_iter(np.arange(1.0, 3.0, 0.66), True) def test_upsample_pytorch_gpu(self): if platform.machine() == "arm64": pytest.xfail("rdar://98010495 (Some old nnv1 test are failing on M1 machine when running on ANE)") self.upsample_pytorch_test_iter(np.arange(1, 4), False) self.upsample_pytorch_test_iter(np.arange(1.0, 3.0, 0.66), False) def upsample_pytorch_test_iter(self, scale_range, cpu_only): for align_corners in [False, True]: for scale_h in scale_range: for scale_w in scale_range: for input_h in range(2, 6): for input_w in range(2, 6): self.upsample_pytorch_test( input_h, input_w, scale_h, scale_w, align_corners, cpu_only, ) def upsample_pytorch_test(self, h, w, scale_h, scale_w, align_corners, cpu_only): input_dim = (1, 1, h, w) if align_corners: linear_upsample_mode = "ALIGN_CORNERS_TRUE" else: linear_upsample_mode = "ALIGN_CORNERS_FALSE" input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_upsample( name="upsample", scaling_factor_h=scale_h, scaling_factor_w=scale_w, linear_upsample_mode=linear_upsample_mode, input_name="data", output_name="output", mode="BILINEAR", ) input_tensor = np.reshape(np.arange(1.0, 1.0 + (h * w), 1.0), input_dim) input = {"data": input_tensor} # Get result from PyTorch x = torch.from_numpy(np.reshape(input_tensor, (1, 1, h, w))) pytorch_output = torch.nn.functional.interpolate( x, scale_factor=(scale_h, scale_w), mode="bilinear", align_corners=align_corners, recompute_scale_factor=True, ) # Expect PyTorch output matches CoreML output expected = {"output": pytorch_output.numpy()} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) self.assertEqual(len(input_dim), builder._get_rank("output")) def test_slice_by_size_cpu(self, cpu_only=True): shapes = [(4,), (3, 4), (2, 5, 6), (3, 5, 2, 4), (4, 5, 3, 6, 7)] for shape in shapes: for axis in range(len(shape)): begin = np.random.randint(shape[axis]) begin_input = np.array([begin]).astype(np.float32) size = np.random.randint(shape[axis] - begin) + 1 x = np.random.rand(*shape) slices = [] for i in range(len(shape)): if i != axis: slices.append(slice(None, None, None)) else: slices.append(slice(begin, begin + size, 1)) slices = tuple(slices) expected = {"output": x[slices]} input_features = [ ("data", datatypes.Array(*shape)), ("begin", datatypes.Array(1)), ] output_features = [("output", datatypes.Array(*x[slices].shape))] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_slice_by_size( "slice_by_size", ["data", "begin"], "output", axis=axis, size=size ) input = {"data": x, "begin": begin_input} self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def _test_conv3d(self, cpu_only, full_test): # Input shape defined by us and PyTorch as [batch, channels, depth, height, width] input_shapes = [ [1, 3, 3, 8, 8], [1, 1, 3, 8, 8], [1, 7, 8, 15, 63], [4, 32, 8, 16, 16], ] # Large enough kernels and/or input causes int overflow and seg fault: see rdar://60309763 kernels = [[3, 3, 3], [2, 2, 2]] strides = [[1, 1, 1], [2, 2, 2]] dilations = [[1, 1, 1], [2, 2, 2]] has_biases = [True, False] # Note: PyTorch's `torch.nn.Conv3d` doesn't support these padding modes, just a single # padding value (for all dimensions) or 3 values (for each dimension) padding_modes = ["custom", "valid", "same"] # Padding shape is front, back, top, bottom, left, right paddings = [[0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 1]] # Add some additional test cases if `full_test` is True if full_test: input_shapes.extend([[1, 4, 3, 128, 128]]) kernels.extend([[1, 2, 3], [5, 5, 5]]) strides.extend([[1, 2, 3]]) dilations.extend([[1, 2, 3]]) paddings.extend([[2, 0, 2, 0, 2, 0], [0, 1, 2, 3, 4, 5]]) test_case_format_str = ( "Conv3d test case | Input shape: {}, Output channels: {}, Groups: {}, Kernel shape: {}," " Stride: {}, Padding: {}, Padding mode: {}, Dilation: {}, Has bias: {}" ) for in_shape in input_shapes: # Test "normal" and depthwise convolution with corresponding groups and output channels groups_outchannels = [(1, 2), (in_shape[1], 2 * in_shape[1])] for kernel in kernels: for has_bias in has_biases: for stride in strides: for dilation in dilations: for padding_mode in padding_modes: # For all modes besides 'custom', the padding values are ignored if padding_mode == "custom": loop_paddings = paddings else: loop_paddings = [[0, 0, 0, 0, 0, 0]] for padding in loop_paddings: for groups, output_channels in groups_outchannels: # Dilated kernel shape = (K - 1) * D + 1 dilated_kernel = list( map( lambda k, d: (k - 1) * d + 1, kernel, dilation, ) ) # Use paddings if padding_mode is "custom", else compute # them according to # https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-convolutional-neural-networks#filter if padding_mode == "same": pad_d = max( 0, ( stride[0] * math.ceil( in_shape[2] / float(stride[0]) ) - in_shape[2] + dilated_kernel[0] - stride[0] ) / 2.0, ) pad_h = max( 0, ( stride[1] * math.ceil( in_shape[3] / float(stride[1]) ) - in_shape[3] + dilated_kernel[1] - stride[1] ) / 2.0, ) pad_w = max( 0, ( stride[2] * math.ceil( in_shape[4] / float(stride[2]) ) - in_shape[4] + dilated_kernel[2] - stride[2] ) / 2.0, ) # Depth padding[0] = int(math.floor(pad_d)) padding[1] = int(math.ceil(pad_d)) # Height padding[2] = int(math.floor(pad_h)) padding[3] = int(math.ceil(pad_h)) # Width padding[4] = int(math.floor(pad_w)) padding[5] = int(math.ceil(pad_w)) elif padding_mode == "valid": # Set to zero for PyTorch padding padding = [0] * 6 elif padding_mode == "custom": # No-op: valid ignores padding and custom uses the # specified padding pass input_features = [ ("data", datatypes.Array(*in_shape)) ] output_features = [("output", None)] input_channels = in_shape[1] # [output_channels, kernel_channels, depth, height, width] weights_shape = [ output_channels, int(input_channels / groups), kernel[0], kernel[1], kernel[2], ] # Init random input input_tensor = np.random.normal(size=in_shape) input_torch = torch.tensor(input_tensor) # Init random weights weights_tensor = np.random.normal( size=weights_shape ) weights_torch = torch.DoubleTensor( weights_tensor ) # Init random bias if applicable if has_bias: bias_tensor = np.random.normal( size=output_channels ) bias_torch = torch.DoubleTensor(bias_tensor) else: bias_tensor = None bias_torch = None builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True, ) builder.add_convolution3d( name="conv3d", input_channels=input_channels, output_channels=output_channels, depth=kernel[0], height=kernel[1], width=kernel[2], W=weights_tensor, b=bias_tensor, has_bias=has_bias, groups=groups, stride_depth=stride[0], stride_height=stride[1], stride_width=stride[2], dilation_depth=dilation[0], dilation_height=dilation[1], dilation_width=dilation[2], padding_mode=padding_mode, padding_front=padding[0], padding_back=padding[1], padding_top=padding[2], padding_bottom=padding[3], padding_left=padding[4], padding_right=padding[5], input_name="data", output_name="output", ) # Get PyTorch output to compare ours to # First pad, since PyTorch Conv3d only supports custom and # same symmetric padding. Padding shape is # (left, right, top, bottom, front, back) padded_input = input_torch if any(p > 0 for p in padding): torch_padding = ( padding[4], padding[5], padding[2], padding[3], padding[0], padding[1], ) pad_layer = torch.nn.ConstantPad3d( torch_padding, 0 ) padded_input = pad_layer(input_torch) # Check if dilated kernel size exceeds padded input size in # any dimension. If it does, it's not a valid convolution if ( dilated_kernel[0] > padded_input.shape[2] or dilated_kernel[1] > padded_input.shape[3] or dilated_kernel[2] > padded_input.shape[4] ): print( "SKIPPING: Dilated kernel exceeds padded input." ) continue # Using Sequential with a padding layer first produces # incorrect convolution output model = torch.nn.Sequential( torch.nn.Conv3d( input_channels, output_channels, kernel, stride=stride, padding=0, dilation=dilation, groups=groups, bias=False, ) ) with torch.no_grad(): model[0].weight = torch.nn.Parameter( weights_torch ) if has_bias: model[0].bias = torch.nn.Parameter( bias_torch ) torch_expected = model(padded_input) test_case = test_case_format_str.format( in_shape, output_channels, groups, weights_shape, stride, padding, padding_mode, dilation, has_bias, ) try: self._test_model( builder.spec, {"data": input_tensor}, { "output": torch_expected.detach().numpy() }, useCPUOnly=cpu_only, test_metric="SNR", SNR=40, validate_shapes_only=False, ) except AssertionError as e: print(test_case) raise def test_conv3d_cpu_basic(self): self._test_conv3d(cpu_only=True, full_test=False) @pytest.mark.slow def test_conv3d_cpu_slow(self): self._test_conv3d(cpu_only=True, full_test=True) def test_conv3d_gpu_basic(self): self._test_conv3d(cpu_only=False, full_test=False) @pytest.mark.slow def test_conv3d_gpu_slow(self): self._test_conv3d(cpu_only=False, full_test=True) @unittest.skipUnless( _is_macos() and _macos_version() >= LAYERS_11_0_MACOS_VERSION, "Only supported on macOS 10.16+", ) class TestReorganizeDataTests(CorrectnessTest): def _to_rank_4(self, x): from_rank = len(x.shape) if from_rank == 3: return np.reshape(x, [1] + list(x.shape)) elif from_rank == 4: return x elif from_rank == 5: return np.squeeze(x, axis=0) def _from_rank_4(self, x, to_rank): if to_rank == 3: return np.squeeze(x, axis=0) elif to_rank == 4: return x elif to_rank == 5: return np.reshape(x, [1] + list(x.shape)) @unittest.skipIf(not _HAS_TF_2, MSG_TF2_NOT_FOUND) def test_depth_to_space_cpu(self, cpu_only=True): params_dict = { "block_size": [2, 3, 4], "channels_div_bsq": [1, 2, 3, 7], "spatial": [[2, 3], [4, 4], [1, 1]], "batch_size": [None, 1, 2], "seq_length": [None, 1], } params_product = list(itertools.product(*params_dict.values())) for param in params_product: param = dict(zip(params_dict.keys(), param)) # Create input based on params block_size = param["block_size"] bsq = block_size * block_size input_shape = [bsq * param["channels_div_bsq"]] + param["spatial"] if param["batch_size"] is not None: input_shape = [param["batch_size"]] + input_shape if param["seq_length"] is not None: input_shape = [param["seq_length"]] + input_shape rank = len(input_shape) x = np.random.random(input_shape) input = {"data": x} # Set up network input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_reorganize_data( "reorganize_data", "data", "output", mode="DEPTH_TO_SPACE", block_size=block_size, ) # Run tensorflow to calculate expected values # TensorFlow requires rank 4, NHWC order on CPU x_tf = self._to_rank_4(x).transpose(0, 2, 3, 1) out_tf = tf.nn.depth_to_space(x_tf, block_size, data_format="NHWC").numpy() out = self._from_rank_4(out_tf.transpose(0, 3, 1, 2), to_rank=rank) expected = {"output": out} # Run model to calculate CoreML values and compare with expected self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) def test_depth_to_space_gpu(self): self.test_depth_to_space_cpu(cpu_only=False) @unittest.skipIf( _macos_version() < LAYERS_11_0_MACOS_VERSION, "macOS 11.0+ required. Skipping tests.", ) def test_pixel_shuffle_cpu(self, cpu_only=True): params_dict = { "block_size": [2, 3, 4], "channels_div_bsq": [1, 2, 3, 7], "spatial": [[2, 3], [4, 4], [1, 1]], "batch_size": [None, 1, 2], "seq_length": [None, 1], } params_product = list(itertools.product(*params_dict.values())) for param in params_product: param = dict(zip(params_dict.keys(), param)) # Create input based on params block_size = param["block_size"] bsq = block_size * block_size input_shape = [bsq * param["channels_div_bsq"]] + param["spatial"] if param["batch_size"] is not None: input_shape = [param["batch_size"]] + input_shape if param["seq_length"] is not None: input_shape = [param["seq_length"]] + input_shape rank = len(input_shape) x = np.random.random(input_shape) input = {"data": x} # Set up network input_features = [("data", datatypes.Array(*input_shape))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_reorganize_data( "reorganize_data", "data", "output", mode="PIXEL_SHUFFLE", block_size=block_size, ) # Run pytorch to calculate expected values x_torch = torch.from_numpy(self._to_rank_4(x)) out_torch = torch.pixel_shuffle(x_torch, upscale_factor=block_size) out = self._from_rank_4(out_torch.numpy(), to_rank=rank) expected = {"output": out} # Run model to calculate CoreML values and compare with expected self._test_model(builder.spec, input, expected, useCPUOnly=cpu_only) @unittest.skipIf( _macos_version() < LAYERS_11_0_MACOS_VERSION, "macOS 10.16+ required. Skipping tests.", ) def test_pixel_shuffle_gpu(self): self.test_pixel_shuffle_cpu(cpu_only=False) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/neural_network/test_quantization.py0000644000000000000000000005556414672066616025235 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause """ Module containing unit tests for verifying various quantizations. """ import itertools import unittest import numpy as np import pytest import coremltools import coremltools.models.datatypes as datatypes from coremltools import ComputeUnit from coremltools.models import (_QUANTIZATION_MODE_LINEAR_QUANTIZATION, neural_network) from coremltools.models.neural_network import quantization_utils from coremltools.models.neural_network.quantization_utils import ( MatrixMultiplyLayerSelector, _quantize_spec_weights, activate_int8_int8_matrix_multiplications) @unittest.skipIf( not coremltools.utils._is_macos() or coremltools.utils._macos_version() < (10, 16), "Missing macOS 10.16+. Skipping tests.", ) class DynamicQuantizedInt8Int8MatMul(unittest.TestCase): """ Quantization tests for dynamic Int8 - Int8 matrix multiplications """ def initialize(self): np.random.seed(1988) self.Cout, self.Cin = 16, 32 self.W = np.random.rand(self.Cout, self.Cin) * 20.0 - 10.0 self.b = np.random.rand(self.Cout) * 20.0 - 10.0 self.input_shape = (5, self.Cin) input_features = [("data", datatypes.Array(*self.input_shape))] output_features = [("output", None)] self.builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) self.selector = MatrixMultiplyLayerSelector() def _test_predictions( self, np_preds, coreml_preds, SNR=30, PSNR=40, ): np_preds = np_preds.flatten() coreml_preds = coreml_preds.flatten() noise = np_preds - coreml_preds noise_var = np.sum(noise ** 2) / len(noise) + 1e-7 signal_energy = np.sum(np_preds ** 2) / len(np_preds) max_signal_energy = np.amax(np_preds ** 2) snr = 10 * np.log10(signal_energy / noise_var) psnr = 10 * np.log10(max_signal_energy / noise_var) self.assertGreaterEqual(snr, SNR) self.assertGreaterEqual(psnr, PSNR) def compare(self, specification_modified=True): x = np.random.rand(*self.input_shape) def _get_preds(spec): mlmodel = coremltools.models.MLModel(spec, compute_units=ComputeUnit.CPU_ONLY) return mlmodel.predict({"data": x})["output"] preds = _get_preds(self.builder.spec) self.assertEqual(self.builder.spec.specificationVersion, 4) quantized_spec = activate_int8_int8_matrix_multiplications( self.builder.spec, self.selector ) layer = self.builder.spec.neuralNetwork.layers[0] layer_type = layer.WhichOneof("layer") if layer_type == "innerProduct": matmul_layer = layer.innerProduct elif layer_type == "batchedMatmul": matmul_layer = layer.batchedMatmul wp = matmul_layer.weights if specification_modified: self.assertEqual(self.builder.spec.specificationVersion, 5) quant_preds = _get_preds(quantized_spec) self._test_predictions(preds, quant_preds, SNR=40) self.assertEqual(len(wp.floatValue), 0) else: self.assertEqual(self.builder.spec.specificationVersion, 4) quant_preds = _get_preds(quantized_spec) np.testing.assert_array_almost_equal(preds, quant_preds) self.assertGreater(len(wp.floatValue), 0) def test_single_batched_matmul_no_bias(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) self.compare() def test_single_batched_matmul_with_bias(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, bias=self.b, ) self.compare() def test_single_inner_product_no_bias(self): self.initialize() self.builder.add_inner_product( name="ip", input_name="data", output_name="output", input_channels=self.Cin, output_channels=self.Cout, W=self.W, b=None, has_bias=False, ) self.compare() def test_single_inner_product_with_bias(self): self.initialize() self.builder.add_inner_product( name="ip", input_name="data", output_name="output", input_channels=self.Cin, output_channels=self.Cout, W=self.W, b=self.b, has_bias=True, ) self.compare() def test_inner_product_min_input_channels_valid(self): self.initialize() self.builder.add_inner_product( name="ip", input_name="data", output_name="output", input_channels=self.Cin, output_channels=self.Cout, W=self.W, b=self.b, has_bias=True, ) self.selector.minimum_input_channels = 31 self.compare() def test_batched_matmul_min_input_channels_valid(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) self.selector.minimum_input_channels = 32 self.compare() def test_inner_product_min_input_channels_invalid(self): self.initialize() self.builder.add_inner_product( name="ip", input_name="data", output_name="output", input_channels=self.Cin, output_channels=self.Cout, W=self.W, b=self.b, has_bias=True, ) self.selector.minimum_input_channels = 33 self.compare(specification_modified=False) def test_batched_matmul_min_input_channels_invalid(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) self.selector.minimum_input_channels = 33 self.compare(specification_modified=False) def test_batched_matmul_max_input_channels_valid(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) self.selector.maximum_input_channels = 32 self.compare() def test_inner_product_max_input_channels_valid(self): self.initialize() self.builder.add_inner_product( name="ip", input_name="data", output_name="output", input_channels=self.Cin, output_channels=self.Cout, W=self.W, b=self.b, has_bias=True, ) self.selector.maximum_input_channels = 33 self.compare() def test_batched_matmul_max_input_channels_invalid(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) self.selector.maximum_input_channels = 31 self.compare(specification_modified=False) def test_inner_product_max_input_channels_invalid(self): self.initialize() self.builder.add_inner_product( name="ip", input_name="data", output_name="output", input_channels=self.Cin, output_channels=self.Cout, W=self.W, b=self.b, has_bias=True, ) self.selector.maximum_input_channels = 30 self.compare(specification_modified=False) def test_inner_product_min_output_channels_valid(self): self.initialize() self.builder.add_inner_product( name="ip", input_name="data", output_name="output", input_channels=self.Cin, output_channels=self.Cout, W=self.W, b=self.b, has_bias=True, ) self.selector.minimum_output_channels = 16 self.compare() def test_batched_matmul_min_output_channels_valid(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) self.selector.minimum_output_channels = 16 self.compare() def test_inner_product_min_output_channels_invalid(self): self.initialize() self.builder.add_inner_product( name="ip", input_name="data", output_name="output", input_channels=self.Cin, output_channels=self.Cout, W=self.W, b=self.b, has_bias=True, ) self.selector.minimum_output_channels = 17 self.compare(specification_modified=False) def test_batched_matmul_min_output_channels_invalid(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) self.selector.minimum_output_channels = 17 self.compare(specification_modified=False) def test_batched_matmul_max_output_channels_valid(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) self.selector.maximum_output_channels = 17 self.compare() def test_inner_product_max_output_channels_valid(self): self.initialize() self.builder.add_inner_product( name="ip", input_name="data", output_name="output", input_channels=self.Cin, output_channels=self.Cout, W=self.W, b=self.b, has_bias=True, ) self.selector.maximum_output_channels = 16 self.compare() def test_batched_matmul_max_output_channels_invalid(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) self.selector.maximum_output_channels = 14 self.compare(specification_modified=False) def test_inner_product_max_output_channels_invalid(self): self.initialize() self.builder.add_inner_product( name="ip", input_name="data", output_name="output", input_channels=self.Cin, output_channels=self.Cout, W=self.W, b=self.b, has_bias=True, ) self.selector.maximum_output_channels = 15 self.compare(specification_modified=False) def test_inner_product_min_weight_count_valid(self): self.initialize() self.builder.add_inner_product( name="ip", input_name="data", output_name="output", input_channels=self.Cin, output_channels=self.Cout, W=self.W, b=self.b, has_bias=True, ) self.selector.minimum_weight_count = 512 self.compare() def test_batched_matmul_min_weight_count_invalid(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) self.selector.minimum_weight_count = 513 self.compare(specification_modified=False) def test_inner_product_layer_names_invalid(self): self.initialize() self.builder.add_inner_product( name="ip", input_name="data", output_name="output", input_channels=self.Cin, output_channels=self.Cout, W=self.W, b=self.b, has_bias=True, ) self.selector.include_layers_with_names = ["ip1", "ip2"] self.compare(specification_modified=False) def test_batched_matmul_layer_names_valid(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) self.selector.include_layers_with_names = ["bm1", "batched_matmul"] self.compare() def test_batched_matmul_8bit_weight_quantized(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) _quantize_spec_weights( self.builder.spec, 8, _QUANTIZATION_MODE_LINEAR_QUANTIZATION ) self.compare() def test_batched_matmul_4bit_weight_quantized(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) _quantize_spec_weights( self.builder.spec, 4, _QUANTIZATION_MODE_LINEAR_QUANTIZATION ) self.compare() def test_batched_matmul_2bit_weight_quantized(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) _quantize_spec_weights( self.builder.spec, 2, _QUANTIZATION_MODE_LINEAR_QUANTIZATION ) self.compare() def test_batched_matmul_1bit_weight_quantized(self): self.initialize() self.builder.add_batched_mat_mul( name="batched_matmul", input_names=["data"], output_name="output", weight_matrix_rows=self.Cin, weight_matrix_columns=self.Cout, W=self.W, ) _quantize_spec_weights( self.builder.spec, 1, _QUANTIZATION_MODE_LINEAR_QUANTIZATION ) self.compare() class TestQuantizeWeightsAPI: @staticmethod @pytest.mark.parametrize( "compute_units", [ComputeUnit.ALL, ComputeUnit.CPU_AND_GPU, ComputeUnit.CPU_ONLY] ) def test_embeddingND_quantize(compute_units): input_features = [("data", datatypes.Array(10, 1))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) builder.add_embedding_nd( name="embedding_nd", input_name="data", output_name="output", vocab_size=300, embedding_size=20, W=np.random.rand(20, 300), ) spec = builder.spec model_fp32 = coremltools.models.MLModel(spec, compute_units=compute_units) assert len(spec.neuralNetwork.layers[0].embeddingND.weights.floatValue) == 6000 # quantize to FP16 model_fp16 = quantization_utils.quantize_weights(model_fp32, nbits=16) assert model_fp16.compute_unit == compute_units spec_fp16 = model_fp16.get_spec() assert len(spec_fp16.neuralNetwork.layers[0].embeddingND.weights.floatValue) == 0 assert len(spec_fp16.neuralNetwork.layers[0].embeddingND.weights.float16Value) == 2 * 6000 # quantize to uint8 model_uint8 = quantization_utils.quantize_weights(model_fp32, nbits=8) assert model_uint8.compute_unit == compute_units spec_uint8 = model_uint8.get_spec() assert len(spec_uint8.neuralNetwork.layers[0].embeddingND.weights.floatValue) == 0 assert len(spec_uint8.neuralNetwork.layers[0].embeddingND.weights.float16Value) == 0 assert len(spec_uint8.neuralNetwork.layers[0].embeddingND.weights.rawValue) == 6000 # quantize to uint5 model_uint5 = quantization_utils.quantize_weights(model_fp32, nbits=5) assert model_uint5.compute_unit == compute_units spec_uint5 = model_uint5.get_spec() assert len(spec_uint5.neuralNetwork.layers[0].embeddingND.weights.floatValue) == 0 assert len(spec_uint5.neuralNetwork.layers[0].embeddingND.weights.float16Value) == 0 assert len(spec_uint5.neuralNetwork.layers[0].embeddingND.weights.rawValue) == 3750 # 3750 = 5*6000/8 @unittest.skipIf(coremltools.utils._macos_version() < (13, 0), 'ComputeUnit.CPU_AND_NE is only available on macOS >= 13.0' ) def test_embeddingND_quantize_CPU_and_NE(self): self.test_embeddingND_quantize(ComputeUnit.CPU_AND_NE) @staticmethod @pytest.mark.parametrize( "compute_units", [ComputeUnit.ALL, ComputeUnit.CPU_AND_GPU, ComputeUnit.CPU_ONLY] ) def test_loadConstantND_quantize(compute_units): input_features = [("data", datatypes.Array(10, 1))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features, disable_rank5_shape_mapping=True ) output_shape = [10, 10] ori_value = np.random.randint(0, 200, output_shape).astype(np.float32) builder.add_load_constant_nd( name="load_constant_nd", output_name="constant", constant_value=ori_value, shape=output_shape) builder.add_broadcast_to_dynamic( name="broadcast_to_dynamic", input_names=["constant", "data"], output_name="output" ) spec = builder.spec model_fp32 = coremltools.models.MLModel(spec, compute_units=compute_units) assert len(spec.neuralNetwork.layers[0].loadConstantND.data.floatValue) == 100 # quantize to FP16 model_fp16 = quantization_utils.quantize_weights(model_fp32, nbits=16) assert model_fp16.compute_unit == compute_units spec_fp16 = model_fp16.get_spec() assert len(spec_fp16.neuralNetwork.layers[0].loadConstantND.data.floatValue) == 0 assert len(spec_fp16.neuralNetwork.layers[0].loadConstantND.data.float16Value) == 2 * 100 # quantize to uint8 model_uint8 = quantization_utils.quantize_weights(model_fp32, nbits=8) assert model_uint8.compute_unit == compute_units spec_uint8 = model_uint8.get_spec() assert len(spec_uint8.neuralNetwork.layers[0].loadConstantND.data.floatValue) == 0 assert len(spec_uint8.neuralNetwork.layers[0].loadConstantND.data.float16Value) == 0 assert len(spec_uint8.neuralNetwork.layers[0].loadConstantND.data.rawValue) == 100 # quantize to uint5 model_uint5 = quantization_utils.quantize_weights(model_fp32, nbits=5) assert model_uint5.compute_unit == compute_units spec_uint5 = model_uint5.get_spec() assert len(spec_uint5.neuralNetwork.layers[0].loadConstantND.data.floatValue) == 0 assert len(spec_uint5.neuralNetwork.layers[0].loadConstantND.data.float16Value) == 0 assert len(spec_uint5.neuralNetwork.layers[0].loadConstantND.data.rawValue) == 63 # 63 = ceil(5*100/8) @unittest.skipIf(coremltools.utils._macos_version() < (13, 0), 'ComputeUnit.CPU_AND_NE is only available on macOS >= 13.0' ) def test_loadConstantND_quantize_CPU_and_NE(self): self.test_loadConstantND_quantize(ComputeUnit.CPU_AND_NE) class TestKMeansLookup: @pytest.mark.parametrize("weightShape, dtype", itertools.product( [(20, 20), (120, 120)], [np.float16, np.float32] )) def test_kmeans_lookup(self, weightShape, dtype): nbits = 4 w = np.random.rand(*weightShape).astype(dtype) lookup_table, quantized_weights = quantization_utils._get_kmeans_lookup_table_and_weight(nbits, w) assert(len(lookup_table) == 2 ** nbits) assert(quantized_weights.shape == (np.prod(weightShape),)) assert(len(np.unique(quantized_weights)) <= len(lookup_table)) quantized_weight_values = lookup_table[quantized_weights] max_deltas = np.abs(w.flatten() - quantized_weight_values.flatten()).max() assert max_deltas < 0.1 def test_kmeans1d_exact_value(self): w = np.array( [ [12.0, 11.0, 12.0, 33.0, 32.0, 99.0, 0.0, 34.0, 40.0], [41.0, 34.0, 98.0, 75.1, 89.0, 99.0, 0.0, 10.0, 41.0], ] ) lookup_table, quantized_weights = quantization_utils._get_kmeans_lookup_table_and_weight( 4, w, force_kmeans1d=True ) assert all( lookup_table == np.array( [ 0.0, 10.0, 11.0, 12.0, 32.0, 33.0, 34.0, 40.0, 41.0, 75.1, 89.0, 98.0, 99.0, 0.0, 0.0, 0.0, ] ) ) assert all( quantized_weights == np.array([3, 2, 3, 5, 4, 12, 0, 6, 7, 8, 6, 11, 9, 10, 12, 0, 1, 8]) ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/neural_network/test_simple_nn_inference.py0000644000000000000000000000336414672066616026500 0ustar00rootroot# Copyright (c) 2021, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import numpy as np import coremltools import coremltools.models.datatypes as datatypes from coremltools import ComputeUnit, utils from coremltools.models import neural_network as neural_network class TestNeuralNetworkPrediction: @staticmethod def test_lrn_model(tmpdir): input_dim = (1, 3, 3) input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", datatypes.Array(*input_dim))] builder = neural_network.NeuralNetworkBuilder(input_features, output_features) builder.add_lrn( name="lrn", input_name="data", output_name="output", alpha=2, beta=3, local_size=1, k=8, ) input = {"data": np.ones((1, 3, 3))} expected = 1e-3 * np.ones((1, 3, 3)) model_path = os.path.join(str(tmpdir), "lrn_model.mlmodel") coremltools.models.utils.save_spec(builder.spec, model_path) try: model = coremltools.models.MLModel(model_path, compute_units=ComputeUnit.CPU_ONLY) if utils._macos_version() >= (10, 13): out = model.predict(input) except RuntimeError as e: print(e) assert str(e) == "Error compiling model: \"The file couldn’t be saved.\"." else: if utils._macos_version() >= (10, 13): assert out['output'].shape == (1, 3, 3) np.testing.assert_allclose(expected, out['output']) print("Core ML output", out) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/neural_network/test_tf_numeric.py0000644000000000000000000004616114672066616024633 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import unittest import numpy as np import pytest import coremltools.models.datatypes as datatypes from coremltools import ComputeUnit from coremltools._deps import _HAS_TF_2, MSG_TF2_NOT_FOUND from coremltools.models import MLModel, neural_network from coremltools.models.utils import _is_macos, _macos_version if _HAS_TF_2: import tensorflow as tf np.random.seed(10) np.set_printoptions(precision=4, suppress=True) @unittest.skipIf(not _HAS_TF_2, MSG_TF2_NOT_FOUND) class CorrectnessTest(unittest.TestCase): def _compare_shapes(self, ref_preds, coreml_preds): if np.squeeze(ref_preds).shape != np.squeeze(coreml_preds).shape: return False else: return True def _compare_predictions_numerical( self, ref_preds, coreml_preds, snr_thresh=15, psnr_thresh=30 ): ref_preds = ref_preds.flatten() coreml_preds = coreml_preds.flatten() noise = coreml_preds - ref_preds noise_var = np.mean(noise ** 2) signal_energy = np.mean(ref_preds ** 2) max_signal_energy = np.amax(ref_preds ** 2) if noise_var > 1e-6 and signal_energy > 1e-6: SNR = 10 * np.log10(signal_energy / noise_var) PSNR = 10 * np.log10(max_signal_energy / noise_var) print("SNR: {}, PSNR: {}".format(SNR, PSNR)) print("noise var: ", np.mean(noise ** 2)) print("max signal energy: ", np.amax(ref_preds ** 2)) print("signal energy: ", np.mean(ref_preds ** 2)) self.assertGreaterEqual(PSNR, psnr_thresh) self.assertGreaterEqual(SNR, snr_thresh) def _test_model( self, input_dict, ref_output_dict, coreml_model, snr_thresh=15, psnr_thresh=30, ): coreml_out_dict = coreml_model.predict(input_dict) for out_ in list(ref_output_dict.keys()): ref_out = ref_output_dict[out_].flatten() coreml_out = coreml_out_dict[out_].flatten() self.assertEqual(len(coreml_out), len(ref_out)) self._compare_predictions_numerical( ref_out, coreml_out, snr_thresh=snr_thresh, psnr_thresh=psnr_thresh ) @unittest.skipUnless(_is_macos(), "Only supported for MacOS platform.") class StressTest(CorrectnessTest): def test_data_reorganize(self, cpu_only=False): def get_coreml_model_reorganize(X, params): eval = True mlmodel = None try: input_dim = X.shape[2:] input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features ) builder.add_reorganize_data( "reorg", "data", "output", mode=params["mode"], block_size=params["block_size"], ) if cpu_only: compute_unit=ComputeUnit.CPU_ONLY else: compute_unit=ComputeUnit.ALL mlmodel = MLModel(builder.spec, compute_units=compute_unit) except RuntimeError as e: print(e) eval = False return mlmodel, eval def get_tf_predictions_reorganize(X, params): if params["mode"] == "SPACE_TO_DEPTH": y = tf.nn.space_to_depth(X, params["block_size"]) else: y = tf.nn.depth_to_space(X, params["block_size"]) return y.numpy() """ Define Params """ params_dict = dict( C=[1, 2, 8, 16, 15, 27], H=[2, 4, 6, 8, 10, 15, 21, 16], W=[2, 4, 6, 8, 10, 15, 21, 16], block_size=[2, 3, 4, 5], mode=["SPACE_TO_DEPTH", "DEPTH_TO_SPACE"], ) params = [x for x in list(itertools.product(*params_dict.values()))] all_candidates = [dict(zip(params_dict.keys(), x)) for x in params] valid_params = [] for pr in all_candidates: if pr["mode"] == "SPACE_TO_DEPTH": if pr["H"] % pr["block_size"] == 0 and pr["W"] % pr["block_size"] == 0: valid_params.append(pr) else: if pr["C"] % (pr["block_size"] ** 2) == 0: valid_params.append(pr) print( "Total params to be tested: ", len(valid_params), "out of candidates: ", len(all_candidates), ) """ Test """ failed_tests_compile = [] for i in range(len(valid_params)): params = valid_params[i] # print("=========: ", params) # if i % 10 == 0: print("======== Testing {}/{}".format(str(i), str(len(valid_params)))) X = np.random.rand(1, params["C"], params["H"], params["W"]) tf_preds = get_tf_predictions_reorganize( np.transpose(X, [0, 2, 3, 1]), params ) tf_preds = np.transpose(tf_preds, [0, 3, 1, 2]) coreml_model, eval = get_coreml_model_reorganize( np.expand_dims(X, axis=0), params ) if eval is False: failed_tests_compile.append(params) else: input_dict = {"data": np.expand_dims(X, axis=0)} ref_output_dict = {"output": tf_preds[0, :, :, :]} self._test_model(input_dict, ref_output_dict, coreml_model) self.assertEqual(failed_tests_compile, []) def test_data_reorganize_cpu_only(self): self.test_data_reorganize(cpu_only=True) def test_depthwise_conv(self, cpu_only=False): def get_coreml_model_depthwise(X, params, w): eval = True mlmodel = None try: input_dim = X.shape[2:] input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features ) # translate weights : (Kh, Kw, kernel_channels, output_channels) == (Kh, Kw, Cin/g, Cout) == (Kh, Kw, 1, channel_multiplier * Cin) w_e = np.reshape( w, ( params["kernel_size"], params["kernel_size"], params["multiplier"] * params["C"], 1, ), ) w_e = np.transpose(w_e, [0, 1, 3, 2]) if params["padding"] == "SAME": pad_mode = "same" else: pad_mode = "valid" builder.add_convolution( "conv", kernel_channels=1, output_channels=params["multiplier"] * params["C"], height=params["kernel_size"], width=params["kernel_size"], stride_height=params["stride"], stride_width=params["stride"], border_mode=pad_mode, groups=params["C"], W=w_e, b=None, has_bias=0, is_deconv=0, output_shape=None, input_name="data", output_name="output", ) if cpu_only: compute_unit=ComputeUnit.CPU_ONLY else: compute_unit=ComputeUnit.ALL mlmodel = MLModel(builder.spec, compute_units=compute_unit) except RuntimeError as e: print(e) eval = False return mlmodel, eval def get_tf_predictions_depthwise(X, params, w): Cin = params["C"] Kh = Kw = params["kernel_size"] channel_multiplier = params["multiplier"] y = tf.nn.depthwise_conv2d( X, w, strides=[1, params["stride"], params["stride"], 1], padding=params["padding"], ) return y.numpy() """ Define Params """ params_dict = dict( C=[1, 4, 7], H=[11, 16], stride=[1, 2, 3], kernel_size=[1, 2, 3, 5], multiplier=[1, 2, 3, 4], padding=["SAME", "VALID"], ) params = [x for x in list(itertools.product(*params_dict.values()))] all_candidates = [dict(zip(params_dict.keys(), x)) for x in params] valid_params = [] for pr in all_candidates: if pr["padding"] == "VALID": if np.floor((pr["H"] - pr["kernel_size"]) / pr["stride"]) + 1 <= 0: continue valid_params.append(pr) print( "Total params to be tested: ", len(valid_params), "out of candidates: ", len(all_candidates), ) """ Test """ failed_tests_compile = [] for i in range(len(valid_params)): params = valid_params[i] # print("=========: ", params) # if i % 10 == 0: print "======== Testing {}/{}".format(str(i), str(len(valid_params))) X = np.random.rand(1, params["C"], params["H"], params["H"]) w = np.random.rand( params["kernel_size"], params["kernel_size"], params["C"], params["multiplier"], ) tf_preds = get_tf_predictions_depthwise( np.transpose(X, [0, 2, 3, 1]), params, w ) tf_preds = np.transpose(tf_preds, [0, 3, 1, 2]) coreml_model, eval = get_coreml_model_depthwise( np.expand_dims(X, axis=0), params, w ) if eval is False: failed_tests_compile.append(params) else: input_dict = {"data": np.expand_dims(X, axis=0)} ref_output_dict = {"output": tf_preds[0, :, :, :]} self._test_model(input_dict, ref_output_dict, coreml_model) self.assertEqual(failed_tests_compile, []) def test_depthwise_conv_cpu_only(self, cpu_only=False): self.test_depthwise_conv(cpu_only=True) @unittest.skipUnless(_macos_version() >= (10, 14), "Only supported on MacOS 10.14+") def test_resize_bilinear(self, cpu_only=False): def get_coreml_model_resize_bilinear(X, params): eval = True mlmodel = None try: input_dim = X.shape[2:] input_features = [("data", datatypes.Array(*input_dim))] output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features ) if params["align_corners"]: mode = "STRICT_ALIGN_ENDPOINTS_MODE" else: mode = "UPSAMPLE_MODE" builder.add_resize_bilinear( "resize", "data", "output", target_height=params["Hnew"], target_width=params["Wnew"], mode=mode, ) if cpu_only: compute_unit=ComputeUnit.CPU_ONLY else: compute_unit=ComputeUnit.ALL mlmodel = MLModel(builder.spec, compute_units=compute_unit) except RuntimeError as e: print(e) eval = False return mlmodel, eval def get_tf_predictions_resize_bilinear(X, params): y = tf.compat.v1.image.resize_bilinear( X, size=[params["Hnew"], params["Wnew"]], align_corners=params["align_corners"], ) return y.numpy() """ Define Params """ params_dict = dict( H=[1, 3, 10], # [1,2,3,10] W=[1, 3, 10], # [1,2,3,10] Hnew=[1, 2, 6], # [1,3,6,12,20] Wnew=[1, 2, 6], # [1,3,6,12,20] align_corners=[False, True], # [False, True] ch=[1, 5], # [1,5] batch=[1, 3], # [1, 3] ) params = [x for x in list(itertools.product(*params_dict.values()))] valid_params = [dict(zip(params_dict.keys(), x)) for x in params] print("Total params to be tested: {}".format(len(valid_params))) """ Test """ failed_tests_compile = [] for i in range(len(valid_params)): params = valid_params[i] # #print("=========: ", params) if i % 100 == 0: print( "======================= Testing {}/{}".format( str(i), str(len(valid_params)) ) ) X = np.round( 255 * np.random.rand( params["batch"], params["ch"], params["H"], params["W"] ) ) tf_preds = get_tf_predictions_resize_bilinear( np.transpose(X, [0, 2, 3, 1]), params ) tf_preds = np.transpose(tf_preds, [0, 3, 1, 2]) coreml_model, eval = get_coreml_model_resize_bilinear( np.expand_dims(X, axis=0), params ) if eval is False: failed_tests_compile.append(params) else: input_dict = {"data": np.expand_dims(X, axis=0)} ref_output_dict = {"output": np.expand_dims(tf_preds, axis=0)} self._test_model(input_dict, ref_output_dict, coreml_model) self.assertEqual(failed_tests_compile, []) @unittest.skipUnless(_macos_version() >= (10, 14), "Only supported on MacOS 10.14+") def test_resize_bilinear_cpu_only(self): self.test_resize_bilinear(cpu_only=True) @unittest.skipUnless(_macos_version() >= (10, 14), "Only supported on MacOS 10.14+") def test_crop_resize(self, cpu_only=False): # This test can be stochastically failing, so we set the below seed: np.random.seed(0) if _macos_version()[0] == 12: pytest.xfail("rdar://110274216") def get_coreml_model_crop_resize(params): eval = True mlmodel = None batch, ch, n_roi = params["b_c_n"] H = params["H"] W = params["W"] try: input_features = [("data", datatypes.Array(ch, H, W))] input_features.append(("roi", datatypes.Array(4, 1, 1))) if batch != 1: input_features.append(("box_ind", datatypes.Array(1, 1, 1))) output_features = [("output", None)] builder = neural_network.NeuralNetworkBuilder( input_features, output_features ) if batch != 1: builder.add_elementwise( "concat", ["box_ind", "roi"], "roi_out", "CONCAT" ) input_names = ["data", "roi_out"] else: input_names = ["data", "roi"] builder.add_crop_resize( "resize", input_names, "output", target_height=params["Hnew"], target_width=params["Wnew"], mode="ALIGN_ENDPOINTS_MODE", normalized_roi=True, box_indices_mode="CORNERS_HEIGHT_FIRST", spatial_scale=1.0, ) if cpu_only: compute_unit=ComputeUnit.CPU_ONLY else: compute_unit=ComputeUnit.ALL mlmodel = MLModel(builder.spec, compute_units=compute_unit) except RuntimeError as e: print(e) eval = False return mlmodel, eval def get_tf_predictions_crop_resize(X, boxes, box_ind, params): y = tf.image.crop_and_resize( X, boxes, box_ind, crop_size=[params["Hnew"], params["Wnew"]] ) return y.numpy() """ Define Params """ params_dict = dict( H=[1, 3, 10], # [1,2,3,6,10] W=[1, 3, 10], # [1,2,3,6,10] Hnew=[1, 2, 3, 6], # [1,2,3,6,12,20] Wnew=[1, 2, 3, 6], # [1,2,3,6,12,20] b_c_n=[ (1, 1, 1), (1, 2, 3), (3, 2, 1), (3, 4, 3), ], # [(1,1,1),(1,2,3),(3,2,1),(3,4,3)] ) params = [x for x in list(itertools.product(*params_dict.values()))] valid_params = [dict(zip(params_dict.keys(), x)) for x in params] print("Total params to be tested: {}".format(len(valid_params))) """ Test """ failed_tests_compile = [] for i in range(len(valid_params)): params = valid_params[i] batch, ch, n_roi = params["b_c_n"] X = np.round(255 * np.random.rand(batch, ch, params["H"], params["W"])) roi = np.zeros((n_roi, 4), dtype=np.float32) box_ind = np.zeros((n_roi)) if batch != 1: box_ind = np.random.randint(low=0, high=batch, size=(n_roi)) for ii in range(n_roi): r = np.random.rand(4) w_start = r[0] h_start = r[1] w_end = r[2] * (1 - w_start) + w_start h_end = r[3] * (1 - h_start) + h_start roi[ii, :] = [h_start, w_start, h_end, w_end] roi[ii, :] = np.round(100 * roi[ii, :]) / 100 assert roi[ii, 0] <= roi[ii, 2] assert roi[ii, 1] <= roi[ii, 3] tf_preds = get_tf_predictions_crop_resize( np.transpose(X, [0, 2, 3, 1]), roi, box_ind, params ) tf_preds = np.transpose(tf_preds, [0, 3, 1, 2]) coreml_model, eval = get_coreml_model_crop_resize(params) if eval is False: failed_tests_compile.append(params) else: input_dict = {"data": np.expand_dims(X, axis=0)} input_dict["roi"] = np.reshape(roi, (n_roi, 1, 4, 1, 1)) if batch != 1: input_dict["box_ind"] = np.reshape( box_ind.astype(np.float32), (n_roi, 1, 1, 1, 1) ) ref_output_dict = {"output": np.expand_dims(tf_preds, axis=0)} self._test_model(input_dict, ref_output_dict, coreml_model) self.assertEqual(failed_tests_compile, []) @unittest.skipUnless(_macos_version() >= (10, 14), "Only supported on MacOS 10.14+") def test_crop_resize_cpu_only(self): self.test_crop_resize(cpu_only=True) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2815473 coremltools-8.0/coremltools/test/optimize/0000755000000000000000000000000014672075535017660 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/__init__.py0000644000000000000000000000033114672066616021766 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2815473 coremltools-8.0/coremltools/test/optimize/api/0000755000000000000000000000000014672075535020431 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/api/__init__.py0000644000000000000000000000033214672066616022540 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/api/test_optimize_api.py0000644000000000000000000005557314672066616024552 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import tempfile import pytest from coremltools.converters.mil.testing_utils import get_op_types_in_program from coremltools.test.optimize.coreml.test_passes import ( TestCompressionPasses as _TestCompressionPasses, ) from coremltools.test.optimize.coreml.test_passes import ( TestConfigurationFromDictFromYaml as _TestConfigurationFromDictFromYaml, ) get_test_program = _TestCompressionPasses._get_test_program_2 def create_model_and_optimizer(): import torch class Model(torch.nn.Module): def __init__(self): super(Model, self).__init__() self.conv1 = torch.nn.Conv2d(3, 128, (1, 1)) self.conv2 = torch.nn.Conv2d(128, 256, (10, 10)) self.conv3 = torch.nn.Conv2d(256, 26, (10, 10)) self.linear = torch.nn.Linear(206, 12) def forward(self, x): x = self.conv1(x) x = self.conv2(x) x = self.conv3(x) x = self.linear(x) return x model = Model() loss_fn = torch.nn.MSELoss() optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9) return model, loss_fn, optimizer def get_mlmodel(): import coremltools as ct prog = get_test_program() mlmodel = ct.convert(prog, convert_to="mlprogram", compute_precision=ct.precision.FLOAT32) return mlmodel class TestOptimizeCoremlAPIOverview: """ This class is testing the api reference code in https://coremltools.readme.io/v7.0/docs/optimizecoreml-api-overview """ def test_6_bit_palettization_example(self): import coremltools as ct import coremltools.optimize.coreml as cto # load model # (original) mlmodel = ct.models.MLModel(uncompressed_model_path) mlmodel = get_mlmodel() # define op config op_config = cto.OpPalettizerConfig(mode="kmeans", nbits=6) # define optimization config by applying the op config globally to all ops config = cto.OptimizationConfig(global_config=op_config) # palettize weights compressed_mlmodel = cto.palettize_weights(mlmodel, config) # Do some basic checks assert compressed_mlmodel is not None ops = get_op_types_in_program(compressed_mlmodel._mil_program) assert ops.count("constexpr_lut_to_dense") == 6 def test_linear_quantization_config_from_yaml(self): import coremltools.optimize.coreml as cto mlmodel = get_mlmodel() config_dict = { "config_type": "OpLinearQuantizerConfig", "global_config": { "mode": "linear_symmetric", "dtype": "int8", }, } yaml_file = _TestConfigurationFromDictFromYaml.get_yaml(config_dict) # (original) config = cto.OptimizationConfig.from_yaml("linear_config.yaml") config = cto.OptimizationConfig.from_yaml(yaml_file) compressed_mlmodel = cto.linear_quantize_weights(mlmodel, config) # Do some basic checks assert compressed_mlmodel is not None ops = get_op_types_in_program(compressed_mlmodel._mil_program) assert ops.count("constexpr_affine_dequantize") == 6 def test_customize_ops_to_compress(self): import coremltools.optimize.coreml as cto mlmodel = get_mlmodel() global_config = cto.OpPalettizerConfig(nbits=6, mode="kmeans") linear_config = cto.OpPalettizerConfig(nbits=8, mode="kmeans") config = cto.OptimizationConfig( global_config=global_config, op_type_configs={"linear": linear_config}, op_name_configs={"conv1": None, "conv3": None}, ) compressed_mlmodel = cto.palettize_weights(mlmodel, config) # Do some basic checks assert compressed_mlmodel is not None ops = get_op_types_in_program(compressed_mlmodel._mil_program) assert ops.count("constexpr_lut_to_dense") == 4 class TestOptimizeTorchAPIOverview: """ This class is testing the api reference code in https://coremltools.readme.io/v7.0/docs/optimizetorch-api-overview """ def get_global_config(self): config_dict = { "global_config": { "scheduler": {"update_steps": [100, 200, 300, 500]}, "target_sparsity": 0.8, } } return _TestConfigurationFromDictFromYaml.get_yaml(config_dict) def get_fine_grain_config(self): config_dict = { "module_type_configs": { "Linear": { "scheduler": { "update_steps": [100, 200, 300, 500], }, "n_m_ratio": [3, 4], }, "Conv2d": { "scheduler": { "update_steps": [100, 200, 300, 500], }, "target_sparsity": 0.5, "block_size": 2, }, }, "module_name_configs": { "module2.conv1": { "scheduler": { "update_steps": [100, 200, 300, 500], }, "target_sparsity": 0.75, }, "module2.linear": None, }, } return _TestConfigurationFromDictFromYaml.get_yaml(config_dict) def test_load_from_yaml(self): def _test_config(config_path): import torch import coremltools as ct from coremltools.optimize.torch.pruning import MagnitudePruner, MagnitudePrunerConfig # Toy example x, label = torch.rand(1, 3, 224, 224), torch.rand(1, 26, 206, 12) data = [(x, label)] model, loss_fn, optimizer = create_model_and_optimizer() # Initialize pruner and configure it # (original) config = MagnitudePrunerConfig.from_yaml("config.yaml") config = MagnitudePrunerConfig.from_yaml(config_path) pruner = MagnitudePruner(model, config) # Insert pruning layers in the model model = pruner.prepare() for inputs, labels in data: output = model(inputs) loss = loss_fn(output, labels) loss.backward() optimizer.step() pruner.step() # Commit pruning masks to model parameters pruner.finalize(inplace=True) # Export example_input = torch.rand(1, 3, 224, 224) traced_model = torch.jit.trace(model, example_input) coreml_model = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], pass_pipeline=ct.PassPipeline.DEFAULT_PRUNING, minimum_deployment_target=ct.target.iOS16, ) assert coreml_model is not None output_file = tempfile.NamedTemporaryFile(suffix=".mlpackage").name coreml_model.save(output_file) _test_config(self.get_global_config()) _test_config(self.get_fine_grain_config()) @pytest.mark.xfail(reason="rdar://132361333 Palettization Test Case Time Out", run=False) def test_programmatic_example_1(self): import torch import coremltools as ct from coremltools.optimize.torch.palettization import ( DKMPalettizer, DKMPalettizerConfig, ModuleDKMPalettizerConfig, ) # Toy example x, label = torch.rand(1, 3, 224, 224), torch.rand(1, 26, 206, 12) data = [(x, label)] # code that defines the pytorch model, and optimizer model, loss_fn, optimizer = create_model_and_optimizer() # Initialize the palettizer config = DKMPalettizerConfig( global_config=ModuleDKMPalettizerConfig(n_bits=4, cluster_dim=2) ) palettizer = DKMPalettizer(model, config) # Prepare the model to insert FakePalettize layers for palettization model = palettizer.prepare(inplace=True) # Use palettizer in the PyTorch training loop for inputs, labels in data: output = model(inputs) loss = loss_fn(output, labels) loss.backward() optimizer.step() palettizer.step() # Fold LUT and indices into weights model = palettizer.finalize(inplace=True) # Export example_input = torch.rand(1, 3, 224, 224) traced_model = torch.jit.trace(model, example_input) coreml_model = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], pass_pipeline=ct.PassPipeline.DEFAULT_PALETTIZATION, minimum_deployment_target=ct.target.iOS18, ) assert coreml_model is not None output_file = tempfile.NamedTemporaryFile(suffix=".mlpackage").name coreml_model.save(output_file) def test_programmatic_example_2(self): import torch import coremltools as ct from coremltools.optimize.torch.quantization import ( LinearQuantizer, LinearQuantizerConfig, ModuleLinearQuantizerConfig, ObserverType, QuantizationScheme, ) # Toy example x, label = torch.rand(1, 3, 224, 224), torch.rand(1, 26, 206, 12) data = [(x, label)] model, loss_fn, optimizer = create_model_and_optimizer() # Initialize the quantizer global_config = ModuleLinearQuantizerConfig( quantization_scheme=QuantizationScheme.symmetric ) config = LinearQuantizerConfig().set_global(global_config) # We only want to quantize convolution layers which have a kernel size of 1 or all linear layers. for name, m in model.named_modules(): if isinstance(m, torch.nn.Conv2d): if m.kernel_size == (1, 1): config = config.set_module_name( name, ModuleLinearQuantizerConfig( weight_observer=ObserverType.min_max, weight_per_channel=True ), ) else: config = config.set_module_name(name, None) quantizer = LinearQuantizer(model, config) # Prepare the model to insert FakeQuantize layers for QAT example_input = torch.rand(1, 3, 224, 224) model = quantizer.prepare(example_inputs=example_input, inplace=True) # Use quantizer in your PyTorch training loop for inputs, labels in data: output = model(inputs) loss = loss_fn(output, labels) loss.backward() optimizer.step() quantizer.step() # Convert operations to their quanitzed counterparts using parameters learnt via QAT model = quantizer.finalize(inplace=True) traced_model = torch.jit.trace(model, example_input) coreml_model = ct.convert( traced_model, inputs=[ct.TensorType(shape=example_input.shape)], minimum_deployment_target=ct.target.iOS17, ) assert coreml_model is not None output_file = tempfile.NamedTemporaryFile(suffix=".mlpackage").name coreml_model.save(output_file) def test_quantize_submodule(self): import torch from torchvision.models import mobilenet_v3_small import coremltools as ct from coremltools.optimize.torch.quantization import LinearQuantizer class Model(torch.nn.Module): def __init__(self): super().__init__() self.model1 = mobilenet_v3_small() self.model2 = mobilenet_v3_small() def forward(self, x): return self.model1(x), self.model2(x) model = Model() data = torch.randn(1, 3, 224, 224) example_inputs = (data,) quantizer = LinearQuantizer(model.model1) model.model1 = quantizer.prepare(example_inputs=example_inputs) model(data) model.model1 = quantizer.finalize() model = model.eval() traced_model = torch.jit.trace(model, example_inputs=example_inputs) coreml_model = ct.convert( traced_model, convert_to="mlprogram", inputs=[ct.TensorType(shape=data.shape)], minimum_deployment_target=ct.target.iOS18, skip_model_load=True, ) assert coreml_model is not None quant_ops = coreml_model._mil_program.functions["main"].find_ops( op_type="constexpr_blockwise_shift_scale" ) assert len(quant_ops) > 0 class TestConvertingCompressedSourceModels: """ This class is testing examples in https://coremltools.readme.io/v7.0/docs/converting-compressed-source-models """ def test_smoke_convert_compressed_source_model_pruning(self): import coremltools as ct model_with_sparse_weights = ct.convert( get_test_program(), pass_pipeline=ct.PassPipeline.DEFAULT_PRUNING, minimum_deployment_target=ct.target.iOS17, ) assert model_with_sparse_weights is not None def test_smoke_convert_compressed_source_model_pelettization(self): import coremltools as ct model_with_lut_weights = ct.convert( get_test_program(), pass_pipeline=ct.PassPipeline.DEFAULT_PALETTIZATION, minimum_deployment_target=ct.target.macOS13, ) assert model_with_lut_weights is not None class TestPostTrainingPruning: """ This class is testing examples in https://coremltools.readme.io/v7.0/docs/pruning-a-core-ml-model """ def test_threshold_pruner(self): from coremltools.optimize.coreml import ( OpThresholdPrunerConfig, OptimizationConfig, prune_weights, ) model = get_mlmodel() op_config = OpThresholdPrunerConfig( threshold=0.03, minimum_sparsity_percentile=0.55, weight_threshold=1024, ) config = OptimizationConfig(global_config=op_config) model_compressed = prune_weights(model, config=config) assert model_compressed is not None def test_magnitute_pruner(self): from coremltools.optimize.coreml import ( OpMagnitudePrunerConfig, OptimizationConfig, prune_weights, ) model = get_mlmodel() op_config = OpMagnitudePrunerConfig( target_sparsity=0.6, weight_threshold=1024, ) config = OptimizationConfig(global_config=op_config) model_compressed = prune_weights(model, config=config) class TestTrainingTimePruning: """ This class is testing examples in https://coremltools.readme.io/v7.0/docs/data-dependent-pruning """ def test_magnitute_pruner(self): from collections import OrderedDict import torch import coremltools as ct from coremltools.optimize.torch.pruning import MagnitudePruner, MagnitudePrunerConfig # Toy example x, label = torch.rand(1, 3, 224, 224), torch.rand(1, 32, 224, 224) data = [(x, label)] model = torch.nn.Sequential( OrderedDict( [ ("conv1", torch.nn.Conv2d(3, 32, 3, padding="same")), ("conv2", torch.nn.Conv2d(32, 32, 3, padding="same")), ] ) ) loss_fn = torch.nn.MSELoss() optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9) # initialize pruner and configure it # we will configure the pruner for all conv2d layers config = MagnitudePrunerConfig.from_dict( { "module_type_configs": { "Conv2d": { "scheduler": {"update_steps": [3, 5, 7]}, "target_sparsity": 0.75, "granularity": "per_scalar", }, } } ) pruner = MagnitudePruner(model, config) # insert pruning layers in the model model = pruner.prepare() for inputs, labels in data: output = model(inputs) loss = loss_fn(output, labels) loss.backward() optimizer.step() pruner.step() # commit pruning masks to model parameters pruner.finalize(inplace=True) # trace and convert the model example_input = torch.rand(1, 3, 224, 224) # shape of input for the model traced_model = torch.jit.trace(model, example_input) coreml_model = ct.convert( traced_model, convert_to="mlprogram", inputs=[ct.TensorType(shape=example_input.shape)], pass_pipeline=ct.PassPipeline.DEFAULT_PRUNING, minimum_deployment_target=ct.target.iOS17, ) assert coreml_model is not None output_file = tempfile.NamedTemporaryFile(suffix=".mlpackage").name coreml_model.save(output_file) class TestPostTrainingPalettization: """ This class is testing the examples in https://coremltools.readme.io/v7.0/docs/data-free-palettization """ def test_palettizer(self): from coremltools.optimize.coreml import ( OpPalettizerConfig, OptimizationConfig, palettize_weights, ) model = get_mlmodel() op_config = OpPalettizerConfig(mode="kmeans", nbits=6, weight_threshold=512) config = OptimizationConfig(global_config=op_config) compressed_6_bit_model = palettize_weights(model, config=config) # Some basic checks assert compressed_6_bit_model is not None ops = get_op_types_in_program(compressed_6_bit_model._mil_program) assert ops.count("constexpr_lut_to_dense") == 6 class TestTrainingTimePalettization: """ This class is testing the examples in https://coremltools.readme.io/v7.0/docs/data-dependent-palettization """ def test_palettizer(self): import torch import torch.nn as nn import coremltools as ct from coremltools.optimize.torch.palettization import DKMPalettizer, DKMPalettizerConfig # Toy example x, label = torch.rand(1, 4), torch.rand(1, 4) data = [(x, label)] model = nn.Sequential(nn.Linear(4, 500), nn.Sigmoid(), nn.Linear(500, 4), nn.Sigmoid()) loss_fn = nn.MSELoss() optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9) # Prepare model for palettization module_config = {nn.Linear: {"n_bits": 2, "weight_threshold": 1000, "milestone": 2}} config = DKMPalettizerConfig.from_dict({"module_type_configs": module_config}) palettizer = DKMPalettizer(model, config) prepared_model = palettizer.prepare() # Fine-tune the model for a few epochs after this. for inputs, labels in data: output = model(inputs) loss = loss_fn(output, labels) loss.backward() optimizer.step() palettizer.step() # prepare for conversion finalized_model = palettizer.finalize() # trace and convert example_input = torch.rand(1, 4) # shape of input for the model traced_model = torch.jit.trace(finalized_model, example_input) coreml_model = ct.convert( traced_model, convert_to="mlprogram", inputs=[ct.TensorType(shape=example_input.shape)], pass_pipeline=ct.PassPipeline.DEFAULT_PALETTIZATION, minimum_deployment_target=ct.target.iOS16, ) assert coreml_model is not None output_file = tempfile.NamedTemporaryFile(suffix=".mlpackage").name coreml_model.save(output_file) class TestPostTrainingQuantization: """ This class is testing the examples in https://coremltools.readme.io/v7.0/docs/data-free-quantization """ def test_quantization(self): import coremltools.optimize.coreml as cto model = get_mlmodel() op_config = cto.OpLinearQuantizerConfig(mode="linear_symmetric", weight_threshold=512) config = cto.OptimizationConfig(global_config=op_config) compressed_8_bit_model = cto.linear_quantize_weights(model, config=config) # Some basic checks assert compressed_8_bit_model is not None ops = get_op_types_in_program(compressed_8_bit_model._mil_program) assert ops.count("constexpr_affine_dequantize") == 6 class TestTrainingTimeQuantization: """ This class is testing the examples in https://coremltools.readme.io/v7.0/docs/data-dependent-quantization """ def test_quantization(self): from collections import OrderedDict import torch import torch.nn as nn import coremltools as ct from coremltools.optimize.torch.quantization import LinearQuantizer, LinearQuantizerConfig # Toy example x, label = torch.rand(1, 1, 20, 20), torch.rand(1, 20, 16, 16) data = [(x, label)] model = nn.Sequential( OrderedDict( { "conv": nn.Conv2d(1, 20, (3, 3)), "relu1": nn.ReLU(), "conv2": nn.Conv2d(20, 20, (3, 3)), "relu2": nn.ReLU(), } ) ) loss_fn = nn.MSELoss() optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9) # Initialize the quantizer config = LinearQuantizerConfig.from_dict( { "global_config": { "quantization_scheme": "symmetric", "milestones": [0, 100, 400, 200], } } ) quantizer = LinearQuantizer(model, config) # Prepare the model to insert FakeQuantize layers for QAT example_input = torch.rand(1, 1, 20, 20) model = quantizer.prepare(example_inputs=example_input, inplace=True) # Use quantizer in your PyTorch training loop for inputs, labels in data: output = model(inputs) loss = loss_fn(output, labels) loss.backward() optimizer.step() quantizer.step() # Convert operations to their quanitzed counterparts using parameters learnt via QAT model = quantizer.finalize(inplace=True) # Convert the PyTorch models to CoreML format traced_model = torch.jit.trace(model, example_input) coreml_model = ct.convert( traced_model, convert_to="mlprogram", inputs=[ct.TensorType(shape=example_input.shape)], minimum_deployment_target=ct.target.iOS17, ) assert coreml_model is not None output_file = tempfile.NamedTemporaryFile(suffix=".mlpackage").name coreml_model.save(output_file) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2815473 coremltools-8.0/coremltools/test/optimize/coreml/0000755000000000000000000000000014672075535021141 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/coreml/__init__.py0000644000000000000000000000033114672066616023247 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/coreml/test_passes.py0000644000000000000000000041250414672066616024056 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import os import re import tempfile import cattrs import numpy as np import pytest import torch import yaml import coremltools as ct import coremltools.optimize as cto import coremltools.optimize.coreml._quantization_passes as quantization from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.passes.graph_pass import PassOption from coremltools.converters.mil.mil.passes.pass_registry import PASS_REGISTRY from coremltools.converters.mil.mil.passes.tests.test_passes import CONSTEXPR_FUNCS, CONSTEXPR_OPS from coremltools.converters.mil.testing_utils import ( apply_pass_and_basic_check, compute_snr_and_psnr, get_op_types_in_program, ) from coremltools.optimize.coreml.experimental._post_training_quantization import ( _get_activation_calibration_stats, ) from coremltools.optimize.coreml.experimental._quantization_passes import ( insert_prefix_quantize_dequantize_pair as _insert_prefix_quantize_dequantize_pair, ) class TestCompressionNumerical: """ This unit test is checking the numerical correctness for the compress/decompress methods in the compression graph paths. """ @pytest.mark.parametrize( "axis, mode, source_dtype, target_dtype, data_range", itertools.product( [0, 1, 2, 3, -1], ["LINEAR", "LINEAR_SYMMETRIC"], [np.float16, np.float32], [types.uint8, types.int8], [ [-1., 1.], [-3., -1.], [1., 3.], # Test corner case of same values [0., 0.], [1., 1.], [-1., -1.], ] ), ) def test_linear_quantizer_compression(self, axis, mode, source_dtype, target_dtype, data_range): input_shape = (10, 20, 30, 40) low, high = data_range val = np.random.uniform(low, high, input_shape).astype(source_dtype) params = quantization.linear_quantize_weights.compress(val, axis, mode, target_dtype) decompressed_val = quantization.linear_quantize_weights.decompress(params) np.testing.assert_allclose(val, decompressed_val, rtol=1e-02, atol=1e-02) @pytest.mark.parametrize( "nbits, signed, block_size, mode, source_dtype, data_range", itertools.product( [4, 8], [True, False], [0, 1, 2, 8, 32], ["LINEAR", "LINEAR_SYMMETRIC"], [np.float16, np.float32], [ [-1.0, 1.0], [-3.0, -1.0], [1.0, 3.0], [1.0, 1.0], # Test corner case of same values. ], ), ) def test_linear_quantizer_compression_blockwise( self, nbits, signed, block_size, mode, source_dtype, data_range, ): """ This test mainly follows the weights pattern in real life's ML models. However, when compressing weights to a small number of bits (such as 4-bit), the information loss is critical, which makes the numerical test hard. That's why we adjust the atol and rtol based on nbits and block_size values. For more comprehensive numerical tests, see `test_linear_quantizer_compression_blockwise_integer`. """ original_data = np.random.uniform(data_range[0], data_range[1], (32, 64)).astype( source_dtype ) compressed_params = quantization.linear_quantize_weights.blockwise_compress( original_data, nbits, mode, signed, block_sizes=[1, block_size] ) decompressed_val = quantization.linear_quantize_weights.decompress(compressed_params) if nbits > 4 and block_size < 3: # When block size is small and nbits is large, the information loss is limited. atol, rtol = 1e-02, 1e-02 elif nbits <= 2 and block_size >= 2: atol, rtol = 0.5, 0.5 else: atol, rtol = 0.2, 0.2 np.testing.assert_allclose(original_data, decompressed_val, rtol=rtol, atol=atol) @pytest.mark.parametrize( "nbits, signed, block_size, mode", itertools.product( [4, 8], [True, False], [1, 2, 8, 32], ["LINEAR", "LINEAR_SYMMETRIC"], ), ) def test_linear_quantizer_compression_blockwise_integer(self, nbits, signed, block_size, mode): """ We use int input because after rounding the dequantized data the numerical loss is less critical when comparing it to the original data. """ input_shape = (32, 64) nbits_range_max = 2 ** (nbits - 1) - 1 nbits_range_min = -nbits_range_max original_data = np.random.randint(nbits_range_min, nbits_range_max, input_shape).astype( np.float32 ) compressed_params = quantization.linear_quantize_weights.blockwise_compress( original_data, nbits, mode, signed, block_sizes=[1, block_size] ) decompressed_val = quantization.linear_quantize_weights.decompress(compressed_params) decompressed_val = np.round(decompressed_val).astype(original_data.dtype) assert np.sum(original_data != decompressed_val) / original_data.size < 0.03 assert np.all(np.abs(original_data - decompressed_val) <= 1) def test_linear_quantizer_compression_blockwise_corner_case(self): """ When the input data is [-2, -10, 6, -3], the np.round(quantized_data / scale) + np.round(zero_point) AND np.round(quantized_data / scale + zero_point) is different ([-1, -8, 7, -2] vs [0, -8, 7, -1]), while we follow PyTorch to use the former. """ original_data = np.array([-2, -10, 6, -3]).astype(np.float32) params = quantization.linear_quantize_weights.blockwise_compress( original_data, nbits=4, block_sizes=[4], mode="LINEAR", signed=True, ) expected_quantized_data = np.array([-1, -8, 7, -2], dtype=np.int8) np.testing.assert_equal(params.data, expected_quantized_data) def test_linear_quantizer_compression_blockwise_invalid_original_data(self): original_data_not_np_array = [1.0, 2.0] with pytest.raises(ValueError, match="Only numpy arrays are supported"): quantization.linear_quantize_weights.blockwise_compress( original_data_not_np_array, nbits=8, block_sizes=[2], mode="LINEAR", signed=True, ) original_data_integer = np.random.randint(0, 10, size=(3, 2)) with pytest.raises(ValueError, match="Only floating numpy arrays are supported."): quantization.linear_quantize_weights.blockwise_compress( original_data_integer, nbits=8, block_sizes=[0, 2], mode="LINEAR", signed=True, ) def test_linear_quantizer_compression_blockwise_invalid_block_size(self, caplog): original_data = np.random.uniform(-1.0, 1.0, (4, 6)) params = quantization.linear_quantize_weights.blockwise_compress( original_data, nbits=8, block_sizes=[1, 2], mode="LINEAR", signed=True, ) assert params.scale.shape == (4, 3) params = quantization.linear_quantize_weights.blockwise_compress( original_data, nbits=8, block_sizes=[1, 6], mode="LINEAR", signed=True, ) assert params.scale.shape == (4, 1) params = quantization.linear_quantize_weights.blockwise_compress( original_data, nbits=8, block_sizes=[2, 6], mode="LINEAR", signed=True, ) assert params.scale.shape == (2, 1) result = quantization.linear_quantize_weights.blockwise_compress( original_data, nbits=8, block_sizes=[1, 8], mode="LINEAR", signed=True, ) assert result is None expected_warning_msg = "Invalid block_sizes" assert any([expected_warning_msg in rec.message for rec in caplog.records]) @pytest.mark.parametrize( "mode, nbits, shape", itertools.product( ["KMEANS", "UNIFORM", "UNIQUE"], [1, 2, 4, 6, 8], [ (1,), (1, 1), (1, 10), (2, 20), (3, 7, 9), (17, 17, 17), ] ), ) def test_palettizer_compression(self, mode, nbits, shape): val_size = np.prod(shape) max_val = 2 ** nbits val = np.arange(max_val).tolist() val = np.array(val * (val_size // max_val + 1))[:val_size].astype(np.float32) params = quantization.palettize_weights.compress(val, mode=mode, nbits=nbits) decompressed_val = quantization.palettize_weights.decompress(params) # For # 1. UNIQUE / KMEANS mode # 2. UNIFORM mode with the data range <= tensor size # We can perfecting re-construct the original value if (mode in ["UNIQUE", "KMEANS"]) or (mode == "UNIFORM" and max_val <= val_size): np.testing.assert_allclose(val, decompressed_val, rtol=1e-02, atol=1e-02) def test_palettizer_compression_channelwise_basic(self): original_data = np.arange(16, dtype=np.float32).reshape((4, 4)) # Group on axis=0. result = quantization.palettize_weights.blockwise_compress( original_data, "UNIQUE", nbits=3, block_sizes=[2, 0] ) expected_lut = np.array( [[0, 1, 2, 3, 4, 5, 6, 7], [8, 9, 10, 11, 12, 13, 14, 15]], dtype=np.float32 ).reshape((2, 1, 8, 1)) np.testing.assert_array_equal(result.lut, expected_lut) expected_indices = np.array( [[0, 1, 2, 3], [4, 5, 6, 7], [0, 1, 2, 3], [4, 5, 6, 7]] ).astype(np.int8) np.testing.assert_array_equal(result.indices, expected_indices) # Group on axis=1. result = quantization.palettize_weights.blockwise_compress( original_data, "UNIQUE", nbits=3, block_sizes=[0, 2] ) expected_lut = np.array( [[0, 1, 4, 5, 8, 9, 12, 13], [2, 3, 6, 7, 10, 11, 14, 15]], dtype=np.float32 ).reshape((1, 2, 8, 1)) np.testing.assert_array_equal(result.lut, expected_lut) expected_indices = np.array( [[0, 1, 0, 1], [2, 3, 2, 3], [4, 5, 4, 5], [6, 7, 6, 7]] ).astype(np.int8) np.testing.assert_array_equal(result.indices, expected_indices) @pytest.mark.parametrize( "nbits, channel_axis, mode, source_dtype, data_range, channel_group_size", itertools.product( [1, 2, 3, 4, 6, 8], [0, 1, 2, -1], ["KMEANS", "UNIFORM"], [np.float16, np.float32], [ [-1.0, 1.0], [-3.0, -1.0], [1.0, 3.0], [1.0, 1.0], ], [0, 1, 2], ), ) def test_palettizer_compression_channelwise_stress( self, nbits, channel_axis, mode, source_dtype, data_range, channel_group_size ): if nbits < 8: # As sub-byte numerical accuracy loss is significant, we construct palettize-friendly data. upper_bound = 2**nbits original_data = np.stack( [np.arange(upper_bound).reshape((1, upper_bound)) for _ in range(4)], axis=channel_axis, ) else: original_data = np.random.uniform(data_range[0], data_range[1], (2, 4, 16)).astype( source_dtype ) block_sizes = [0] * len(original_data.shape) block_sizes[channel_axis] = channel_group_size params = quantization.palettize_weights.blockwise_compress( original_data, mode, nbits, block_sizes, ) decompressed_val = quantization.palettize_weights.decompress(params) if nbits < 8 or mode == "KMEANS": np.testing.assert_allclose(original_data, decompressed_val, rtol=3e-4, atol=3e-4) else: np.testing.assert_array_almost_equal(original_data, decompressed_val, decimal=2) @pytest.mark.parametrize( "nbits, channel_axis, channel_group_size", itertools.product( [2, 3, 4, 6], [0, 1, -1], [0, 1, 2], ), ) def test_grouped_channelwise_equivalent_to_blockwise( self, nbits, channel_axis, channel_group_size ): """The grouped channelwise palettization could be expressed as general blockwise.""" original_data = np.random.randint(low=-256, high=256, size=(16, 16, 2, 2)).astype( np.float32 ) params_grouped_channelwise = quantization.palettize_weights.grouped_channelwise_compress( original_data, "UNIFORM", nbits, channel_axis, channel_group_size ) decompressed_grouped_channelwise = quantization.palettize_weights.decompress( params_grouped_channelwise ) block_sizes = [0] * len(original_data.shape) block_sizes[channel_axis] = channel_group_size params_blockwise = quantization.palettize_weights.blockwise_compress( original_data, "UNIFORM", nbits, block_sizes=block_sizes ) decompressed_blockwise = quantization.palettize_weights.decompress(params_blockwise) np.testing.assert_allclose( np.sort(params_grouped_channelwise.lut, axis=None), np.sort(params_blockwise.lut, axis=None), ) np.testing.assert_allclose(decompressed_grouped_channelwise, decompressed_blockwise) @pytest.mark.parametrize( "nbits, mode", itertools.product( [2, 3, 4, 6], ["KMEANS", "UNIFORM"], ), ) def test_tensorwise_equivalent_to_blockwise_zero(self, nbits, mode): """The block_size=0 in palettization is equivalent to legacy tensorwise compression.""" original_data = np.random.randint(low=-256, high=256, size=(16, 16, 2, 2)).astype( np.float32 ) params_old = quantization.palettize_weights.compress(original_data, mode, nbits) decompressed_old = quantization.palettize_weights.decompress(params_old) params_new = quantization.palettize_weights.blockwise_compress( original_data, mode, nbits, block_sizes=[0] * len(original_data.shape) ) decompressed_new = quantization.palettize_weights.decompress(params_new) np.testing.assert_allclose( np.sort(params_old.lut, axis=None), np.sort(params_new.lut, axis=None), atol=5e-5, rtol=1e-6, ) np.testing.assert_allclose(decompressed_old, decompressed_new, atol=5e-5, rtol=1e-6) @pytest.mark.parametrize( "nbits, channel_axis, channel_group_size", itertools.product( [2, 3, 4], [0, 1], [1, 2], ), ) def test_grouped_channelwise_better_than_tensorwise( self, nbits, channel_axis, channel_group_size ): """The noise introduced by per-tensor lut should be more than grouped-channel-wise lut.""" original_data = np.random.randint(low=-512, high=512, size=(32, 32, 2, 2)).astype( np.float32 ) block_sizes_channelwise = [0] * len(original_data.shape) block_sizes_channelwise[channel_axis] = channel_group_size params_grouped_channelwise = quantization.palettize_weights.blockwise_compress( original_data, "UNIFORM", nbits, block_sizes_channelwise, ) block_sizes_per_tensor = [0] * len(original_data.shape) params_per_tensor = quantization.palettize_weights.blockwise_compress( original_data, "UNIFORM", nbits, block_sizes_per_tensor, ) decompressed_grouped_channelwise = quantization.palettize_weights.decompress( params_grouped_channelwise ) decompressed_per_tensor = quantization.palettize_weights.decompress(params_per_tensor) snr_grouped_channelwise = compute_snr_and_psnr( original_data, decompressed_grouped_channelwise )[0] snr_per_tensor = compute_snr_and_psnr(original_data, decompressed_per_tensor)[0] assert snr_grouped_channelwise > snr_per_tensor def test_palettizer_compression_blockwise_invalid(self): with pytest.raises(ValueError, match="Only numpy arrays are supported"): quantization.palettize_weights.blockwise_compress(10, "KMEANS", 6, [0]) with pytest.raises(ValueError, match="Invalid nbits."): quantization.palettize_weights.blockwise_compress( np.random.uniform(-1.0, 1.0, (2, 3, 4)), "KMEANS", nbits=5, block_sizes=[0, 0, 1] ) assert ( quantization.palettize_weights.blockwise_compress( np.random.uniform(-1.0, 1.0, (2, 3, 4)), "KMEANS", nbits=3, block_sizes=[3, 0, 0] ) is None ) def test_block_sparsity_pruning_smoke(self): # dim = 0 val = np.array( [ [1, 3, 4], [-6, -7, 2], [0, 3, 4], [-9, 2, -1], ] ).astype(np.float32) expected_val = np.array( [ [1, 3, 0], [-6, -7, 0], [0, 0, 0], [-9, 0, 0], ] ).astype(np.float32) params = quantization.prune_weights.compress_by_magnitude( val, target_sparsity=0.5, block_size=2, dim=0, ) decompressed_val = quantization.prune_weights.decompress(params) np.testing.assert_array_equal(decompressed_val, expected_val) # dim = 1, with padding val = np.array( [ [1, 3, 4, 18, 1], [-6, -7, 2, 2, 9], [0, 3, 4, 8, 9], ] ).astype(np.float32) expected_val = np.array( [ [0, 0, 4, 18, 0], [-6, -7, 0, 0, 9], [0, 0, 0, 0, 9], ] ).astype(np.float32) params = quantization.prune_weights.compress_by_magnitude( val, target_sparsity=0.5, block_size=2, dim=1, ) decompressed_val = quantization.prune_weights.decompress(params) np.testing.assert_array_equal(decompressed_val, expected_val) @pytest.mark.parametrize( "block_size, target_sparsity, shape, dim", itertools.product( [2, 5, 10, 17], [0.0, 0.1, 0.5, 0.75, 1.0], [ (10, 25), ( 10, 5, 8, ), (40, 100, 6, 7), (20, 60, 4, 5, 6), ], [0, 1], ), ) def test_block_sparsity_pruning_stress(self, block_size, target_sparsity, shape, dim): def _is_int(val): return int(val) == val val = np.random.rand(*shape) rank = len(shape) params = quantization.prune_weights.compress_by_magnitude( val, target_sparsity=target_sparsity, block_size=block_size, dim=dim, ) if block_size > shape[dim] / 2: assert params is None return decompressed_val = quantization.prune_weights.decompress(params) assert decompressed_val.shape == val.shape sparsity_percentile = np.sum(decompressed_val == 0) / np.prod(shape) if (shape[dim]) % block_size == 0 and _is_int( np.prod(shape) // block_size * target_sparsity ): assert sparsity_percentile == target_sparsity val_compress = np.copy(val) val_compress[np.where(decompressed_val == 0)] = 0 np.testing.assert_array_equal(decompressed_val, val_compress) def test_n_m_pruning_smoke(self): # dim = 1 val = np.array( [ [1, 3, 4, -3], [-6, -7, 2, 4], [0, 3, 4, 1], [-9, 2, -1, 8], ] ).astype(np.float32) expected_val = np.array( [ [0, 3, 4, 0], [0, -7, 0, 4], [0, 3, 4, 0], [-9, 0, 0, 8], ] ).astype(np.float32) params = quantization.prune_weights.compress_by_nm_sparsity( val, n_m_ratio=(1, 2), dim=1, ) decompressed_val = quantization.prune_weights.decompress(params) np.testing.assert_array_equal(decompressed_val, expected_val) # dim = 0, with padding val = np.array( [ [1, 3, 4, -3, 2, 4], [-6, -7, 2, 4, 6, 8], [0, 4, 4, 1, -9, -4], [-9, 2, -1, 8, 3, 9], [-1, 5, 0, 8, 9, -3], [-3, 3, 6, 3, 6, -1], [2, 1, -2, 8, 2, -6], ] ).astype(np.float32) expected_val = np.array( [ [0, 0, 0, 0, 0, 0], [-6, -7, 0, 4, 0, 8], [0, 0, 4, 0, -9, 0], [-9, 0, 0, 0, 0, 9], [0, 5, 0, 8, 9, 0], [0, 0, 6, 0, 0, 0], [2, 1, -2, 8, 2, -6], ] ).astype(np.float32) params = quantization.prune_weights.compress_by_nm_sparsity( val, n_m_ratio=(2, 3), dim=0, ) decompressed_val = quantization.prune_weights.decompress(params) print(decompressed_val) np.testing.assert_array_equal(decompressed_val, expected_val) @pytest.mark.parametrize( "n_m_ratio, shape", itertools.product( [ (1, 1), (0, 2), (1, 2), (3, 5), (5, 10), (12, 17), ], [ (1, 2), (3, 3), ( 10, 5, 8, ), (80, 50, 6, 7), (40, 30, 4, 5, 6), ], ), ) def test_n_m_pruning_stress(self, n_m_ratio, shape): n, m = n_m_ratio val = np.random.rand(*shape) rank = len(shape) for dim in [0, 1]: params = quantization.prune_weights.compress_by_nm_sparsity( val, n_m_ratio=n_m_ratio, dim=dim, ) # We skip the compression if m > channel / 2 if m > shape[dim] / 2: assert params is None return decompressed_val = quantization.prune_weights.decompress(params) assert decompressed_val.shape == val.shape sparsity_percentile = np.sum(decompressed_val == 0) / np.prod(shape) if (shape[dim]) % m == 0: assert sparsity_percentile == n / m val_compress = np.copy(val) val_compress[np.where(decompressed_val == 0)] = 0 np.testing.assert_array_equal(decompressed_val, val_compress) class TestCompressionGraphBackwardCompatibility: """ Most of the numerical tests are already convered in coremltools.tests.ml_program.test_compression_utils. This test is checking the basic behavior of the graph pass classes using only global config. This test also converts the backward compatibility test for the deprecated ct.compression_utils. """ @staticmethod def _get_conv_program(): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 30, 10, 10))], opset_version=ct.target.iOS16 ) def prog(x): conv_weight = np.random.rand(90, 30, 2, 2).astype(np.float32) x = mb.conv(x=x, weight=conv_weight) return x return prog @pytest.mark.parametrize( "fake_compression, is_deprecated", itertools.product( [True, False], [True, False], ) ) def test_affine_quantizer(self, fake_compression, is_deprecated): weight_threshold = None if is_deprecated else 0 op_selector=(lambda const: True) if is_deprecated else None op_config = cto.coreml.OpLinearQuantizerConfig(weight_threshold=weight_threshold) config = cto.coreml.OptimizationConfig(global_config=op_config, is_deprecated=is_deprecated, op_selector=op_selector) quantizer = quantization.linear_quantize_weights( config=config, fake_compression=fake_compression ) prog = self._get_conv_program() quantizer.apply(prog) expected_ops = ["constexpr_affine_dequantize", "conv"] if not fake_compression else ["conv"] assert get_op_types_in_program(prog) == expected_ops @pytest.mark.parametrize( "fake_compression, is_deprecated", itertools.product( [True, False], [True, False], ) ) def test_weight_pruner(self, fake_compression, is_deprecated): weight_threshold = None if is_deprecated else 0 op_selector=(lambda const: True) if is_deprecated else None op_config = cto.coreml.OpMagnitudePrunerConfig( weight_threshold=weight_threshold, target_sparsity=0.75, ) config = cto.coreml.OptimizationConfig(global_config=op_config, is_deprecated=is_deprecated, op_selector=op_selector) quantizer = quantization.prune_weights( config=config, fake_compression=fake_compression ) prog = self._get_conv_program() quantizer.apply(prog) expected_ops = ["constexpr_sparse_to_dense", "conv"] if not fake_compression else ["conv"] assert get_op_types_in_program(prog) == expected_ops @pytest.mark.parametrize( "fake_compression, is_deprecated", itertools.product( [True, False], [True, False], ) ) def test_weight_palettization(self, fake_compression, is_deprecated): weight_threshold = None if is_deprecated else 0 op_selector=(lambda const: True) if is_deprecated else None op_config = cto.coreml.OpPalettizerConfig( weight_threshold=weight_threshold, mode="uniform", nbits=4, ) config = cto.coreml.OptimizationConfig(global_config=op_config, is_deprecated=is_deprecated, op_selector=op_selector) quantizer = quantization.palettize_weights( config=config, fake_compression=fake_compression ) prog = self._get_conv_program() quantizer.apply(prog) expected_ops = ["constexpr_lut_to_dense", "conv"] if not fake_compression else ["conv"] assert get_op_types_in_program(prog) == expected_ops class TestCompressionPasses: @staticmethod def _get_test_program(): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 30, 10, 10))], opset_version=ct.target.iOS16 ) def prog(x): # weight conv_weight = np.random.rand(90, 30, 2, 2).astype(np.float32) linear_weight = np.random.rand(70, 81).astype(np.float32) conv_transpose_weight = np.random.rand(30, 4, 21, 10).astype(np.float32) # graph x = mb.conv(x=x, weight=conv_weight, name="conv") x = mb.reshape(x=x, shape=(1, 90, 81), name="reshape_1") x = mb.linear(x=x, weight=linear_weight, name="linear") x = mb.reshape(x=x, shape=(1, 30, 21, 10), name="reshape_2") x = mb.conv_transpose(x=x, weight=conv_transpose_weight, name="conv_transpose") return x return prog @staticmethod def _get_test_program_2(): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 30, 10, 10))], opset_version=ct.target.iOS16 ) def prog(x): # weight conv1_weight = np.random.rand(40, 30, 2, 2).astype(np.float32) conv2_weight = np.random.rand(50, 40, 3, 3).astype(np.float32) conv3_weight = np.random.rand(60, 50, 2, 4).astype(np.float32) linear1_weight = np.random.rand(80, 60).astype(np.float32) linear2_weight = np.random.rand(90, 80).astype(np.float32) conv_transpose_weight = np.random.rand(60, 30, 6, 10).astype(np.float32) # graph x = mb.conv(x=x, weight=conv1_weight, name="conv1") x = mb.conv(x=x, weight=conv2_weight, name="conv2") x = mb.conv(x=x, weight=conv3_weight, name="conv3") x = mb.reshape(x=x, shape=(6, 4, 60), name="reshape1") x = mb.linear(x=x, weight=linear1_weight, name="linear1") x = mb.linear(x=x, weight=linear2_weight, name="linear2") x = mb.reshape(x=x, shape=(1, 30, 6, 12), name="reshape2") x = mb.conv_transpose(x=x, weight=conv_transpose_weight, name="conv_transpose") return x return prog @staticmethod def _get_test_program_3(): """An iOS18 program with conv, linear, matmul, and conv_transpose.""" @mb.program( input_specs=[mb.TensorSpec(shape=(1, 30, 10, 10))], opset_version=ct.target.iOS18, ) def prog(x): # weight conv_weight = np.random.rand(90, 30, 2, 2).astype(np.float32) linear_weight = np.random.rand(70, 81).astype(np.float32) matmul_weight = np.random.rand(2, 1, 70, 35).astype(np.float32) conv_transpose_weight = np.random.rand(30, 4, 21, 10).astype(np.float32) # graph x = mb.conv(x=x, weight=conv_weight, name="conv") x = mb.reshape(x=x, shape=(1, 90, 81), name="reshape_1") x = mb.linear(x=x, weight=linear_weight, name="linear") x = mb.matmul(x=x, y=matmul_weight, transpose_y=False, name="matmul") x = mb.reshape(x=x, shape=(1, 30, 21, 10), name="reshape_2") x = mb.conv_transpose(x=x, weight=conv_transpose_weight, name="conv_transpose") return x return prog @staticmethod def _get_test_program_conv(): """An iOS17 program with conv.""" @mb.program( input_specs=[mb.TensorSpec(shape=(1, 30, 10, 10))], opset_version=ct.target.iOS17 ) def prog(x): conv_weight = np.random.rand(90, 30, 2, 2).astype(np.float32) x = mb.cast(x=x, dtype="fp16") x = mb.conv(x=x, weight=conv_weight) x = mb.cast(x=x, dtype="fp32") return x return prog @staticmethod def _get_test_program_conv_relu(): """An iOS17 program with conv and relu.""" @mb.program( input_specs=[mb.TensorSpec(shape=(1, 30, 10, 10))], opset_version=ct.target.iOS17 ) def prog(x): conv_weight = np.random.rand(90, 30, 2, 2).astype(np.float32) x = mb.cast(x=x, dtype="fp16") x = mb.conv(x=x, weight=conv_weight) x = mb.relu(x=x) x = mb.cast(x=x, dtype="fp32") return x return prog @staticmethod def _get_test_program_add(): """An iOS17 program with add.""" @mb.program( input_specs=[mb.TensorSpec(shape=(1, 2, 4, 4)), mb.TensorSpec(shape=(1, 2, 4, 4))], opset_version=ct.target.iOS17, ) def prog(x1, x2): y1 = mb.cast(x=x1, dtype="fp16") y2 = mb.cast(x=x2, dtype="fp16") y = mb.add(x=y1, y=y2) z = mb.cast(x=y, dtype="fp32") return z return prog @staticmethod def _get_test_program_avgpool(): """An iOS17 program with avg_pool""" @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 4, 4))], opset_version=ct.target.iOS17) def prog(x): # graph x = mb.cast(x=x, dtype="fp16") x = mb.avg_pool(x=x, kernel_sizes=[1, 1], strides=[1, 1], pad_type="valid") x = mb.cast(x=x, dtype="fp32") return x return prog @staticmethod def _get_test_program_maxpool(): """An iOS17 program with max_pool""" @mb.program(input_specs=[mb.TensorSpec(shape=(1, 2, 4, 4))], opset_version=ct.target.iOS17) def prog(x): # graph x = mb.cast(x=x, dtype="fp16") x = mb.max_pool(x=x, kernel_sizes=[1, 1], strides=[1, 1], pad_type="valid") x = mb.cast(x=x, dtype="fp32") return x return prog @staticmethod def _get_test_mlmodel_conv_relu(): """A mlmodel with conv, relu""" # Prepare torch model. inputs = [ct.TensorType(name="data", shape=(5, 10, 4, 4))] input_data = [torch.rand(*i.shape.to_list()) for i in inputs] m = torch.nn.Sequential( torch.nn.Conv2d(in_channels=10, out_channels=20, kernel_size=4), torch.nn.ReLU(), ) torchmodel = torch.jit.trace(m, input_data) # Convert to mlmodel. mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY, compute_precision=ct.precision.FLOAT16, ) return mlmodel @staticmethod def _get_test_mlmodel_boolean_type(): """A mlmodel with boolean type intermediate tensor""" # Prepare torch model. class Net(torch.nn.Module): def __init__(self): super(Net, self).__init__() self.linear1 = torch.nn.Linear(28 * 28, 100) self.linear2 = torch.nn.Linear(28 * 28, 100) def forward(self, img): # convert + flatten y1 = self.linear1(img) y2 = self.linear2(img) y = torch.logical_and(y1, y2) return y model = Net() inputs = [ct.TensorType(name="data", shape=(1, 28 * 28))] input_data = [torch.rand(*i.shape.to_list()) for i in inputs] torchmodel = torch.jit.trace(model, input_data) # Convert to mlmodel. mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY, compute_precision=ct.precision.FLOAT16, ) return mlmodel @staticmethod def _get_test_mlmodel_conv_concat(): """A mlmodel has a concat with 2 inputs and 1 output all surrounded by conv.""" # Prepare torch model. class Net(torch.nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = torch.nn.Conv2d(in_channels=10, out_channels=20, kernel_size=4) self.conv2 = torch.nn.Conv2d(in_channels=10, out_channels=20, kernel_size=4) self.conv3 = torch.nn.Conv2d(in_channels=20, out_channels=20, kernel_size=1) def forward(self, img): # convert + flatten y1 = self.conv1(img) y2 = self.conv2(img) y = torch.concat((y1, y2), 0) y3 = self.conv3(y) return y3 model = Net() inputs = [ct.TensorType(name="data_0", shape=(5, 10, 4, 4))] input_data = [torch.rand(*i.shape.to_list()) for i in inputs] torchmodel = torch.jit.trace(model, input_data) # Convert to mlmodel. mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", compute_units=ct.ComputeUnit.CPU_ONLY, compute_precision=ct.precision.FLOAT16, ) return mlmodel class TestOptimizationConfig(TestCompressionPasses): """ Test some basic functionality of the OptimizationConfig. """ @pytest.mark.parametrize( "compressor_class, fake_compression", itertools.product( [ quantization.palettize_weights, quantization.prune_weights, quantization.linear_quantize_weights, ], [True, False], ) ) def test_empty_config(self, compressor_class, fake_compression): """ For an empty config, the compression graph passes should do nothing """ config = cto.coreml.OptimizationConfig() compressor = compressor_class( config=config, fake_compression=fake_compression ) prog = self._get_test_program() compressor.apply(prog) expected_ops = ["conv", "reshape", "linear", "reshape", "conv_transpose"] assert get_op_types_in_program(prog) == expected_ops def test_empty_op_type(self): """ If an op_type config is set to None. The entire class will not be compressed. """ config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig(mode="kmeans", nbits=2), op_type_configs={ "conv": None, }, ) compressor = quantization.palettize_weights(config=config) prog = self._get_test_program() compressor.apply(prog) expected_ops = [ "conv", "reshape", "constexpr_lut_to_dense", "linear", "reshape", "constexpr_lut_to_dense", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops conv_op = prog.find_ops(op_type="conv")[0] assert conv_op.weight.op.op_type == "const" def test_empty_op_name(self): """ If an op_name config is set to None. The op instance will not be compressed. """ config = cto.coreml.OptimizationConfig( op_type_configs={ "conv": cto.coreml.OpPalettizerConfig(mode="kmeans", nbits=2), }, op_name_configs={ "conv1": None, }, ) compressor = quantization.palettize_weights(config=config) prog = self._get_test_program_2() compressor.apply(prog) expected_ops = [ "conv", "constexpr_lut_to_dense", "conv", "constexpr_lut_to_dense", "conv", "reshape", "linear", "linear", "reshape", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops conv_op = prog.find_ops(op_type="conv")[0] assert conv_op.weight.op.op_type == "const" def test_config_hierarchy(self): """ This test is checking the graph pass compresses the program correctly according to the following hierarchical order (high -> low): 1. op name 2. op type 3. global """ prog = self._get_test_program_2() # global config global_config = cto.coreml.OpPalettizerConfig( nbits=8, mode="KMEANS", weight_threshold=100, ) # op type config conv_config = cto.coreml.OpPalettizerConfig( nbits=6, mode="KMEANS", weight_threshold=100, ) linear_config = cto.coreml.OpPalettizerConfig( nbits=4, mode="KMEANS", weight_threshold=100, ) # op name config conv1_config = cto.coreml.OpPalettizerConfig( nbits=2, mode="KMEANS", weight_threshold=100, ) linear2_config = cto.coreml.OpPalettizerConfig( nbits=1, mode="KMEANS", weight_threshold=100, ) config = cto.coreml.OptimizationConfig() config.set_global(global_config) config.set_op_type("conv", conv_config) config.set_op_type("linear", linear_config) config.set_op_name("conv1", conv1_config) config.set_op_name("linear2", linear2_config) compressor = quantization.palettize_weights(config=config) compressor.apply(prog) expected_ops = [ "constexpr_lut_to_dense", "conv", "constexpr_lut_to_dense", "conv", "constexpr_lut_to_dense", "conv", "reshape", "constexpr_lut_to_dense", "linear", "constexpr_lut_to_dense", "linear", "reshape", "constexpr_lut_to_dense", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops expected_nbits = [2, 6, 6, 4, 1, 8, 8] lut_ops = prog.find_ops(op_type="constexpr_lut_to_dense") for nbits, op in zip(expected_nbits, lut_ops): assert op.lut.val.shape == (2**nbits,) def test_mixed_compression_algorithms(self): """ This test is checking a program can be ran under different compression method """ prog = self._get_test_program_2() # Run palettization for conv ops conv_config = cto.coreml.OpPalettizerConfig( nbits=1, mode="KMEANS", weight_threshold=100, ) config = cto.coreml.OptimizationConfig() config.set_op_type("conv", conv_config) compressor = quantization.palettize_weights(config=config) compressor.apply(prog) expected_ops = [ "constexpr_lut_to_dense", "conv", "constexpr_lut_to_dense", "conv", "constexpr_lut_to_dense", "conv", "reshape", "linear", "linear", "reshape", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops # Run affine quanitzation for conv1 / linear1. Note that since conv1 is already compressed # the quantization makes no affect on it op_name_config = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype=np.int8, weight_threshold=100, ) config = cto.coreml.OptimizationConfig() config.set_op_name("conv1", op_name_config) config.set_op_name("linear1", op_name_config) compressor = quantization.linear_quantize_weights(config=config) compressor.apply(prog) expected_ops = [ "constexpr_lut_to_dense", "conv", "constexpr_lut_to_dense", "conv", "constexpr_lut_to_dense", "conv", "reshape", "constexpr_affine_dequantize", "linear", "linear", "reshape", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops # Run sparsification for the whoel program global_config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.85, weight_threshold=100, ) config = cto.coreml.OptimizationConfig(global_config=global_config) compressor = quantization.prune_weights(config=config) compressor.apply(prog) expected_ops = [ "constexpr_lut_to_dense", "conv", "constexpr_lut_to_dense", "conv", "constexpr_lut_to_dense", "conv", "reshape", "constexpr_affine_dequantize", "linear", "constexpr_sparse_to_dense", "linear", "reshape", "constexpr_sparse_to_dense", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops @staticmethod def test_const_only_used_as_output_skip_compress(): """ If the const is only fed to the block output, we skip the compression, due to the bug rdar://108274019 ([Bug] constexpr ops cannot be directly fed to block output) """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20, 30))], opset_version=ct.target.iOS16) def prog(x): val = np.random.rand(10, 20, 30).astype(np.float32) const = mb.const(val=val) output = mb.add(x=x, y=1.0) return output, const op_config = cto.coreml.OpPalettizerConfig( nbits=2, mode="kmeans", weight_threshold=0, ) config = cto.coreml.OptimizationConfig(global_config=op_config) compressor = quantization.palettize_weights(config=config) compressor.apply(prog) assert get_op_types_in_program(prog) == ["add"] @staticmethod def test_const_as_output(): """ If the const is fed to the block output and at least one another op, it can still be compressed """ @mb.program(input_specs=[mb.TensorSpec(shape=(10, 20, 30))], opset_version=ct.target.iOS16) def prog(x): val = np.random.rand(10, 20, 30).astype(np.float32) const = mb.const(val=val) output = mb.add(x=x, y=const) return output, const op_config = cto.coreml.OpPalettizerConfig( nbits=2, mode="kmeans", weight_threshold=0, ) config = cto.coreml.OptimizationConfig(global_config=op_config) compressor = quantization.palettize_weights(config=config) compressor.apply(prog) assert get_op_types_in_program(prog) == ["constexpr_lut_to_dense", "add"] @staticmethod def test_set_op_name_for_const(): """ We can set_op_name for const ops """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 10, 30))], opset_version=ct.target.iOS16) def prog(x): add_const_1 = np.random.rand(10, 30).astype(np.float32) add_const_2 = np.random.rand(10, 30).astype(np.float32) const_1 = mb.const(val=add_const_1, name="const_1") const_2 = mb.const(val=add_const_2, name="const_2") x = mb.add(x=x, y=const_1) return mb.add(x=x, y=const_2) compressor = quantization.palettize_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig(nbits=2, mode="KMEANS", weight_threshold=50), op_name_configs={"const_2": cto.coreml.OpPalettizerConfig(nbits=4, mode="KMEANS", weight_threshold=50)} ) ) compressor.apply(prog) expected_ops = [ "constexpr_lut_to_dense", "constexpr_lut_to_dense", "add", "add", ] assert get_op_types_in_program(prog) == expected_ops expected_nbits = [2, 4] lut_ops = prog.find_ops(op_type="constexpr_lut_to_dense") for nbits, op in zip(expected_nbits, lut_ops): assert op.lut.val.shape == (2**nbits,) @staticmethod @pytest.mark.parametrize( "constexpr_op", CONSTEXPR_OPS, ) def test_constexpr_const_not_compressed(constexpr_op): """ The const op which is fed into constexpr ops cannot be compressed. """ @mb.program(input_specs=[mb.TensorSpec(shape=(2, 3, 4, 5))]) def prog(x): constexpr = CONSTEXPR_FUNCS[constexpr_op]((2, 3, 4, 5)) return mb.add(x=x, y=constexpr) compressor = quantization.palettize_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig(nbits=2, mode="KMEANS", weight_threshold=0), ) ) compressor.apply(prog) expected_ops = [constexpr_op, "add"] assert get_op_types_in_program(prog) == expected_ops @staticmethod def test_shared_weights(): """ If a const is shared with different downstream ops, we do a further conflict detection. """ def _get_program(): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 10, 30))], opset_version=ct.target.iOS16 ) def prog(x): add_const = np.random.rand(10, 30).astype(np.float32) add_const = mb.const(val=add_const, name="add_const") x = mb.add(x=x, y=add_const, name="add_1") return mb.add(x=x, y=add_const, name="add_2") return prog # [Case 1] No conflict. Global and op_name level config are the same prog = _get_program() compressor = quantization.palettize_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig(nbits=2, mode="KMEANS", weight_threshold=50), op_name_configs={"add_2": cto.coreml.OpPalettizerConfig(nbits=2, mode="KMEANS", weight_threshold=50)} ) ) compressor.apply(prog) expected_ops = [ "constexpr_lut_to_dense", "add", "add", ] assert get_op_types_in_program(prog) == expected_ops # [Case 2] No conflict. op_name level configs are the same prog = _get_program() compressor = quantization.palettize_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig(nbits=4, mode="KMEANS", weight_threshold=50), op_name_configs={ "add_1": cto.coreml.OpPalettizerConfig(nbits=2, mode="KMEANS", weight_threshold=50), "add_2": cto.coreml.OpPalettizerConfig(nbits=2, mode="KMEANS", weight_threshold=50), } ) ) compressor.apply(prog) expected_ops = [ "constexpr_lut_to_dense", "add", "add", ] assert get_op_types_in_program(prog) == expected_ops # [Case 3] Conflict. Global and op_name level config are different prog = _get_program() compressor = quantization.palettize_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig(nbits=2, mode="KMEANS", weight_threshold=50), op_name_configs={"add_2": cto.coreml.OpPalettizerConfig(nbits=4, mode="KMEANS", weight_threshold=50)} ) ) with pytest.raises(ValueError, match="compression config conflict detected between ops"): compressor.apply(prog) # [Case 4] Conflict. op_name level configs are different prog = _get_program() compressor = quantization.palettize_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig(nbits=2, mode="KMEANS", weight_threshold=50), op_name_configs={ "add_1": cto.coreml.OpPalettizerConfig(nbits=4, mode="KMEANS", weight_threshold=50), "add_2": cto.coreml.OpPalettizerConfig(nbits=4, mode="KMEANS", weight_threshold=30), }, ) ) with pytest.raises(ValueError, match="compression config conflict detected between ops"): compressor.apply(prog) class TestLinearQuantizer(TestCompressionPasses): @pytest.mark.parametrize( "mode, dtype, weight_threshold, fake_compression", itertools.product( ["LINEAR", "LINEAR_SYMMETRIC"], [np.int8, np.uint8, types.int8, types.uint8], [1000, 7000], [True, False], ), ) def test_global_config_affine_quantizer(self, mode, dtype, weight_threshold, fake_compression): """ Global config would compress all operations with the same config """ op_config = cto.coreml.OpLinearQuantizerConfig( mode=mode, dtype=dtype, weight_threshold=weight_threshold ) config = cto.coreml.OptimizationConfig(global_config=op_config) compressor = quantization.linear_quantize_weights( config=config, fake_compression=fake_compression ) prog = self._get_test_program() compressor.apply(prog) if fake_compression: expected_ops = ["conv", "reshape", "linear", "reshape", "conv_transpose"] elif weight_threshold == 1000: expected_ops = [ "constexpr_affine_dequantize", "conv", "reshape", "constexpr_affine_dequantize", "linear", "reshape", "constexpr_affine_dequantize", "conv_transpose", ] else: assert weight_threshold == 7000 # linear weight size < 7000 expected_ops = [ "constexpr_affine_dequantize", "conv", "reshape", "linear", "reshape", "constexpr_affine_dequantize", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops @pytest.mark.parametrize( "mode, dtype, block_size, weight_threshold, fake_compression", itertools.product( ["LINEAR", "LINEAR_SYMMETRIC"], ["int4", "uint4", "int8", "uint8", np.int8, np.uint8], [1], [1000, 7000], [True, False], ), ) def test_global_config_affine_quantizer_blockwise( self, mode, dtype, block_size, weight_threshold, fake_compression ): """ Global config would compress all operations with the same config for blockwise. """ op_config = cto.coreml.OpLinearQuantizerConfig( mode=mode, dtype=dtype, granularity="per_block", block_size=block_size, weight_threshold=weight_threshold, ) config = cto.coreml.OptimizationConfig(global_config=op_config) compressor = quantization.linear_quantize_weights( config=config, fake_compression=fake_compression ) prog = self._get_test_program_3() compressor.apply(prog) if fake_compression: expected_ops = ["conv", "reshape", "linear", "matmul", "reshape", "conv_transpose"] elif weight_threshold == 1000: expected_ops = [ "constexpr_blockwise_shift_scale", "conv", "reshape", "constexpr_blockwise_shift_scale", "linear", "constexpr_blockwise_shift_scale", "matmul", "reshape", "constexpr_blockwise_shift_scale", "conv_transpose", ] else: assert weight_threshold == 7000 # linear and matmul weight size < 7000 expected_ops = [ "constexpr_blockwise_shift_scale", "conv", "reshape", "linear", "matmul", "reshape", "constexpr_blockwise_shift_scale", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops def test_op_type_config_linear_quantizer(self): """ set_op_type allow the user to set different config for each op type. Also checking that the config can be overwritten """ conv_config_1 = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype=np.uint8, weight_threshold=2000, ) # conv_config_2 overwrite conv_config_1 conv_config_2 = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype=np.int8, weight_threshold=2000, ) # The weight_threshold is super large so linear is not going to be compressed linear_config = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype=np.int8, weight_threshold=1000000, ) conv_transpose_config = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR", dtype=np.uint8, weight_threshold=2000, ) config = cto.coreml.OptimizationConfig() config.set_op_type("conv", conv_config_1) config.set_op_type("conv", conv_config_2) config.set_op_type("linear", linear_config) config.set_op_type("conv_transpose", conv_transpose_config) compressor = quantization.linear_quantize_weights(config=config) prog = self._get_test_program() compressor.apply(prog) expected_ops = [ "constexpr_affine_dequantize", "conv", "reshape", "linear", "reshape", "constexpr_affine_dequantize", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops # Test different dtype are applied assert ( prog.find_ops(op_type="constexpr_affine_dequantize")[0].quantized_data.val.dtype == np.int8 ) assert ( prog.find_ops(op_type="constexpr_affine_dequantize")[1].quantized_data.val.dtype == np.uint8 ) def test_op_type_config_linear_quantizer_blockwise(self): """ set_op_type allow the user to set different config for each op type for blockwise. Also checking that the config can be overwritten. """ conv_config_1 = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype="int8", granularity="per_block", block_size=10, weight_threshold=5000, ) # conv_config_2 overwrite conv_config_1 conv_config_2 = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype="int4", granularity="per_block", block_size=3, weight_threshold=2000, ) # The weight_threshold is super large so linear is not going to be compressed linear_config = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype="int4", granularity="per_block", weight_threshold=1000000, ) conv_transpose_config = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR", dtype="int8", granularity="per_block", block_size=10, weight_threshold=2000, ) config = cto.coreml.OptimizationConfig() config.set_op_type("conv", conv_config_1) config.set_op_type("conv", conv_config_2) config.set_op_type("linear", linear_config) config.set_op_type("conv_transpose", conv_transpose_config) compressor = quantization.linear_quantize_weights(config=config) prog = self._get_test_program_3() compressor.apply(prog) expected_ops = [ "constexpr_blockwise_shift_scale", "conv", "reshape", "linear", "matmul", "reshape", "constexpr_blockwise_shift_scale", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops assert prog.find_ops(op_type="constexpr_blockwise_shift_scale")[0].offset is None assert prog.find_ops(op_type="constexpr_blockwise_shift_scale")[1].offset is not None def test_op_name_config_linear_quantizer(self): """ set_op_name allow the user to set different config for each op specified by name. Also checking that the config can be overwritten """ conv_config_1 = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype=np.uint8, weight_threshold=2000, ) # conv_config_2 overwrite conv_config_1 conv_config_2 = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype=np.int8, weight_threshold=2000, ) # The weight_threshold is super large so linear is not going to be compressed linear_config = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype=np.int8, weight_threshold=1000000, ) conv_transpose_config = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR", dtype=np.uint8, weight_threshold=2000, ) config = cto.coreml.OptimizationConfig() config.set_op_name("conv", conv_config_1) config.set_op_name("conv", conv_config_2) config.set_op_name("linear", linear_config) config.set_op_name("conv_transpose", conv_transpose_config) compressor = quantization.linear_quantize_weights(config=config) prog = self._get_test_program() compressor.apply(prog) expected_ops = [ "constexpr_affine_dequantize", "conv", "reshape", "linear", "reshape", "constexpr_affine_dequantize", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops # Test different dtype are applied assert ( prog.find_ops(op_type="constexpr_affine_dequantize")[0].quantized_data.val.dtype == np.int8 ) assert ( prog.find_ops(op_type="constexpr_affine_dequantize")[1].quantized_data.val.dtype == np.uint8 ) def test_op_name_config_linear_quantizer_blockwise(self): """ set_op_name allow the user to set different config for each op specified by name. Also checking that the config can be overwritten """ conv_config_1 = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype="int8", granularity="per_block", block_size=4, weight_threshold=2000, ) # conv_config_2 overwrite conv_config_1 conv_config_2 = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype="int8", granularity="per_block", block_size=2, weight_threshold=2000, ) # The weight_threshold is super large so linear is not going to be compressed linear_config = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype="int4", weight_threshold=1000000, ) conv_transpose_config = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR", dtype="int8", granularity="per_block", block_size=6, weight_threshold=2000, ) config = cto.coreml.OptimizationConfig() config.set_op_name("conv", conv_config_1) config.set_op_name("conv", conv_config_2) config.set_op_name("linear", linear_config) config.set_op_name("conv_transpose", conv_transpose_config) compressor = quantization.linear_quantize_weights(config=config) prog = self._get_test_program_3() compressor.apply(prog) expected_ops = [ "constexpr_blockwise_shift_scale", "conv", "reshape", "linear", "matmul", "reshape", "constexpr_blockwise_shift_scale", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops blockwise_ops = prog.find_ops(op_type="constexpr_blockwise_shift_scale") assert blockwise_ops[0].offset is None assert blockwise_ops[1].offset is not None # Conv transpose original weight shape is (30, 4, 21, 10). The output channel axis is 1 and # input channel axis is 0, so the scale's first axis dim is 30 / 6 = 5. assert blockwise_ops[1].scale.shape == (5, 4, 1, 1) def test_auto_pick_channel_axis_quantizer(self): """ Check the right output channel axis is picked for block-wise quantization. """ global_config = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR", dtype="int4", granularity="per_block", block_size=2, weight_threshold=2000, ) linear_config = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype="int4", granularity="per_block", block_size=9, weight_threshold=100, ) config = cto.coreml.OptimizationConfig() config.set_global(global_config) config.set_op_name("linear", linear_config) compressor = quantization.linear_quantize_weights(config=config) prog = self._get_test_program_3() compressor.apply(prog) blockwise_ops = prog.find_ops(op_type="constexpr_blockwise_shift_scale") # For conv, input channel axis is 1, output channel axis is 0. # The original weight shape is [90, 30, 2, 2], the scale's second dim is 30 / 2 = 15. assert blockwise_ops[0].scale.shape == (90, 15, 1, 1) # For linear, input channel axis is 1, output channel axis is 0. # The original weight shape is [70, 81], the scale's second dim is 81 / 9 = 9. assert blockwise_ops[1].scale.shape == (70, 9) # For matmul (transpose_y=False), input channel axis is -2, output channel axis is -1. # The original weight shape is [2, 1, 70, 35], the scale's third dim is 70 / 2 = 35. assert blockwise_ops[2].scale.shape == (1, 1, 35, 35) # For conv_transpose, input channel axis is 0, output channel axis is 1. # The original weight shape is [30, 4, 21, 10], the scale's first dim is 30 / 2 = 15. assert blockwise_ops[3].scale.shape == (15, 4, 1, 1) def test_invalid_config(self): with pytest.raises( ValueError, match="Invalid dtype int2. Only support int8/uint8/int4/uint4", ): cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype="int2", block_size=2, weight_threshold=2000, ) with pytest.raises( ValueError, match="Only mode \('LINEAR_SYMMETRIC', 'LINEAR'\) supported for weight affine quantization. Got mode: \"DUMMY\".", ): cto.coreml.OpLinearQuantizerConfig( mode="DUMMY", dtype="int4", block_size=32, weight_threshold=5000, ) def test_not_divisible_block_size(self, caplog): global_config = cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", granularity="per_block", dtype="int4", block_size=13, weight_threshold=100, ) config = cto.coreml.OptimizationConfig() config.set_global(global_config) compressor = quantization.linear_quantize_weights(config=config) prog = self._get_test_program_3() compressor.apply(prog) warning_msg = "Invalid block_sizes; On 1th axis, the dim size 30 is not divisible by block size 13. Unable to perform structured quantization." assert any([re.match(warning_msg, rec.message) for rec in caplog.records]) class TestPruner(TestCompressionPasses): @pytest.mark.parametrize( "mode, threshold, target_sparsity, weight_threshold, fake_compression", itertools.product( ["THRESHOLD_BASED", "PERCENTILE_BASED"], [1e-3, 1.0], [0.2, 0.98], [1000, 7000], [True, False], ), ) def test_global_config_pruner( self, mode, threshold, target_sparsity, weight_threshold, fake_compression ): """ Global config would compress all operations with the same config """ if mode == "THRESHOLD_BASED": op_config = cto.coreml.OpThresholdPrunerConfig( threshold=threshold, weight_threshold=weight_threshold, minimum_sparsity_percentile=0.0, ) else: assert mode == "PERCENTILE_BASED" op_config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=target_sparsity, weight_threshold=weight_threshold, ) config = cto.coreml.OptimizationConfig(global_config=op_config) compressor = quantization.prune_weights(config=config, fake_compression=fake_compression) prog = self._get_test_program() compressor.apply(prog) if fake_compression: expected_ops = ["conv", "reshape", "linear", "reshape", "conv_transpose"] elif weight_threshold == 1000: expected_ops = [ "constexpr_sparse_to_dense", "conv", "reshape", "constexpr_sparse_to_dense", "linear", "reshape", "constexpr_sparse_to_dense", "conv_transpose", ] else: assert weight_threshold == 7000 # linear weight size < 7000 expected_ops = [ "constexpr_sparse_to_dense", "conv", "reshape", "linear", "reshape", "constexpr_sparse_to_dense", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops def test_op_type_config_pruner(self): """ set_op_type allow the user to set different config for each op type. Also checking that the config can be overwritten """ conv_config_1 = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.5, weight_threshold=2000, ) # conv_config_2 overwrite conv_config_1 conv_config_2 = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.9, weight_threshold=2000, ) linear_config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.2, weight_threshold=2000, ) # The weight_threshold is super large so conv_transpose is not going to be compressed conv_transpose_config = cto.coreml.OpThresholdPrunerConfig( threshold=1.0, weight_threshold=1000000, ) config = cto.coreml.OptimizationConfig() config.set_op_type("conv", conv_config_1) config.set_op_type("conv", conv_config_2) config.set_op_type("linear", linear_config) config.set_op_type("conv_transpose", conv_transpose_config) compressor = quantization.prune_weights(config=config) prog = self._get_test_program() compressor.apply(prog) expected_ops = [ "constexpr_sparse_to_dense", "conv", "reshape", "constexpr_sparse_to_dense", "linear", "reshape", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops # Test different sparcsity percentile are applied assert ( prog.find_ops(op_type="constexpr_sparse_to_dense")[0].nonzero_data.val.size == 1080 ) # 1080 * 0.1 assert ( prog.find_ops(op_type="constexpr_sparse_to_dense")[1].nonzero_data.val.size == 4536 ) # 5670 * 0.8 def test_op_name_config_pruner(self): """ set_op_name allow the user to set different config for each op specified by name. Also checking that the config can be overwritten """ conv_config_1 = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.5, weight_threshold=2000, ) # conv_config_2 overwrite conv_config_1 conv_config_2 = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.9, weight_threshold=2000, ) linear_config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.2, weight_threshold=2000, ) # The weight_threshold is super large so conv_transpose is not going to be compressed conv_transpose_config = cto.coreml.OpThresholdPrunerConfig( threshold=1.0, weight_threshold=1000000, ) config = cto.coreml.OptimizationConfig() config.set_op_name("conv", conv_config_1) config.set_op_name("conv", conv_config_2) config.set_op_name("linear", linear_config) config.set_op_name("conv_transpose", conv_transpose_config) compressor = quantization.prune_weights(config=config) prog = self._get_test_program() compressor.apply(prog) expected_ops = [ "constexpr_sparse_to_dense", "conv", "reshape", "constexpr_sparse_to_dense", "linear", "reshape", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops # Test different sparcsity percentile are applied assert ( prog.find_ops(op_type="constexpr_sparse_to_dense")[0].nonzero_data.val.size == 1080 ) # 1080 * 0.1 assert ( prog.find_ops(op_type="constexpr_sparse_to_dense")[1].nonzero_data.val.size == 4536 ) # 5670 * 0.8 @pytest.mark.parametrize( "target_sparsity, minimum_sparsity_percentile", itertools.product( [0.1, 0.5, 0.9], [0.0, 0.3, 0.7], ), ) def test_pruner_minimum_sparsity_percentile(self, target_sparsity, minimum_sparsity_percentile): def _get_sparse_weight(shape, target_sparsity): size = np.prod(shape) weight = 3 * np.ones(size) num_of_zeros = int(size * target_sparsity) weight[:num_of_zeros] = 0 return np.reshape(weight, shape).astype(np.float32) def _get_simple_program(): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 30, 10, 10))], opset_version=ct.target.iOS16 ) def prog(x): conv_weight = _get_sparse_weight((90, 30, 3, 3), target_sparsity) x = mb.conv(x=x, weight=conv_weight, name="conv1") return x return prog op_config = cto.coreml.OpThresholdPrunerConfig( threshold=1e-3, minimum_sparsity_percentile=minimum_sparsity_percentile, weight_threshold=200, ) config = cto.coreml.OptimizationConfig(global_config=op_config) compressor = quantization.prune_weights(config=config) prog = _get_simple_program() compressor.apply(prog) if minimum_sparsity_percentile < target_sparsity: expected_ops = ["constexpr_sparse_to_dense", "conv"] else: expected_ops = ["conv"] assert get_op_types_in_program(prog) == expected_ops def test_structural_pruning(self): def _get_test_prog(): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 30, 10, 10))], opset_version=ct.target.iOS16 ) def prog(x): conv_weight_1 = mb.const( val=np.random.rand(90, 30, 2, 2).astype(np.float32), name="w_1" ) conv_bias_1 = mb.const( val=np.random.rand( 90, ).astype(np.float32), name="b_1", ) conv_weight_2 = mb.const( val=np.random.rand(10, 90, 2, 2).astype(np.float32), name="w_2" ) linear_weight = mb.const(val=np.random.rand(128, 64).astype(np.float32), name="l_w") linear_bias = mb.const( val=np.random.rand( 128, ).astype(np.float32), name="l_b", ) add_const = mb.const( val=np.random.rand(10, 128).astype(np.float32), name="add_const" ) x = mb.conv(x=x, weight=conv_weight_1, bias=conv_bias_1, name="conv_1") x = mb.conv(x=x, weight=conv_weight_2, name="conv_2") x = mb.reshape(x=x, shape=(10, 64)) x = mb.linear(x=x, weight=linear_weight, bias=linear_bias, name="linear_1") x = mb.add(x=x, y=add_const, name="add_1") return x return prog # (1) Global structural pruning config will only applied to conv / linear weight prog = _get_test_prog() config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpMagnitudePrunerConfig( n_m_ratio=(2, 3), weight_threshold=0, ) ) compressor = quantization.prune_weights(config=config) compressor.apply(prog) expected_ops = [ "constexpr_sparse_to_dense", "constexpr_sparse_to_dense", "constexpr_sparse_to_dense", "conv", "conv", "reshape", "linear", "add", ] assert get_op_types_in_program(prog) == expected_ops conv_ops = prog.find_ops(op_type="conv") assert conv_ops[0].weight.op.op_type == "constexpr_sparse_to_dense" assert conv_ops[1].weight.op.op_type == "constexpr_sparse_to_dense" assert prog.find_ops(op_type="linear")[0].weight.op.op_type == "constexpr_sparse_to_dense" # (2) Even by setting the ops with structural pruning, make sure only weight is sparsified, not bias prog = _get_test_prog() config = cto.coreml.OptimizationConfig( op_type_configs={ "conv": cto.coreml.OpMagnitudePrunerConfig( n_m_ratio=(2, 3), weight_threshold=0, ) }, op_name_configs={ "linear_1": cto.coreml.OpMagnitudePrunerConfig( n_m_ratio=(1, 4), weight_threshold=0, ) }, ) compressor = quantization.prune_weights(config=config) compressor.apply(prog) expected_ops = [ "constexpr_sparse_to_dense", "constexpr_sparse_to_dense", "constexpr_sparse_to_dense", "conv", "conv", "reshape", "linear", "add", ] assert get_op_types_in_program(prog) == expected_ops conv_ops = prog.find_ops(op_type="conv") assert conv_ops[0].weight.op.op_type == "constexpr_sparse_to_dense" assert conv_ops[1].weight.op.op_type == "constexpr_sparse_to_dense" assert prog.find_ops(op_type="linear")[0].weight.op.op_type == "constexpr_sparse_to_dense" # (3) Early error out when setting a non applicable op to structural pruning with set_op_type with pytest.raises( ValueError, match="block sparsity or n:m pruning does not support op type add" ): config = cto.coreml.OptimizationConfig( op_type_configs={ "add": cto.coreml.OpMagnitudePrunerConfig( n_m_ratio=(2, 3), weight_threshold=0, ) }, ) with pytest.raises( ValueError, match="block sparsity or n:m pruning does not support op type add" ): config = cto.coreml.OptimizationConfig() config.set_op_type( "add", cto.coreml.OpMagnitudePrunerConfig( n_m_ratio=(2, 3), weight_threshold=0, ), ) # (4) By using set_op_name, we can still force a const op to use structural pruning prog = _get_test_prog() config = cto.coreml.OptimizationConfig( op_name_configs={ "add_const": cto.coreml.OpMagnitudePrunerConfig( n_m_ratio=(1, 4), weight_threshold=0, ) } ) compressor = quantization.prune_weights(config=config) compressor.apply(prog) expected_ops = [ "constexpr_sparse_to_dense", "conv", "conv", "reshape", "linear", "add", ] assert get_op_types_in_program(prog) == expected_ops assert prog.find_ops(op_type="add")[0].y.op.op_type == "constexpr_sparse_to_dense" class TestPalettizer(TestCompressionPasses): @pytest.mark.parametrize( "nbits, mode, weight_threshold, fake_compression", itertools.product( [2, 6], ["KMEANS", "UNIFORM"], [1000, 7000], [True, False], ), ) def test_global_config_palettizer(self, nbits, mode, weight_threshold, fake_compression): """ Global config would compress all operations with the same config """ op_config = cto.coreml.OpPalettizerConfig( nbits=nbits, mode=mode, weight_threshold=weight_threshold ) config = cto.coreml.OptimizationConfig(global_config=op_config) compressor = quantization.palettize_weights( config=config, fake_compression=fake_compression ) prog = self._get_test_program() compressor.apply(prog) if fake_compression: expected_ops = ["conv", "reshape", "linear", "reshape", "conv_transpose"] elif weight_threshold == 1000: expected_ops = [ "constexpr_lut_to_dense", "conv", "reshape", "constexpr_lut_to_dense", "linear", "reshape", "constexpr_lut_to_dense", "conv_transpose", ] else: assert weight_threshold == 7000 # linear weight size < 7000 expected_ops = [ "constexpr_lut_to_dense", "conv", "reshape", "linear", "reshape", "constexpr_lut_to_dense", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops def test_op_type_config_palettizer(self): """ set_op_type allow the user to set different config for each op type. Also checking that the config can be overwritten """ conv_config_1 = cto.coreml.OpPalettizerConfig( nbits=8, mode="KMEANS", weight_threshold=2000, ) # conv_config_2 overwrite conv_config_1 conv_config_2 = cto.coreml.OpPalettizerConfig( nbits=2, mode="KMEANS", weight_threshold=2000, ) linear_config = cto.coreml.OpPalettizerConfig( nbits=4, mode="UNIFORM", weight_threshold=2000, ) # The weight_threshold is super large so conv_transpose is not going to be compressed conv_transpose_config = cto.coreml.OpPalettizerConfig( nbits=4, mode="UNIFORM", weight_threshold=1000000, ) config = cto.coreml.OptimizationConfig() config.set_op_type("conv", conv_config_1) config.set_op_type("conv", conv_config_2) config.set_op_type("linear", linear_config) config.set_op_type("conv_transpose", conv_transpose_config) compressor = quantization.palettize_weights(config=config) prog = self._get_test_program() compressor.apply(prog) expected_ops = [ "constexpr_lut_to_dense", "conv", "reshape", "constexpr_lut_to_dense", "linear", "reshape", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops # Test different nbits are applied assert prog.find_ops(op_type="constexpr_lut_to_dense")[0].lut.val.shape == (4,) assert prog.find_ops(op_type="constexpr_lut_to_dense")[1].lut.val.shape == (16,) def test_op_name_config_palettizer(self): """ set_op_name allow the user to set different config for each op specified by name. Also checking that the config can be overwritten """ conv_config_1 = cto.coreml.OpPalettizerConfig( nbits=8, mode="KMEANS", weight_threshold=2000, ) # conv_config_2 overwrite conv_config_1 conv_config_2 = cto.coreml.OpPalettizerConfig( nbits=2, mode="KMEANS", weight_threshold=2000, ) linear_config = cto.coreml.OpPalettizerConfig( nbits=4, mode="UNIFORM", weight_threshold=2000, ) # The weight_threshold is super large so conv_transpose is not going to be compressed conv_transpose_config = cto.coreml.OpPalettizerConfig( nbits=4, mode="UNIFORM", weight_threshold=1000000, ) config = cto.coreml.OptimizationConfig() config.set_op_name("conv", conv_config_1) config.set_op_name("conv", conv_config_2) config.set_op_name("linear", linear_config) config.set_op_name("conv_transpose", conv_transpose_config) compressor = quantization.palettize_weights(config=config) prog = self._get_test_program() compressor.apply(prog) expected_ops = [ "constexpr_lut_to_dense", "conv", "reshape", "constexpr_lut_to_dense", "linear", "reshape", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops # Test different nbits are applied assert prog.find_ops(op_type="constexpr_lut_to_dense")[0].lut.val.shape == (4,) assert prog.find_ops(op_type="constexpr_lut_to_dense")[1].lut.val.shape == (16,) def test_op_name_config_palettizer_blockwise(self): """ set_op_name allow the user to set different config for each op specified by name. Also checking that the config can be overwritten. """ conv_config_1 = cto.coreml.OpPalettizerConfig( mode="uniform", nbits=4, granularity="per_tensor", weight_threshold=500000, ) # The conv_config_2 overwrites conv_config_1. conv_config_2 = cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=8, granularity="per_grouped_channel", group_size=1, channel_axis=1, weight_threshold=2000, ) # The weight_threshold is super large so linear is not going to be compressed. linear_config = cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=4, weight_threshold=1000000, ) conv_transpose_config = cto.coreml.OpPalettizerConfig( mode="uniform", nbits=4, granularity="per_grouped_channel", group_size=1, weight_threshold=2000, ) config = cto.coreml.OptimizationConfig() config.set_op_name("conv", conv_config_1) config.set_op_name("conv", conv_config_2) config.set_op_name("linear", linear_config) config.set_op_name("conv_transpose", conv_transpose_config) prog = self._get_test_program_3() compressor = quantization.palettize_weights(config=config) compressor.apply(prog) expected_ops = [ "constexpr_lut_to_dense", "conv", "reshape", "linear", "matmul", "reshape", "constexpr_lut_to_dense", "conv_transpose", ] assert get_op_types_in_program(prog) == expected_ops assert prog.find_ops(op_type="constexpr_lut_to_dense")[0].vector_axis is None # Makes sure the channel_axis in conv_config_2 is effective. conv_lut = prog.find_ops(op_type="constexpr_lut_to_dense")[0].lut assert conv_lut.shape[0] == 1 assert conv_lut.shape[1] == 30 def test_invalid_granularity(self): with pytest.raises( ValueError, match='"granularity" must be one of .*, but got CompressionGranularity.PER_CHANNEL', ): cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=4, granularity="per_channel", weight_threshold=2000, ) with pytest.raises(TypeError, match="got an unexpected keyword argument 'block_size'"): cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=4, granularity="per_tensor", block_size=2, weight_threshold=2000, ) def test_auto_pick_channel_axis_palettizer(self): """ Check the right output channel axis is picked for granularity='per_grouped_channel'. """ global_config = cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=4, granularity="per_grouped_channel", group_size=1, weight_threshold=2000, ) config = cto.coreml.OptimizationConfig() config.set_global(global_config) compressor = quantization.palettize_weights(config=config) prog = self._get_test_program_3() compressor.apply(prog) # For conv, the output channel-axis is 0. conv_lut = prog.find_ops(op_type="constexpr_lut_to_dense")[0].lut assert conv_lut.shape[0] == 90 assert conv_lut.shape[1] == 1 # For linear, the output channel-axis is 0. linear_lut = prog.find_ops(op_type="constexpr_lut_to_dense")[1].lut assert linear_lut.shape[0] == 70 assert linear_lut.shape[1] == 1 # For matmul with transpose_y=False, the output channel-axis is -1. matmul_lut = prog.find_ops(op_type="constexpr_lut_to_dense")[2].lut assert matmul_lut.shape == (1, 1, 1, 35, 16, 1) # For conv_transpose, the output channel-axis is -2. conv_transpose_lut = prog.find_ops(op_type="constexpr_lut_to_dense")[3].lut assert conv_transpose_lut.shape[0] == 1 assert conv_transpose_lut.shape[1] == 4 def test_group_channel_wise(self): global_config = cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=3, granularity="per_grouped_channel", group_size=2, weight_threshold=2000, ) config = cto.coreml.OptimizationConfig() config.set_global(global_config) compressor = quantization.palettize_weights(config=config) prog = self._get_test_program_3() compressor.apply(prog) lut_ops = prog.find_ops(op_type="constexpr_lut_to_dense") # The conv weight dense shape is (90, 30, 2, 2). Auto-picked axis=0. assert lut_ops[0].lut.shape == (45, 1, 1, 1, 8, 1) # The linear weight dense shape is (70, 81). Auto-picked axis=0. assert lut_ops[1].lut.shape == (35, 1, 8, 1) # The matmul y dense shape is (2, 1, 70, 35). Auto-picked axis=-1. # However, the 35 is not divisible by 2, so it will get skipped. assert prog.find_ops(op_type="matmul")[0].y.op.op_type == "const" # The conv_transpose weight dense shape is (30, 4, 21, 10). Auto-picked axis=-2. assert lut_ops[2].lut.shape == (1, 2, 1, 1, 8, 1) def test_tensor_wise(self): """Test granularity='per_block' with block_size=0 equivalent to granularity='per_tensor'.""" global_config_1 = cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=3, granularity="per_tensor", weight_threshold=2000, ) global_config_2 = cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=3, granularity="per_grouped_channel", group_size=0, weight_threshold=2000, ) for global_config in (global_config_1, global_config_2): config = cto.coreml.OptimizationConfig(global_config=global_config) compressor = quantization.palettize_weights(config=config) prog = self._get_test_program_3() compressor.apply(prog) lut_ops = prog.find_ops(op_type="constexpr_lut_to_dense") # The conv weight dense shape is (90, 30, 2, 2). assert lut_ops[0].lut.shape == (1, 1, 1, 1, 8, 1) # The linear weight dense shape is (70, 81). assert lut_ops[1].lut.shape == (1, 1, 8, 1) # The matmul y dense shape is (2, 1, 70, 35). assert lut_ops[2].lut.shape == (1, 1, 1, 1, 8, 1) # The conv_transpose weight dense shape is (30, 4, 21, 10). assert lut_ops[3].lut.shape == (1, 1, 1, 1, 8, 1) def test_not_divisible_channel_group_size(self, caplog): global_config = cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=4, granularity="per_grouped_channel", group_size=3, weight_threshold=2000, ) config = cto.coreml.OptimizationConfig() config.set_global(global_config) compressor = quantization.palettize_weights(config=config) prog = self._get_test_program_3() compressor.apply(prog) # The axis-0 in linear (70), axis-3 in matmul (35), and axis-1 in conv_transpose (4) are not divisible by 3. for axis in (0, 3, 1): warning_msg = ( f"Can't perform palettization: The number of channels at {axis}th axis .* is not " "divisible by channel_group_size" ) assert any([re.match(warning_msg, rec.message) for rec in caplog.records]) # Only the conv get compressed. lut_ops = prog.find_ops(op_type="constexpr_lut_to_dense") assert len(lut_ops) == 1 assert lut_ops[0].outputs[0].child_ops[0].op_type == "conv" def test_ios16_program_not_support_channel_wise_lut(self): global_config = cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=4, granularity="per_grouped_channel", group_size=3, weight_threshold=2000, ) config = cto.coreml.OptimizationConfig() config.set_global(global_config) compressor = quantization.palettize_weights(config=config) prog = self._get_test_program() with pytest.raises( AssertionError, match=re.escape( "The iOS16 only supports per-tensor lut, but got more than one lut " "on 0th axis. LUT shape: (30, 1, 1, 1, 16, 1)" ), ): compressor.apply(prog) class TestCompressionOperations(TestCompressionPasses): """ This test is checking compression for some common operations. """ COMPRESSORS = [ quantization.palettize_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig( nbits=2, mode="KMEANS", weight_threshold=50 ) ) ), quantization.linear_quantize_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpLinearQuantizerConfig( mode="LINEAR_SYMMETRIC", dtype=np.int8, weight_threshold=50 ) ) ), quantization.prune_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.9, weight_threshold=50 ) ) ), ] COMPRESSOR_TO_OP_TYPE = { "palettize_weights": "constexpr_lut_to_dense", "linear_quantize_weights": "constexpr_affine_dequantize", "prune_weights": "constexpr_sparse_to_dense", } @staticmethod @pytest.mark.parametrize( "compressor", COMPRESSORS, ) def test_conv_compress(compressor): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 30, 10, 10))], opset_version=ct.target.iOS16 ) def prog(x): conv_weight = np.random.rand(90, 30, 2, 2).astype(np.float32) return mb.conv(x=x, weight=conv_weight) compressor.apply(prog) op_type = TestCompressionOperations.COMPRESSOR_TO_OP_TYPE[compressor.__class__.__name__] assert get_op_types_in_program(prog) == [op_type, "conv"] @staticmethod @pytest.mark.parametrize( "compressor", COMPRESSORS, ) def test_conv_transpose_compress(compressor): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 30, 10, 10))], opset_version=ct.target.iOS16 ) def prog(x): conv_weight = np.random.rand(90, 30, 2, 2).astype(np.float32) return mb.conv_transpose(x=x, weight=conv_weight) compressor.apply(prog) op_type = TestCompressionOperations.COMPRESSOR_TO_OP_TYPE[compressor.__class__.__name__] assert get_op_types_in_program(prog) == [op_type, "conv_transpose"] @staticmethod @pytest.mark.parametrize( "compressor", COMPRESSORS, ) def test_liear_compress(compressor): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 30, 10))], opset_version=ct.target.iOS16) def prog(x): linear_weight = np.random.rand(40, 10).astype(np.float32) return mb.linear(x=x, weight=linear_weight) compressor.apply(prog) op_type = TestCompressionOperations.COMPRESSOR_TO_OP_TYPE[compressor.__class__.__name__] assert get_op_types_in_program(prog) == [op_type, "linear"] @staticmethod @pytest.mark.parametrize( "compressor", COMPRESSORS, ) def test_matmul_compress(compressor): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 30, 10))], opset_version=ct.target.iOS16) def prog(x): weight1 = np.random.rand(10, 40).astype(np.float32) weight2 = np.random.rand(20, 30).astype(np.float32) x = mb.matmul(x=x, y=weight1) return mb.matmul(x=weight2, y=x) compressor.apply(prog) op_type = TestCompressionOperations.COMPRESSOR_TO_OP_TYPE[compressor.__class__.__name__] assert get_op_types_in_program(prog) == [op_type, "matmul", op_type, "matmul"] @staticmethod @pytest.mark.parametrize( "compressor", COMPRESSORS, ) def test_gru_compress(compressor): @mb.program( input_specs=[mb.TensorSpec(shape=(1, 10, 30)), mb.TensorSpec(shape=(10, 40))], opset_version=ct.target.iOS16, ) def prog(x, initial_h): weight_ih = np.random.rand(120, 30).astype(np.float32) weight_hh = np.random.rand(120, 40).astype(np.float32) return mb.gru(x=x, initial_h=initial_h, weight_ih=weight_ih, weight_hh=weight_hh) compressor.apply(prog) op_type = TestCompressionOperations.COMPRESSOR_TO_OP_TYPE[compressor.__class__.__name__] assert get_op_types_in_program(prog) == [op_type, op_type, "gru"] @staticmethod @pytest.mark.parametrize( "compressor", COMPRESSORS, ) def test_lstm_compress(compressor): @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 10, 30)), mb.TensorSpec(shape=(10, 40)), mb.TensorSpec(shape=(10, 40)), ], opset_version=ct.target.iOS16, ) def prog(x, initial_h, initial_c): weight_ih = np.random.rand(160, 30).astype(np.float32) weight_hh = np.random.rand(160, 40).astype(np.float32) return mb.lstm( x=x, initial_h=initial_h, initial_c=initial_c, weight_ih=weight_ih, weight_hh=weight_hh, ) compressor.apply(prog) op_type = TestCompressionOperations.COMPRESSOR_TO_OP_TYPE[compressor.__class__.__name__] assert get_op_types_in_program(prog) == [op_type, op_type, "lstm"] @staticmethod @pytest.mark.parametrize( "compressor", COMPRESSORS, ) def test_rnn_compress(compressor): @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 10, 30)), mb.TensorSpec(shape=(10, 40)), ], opset_version=ct.target.iOS16, ) def prog(x, initial_h): weight_ih = np.random.rand(40, 30).astype(np.float32) weight_hh = np.random.rand(40, 40).astype(np.float32) return mb.rnn(x=x, initial_h=initial_h, weight_ih=weight_ih, weight_hh=weight_hh) compressor.apply(prog) op_type = TestCompressionOperations.COMPRESSOR_TO_OP_TYPE[compressor.__class__.__name__] assert get_op_types_in_program(prog) == [op_type, op_type, "rnn"] @staticmethod @pytest.mark.parametrize( "compressor", COMPRESSORS, ) def test_add_compress(compressor): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 10, 30))], opset_version=ct.target.iOS16) def prog(x): add_const = np.random.rand(10, 30).astype(np.float32) return mb.add(x=x, y=add_const) compressor.apply(prog) op_type = TestCompressionOperations.COMPRESSOR_TO_OP_TYPE[compressor.__class__.__name__] assert get_op_types_in_program(prog) == [op_type, "add"] @staticmethod def test_add_compress_set_op_type(): @mb.program(input_specs=[mb.TensorSpec(shape=(1, 10, 30))], opset_version=ct.target.iOS16) def prog(x): add_const = np.random.rand(10, 30).astype(np.float32) return mb.add(x=x, y=add_const) compressor = quantization.palettize_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig( nbits=2, mode="KMEANS", weight_threshold=50 ), op_type_configs={ "add": cto.coreml.OpPalettizerConfig( nbits=4, mode="KMEANS", weight_threshold=50 ) }, ) ) compressor.apply(prog) assert get_op_types_in_program(prog) == ["constexpr_lut_to_dense", "add"] # also check the compression config comes from set_op_type assert prog.find_ops(op_type="constexpr_lut_to_dense")[0].lut.val.shape == (16,) class TestInvalidConfig: """ This test is checking error handling for invalid configuration. """ @staticmethod def test_invalid_config_type(): err_msg = "config must be of type OptimizationConfig" with pytest.raises(ValueError, match=err_msg): compressor = quantization.palettize_weights( config=1, ) with pytest.raises(ValueError, match=err_msg): compressor = quantization.linear_quantize_weights( config="12", ) with pytest.raises(ValueError, match=err_msg): compressor = quantization.prune_weights( config=[12, 3], ) msg = "palettize_weights only accept OpPalettizerConfig type config" with pytest.raises(ValueError, match=msg): compressor = quantization.palettize_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpLinearQuantizerConfig(), ) ) with pytest.raises(ValueError, match=msg): compressor = quantization.palettize_weights( config=cto.coreml.OptimizationConfig( op_type_configs={"op": cto.coreml.OpLinearQuantizerConfig()}, ) ) with pytest.raises(ValueError, match=msg): compressor = quantization.palettize_weights( config=cto.coreml.OptimizationConfig( op_name_configs={"name": cto.coreml.OpLinearQuantizerConfig()}, ) ) msg = "linear_quantize_weights only accept OpLinearQuantizerConfig type config" with pytest.raises(ValueError, match=msg): compressor = quantization.linear_quantize_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig(nbits=2), ) ) with pytest.raises(ValueError, match=msg): compressor = quantization.linear_quantize_weights( config=cto.coreml.OptimizationConfig( op_type_configs={"op": cto.coreml.OpPalettizerConfig(nbits=2)}, ) ) with pytest.raises(ValueError, match=msg): compressor = quantization.linear_quantize_weights( config=cto.coreml.OptimizationConfig( op_name_configs={"op": cto.coreml.OpPalettizerConfig(nbits=2)}, ) ) msg = "prune_weights only accept (OpMagnitudePrunerConfig, OpThresholdPrunerConfig) type config" with pytest.raises(ValueError, match=msg): compressor = quantization.prune_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig(nbits=2), ) ) with pytest.raises(ValueError, match=msg): compressor = quantization.prune_weights( config=cto.coreml.OptimizationConfig( op_type_configs={"op": cto.coreml.OpPalettizerConfig(nbits=2)}, ) ) with pytest.raises(ValueError, match=msg): compressor = quantization.prune_weights( config=cto.coreml.OptimizationConfig( op_name_configs={"name": cto.coreml.OpPalettizerConfig(nbits=2)}, ) ) msg = "config must be type of OpCompressorConfig." with pytest.raises(ValueError, match=msg): cto.coreml.OptimizationConfig( global_config="str", ) with pytest.raises(ValueError, match=msg): cto.coreml.OptimizationConfig( op_type_configs={"op": 123}, ) with pytest.raises(ValueError, match=msg): cto.coreml.OptimizationConfig( op_name_configs={"name": []}, ) msg = 'Invalid value of "minimum_sparsity_percentile":' with pytest.raises(ValueError, match=msg): config = cto.coreml.OpThresholdPrunerConfig( threshold=0.8, minimum_sparsity_percentile=1.2, ) with pytest.raises(ValueError, match=msg): config = cto.coreml.OpThresholdPrunerConfig( threshold=0.8, minimum_sparsity_percentile=-9.0, ) msg = '"weight_threshold" must be a non-negative integer.' with pytest.raises(ValueError, match=msg): config = cto.coreml.OpThresholdPrunerConfig( threshold=0.8, weight_threshold=-9, ) with pytest.raises(ValueError, match=msg): config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=1.0, weight_threshold=-8, ) with pytest.raises(ValueError, match=msg): config = cto.coreml.OpLinearQuantizerConfig( weight_threshold=-9, ) with pytest.raises(ValueError, match=msg): config = cto.coreml.OpPalettizerConfig( nbits=2, weight_threshold=-10, ) msg = 'Either "target_sparsity" or "n_m_ratio" need to be set. They cannot be set at the same time.' with pytest.raises(ValueError, match=msg): config = cto.coreml.OpMagnitudePrunerConfig() with pytest.raises(ValueError, match=msg): config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.0, n_m_ratio=(2, 10), ) msg = 'Invalid value of "target_sparsity":' with pytest.raises(ValueError, match=msg): config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=-0.9, ) with pytest.raises(ValueError, match=msg): config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=1.1, ) with pytest.raises( ValueError, match='"block_size" and "n_m_ratio" cannot be set at the same time.' ): config = cto.coreml.OpMagnitudePrunerConfig( n_m_ratio=(2, 2), block_size=9, ) msg = '"block_size" must be an integer \> 1' with pytest.raises(ValueError, match=msg): config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.9, block_size=1, ) with pytest.raises(ValueError, match=msg): config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.9, block_size=-9, ) msg = '"n_m_ratio" must be a tuple of two integers \(n, m\). n \<\= m. Got' with pytest.raises(ValueError, match=msg): config = cto.coreml.OpMagnitudePrunerConfig( n_m_ratio=(2, 2, 2), ) with pytest.raises(ValueError, match=msg): config = cto.coreml.OpMagnitudePrunerConfig( n_m_ratio=(6, 1), ) msg = '"dim" must be 1 or 0' with pytest.raises(ValueError, match=msg): config = cto.coreml.OpMagnitudePrunerConfig( n_m_ratio=(1, 1), dim=-1, ) with pytest.raises(ValueError, match=msg): config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=1.0, block_size=2, dim=2, ) with pytest.raises( ValueError, match='"dim" can only be set along with "block_size" or "n_m_ratio".' ): config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=1.0, dim=1, ) @staticmethod def test_set_op_type_error_out_for_const(): """ We cannot use set_op_type for const op """ @mb.program(input_specs=[mb.TensorSpec(shape=(1, 10, 30))], opset_version=ct.target.iOS16) def prog(x): add_const = np.random.rand(10, 30).astype(np.float32) return mb.add(x=x, y=add_const, name="add1") compressor = quantization.palettize_weights( config=cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig( nbits=2, mode="KMEANS", weight_threshold=50 ), op_type_configs={ "const": cto.coreml.OpPalettizerConfig( nbits=4, mode="KMEANS", weight_threshold=50 ) }, ) ) with pytest.raises( ValueError, match="const ops cannot be set by the `set_op_type` function. Please use `set_global`", ): compressor.apply(prog) class TestConfigurationFromDictFromYaml: """ Test the from_dict and from_yaml functionality. """ @staticmethod def load_to_yaml(config_dict): with tempfile.NamedTemporaryFile("w") as file: yaml.dump(config_dict, file) yaml_dict = yaml.safe_load(open(file.name)) file.close() return yaml_dict @staticmethod def get_yaml(config_dict): with tempfile.NamedTemporaryFile("w", delete=False) as file: yaml.dump(config_dict, file) return file.name def get_opt_config(self, config_dict, from_yaml, yaml_as_string): if from_yaml: yaml_file_name = self.get_yaml(config_dict) if not yaml_as_string: yaml = open(yaml_file_name) else: yaml = yaml_file_name config = quantization.OptimizationConfig.from_yaml(yaml) os.remove(yaml_file_name) else: config = quantization.OptimizationConfig.from_dict(config_dict) return config @staticmethod @pytest.mark.parametrize( "config_cls", [ quantization.OpLinearQuantizerConfig, quantization.OpThresholdPrunerConfig, quantization.OpMagnitudePrunerConfig, quantization.OpPalettizerConfig, ], ) def test_config_load_invalid_key(config_cls): # Invalid key config_dict = {"invalid": 2} with pytest.raises(cattrs.errors.ClassValidationError): config_cls._from_dict(config_dict) @pytest.mark.parametrize( "mode, dtype, granularity, block_size, weight_threshold, use_yaml", itertools.product( ["linear", "linear_symmetric"], ["int4", "uint4", "int8", "uint8", np.int8, np.uint8, types.int8, types.uint8], ["per_tensor", "per_channel", "per_block"], [0, 1, 2, [0, 1]], [1024, None], [True, False], ), ) def test_linear_quantizer_config_load_stress( self, mode, dtype, granularity, block_size, weight_threshold, use_yaml ): config_dict = { "mode": mode, "dtype": dtype, "granularity": granularity, "block_size": block_size, "weight_threshold": weight_threshold, } if use_yaml and isinstance(dtype, str): config_dict = self.load_to_yaml(config_dict) config = quantization.OpLinearQuantizerConfig._from_dict(config_dict) expected_config = quantization.OpLinearQuantizerConfig( mode=mode, dtype=dtype, granularity=granularity, block_size=block_size, weight_threshold=weight_threshold, ) assert config == expected_config @pytest.mark.parametrize( "threshold, minimum_sparsity_percentile, weight_threshold, use_yaml", itertools.product( [0.0, 1.0], [0.0, 1.0], [1024, None], [True, False], ), ) def test_threshold_pruner_config_load_stress( self, threshold, minimum_sparsity_percentile, weight_threshold, use_yaml ): config_dict = { "threshold": threshold, "minimum_sparsity_percentile": minimum_sparsity_percentile, "weight_threshold": weight_threshold, } if use_yaml: config_dict = self.load_to_yaml(config_dict) config = quantization.OpThresholdPrunerConfig._from_dict(config_dict) expected_config = quantization.OpThresholdPrunerConfig( threshold=threshold, minimum_sparsity_percentile=minimum_sparsity_percentile, weight_threshold=weight_threshold, ) assert config == expected_config @pytest.mark.parametrize( "n_m_ratio, dim, weight_threshold, use_yaml", itertools.product( [[1, 1], (2, 3)], [0, 1], [1024, None], [True, False], ), ) def test_magnitude_nm_pruner_config_load_stress( self, n_m_ratio, dim, weight_threshold, use_yaml ): config_dict = { "n_m_ratio": n_m_ratio, "dim": dim, "weight_threshold": weight_threshold, } if use_yaml and not isinstance(n_m_ratio, tuple): config_dict = self.load_to_yaml(config_dict) config = quantization.OpMagnitudePrunerConfig._from_dict(config_dict) expected_config = quantization.OpMagnitudePrunerConfig( n_m_ratio=tuple(n_m_ratio), dim=dim, weight_threshold=weight_threshold, ) assert config == expected_config @pytest.mark.parametrize( "target_sparsity, block_size, dim, weight_threshold, use_yaml", itertools.product( [0.0, 1.0], [None, 2], [None, 0, 1], [None, 1024], [True, False], ), ) def test_magnitude_block_sparsity_pruner_config_load_stress( self, target_sparsity, block_size, dim, weight_threshold, use_yaml ): if block_size is None and dim is not None: return config_dict = { "target_sparsity": target_sparsity, "block_size": block_size, "dim": dim, "weight_threshold": weight_threshold, } if use_yaml: config_dict = self.load_to_yaml(config_dict) config = quantization.OpMagnitudePrunerConfig._from_dict(config_dict) expected_config = quantization.OpMagnitudePrunerConfig( target_sparsity=target_sparsity, block_size=block_size, dim=dim, weight_threshold=weight_threshold, ) assert config == expected_config @pytest.mark.parametrize( "mode, nbits, granularity, group_size, channel_axis, weight_threshold, num_kmeans_workers, use_yaml", itertools.product( ["kmeans", "uniform"], [1, 2, 3, 4, 6, 8], ["per_tensor", "per_grouped_channel"], [0, 1, 32], [None, 0, 1], [1024, None], [1, 4], [True, False], ), ) def test_palettizer_config_load_stress( self, mode, nbits, granularity, group_size, channel_axis, weight_threshold, num_kmeans_workers, use_yaml, ): config_dict = { "mode": mode, "nbits": nbits, "granularity": granularity, "group_size": group_size, "channel_axis": channel_axis, "weight_threshold": weight_threshold, "num_kmeans_workers": num_kmeans_workers, } if use_yaml: config_dict = self.load_to_yaml(config_dict) config = quantization.OpPalettizerConfig._from_dict(config_dict) expected_config = quantization.OpPalettizerConfig( mode=mode, nbits=nbits, granularity=granularity, group_size=group_size, channel_axis=channel_axis, weight_threshold=weight_threshold, num_kmeans_workers=num_kmeans_workers, ) assert config == expected_config @pytest.mark.parametrize( "from_yaml, yaml_as_string", itertools.product( [True, False], [True, False], ), ) def test_optimization_config_load_corner_cases(self, from_yaml, yaml_as_string): config_dict = { "bobby_joe": 56, } with pytest.raises( ValueError, match="Invalid key bobby_joe to construct an OptimizationConfig object." ): self.get_opt_config(config_dict, from_yaml, yaml_as_string) config_dict = { "global_config": None, } with pytest.raises(ValueError, match="config_type must be provided with type of string."): self.get_opt_config(config_dict, from_yaml, yaml_as_string) config_dict = { "config_type": "OpLinearQuantizerConfig", "op_type_configs": 123, } with pytest.raises(ValueError, match="op_type_configs must be type of dict. Got"): self.get_opt_config(config_dict, from_yaml, yaml_as_string) config_dict = { "config_type": "OpLinearQuantizerConfig", "op_name_configs": "eric", } with pytest.raises(ValueError, match="op_name_configs must be type of dict. Got"): self.get_opt_config(config_dict, from_yaml, yaml_as_string) # check that the value of the dictionary can be None or not provided config_dict = { "config_type": "OpLinearQuantizerConfig", } config = self.get_opt_config(config_dict, from_yaml, yaml_as_string) assert config.global_config is None assert config.op_type_configs == {} assert config.op_name_configs == {} config_dict = { "config_type": "OpLinearQuantizerConfig", "global_config": None, "op_type_configs": { "conv": None, }, "op_name_configs": { "op_1": None, }, } config = self.get_opt_config(config_dict, from_yaml, yaml_as_string) assert config.global_config is None assert config.op_type_configs["conv"] is None assert config.op_name_configs["op_1"] is None @pytest.mark.parametrize( "from_yaml, yaml_as_string", itertools.product( [True, False], [True, False], ), ) def test_optimization_config_load_linear_quantizer(self, from_yaml, yaml_as_string): config_dict = { "config_type": "OpLinearQuantizerConfig", "global_config": { "mode": "linear", "dtype": "int8", "weight_threshold": None, }, "op_type_configs": { "linear": { "mode": "linear_symmetric", "dtype": "uint8", "weight_threshold": None, }, }, "op_name_configs": { "op_1": { "mode": "linear_symmetric", "dtype": "int8", "weight_threshold": 2047, }, "op_2": { "mode": "linear", "dtype": "uint8", "weight_threshold": 1, }, }, } config = self.get_opt_config(config_dict, from_yaml, yaml_as_string) expected_global_config = quantization.OpLinearQuantizerConfig( mode="linear", dtype=np.int8, weight_threshold=None, ) assert config.global_config == expected_global_config expected_config = quantization.OpLinearQuantizerConfig( mode="linear_symmetric", dtype=np.uint8, weight_threshold=None, ) assert config.op_type_configs["linear"] == expected_config expected_config = quantization.OpLinearQuantizerConfig( mode="linear_symmetric", dtype=np.int8, weight_threshold=2047, ) assert config.op_name_configs["op_1"] == expected_config expected_config = quantization.OpLinearQuantizerConfig( mode="linear", dtype=np.uint8, weight_threshold=1, ) assert config.op_name_configs["op_2"] == expected_config @pytest.mark.parametrize( "from_yaml, yaml_as_string", itertools.product( [True, False], [True, False], ), ) def test_optimization_config_load_pruner(self, from_yaml, yaml_as_string): """ This test also checking the override of the config_type """ config_dict = { "config_type": "OpThresholdPrunerConfig", "global_config": { "config_type": "OpMagnitudePrunerConfig", "target_sparsity": 0.3, }, "op_type_configs": { "linear": { "config_type": "OpMagnitudePrunerConfig", "n_m_ratio": [4, 5], "dim": 0, "weight_threshold": 2, }, "conv": { "threshold": 0.01, "minimum_sparsity_percentile": 0.01, "weight_threshold": 45, }, }, "op_name_configs": { "op_1": { "threshold": 0.1, "minimum_sparsity_percentile": 0.1, "weight_threshold": 1, }, "op_2": { "config_type": "OpMagnitudePrunerConfig", "target_sparsity": 0.5, "block_size": 100, }, }, } config = self.get_opt_config(config_dict, from_yaml, yaml_as_string) expected_global_config = quantization.OpMagnitudePrunerConfig( target_sparsity=0.3, ) assert config.global_config == expected_global_config expected_config = quantization.OpMagnitudePrunerConfig( n_m_ratio=(4, 5), dim=0, weight_threshold=2, ) assert config.op_type_configs["linear"] == expected_config expected_config = quantization.OpThresholdPrunerConfig( threshold=0.01, minimum_sparsity_percentile=0.01, weight_threshold=45, ) assert config.op_type_configs["conv"] == expected_config expected_config = quantization.OpThresholdPrunerConfig( threshold=0.1, minimum_sparsity_percentile=0.1, weight_threshold=1, ) assert config.op_name_configs["op_1"] == expected_config expected_config = quantization.OpMagnitudePrunerConfig( target_sparsity=0.5, block_size=100, ) assert config.op_name_configs["op_2"] == expected_config @pytest.mark.parametrize( "from_yaml, yaml_as_string", itertools.product( [True, False], [True, False], ), ) def test_optimization_config_load_palettizer(self, from_yaml, yaml_as_string): config_dict = { "config_type": "OpPalettizerConfig", "global_config": { "mode": "kmeans", "nbits": 1, "weight_threshold": 2, }, "op_type_configs": { "linear": { "mode": "uniform", "nbits": 6, "weight_threshold": None, }, }, "op_name_configs": { "op_1": { "config_type": "OpPalettizerConfig", "mode": "unique", }, }, } config = self.get_opt_config(config_dict, from_yaml, yaml_as_string) expected_global_config = quantization.OpPalettizerConfig( mode="kmeans", nbits=1, weight_threshold=2, ) assert config.global_config == expected_global_config expected_config = quantization.OpPalettizerConfig( mode="uniform", nbits=6, weight_threshold=None, ) assert config.op_type_configs["linear"] == expected_config expected_config = quantization.OpPalettizerConfig( mode="unique", ) assert config.op_name_configs["op_1"] == expected_config class TestLinearActivationQuantizer(TestCompressionPasses): @pytest.mark.parametrize( "mode, dtype, weight_threshold", itertools.product( ["LINEAR_SYMMETRIC"], [np.int8, types.int8], [1000], ), ) def test_global_config_activation_quantizer_on_pattern_1(self, mode, dtype, weight_threshold): """ Global config would compress all operations with the same config Valid patterns: - conv - conv + relu """ # Insert prefix quantize/dequantize pairs op_config = cto.coreml.experimental.OpActivationLinearQuantizerConfig( mode=mode, dtype=dtype, weight_threshold=weight_threshold ) config = cto.coreml.OptimizationConfig(global_config=op_config) graph_pass_1 = _insert_prefix_quantize_dequantize_pair(config) # Insert suffix quantize/dequantize pairs graph_pass_2 = PASS_REGISTRY["compression::insert_suffix_quantize_dequantize_pair"] graph_pass_2.set_options([PassOption("config", config)]) # Test case: conv prog = self._get_test_program_conv() apply_pass_and_basic_check(prog, graph_pass_1) apply_pass_and_basic_check(prog, graph_pass_2) assert get_op_types_in_program(prog) == [ "cast", "quantize", "dequantize", "conv", "quantize", "dequantize", "cast", ] # Test case: conv + relu prog = self._get_test_program_conv_relu() apply_pass_and_basic_check(prog, graph_pass_1) apply_pass_and_basic_check(prog, graph_pass_2) assert get_op_types_in_program(prog) == [ "cast", "quantize", "dequantize", "conv", "relu", "quantize", "dequantize", "cast", ] @pytest.mark.parametrize( "mode, dtype, weight_threshold", itertools.product( ["LINEAR_SYMMETRIC"], [np.int8, types.int8], [1000], ), ) def test_global_config_activation_quantizer_on_pattern_2(self, mode, dtype, weight_threshold): """ Global config would compress all operations with the same config Valid patterns: add """ # Insert prefix quantize/dequantize pairs op_config = cto.coreml.experimental.OpActivationLinearQuantizerConfig( mode=mode, dtype=dtype, weight_threshold=weight_threshold ) config = cto.coreml.OptimizationConfig(global_config=op_config) graph_pass_1 = _insert_prefix_quantize_dequantize_pair(config) # Insert suffix quantize/dequantize pairs graph_pass_2 = PASS_REGISTRY["compression::insert_suffix_quantize_dequantize_pair"] graph_pass_2.set_options([PassOption("config", config)]) # Test case: add prog = self._get_test_program_add() apply_pass_and_basic_check(prog, graph_pass_1) apply_pass_and_basic_check(prog, graph_pass_2) assert get_op_types_in_program(prog) == [ "cast", "cast", "quantize", "dequantize", "quantize", "dequantize", "add", "quantize", "dequantize", "cast", ] @pytest.mark.parametrize( "mode, dtype, weight_threshold", itertools.product( ["LINEAR_SYMMETRIC"], [np.int8, types.int8], [1000], ), ) def test_global_config_activation_quantizer_on_pattern_3(self, mode, dtype, weight_threshold): """ Global config would compress all operations with the same config Valid pattern: pooling (avg_pool, max_pool) """ # Insert prefix quantize/dequantize pairs op_config = cto.coreml.experimental.OpActivationLinearQuantizerConfig( mode=mode, dtype=dtype, weight_threshold=weight_threshold ) config = cto.coreml.OptimizationConfig(global_config=op_config) graph_pass_1 = _insert_prefix_quantize_dequantize_pair(config) # Insert suffix quantize/dequantize pairs graph_pass_2 = PASS_REGISTRY["compression::insert_suffix_quantize_dequantize_pair"] graph_pass_2.set_options([PassOption("config", config)]) # Test case: avg_pool prog = self._get_test_program_avgpool() apply_pass_and_basic_check(prog, graph_pass_1) apply_pass_and_basic_check(prog, graph_pass_2) assert get_op_types_in_program(prog) == [ "cast", "quantize", "dequantize", "avg_pool", "quantize", "dequantize", "cast", ] # Test case: max_pool prog = self._get_test_program_maxpool() apply_pass_and_basic_check(prog, graph_pass_1) apply_pass_and_basic_check(prog, graph_pass_2) assert get_op_types_in_program(prog) == [ "cast", "quantize", "dequantize", "max_pool", "quantize", "dequantize", "cast", ] class TestGetActivationStats(TestCompressionPasses): def test_get_activation_calibration_stats_basic(self): """ Calibration a floating point model with sample data. """ # Prepare sample data sample_data = [] for _ in range(3): input_data = np.random.rand(5, 10, 4, 4) sample_data.append({"data": input_data}) # Loading a floating point mlmodel mlmodel = self._get_test_mlmodel_conv_relu() activation_stats = _get_activation_calibration_stats(mlmodel, sample_data) def test_get_activation_calibration_stats_skip_invalid_ops(self): """ Calibration a floating point model with sample data. rdar://130623705 A unit test for model with boolean type intermediate tensor. """ # Prepare sample data sample_data = [] for _ in range(3): input_data = np.random.rand(1, 28 * 28, 1) sample_data.append({"data": input_data}) # Loading a floating point mlmodel mlmodel = self._get_test_mlmodel_boolean_type() activation_stats = _get_activation_calibration_stats(mlmodel, sample_data) def test_get_activation_calibration_stats_concat_surrounding_ops(self): """ Calibration a floating point model with sample data. rdar://132017374 A unit test for model with concat would be surrounded by quantize/dequantize pairs after activation quantization. The activation_stats of concat surrounding nodes should be the same, so quantize/dequantize pairs could share same scale/zp. """ # Prepare sample data sample_data = [] for _ in range(3): input_data = np.random.rand(5, 10, 4, 4) sample_data.append({"data_0": input_data}) # Loading a floating point mlmodel mlmodel = self._get_test_mlmodel_conv_concat() activation_stats = _get_activation_calibration_stats(mlmodel, sample_data) activation_stats_unique = set() for value in activation_stats.values(): activation_stats_unique.add((value["rmin"], value["rmax"])) # Since mlmodel has a concat with 2 inputs and 1 output, we should see at least 3 rmin/rmax pairs are identical in activation_stats. # If we dedup rmin/rmax pairs with identical values, the length of unique values should at least reduced by 2 compared with original one. assert len(activation_stats) - len(activation_stats_unique) >= 2 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508431.0 coremltools-8.0/coremltools/test/optimize/coreml/test_post_training_quantization.py0000644000000000000000000032636314672066617030256 0ustar00rootroot# Copyright (c) 2023, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import logging import re import shutil import tempfile from typing import Tuple import numpy as np import pytest import torch import coremltools as ct import coremltools.optimize as cto from coremltools._deps import _HAS_SKLEARN from coremltools.converters.mil.frontend.torch.test.test_torch_conversion_api import ( TestPyTorchConverterExamples, ) from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.tests.iOS18 import backends from coremltools.converters.mil.testing_reqs import compute_units from coremltools.converters.mil.testing_utils import compute_snr_and_psnr, get_op_types_in_program from coremltools.models.utils import MultiFunctionDescriptor, _macos_version, save_multifunction from coremltools.optimize.coreml import _utils as optimize_utils from coremltools.optimize.coreml._post_training_quantization import CoreMLWeightMetaData from coremltools.test.ml_program.test_compression import get_test_model_and_data # Wrapper functions that create the optimization config and call ct.optimize.coreml APIs def linear_quantize_weights(mlmodel, mode="linear", dtype=np.int8): op_config = cto.coreml.OpLinearQuantizerConfig(mode=mode, dtype=dtype) config = cto.coreml.OptimizationConfig(global_config=op_config) return cto.coreml.linear_quantize_weights(mlmodel, config) def palettize_weights(mlmodel, nbits=None, mode="kmeans", lut_function=None): op_config = cto.coreml.OpPalettizerConfig(mode=mode, nbits=nbits, lut_function=lut_function) config = cto.coreml.OptimizationConfig(global_config=op_config) return cto.coreml.palettize_weights(mlmodel, config) def prune_weights( mlmodel, mode="threshold_based", threshold=1e-3, target_sparsity=1.0, block_size=-1, n_m_ratio=(), ): if mode == "threshold_based": op_config = cto.coreml.OpThresholdPrunerConfig( threshold=threshold, minimum_sparsity_percentile=0.0, ) elif mode == "percentile_based": op_config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=target_sparsity, ) elif mode == "block_sparsity": op_config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=target_sparsity, block_size=block_size, ) else: assert mode == "n_m_pruning" op_config = cto.coreml.OpMagnitudePrunerConfig( n_m_ratio=n_m_ratio, ) config = cto.coreml.OptimizationConfig(global_config=op_config) return cto.coreml.prune_weights(mlmodel, config) def decompress_weights(mlmodel): return cto.coreml.decompress_weights(mlmodel) # Utility functions for testing def get_test_model_and_data_complex(): inputs = [ct.TensorType(name="data", shape=(1, 64, 10, 10))] torch_input_values = [torch.rand(*i.shape.to_list()) for i in inputs] coreml_input_values = { i.name: val.detach().numpy() for i, val in zip(inputs, torch_input_values) } class Model(torch.nn.Module): def __init__(self): super(Model, self).__init__() self.conv_1 = torch.nn.Conv2d(in_channels=64, out_channels=32, kernel_size=2) self.conv_2 = torch.nn.Conv2d(in_channels=32, out_channels=64, kernel_size=2) self.linear_1 = torch.nn.Linear(64, 128) self.linear_2 = torch.nn.Linear(128, 256) self.lstm = torch.nn.LSTM(256, 80) def forward(self, x): conv_1 = self.conv_1(x) conv_2 = self.conv_2(conv_1) reshape = torch.reshape(conv_2, (1, 64, 64)) linear_1 = self.linear_1(reshape) linear_2 = self.linear_2(linear_1) lstm = self.lstm(linear_2) return lstm return Model().eval(), inputs, torch_input_values, coreml_input_values def get_test_model_and_data_conv_transpose(): """Two conv transpose layer which share the same weight.""" inputs = [ct.TensorType(name="data", shape=(1, 64, 5, 5))] torch_input_values = [torch.rand(*i.shape.to_list()) for i in inputs] coreml_input_values = { i.name: val.detach().numpy() for i, val in zip(inputs, torch_input_values) } class Model(torch.nn.Module): def __init__(self): super(Model, self).__init__() self.conv_transpose1 = torch.nn.ConvTranspose2d( in_channels=64, out_channels=32, kernel_size=2 ) self.conv_transpose2 = torch.nn.ConvTranspose2d( in_channels=64, out_channels=32, kernel_size=2 ) self.conv_transpose1.weight = self.conv_transpose2.weight def forward(self, x): return self.conv_transpose1(x) + self.conv_transpose2(x) return Model().eval(), inputs, torch_input_values, coreml_input_values def create_unique_weight(weight, nbits, vector_size=1, vector_axis=None): shape = list(weight.detach().numpy().shape) unique_number = 1 << nbits if vector_size == 1: weight = np.random.randint(low=0, high=unique_number, size=shape) else: if shape[vector_axis] % vector_size != 0: raise ValueError( f"weight's dim at {vector_axis}th axis must be divisible by " f"vector_size {vector_size}" ) # Swap the dim size of vector_axis with last dim. shape[vector_axis], shape[-1] = shape[-1], shape[vector_axis] shape[-1] //= vector_size weight = np.random.randint(low=0, high=unique_number, size=shape) weight = np.repeat(weight, vector_size, axis=-1) weight = np.swapaxes(weight, -1, vector_axis) return weight.astype(np.float32) def create_sparse_weight(weight, target_sparsity): shape = list(weight.shape) size = np.prod(shape) weight = 100 * np.random.rand(size) num_of_zeros = int(size * target_sparsity) weight[:num_of_zeros] = 0 return np.reshape(weight, shape).astype(np.float32) def create_quantize_friendly_weight( weight: np.ndarray, nbits: int, signed: bool ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: """Create quantize friendly weight by first quantize and then de-quantize the weight.""" axes = tuple(axis for axis in range(len(weight.shape)) if axis != 0) quantized_weight, scale, zero_point = optimize_utils.quantize_weight( weight, axes, nbits, signed, quantization_mode="LINEAR", dtype=np.int8 if signed else np.uint8, ) scale_shape = scale.shape + tuple([1] * len(axes)) scale = scale.reshape(scale_shape) zero_point = zero_point.reshape(scale_shape) dequantized_weight = scale * ( quantized_weight.astype(np.float32) - zero_point.astype(np.float32) ) return dequantized_weight, scale, zero_point def verify_model_outputs(model, compressed_model, input_values, rtol=1e-7, atol=0.0): """ This utility functions does the following checks: (1) Verify the output of the compressed model has the same shape / type of the original model (2) The decompressed and compressed model have the same numerical outputs """ # Make sure the model can be decompressed decompressed_model = decompress_weights(compressed_model) # Validate the output shape / type ref_outputs = model._mil_program.functions["main"].outputs outputs = compressed_model._mil_program.functions["main"].outputs assert len(ref_outputs) == len(outputs) for a, b in zip(ref_outputs, outputs): assert a.name == b.name assert a.shape == a.shape assert a.dtype == b.dtype if ct.utils._macos_version() < (13, 0): return # Validate that the compressed model could be decompressed, and produces correct outputs output_dict = compressed_model.predict(input_values) de_output_dict = decompressed_model.predict(input_values) for k, v in de_output_dict.items(): assert k in output_dict np.testing.assert_allclose(v, output_dict[k], rtol=rtol, atol=atol) class TestLinearQuantizeWeights: @staticmethod def test_linear_quantization_with_classifier(): traced_model, example_input = TestPyTorchConverterExamples._get_classifier_model() for class_type in ("str", "int"): mlmodel = TestPyTorchConverterExamples._convert_classifier_model( traced_model, example_input, class_type ) config = cto.coreml.OptimizationConfig() global_config = cto.coreml.OpLinearQuantizerConfig( mode="linear_symmetric", dtype=np.int8, weight_threshold=0 ) config.set_global(global_config) mlmodel = cto.coreml.linear_quantize_weights(mlmodel, config) expected_ops = [ "cast", "reshape", "constexpr_affine_dequantize", "linear", "relu", "constexpr_affine_dequantize", "linear", "relu", "constexpr_affine_dequantize", "linear", "cast", "classify", ] assert get_op_types_in_program(mlmodel._mil_program) == expected_ops @staticmethod def test_linear_quantization(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram", compute_precision=ct.precision.FLOAT32) config = cto.coreml.OptimizationConfig() conv_config = cto.coreml.OpLinearQuantizerConfig(mode="linear_symmetric", dtype=np.int8, weight_threshold=500) lstm_config = cto.coreml.OpLinearQuantizerConfig(mode="linear", dtype=np.uint8, weight_threshold=4800) config.set_op_type("conv", conv_config) config.set_op_type("lstm", lstm_config) config.set_op_name("conv_2_1", None) mlmodel = cto.coreml.linear_quantize_weights(mlmodel, config) expected_ops = [ "constexpr_affine_dequantize", "conv", "conv", "reshape", "linear", "linear", "constexpr_affine_dequantize", "constexpr_affine_dequantize", "constexpr_affine_dequantize", "lstm", "expand_dims", "expand_dims" ] prog = mlmodel._mil_program assert get_op_types_in_program(prog) == expected_ops assert prog.find_ops(op_type="conv")[1].weight.op.op_type == "const" expected_dtype = [np.int8, np.uint8, np.uint8, np.uint8, np.uint8] affine_ops = prog.find_ops(op_type="constexpr_affine_dequantize") for dtype, op in zip(expected_dtype, affine_ops): assert op.quantized_data.val.dtype == dtype @staticmethod @pytest.mark.parametrize( "mode, dtype", itertools.product( ("linear", "linear_symmetric"), (np.int8, np.uint8, types.int8, types.uint8), ), ) def test_linear_quanitzation_stress(mode, dtype): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") mlmodel_quantized = linear_quantize_weights(mlmodel, mode=mode, dtype=dtype) # validate parameters expected_ops = ['constexpr_affine_dequantize', 'cast', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_quantized._mil_program) == expected_ops quanitze_op = mlmodel_quantized._mil_program.functions["main"].find_ops(op_type="constexpr_affine_dequantize")[0] assert model.weight.detach().numpy().shape == quanitze_op.quantized_data.shape verify_model_outputs(mlmodel, mlmodel_quantized, coreml_input_values) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_blockwise_quantization(self, compute_unit, backend): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) config = cto.coreml.OptimizationConfig() conv_config = cto.coreml.OpLinearQuantizerConfig( mode="linear_symmetric", dtype="int4", granularity="per_block", block_size=2, weight_threshold=500, ) lstm_config = cto.coreml.OpLinearQuantizerConfig( mode="linear", dtype="int4", granularity="per_block", block_size=2, weight_threshold=4800, ) config.set_op_type("conv", conv_config) config.set_op_type("lstm", lstm_config) # Set a specific conv's config to None to prevent it being compressed. conv_not_to_compress_name = "conv_2_1" if backend.precision == "fp16": conv_not_to_compress_name += "_cast_fp16" config.set_op_name(conv_not_to_compress_name, None) mlmodel_quantized = cto.coreml.linear_quantize_weights(mlmodel, config) expected_ops = [ "constexpr_blockwise_shift_scale", "conv", "conv", "reshape", "linear", "linear", "constexpr_blockwise_shift_scale", "constexpr_blockwise_shift_scale", "constexpr_blockwise_shift_scale", "lstm", "expand_dims", "expand_dims", ] prog = mlmodel_quantized._mil_program assert get_op_types_in_program(prog) == expected_ops assert prog.find_ops(op_type="conv")[1].weight.op.op_type == "const" quantize_ops = prog.find_ops(op_type="constexpr_blockwise_shift_scale") for quantize_op in quantize_ops: assert quantize_op.data.dtype == types.int4 assert types.builtin_to_string(quantize_op.scale.dtype) == backend.precision if _macos_version() >= (15, 0): verify_model_outputs( mlmodel, mlmodel_quantized, coreml_input_values, rtol=1e-2, atol=4e-2 ) @staticmethod @pytest.mark.parametrize( "compute_unit, backend, mode, nbits, signed, block_size", itertools.product( compute_units, backends, ("linear", "linear_symmetric"), (4, 8), (True, False), (0, 1, 2, 4), ), ) def test_blockwise_quanitzation_stress(compute_unit, backend, mode, nbits, signed, block_size): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) dtype_str = types.builtin_to_string(types.get_nbits_int_builtin_type(nbits, signed)) op_config = cto.coreml.OpLinearQuantizerConfig( mode=mode, dtype=dtype_str, granularity="per_block", block_size=block_size ) config = cto.coreml.OptimizationConfig(global_config=op_config) mlmodel_quantized = cto.coreml.linear_quantize_weights(mlmodel, config) # Verify ops. if backend.precision == "fp16": # For fp16 precision there is no extra cast op inserted. expected_ops = ["constexpr_blockwise_shift_scale", "conv"] else: expected_ops = ["constexpr_blockwise_shift_scale", "cast", "conv", "cast"] assert get_op_types_in_program(mlmodel_quantized._mil_program) == expected_ops quantize_op = mlmodel_quantized._mil_program.functions["main"].find_ops( op_type="constexpr_blockwise_shift_scale" )[0] assert types.builtin_to_string(quantize_op.data.dtype) == dtype_str # For sub-byte dtype, we still use np.int8/uint8 to store the data. assert quantize_op.data.val.dtype == np.int8 if signed else np.uint8 assert model.weight.detach().numpy().size == quantize_op.data.val.size # Weight shape is [32, 64, 2, 2]. The scale's shape reflects number of blocks on each axis. assert quantize_op.scale.shape == (32, 64 // block_size if block_size > 0 else 1, 1, 1) if _macos_version() >= (15, 0): verify_model_outputs(mlmodel, mlmodel_quantized, coreml_input_values) # The verify_model_outputs only check compressed and decompressed consistency. # Also need to compare original and compressed model. original_output = mlmodel.predict(coreml_input_values) quantized_output = mlmodel_quantized.predict(coreml_input_values) for k, v in quantized_output.items(): if nbits <= 4 and block_size != 1: # Low-bit has too much info lost when block size is not 1. continue # When nbits is larger and block_size is smaller, the info lost is less. atol, rtol = 0.4, 0.4 if block_size == 1 and nbits > 4: atol, rtol = 1e-2, 1e-2 np.testing.assert_allclose(v, original_output[k], atol=atol, rtol=rtol) @staticmethod @pytest.mark.parametrize( "compute_unit, backend, mode, nbits", itertools.product( compute_units, backends, ("linear", "linear_symmetric"), (4, 8), ), ) def test_per_tensor_quantization_with_blockwise_op(compute_unit, backend, mode, nbits): op_config = cto.coreml.OpLinearQuantizerConfig( mode=mode, dtype=f"int{nbits}", granularity="per_tensor" ) config = cto.coreml.OptimizationConfig(global_config=op_config) model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data( quantize_config=op_config ) torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) mlmodel_quantized = cto.coreml.linear_quantize_weights(mlmodel, config) # Verify ops. if backend.precision == "fp16": # For fp16 precision there is no extra cast op inserted. expected_ops = ["constexpr_blockwise_shift_scale", "conv"] else: expected_ops = ["constexpr_blockwise_shift_scale", "cast", "conv", "cast"] assert get_op_types_in_program(mlmodel_quantized._mil_program) == expected_ops quantize_op = mlmodel_quantized._mil_program.functions["main"].find_ops( op_type="constexpr_blockwise_shift_scale" )[0] assert types.builtin_to_string(quantize_op.data.dtype) == f"int{nbits}" if mode == "linear": assert types.builtin_to_string(quantize_op.offset.dtype) == f"int{nbits}" # For int4, we still use np.int8 to store the data. assert quantize_op.data.val.dtype == np.int8 assert model.weight.detach().numpy().size == quantize_op.data.val.size if _macos_version() >= (15, 0): verify_model_outputs(mlmodel, mlmodel_quantized, coreml_input_values) @staticmethod @pytest.mark.parametrize( "compute_unit, backend, mode, nbits, granularity", itertools.product( compute_units, backends, ("linear", "linear_symmetric"), (4, 8), ("per_tensor", "per_channel", "per_block"), ), ) def test_quantization_conv_transpose_axis(compute_unit, backend, mode, nbits, granularity): """The conv_transpose has [Cin, Cout, ...], which is different from conv.""" ( model, inputs, torch_input_values, coreml_input_values, ) = get_test_model_and_data_conv_transpose() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) dtype_str = f"int{nbits}" op_config = cto.coreml.OpLinearQuantizerConfig( mode=mode, dtype=dtype_str, granularity=granularity ) config = cto.coreml.OptimizationConfig(global_config=op_config) mlmodel_quantized = cto.coreml.linear_quantize_weights(mlmodel, config) # Verify ops. if backend.precision == "fp16": # For fp16 precision there is no extra cast op inserted. expected_ops = [ "constexpr_blockwise_shift_scale", "conv_transpose", "conv_transpose", "add", ] else: expected_ops = [ "constexpr_blockwise_shift_scale", "cast", "conv_transpose", "conv_transpose", "add", "cast", ] assert get_op_types_in_program(mlmodel_quantized._mil_program) == expected_ops # Verify quantization ops are on the expected axis. quantize_op = mlmodel_quantized._mil_program.functions["main"].find_ops( op_type="constexpr_blockwise_shift_scale" )[0] assert types.builtin_to_string(quantize_op.data.dtype) == dtype_str if granularity == "per_tensor": expected_scale_shape = (1, 1, 1, 1) elif granularity == "per_channel": # The weight has shape [64, 32, 2, 2], and the second axis is output channel. expected_scale_shape = (1, 32, 1, 1) else: # The per_block has default block_size 32. expected_scale_shape = (64 // 32, 32, 1, 1) assert quantize_op.scale.shape == expected_scale_shape if _macos_version() >= (15, 0): verify_model_outputs(mlmodel, mlmodel_quantized, coreml_input_values, atol=2e-2) @staticmethod @pytest.mark.parametrize( "backend, skip_model_load", itertools.product(backends, (True, False)), ) def test_skip_model_load_in_compression_pass(backend, skip_model_load): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16, skip_model_load=skip_model_load, ) config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpLinearQuantizerConfig( mode="linear_symmetric", dtype="int4", granularity="per_block", block_size=2, weight_threshold=500, ) ) mlmodel_quantized = cto.coreml.linear_quantize_weights(mlmodel, config) if skip_model_load: # If the mlmodel before compression is not compiled and loaded, the compression pass # should keep the model skip_model_load. with pytest.raises(Exception, match="Cannot make predictions"): mlmodel_quantized.predict(coreml_input_values) else: mlmodel_quantized.predict(coreml_input_values) class TestPalettizeWeights: @staticmethod def test_palettization_with_classifier(): traced_model, example_input = TestPyTorchConverterExamples._get_classifier_model() for class_type in ("str", "int"): mlmodel = TestPyTorchConverterExamples._convert_classifier_model( traced_model, example_input, class_type ) config = cto.coreml.OptimizationConfig() global_config = cto.coreml.OpPalettizerConfig( nbits=8, mode="kmeans", weight_threshold=2 ) config.set_global(global_config) mlmodel = cto.coreml.palettize_weights(mlmodel, config) expected_ops = [ "cast", "reshape", "constexpr_lut_to_dense", "linear", "relu", "constexpr_lut_to_dense", "linear", "relu", "constexpr_lut_to_dense", "linear", "cast", "classify", ] assert get_op_types_in_program(mlmodel._mil_program) == expected_ops @staticmethod def test_palettization(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram", compute_precision=ct.precision.FLOAT32) config = cto.coreml.OptimizationConfig() global_config = cto.coreml.OpPalettizerConfig(nbits=8, mode="kmeans", weight_threshold=500) conv_config = cto.coreml.OpPalettizerConfig(nbits=6, mode="kmeans", weight_threshold=500) conv_2_config = cto.coreml.OpPalettizerConfig(nbits=4, mode="kmeans", weight_threshold=500) linear_1_config = cto.coreml.OpPalettizerConfig(nbits=2, mode="kmeans", weight_threshold=500) config.set_global(global_config) config.set_op_type("conv", conv_config) config.set_op_name("conv_2_1", conv_2_config) config.set_op_name("linear_0", linear_1_config) mlmodel = cto.coreml.palettize_weights(mlmodel, config) expected_ops = [ "constexpr_lut_to_dense", "constexpr_lut_to_dense", "constexpr_lut_to_dense", "constexpr_lut_to_dense", "conv", "conv", "reshape", "linear", "linear", "constexpr_lut_to_dense", "constexpr_lut_to_dense", "constexpr_lut_to_dense", "lstm", "expand_dims", "expand_dims" ] prog = mlmodel._mil_program assert get_op_types_in_program(prog) == expected_ops expected_nbits = [6, 4, 2, 8, 8, 8, 8, 8] lut_ops = prog.find_ops(op_type="constexpr_lut_to_dense") for nbits, op in zip(expected_nbits, lut_ops): assert op.lut.val.shape == (2**nbits,) @staticmethod @pytest.mark.parametrize( "mode", ("uniform", "kmeans") if _HAS_SKLEARN else ("uniform",) ) def test_weight_palettization_stress(mode): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") mlmodel_palettized = palettize_weights(mlmodel, nbits=4, mode=mode) # validate parameters expected_ops = ['constexpr_lut_to_dense', 'cast', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_palettized._mil_program) == expected_ops main_func = mlmodel_palettized._mil_program.functions["main"] lut_to_dense_op = main_func.find_ops(op_type="constexpr_lut_to_dense")[0] assert lut_to_dense_op.shape.val.tolist() == list(model.weight.detach().numpy().shape) # validate the model verify_model_outputs(mlmodel, mlmodel_palettized, coreml_input_values) @staticmethod def test_weight_palettization_unique_case_1(): # In this model, both conv weights can be palettized model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data(multi_layer=True) weight_1_unique = create_unique_weight(model.conv_1.weight, nbits=2) weight_2_unique = create_unique_weight(model.conv_2.weight, nbits=6) with torch.no_grad(): model.conv_1.weight = torch.nn.Parameter(torch.Tensor(weight_1_unique)) model.conv_2.weight = torch.nn.Parameter(torch.Tensor(weight_2_unique)) torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") # validate parameters mlmodel_palettized = palettize_weights(mlmodel, mode="unique") expected_ops = ['constexpr_lut_to_dense', 'cast', 'conv', 'constexpr_lut_to_dense', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_palettized._mil_program) == expected_ops main_func = mlmodel_palettized._mil_program.functions["main"] lut_to_dense_op_1 = main_func.find_ops(op_type="constexpr_lut_to_dense")[0] lut_to_dense_op_2 = main_func.find_ops(op_type="constexpr_lut_to_dense")[1] assert lut_to_dense_op_1.shape.val.tolist() == list(model.conv_1.weight.detach().numpy().shape) assert lut_to_dense_op_2.shape.val.tolist() == list(model.conv_2.weight.detach().numpy().shape) # validate the model verify_model_outputs(mlmodel, mlmodel_palettized, coreml_input_values) def test_weight_palettization_unique_case_2(self, caplog): # In this model, only one conv weights can be palettized, the converter should warn the users that one weight is skipped model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data(multi_layer=True) weight_1_unique = create_unique_weight(model.conv_1.weight, nbits=2) with torch.no_grad(): model.conv_1.weight = torch.nn.Parameter(torch.Tensor(weight_1_unique)) torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") # validate parameters # converter should warn the user that one weight is not compressed mlmodel_palettized = palettize_weights(mlmodel, mode="unique") warning_msg = "Unique values in weight cannot be represented by 8 bits palettization." assert any([warning_msg in rec.message for rec in caplog.records]) expected_ops = ['constexpr_lut_to_dense', 'cast', 'conv', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_palettized._mil_program) == expected_ops main_func = mlmodel_palettized._mil_program.functions["main"] lut_to_dense_op_1 = main_func.find_ops(op_type="constexpr_lut_to_dense")[0] assert lut_to_dense_op_1.shape.val.tolist() == list(model.conv_1.weight.detach().numpy().shape) # validate the model verify_model_outputs(mlmodel, mlmodel_palettized, coreml_input_values) @staticmethod def test_weight_palettization_custom(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") def lut_function(weight): nbits = 4 weight = weight.flatten() unique_elements = np.unique(weight) k = (1 << nbits) - 1 top_k = np.partition(weight, -k)[-k:] np.sort(top_k) lut = np.array([0.] + top_k.tolist()).astype(weight.dtype) mapping = {v: idx for idx, v in enumerate(lut)} indices = np.array([mapping[v] if v in mapping else 0 for v in weight]).astype(np.uint8) return lut, indices mlmodel_palettized = palettize_weights(mlmodel, mode="custom", lut_function=lut_function) # validate parameters expected_ops = ['constexpr_lut_to_dense', 'cast', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_palettized._mil_program) == expected_ops main_func = mlmodel_palettized._mil_program.functions["main"] lut_to_dense_op = main_func.find_ops(op_type="constexpr_lut_to_dense")[0] assert lut_to_dense_op.shape.val.tolist() == list(model.weight.detach().numpy().shape) # validate the model verify_model_outputs(mlmodel, mlmodel_palettized, coreml_input_values) @staticmethod def test_convert_palettized_source_model_default(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() weight_1_unique = create_unique_weight(model.conv_1.weight, nbits=2) weight_2_unique = create_unique_weight(model.conv_2.weight, nbits=6) linear_1_unique = create_unique_weight(model.linear_1.weight, nbits=4) with torch.no_grad(): model.conv_1.weight = torch.nn.Parameter(torch.Tensor(weight_1_unique)) model.conv_2.weight = torch.nn.Parameter(torch.Tensor(weight_2_unique)) model.linear_1.weight = torch.nn.Parameter(torch.Tensor(linear_1_unique)) torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", pass_pipeline=ct.PassPipeline.DEFAULT_PALETTIZATION, compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS16, ) expected_ops = [ "constexpr_lut_to_dense", "constexpr_lut_to_dense", "constexpr_lut_to_dense", "conv", "conv", "reshape", "linear", "linear", "constexpr_lut_to_dense", "squeeze", "lstm", "expand_dims", "expand_dims", ] prog = mlmodel._mil_program assert get_op_types_in_program(prog) == expected_ops expected_nbits = [2, 6, 4, 1, 1] lut_ops = prog.find_ops(op_type="constexpr_lut_to_dense") for nbits, op in zip(expected_nbits, lut_ops): assert op.lut.val.shape == (2**nbits,) @staticmethod def test_convert_palettized_source_model_custom(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() weight_1_unique = create_unique_weight(model.conv_1.weight, nbits=2) weight_2_unique = create_unique_weight(model.conv_2.weight, nbits=6) linear_1_unique = create_unique_weight(model.linear_1.weight, nbits=4) with torch.no_grad(): model.conv_1.weight = torch.nn.Parameter(torch.Tensor(weight_1_unique)) model.conv_2.weight = torch.nn.Parameter(torch.Tensor(weight_2_unique)) model.linear_1.weight = torch.nn.Parameter(torch.Tensor(linear_1_unique)) pipeline = ct.PassPipeline.DEFAULT_PALETTIZATION config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig(mode="unique"), op_type_configs={ "conv": None, "linear": cto.coreml.OpPalettizerConfig(nbits=1, mode="kmeans"), } ) pipeline.set_options("compression::palettize_weights", {"config": config}) torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", pass_pipeline=pipeline, compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS16, ) expected_ops = [ "constexpr_lut_to_dense", "constexpr_lut_to_dense", "conv", "conv", "reshape", "linear", "linear", "constexpr_lut_to_dense", "squeeze", "lstm", "expand_dims", "expand_dims", ] prog = mlmodel._mil_program assert get_op_types_in_program(prog) == expected_ops expected_nbits = [1, 1, 1, 1] lut_ops = prog.find_ops(op_type="constexpr_lut_to_dense") for nbits, op in zip(expected_nbits, lut_ops): assert op.lut.val.shape == (2**nbits,) conv_ops = prog.find_ops(op_type="conv") assert conv_ops[0].weight.op.op_type == "const" assert conv_ops[1].weight.op.op_type == "const" linear_ops = prog.find_ops(op_type="linear") assert linear_ops[0].weight.op.op_type == "constexpr_lut_to_dense" assert linear_ops[1].weight.op.op_type == "constexpr_lut_to_dense" @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_channelwise_palettization(self, compute_unit, backend): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) config = cto.coreml.OptimizationConfig() conv_config = cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=8, granularity="per_grouped_channel", group_size=1, weight_threshold=500, ) lstm_config = cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=4, granularity="per_grouped_channel", group_size=1, weight_threshold=4800, ) config.set_op_type("conv", conv_config) config.set_op_type("lstm", lstm_config) # Set a specific conv's config to None to prevent it being compressed. conv_not_to_compress_name = "conv_2_1" if backend.precision == "fp16": conv_not_to_compress_name += "_cast_fp16" config.set_op_name(conv_not_to_compress_name, None) mlmodel_palettized = cto.coreml.palettize_weights(mlmodel, config) expected_ops = [ "constexpr_lut_to_dense", "conv", "conv", "reshape", "linear", "linear", "constexpr_lut_to_dense", "constexpr_lut_to_dense", "constexpr_lut_to_dense", "lstm", "expand_dims", "expand_dims", ] prog = mlmodel_palettized._mil_program assert get_op_types_in_program(prog) == expected_ops assert prog.find_ops(op_type="conv")[1].weight.op.op_type == "const" palettize_ops = prog.find_ops(op_type="constexpr_lut_to_dense") for quantize_op in palettize_ops: assert types.builtin_to_string(quantize_op.lut.dtype) == backend.precision assert types.builtin_to_string(palettize_ops[0].indices.dtype) == "uint8" assert types.builtin_to_string(palettize_ops[3].indices.dtype) == "uint4" if _macos_version() >= (15, 0): verify_model_outputs(mlmodel, mlmodel_palettized, coreml_input_values) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_channelwise_palettization_unique_skip_op(self, compute_unit, backend, caplog): """Test where mode is unique and can't use nbits to represent the weight""" model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() traced_model = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( traced_model, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) config = cto.coreml.OptimizationConfig() global_config = cto.coreml.OpPalettizerConfig( mode="unique", granularity="per_grouped_channel", group_size=1, weight_threshold=100, ) # For conv weight in the whole tensor cannot be represented by 2**8 unique values. conv_config = cto.coreml.OpPalettizerConfig( mode="unique", granularity="per_tensor", weight_threshold=100, ) config.set_global(global_config) config.set_op_type("conv", conv_config) mlmodel_palettized = cto.coreml.palettize_weights(mlmodel, config) assert any( [ "Unique values in weight cannot be represented by 8 bits palettization." in rec.message for rec in caplog.records ] ) # There is no constexpr for the conv weight. for conv_op in mlmodel_palettized._mil_program.find_ops(op_type="conv"): assert conv_op.weight.op.op_type == "const" # There are still constexpr ops for linear and lstm weights. assert len(mlmodel_palettized._mil_program.find_ops(op_type="constexpr_lut_to_dense")) == 5 if _macos_version() >= (15, 0): verify_model_outputs(mlmodel, mlmodel_palettized, coreml_input_values) @staticmethod @pytest.mark.parametrize( "compute_unit, backend, mode, nbits, channel_axis, channel_group_size", itertools.product( compute_units, backends, ("kmeans", "uniform"), (1, 2, 3, 4, 6, 8), (0, 1), (0, 1, 2), ), ) def test_channelwise_palettization_stress( compute_unit, backend, mode, nbits, channel_axis, channel_group_size ): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) op_config = cto.coreml.OpPalettizerConfig( mode=mode, nbits=nbits, granularity="per_grouped_channel", group_size=channel_group_size, channel_axis=channel_axis, ) config = cto.coreml.OptimizationConfig(global_config=op_config) mlmodel_palettized = cto.coreml.palettize_weights(mlmodel, config) # Verify ops. if backend.precision == "fp16": # For fp16 precision there is no extra cast op inserted. expected_ops = ["constexpr_lut_to_dense", "conv"] else: expected_ops = ["constexpr_lut_to_dense", "cast", "conv", "cast"] assert get_op_types_in_program(mlmodel_palettized._mil_program) == expected_ops palettize_op = mlmodel_palettized._mil_program.functions["main"].find_ops( op_type="constexpr_lut_to_dense" )[0] assert types.builtin_to_string(palettize_op.indices.dtype) == f"uint{nbits}" # For uint4, we still use np.uint8 to store the data. assert palettize_op.indices.val.dtype == np.uint8 assert model.weight.detach().numpy().shape == palettize_op.indices.val.shape if _macos_version() >= (15, 0): verify_model_outputs(mlmodel, mlmodel_palettized, coreml_input_values) # The verify_model_outputs compares the decompressed model with compressed model. # We further compare the compressed model with original model. ref_output_dict = mlmodel.predict(coreml_input_values) output_dict = mlmodel_palettized.predict(coreml_input_values) for k, v in output_dict.items(): assert k in ref_output_dict if nbits == 1: continue # nbits=1 numerical loss is too significant. elif nbits <= 3: large_diff_count = np.sum((v - ref_output_dict[k]) > 0.2) threshold = 0.15 if channel_group_size != 0 else 0.5 assert large_diff_count / v.size < threshold elif nbits < 8: np.testing.assert_almost_equal(v, ref_output_dict[k], decimal=1) else: err_tol = 1e-5 if mode == "kmeans" and channel_group_size == 1 else 1e-2 np.testing.assert_allclose(v, ref_output_dict[k], atol=err_tol, rtol=err_tol) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_grouped_channelwise_palettization_better_than_per_tensor(self, compute_unit, backend): """The grouped channelwise lut should be better than per-tensor lut.""" model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) per_tensor_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=4, granularity="per_tensor", ) ) grouped_channelwise_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=4, granularity="per_grouped_channel", group_size=1, ) ) if _macos_version() < (15, 0): pytest.skip("Channelwise palettization prediction only support in iOS18+") mlmodel_per_tensor_palettized = cto.coreml.palettize_weights(mlmodel, per_tensor_config) mlmodel_grouped_channelwise_palettized = cto.coreml.palettize_weights( mlmodel, grouped_channelwise_config ) output_ref = mlmodel.predict(coreml_input_values) output_per_tensor = mlmodel_per_tensor_palettized.predict(coreml_input_values) output_grouped_channelwise = mlmodel_grouped_channelwise_palettized.predict( coreml_input_values ) for k_ref, v_ref in output_ref.items(): snr_per_tensor = compute_snr_and_psnr(v_ref, output_per_tensor[k_ref])[0] snr_grouped_channelwise = compute_snr_and_psnr( v_ref, output_grouped_channelwise[k_ref] )[0] assert snr_grouped_channelwise > snr_per_tensor def test_channelwise_palettization_invalid_config(self): with pytest.raises(ValueError, match='Invalid value of "nbits" \(7\) for palettization'): cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=7, granularity="per_tensor", weight_threshold=500, ) @pytest.mark.parametrize( "compute_unit, backend, group_size", itertools.product(compute_units, backends, [1, 16]), ) def test_convert_palettized_model_with_pipeline(self, compute_unit, backend, group_size): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data( multi_layer=True ) with torch.no_grad(): model.conv_1.weight = torch.nn.Parameter( torch.Tensor(create_unique_weight(model.conv_1.weight, nbits=2)) ) model.conv_2.weight = torch.nn.Parameter( torch.Tensor(create_unique_weight(model.conv_2.weight, nbits=6)) ) torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) pass_pipeline = ct.PassPipeline.DEFAULT_PALETTIZATION pass_pipeline.set_options( "compression::palettize_weights", { "config": cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig( mode="unique", granularity="per_grouped_channel", group_size=group_size ) ) }, ) mlmodel_palettized = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, pass_pipeline=pass_pipeline, ) expected_ops = ["constexpr_lut_to_dense", "constexpr_lut_to_dense", "conv", "conv"] assert get_op_types_in_program(mlmodel_palettized._mil_program) == expected_ops palettize_ops = mlmodel_palettized._mil_program.functions["main"].find_ops( op_type="constexpr_lut_to_dense" ) assert types.builtin_to_string(palettize_ops[0].indices.dtype) == "uint2" assert palettize_ops[0].lut.shape == (32 // group_size, 1, 1, 1, 4, 1) assert types.builtin_to_string(palettize_ops[1].indices.dtype) == "uint6" assert palettize_ops[1].lut.shape == (64 // group_size, 1, 1, 1, 64, 1) if _macos_version() >= (15, 0): verify_model_outputs(mlmodel, mlmodel_palettized, coreml_input_values) @pytest.mark.xfail(reason="rdar://131511244 Investigate Why Palettization is Failing on BNNS") @pytest.mark.parametrize( "compute_unit, backend, mode, cluster_dim", itertools.product(compute_units, backends, ("kmeans", "unique"), (2, 4)), ) def test_vector_palettization(self, compute_unit, backend, mode, cluster_dim): """Test the vector palettization (cluster_dim > 1).""" model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() if mode == "unique": weight_unique = create_unique_weight( model.weight, nbits=4, vector_size=cluster_dim, vector_axis=0 ) with torch.no_grad(): model.weight = torch.nn.Parameter(torch.Tensor(weight_unique)) torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) vector_lut_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig( mode=mode, nbits=4 if mode == "kmeans" else None, granularity="per_grouped_channel", group_size=0, cluster_dim=cluster_dim, weight_threshold=500, ) ) mlmodel_palettized = cto.coreml.palettize_weights(mlmodel, vector_lut_config) # Verify ops. palettize_op = mlmodel_palettized._mil_program.functions["main"].find_ops( op_type="constexpr_lut_to_dense" )[0] produced_nbits = 4 assert types.builtin_to_string(palettize_op.indices.dtype) == f"uint{produced_nbits}" # The shape on the Cout (0th for conv) should match after multiplying cluster_dim. assert model.weight.shape[0] == palettize_op.indices.val.shape[0] * cluster_dim # The last dim of lut should match cluster_dim. assert palettize_op.lut.shape[-2:] == (2**produced_nbits, cluster_dim) if _macos_version() >= (15, 0): verify_model_outputs(mlmodel, mlmodel_palettized, coreml_input_values) @pytest.mark.parametrize( "compute_unit, backend, mode, cluster_dim, op_type", itertools.product( compute_units, backends, ("kmeans", "unique"), (2, 4), ("conv", "conv_transpose") ), ) def test_vector_palettization_skip_conv( self, compute_unit, backend, mode, cluster_dim, op_type, caplog ): """Test grouped conv/conv_transpose where effective dim size is not divisible by cluster_dim.""" inputs = [ct.TensorType(name="data", shape=(1, 32, 10, 10))] torch_input_values = [torch.rand(*i.shape.to_list()) for i in inputs] if op_type == "conv": model = torch.nn.Conv2d(in_channels=32, out_channels=32, kernel_size=1, groups=32) else: model = torch.nn.ConvTranspose2d( in_channels=32, out_channels=32, kernel_size=1, groups=32 ) model.eval() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) vector_lut_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig( mode=mode, nbits=4 if mode == "kmeans" else None, granularity="per_grouped_channel", group_size=0, cluster_dim=cluster_dim, weight_threshold=30, # The weight shape is [32, 1, 1, 1]. ) ) with caplog.at_level(logging.WARNING): mlmodel_palettized = cto.coreml.palettize_weights(mlmodel, vector_lut_config) # As the effective dim size (1) is not divisible by cluster_dim, the op won't be palettized. warning_msg = "The `cluster_dim` is invalid for .* Skipped this op." assert any([re.match(warning_msg, rec.message) for rec in caplog.records]) assert get_op_types_in_program(mlmodel._mil_program) == get_op_types_in_program( mlmodel_palettized._mil_program ) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_palettization_pcs(self, compute_unit, backend): """Test the palettization with per-channel-scale.""" model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) vector_lut_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=4, granularity="per_grouped_channel", group_size=0, enable_per_channel_scale=True, weight_threshold=500, ) ) mlmodel_palettized = cto.coreml.palettize_weights(mlmodel, vector_lut_config) # Verify ops. palettize_op = mlmodel_palettized._mil_program.functions["main"].find_ops( op_type="constexpr_lut_to_dense" )[0] assert types.builtin_to_string(palettize_op.indices.dtype) == "uint4" # The per-channel-scale is represented by a quant op to do scaling. quantize_ops = mlmodel_palettized._mil_program.functions["main"].find_ops( op_type="constexpr_blockwise_shift_scale" ) assert len(quantize_ops) > 0 # Order of quant and lut op is determined by canonicalize_quantized_lut_pattern graph pass. assert quantize_ops[0].outputs[0].child_ops[0].op_type == "constexpr_lut_to_dense" if _macos_version() >= (15, 0): verify_model_outputs(mlmodel, mlmodel_palettized, coreml_input_values) class TestPruneWeights: @staticmethod def test_pruning_with_classifier(): traced_model, example_input = TestPyTorchConverterExamples._get_classifier_model() for class_type in ("str", "int"): mlmodel = TestPyTorchConverterExamples._convert_classifier_model( traced_model, example_input, class_type ) config = cto.coreml.OptimizationConfig() global_config = cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.9, weight_threshold=2 ) config.set_global(global_config) mlmodel = cto.coreml.prune_weights(mlmodel, config) expected_ops = [ "cast", "reshape", "constexpr_sparse_to_dense", "linear", "relu", "constexpr_sparse_to_dense", "linear", "relu", "constexpr_sparse_to_dense", "linear", "cast", "classify", ] assert get_op_types_in_program(mlmodel._mil_program) == expected_ops @staticmethod def test_pruning(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram", compute_precision=ct.precision.FLOAT32) config = cto.coreml.OptimizationConfig() global_config = cto.coreml.OpMagnitudePrunerConfig(target_sparsity=0.9, weight_threshold=500) config.set_global(global_config) config.set_op_type("lstm", None) config.set_op_name("linear_0", None) mlmodel = cto.coreml.prune_weights(mlmodel, config) expected_ops = [ "constexpr_sparse_to_dense", "constexpr_sparse_to_dense", "constexpr_sparse_to_dense", "conv", "conv", "reshape", "linear", "linear", "lstm", "expand_dims", "expand_dims" ] prog = mlmodel._mil_program assert get_op_types_in_program(prog) == expected_ops assert prog.find_ops(op_type="linear")[0].weight.op.op_type == "const" @staticmethod @pytest.mark.parametrize( "threshold", (0.0, 0.001, 1e2), ) def test_weight_pruning_threshold_based(threshold): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() with torch.no_grad(): model.weight[0][0][0][0] = 101 torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") mlmodel_sparsified = prune_weights(mlmodel, mode="threshold_based", threshold=threshold) # validate parameters expected_ops = ['constexpr_sparse_to_dense', 'cast', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_sparsified._mil_program) == expected_ops main_func = mlmodel_sparsified._mil_program.functions["main"] sparse_to_dense_op = main_func.find_ops(op_type="constexpr_sparse_to_dense")[0] non_sparse_data = sparse_to_dense_op.nonzero_data if threshold != 1e2: assert np.min(np.absolute(non_sparse_data.val)) >= threshold else: assert non_sparse_data.val.size == 1 assert sparse_to_dense_op.shape.val.tolist() == list(model.weight.detach().numpy().shape) # validate the model verify_model_outputs(mlmodel, mlmodel_sparsified, coreml_input_values) @staticmethod @pytest.mark.parametrize( "percentile", (0., 0.5, 1.0), ) def test_weight_pruning_percentile_based(percentile): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() # Make sure no weight element is randomed to 0, to eliminate testing noise # e.g. in percentile 0 test case, we would expect no element gets pruned # if there is no 0 in initial weight with torch.no_grad(): non0_weight = torch.where(torch.abs(model.weight) > 1e-6, model.weight, 1e-6) model.weight.copy_(non0_weight) torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") mlmodel_sparsified = prune_weights(mlmodel, mode="percentile_based", target_sparsity=percentile) # validate parameters expected_ops = ['constexpr_sparse_to_dense', 'cast', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_sparsified._mil_program) == expected_ops main_func = mlmodel_sparsified._mil_program.functions["main"] sparse_to_dense_op = main_func.find_ops(op_type="constexpr_sparse_to_dense")[0] non_sparse_data = sparse_to_dense_op.nonzero_data weight = model.weight.detach().numpy() if percentile == 0.: assert non_sparse_data.val.size == weight.size elif percentile == 0.5: lower = 0.49 * weight.size upper = 0.51 * weight.size actual = non_sparse_data.val.size assert lower <= actual and actual <= upper else: assert non_sparse_data.val.size == 0 assert sparse_to_dense_op.shape.val.tolist() == list(model.weight.detach().numpy().shape) # validate the model verify_model_outputs(mlmodel, mlmodel_sparsified, coreml_input_values) def test_weight_pruning_block_sparsity(self): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") mlmodel_sparsified = prune_weights(mlmodel, mode="block_sparsity", target_sparsity=0.3, block_size=5) # validate parameters expected_ops = ['constexpr_sparse_to_dense', 'cast', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_sparsified._mil_program) == expected_ops def test_weight_pruning_n_m(self): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") mlmodel_sparsified = prune_weights(mlmodel, mode="n_m_pruning", n_m_ratio=(2, 3)) # validate parameters expected_ops = ['constexpr_sparse_to_dense', 'cast', 'conv', 'cast'] assert get_op_types_in_program(mlmodel_sparsified._mil_program) == expected_ops def test_convert_sparse_source_model_default(self): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() weight_1_sparse = create_sparse_weight(model.conv_1.weight, 0.5) weight_2_sparse = create_sparse_weight(model.conv_2.weight, 0.1) linear_1_sparse = create_sparse_weight(model.linear_1.weight, 0.9) with torch.no_grad(): model.conv_1.weight = torch.nn.Parameter(torch.Tensor(weight_1_sparse)) model.conv_2.weight = torch.nn.Parameter(torch.Tensor(weight_2_sparse)) model.linear_1.weight = torch.nn.Parameter(torch.Tensor(linear_1_sparse)) torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", pass_pipeline=ct.PassPipeline.DEFAULT_PRUNING, compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS16, ) prog = mlmodel._mil_program # The default minimum_sparsity_percentile is 0.3, so only conv1, linear1, and two initialize states of lstm # are compressed expected_ops = [ "constexpr_sparse_to_dense", "constexpr_sparse_to_dense", "conv", "conv", "reshape", "linear", "linear", "constexpr_sparse_to_dense", "squeeze", "lstm", "expand_dims", "expand_dims" ] assert get_op_types_in_program(prog) == expected_ops conv_ops = prog.find_ops(op_type="conv") assert conv_ops[0].weight.op.op_type == "constexpr_sparse_to_dense" assert conv_ops[1].weight.op.op_type == "const" linear_ops = prog.find_ops(op_type="linear") assert linear_ops[0].weight.op.op_type == "constexpr_sparse_to_dense" assert linear_ops[1].weight.op.op_type == "const" def test_convert_sparse_source_model_custom(self): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() weight_1_sparse = create_sparse_weight(model.conv_1.weight, 0.5) weight_2_sparse = create_sparse_weight(model.conv_2.weight, 0.1) linear_1_sparse = create_sparse_weight(model.linear_1.weight, 0.9) with torch.no_grad(): model.conv_1.weight = torch.nn.Parameter(torch.Tensor(weight_1_sparse)) model.conv_2.weight = torch.nn.Parameter(torch.Tensor(weight_2_sparse)) model.linear_1.weight = torch.nn.Parameter(torch.Tensor(linear_1_sparse)) torchmodel = torch.jit.trace(model, torch_input_values) pipeline = ct.PassPipeline.DEFAULT_PRUNING config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpThresholdPrunerConfig( threshold=1e-12, minimum_sparsity_percentile=0.05 ), op_type_configs={"conv": None}, ) pipeline.set_options("compression::prune_weights", {"config": config}) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", pass_pipeline=pipeline, compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS16, ) prog = mlmodel._mil_program expected_ops = [ "constexpr_sparse_to_dense", "conv", "conv", "reshape", "linear", "linear", "constexpr_sparse_to_dense", "squeeze", "lstm", "expand_dims", "expand_dims" ] assert get_op_types_in_program(prog) == expected_ops conv_ops = prog.find_ops(op_type="conv") assert conv_ops[0].weight.op.op_type == "const" assert conv_ops[1].weight.op.op_type == "const" linear_ops = prog.find_ops(op_type="linear") assert linear_ops[0].weight.op.op_type == "constexpr_sparse_to_dense" assert linear_ops[1].weight.op.op_type == "const" @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_default_prune_pipeline_ios18(self, compute_unit, backend): """Make sure the new iOS18 op is used for DEFAULT_PRUNING pass pipeline.""" # Make the weight size not divisible by 8, to make sure the internal conversion to ios18 # sparse_to_dense op handles sub-byte masks correctly. model = torch.nn.Linear(21, 121) model.eval() weight_sparse = create_sparse_weight(model.weight, 0.7) with torch.no_grad(): model.weight = torch.nn.Parameter(torch.Tensor(weight_sparse)) inputs = [ct.TensorType(name="data", shape=(4, 21))] torch_input_values = [torch.rand(*i.shape.to_list()) for i in inputs] coreml_input_values = { i.name: val.detach().numpy() for i, val in zip(inputs, torch_input_values) } torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) mlmodel_pruned = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, pass_pipeline=ct.PassPipeline.DEFAULT_PRUNING, ) sparse_ops = mlmodel_pruned._mil_program.find_ops(op_type="constexpr_sparse_to_dense") assert len(sparse_ops) > 0 for sparse_op in sparse_ops: assert types.builtin_to_string(sparse_op.nonzero_data.dtype) == backend.precision if backend.opset_version >= ct.target.iOS18: assert types.builtin_to_string(sparse_op.mask.dtype) == "uint1" else: assert types.builtin_to_string(sparse_op.mask.dtype) == "uint8" assert types.builtin_to_string(sparse_op.shape.dtype) == "uint32" if _macos_version() >= (15, 0): verify_model_outputs(mlmodel, mlmodel_pruned, coreml_input_values, rtol=3e-3, atol=2e-3) class TestJointCompressWeights: """Test using coremltools PTQ to do joint compression.""" @pytest.mark.xfail( reason="rdar://131511244 Investigate Why Joint Prune x Anything are Failing on BNNS" ) @pytest.mark.parametrize( "compute_unit, backend, dtype, block_size, output_channel_block_size, prune_first", itertools.product( compute_units, backends, ("int4", "int8", "uint4", "uint8"), (0, 1, 2), (0, 1), (True, False), ), ) def test_joint_prune_quantize_weights( self, compute_unit, backend, dtype, block_size, output_channel_block_size, prune_first ): """Jointly prune and quantize the model, where non-sparse entries are quantized.""" model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) prune_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.5, weight_threshold=500 ) ) quant_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpLinearQuantizerConfig( mode="linear", dtype=dtype, granularity="per_block", block_size=[0, block_size] if output_channel_block_size == 0 else block_size, weight_threshold=500, ), op_type_configs={ "conv": cto.coreml.OpLinearQuantizerConfig( mode="linear", dtype=dtype, granularity="per_block", block_size=[0, block_size, 0, 0] if output_channel_block_size == 0 else block_size, weight_threshold=500, ), }, ) if prune_first: mlmodel_pruned = cto.coreml.prune_weights(mlmodel, prune_config) mlmodel_joint_pruned_quantized = cto.coreml.linear_quantize_weights( mlmodel_pruned, quant_config, joint_compression=True ) else: mlmodel_quantized = cto.coreml.linear_quantize_weights(mlmodel, quant_config) mlmodel_joint_pruned_quantized = cto.coreml.prune_weights( mlmodel_quantized, prune_config, joint_compression=True ) # If run prune first, the all-zero const for lstm won't have nonzero-data, so it won't be # further quantized. lstm_weight_compression_ops = ( ["constexpr_sparse_to_dense"] if prune_first else ["constexpr_sparse_blockwise_shift_scale", "constexpr_sparse_to_dense"] ) expected_ops = ( ["constexpr_sparse_blockwise_shift_scale", "constexpr_sparse_to_dense", "conv"] * 2 + ["reshape"] + ["constexpr_sparse_blockwise_shift_scale", "constexpr_sparse_to_dense", "linear"] * 2 + lstm_weight_compression_ops + ["constexpr_sparse_blockwise_shift_scale", "constexpr_sparse_to_dense"] * 2 + ["lstm", "expand_dims", "expand_dims"] ) prog = mlmodel_joint_pruned_quantized._mil_program assert get_op_types_in_program(prog) == expected_ops for linear_op in prog.find_ops(op_type="linear"): assert linear_op.weight.op.op_type == "constexpr_sparse_to_dense" for conv_op in prog.find_ops(op_type="conv"): assert conv_op.weight.op.op_type == "constexpr_sparse_to_dense" sparse_quantize_ops = prog.find_ops(op_type="constexpr_sparse_blockwise_shift_scale") assert len(sparse_quantize_ops) > 0 for sparse_quantize_op in sparse_quantize_ops: assert types.builtin_to_string(sparse_quantize_op.nonzero_data.dtype) == dtype assert sparse_quantize_op.data_mask.dtype == types.uint1 assert sparse_quantize_op.scale.dtype == types.fp16 assert types.builtin_to_string(sparse_quantize_op.offset.dtype) == dtype assert sparse_quantize_op.outputs[1].child_ops[0].op_type == "constexpr_sparse_to_dense" # As both quantization and pruning is on the original weight, the shape of scale should # match the original weight's shape except on the input/output channel. weight_shape = sparse_quantize_op.outputs[1].child_ops[0].outputs[0].shape expected_scale_shape = [1] * len(weight_shape) if block_size > 0: expected_scale_shape[1] = weight_shape[1] // block_size if output_channel_block_size > 0: expected_scale_shape[0] = weight_shape[0] // output_channel_block_size assert sparse_quantize_op.scale.shape == tuple(expected_scale_shape) sparse_ops = prog.find_ops(op_type="constexpr_sparse_to_dense") assert len(sparse_ops) > 0 for sparse_op in sparse_ops: assert sparse_op.mask.dtype == types.uint1 assert sparse_op.nonzero_data.dtype == types.fp16 if _macos_version() >= (15, 0): atol = 5e-4 if compute_unit == ct.ComputeUnit.CPU_AND_GPU else 1e-6 verify_model_outputs( mlmodel, mlmodel_joint_pruned_quantized, coreml_input_values, atol=atol ) @pytest.mark.xfail( reason="rdar://131511244 Investigate Why Joint Prune x Anything are Failing on BNNS" ) @pytest.mark.parametrize( "compute_unit, backend, nbits, channel_group_size, prune_first", itertools.product( compute_units, backends, (3, 4, 8), (0, 1, 2), (True, False), ), ) def test_joint_prune_palettize_weights( self, compute_unit, backend, nbits, channel_group_size, prune_first ): """Jointly prune and palettize the model, where non-sparse entries are palettized.""" model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) prune_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpMagnitudePrunerConfig( target_sparsity=0.2, weight_threshold=500, ) ) palettize_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig( mode="uniform", nbits=nbits, granularity="per_grouped_channel", group_size=channel_group_size, weight_threshold=500, ) ) if prune_first: mlmodel_pruned = cto.coreml.prune_weights(mlmodel, prune_config) mlmodel_joint_pruned_palettized = cto.coreml.palettize_weights( mlmodel_pruned, palettize_config, joint_compression=True ) else: mlmodel_palettized = cto.coreml.palettize_weights(mlmodel, palettize_config) mlmodel_joint_pruned_palettized = cto.coreml.prune_weights( mlmodel_palettized, prune_config, joint_compression=True ) # If run prune first, the all-zero const for lstm won't have nonzero-data, so it won't be # further quantized. lstm_weight_compression_ops = ( ["constexpr_sparse_to_dense"] if prune_first else ["constexpr_lut_to_sparse", "constexpr_sparse_to_dense"] ) expected_ops = ( ["constexpr_lut_to_sparse", "constexpr_sparse_to_dense", "conv"] * 2 + ["reshape"] + ["constexpr_lut_to_sparse", "constexpr_sparse_to_dense", "linear"] * 2 + lstm_weight_compression_ops + ["constexpr_lut_to_sparse", "constexpr_sparse_to_dense"] * 2 + ["lstm", "expand_dims", "expand_dims"] ) prog = mlmodel_joint_pruned_palettized._mil_program assert get_op_types_in_program(prog) == expected_ops for linear_op in prog.find_ops(op_type="linear"): assert linear_op.weight.op.op_type == "constexpr_sparse_to_dense" for conv_op in prog.find_ops(op_type="conv"): assert conv_op.weight.op.op_type == "constexpr_sparse_to_dense" sparse_palettize_ops = prog.find_ops(op_type="constexpr_lut_to_sparse") assert len(sparse_palettize_ops) > 0 for sparse_palettize_op in sparse_palettize_ops: assert sparse_palettize_op.indices_nonzero_data.dtype == types.string_to_builtin( f"uint{nbits}" ) assert sparse_palettize_op.indices_mask.dtype == types.uint1 assert sparse_palettize_op.lut.dtype == types.fp16 assert ( sparse_palettize_op.outputs[1].child_ops[0].op_type == "constexpr_sparse_to_dense" ) # As both palettization and pruning is on the original weight, the shape of lut should # match the original weight's shape except on the output channel. weight_shape = sparse_palettize_op.outputs[1].child_ops[0].outputs[0].shape expected_lut_shape = [1] * len(weight_shape) + [2**nbits] + [1] if channel_group_size > 0: expected_lut_shape[0] = weight_shape[0] // channel_group_size assert sparse_palettize_op.lut.shape == tuple(expected_lut_shape) sparse_ops = prog.find_ops(op_type="constexpr_sparse_to_dense") assert len(sparse_ops) > 0 for sparse_op in sparse_ops: assert sparse_op.mask.dtype == types.uint1 assert sparse_op.nonzero_data.dtype == types.fp16 if _macos_version() >= (15, 0): atol = 5e-4 if compute_unit == ct.ComputeUnit.CPU_AND_GPU else 1e-6 verify_model_outputs( mlmodel, mlmodel_joint_pruned_palettized, coreml_input_values, atol=atol ) @pytest.mark.parametrize( "compute_unit, backend, nbits, channel_group_size, quantize_first", itertools.product( compute_units, backends, (3, 4, 8), (0, 1, 2), (True, False), ), ) def test_joint_palettize_quantize_weights( self, compute_unit, backend, nbits, channel_group_size, quantize_first ): """ If quantize_first is True: First quantize to get int8 weight, and then palettize to n-bit lut with int8 entries. If quantize_first is False: First palettize to get fp16 lut, and then quantize the lut to make int8 lut. Notice no matter applies which one first, the final output model's op order is guaranteed to be consistent by the common::canonicalize_quantized_lut_pattern graph pass. """ model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) palettize_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig( mode="uniform", nbits=nbits, granularity="per_grouped_channel", group_size=channel_group_size, weight_threshold=500, ) ) quant_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpLinearQuantizerConfig( # Quantize the whole lut tensor as the lut usually is not huge. mode="linear", dtype="int8", granularity="per_tensor", weight_threshold=500, ) ) if quantize_first: mlmodel_quantized = cto.coreml.linear_quantize_weights(mlmodel, quant_config) mlmodel_joint_palettized_quantized = cto.coreml.palettize_weights( mlmodel_quantized, palettize_config, joint_compression=True ) else: mlmodel_palettized = cto.coreml.palettize_weights(mlmodel, palettize_config) mlmodel_joint_palettized_quantized = cto.coreml.linear_quantize_weights( mlmodel_palettized, quant_config, joint_compression=True ) expected_ops = ( ["constexpr_blockwise_shift_scale", "constexpr_lut_to_dense", "conv"] * 2 + ["reshape"] + ["constexpr_blockwise_shift_scale", "constexpr_lut_to_dense", "linear"] * 2 + ["constexpr_blockwise_shift_scale", "constexpr_lut_to_dense"] * 3 + ["lstm", "expand_dims", "expand_dims"] ) prog = mlmodel_joint_palettized_quantized._mil_program if channel_group_size == 0: # When doing lut first with per-tensor lut, the lut size is too small, so it's stored as ImmediateValue # which won't be quantized. ops_in_prog = get_op_types_in_program(prog) if nbits < 4 and not quantize_first: assert ops_in_prog.count("constexpr_blockwise_shift_scale") == 0 else: assert ops_in_prog.count("constexpr_blockwise_shift_scale") >= 6 else: assert get_op_types_in_program(prog) == expected_ops for linear_op in prog.find_ops(op_type="linear"): assert linear_op.weight.op.op_type == "constexpr_lut_to_dense" for conv_op in prog.find_ops(op_type="conv"): assert conv_op.weight.op.op_type == "constexpr_lut_to_dense" for quantize_op in prog.find_ops(op_type="constexpr_blockwise_shift_scale"): assert quantize_op.data.dtype == types.int8 assert quantize_op.scale.dtype == types.fp16 assert quantize_op.offset.dtype == types.int8 assert quantize_op.outputs[0].child_ops[0].op_type == "constexpr_lut_to_dense" for palettize_op in prog.find_ops(op_type="constexpr_lut_to_dense"): assert palettize_op.lut.dtype == types.fp16 assert palettize_op.indices.dtype == types.string_to_builtin(f"uint{nbits}") if _macos_version() >= (15, 0): verify_model_outputs(mlmodel, mlmodel_joint_palettized_quantized, coreml_input_values) @pytest.mark.parametrize( "compute_unit, backend", itertools.product(compute_units, backends), ) def test_joint_palettize_quantize_weights_invalid(self, compute_unit, backend): """Only support per-tensor quantization for this case.""" model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) palettize_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig( mode="uniform", nbits=4, granularity="per_grouped_channel", group_size=1, weight_threshold=500, ) ) quant_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpLinearQuantizerConfig( mode="linear", block_size=1, weight_threshold=500, ) ) mlmodel_palettized = cto.coreml.palettize_weights(mlmodel, palettize_config) with pytest.raises( NotImplementedError, match="When use joint compression for palettization-quantization, " "please make sure to use per-tensor quantization", ): cto.coreml.linear_quantize_weights( mlmodel_palettized, quant_config, joint_compression=True ) @pytest.mark.xfail( reason="rdar://131511244 Investigate Why Joint Prune x Anything are Failing on BNNS" ) @pytest.mark.parametrize( "compute_unit, backend, nbits, channel_group_size, target_sparsity", itertools.product( compute_units, backends, (3, 4, 8), (0, 1, 2), (0.2, 0.8), ), ) def test_joint_prune_palettize_quantize_weights( self, compute_unit, backend, nbits, channel_group_size, target_sparsity ): """ First prune to get sparse weight, and then palettize the non-sparse entries to get fp16 lut, and then quantize the lut to make int8 lut. """ model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", minimum_deployment_target=backend.opset_version, compute_precision=ct.precision.FLOAT16 if backend.precision == "fp16" else ct.precision.FLOAT32, compute_units=compute_unit, ) prune_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpMagnitudePrunerConfig( target_sparsity=target_sparsity, weight_threshold=500 ) ) palettize_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig( mode="kmeans", nbits=nbits, granularity="per_grouped_channel", group_size=channel_group_size, weight_threshold=500, ) ) quant_config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpLinearQuantizerConfig( mode="linear", dtype="int8", granularity="per_tensor", weight_threshold=200, # Need to be smaller than entries in lut (2**8=256). ) ) mlmodel_pruned = cto.coreml.prune_weights(mlmodel, prune_config) mlmodel_joint_pruned_palettized = cto.coreml.palettize_weights( mlmodel_pruned, palettize_config, joint_compression=True ) mlmodel_joint_pruned_palettized_quantized = cto.coreml.linear_quantize_weights( mlmodel_joint_pruned_palettized, quant_config, joint_compression=True ) expected_ops = ( [ "constexpr_blockwise_shift_scale", "constexpr_lut_to_sparse", "constexpr_sparse_to_dense", "conv", ] * 2 + ["reshape"] + [ "constexpr_blockwise_shift_scale", "constexpr_lut_to_sparse", "constexpr_sparse_to_dense", "linear", ] * 2 + ["constexpr_sparse_to_dense"] + [ "constexpr_blockwise_shift_scale", "constexpr_lut_to_sparse", "constexpr_sparse_to_dense", ] * 2 + ["lstm", "expand_dims", "expand_dims"] ) if nbits < 4 and channel_group_size == 0: # The lut tensor is too small, which is stored as immediate values. expected_ops = [ expected_op for expected_op in expected_ops if expected_op != "constexpr_blockwise_shift_scale" ] prog = mlmodel_joint_pruned_palettized_quantized._mil_program assert get_op_types_in_program(prog) == expected_ops for linear_op in prog.find_ops(op_type="linear"): assert linear_op.weight.op.op_type == "constexpr_sparse_to_dense" for conv_op in prog.find_ops(op_type="conv"): assert conv_op.weight.op.op_type == "constexpr_sparse_to_dense" for quantize_op in prog.find_ops(op_type="constexpr_blockwise_shift_scale"): assert types.builtin_to_string(quantize_op.data.dtype) == "int8" assert types.builtin_to_string(quantize_op.scale.dtype) == backend.precision assert types.builtin_to_string(quantize_op.offset.dtype) == "int8" assert quantize_op.outputs[0].child_ops[0].op_type == "constexpr_lut_to_sparse" for sparse_palettize_op in prog.find_ops(op_type="constexpr_lut_to_sparse"): assert ( types.builtin_to_string(sparse_palettize_op.indices_nonzero_data.dtype) == f"uint{nbits}" ) assert sparse_palettize_op.indices_mask.dtype == types.uint1 assert ( sparse_palettize_op.outputs[1].child_ops[0].op_type == "constexpr_sparse_to_dense" ) for sparse_op in prog.find_ops(op_type="constexpr_sparse_to_dense"): assert sparse_op.mask.dtype == types.uint1 assert types.builtin_to_string(sparse_op.nonzero_data.dtype) == backend.precision if _macos_version() >= (15, 0): atol = 5e-4 if compute_unit == ct.ComputeUnit.CPU_AND_GPU else 1e-6 verify_model_outputs( mlmodel, mlmodel_joint_pruned_palettized_quantized, coreml_input_values, atol=atol, ) class TestDecompressWeights: @staticmethod def test_weight_decopmression_coreml_optimize(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() weight_1_sparse = create_sparse_weight(model.conv_1.weight, 0.5) weight_2_sparse = create_sparse_weight(model.conv_2.weight, 0.1) linear_1_unique = create_unique_weight(model.linear_1.weight, nbits=4) with torch.no_grad(): model.conv_1.weight = torch.nn.Parameter(torch.Tensor(weight_1_sparse)) model.conv_2.weight = torch.nn.Parameter(torch.Tensor(weight_2_sparse)) model.linear_1.weight = torch.nn.Parameter(torch.Tensor(linear_1_unique)) torchmodel = torch.jit.trace(model, torch_input_values) pipeline = ct.PassPipeline.DEFAULT_PRUNING # Add a palettization pass after the pruning pass. prune_pass_idx = pipeline.passes.index("compression::prune_weights") pipeline.insert_pass(prune_pass_idx + 1, "compression::palettize_weights") config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig(mode="unique"), ) pipeline.set_options("compression::palettize_weights", {"config": config}) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", pass_pipeline=pipeline, compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS16, ) decompressed_model = cto.coreml.decompress_weights(mlmodel) prog = decompressed_model._mil_program op_types = get_op_types_in_program(prog) for val in op_types: assert "constexpr" not in val if ct.utils._macos_version() < (13, 0): return # compared the numerical outputs output_dict = mlmodel.predict(coreml_input_values) de_output_dict = decompressed_model.predict(coreml_input_values) for k, v in output_dict.items(): assert k in de_output_dict np.testing.assert_allclose(v, de_output_dict[k]) class TestConvertMixedCompression: @staticmethod def test_convert_sparse_and_palettized_source_model_custom(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() weight_1_sparse = create_sparse_weight(model.conv_1.weight, 0.5) weight_2_sparse = create_sparse_weight( model.conv_2.weight, 0.1 ) # the sparsity of 0.1 is filtered out by the minimum_sparsity_percentile linear_1_unique = create_unique_weight(model.linear_1.weight, nbits=4) with torch.no_grad(): model.conv_1.weight = torch.nn.Parameter(torch.Tensor(weight_1_sparse)) model.conv_2.weight = torch.nn.Parameter(torch.Tensor(weight_2_sparse)) model.linear_1.weight = torch.nn.Parameter(torch.Tensor(linear_1_unique)) torchmodel = torch.jit.trace(model, torch_input_values) pipeline = ct.PassPipeline.DEFAULT_PRUNING # Add a palettization pass after the pruning pass. prune_pass_idx = pipeline.passes.index("compression::prune_weights") pipeline.insert_pass(prune_pass_idx + 1, "compression::palettize_weights") config = cto.coreml.OptimizationConfig( global_config=cto.coreml.OpPalettizerConfig(mode="unique"), ) pipeline.set_options("compression::palettize_weights", {"config": config}) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", pass_pipeline=pipeline, compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS16, ) prog = mlmodel._mil_program expected_ops = [ "constexpr_sparse_to_dense", "constexpr_lut_to_dense", "conv", "conv", "reshape", "linear", "linear", "constexpr_sparse_to_dense", "squeeze", "lstm", "expand_dims", "expand_dims" ] assert get_op_types_in_program(prog) == expected_ops conv_ops = prog.find_ops(op_type="conv") assert conv_ops[0].weight.op.op_type == "constexpr_sparse_to_dense" assert conv_ops[1].weight.op.op_type == "const" linear_ops = prog.find_ops(op_type="linear") assert linear_ops[0].weight.op.op_type == "constexpr_lut_to_dense" assert linear_ops[1].weight.op.op_type == "const" class TestErrorHandling: @staticmethod def test_error_handling(): model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") # Test invalid mode for affine quantization expected_err_str = "supported for weight affine quantization. Got mode" with pytest.raises(ValueError, match=expected_err_str): linear_quantize_weights(mlmodel, mode="invalid_mode") # Test invalid dtype for affine quantization expected_err_str = "Should be int4/8 or uint4/8, but got int32" with pytest.raises(ValueError, match=expected_err_str): linear_quantize_weights(mlmodel, dtype=np.int32) expected_err_str = "Should be int4/8 or uint4/8, but got int32" with pytest.raises(ValueError, match=expected_err_str): linear_quantize_weights(mlmodel, dtype="int32") # Test invalid threshold for weight sparsification expected_err_str = 'Invalid value of "threshold": \-1.0. Needs to be in \[0, inf\)' with pytest.raises(ValueError, match=expected_err_str): prune_weights(mlmodel, mode="threshold_based", threshold=-1.0) # Test invalid percentile for weight sparsification expected_err_str = "Invalid value of \"target_sparsity\": 1.2. Needs to be in \[0, 1\]" with pytest.raises(ValueError, match=expected_err_str): prune_weights(mlmodel, mode="percentile_based", target_sparsity=1.2) # Test invalid mode for weight palettization expected_err_str = "supported for weight palettization. Got \"mode\"" with pytest.raises(ValueError, match=expected_err_str): palettize_weights(mlmodel, mode="invalid_mode") # Test nbits must be provided for kmeans, uniform mode for weight palettization expected_err_str = "\"nbits\" must be provided for" with pytest.raises(ValueError, match=expected_err_str): palettize_weights(mlmodel, mode="kmeans") with pytest.raises(ValueError, match=expected_err_str): palettize_weights(mlmodel, mode="uniform") # Test nbits must not be provided for unique, custom mode for weight palettization expected_err_str = "\"nbits\" must NOT be provided for" with pytest.raises(ValueError, match=expected_err_str): palettize_weights(mlmodel, mode="unique", nbits=2) with pytest.raises(ValueError, match=expected_err_str): palettize_weights(mlmodel, mode="custom", nbits=2) # Test lut_function must be provided for custom mode, and must not be provided otherwise with pytest.raises(ValueError, match="\"lut_function\" can not be None, if \"mode\" is \"custom\"."): palettize_weights(mlmodel, mode="custom") with pytest.raises(ValueError, match="\"lut_function\" must be None, if \"mode\" is not \"custom\"."): palettize_weights(mlmodel, mode="unique", lut_function=lambda op: True) # Test lut_function must be a function object expected_err_str = "A function object must be provided as \"lut_function\"" with pytest.raises(ValueError, match=expected_err_str): palettize_weights(mlmodel, mode="custom", lut_function=1) @staticmethod def test_error_out_multifunction(): # prepare a mlmodel from a torch model model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data() torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert(torchmodel, inputs=inputs, convert_to="mlprogram") # make a multifunction model package_path = tempfile.mkdtemp(suffix=".mlpackage") mlmodel.save(package_path) desc = MultiFunctionDescriptor(package_path) desc.default_function_name = "main" multifunction_path = tempfile.mkdtemp(suffix=".mlpackage") save_multifunction(desc, multifunction_path) multifunction_mlmodel = ct.models.MLModel(multifunction_path) # all PTQ API should error out, until the radar is fixed: # rdar://126084385 ([Infra] Figure out the story of PTQ or other passes operate on loaded Mutli-function model) def run_palettization(mlmodel): return palettize_weights(mlmodel, nbits=2) for func in [ linear_quantize_weights, prune_weights, run_palettization, decompress_weights, ct.optimize.coreml.get_weights_metadata, ]: with pytest.raises(ValueError, match="is not supported for a multifunction model"): func(multifunction_mlmodel) # cleanup shutil.rmtree(package_path) shutil.rmtree(multifunction_path) class TestCoreMLWeightMetaData: """ This test includes unit tests for: 1. CoreMLWeightMetaData 2. coremltools.optimize.coreml.get_weights_metadata """ @staticmethod def test_coreml_weight_metadata_api(): """ Test the example in the CoreMLWeightMetaData api doc string. """ data = np.array([[1.0, 0.0], [0.0, 6.0]], dtype=np.float32) meta_data = CoreMLWeightMetaData(data) assert meta_data.val is data assert meta_data.sparsity == 0.5 assert meta_data.unique_values == 3 @staticmethod def test_get_weights_metadata(): """ Test the example in the get_weights_metadata functionality with op_type is None. """ model, inputs, torch_input_values, coreml_input_values = get_test_model_and_data_complex() weight_1_sparse = create_sparse_weight(model.conv_1.weight, 0.5) weight_2_sparse = create_sparse_weight(model.conv_2.weight, 0.8) linear_1_palettized = create_unique_weight(model.linear_1.weight, 2) linear_2_palettized = create_unique_weight(model.linear_2.weight, 4) with torch.no_grad(): model.conv_1.weight = torch.nn.Parameter(torch.Tensor(weight_1_sparse)) model.conv_2.weight = torch.nn.Parameter(torch.Tensor(weight_2_sparse)) model.linear_1.weight = torch.nn.Parameter(torch.Tensor(linear_1_palettized)) model.linear_2.weight = torch.nn.Parameter(torch.Tensor(linear_2_palettized)) torchmodel = torch.jit.trace(model, torch_input_values) mlmodel = ct.convert( torchmodel, inputs=inputs, convert_to="mlprogram", compute_precision=ct.precision.FLOAT32, minimum_deployment_target=ct.target.iOS16, ) # test the weight_threshold can filter out weights with size weight_threshold = 10 weight_metadata_dict = ct.optimize.coreml.get_weights_metadata( mlmodel, weight_threshold=weight_threshold ) for v in weight_metadata_dict.values(): assert v.val.size >= weight_threshold # test the functionality of using the returned meta data weight_metadata_dict = ct.optimize.coreml.get_weights_metadata(mlmodel) # get the weight names with size > 25600 large_weights = [] for k, v in weight_metadata_dict.items(): if v.val.size >= 25600: large_weights.append(k) # get the weight names with sparsity >= 50% sparse_weights = [] for k, v in weight_metadata_dict.items(): if v.sparsity >= 0.5: sparse_weights.append(k) # get the weight names with unique elements <= 16 palettized_weights = [] for k, v in weight_metadata_dict.items(): if v.unique_values <= 16: palettized_weights.append(k) meta_data_1 = weight_metadata_dict["conv_1_weight"] # testing expected_large_weights = [ "linear_2_weight", "concat_1", "concat_2", ] assert large_weights == expected_large_weights expected_sparse_weights = [ "conv_1_weight", "conv_2_weight", "op_59_lstm_h0_squeeze", ] assert sparse_weights == expected_sparse_weights expected_palettized_weights = [ "linear_1_weight", "linear_2_weight", "op_59_lstm_h0_squeeze", ] assert palettized_weights == expected_palettized_weights @staticmethod def test_get_weights_metadata_shared_weight(): """ Test the get_weights_metadata functionality for models with weight-sharing layers. """ def _test_child_ops(child_ops): assert len(child_ops) == 2 assert child_ops[0].name == "add_1" assert child_ops[0].op_type == "add" assert child_ops[0].params_name_mapping["y"] == "w_1" assert child_ops[1].name == "add_2" assert child_ops[1].op_type == "add" assert child_ops[1].params_name_mapping["y"] == "w_1" @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 30, 10, 10)), mb.TensorSpec(shape=(1, 30, 10, 10)), ], ) def prog(x, y): shared_weight = mb.const( val=np.random.rand(1, 30, 10, 10).astype(np.float32), name="w_1" ) x = mb.add(x=x, y=shared_weight, name="add_1") y = mb.add(x=y, y=shared_weight, name="add_2") return x, y mlmodel = ct.convert( prog, convert_to="mlprogram", compute_precision=ct.precision.FLOAT32, ) ops_metadata_dict = ct.optimize.coreml.get_weights_metadata( mlmodel, weight_threshold=100, ) assert len(ops_metadata_dict) == 1 child_ops = ops_metadata_dict["w_1"].child_ops _test_child_ops(child_ops) @staticmethod def test_get_weights_metadata_op_var_different_name(): """ For several rare corner cases, the const var and op have different names. Test that the API is correctly using the op's name. """ @mb.program( input_specs=[ mb.TensorSpec(shape=(1, 30, 10, 10)), ], ) def prog(x): shared_weight = mb.const( val=np.random.rand(1, 30, 10, 10).astype(np.float32), name="w_1" ) shared_weight.name = "w_1_new" x = mb.add(x=x, y=shared_weight, name="add_1") return x mlmodel = ct.convert( prog, convert_to="mlprogram", compute_precision=ct.precision.FLOAT32, ) ops_metadata_dict = ct.optimize.coreml.get_weights_metadata( mlmodel, weight_threshold=100, ) assert "w_1" in ops_metadata_dict assert ops_metadata_dict["w_1"].child_ops[0].params_name_mapping["y"] == "w_1" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/coreml/test_utils.py0000644000000000000000000002123714672066616023717 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import numpy as np import pytest from coremltools.converters.mil.mil import types from coremltools.converters.mil.mil.ops.defs.iOS18.compression import constexpr_lut_to_dense from coremltools.optimize.coreml import _utils as optimize_utils class TestComputeQuantizationParams: @pytest.mark.parametrize( "quant_mode, rank, block_size", itertools.product( ["LINEAR", "LINEAR_SYMMETRIC"], [1, 2, 3], [0, 1, 2], ), ) def test_compute_qparams(self, quant_mode, rank, block_size): weight_shape = [10] * rank weight = np.random.randn(*weight_shape) ret = optimize_utils.compute_qparams( weight, nbits=8, signed=True, quantization_mode=quant_mode, dtype=np.int8, block_sizes=[block_size] * rank, ) if quant_mode == "LINEAR_SYMMETRIC": assert ret[-1] is None else: assert ret[-1] is not None assert ret[0].shape == weight.shape @pytest.mark.parametrize( "quant_mode, block_sizes", itertools.product( ["LINEAR", "LINEAR_SYMMETRIC"], [ [0], [4, 5], [3, 9], [4, 5, 6], ], ), ) def test_compute_qparams_failure(self, block_sizes, quant_mode): weight = np.random.randn(10, 10) with pytest.raises(AssertionError): ret = optimize_utils.compute_qparams( weight, nbits=8, signed=True, quantization_mode=quant_mode, dtype=np.int8, block_sizes=block_sizes, ) assert ret is not None class TestFindIndicesForLut: def test_basic(self): """ data: [3.01, -7.99, -8.01, 3.02, 3.89, -1.88, -2.02, -6.98] lut: [-8, -7, 3, 4, -2] expected indices: [2, 0, 0, 2, 3, 4, 4, 1] """ data = np.array([3.01, -7.99, -8.01, 3.02, 3.89, 0.98, 1.98, -6.98], dtype=np.float16) lut = np.array([-8, -7, 3, 4], dtype=np.int8).reshape((1, 4, 1)) expected_indices = np.array([2, 0, 0, 2, 3, 2, 2, 1], dtype=np.uint8) indices = optimize_utils.find_indices_for_lut(data, lut) np.testing.assert_array_equal(indices, expected_indices) assert types.builtin_to_string(types.numpy_type_to_builtin_type(indices.dtype)) == "uint2" @pytest.mark.parametrize( "nbits, block_sizes", itertools.product( (2, 3, 4, 8), ( [0], [1], [2], [2, 2], [1, 2, 1], [0, 2, 2], [4, 0, 0, 1], [8, 4, 2, 3], ), ), ) def test_stress(self, nbits, block_sizes): """ As finding indices is the reverse progress of generating data from lut, we first manually construct indices and lut, and then generate data from lut and salt it, and finally check if the restored indices are identical to the original indices. """ data_shape = [8, 4, 2, 3] lut_shape = data_shape + [2**nbits, 1] for idx, dim_size in enumerate(data_shape): if idx < len(block_sizes): lut_shape[idx] = 1 if block_sizes[idx] == 0 else data_shape[idx] // block_sizes[idx] nbits_range = types.type_mapping.builtin_to_range(types.string_to_builtin(f"uint{nbits}")) lut = np.arange(np.prod(lut_shape)).reshape(lut_shape).astype(np.float32) expected_indices = np.random.randint( low=nbits_range.low, high=nbits_range.high + 1, size=data_shape, dtype=np.uint8 ) data = constexpr_lut_to_dense.decompress(expected_indices, lut, vector_axis=None) # Salting the data to manually introduce numerical instability. data += np.random.randint(low=0, high=2, size=data.shape) * 0.01 data -= np.random.randint(low=0, high=2, size=data.shape) * 0.01 indices = optimize_utils.find_indices_for_lut(data, lut) np.testing.assert_array_equal(indices, expected_indices) assert ( types.builtin_to_string(types.numpy_type_to_builtin_type(indices.dtype)) == f"uint{nbits}" ) def test_vector_basic(self): """ data: [[3.01, -7.99, 2.02, -7.05], [3.02, -8.01, 1.89, -6.88]] lut: [[2, -7], [3, -8]] expected indices: [[1, 0], [0, 1]] """ data = np.array([[3.01, -7.99, 2.02, -7.05], [1.89, -6.88, 3.02, -8.01]], dtype=np.float16) lut = np.array([[2, -7], [3, -8]], dtype=np.int8).reshape((1, 1, 2, 2)) expected_indices = np.array([[1, 0], [0, 1]], dtype=np.uint8) indices = optimize_utils.find_indices_for_lut(data, lut, vector_axis=-1) np.testing.assert_array_equal(indices, expected_indices) assert types.builtin_to_string(types.numpy_type_to_builtin_type(indices.dtype)) == "uint1" @pytest.mark.parametrize( "nbits, vector_size, vector_axis, group_size", itertools.product( (2, 3, 4, 8), (1, 2, 4), (0, 1, -1), (0, 4), ), ) def test_vector_stress(self, nbits, vector_size, vector_axis, group_size): data_shape = [8, 16, 32] lut_shape = [1] * len(data_shape) if group_size > 0: lut_shape[vector_axis] = data_shape[vector_axis] // group_size lut_shape += [2**nbits, vector_size] nbits_range = types.type_mapping.builtin_to_range(types.string_to_builtin(f"uint{nbits}")) lut = np.arange(np.prod(lut_shape)).reshape(lut_shape).astype(np.float16) indices_shape = list(data_shape) indices_shape[vector_axis] //= vector_size expected_indices = np.random.randint( low=nbits_range.low, high=nbits_range.high + 1, size=indices_shape, dtype=np.uint8 ) data = constexpr_lut_to_dense.decompress(expected_indices, lut, vector_axis=vector_axis) # Salting the data to manually introduce numerical instability. data += np.random.randint(low=0, high=2, size=data.shape) * 0.01 data -= np.random.randint(low=0, high=2, size=data.shape) * 0.01 indices = optimize_utils.find_indices_for_lut(data, lut, vector_axis=vector_axis) np.testing.assert_array_equal(indices, expected_indices) assert ( types.builtin_to_string(types.numpy_type_to_builtin_type(indices.dtype)) == f"uint{nbits}" ) class TestPackUnpackBits: def test_pack_basic(self): """ Original data: [-8, 7, 3, 4, -2]. The 4-bit binary representation for those elements are: -8: 1000; 7: 0111; 3: 0011 4: 0100 -2: 1110 Hence the packed quantized_data will be 3 bytes long, i.e., 24 bits long, which is: 0111 1000 0100 0011 0000 1110 So the packed data is represented by 3 uint8 values: [120, 67, 14]. """ original_data = np.array([-8, 7, 3, 4, -2], dtype=np.int8) expected_packed_data = np.array([120, 67, 14], dtype=np.uint8) packed_data = optimize_utils.pack_elements_into_bits(original_data, nbits=4) np.testing.assert_array_equal(packed_data, expected_packed_data) def test_pack_basic_2(self): original_data = np.array([1, 2, 3, 4, 5], dtype=np.int8) expected_packed_data = np.array([33, 67, 5], dtype=np.uint8) packed_data = optimize_utils.pack_elements_into_bits(original_data, nbits=4) np.testing.assert_array_equal(packed_data, expected_packed_data) @pytest.mark.parametrize( "nbits, data_dtype, element_num", itertools.product(list(range(1, 9)), [np.int8, np.uint8], [1, 3, 20]), ) def test_round_trip_pack_unpack(self, nbits, data_dtype, element_num): is_data_signed = np.issubdtype(data_dtype, np.signedinteger) low, high = 0, 2**nbits if is_data_signed: low, high = -(2 ** (nbits - 1)), 2 ** (nbits - 1) original_data = np.random.randint(low=low, high=high, size=(element_num,)).astype( data_dtype ) packed_data = optimize_utils.pack_elements_into_bits(original_data, nbits) restored_data = optimize_utils.restore_elements_from_packed_bits( packed_data, nbits, element_num, are_packed_values_signed=is_data_signed ) np.testing.assert_array_equal(restored_data, original_data) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2815473 coremltools-8.0/coremltools/test/optimize/torch/0000755000000000000000000000000014672075535020777 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/__init__.py0000644000000000000000000000033314672066616023107 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/conftest.py0000644000000000000000000000613714672066616023205 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import shutil import pytest from coremltools.test.optimize.torch.models.mnist import ( mnist_dataset, mnist_example_input, mnist_example_output, mnist_model, mnist_model_conv_transpose, mnist_model_large, mnist_model_quantization, residual_mnist_model, ) from coremltools.test.optimize.torch.pruning.pruning_utils import get_model_and_pruner # dummy function to use the imported fixtures so that linter # does not remove them as unused imports def _dummy( mnist_dataset, mnist_example_input, mnist_example_output, mnist_model, residual_mnist_model, mnist_model_large, mnist_model_quantization, get_model_and_pruner, mnist_model_conv_transpose, ): return ( mnist_dataset, mnist_example_input, mnist_example_output, mnist_model, residual_mnist_model, mnist_model_large, mnist_model_quantization, get_model_and_pruner, mnist_model_conv_transpose, ) def _datadir(request): # When using this fixture with parametrized tests, we end up with '[' and ']' characters in the pathname, which TF # is not happy with. Thus we should substitute these characters with a more universally accepted path character. safe_name = request.node.name.replace("[", "___").replace("]", "___") dir = test_data_path() / safe_name # noqa: F821 shutil.rmtree(str(dir), ignore_errors=True) os.makedirs(str(dir)) return dir @pytest.fixture def datadir(request): """ Directory for storing test data for latter inspection. """ return _datadir(request) @pytest.fixture def mock_name_main(monkeypatch): monkeypatch.setattr(__import__("__main__"), "__name__", "__main__") def pytest_addoption(parser): """ Adds command line option --runopt to the pytest parser By default, evaluates to False. If command line option passed, evaluates to True """ parser.addoption("--runopt", action="store_true", default=False, help="run optional tests") def pytest_configure(config): """ Adds info about optional marker to pytest config """ config.addinivalue_line("markers", "optional: mark test run as optional") def marker_names(item): """ Returns set containing the name of each marker associated with the given test item """ return set(m.name for m in item.iter_markers()) def pytest_collection_modifyitems(config, items): """ Modifies the test items so that items marked optional are skipped when the --runopt command line option is not provided. Otherwise, will not perform any modifications. """ # No modifications required if config.getoption("--runopt"): return skip_opt = pytest.mark.skip(reason="need --runopt option to run") for item in items: markers = marker_names(item) if "optional" in markers: item.add_marker(skip_opt) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2815473 coremltools-8.0/coremltools/test/optimize/torch/conversion/0000755000000000000000000000000014672075535023164 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/conversion/__init__.py0000644000000000000000000000033314672066616025274 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/conversion/conversion_utils.py0000644000000000000000000000625514672066616027153 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import sys import numpy as np import torch import coremltools as ct def convert_and_verify( pytorch_model, input_data, input_as_shape=False, pass_pipeline=None, minimum_deployment_target=ct.target.iOS18, expected_ops=None, ): """ Utility to: 1) Convert a PyTorch model to coreml format 2) Compare their outputs for numerical equivalence 3) Verify the converted model contains expected ops Args: input_as_shape: If true generates random input data with shape. expected_ops: List of MIL ops expected in the converted model Returns: Converted coreml model """ if input_as_shape: example_input = torch.rand(input_data) else: example_input = input_data # Generate converted model coreml_model = get_converted_model( pytorch_model, example_input, pass_pipeline, minimum_deployment_target ) assert coreml_model is not None # Verify converted model output matches torch model verify_model_outputs(pytorch_model, coreml_model, example_input) # Verify desired ops are present verify_ops(coreml_model, expected_ops) return coreml_model def get_converted_model( pytorch_model, input_data, pass_pipeline=None, minimum_deployment_target=ct.target.iOS17, ): """ Utility that takes a PyTorch model and converts it to a coreml model """ traced_model = torch.jit.trace(pytorch_model, example_inputs=(input_data,)) coreml_model = None try: coreml_model = ct.convert( traced_model, inputs=[ct.TensorType(shape=input_data.shape)], pass_pipeline=pass_pipeline, minimum_deployment_target=minimum_deployment_target, ) except Exception as err: print(f"Conversion Error: {err}") return coreml_model def verify_model_outputs(pytorch_model, coreml_model, input_value): """ This utility functions does the following checks: (1) Verify the output of the coreml model has the same shape of the PyTorch model (2) The PyTorch and coreml model have the same numerical outputs """ # Validate the output shape / type ref_output = pytorch_model(input_value) output = coreml_model._mil_program.functions["main"].outputs[0] assert ref_output.shape == output.shape # Cannot run predict on linux if sys.platform == "linux": return # Validate that the coreml model produces correct outputs pytorch_model.eval() ref_output_dict = pytorch_model(input_value) coreml_input_value = {"input_1": input_value.detach().numpy()} output_dict = coreml_model.predict(coreml_input_value) for k, v in output_dict.items(): np.testing.assert_allclose(v, output_dict[k]) def verify_ops(coreml_model, expected_ops): if not expected_ops: return for op in expected_ops: compressed_ops = coreml_model._mil_program.functions["main"].find_ops(op_type=op) assert len(compressed_ops) >= 1 ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2855473 coremltools-8.0/coremltools/test/optimize/torch/conversion/joint/0000755000000000000000000000000014672075535024307 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/conversion/joint/__init__.py0000644000000000000000000000033314672066616026417 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000020500000000000010212 xustar00111 path=coremltools-8.0/coremltools/test/optimize/torch/conversion/joint/test_joint_compression_conversion.py 22 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/conversion/joint/test_joint_compression_conversion.p0000644000000000000000000000661214672066616033545 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import pytest ct = pytest.importorskip("coremltools") import coremltools.test.optimize.torch.conversion.conversion_utils as util from coremltools.optimize.torch.layerwise_compression import ( LayerwiseCompressor, LayerwiseCompressorConfig, ) from coremltools.optimize.torch.palettization import DKMPalettizer, DKMPalettizerConfig from coremltools.optimize.torch.pruning import MagnitudePruner, MagnitudePrunerConfig from coremltools.optimize.torch.quantization import LinearQuantizer, LinearQuantizerConfig @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Only supported on macOS 15+") def test_joint_pruning_quantization(mnist_model, mnist_example_input): example_input = mnist_example_input quant_config = LinearQuantizerConfig.from_dict( { "global_config": { "milestones": [0, 0, 10, 10], } } ) prune_config = MagnitudePrunerConfig.from_dict({"global_config": {"target_sparsity": 0.5}}) quantizer = LinearQuantizer(mnist_model, quant_config) quant_model = quantizer.prepare(example_inputs=(example_input,)) pruner = MagnitudePruner(quant_model, prune_config) pruned_quant_model = pruner.prepare(inplace=True) quantizer.step() pruner.step() # Do a forward pass for pruner mask to be applied # Alternatively can set initial sparsity to target sparsity pruned_quant_model(example_input) quant_finalized_model = quantizer.finalize(inplace=True) finalized_model = pruner.finalize(quant_finalized_model) util.convert_and_verify( finalized_model, example_input, minimum_deployment_target=ct.target.iOS18, expected_ops=[ "constexpr_sparse_to_dense", "constexpr_sparse_blockwise_shift_scale", ], ) @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Only supported on macOS 15+") @pytest.mark.parametrize( "config, expected_ops", [ pytest.param( {"global_config": {"algorithm": "sparse_gpt"}}, ["constexpr_sparse_to_dense"], id="pruning", ), pytest.param( {"global_config": {"algorithm": "sparse_gpt", "weight_dtype": "uint4"}}, ["constexpr_sparse_to_dense", "constexpr_sparse_blockwise_shift_scale"], id="joint_pruning_quantization", ), pytest.param( { "global_config": { "algorithm": "sparse_gpt", "weight_dtype": "uint4", "enable_normal_float": True, } }, ["constexpr_sparse_to_dense", "constexpr_lut_to_sparse"], id="joint_pruning_palettization", ), ], ) def test_sparsegpt(config, mnist_model, mnist_example_input, expected_ops): compressor_config = LayerwiseCompressorConfig.from_dict(config) compressor = LayerwiseCompressor(mnist_model, compressor_config) def calibration_loader(): yield mnist_example_input compressed_model = compressor.compress(calibration_loader(), device="cpu") util.convert_and_verify( compressed_model, mnist_example_input, expected_ops=expected_ops, ) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2855473 coremltools-8.0/coremltools/test/optimize/torch/conversion/palettization/0000755000000000000000000000000014672075535026053 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/conversion/palettization/__init__.py0000644000000000000000000000033314672066616030163 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000021100000000000010207 xustar00115 path=coremltools-8.0/coremltools/test/optimize/torch/conversion/palettization/test_palettization_conversion.py 22 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/conversion/palettization/test_palettization_conversi0000644000000000000000000003424414672066616033643 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed, by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import pytest import torch import torch.nn as nn import coremltools as ct import coremltools.test.optimize.torch.conversion.conversion_utils as util from coremltools.optimize.torch.palettization import ( DKMPalettizer, DKMPalettizerConfig, PostTrainingPalettizer, PostTrainingPalettizerConfig, SKMPalettizer, SKMPalettizerConfig, ) from coremltools.test.optimize.torch.utils import count_unique_params ct = pytest.importorskip("coremltools") cto = pytest.importorskip("coremltools.optimize") # region per_tensor @pytest.mark.parametrize( "config, lut_shape_map", [ # Exclude testing for 8/6 bits since all ops in MNIST get skipped for 8/6-bit palettization. pytest.param( { "global_config": {"n_bits": 4, "granularity": "per_tensor"}, }, { "conv1": (1, 1, 1, 1, 16, 1), "conv2": (1, 1, 1, 1, 16, 1), "dense1": (1, 1, 16, 1), "dense2": (1, 1, 16, 1), }, id="4bits", ), pytest.param( { "global_config": {"n_bits": 2, "granularity": "per_tensor"}, }, { "conv1": (1, 1, 1, 1, 4, 1), "conv2": (1, 1, 1, 1, 4, 1), "dense1": (1, 1, 4, 1), "dense2": (1, 1, 4, 1), }, id="2bits", ), ], ) @pytest.mark.parametrize("algorithm", ["SKM", "PTP", "DKM"]) def test_palettization_per_tensor( mnist_model, mnist_example_input, mnist_example_output, config, lut_shape_map, algorithm, ): if algorithm == "DKM": # Skip compressing all layers for DKM to reduce test time config["module_name_configs"] = {"conv1": None, "dense1": None} compressed_model = get_compressed_model( algorithm, mnist_model, mnist_example_input, mnist_example_output, config ) weight_sample = compressed_model.conv2.weight.detach() # per tensor # Validate on torch model. _n_bits = config["global_config"]["n_bits"] max_unique_values = 2**_n_bits assert count_unique_params(torch.unique(weight_sample)) <= max_unique_values if ct.utils._macos_version() < (15, 0): return # Convert and validate on coreml model. compressed_model_coreml = util.convert_and_verify( compressed_model, mnist_example_input, expected_ops=["constexpr_lut_to_dense"], ) verify_op_constexpr_lut_to_dense(compressed_model_coreml, lut_shape_map) # endregion # region per_channel_scale @pytest.mark.parametrize( "config, lut_shape_map", [ # Exclude testing for 8/6 bits since all ops in MNIST get skipped for 8/6-bit palettization. pytest.param( { "global_config": { "n_bits": 4, "granularity": "per_grouped_channel", "group_size": 1, "enable_per_channel_scale": True, }, }, { "conv1": (32, 1, 1, 1, 16, 1), "conv2": (64, 1, 1, 1, 16, 1), "dense1": (1024, 1, 16, 1), "dense2": (10, 1, 16, 1), }, id="4bits", ), pytest.param( { "global_config": { "n_bits": 2, "granularity": "per_grouped_channel", "group_size": 1, "enable_per_channel_scale": True, }, }, { "conv1": (32, 1, 1, 1, 4, 1), "conv2": (64, 1, 1, 1, 4, 1), "dense1": (1024, 1, 4, 1), "dense2": (10, 1, 4, 1), }, id="2bits", ), ], ) @pytest.mark.parametrize("algorithm", ["SKM", "PTP", "DKM"]) def test_palettization_per_channel_scale( mnist_model, mnist_example_input, mnist_example_output, config, lut_shape_map, algorithm, ): if algorithm == "DKM": # Skip compressing all layers for DKM to reduce test time config["module_name_configs"] = {"conv1": None, "dense1": None} compressed_model = get_compressed_model( algorithm, mnist_model, mnist_example_input, mnist_example_output, config ) # Validate on torch model. for i in range(32): weight_sample = compressed_model.conv2.weight[i].detach() # per channel _n_bits = config["global_config"]["n_bits"] max_unique_values = 2**_n_bits assert count_unique_params(torch.unique(weight_sample)) <= max_unique_values if ct.utils._macos_version() < (15, 0): return compressed_model_coreml = util.convert_and_verify( compressed_model, mnist_example_input, expected_ops=["constexpr_lut_to_dense"], ) verify_op_constexpr_lut_to_dense(compressed_model_coreml, lut_shape_map) # endregion # region grouped_channelwise @pytest.mark.parametrize( "config, lut_shape_map", [ pytest.param( { "global_config": { "n_bits": 4, "granularity": "per_grouped_channel", "group_size": 16, "channel_axis": 0, }, }, { "conv1": (2, 1, 1, 1, 16, 1), "conv2": (4, 1, 1, 1, 16, 1), "dense1": (64, 1, 16, 1), }, id="4bits_group_size_16_axis_0", ), pytest.param( { "global_config": { "n_bits": 4, "granularity": "per_grouped_channel", "group_size": 16, "channel_axis": 1, }, }, { "conv2": (1, 2, 1, 1, 16, 1), "dense1": (1, 196, 16, 1), "dense2": (1, 64, 16, 1), }, id="4bits_group_size_16_axis_1", ), ], ) @pytest.mark.parametrize("algorithm", ["SKM", "PTP", "DKM"]) def test_palettization_grouped_channelwise( mnist_model, mnist_example_input, mnist_example_output, config, lut_shape_map, algorithm, ): if algorithm == "DKM": # DKM API currently does not support channel_axis. by default axis is 0 if config["global_config"]["channel_axis"] == 1: # skip test return else: # remove channel_axis key, which will default to axis 0 del config["global_config"]["channel_axis"] # Skip compressing all layers for DKM to reduce test time config["module_name_configs"] = {"conv1": None, "dense1": None} compressed_model = get_compressed_model( algorithm, mnist_model, mnist_example_input, mnist_example_output, config ) # Validate on torch model. _group_size = config["global_config"]["group_size"] _axis = config["global_config"]["channel_axis"] if algorithm != "DKM" else 0 for i in range(0, _group_size, 32): if _axis == 1: # blocking along input channel axis weight_sample = compressed_model.conv2.weight[:, i : i + _group_size].detach() else: # blocking along output channel axis weight_sample = compressed_model.conv2.weight[i : i + _group_size].detach() _n_bits = config["global_config"]["n_bits"] max_unique_values = 2**_n_bits assert count_unique_params(torch.unique(weight_sample)) <= max_unique_values if ct.utils._macos_version() < (15, 0): return compressed_model_coreml = util.convert_and_verify( compressed_model, mnist_example_input, expected_ops=["constexpr_lut_to_dense"], ) verify_op_constexpr_lut_to_dense(compressed_model_coreml, lut_shape_map) # endregion # region vector @pytest.mark.parametrize( "config, lut_shape_map", [ pytest.param( { "module_name_configs": { "conv2": { "n_bits": 4, "granularity": "per_tensor", "cluster_dim": 4, } }, }, { "conv2": (1, 1, 1, 1, 16, 4), }, marks=pytest.mark.xfail( reason="rdar://124474258 ([Compression] Support Vector Palettization in coremltools)" ), id="4bits_vector_4", ), ], ) @pytest.mark.parametrize("algorithm", ["SKM", "PTP", "DKM"]) def test_palettization_vector( mnist_model, mnist_example_input, mnist_example_output, config, lut_shape_map, algorithm, ): compressed_model = get_compressed_model( algorithm, mnist_model, mnist_example_input, mnist_example_output, config ) # Validate on torch model. _cluster_dim = config["module_name_configs"]["conv2"]["cluster_dim"] weight_sample = ( compressed_model.conv2.weight.flatten(1).transpose(0, 1).reshape(-1, _cluster_dim) ) _n_bits = config["module_name_configs"]["conv2"]["n_bits"] max_unique_values = 2**_n_bits assert len(torch.unique(weight_sample, dim=0)) <= max_unique_values # test compression metadata is available assert getattr(compressed_model.conv2, "_COREML_/weight/vector_axis") == torch.tensor(0) if ct.utils._macos_version() < (15, 0): return compressed_model_coreml = util.convert_and_verify( compressed_model, mnist_example_input, expected_ops=["constexpr_lut_to_dense"], ) verify_op_constexpr_lut_to_dense(compressed_model_coreml, lut_shape_map) # endregion @pytest.mark.parametrize( "config, lut_shape_map", [ pytest.param( { "global_config": { "n_bits": 4, "granularity": "per_tensor", }, }, { "conv1": (1, 1, 1, 1, 16, 1), "conv2": (1, 1, 1, 1, 16, 1), "dense1": (1, 1, 16, 1), "dense2": (1, 1, 16, 1), }, id="4bits_per_tensor", ), pytest.param( { "global_config": { "n_bits": 4, "granularity": "per_grouped_channel", "group_size": 16, }, }, { "conv1": (2, 1, 1, 1, 16, 1), "conv2": (4, 1, 1, 1, 16, 1), "dense1": (64, 1, 16, 1), }, id="4bits_group_size_16_axis_0", ), ], ) @pytest.mark.parametrize("lut_dtype", ["int8", "uint8"]) @pytest.mark.parametrize("algorithm", ["SKM", "PTP", "DKM"]) @pytest.mark.xfail( reason="rdar://126355261 ([Compression] Support LUT with 8bit values Model Conversion)", ) def test_palettization_int8_lut( mnist_model, mnist_example_input, mnist_example_output, config, lut_shape_map, lut_dtype, algorithm, ): config["global_config"]["lut_dtype"] = lut_dtype if algorithm == "DKM": # Skip compressing all layers for DKM to reduce test time config["module_name_configs"] = {"conv1": None, "dense1": None} compressed_model = get_compressed_model( algorithm, mnist_model, mnist_example_input, mnist_example_output, config ) compressed_model_coreml = util.convert_and_verify( compressed_model, mnist_example_input, expected_ops=["constexpr_lut_to_dense"], ) verify_op_constexpr_lut_to_dense(compressed_model_coreml, lut_shape_map) # endregion # region HelperMethods def get_compressed_model(algorithm, mnist_model, mnist_example_input, mnist_example_output, config): if algorithm == "DKM": return get_compressed_model_for_dkm(mnist_model, mnist_example_input, config) elif algorithm == "SKM": return get_compressed_model_for_skm( mnist_model, mnist_example_input, mnist_example_output, config ) elif algorithm == "PTP": return get_compressed_model_for_ptp(mnist_model, config) else: print("Unsupported compression algorithm: ", algorithm) return None # Get a compressed MNIST model with DKMPalettizer and sample data. def get_compressed_model_for_dkm(mnist_model, mnist_example_input, config): palettizer_config = DKMPalettizerConfig.from_dict(config) palettizer = DKMPalettizer(mnist_model, palettizer_config) prepared_model = palettizer.prepare(inplace=True) palettizer.step() prepared_model(mnist_example_input) model = palettizer.finalize() return model # Get a compressed MNIST model with SKMPalettizer and calibration data. def get_compressed_model_for_skm(mnist_model, mnist_example_input, mnist_example_output, config): palettizer_config = SKMPalettizerConfig.from_dict(config) def calibration_loader(): yield mnist_example_input def loss_fn(mnist_model, mnist_example_input): out = mnist_model(mnist_example_input) return nn.functional.mse_loss(out, mnist_example_output) compressor = SKMPalettizer(mnist_model, palettizer_config) compressed_model = compressor.compress(dataloader=calibration_loader(), loss_fn=loss_fn) return compressed_model # Get a compressed MNIST model with PostTrainingPalettization def get_compressed_model_for_ptp(mnist_model, config): palettizer_config = PostTrainingPalettizerConfig.from_dict(config) compressor = PostTrainingPalettizer(mnist_model, palettizer_config) compressed_model = compressor.compress() return compressed_model def verify_op_constexpr_lut_to_dense(coreml_model, per_layer_lut_shape): compressed_ops = coreml_model._mil_program.functions["main"].find_ops( op_type="constexpr_lut_to_dense" ) assert len(compressed_ops) >= 1 # Verify if number of bits is correct. # For palettization, it's hidden in the shape of LUT. for compressed_op in compressed_ops: layer_name = compressed_op.name.split("_weight")[0] assert compressed_op.lut.shape == per_layer_lut_shape[layer_name] # endregion ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2855473 coremltools-8.0/coremltools/test/optimize/torch/conversion/pruning/0000755000000000000000000000000014672075535024646 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/conversion/pruning/__init__.py0000644000000000000000000000033314672066616026756 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/conversion/pruning/test_pruning_conversion.py0000644000000000000000000000453214672066616032212 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import pytest ct = pytest.importorskip("coremltools") import coremltools.test.optimize.torch.conversion.conversion_utils as util from coremltools.optimize.torch.pruning import MagnitudePruner, MagnitudePrunerConfig # region MagnitudePruner @pytest.mark.parametrize( "config", [ pytest.param( { "global_config": { "initial_sparsity": 0.5, "target_sparsity": 0.5, } }, id="unstructured_sparsity", ), pytest.param( { "global_config": { "initial_sparsity": 0.5, "target_sparsity": 0.5, "block_size": 2, } }, id="block_structured_sparsity", ), pytest.param( { "global_config": { "initial_sparsity": 0.5, "target_sparsity": 0.5, "n_m_ratio": (1, 2), } }, id="n_m_structured_sparsity", ), pytest.param( { "global_config": { "initial_sparsity": 0.5, "target_sparsity": 0.5, "granularity": "per_channel", } }, id="general_structured_sparsity", ), ], ) @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Only supported on macOS 15+") def test_magnitude_pruner(config, mnist_model, mnist_example_input): pruner_config = MagnitudePrunerConfig.from_dict(config) pruner = MagnitudePruner(mnist_model, pruner_config) pruned_model = get_pruned_model(pruner) util.convert_and_verify( pruned_model, mnist_example_input, pass_pipeline=ct.PassPipeline.DEFAULT_PRUNING, expected_ops=["constexpr_sparse_to_dense"], ) # endregion # region GlobalUnstructuredPruner # endregion # region STRPruner # endregion # region HelperMethods def get_pruned_model(pruner): pruner.prepare(inplace=True) pruner.step() return pruner.finalize() # endregion ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2855473 coremltools-8.0/coremltools/test/optimize/torch/conversion/quantization/0000755000000000000000000000000014672075535025712 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/conversion/quantization/__init__.py0000644000000000000000000000033314672066616030022 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000020700000000000010214 xustar00113 path=coremltools-8.0/coremltools/test/optimize/torch/conversion/quantization/test_quantization_conversion.py 22 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/conversion/quantization/test_quantization_conversion0000644000000000000000000001170514672066616033673 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import pytest ct = pytest.importorskip("coremltools") import coremltools.test.optimize.torch.conversion.conversion_utils as util from coremltools.optimize.torch.layerwise_compression import ( LayerwiseCompressor, LayerwiseCompressorConfig, ) from coremltools.optimize.torch.quantization import LinearQuantizer, LinearQuantizerConfig # region LinearQuantizer @pytest.mark.parametrize( "config", [ pytest.param( {"global_config": {"quantization_scheme": "symmetric"}}, id="symmetric_per_tensor", ), pytest.param({"global_config": {"quantization_scheme": "affine"}}, id="affine_per_tensor"), pytest.param( { "global_config": { "weight_dtype": "qint4", "quantization_scheme": "symmetric", } }, id="4bit_symmetric_per_tensor", ), pytest.param( { "global_config": { "weight_dtype": "qint4", "quantization_scheme": "affine", } }, id="4bit_affine_per_tensor", ), ], ) @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Only supported on macOS 15+") @pytest.mark.parametrize("model", ["mnist_model", "mnist_model_conv_transpose"]) def test_linear_quantizer(config, model, mnist_example_input, request): quantizer_config = LinearQuantizerConfig.from_dict(config) quantizer = LinearQuantizer(request.getfixturevalue(model), quantizer_config) quantized_model = get_quantized_model(quantizer, mnist_example_input) util.convert_and_verify( quantized_model, mnist_example_input, expected_ops=["constexpr_blockwise_shift_scale"], ) # endregion # region GPTQ @pytest.mark.parametrize( "config", [ pytest.param( {"global_config": {"algorithm": "gptq", "weight_dtype": "uint4"}}, id="4bit", ), pytest.param( { "global_config": { "algorithm": "gptq", "weight_dtype": "uint8", "block_size": 32, "granularity": "per_block", } }, id="blockwise", ), pytest.param( { "global_config": { "algorithm": "gptq", "weight_dtype": "uint4", "block_size": 32, "granularity": "per_block", } }, id="4bit_blockwise", ), ], ) @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Only supported on macOS 15+") def test_gptq(config, mnist_model, mnist_example_input): compressor_config = LayerwiseCompressorConfig.from_dict(config) compressor = LayerwiseCompressor(mnist_model, compressor_config) def calibration_loader(): yield mnist_example_input compressed_model = compressor.compress(calibration_loader(), device="cpu") util.convert_and_verify( compressed_model, mnist_example_input, expected_ops=["constexpr_blockwise_shift_scale"], ) # endregion # region PTQ @pytest.mark.parametrize( "config", [ pytest.param( {"global_config": {"weight_dtype": "int4", "granularity": "per_tensor"}}, id="4bit_per_tensor", ), pytest.param( {"global_config": {"weight_dtype": "int4", "granularity": "per_channel"}}, id="4bit_per_channel", ), pytest.param( { "global_config": { "weight_dtype": "int4", "granularity": "per_block", "block_size": 16, } }, id="4bit_per_block", ), ], ) @pytest.mark.skipif(ct.utils._macos_version() < (15, 0), reason="Only supported on macOS 15+") def test_ptq(mnist_model, mnist_example_input, config): pytest.importorskip("coremltools.optimize.coreml._utils") from coremltools.optimize.torch.quantization.post_training_quantization import ( PostTrainingQuantizer, PostTrainingQuantizerConfig, ) model = mnist_model ptq_config = PostTrainingQuantizerConfig.from_dict(config) ptq = PostTrainingQuantizer(model, ptq_config) compressed_model = ptq.compress() util.convert_and_verify( compressed_model, mnist_example_input, expected_ops=["constexpr_blockwise_shift_scale"], ) # endregion # region HelperMethods def get_quantized_model(quantizer, example_input): quantizer.prepare(example_inputs=(example_input,), inplace=True) quantizer.step() model = quantizer.finalize() return model # endregion ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2855473 coremltools-8.0/coremltools/test/optimize/torch/layerwise_compression/0000755000000000000000000000000014672075535025424 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/layerwise_compression/__init__.py0000644000000000000000000000033314672066616027534 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/layerwise_compression/test_algorithms.py0000644000000000000000000002666614672066616031226 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from contextlib import nullcontext as does_not_raise import pytest import torch import torch.nn as nn from attr import define, field, validators from coremltools.optimize.torch._utils.metadata_utils import ( METADATA_VERSION, METADATA_VERSION_BUFFER, CompressionMetadata, CompressionType, ) from coremltools.optimize.torch.layerwise_compression import ( LayerwiseCompressor, LayerwiseCompressorConfig, ) from coremltools.optimize.torch.layerwise_compression._quant import Quantizer from coremltools.optimize.torch.layerwise_compression.algorithms import ( GPTQ, LayerwiseCompressionAlgorithmConfig, ModuleGPTQConfig, ModuleSparseGPTConfig, ) @pytest.mark.parametrize( "global_config_and_class", [ ({"algorithm": "gptq", "weight_dtype": "uint4"}, ModuleGPTQConfig), ( { "algorithm": "sparse_gpt", "weight_dtype": "uint8", "target_sparsity": 0.25, }, ModuleSparseGPTConfig, ), ], ) def test_obs_compression_algorithm_config(global_config_and_class): """ Test the registry-based configuration of the :py:class:`LayerwiseCompressionAlgorithmConfig` using :py:class:`LayerwiseCompressorConfig` """ global_config, class_type = global_config_and_class # compress config = LayerwiseCompressorConfig.from_dict( { "global_config": global_config, "input_cacher": "default", "calibration_nsamples": 128, } ) algo = global_config.get("algorithm") algo_class = LayerwiseCompressionAlgorithmConfig.get_class(algo) assert algo_class == class_type assert isinstance(config.global_config, class_type) def test_custom_obs_compression_algorithm_config(): @LayerwiseCompressionAlgorithmConfig.register("foo") @define class FooConfig(LayerwiseCompressionAlgorithmConfig): bar: str = field(default=None, validator=validators.instance_of(str)) algorithm: str = field(default="foo", validator=validators.instance_of(str)) config = LayerwiseCompressorConfig.from_dict( {"global_config": {"algorithm": "foo", "bar": "baz"}} ) assert isinstance(config.global_config, FooConfig) assert config.global_config.bar == "baz" @pytest.mark.parametrize( "input_size, expectation", [ (512, does_not_raise()), (1024, does_not_raise()), (480, pytest.raises(ValueError)), (960, pytest.raises(ValueError)), ], ) def test_block_size_validation_gptq(input_size, expectation): """ Test handling of block_size configuration for GPTQ algorithm """ config = ModuleGPTQConfig.from_dict( { "algorithm": "gptq", "weight_dtype": "uint8", "block_size": 128, "granularity": "per_block", } ) _model = nn.Transformer(d_model=input_size, nhead=8) layer = _model.encoder.layers.get_submodule("0.linear1") with expectation: gptq = GPTQ(layer, config) assert gptq is not None @pytest.mark.parametrize("block_size", [32, 1024]) def test_blockwise_compression_gptq(block_size): model = nn.Sequential(nn.Linear(256, 100)) config = ModuleGPTQConfig.from_dict( { "algorithm": "gptq", "weight_dtype": "uint8", "block_size": block_size, "granularity": "per_block", } ) compressor_config = LayerwiseCompressorConfig().set_global(global_config=config) compressor = LayerwiseCompressor(model, compressor_config) def dataloader(): yield torch.rand(10, 256) compressed_model = compressor.compress(dataloader=dataloader(), device="cpu") if model[0].weight.shape[1] % block_size != 0: # No compression; layer skipped assert torch.equal(compressed_model[0].weight, model[0].weight) else: # Compression should have occurred assert not torch.equal(compressed_model[0].weight, model[0].weight) @pytest.mark.parametrize( "config", [ {"global_config": {"algorithm": "gptq", "weight_dtype": "uint4"}}, { "global_config": { "algorithm": "gptq", "weight_dtype": "uint8", "block_size": 16, "granularity": "per_block", } }, { "global_config": { "algorithm": "gptq", "weight_dtype": "uint4", "enable_normal_float": True, } }, { "global_config": { "algorithm": "gptq", "weight_dtype": "uint3", "enable_normal_float": True, } }, ], ) @pytest.mark.parametrize( "model, input_shape", [ (nn.Sequential(nn.Linear(4096, 1024)), (1, 4096)), (nn.Sequential(nn.Conv2d(32, 64, 3)), (1, 32, 224, 224)), ], ) def test_gptq_metadata(config, model, input_shape): """ Test registration of metadata buffers for GPTQ algorithm """ # Setup to get compressed model compressor_config = LayerwiseCompressorConfig.from_dict(config) compressor = LayerwiseCompressor(model, compressor_config) def calibration_loader(): yield torch.rand(input_shape) compressed_model = compressor.compress(calibration_loader(), device="cpu") # Extract registered metadata from state_dict state_dict = compressed_model[0].state_dict() metadata_dict = CompressionMetadata.from_state_dict(state_dict) assert len(metadata_dict) == 1 assert "weight" in metadata_dict # Verification metadata = metadata_dict["weight"] if compressor_config.global_config.enable_normal_float: assert metadata.compression_type == [CompressionType.palettization.value] assert metadata.lut.shape == (1,) * state_dict["weight"].dim() + ( 2**compressor_config.global_config.weight_n_bits, 1, ) assert metadata.palettization_scale.shape == (state_dict["weight"].shape[0], 1) else: assert metadata.compression_type == [CompressionType.quantization.value] assert metadata.quantization_n_bits == compressor_config.global_config.weight_n_bits assert metadata.zero_point.shape == metadata.quantization_scale.shape assert metadata.quantization_scale.shape[0] == state_dict["weight"].shape[0] block_size = compressor_config.global_config.block_size if block_size is None: assert metadata.quantization_scale.shape[1] == 1 else: assert ( metadata.quantization_scale.shape[1] == state_dict["weight"].shape[1] / block_size ) assert METADATA_VERSION_BUFFER in compressed_model.state_dict() assert torch.equal(compressed_model.state_dict()[METADATA_VERSION_BUFFER], METADATA_VERSION) @pytest.mark.parametrize( "config", [ pytest.param({"global_config": {"algorithm": "sparse_gpt"}}, id="pruning"), pytest.param( {"global_config": {"algorithm": "sparse_gpt", "weight_dtype": "uint8"}}, id="pruning_quantization", ), pytest.param( { "global_config": { "algorithm": "sparse_gpt", "weight_dtype": "uint4", "enable_normal_float": True, } }, id="pruning_palettization", ), ], ) def test_sparse_gpt_metadata(config): """ Test registration of metadata buffers for GPTQ algorithm """ # Setup to get compressed model model = nn.Sequential(nn.Linear(4096, 1024)) compressor_config = LayerwiseCompressorConfig.from_dict(config) compressor = LayerwiseCompressor(model, compressor_config) def calibration_loader(): yield torch.rand(1, 4096) compressed_model = compressor.compress(calibration_loader(), device="cpu") # Extract registered metadata from state_dict state_dict = compressed_model[0].state_dict() metadata_dict = CompressionMetadata.from_state_dict(state_dict) assert len(metadata_dict) == 1 assert "weight" in metadata_dict # Verification metadata = metadata_dict["weight"] if compressor_config.global_config.enable_normal_float: assert metadata.compression_type == [ CompressionType.pruning.value, CompressionType.palettization.value, ] assert metadata.lut.shape == ( 1, 1, 2**compressor_config.global_config.weight_n_bits, 1, ) assert metadata.palettization_scale.shape == (state_dict["weight"].shape[0], 1) elif ( compressor_config.global_config.weight_n_bits is not None and compressor_config.global_config.weight_n_bits < 16 ): assert metadata.compression_type == [ CompressionType.pruning.value, CompressionType.quantization.value, ] assert metadata.quantization_n_bits == compressor_config.global_config.weight_n_bits assert metadata.zero_point.shape == metadata.quantization_scale.shape assert METADATA_VERSION_BUFFER in compressed_model.state_dict() assert torch.equal(compressed_model.state_dict()[METADATA_VERSION_BUFFER], METADATA_VERSION) @pytest.mark.parametrize( "config", [ { "global_config": { "algorithm": "gptq", "weight_dtype": "uint8", "block_size": 16, "granularity": "per_block", } }, { "global_config": { "algorithm": "gptq", "weight_dtype": "uint8", "block_size": None, "granularity": "per_block", } }, ], ) def test_gptq_block_size_configs(config): model = nn.Sequential(nn.Linear(4096, 1024)) compressor_config = LayerwiseCompressorConfig.from_dict(config) compressor = LayerwiseCompressor(model, compressor_config) def calibration_loader(): yield torch.rand(1, 4096) compressed_model = compressor.compress(calibration_loader(), device="cpu") def test_gptq_static_blocking(): model = nn.Sequential(nn.Linear(6, 8)) compressor_config = LayerwiseCompressorConfig.from_dict( { "global_config": { "algorithm": "gptq", "weight_dtype": "uint8", "block_size": 2, "granularity": "per_block", "use_activation_order_heuristic": True, } } ) compressor = LayerwiseCompressor(model, compressor_config) def calibration_loader(): yield torch.randn(1, 6) compressed_model = compressor.compress(calibration_loader(), device="cpu") block_size = compressor_config.global_config.block_size quantizer = Quantizer( n_bits=8, per_channel=True, symmetric=True, enable_normal_float=False, ) expected_scale = compressed_model[0]._buffers["_COREML_/weight/quantization_scale"] with torch.no_grad(): for block_idx in range(3): start_idx = block_size * block_idx block = model[0].weight[:, start_idx : start_idx + block_size] quantizer.find_params(block, weight=True) assert torch.all(quantizer.scale.flatten() == expected_scale[:, block_idx]) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/layerwise_compression/test_quant.py0000644000000000000000000000543014672066616030167 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import pytest import torch from coremltools.optimize.torch.layerwise_compression._quant import Quantizer @pytest.mark.parametrize( "quantizer, expected_scale, expected_zp", [ ( Quantizer(n_bits=4, per_channel=True, symmetric=True), torch.tensor([[0.48], [1.4]]), torch.tensor([[8.0], [8.0]]), ), ( Quantizer(n_bits=4, per_channel=False, symmetric=True), torch.tensor([[1.4], [1.4]]), torch.tensor([[8.0], [8.0]]), ), ( Quantizer(n_bits=4, per_channel=True, symmetric=False), torch.tensor([[0.32], [0.76]]), torch.tensor([[11.0], [1.0]]), ), ( Quantizer(n_bits=4, per_channel=False, symmetric=False), torch.tensor([[0.94], [0.94]]), torch.tensor([[4.0], [4.0]]), ), ( Quantizer(n_bits=8, per_channel=True, symmetric=True), torch.tensor([[0.028], [0.0824]]), torch.tensor([[128.0], [128.0]]), ), ( Quantizer(n_bits=8, per_channel=False, symmetric=True), torch.tensor([[0.0824], [0.0824]]), torch.tensor([[128.0], [128.0]]), ), ( Quantizer(n_bits=8, per_channel=True, symmetric=False), torch.tensor([[0.0188], [0.0447]]), torch.tensor([[191.0], [20.0]]), ), ( Quantizer(n_bits=8, per_channel=False, symmetric=False), torch.tensor([[0.0553], [0.0553]]), torch.tensor([[65.0], [65.0]]), ), ], ) def test_find_params(quantizer, expected_scale, expected_zp): # input x = torch.tensor([[1.2, -3.6, 0.4], [-0.9, 1.5, 10.5]]) # fine quantization params quantizer.find_params(x, weight=True) # compare assert torch.all( torch.isclose(quantizer.scale, expected_scale, rtol=0.001, atol=0.001) ) assert torch.all(torch.isclose(quantizer.zero_point, expected_zp)) @pytest.mark.parametrize( "input, weight, expected_shape", [ (torch.rand(2, 3), True, (2, 1)), (torch.rand(2, 3, 4), True, (2, 1, 1)), (torch.rand(2, 3, 4, 5), True, (2, 1, 1, 1)), (torch.rand(2, 3), False, (1, 3)), (torch.rand(2, 3, 4), False, (1, 1, 4)), (torch.rand(2, 3, 4, 5), False, (1, 3, 1, 1)), ], ) def test_find_params_reshape(input, weight, expected_shape): quantizer = Quantizer(n_bits=4, per_channel=True, symmetric=True) quantizer.find_params(input, weight=weight) assert quantizer.scale.shape == expected_shape ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2855473 coremltools-8.0/coremltools/test/optimize/torch/models/0000755000000000000000000000000014672075535022262 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/models/__init__.py0000644000000000000000000000033314672066616024372 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/models/mnist.py0000644000000000000000000002051214672066616023766 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause # type: ignore import os from collections import OrderedDict import pytest import torch import torch.nn as nn from filelock import FileLock from torchvision import datasets, transforms from coremltools.test.optimize.torch.utils import test_data_path # IMPORTANT: DO NOT import these fixtures in your tests. # That leads pytest to run the fixtures (even session-scoped) multiple times. # These have been imported into conftest.py, which makes them available for all # tests within the test/ folder. num_classes = 10 @pytest.fixture() def mnist_example_input(): return torch.rand(1, 1, 28, 28) @pytest.fixture() def mnist_example_output(): return torch.rand(1, num_classes) @pytest.fixture def mnist_model(): return nn.Sequential( OrderedDict( [ ("conv1", nn.Conv2d(1, 32, (5, 5), padding=2)), ("relu1", nn.ReLU()), ("pool1", nn.MaxPool2d(2, stride=2, padding=0)), ("bn1", nn.BatchNorm2d(32, eps=0.001, momentum=0.01)), ("conv2", nn.Conv2d(32, 64, (5, 5), padding=2)), ("relu2", nn.ReLU()), ("pool2", nn.MaxPool2d(2, stride=2, padding=0)), ("flatten", nn.Flatten()), ("dense1", nn.Linear(3136, 1024)), ("relu3", nn.ReLU()), ("dropout", nn.Dropout(p=0.4)), ("dense2", nn.Linear(1024, num_classes)), ("softmax", nn.LogSoftmax()), ] ) ) @pytest.fixture def mnist_model_conv_transpose(): # this method will be removed once conv_transpose is integrated for pruning and palettization return nn.Sequential( OrderedDict( [ ("conv1", nn.Conv2d(1, 32, (5, 5), padding=2)), ("relu1", nn.ReLU()), ("pool1", nn.MaxPool2d(2, stride=2, padding=0)), ("bn1", nn.BatchNorm2d(32, eps=0.001, momentum=0.01)), ("conv2", nn.Conv2d(32, 64, (5, 5), padding=2)), ("relu2", nn.ReLU()), ("pool2", nn.MaxPool2d(2, stride=2, padding=0)), ( "conv_transpose1", nn.ConvTranspose2d(64, 32, stride=1, kernel_size=3, padding=1), ), ("conv4", nn.Conv2d(32, 64, stride=1, kernel_size=1, padding=0)), ("flatten", nn.Flatten()), ("dense1", nn.Linear(3136, 1024)), ("relu3", nn.ReLU()), ("dropout", nn.Dropout(p=0.4)), ("dense2", nn.Linear(1024, 10)), ("softmax", nn.LogSoftmax()), ] ) ) @pytest.fixture def mnist_model_quantization(): # String padding mode like "same" or "valid" is not supported # for quantized models: https://github.com/pytorch/pytorch/issues/76304 return nn.Sequential( OrderedDict( [ ("conv1", nn.Conv2d(1, 32, (5, 5), padding=2)), ("bn1", nn.BatchNorm2d(32, eps=0.001, momentum=0.01)), ("relu1", nn.ReLU6()), ("pool1", nn.MaxPool2d(2, stride=2, padding=0)), ("conv2", nn.Conv2d(32, 64, (5, 5), padding=2)), ("relu2", nn.ReLU6()), ("pool2", nn.MaxPool2d(2, stride=2, padding=0)), ("conv_transpose1", nn.ConvTranspose2d(64, 128, 3, padding=1)), ("bn3", nn.BatchNorm2d(128, eps=0.001, momentum=0.01)), ("relu3", nn.ReLU6()), ("pool3", nn.MaxPool2d(1, stride=1, padding=0)), ("conv_transpose2", nn.ConvTranspose2d(128, 64, 3, padding=1)), ("relu4", nn.GELU()), ("flatten", nn.Flatten()), ("dense1", nn.Linear(3136, 1024)), ("relu5", nn.ReLU6()), ("dropout", nn.Dropout(p=0.4)), ("dense2", nn.Linear(1024, num_classes)), ("softmax", nn.LogSoftmax()), ] ) ) class Residual(nn.Module): def __init__(self, module): super().__init__() self.module = module def forward(self, inputs): return self.module(inputs) + inputs @pytest.fixture def residual_mnist_model(): return nn.Sequential( OrderedDict( [ ( "conv1", nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False), ), ("bn1", nn.BatchNorm2d(64)), ("relu1", nn.ReLU()), ("pool1", nn.MaxPool2d(kernel_size=3, stride=2, padding=1)), ( "add1", Residual( nn.Sequential( OrderedDict( [ ( "conv2", nn.Conv2d(64, 64, kernel_size=1, stride=1, bias=False), ), ("bn2", nn.BatchNorm2d(64)), ("relu2", nn.ReLU()), ( "conv3", nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1), ), ("bn3", nn.BatchNorm2d(64)), ] ) ) ), ), ("relu3", nn.ReLU()), ("flatten", nn.Flatten()), ("dense1", nn.Linear(3136, 1024)), ("relu4", nn.ReLU()), ("dropout", nn.Dropout(p=0.4)), ("dense2", nn.Linear(1024, num_classes)), ("softmax", nn.LogSoftmax()), ] ) ) @pytest.fixture def mnist_model_large(): """ MNIST model with redundant layers for testing pruning algorithm """ return nn.Sequential(OrderedDict([ ('conv1', nn.Conv2d(1, 32, (5, 5), padding='same')), ('relu1', nn.ReLU()), ('pool1', nn.MaxPool2d(2, stride=2, padding=0)), ('bn1', nn.BatchNorm2d(32, eps=0.001, momentum=0.01)), ('conv2', nn.Conv2d(32, 64, (5, 5), padding='same')), ('relu2', nn.ReLU()), ('pool2', nn.MaxPool2d(2, stride=2, padding=0)), ('conv3', nn.Conv2d(64, 64, (5, 5), padding='same')), ('relu3', nn.ReLU()), ('conv4', nn.Conv2d(64, 64, (5, 5), padding='same')), ('relu4', nn.ReLU()), ('flatten', nn.Flatten()), ('dense1', nn.Linear(3136, 1024)), ('relu5', nn.ReLU()), ('dropout', nn.Dropout(p=0.4)), ('dense2', nn.Linear(1024, num_classes)), ('softmax', nn.LogSoftmax())])) def LeNet5(): """ Original LeNet5 model for MNIST with sigmoid activations. """ return nn.Sequential( OrderedDict( [ ("conv1", nn.Conv2d(1, 6, 5, 1, 2)), ("sigmoid1", nn.Sigmoid()), ("pool1", nn.AvgPool2d(2, 2)), ("conv2", nn.Conv2d(6, 16, 5, 1, 0)), ("sigmoid2", nn.Sigmoid()), ("pool2", nn.AvgPool2d(2, 2)), ("flatten", nn.Flatten()), ("dense1", nn.Linear(5 * 5 * 16, 120)), ("sigmoid3", nn.Sigmoid()), ("dense2", nn.Linear(120, 84)), ("sigmoid4", nn.Sigmoid()), ("dense3", nn.Linear(84, num_classes)), ("softmax", nn.LogSoftmax(dim=1)), ] ) ) @pytest.fixture(scope="session") def mnist_dataset(): transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)) ]) data_path = os.path.join(test_data_path(), 'mnist') os.makedirs(data_path, exist_ok=True) with FileLock(os.path.join(data_path, 'data.lock')): train = datasets.MNIST(data_path, train=True, download=True, transform=transform) test = datasets.MNIST(data_path, train=False, download=True, transform=transform) return train, test ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2855473 coremltools-8.0/coremltools/test/optimize/torch/palettization/0000755000000000000000000000000014672075535023666 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/palettization/__init__.py0000644000000000000000000000033314672066616025776 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/palettization/palettization_utils.py0000644000000000000000000000221614672066616030350 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from torch.ao.quantization import quantization_mappings def _assert_changes_post_attach(module, n_bits, cluster_dim): assert hasattr(module, 'qconfig') assert module.qconfig.weight.p.keywords["n_bits"] == n_bits assert module.qconfig.weight.p.keywords["cluster_dim"] == cluster_dim def _assert_changes_post_prepare( original_module, palettized_module, n_bits, cluster_dim, kmeans_max_iter ): assert ( type(palettized_module) == quantization_mappings.DEFAULT_QAT_MODULE_MAPPINGS[type(original_module)] ) assert palettized_module.weight_fake_quant.n_clusters == 2**n_bits assert palettized_module.weight_fake_quant.cluster_dim == cluster_dim assert palettized_module.weight_fake_quant.kmeans_max_iter == kmeans_max_iter def _get_max_unique_weights_in_module_post_conversion(config, module): return (2 ** config[type(module)]["n_bits"]) \ * config[type(module)]["cluster_dim"] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/palettization/test_palettization_api.py0000644000000000000000000007233714672066616031033 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import pytest import torch import torch.nn as nn import torch.nn.functional as F from coremltools.optimize.torch.palettization import ( DKMPalettizer, DKMPalettizerConfig, ModuleDKMPalettizerConfig, ) from coremltools.optimize.torch.palettization.palettization_config import ( DEFAULT_PALETTIZATION_SCHEME, ) from coremltools.test.optimize.torch.palettization.palettization_utils import ( _assert_changes_post_attach, _assert_changes_post_prepare, ) from coremltools.test.optimize.torch.utils import get_logging_capture_context_manager REGEX_YAML = """ module_name_configs: conv\d+: - n_bits: 4 weight_threshold: 400 palett_tau: 0.000004 - n_bits: 2 weight_threshold: 1000 palett_tau: 0.000004 """ def _create_simple_model(): class Net(nn.Module): def __init__(self): super().__init__() self.conv1 = nn.Conv2d(3, 6, 5) self.pool = nn.MaxPool2d(2, 2) self.conv2 = nn.Conv2d(6, 16, 5) self.fc1 = nn.Linear(16 * 5 * 5, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = torch.flatten(x, 1) # flatten all dimensions except batch x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x return Net() @pytest.fixture def simple_model(): return _create_simple_model() def test_inplace_false_attach_config(simple_model): palettizer = DKMPalettizer(simple_model) prepared_model = palettizer.prepare() assert not hasattr(simple_model.conv1, "qconfig") assert not hasattr(simple_model.conv2, "qconfig") assert not hasattr(simple_model.fc1, "qconfig") assert not hasattr(simple_model.fc2, "qconfig") assert not hasattr(simple_model.fc3, "qconfig") _assert_changes_post_attach( prepared_model.conv2, DEFAULT_PALETTIZATION_SCHEME[type(simple_model.conv2)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model.conv2)]["cluster_dim"], ) _assert_changes_post_attach( prepared_model.fc1, DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc1)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc1)]["cluster_dim"], ) _assert_changes_post_attach( prepared_model.fc2, DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc2)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc2)]["cluster_dim"], ) def test_empty_dict_for_config(simple_model): ## This test should behave the same as that when a None config is passed to DKMPalettizer config = DKMPalettizerConfig.from_dict({}) palettizer = DKMPalettizer(simple_model, config) prepared_model = palettizer.prepare() assert not hasattr(simple_model.conv1, "qconfig") assert not hasattr(simple_model.conv2, "qconfig") assert not hasattr(simple_model.fc1, "qconfig") assert not hasattr(simple_model.fc2, "qconfig") assert not hasattr(simple_model.fc3, "qconfig") _assert_changes_post_attach( prepared_model.conv2, DEFAULT_PALETTIZATION_SCHEME[type(simple_model.conv2)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model.conv2)]["cluster_dim"], ) _assert_changes_post_attach( prepared_model.fc1, DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc1)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc1)]["cluster_dim"], ) _assert_changes_post_attach( prepared_model.fc2, DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc2)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc2)]["cluster_dim"], ) @pytest.fixture(scope="session") def test_empty_yaml_for_config(simple_model, tmp_path_factory): ## This test should behave the same as that when a None config is passed to DKMPalettizer fname = tmp_path_factory.mktemp("test_configs") / "empty_config.yaml" with open(fname, "w") as file: file.write("\n") config = DKMPalettizerConfig.from_yaml(fname) palettizer = DKMPalettizer(simple_model, config) prepared_model = palettizer.prepare() assert not hasattr(simple_model.conv1, "qconfig") assert not hasattr(simple_model.conv2, "qconfig") assert not hasattr(simple_model.fc1, "qconfig") assert not hasattr(simple_model.fc2, "qconfig") assert not hasattr(simple_model.fc3, "qconfig") _assert_changes_post_attach( prepared_model.conv2, DEFAULT_PALETTIZATION_SCHEME[type(simple_model.conv2)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model.conv2)]["cluster_dim"], ) _assert_changes_post_attach( prepared_model.fc1, DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc1)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc1)]["cluster_dim"], ) _assert_changes_post_attach( prepared_model.fc2, DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc2)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc2)]["cluster_dim"], ) @pytest.fixture(scope="session") def test_regex_module_name_configs(simple_model, tmp_path_factory): fname = tmp_path_factory.mktemp("test_configs") / "regex_config.yaml" with open(fname, "w") as file: file.write(REGEX_YAML) config = DKMPalettizerConfig.from_yaml(fname) palettizer = DKMPalettizer(simple_model, config) palettizer.prepare(inplace=True) assert hasattr(simple_model.fc1, "qconfig") and simple_model.fc1.qconfig is None _assert_changes_post_attach(simple_model.conv1, 4, 1) _assert_changes_post_attach(simple_model.conv2, 2, 1) def test_attach_config_simple_model_uniform_palettization_config(simple_model): config = DKMPalettizerConfig.from_dict({"global_config": {"n_bits": 4}}) palettizer = DKMPalettizer(simple_model, config) palettizer.prepare(inplace=True) n_bits = config.global_config.n_bits _assert_changes_post_attach(simple_model.conv2, n_bits, 1) _assert_changes_post_attach(simple_model.fc1, n_bits, 1) _assert_changes_post_attach(simple_model.fc2, n_bits, 1) def test_attach_config_simple_model_custom_palettization_config(simple_model): custom_config = { nn.Conv2d: {"n_bits": 2, "cluster_dim": 2}, nn.Linear: {"n_bits": 4}, } config = DKMPalettizerConfig.from_dict( {"module_type_configs": custom_config, "module_name_configs": {'conv2': {"n_bits": 3, "cluster_dim": 2}}} ) palettizer = DKMPalettizer(simple_model, config) palettizer.prepare(inplace=True) _assert_changes_post_attach(simple_model.conv2, 3, 2) _assert_changes_post_attach(simple_model.fc1, custom_config[nn.Linear]["n_bits"], 1) _assert_changes_post_attach(simple_model.fc2, custom_config[nn.Linear]["n_bits"], 1) def test_attach_config_simple_model_weight_threshold_test(simple_model): custom_config = {nn.Conv2d: {"n_bits": 2, "cluster_dim": 2, "weight_threshold": 1000}} config = DKMPalettizerConfig.from_dict( {"module_type_configs": custom_config} ) palettizer = DKMPalettizer(simple_model, config) palettizer.prepare(inplace=True) # For the below two assertions, prepare_qat would propagate a None qconfig throughout the supported modules in # the model assert hasattr(simple_model.conv1, "qconfig") and simple_model.conv1.qconfig is None assert hasattr(simple_model.fc1, "qconfig") and simple_model.fc1.qconfig is None _assert_changes_post_attach( simple_model.conv2, custom_config[nn.Conv2d]["n_bits"], custom_config[nn.Conv2d]["cluster_dim"], ) def test_attach_config_simple_model_weight_threshold_range_test(simple_model): custom_config = { nn.Conv2d: [ {"n_bits": 4, "cluster_dim": 1, "weight_threshold": 1000}, {"n_bits": 2, "cluster_dim": 1, "weight_threshold": 400}, ] } config = DKMPalettizerConfig.from_dict({"module_type_configs": custom_config}) palettizer = DKMPalettizer(simple_model, config) palettizer.prepare(inplace=True) # For the below assertion, prepare_qat would propagate a None qconfig throughout the supported modules in # the model assert hasattr(simple_model.fc1, "qconfig") and simple_model.fc1.qconfig is None _assert_changes_post_attach( simple_model.conv1, custom_config[nn.Conv2d][1]["n_bits"], custom_config[nn.Conv2d][1]["cluster_dim"], ) _assert_changes_post_attach( simple_model.conv2, custom_config[nn.Conv2d][0]["n_bits"], custom_config[nn.Conv2d][0]["cluster_dim"], ) def test_attach_config_only_on_specified_modules_conv(simple_model): """ If there is a module type specified in the palettization_config, qconfigs should only be applied to modules of those types not to modules of other type. For eg: If palettization_config only contains Conv2d, we should not attach a qconfig to nn.Linear despite it being supported by palettization. """ custom_config = {nn.Conv2d: {"n_bits": 2, "cluster_dim": 2}} config = DKMPalettizerConfig.from_dict({"module_type_configs": custom_config}) palettizer = DKMPalettizer(simple_model, config) palettizer.prepare(inplace=True) # For the below assertion, prepare_qat would propagate a None qconfig throughout the supported modules in # the model assert hasattr(simple_model.fc1, "qconfig") and simple_model.fc1.qconfig is None _assert_changes_post_attach( simple_model.conv2, custom_config[nn.Conv2d]["n_bits"], custom_config[nn.Conv2d]["cluster_dim"], ) def test_attach_config_only_on_specified_modules_linear(simple_model): custom_config = {nn.Linear: {"n_bits": 2, "cluster_dim": 2}} config = DKMPalettizerConfig.from_dict( {"module_type_configs": custom_config} ) palettizer = DKMPalettizer(simple_model, config) palettizer.prepare(inplace=True) # For the below two assertions, prepare_qat would propagate a None qconfig throughout the supported modules in # the model assert hasattr(simple_model.conv1, "qconfig") and simple_model.conv1.qconfig is None assert hasattr(simple_model.conv2, "qconfig") and simple_model.conv2.qconfig is None _assert_changes_post_attach( simple_model.fc1, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], ) def test_prepare_palettizer_simple_model_custom_palettization_config(simple_model): simple_model_copy = copy.deepcopy(simple_model) custom_config = {nn.Conv2d: {"n_bits": 2, "cluster_dim": 2, "kmeans_max_iter": 4}, nn.Linear: {"n_bits": 4, "cluster_dim": 1, "kmeans_max_iter": 5}} config = DKMPalettizerConfig.from_dict( {"module_type_configs": custom_config} ) palettizer = DKMPalettizer(simple_model, config) prepared_model = palettizer.prepare(inplace=True) num_epochs = 1 for epoch in range(num_epochs): palettizer.step() _assert_changes_post_prepare( simple_model_copy.conv2, prepared_model.conv2, custom_config[nn.Conv2d]["n_bits"], custom_config[nn.Conv2d]["cluster_dim"], custom_config[nn.Conv2d]["kmeans_max_iter"], ) _assert_changes_post_prepare( simple_model_copy.fc1, prepared_model.fc1, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], custom_config[nn.Linear]["kmeans_max_iter"], ) _assert_changes_post_prepare( simple_model_copy.fc2, prepared_model.fc2, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], custom_config[nn.Linear]["kmeans_max_iter"], ) @pytest.mark.parametrize( "cluster_dim_expected_std_outputs", [ (4, None), ( 5, [ "WARNING:coremltools.optimize.torch._utils.validation_utils:conv2.weight: The number of channels in channel axis dimension: " "0, 16 is not divisible by cluster_dim=5" ], ), ], ) def test_prepare_palettizer_simple_model_cluster_dim_mil_check( simple_model, cluster_dim_expected_std_outputs ): cluster_dim, expected_std_outputs = cluster_dim_expected_std_outputs custom_config = {nn.Conv2d: {"n_bits": 2, "cluster_dim": cluster_dim}} config = DKMPalettizerConfig.from_dict({"module_type_configs": custom_config}) palettizer = DKMPalettizer(simple_model, config) logging_context_manager = get_logging_capture_context_manager() with logging_context_manager( "coremltools.optimize.torch._utils.validation_utils" ) as log_capture: simple_model = palettizer.prepare() output_capture = log_capture.getvalue() if expected_std_outputs: assert not hasattr(simple_model.conv2, "weight_fake_quant") for expected_std_output in expected_std_outputs: assert expected_std_output in output_capture else: assert hasattr(simple_model.conv2, "weight_fake_quant") @pytest.mark.parametrize( "block_size_expected_std_outputs", [ ( 5, [ "WARNING:coremltools.optimize.torch._utils.validation_utils:conv2.weight: axis_length=16 is not divisible by group_size=5", "INFO:coremltools.optimize.torch._utils.validation_utils:Skipping compression for conv2.weight", ], ), (4, None), ], ) def test_prepare_palettizer_simple_model_block_size_mil_check( simple_model, block_size_expected_std_outputs ): curr_block_size, expected_std_outputs = block_size_expected_std_outputs custom_config = { nn.Conv2d: { "n_bits": 2, "cluster_dim": 4, "granularity": "per_grouped_channel", "group_size": curr_block_size, } } config = DKMPalettizerConfig.from_dict({"module_type_configs": custom_config}) palettizer = DKMPalettizer(simple_model, config) logging_context_manager = get_logging_capture_context_manager() with logging_context_manager( "coremltools.optimize.torch._utils.validation_utils" ) as log_capture: simple_model = palettizer.prepare() output_capture = log_capture.getvalue() if expected_std_outputs: assert not hasattr(simple_model.conv2, "weight_fake_quant") for expected_std_output in expected_std_outputs: assert expected_std_output in output_capture else: assert hasattr(simple_model.conv2, "weight_fake_quant") def test_inplace_true_prepare_palettizer(simple_model): simple_model_copy = copy.deepcopy(simple_model) custom_config = { nn.Conv2d: { "n_bits": 2, "cluster_dim": 2, "kmeans_max_iter": 4, "milestone": 1, }, nn.Linear: { "n_bits": 4, "cluster_dim": 1, "kmeans_max_iter": 5, "milestone": 1, }, } config = DKMPalettizerConfig.from_dict({"module_type_configs": custom_config}) palettizer = DKMPalettizer(simple_model, config) palettizer.prepare(inplace=True) num_steps = 2 for step in range(num_steps): palettizer.step() if step == 0: assert palettizer._model.fc1.weight_fake_quant.fake_palett_enabled[0] == 0 else: assert palettizer._model.fc1.weight_fake_quant.fake_palett_enabled[0] == 1 _assert_changes_post_prepare( simple_model_copy.conv2, simple_model.conv2, custom_config[nn.Conv2d]["n_bits"], custom_config[nn.Conv2d]["cluster_dim"], custom_config[nn.Conv2d]["kmeans_max_iter"], ) _assert_changes_post_prepare( simple_model_copy.fc1, simple_model.fc1, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], custom_config[nn.Linear]["kmeans_max_iter"], ) _assert_changes_post_prepare( simple_model_copy.fc2, simple_model.fc2, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], custom_config[nn.Linear]["kmeans_max_iter"], ) def test_prepare_palettizer_simple_model_custom_palettization_config_milestone_1(simple_model): custom_config = {nn.Conv2d: {"n_bits": 2, "cluster_dim": 2, "kmeans_max_iter": 4, "milestone": 1}, nn.Linear: {"n_bits": 4, "cluster_dim": 1, "kmeans_max_iter": 5, "milestone": 1}} config = DKMPalettizerConfig.from_dict( {"module_type_configs": custom_config} ) palettizer = DKMPalettizer(simple_model, config) prepared_model = palettizer.prepare() num_steps = 2 for step in range(num_steps): palettizer.step() if step == 0: assert palettizer._model.fc1.weight_fake_quant.fake_palett_enabled[0] == 0 else: assert palettizer._model.fc1.weight_fake_quant.fake_palett_enabled[0] == 1 _assert_changes_post_prepare(simple_model.conv2, prepared_model.conv2, custom_config[nn.Conv2d]["n_bits"], custom_config[nn.Conv2d]["cluster_dim"], custom_config[nn.Conv2d]["kmeans_max_iter"]) _assert_changes_post_prepare(simple_model.fc1, prepared_model.fc1, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], custom_config[nn.Linear]["kmeans_max_iter"]) _assert_changes_post_prepare(simple_model.fc2, prepared_model.fc2, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], custom_config[nn.Linear]["kmeans_max_iter"]) def test_prepare_palettizer_different_milestone_per_module_type(simple_model): custom_config = { nn.Conv2d: { "n_bits": 2, "cluster_dim": 2, "kmeans_max_iter": 4, "milestone": 1, }, nn.Linear: {"n_bits": 4, "kmeans_max_iter": 5, "milestone": 2}, } config = DKMPalettizerConfig.from_dict({"module_type_configs": custom_config}) palettizer = DKMPalettizer(simple_model, config) orig_conv_mods = [simple_model.conv2] orig_fc_mods = [simple_model.fc1, simple_model.fc2] prepared_model = palettizer.prepare() prepared_conv_mods = [prepared_model.conv2] prepared_fc_mods = [prepared_model.fc1, prepared_model.fc2] num_steps = 3 for step in range(num_steps): palettizer.step() if step == 0: for mod in prepared_conv_mods + prepared_fc_mods: assert mod.weight_fake_quant.fake_palett_enabled[0] == 0 elif step == 1: for mod in prepared_conv_mods: assert mod.weight_fake_quant.fake_palett_enabled[0] == 1 for mod in prepared_fc_mods: assert mod.weight_fake_quant.fake_palett_enabled[0] == 0 else: for mod in prepared_conv_mods: assert mod.weight_fake_quant.fake_palett_enabled[0] == 1 for mod in prepared_fc_mods: assert mod.weight_fake_quant.fake_palett_enabled[0] == 1 for orig, prep in zip(orig_conv_mods, prepared_conv_mods): _assert_changes_post_prepare(orig, prep, custom_config[nn.Conv2d]["n_bits"], custom_config[nn.Conv2d]["cluster_dim"], custom_config[nn.Conv2d]["kmeans_max_iter"]) for orig, prep in zip(orig_fc_mods, prepared_fc_mods): _assert_changes_post_prepare( orig, prep, custom_config[nn.Linear]["n_bits"], 1, custom_config[nn.Linear]["kmeans_max_iter"], ) def test_attach_config_weight_threshold_range_different_milestone(simple_model): custom_config = { nn.Conv2d: [ {"n_bits": 4, "cluster_dim": 2, "weight_threshold": 1000, "milestone": 2}, {"n_bits": 2, "cluster_dim": 1, "weight_threshold": 400, "milestone": 1}, ] } config = DKMPalettizerConfig.from_dict({"module_type_configs": custom_config}) palettizer = DKMPalettizer(simple_model, config) prepared_model = palettizer.prepare() # configs should get sorted automatically assert hasattr(prepared_model.fc1, "qconfig") and prepared_model.fc1.qconfig is None num_steps = 3 for step in range(num_steps): palettizer.step() if step == 0: assert prepared_model.conv2.weight_fake_quant.fake_palett_enabled[0] == 0 elif step == 1: assert prepared_model.conv2.weight_fake_quant.fake_palett_enabled[0] == 0 else: assert prepared_model.conv2.weight_fake_quant.fake_palett_enabled[0] == 1 _assert_changes_post_attach( prepared_model.conv2, custom_config[nn.Conv2d][0]["n_bits"], custom_config[nn.Conv2d][0]["cluster_dim"], ) def test_prepare_palettizer_simple_model_custom_palettization_config_none_module(simple_model): custom_config = {nn.Conv2d: {"n_bits": 2, "cluster_dim": 2, "kmeans_max_iter": 4}, nn.Linear: {"n_bits": 4, "cluster_dim": 1, "kmeans_max_iter": 5}} config = DKMPalettizerConfig.from_dict( {"module_name_configs": {"conv1": None}, "module_type_configs": custom_config} ) palettizer = DKMPalettizer(simple_model, config) prepared_model = palettizer.prepare() num_epochs = 1 for epoch in range(num_epochs): palettizer.step() assert type(prepared_model.conv1) == nn.Conv2d # Means that if None was provided, it wasn't prepared. _assert_changes_post_prepare(simple_model.conv2, prepared_model.conv2, custom_config[nn.Conv2d]["n_bits"], custom_config[nn.Conv2d]["cluster_dim"], custom_config[nn.Conv2d]["kmeans_max_iter"]) _assert_changes_post_prepare(simple_model.fc1, prepared_model.fc1, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], custom_config[nn.Linear]["kmeans_max_iter"]) _assert_changes_post_prepare(simple_model.fc2, prepared_model.fc2, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], custom_config[nn.Linear]["kmeans_max_iter"]) def test_prepare_palettizer_simple_model_custom_palettization_config_none_conv2d(simple_model): custom_config = {nn.Conv2d: None, nn.Linear: {"n_bits": 4, "cluster_dim": 1, "kmeans_max_iter": 5}} config = DKMPalettizerConfig.from_dict( {"module_type_configs": custom_config} ) palettizer = DKMPalettizer(simple_model, config) prepared_model = palettizer.prepare() num_epochs = 1 for epoch in range(num_epochs): palettizer.step() assert type(prepared_model.conv1) == nn.Conv2d # Means that if None was provided, it wasn't prepared. assert type(prepared_model.conv2) == nn.Conv2d assert not hasattr(prepared_model.conv1, "weight_fake_quant") assert not hasattr(prepared_model.conv2, "weight_fake_quant") _assert_changes_post_prepare(simple_model.fc1, prepared_model.fc1, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], custom_config[nn.Linear]["kmeans_max_iter"]) _assert_changes_post_prepare(simple_model.fc2, prepared_model.fc2, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], custom_config[nn.Linear]["kmeans_max_iter"]) def test_prepare_palettizer_simple_model_custom_palettization_config_linear_default(simple_model): custom_config = {nn.Conv2d: {"n_bits": 2, "cluster_dim": 2, "kmeans_max_iter": 4}, nn.Linear: {"n_bits": 4, "cluster_dim": 1}} config = DKMPalettizerConfig.from_dict( {"module_type_configs": custom_config} ) palettizer = DKMPalettizer(simple_model, config) prepared_model = palettizer.prepare() num_epochs = 1 for epoch in range(num_epochs): palettizer.step() _assert_changes_post_prepare( simple_model.conv2, prepared_model.conv2, custom_config[nn.Conv2d]["n_bits"], custom_config[nn.Conv2d]["cluster_dim"], custom_config[nn.Conv2d]["kmeans_max_iter"], ) _assert_changes_post_prepare( simple_model.fc1, prepared_model.fc1, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], DEFAULT_PALETTIZATION_SCHEME[nn.Linear]["kmeans_max_iter"], ) _assert_changes_post_prepare( simple_model.fc2, prepared_model.fc2, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], DEFAULT_PALETTIZATION_SCHEME[nn.Linear]["kmeans_max_iter"], ) def test_inplace_true_attach_config(simple_model): simple_model_copy = copy.deepcopy(simple_model) palettizer = DKMPalettizer(simple_model) palettizer.prepare(inplace=True) _assert_changes_post_attach( simple_model.conv2, DEFAULT_PALETTIZATION_SCHEME[type(simple_model_copy.conv2)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model_copy.conv2)]["cluster_dim"], ) _assert_changes_post_attach( simple_model.fc1, DEFAULT_PALETTIZATION_SCHEME[type(simple_model_copy.fc1)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model_copy.fc1)]["cluster_dim"], ) _assert_changes_post_attach( simple_model.fc2, DEFAULT_PALETTIZATION_SCHEME[type(simple_model_copy.fc2)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model_copy.fc2)]["cluster_dim"], ) def test_inplace_false_attach_config(simple_model): palettizer = DKMPalettizer(simple_model) prepared_model = palettizer.prepare() assert not hasattr(simple_model.conv1, "qconfig") assert not hasattr(simple_model.conv2, "qconfig") assert not hasattr(simple_model.fc1, "qconfig") assert not hasattr(simple_model.fc2, "qconfig") assert not hasattr(simple_model.fc3, "qconfig") _assert_changes_post_attach( prepared_model.conv2, DEFAULT_PALETTIZATION_SCHEME[type(simple_model.conv2)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model.conv2)]["cluster_dim"], ) _assert_changes_post_attach( prepared_model.fc1, DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc1)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc1)]["cluster_dim"], ) _assert_changes_post_attach( prepared_model.fc2, DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc2)]["n_bits"], DEFAULT_PALETTIZATION_SCHEME[type(simple_model.fc2)]["cluster_dim"], ) def test_inplace_true_prepare_palettizer(simple_model): simple_model_copy = copy.deepcopy(simple_model) custom_config = { nn.Conv2d: { "n_bits": 2, "cluster_dim": 2, "kmeans_max_iter": 4, "milestone": 1, }, nn.Linear: { "n_bits": 4, "cluster_dim": 1, "kmeans_max_iter": 5, "milestone": 1, }, } config = DKMPalettizerConfig.from_dict({"module_type_configs": custom_config}) palettizer = DKMPalettizer(simple_model, config) palettizer.prepare(inplace=True) num_steps = 2 for step in range(num_steps): palettizer.step() if step == 0: assert palettizer._model.fc1.weight_fake_quant.fake_palett_enabled[0] == 0 else: assert palettizer._model.fc1.weight_fake_quant.fake_palett_enabled[0] == 1 _assert_changes_post_prepare( simple_model_copy.conv2, simple_model.conv2, custom_config[nn.Conv2d]["n_bits"], custom_config[nn.Conv2d]["cluster_dim"], custom_config[nn.Conv2d]["kmeans_max_iter"], ) _assert_changes_post_prepare( simple_model_copy.fc1, simple_model.fc1, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], custom_config[nn.Linear]["kmeans_max_iter"], ) _assert_changes_post_prepare( simple_model_copy.fc2, simple_model.fc2, custom_config[nn.Linear]["n_bits"], custom_config[nn.Linear]["cluster_dim"], custom_config[nn.Linear]["kmeans_max_iter"], ) def test_quantize_activations_flag(simple_model): config = DKMPalettizerConfig.from_dict( {"global_config": {"n_bits": 2, "cluster_dim": 1, "quantize_activations": True}} ) palettizer = DKMPalettizer(simple_model, config) palettizer.prepare() for _ in range(3): palettizer.step() assert not isinstance(palettizer._model.conv2.activation_post_process, torch.nn.Identity) def test_finalize_without_forward(simple_model): config = DKMPalettizerConfig.from_dict({"global_config": {"n_bits": 2, "cluster_dim": 1}}) palettizer = DKMPalettizer(simple_model, config) prepared_model = palettizer.prepare() palettizer.step() finalized_model = palettizer.finalize(prepared_model) assert torch.equal(simple_model.fc2.weight, finalized_model.fc2.weight) def test_deprecated_api(): with pytest.raises(DeprecationWarning): config = DKMPalettizerConfig.from_dict({"global_config": {"partition_size": 100}}) config = DKMPalettizerConfig(global_config=ModuleDKMPalettizerConfig()) with pytest.raises(DeprecationWarning): config.global_config.partition_size = 100 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/palettization/test_palettization_utils.py0000644000000000000000000000354514672066616031415 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import pytest import torch from coremltools.optimize.torch.palettization._utils import devectorize, vectorize @pytest.mark.parametrize( "cluster_dim_reshape_expected_shape_expected_first_row", [ (2, (2, 3, 4), (12, 2), torch.tensor([0, 12])), (2, (4, 3, 2), (12, 2), torch.tensor([0, 6])), (3, (2, 3, 4), (8, 3), torch.tensor([0, 12, 4])), (3, (4, 3, 2), (8, 3), torch.tensor([0, 6, 12])), ], ) def test_vectorize_cluster_dim_gt_1( cluster_dim_reshape_expected_shape_expected_first_row, ): ( cluster_dim, reshape, expected_shape, expected_first_row, ) = cluster_dim_reshape_expected_shape_expected_first_row partition_weight_tensor = torch.arange(24).reshape(reshape) vectorized_weight_tensor, _ = vectorize(partition_weight_tensor, cluster_dim) assert tuple(vectorized_weight_tensor.shape), expected_shape assert torch.equal(vectorized_weight_tensor[0], expected_first_row) @pytest.mark.parametrize( "cluster_dim_reshape", [ (2, (2, 3, 4)), (2, (4, 3, 2)), (3, (2, 3, 4)), (3, (4, 3, 2)), ], ) def test_devectorize_cluster_dim_gt_1(cluster_dim_reshape): cluster_dim, reshape = cluster_dim_reshape partition_weight_tensor = torch.arange(24).reshape(reshape) pwt_copy = copy.deepcopy(partition_weight_tensor) vectorized_weight_tensor, _ = vectorize(partition_weight_tensor, cluster_dim) devectorized_partition_weight_tensor = devectorize( vectorized_weight_tensor, None, torch.Size(reshape), cluster_dim ) assert torch.equal(pwt_copy, devectorized_partition_weight_tensor) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/palettization/test_palettizer.py0000644000000000000000000001005514672066616027463 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import pytest import torch from coremltools.optimize.torch.palettization import ( DKMPalettizer, DKMPalettizerConfig, FakePalettize, ModuleDKMPalettizerConfig, ) @pytest.fixture def palettizer_config(): return DKMPalettizerConfig( global_config=ModuleDKMPalettizerConfig(n_bits=4, cluster_dim=1, weight_threshold=0) ) @pytest.mark.parametrize( "module", [ torch.nn.Conv1d(2, 10, (1,)), torch.nn.Conv2d(2, 10, (2, 2)), torch.nn.Conv3d(2, 10, (2, 2, 2)), torch.nn.Linear(10, 20), torch.nn.LayerNorm(10), torch.nn.Embedding(10, 20), ], ) def test_fake_palettize_insertion_weighted_modules(module, palettizer_config): wrapped_module = torch.nn.Sequential(module) palettizer = DKMPalettizer(wrapped_module, palettizer_config) palettized_module = palettizer.prepare() assert isinstance(palettized_module[0].weight_fake_quant, FakePalettize) @pytest.mark.parametrize("kdim,vdim", [(None, None), (1, 1)]) @pytest.mark.parametrize("batch_first", [True, False]) def test_fake_palettize_insertion_multihead_attention(kdim, vdim, batch_first, palettizer_config): attention_module = torch.nn.MultiheadAttention( bias=True, embed_dim=6, num_heads=3, add_bias_kv=True, kdim=kdim, vdim=vdim, batch_first=batch_first, ) class WrappedModule(torch.nn.Sequential): def __init__(self, module): super().__init__(module) def forward(self, query, key, value): return self[0](query, key, value) wrapped_module = WrappedModule(attention_module) palettizer = DKMPalettizer(wrapped_module, palettizer_config) palettized_module = palettizer.prepare(inplace=False) palettizer.enable_fake_palett(True) query_shape = (2, 3, 6) assert isinstance(palettized_module[0].out_proj.weight_fake_quant, FakePalettize) assert palettized_module[0].out_proj.weight_fake_quant.fake_palett_enabled if kdim is None and vdim is None: assert isinstance(palettized_module[0].in_proj_weight_fake_quant, FakePalettize) assert palettized_module[0].in_proj_weight_fake_quant.fake_palett_enabled data_q = data_k = data_v = torch.randn(query_shape) else: assert isinstance(palettized_module[0].q_proj_weight_fake_quant, FakePalettize) assert palettized_module[0].q_proj_weight_fake_quant.fake_palett_enabled assert isinstance(palettized_module[0].k_proj_weight_fake_quant, FakePalettize) assert palettized_module[0].k_proj_weight_fake_quant.fake_palett_enabled assert isinstance(palettized_module[0].v_proj_weight_fake_quant, FakePalettize) assert palettized_module[0].v_proj_weight_fake_quant.fake_palett_enabled data_q = torch.randn(query_shape) data_k = data_v = torch.randn(2, 3, 1) palettizer.enable_fake_palett(False) output, _ = palettized_module(data_q, data_k, data_v) if batch_first: assert output.shape[0] == query_shape[0] else: assert output.shape[1] == query_shape[1] palettizer.finalize() assert torch.all(palettized_module[0].out_proj.bias == attention_module.out_proj.bias) assert torch.all(palettized_module[0].in_proj_bias == attention_module.in_proj_bias) assert torch.all(palettized_module[0].bias_k == attention_module.bias_k) assert torch.all(palettized_module[0].bias_v == attention_module.bias_v) # assert hasattr() @pytest.mark.parametrize("module", [torch.nn.Conv1d(2, 10, (1,))]) def test_fake_palettize_train_no_grad_fwd(module, palettizer_config): wrapped_module = torch.nn.Sequential(module) palettizer = DKMPalettizer(wrapped_module, palettizer_config) palettized_module = palettizer.prepare() palettized_module.train() palettizer.step() with torch.no_grad(): palettized_module(torch.randn(3, 2, 10)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/palettization/test_post_training_palettization.py0000644000000000000000000001651614672066616033137 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import pytest import torch import torch.functional as F import torch.nn as nn from coremltools.optimize.torch._utils.metadata_utils import CompressionMetadata from coremltools.optimize.torch.palettization import ( PostTrainingPalettizer, PostTrainingPalettizerConfig, SKMPalettizer, SKMPalettizerConfig, ) @pytest.fixture def simple_model(): class Net(nn.Module): def __init__(self): super().__init__() self.conv1 = nn.Conv2d(3, 6, 5) self.pool = nn.MaxPool2d(2, 2) self.conv2 = nn.Conv2d(6, 16, 5) self.fc1 = nn.Linear(16 * 5 * 5, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = torch.flatten(x, 1) # flatten all dimensions except batch x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x return Net() def test_no_config(simple_model): # Would do a 4-bit kmeans for all supported modules after giving a warning ptpalettizer = PostTrainingPalettizer(simple_model) palettized_model = ptpalettizer.compress() assert palettized_model.conv1.weight.unique().size()[0] == 16 assert palettized_model.conv2.weight.unique().size()[0] == 16 assert palettized_model.fc1.weight.unique().size()[0] == 16 assert palettized_model.fc2.weight.unique().size()[0] == 16 assert palettized_model.fc3.weight.unique().size()[0] == 16 @pytest.mark.parametrize( "config_dict,expected_output", [ ( {"global_config": {"n_bits": 4}}, ["==16", "==16", "==16", "==16", "==16"], ), ( { "module_name_configs": { "conv1": {"n_bits": 4}, "fc1": {"n_bits": 2}, }, }, ["==16", ">16", "==4", ">4", ">4"], ), ( { "module_type_configs": { nn.Conv2d: {"n_bits": 4}, nn.Linear: {"n_bits": 2}, }, }, ["==16", "==16", "==4", "==4", "==4"], ), ( { "module_type_configs": { # Invalid cluster_dim gets ignored. # Conv2d should be skipped nn.Conv2d: {"n_bits": 4, "cluster_dim": 5}, nn.Linear: {"n_bits": 2}, }, }, [">16", ">16", "==4", "==4", "==4"], ), ], ) def test_post_training_palettization_dict_config(simple_model, config_dict, expected_output): dict_config = PostTrainingPalettizerConfig.from_dict(config_dict) ptpalettizer = PostTrainingPalettizer(simple_model, dict_config) palettized_model = ptpalettizer.compress() i = 0 for name, mod in palettized_model.named_modules(): if hasattr(mod, "weight"): assert eval(f"mod.weight.unique().size()[0] {expected_output[i]}") i += 1 @pytest.mark.parametrize( "config_dict,expected_output", [ ( { "module_name_configs": { "conv1": { "n_bits": 4, "granularity": "per_tensor", "cluster_dim": 3, }, "conv2": { "n_bits": 4, "granularity": "per_tensor", "cluster_dim": 3, }, "fc3": { "n_bits": 2, "granularity": "per_tensor", "cluster_dim": 2, }, }, }, ["==16", ">16", "==4"], ), ], ) def test_post_training_vector_palettization_dict_config(simple_model, config_dict, expected_output): dict_config = PostTrainingPalettizerConfig.from_dict(config_dict) ptpalettizer = PostTrainingPalettizer(simple_model, dict_config) palettized_model = ptpalettizer.compress() i = 0 for name, mod in palettized_model.named_modules(): # Only validate the layers that get palettized. if name in config_dict["module_name_configs"] and hasattr(mod, "weight"): _cluster_dim = config_dict["module_name_configs"][name]["cluster_dim"] weight_reshaped = mod.weight.flatten(1).transpose(0, 1).reshape(-1, _cluster_dim) unique_vector = torch.unique(weight_reshaped, dim=0) assert eval(f"len(unique_vector) {expected_output[i]}") i += 1 @pytest.mark.parametrize( "config_dict", [ { "n_bits": 4, "granularity": "per_tensor", }, { "n_bits": 4, "granularity": "per_grouped_channel", "group_size": 4, }, { "n_bits": 4, "cluster_dim": 4, }, { "n_bits": 4, "granularity": "per_grouped_channel", "group_size": 4, "enable_per_channel_scale": True, }, ], ) @pytest.mark.parametrize( "lut_dtype", [torch.int8, torch.uint8], ) @pytest.mark.parametrize( "layer", ["conv2", "fc2"], ) def test_ptp_int_lut(simple_model, config_dict, lut_dtype, layer): config_dict["lut_dtype"] = lut_dtype module_config = {"module_name_configs": {layer: config_dict}} config = PostTrainingPalettizerConfig.from_dict(module_config) ptpalettizer = PostTrainingPalettizer(simple_model, config) palettized_model = ptpalettizer.compress() submodule = palettized_model.get_submodule(layer) metadata_dict = CompressionMetadata.from_state_dict(submodule.state_dict()) metadata = metadata_dict["weight"] assert metadata.quantization_n_bits == 8 scale = metadata.quantization_scale zp = metadata.zero_point lut = metadata.lut if lut_dtype == torch.int8: assert zp is None lut_quant = lut / scale assert torch.min(lut_quant).int() >= -127 assert torch.max(lut_quant).int() <= 128 else: assert zp is not None lut_quant = lut / scale + zp assert torch.min(lut_quant).int() >= 0 assert torch.max(lut_quant).int() <= 254 def loss_fn(model, input): out = model(input) return nn.functional.mse_loss(out, torch.rand(1, 10)) def test_compute_sensitivity_single_worker_mutability(mnist_model, mnist_example_input): config = {"global_config": {"n_bits": 4}} skm_config = SKMPalettizerConfig.from_dict(config) palettizer = SKMPalettizer(mnist_model, skm_config) state_dict_before = copy.deepcopy(palettizer._model.state_dict()) def calibration_loader(): yield mnist_example_input palettizer.compute_sensitivity( dataloader=calibration_loader(), loss_fn=loss_fn, num_sensitivity_workers=1 ) state_dict_after = palettizer._model.state_dict() assert len(state_dict_before) == len(state_dict_after) for key in state_dict_before: assert torch.equal(state_dict_before[key], state_dict_after[key]) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/palettization/test_sensitive_k_means.py0000644000000000000000000002713214672066616031012 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict from contextlib import nullcontext from typing import Any, Dict from unittest.mock import ANY, Mock, patch import pytest import torch from coremltools.optimize.torch._utils.fsdp_utils import ( FSDPAutoWrapPolicy, ModuleWrapPolicy, SizeBasedWrapPolicy, ) from coremltools.optimize.torch._utils.k_means import KMeansConfig from coremltools.optimize.torch.palettization.sensitive_k_means import ( ModuleSKMPalettizerConfig, SKMPalettizer, SKMPalettizerConfig, ) @pytest.mark.parametrize( "auto_wrap_policy", [ ModuleWrapPolicy(module_classes=torch.nn.Linear), SizeBasedWrapPolicy(min_num_params=1000), None, ], ) @pytest.mark.parametrize("num_sensitivity_workers", [1, 8]) @pytest.mark.parametrize("num_kmeans_workers", [1, 8]) def test_fsdp_auto_wrap_policy_compress_call( mocker, num_kmeans_workers, num_sensitivity_workers, auto_wrap_policy ): """ Test compress passes fsdp_auto_wrap_policy argument correctly to compute_sensitivity method. """ mock_compute_sensitivity = Mock(return_value={"weight": None}) mocker.patch.object(SKMPalettizer, "compute_sensitivity", mock_compute_sensitivity) mocker.patch("coremltools.optimize.torch.palettization.sensitive_k_means._ParallelKMeans") mocker.patch("coremltools.optimize.torch.palettization.sensitive_k_means._SequentialKMeans") model = torch.nn.Linear(5, 10) palettizer = SKMPalettizer(model) palettizer.compress( num_sensitivity_workers=num_sensitivity_workers, fsdp_auto_wrap_policy=auto_wrap_policy, num_kmeans_workers=num_kmeans_workers, ) mock_compute_sensitivity.assert_called_once_with( None, None, None, num_sensitivity_workers, fsdp_auto_wrap_policy=auto_wrap_policy, ) @pytest.mark.parametrize( "auto_wrap_policy", [ ModuleWrapPolicy(module_classes=torch.nn.Linear), SizeBasedWrapPolicy(min_num_params=1000), None, ], ) @pytest.mark.parametrize("num_sensitivity_workers", [1, 8]) def test_fsdp_auto_wrap_policy_compute_sensitivity_call( mocker, num_sensitivity_workers, auto_wrap_policy ): """ Test compute_sensitivity passes fsdp_auto_wrap_policy argument correctly to impl methods """ model = torch.nn.Linear(5, 10) mocker.patch("coremltools.optimize.torch.palettization.sensitive_k_means._torch.save") mocker.patch( "coremltools.optimize.torch.palettization.sensitive_k_means._torch.load", Mock(return_value=model.state_dict()), ) mocker.patch( "coremltools.optimize.torch.palettization.sensitive_k_means._torch.cuda.is_available", Mock(return_value=True), ) mock_ctx = Mock() mocker.patch( "coremltools.optimize.torch.palettization.sensitive_k_means._mp.get_context", Mock(return_value=mock_ctx), ) mock_compute_sen_single_worker = Mock() mocker.patch.object( SKMPalettizer, "_compute_sensitivity_impl_single_worker", mock_compute_sen_single_worker, ) mock_dataset = Mock() mocker.patch.object(SKMPalettizer, "_get_dataset", Mock(return_value=mock_dataset)) mocker.patch.object(SKMPalettizer, "_process_sensitivity") palettizer = SKMPalettizer(model) dataloader = [] loss_fn = lambda mod, dat: mod(dat) palettizer.compute_sensitivity( dataloader=dataloader, loss_fn=loss_fn, sensitivity_path=None, num_sensitivity_workers=num_sensitivity_workers, fsdp_auto_wrap_policy=auto_wrap_policy, ) if num_sensitivity_workers > 1: for rank in range(num_sensitivity_workers): mock_ctx.Process.assert_any_call( target=palettizer._compute_sensitivity_impl_multiple_workers, args=( rank, num_sensitivity_workers, mock_dataset, loss_fn, None, auto_wrap_policy, ), name=f"Process-{rank}", ) else: mock_compute_sen_single_worker.assert_called_once_with(mock_dataset, loss_fn, None) @pytest.mark.parametrize("auto_wrap_policy", [Mock(spec=FSDPAutoWrapPolicy), None]) def test_fsdp_auto_wrap_policy_multi_worker_compute_sensitivity_call(mocker, auto_wrap_policy): """ Test _compute_sensitivity_impl_multiple_workers passes correct value of fsdp auto wrap policy to FSDP call """ model = torch.nn.Linear(5, 10) mocker.patch("coremltools.optimize.torch.palettization.sensitive_k_means._torch") mocker.patch("coremltools.optimize.torch.palettization.sensitive_k_means._ddp_setup") mocker.patch( "coremltools.optimize.torch.palettization.sensitive_k_means._is_leader", Mock(return_value=True), ) mocker.patch.object( SKMPalettizer, "_register_grad_square_hooks", Mock(return_value=nullcontext()) ) if auto_wrap_policy is not None: expected_auto_wrap_policy = Mock() auto_wrap_policy.get_policy.return_value = expected_auto_wrap_policy else: expected_auto_wrap_policy = None with patch( "coremltools.optimize.torch.palettization.sensitive_k_means._FSDP", autospec=True ) as mock_fsdp: mock_fsdp.state_dict_type.return_value = nullcontext() palettizer = SKMPalettizer(model) palettizer._compute_sensitivity_impl_multiple_workers( rank=0, num_workers=1, dataset=[None], loss_fn=Mock(), sensitivity_path=None, fsdp_auto_wrap_policy=auto_wrap_policy, ) # test FSDP either gets None or output of get_policy method on the # FSDPAutoWrapPolicy object mock_fsdp.assert_called_with( module=palettizer._model, auto_wrap_policy=expected_auto_wrap_policy, sharding_strategy=ANY, use_orig_params=False, device_id=ANY, sync_module_states=True, ) @pytest.fixture() def model_for_compression() -> torch.nn.Module: return torch.nn.Sequential( OrderedDict( [ ("modconv", torch.nn.Conv2d(3, 10, (3, 3))), ("modlinear", torch.nn.Linear(2, 5)), ("multihead", torch.nn.MultiheadAttention(10, 5)), ("embedding", torch.nn.Embedding(100, 10)), ] ) ) @pytest.fixture() def sensitvity_dict_for_compression() -> Dict[str, Any]: return { "modconv.weight": Mock(), "modlinear.weight": Mock(), "multihead.in_proj_weight": Mock(), "multihead.out_proj.weight": Mock(), "embedding.weight": Mock(), } @pytest.fixture() def model_for_compression_custom_module() -> torch.nn.Module: class MyModule(torch.nn.Module): def __init__(self): super().__init__() self.weight = torch.nn.Parameter(data=torch.randn(5, 10)) return torch.nn.Sequential( OrderedDict( [ ("modconv", torch.nn.Conv2d(3, 10, (3, 3))), ("modlinear", torch.nn.Linear(2, 5)), ("multihead", torch.nn.MultiheadAttention(10, 5)), ("custom", MyModule()), ] ) ) @pytest.mark.parametrize( "model,sensitivity_dict,config,kmeans_keys", [ (torch.nn.Linear(5, 10), {"weight": None}, None, {"": "weight"}), ( "model_for_compression", "sensitvity_dict_for_compression", SKMPalettizerConfig( global_config=ModuleSKMPalettizerConfig(), module_name_configs={ "modconv": None, }, ), { "modlinear": "weight", "multihead": "in_proj_weight", "multihead.out_proj": "weight", "embedding": "weight", }, ), ( "model_for_compression", "sensitvity_dict_for_compression", SKMPalettizerConfig( global_config=ModuleSKMPalettizerConfig(), module_name_configs={ "mod.*": None, }, ), { "multihead": "in_proj_weight", "multihead.out_proj": "weight", "embedding": "weight", }, ), ( "model_for_compression", "sensitvity_dict_for_compression", SKMPalettizerConfig( global_config=ModuleSKMPalettizerConfig(), module_type_configs={torch.nn.Embedding: None}, ), { "modconv": "weight", "modlinear": "weight", "multihead": "in_proj_weight", "multihead.out_proj": "weight", }, ), ( "model_for_compression", "sensitvity_dict_for_compression", SKMPalettizerConfig( global_config=ModuleSKMPalettizerConfig(), module_type_configs={"MultiheadAttention": None}, module_name_configs={"multihead.out_proj": None}, ), {"modconv": "weight", "modlinear": "weight", "embedding": "weight"}, ), ( "model_for_compression_custom_module", "sensitvity_dict_for_compression", None, { "modconv": "weight", "modlinear": "weight", "multihead": "in_proj_weight", "multihead.out_proj": "weight", }, ), ], ) @pytest.mark.parametrize("num_kmeans_workers", [1, 8]) def test_compress_cluster_weights_call( mocker, num_kmeans_workers, model, sensitivity_dict, config, kmeans_keys, request ): """ Test ParallelKMeans/SequentialKMeans are called with correct arguments """ if isinstance(model, str): model = request.getfixturevalue(model) if isinstance(sensitivity_dict, str): sensitivity_dict = request.getfixturevalue(sensitivity_dict) mocker.patch.object(SKMPalettizer, "compute_sensitivity", Mock(return_value=sensitivity_dict)) mock_parallel = mocker.patch( "coremltools.optimize.torch.palettization.sensitive_k_means._ParallelKMeans" ) mock_sequential = mocker.patch( "coremltools.optimize.torch.palettization.sensitive_k_means._SequentialKMeans" ) palettizer = SKMPalettizer(model, config) palettizer.compress( num_sensitivity_workers=1, fsdp_auto_wrap_policy=None, num_kmeans_workers=num_kmeans_workers, ) k_means_config_dict = {} for key, val in kmeans_keys.items(): sensitivity_key = f"{key}.{val}" if len(key) > 0 else val k_means_config_dict[key] = { val: KMeansConfig( n_bits=ModuleSKMPalettizerConfig().n_bits, axis=0, block_size=None, cluster_dim=1, importance=sensitivity_dict[sensitivity_key], enable_per_channel_scale=ModuleSKMPalettizerConfig().enable_per_channel_scale, ) } if num_kmeans_workers > 1: mock_parallel.cluster_weights.assert_called_once_with( palettizer._model, k_means_config_dict, num_workers=num_kmeans_workers ) else: mock_sequential.cluster_weights.assert_called_once_with( palettizer._model, k_means_config_dict, ) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2855473 coremltools-8.0/coremltools/test/optimize/torch/pruning/0000755000000000000000000000000014672075535022461 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/pruning/__init__.py0000644000000000000000000000033314672066616024571 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/pruning/pruning_utils.py0000644000000000000000000000570314672066616025742 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import numpy as np import torch import coremltools.test.optimize.torch.utils as utils batch_size = 128 def verify_global_pruning_amount(supported_modules, model, expected_sparsity): total_params = 0 unpruned_params = 0 for name, module in model.named_modules(): if type(module) in supported_modules: total_params += module.weight.numel() if hasattr(module, "weight_mask"): unpruned_params += torch.nonzero(module.weight_mask, as_tuple=False).size(0) else: unpruned_params += torch.nonzero(module.weight, as_tuple=False).size(0) actual_global_sparsity = 1 - unpruned_params / total_params np.testing.assert_allclose(actual_global_sparsity, expected_sparsity, atol=0.02) def train_and_eval_model(model, dataset, pruner, num_epochs, pass_loss=False): train_loader, test_loader = utils.setup_data_loaders(dataset, batch_size) optimizer = torch.optim.Adam(model.parameters(), eps=1e-07, weight_decay=1e-4) # train the model for epoch in range(num_epochs): model.train() for batch_idx, (data, target) in enumerate(train_loader): loss = utils.train_step(model, optimizer, train_loader, data, target, batch_idx, epoch) if pass_loss: pruner.step(epoch, loss) else: pruner.step() accuracy = utils.eval_model(model, test_loader) return accuracy def get_compression_ratio(model, pruner): # export the model import coremltools as ct model.eval() pruner.finalize(inplace=True) example_input = torch.rand(1, 1, 28, 28) traced_model = torch.jit.trace(model, example_input) converted_model = ct.convert( traced_model, convert_to="mlprogram", inputs=[ct.TensorType(shape=example_input.shape)], ) # save and get size converted_model.save("/tmp/converted_model_unpruned.mlpackage") unpruned_model_size = os.path.getsize( "/tmp/converted_model_unpruned.mlpackage/Data/com.apple.CoreML/weights/weight.bin") # compress the model pruned_model = ct.compression_utils.sparsify_weights(converted_model, mode="threshold_based", threshold=1e-12) # save and get size pruned_model.save("/tmp/converted_model_pruned.mlpackage") pruned_model_size = os.path.getsize( "/tmp/converted_model_pruned.mlpackage/Data/com.apple.CoreML/weights/weight.bin") compression_ratio = pruned_model_size/unpruned_model_size print(f"Compression ratio: {compression_ratio}") return compression_ratio def get_model_and_pruner(mnist_model, pruner_cls, pruner_config): model = mnist_model pruner = pruner_cls(model, pruner_config) pruner.prepare(inplace=True) return model, pruner ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/pruning/test_base_pruner.py0000644000000000000000000000323614672066616026403 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict import pytest import torch.nn as nn from coremltools.optimize.torch._utils.metadata_utils import CompressionMetadata, CompressionType from coremltools.optimize.torch.pruning import MagnitudePruner, MagnitudePrunerConfig @pytest.mark.parametrize( "algorithm, config", [ (MagnitudePruner, MagnitudePrunerConfig()), ], ) def test_compression_metadata(algorithm, config): """ Test that calling finalize on the module leads to compression metadata being added to the model """ model = nn.Sequential( OrderedDict([("conv1", nn.Conv2d(3, 32, 3)), ("fc1", nn.Linear(32, 100))]) ) # Disable compression for Linear layer config = config.set_module_name("fc1", None) pruner = algorithm(model, config) pruner.prepare(inplace=True) pruner.step() pruner.finalize(inplace=True) # Verify metadata version is added to model assert "_COREML_/metadata_version" in model.state_dict() # Verify compression metadata is added for conv1 metadata_dict = CompressionMetadata.from_state_dict(model.conv1.state_dict()) assert len(metadata_dict) == 1 assert "weight" in metadata_dict metadata = metadata_dict["weight"] assert metadata.compression_type == [CompressionType.pruning.value] # Verify no compression metadata is added for fc1 metadata_dict = CompressionMetadata.from_state_dict(model.fc1.state_dict()) assert len(metadata_dict) == 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/pruning/test_magnitude_pruner.py0000644000000000000000000005250314672066616027447 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy from collections import OrderedDict import numpy as np import pytest import torch import torch.nn as nn from coremltools.optimize.torch._utils.metadata_utils import CompressionMetadata, CompressionType from coremltools.optimize.torch.pruning import ( MagnitudePruner, MagnitudePrunerConfig, ModuleMagnitudePrunerConfig, ) from coremltools.optimize.torch.pruning._utils import n_m_mask def _zero_loss(x, y): return torch.sum(x) * 0.0 def _mock_initializer(shape, dtype): # Each output channel is (entirely) an integer, increaing. This makes it so # that we know what to expect from the LnPruner. output_channel_index = 0 num_output_channels = shape[output_channel_index] output_channel_values = np.arange(1, num_output_channels + 1, dtype=dtype) broadcast_shape = tuple(-1 if i == output_channel_index else 1 for i, _ in enumerate(shape)) output_channel_values_reshaped = np.reshape(output_channel_values, broadcast_shape) return torch.tensor(np.broadcast_to(output_channel_values_reshaped, shape)) def _create_module(): conv2d = torch.nn.Conv2d(in_channels=3, out_channels=4, kernel_size=(3, 3), bias=False, groups=1) conv2d.weight = torch.nn.Parameter(_mock_initializer(conv2d.weight.shape, np.float32)) activation = torch.nn.ReLU() return torch.nn.Sequential(OrderedDict([ ('conv2d', conv2d), ('activation', activation)])) def _create_large_module(): def _conv2d(): return torch.nn.Conv2d(8, 8, (3, 3), bias=False, groups=1) return torch.nn.Sequential(OrderedDict([ ('conv1', _conv2d()), ('conv2', _conv2d()), ('conv3', _conv2d()), ('flatten', torch.nn.Flatten()), ('linear1', torch.nn.Linear(2592, 100)), ('linear2', torch.nn.Linear(100, 10))])) @pytest.fixture def simple_module(): return _create_module() @pytest.fixture def large_module(): return _create_large_module() @pytest.fixture(scope="module") def sample_data(): X = np.asarray([np.random.uniform(0.0, 1.0, size=(3, 8, 8)).astype(np.float32) for _ in range(4)]) Y = np.asarray([np.random.uniform(0.0, 1.0, size=(4, 6, 6)).astype(np.float32) for _ in range(4)]) X, Y = torch.tensor(X), torch.tensor(Y) return X, Y @pytest.mark.parametrize("out_channels", [17, 127]) @pytest.mark.parametrize("block_size", [2, 3, 4]) def test_magnitude_pruner_nondivisible_block_size(out_channels, block_size): """ Test block sparsity when the number of channels is not divisible by block size """ conv2d = torch.nn.Conv2d(in_channels=3, out_channels=out_channels, kernel_size=(3, 3), bias=False, groups=1) weight_shape = tuple(conv2d.weight.shape) weight_tensor = torch.abs(torch.randn(*weight_shape)) weight_tensor[weight_tensor == 0] = 1.0 conv2d.weight = torch.nn.Parameter(weight_tensor) config = MagnitudePrunerConfig.from_dict( {"global_config": { "scheduler": {"update_steps": [1, 2]}, "initial_sparsity": 0.0, "target_sparsity": 0.5, "block_size": block_size, }}, ) pruner = MagnitudePruner(conv2d, config) conv2d = pruner.prepare() for _ in range(4): pruner.step() if block_size > 1: block_sparse_channels = out_channels - out_channels % block_size for idx in range(0, block_sparse_channels, block_size): for jdx in range(1, block_size): assert torch.all(conv2d.weight_mask[idx] == conv2d.weight_mask[idx + jdx]) sparsity = conv2d.weight_mask.eq(0).sum() / conv2d.weight_mask.numel() np.testing.assert_array_almost_equal(sparsity, 0.5, decimal=2) @pytest.mark.parametrize("out_channels", [8]) @pytest.mark.parametrize("block_size", [5, 8, 9]) def test_magnitude_pruner_morethanhalf_block_size(out_channels, block_size): """ Test block sparsity when the block size is greater than half the number of channels """ conv2d = torch.nn.Conv2d( in_channels=3, out_channels=out_channels, kernel_size=(3, 3), bias=False, groups=1, ) weight_tensor = torch.rand_like(conv2d.weight) weight_tensor[weight_tensor == 0] = 1.0 conv2d.weight.data = weight_tensor config = MagnitudePrunerConfig.from_dict( { "global_config": { "scheduler": {"update_steps": [1, 2]}, "initial_sparsity": 0.0, "target_sparsity": 0.5, "block_size": block_size, } }, ) pruner = MagnitudePruner(conv2d, config) conv2d = pruner.prepare() for _ in range(4): pruner.step() if block_size > 1: block_sparse_channels = out_channels - out_channels % block_size for idx in range(0, block_sparse_channels, block_size): for jdx in range(1, block_size): assert torch.all(conv2d.weight_mask[idx] == conv2d.weight_mask[idx + jdx]) sparsity = conv2d.weight_mask.eq(0).sum() / conv2d.weight_mask.numel() assert np.isclose(sparsity, 0.5, rtol=0.05) @pytest.mark.parametrize( "options", [("block_size", 2), ("granularity", "per_channel")], ) def test_magnitude_pruner_n_m_ratio_param_usage(options): param_name, val = options with pytest.raises(Exception): MagnitudePrunerConfig.from_dict( {"global_config": { "n_m_ratio": [3, 4], param_name: val}}, ) @pytest.mark.parametrize('config_dict', [ {"module_type_configs": {"Linear": {"block_size": 2}}}, {"module_name_configs": {"conv2d": {"block_size": 2}}}, {"global_config": {"block_size": 2}}, {}, ]) def test_magnitude_pruner_config_global_config_set(config_dict): config = MagnitudePrunerConfig.from_dict(config_dict) if len(config_dict) == 0: assert config.global_config == ModuleMagnitudePrunerConfig() else: keys = ["global_config", "module_type_configs", "module_name_configs"] for key in keys: if key not in config_dict: param_in_config = getattr(config, key) assert param_in_config is None or len(param_in_config) == 0 if "global_config" in config_dict: assert config.global_config.block_size == config_dict["global_config"]["block_size"] if "module_name_configs" in config_dict: for key in config_dict["module_name_configs"]: assert config.module_name_configs[key].block_size == \ config_dict["module_name_configs"][key]["block_size"] if "module_type_configs" in config_dict: for key in config_dict["module_type_configs"]: assert config.module_type_configs[key].block_size == \ config_dict["module_type_configs"][key]["block_size"] @pytest.mark.parametrize('out_channels', [16, 64]) @pytest.mark.parametrize('block_size', [1, 4, 8]) def test_magnitude_pruner_block_sparsity(out_channels, block_size): """ Test block sparsity structure is obtained by MagnitudePruner when block_size > 1 """ conv2d = torch.nn.Conv2d(in_channels=3, out_channels=out_channels, kernel_size=(3, 3), bias=False, groups=1) weight_shape = tuple(conv2d.weight.shape) weight_tensor = torch.abs(torch.randn(*weight_shape)) weight_tensor[weight_tensor == 0] = 1.0 conv2d.weight = torch.nn.Parameter(weight_tensor) config = MagnitudePrunerConfig.from_dict( {"global_config": { "scheduler": {"update_steps": [1, 2]}, "initial_sparsity": 0.0, "target_sparsity": 0.5, "block_size": block_size, }}, ) pruner = MagnitudePruner(conv2d, config) conv2d = pruner.prepare() for _ in range(4): pruner.step() if block_size > 1: for idx in range(0, out_channels, block_size): for jdx in range(1, block_size): assert torch.all(conv2d.weight_mask[idx] == conv2d.weight_mask[idx + jdx]) assert torch.sum(conv2d.weight_mask == 0).item() == int(0.5 * torch.numel(conv2d.weight)) def test_finalize(simple_module): """ Test that calling finalize on the module leads to param being replaced with param_orig * param_mask. """ config = MagnitudePrunerConfig.from_dict( {"global_config": { "scheduler": {"update_steps": [1, 2]}, "initial_sparsity": 0.0, "target_sparsity": 0.5, "granularity": "per_channel" }}, ) pruner = MagnitudePruner(simple_module, config) simple_module = pruner.prepare() for _ in range(4): pruner.step() pruner.finalize(inplace=True) assert torch.sum(simple_module.conv2d.weight[:2] == 0).item() == 54 assert torch.sum(simple_module.conv2d.weight[2] == 3).item() == 27 assert torch.sum(simple_module.conv2d.weight[3] == 4).item() == 27 def test_magnitude_pruning_correctness(simple_module): """ Test correctness of magnitude pruning. Initialize convolution weight with 4 output channels, with weights associated with channel `k` having integer value k+1 (k=0,...,3). We test that pruning twice indeed zeros out 3 output channels. """ config = MagnitudePrunerConfig.from_dict( {"global_config": { "scheduler": {"update_steps": [2, 3]}, "initial_sparsity": 0.0, "target_sparsity": 0.75, "granularity": "per_channel" }}, ) pruner = MagnitudePruner(simple_module, config) simple_module = pruner.prepare() # Perform 4 iterations: pruning should happen on steps 2 and 3 # step 1: No pruning pruner.step() np.testing.assert_equal(simple_module.conv2d.weight_mask.numpy(), np.array([1, 1, 1, 1], dtype=np.int32).reshape((4, 1, 1, 1))) # step 2: prune once, polynomial schedule will give new sparsity as 0.0, still no pruning pruner.step() np.testing.assert_equal(simple_module.conv2d.weight_mask.numpy(), np.array([1, 1, 1, 1], dtype=np.int32).reshape((4, 1, 1, 1))) # step 3: prune once again, polynomial schedule will give new sparsity as 1.0, 75% = 3 out of 4 # channels with least magnitude (first three channels) will be pruned out pruner.step() np.testing.assert_equal(simple_module.conv2d.weight_mask.numpy(), np.array([0, 0, 0, 1], dtype=np.int32).reshape((4, 1, 1, 1))) # step 4: prune once again, polynomial schedule sparsity stays at 0.75, no further pruning pruner.step() np.testing.assert_equal(simple_module.conv2d.weight_mask.numpy(), np.array([0, 0, 0, 1], dtype=np.int32).reshape((4, 1, 1, 1))) def test_magnitude_pruning_training_and_validation(simple_module, sample_data): """ Tests pruned weights are used for computing forward pass pruned module. Also demonstrates how pruner can be combined with training code in PyTorch, i.e, pruning can be done at a schedule different from training. Note: No actual training happens here because loss function is a no-op. """ config = MagnitudePrunerConfig.from_dict( {"global_config": { "scheduler": {"update_steps": [2, 3]}, "initial_sparsity": 0.0, "target_sparsity": 0.75, "granularity": "per_channel" }}, ) pruner = MagnitudePruner(simple_module, config) simple_module = pruner.prepare() # Train the model for 4 epochs num_epochs = 4 X, Y = sample_data simple_module.train() optimizer = torch.optim.Adam(params=simple_module.parameters(), lr=0.001) for _ in range(num_epochs): for inp, label in zip(X, Y): inp = inp.view(1, *X.shape[1:]) label = label.view(1, *Y.shape[1:]) output = simple_module(inp) loss = _zero_loss(output, label) optimizer.zero_grad() loss.backward() optimizer.step() pruner.step() # Test inference # After 4 iterations, pruner will zero out first 3 layers of conv2d layer in simple_module simple_module.eval() with torch.no_grad(): x_test = torch.tensor(np.random.uniform(0.0, 1.0, size=(1, 3, 8, 8)).astype(np.float32)) y_test = simple_module(x_test).detach().numpy() zero_output = y_test[:, :3, :, :] nonzero_output = y_test[:, 3:, :, :] np.testing.assert_equal(zero_output, np.zeros_like(zero_output)) assert np.any(np.abs(nonzero_output) > 0.0) @pytest.mark.parametrize('granularity', ["per_scalar", "per_channel"]) def test_magnitude_pruning_granularity_parameter_usage(simple_module, granularity): """ Tests MagnitudePruner creates mask of the correct shape depending on the granularity parameter. We set target sparsity to 1.0 so the mask should be all zeros after 4 iterations. """ config = MagnitudePrunerConfig.from_dict( {"global_config": { "scheduler": {"update_steps": [2, 3]}, "initial_sparsity": 0.5, "target_sparsity": 1.0, "granularity": granularity }}, ) pruner = MagnitudePruner(simple_module, config) simple_module = pruner.prepare() # Perform 4 iterations for _ in range(4): pruner.step() mask_data = simple_module.conv2d.weight_mask.numpy() # Pruning mask should be all zeros since the pruner should be at 100% sparsity. if granularity == "per_scalar": expected_mask_shape = (4, 3, 3, 3) else: assert granularity == "per_channel" expected_mask_shape = (4, 1, 1, 1) np.testing.assert_equal(mask_data, np.zeros(expected_mask_shape)) @pytest.mark.parametrize('granularity', ["per_scalar", "per_channel"]) def test_pruner_finalize(simple_module, granularity): config = MagnitudePrunerConfig.from_dict( {"global_config": { "scheduler": {"update_steps": [2, 3]}, "initial_sparsity": 0.5, "target_sparsity": 1.0, "granularity": granularity }}, ) pruner = MagnitudePruner(simple_module, config) simple_module = pruner.prepare() assert hasattr(simple_module.conv2d, "weight_mask") assert hasattr(simple_module.conv2d, "weight_orig") # Perform 4 iterations for _ in range(4): pruner.step() pruner.finalize(inplace=True) assert not hasattr(simple_module.conv2d, "weight_mask") assert not hasattr(simple_module.conv2d, "weight_orig") weight_data = simple_module.conv2d.weight.detach().numpy() np.testing.assert_equal(weight_data, np.zeros_like(weight_data)) # calling finalize again is a no-op pruner.finalize(inplace=True) @pytest.mark.parametrize("block_size", [1, 2]) @pytest.mark.parametrize("granularity", ["per_scalar", "per_channel"]) def test_sparsity_report_method(large_module, block_size, granularity): model = large_module target_sparsity = 0.5 config = MagnitudePrunerConfig.from_dict( {"global_config": { "scheduler": {"update_steps": [2, 3]}, "block_size": block_size, "initial_sparsity": 0.0, "target_sparsity": target_sparsity, "granularity": granularity }}, ) pruner = MagnitudePruner(model, config) pruner.prepare(inplace=True) inp = torch.ones(1, 8, 24, 24) for _ in range(4): model(inp) pruner.step() report = pruner.report() assert len(report) == 6 for sparsity in [val["unstructured_weight_sparsity"] for _, val in report.items()]: assert sparsity == pytest.approx(target_sparsity, 0.1) if block_size == 2: for sparsity in [val["block2_weight_sparsity"] for _, val in report.items()]: assert sparsity == pytest.approx(target_sparsity, 0.1) if granularity == "per_channel": for sparsity in [ val["structured_weight_sparsity"] for _, val in report.items() ][:3]: # only conv layers assert sparsity == pytest.approx(target_sparsity, 0.1) def test_sparsity_report_block2_sparsity_not_applicable(): model = torch.nn.Sequential(torch.nn.Conv2d(1, 31, 2, 1), torch.nn.Conv2d(31, 21, 2, 1)) target_sparsity = 0.5 config = MagnitudePrunerConfig.from_dict( {"global_config": { "scheduler": {"begin_step": 0}, "initial_sparsity": 0.0, "target_sparsity": target_sparsity, }}, ) pruner = MagnitudePruner(model, config) pruner.prepare(inplace=True) inp = torch.ones(1, 1, 28, 28) for _ in range(2): pruner.step() model(inp) report = pruner.report() assert len(report) == 3 for sparsity in [val["block2_weight_sparsity"] for _, val in report.items()]: assert sparsity == -1 def test_magnitude_pruner_cloning(simple_module): model = simple_module config = MagnitudePrunerConfig.from_dict( {"global_config": { "scheduler": {"update_steps": [0, 1]}, }}, ) pruner = MagnitudePruner(model, config) pruner.prepare(inplace=True) model_copy = copy.deepcopy(model) assert hasattr(model_copy.conv2d, "pruning_method") assert torch.all(model_copy.conv2d.weight_orig == model.conv2d.weight_orig) assert torch.all(model_copy.conv2d.weight_mask == model.conv2d.weight_mask) pruner.finalize(inplace=True) model_copy_finalize = copy.deepcopy(model) assert not hasattr(model_copy_finalize.conv2d, "pruning_method") assert torch.all(model_copy_finalize.conv2d.weight == model.conv2d.weight) @pytest.mark.parametrize('weights_shape', [[2, 8], [2, 8, 1, 1]]) @pytest.mark.parametrize('dim', [1, 0]) def test_nm_pruner_mask_computation(weights_shape, dim): weights = torch.tensor( [ [2, 3, 0, 4, 5, 9, 1, 1], [3, 6, 1, 0, 2, 3, 8, 9] ] ) if dim == 1: expected_mask = torch.tensor( [ [0, 1, 0, 1, 1, 1, 0, 0], [1, 1, 0, 0, 0, 0, 1, 1] ] ) nm = (2, 4) else: expected_mask = torch.tensor( [ [0, 0, 0, 1, 1, 1, 0, 0], [1, 1, 1, 0, 0, 0, 1, 1] ] ) nm = (1, 2) if weights_shape == [2, 8, 1, 1]: weights = weights.reshape([2, 8, 1, 1]) expected_mask = expected_mask.reshape([2, 8, 1, 1]) mask = n_m_mask(weights, nm, dim=dim) np.testing.assert_array_equal(mask, expected_mask) @pytest.mark.parametrize("range_str", ["range(0, 25000, 100)", "range(0)"]) def test_polynomial_scheduler_range_str(range_str): pruner_config = MagnitudePrunerConfig.from_dict( {"global_config": {"scheduler": {"update_steps": range_str}}} ) update_steps_tensor = torch.tensor(list(eval(range_str))) assert torch.all( pruner_config.global_config.scheduler.update_steps == update_steps_tensor ) def test_nm_pruner_polynomial_scheduler(): model = torch.nn.Linear(8, 2) weights = torch.tensor( [[2, 3, 7, 4, 5, 8, 1, 6], [4, 5, 1, 6, 2, 3, 7, 8]], dtype=torch.float ) model.weight.data = weights data = torch.randn(1, 8) config = MagnitudePrunerConfig.from_dict( { "global_config": { "scheduler": {"update_steps": range(8), "power": 1}, "n_m_ratio": (7, 8), } } ) pruner = MagnitudePruner(model, config) model = pruner.prepare() for idx in range(7): pruner.step() model(data) for row in range(2): assert torch.count_nonzero(model.weight_mask[row]) == (7 - idx) def test_compression_metadata(): """ Test that calling finalize on the module leads to compression metadata being added to the model """ model = nn.Sequential( OrderedDict([("conv1", nn.Conv2d(3, 32, 3)), ("fc1", nn.Linear(32, 100))]) ) # Disable compression for Linear layer config = MagnitudePrunerConfig().set_module_name("fc1", None) pruner = MagnitudePruner(model, config) pruner.prepare(inplace=True) pruner.step() pruner.finalize(inplace=True) # Verify metadata version is added to model assert "_COREML_/metadata_version" in model.state_dict() # Verify compression metadata is added for conv1 metadata_dict = CompressionMetadata.from_state_dict(model.conv1.state_dict()) assert len(metadata_dict) == 1 assert "weight" in metadata_dict metadata = metadata_dict["weight"] assert metadata.compression_type == [CompressionType.pruning.value] # Verify no compression metadata is added for fc1 metadata_dict = CompressionMetadata.from_state_dict(model.fc1.state_dict()) assert len(metadata_dict) == 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/pruning/test_pruning_scheduler.py0000644000000000000000000000653514672066616027623 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import sys import pytest import torch from coremltools.optimize.torch.pruning import ( ConstantSparsityScheduler, MagnitudePruner, MagnitudePrunerConfig, ModuleMagnitudePrunerConfig, PolynomialDecayScheduler, ) @pytest.fixture def simple_module(): return torch.nn.Conv2d(3, 3, (3, 3), bias=False, groups=1) @pytest.mark.skipif(sys.platform == "darwin", reason="temporarily disabled.") @pytest.mark.parametrize('steps_and_expected', [[[4, 7, 9], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.875, 0.875, 1.0, 1.0]], [[3], [0.0, 0.0, 1.0, 1.0]]]) def test_polynomial_decay_correctness(simple_module, steps_and_expected): """ Tests correctness of polynomial decay schedule. Note: Schedule can be stepped beyond the maximum step specified in update_steps. Beyond the max update step, the sparsity stays at the target sparsity. For example, in the first test case, we step 10 times, whereas max step is 9. At the 10th call to schedule.step, sparsity remains at 1.0. """ update_steps, expected_sparsitys = steps_and_expected config = MagnitudePrunerConfig().set_global( ModuleMagnitudePrunerConfig( scheduler=PolynomialDecayScheduler(update_steps=update_steps), initial_sparsity=0.0, target_sparsity=1.0, ) ) pruner = MagnitudePruner(simple_module, config) pruner.prepare(inplace=True) for expected in expected_sparsitys: pruner.step() assert pruner._pruner_info[''].sparsity_level == expected @pytest.mark.parametrize('steps', [[2.5, 6.5, 3.3], [[2, 3], [3, 5]], [-2, 0, 2]]) def test_polynomial_decay_initialization_failure(steps): with pytest.raises(Exception): PolynomialDecayScheduler(update_steps=steps) with pytest.raises(Exception): PolynomialDecayScheduler(update_steps=torch.tensor(steps)) @pytest.mark.skipif(sys.platform == "darwin", reason="temporarily disabled.") @pytest.mark.parametrize('step_and_target', [(4, 0.5), (0, 0.8)]) def test_constant_sparsity_correctness(simple_module, step_and_target): """ Tests correctness of spline schedule. Note: Schedule can be stepped beyond the maximum step specified in update_steps. Beyond the max update step, the sparsity stays at the target sparsity. For example, in the first test case, we step 10 times, whereas max step is 9. At the 10th call to schedule.step, sparsity remains at 1.0. """ begin_step, target_sparsity = step_and_target initial_sparsity = target_sparsity if begin_step == 0 else 0.0 config = MagnitudePrunerConfig().set_global( ModuleMagnitudePrunerConfig( scheduler=ConstantSparsityScheduler(begin_step=begin_step), initial_sparsity=initial_sparsity, target_sparsity=target_sparsity, ) ) pruner = MagnitudePruner(simple_module, config) pruner.prepare(inplace=True) for _ in range(begin_step): assert pruner._pruner_info[''].sparsity_level == initial_sparsity pruner.step() assert pruner._pruner_info[''].sparsity_level == target_sparsity ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2895474 coremltools-8.0/coremltools/test/optimize/torch/quantization/0000755000000000000000000000000014672075535023525 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/quantization/__init__.py0000644000000000000000000000033314672066616025635 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/quantization/test_configure.py0000644000000000000000000011633014672066616027123 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import operator from collections import OrderedDict from typing import List import pytest import torch import torch.ao.nn.quantized.reference import torch.ao.quantization import torch.nn as nn import torch.nn.intrinsic import torch.nn.intrinsic.qat import torch.nn.qat import torch.nn.quantized from coremltools.optimize.torch.quantization import ( LinearQuantizer, LinearQuantizerConfig, ModuleLinearQuantizerConfig, ) from coremltools.optimize.torch.quantization._backend_config import _mod_activations from coremltools.optimize.torch.quantization._qconfig_mapping import _QConfigMappingBuilder from coremltools.optimize.torch.quantization._utils import ( find_module, get_quant_range, is_activation_post_process, ) from coremltools.optimize.torch.quantization.modules import fused_modules as _fused from coremltools.optimize.torch.quantization.modules import qat_modules as _qat from coremltools.optimize.torch.quantization.modules import quantized_modules as _quantized from coremltools.optimize.torch.quantization.quantization_config import QuantizationScheme def get_configs_for_qscheme( activation_dtype=torch.quint8, weight_per_channel=True, weight_dtype=torch.qint8, ) -> List[LinearQuantizerConfig]: return [ LinearQuantizerConfig.from_dict( { "global_config": { "quantization_scheme": QuantizationScheme.symmetric, "milestones": [0, 0, 10, 10], "weight_dtype": weight_dtype, "activation_dtype": activation_dtype, "weight_per_channel": weight_per_channel, } } ), LinearQuantizerConfig.from_dict( { "global_config": { "quantization_scheme": QuantizationScheme.affine, "milestones": [0, 0, 10, 10], "weight_dtype": weight_dtype, "activation_dtype": activation_dtype, "weight_per_channel": weight_per_channel, } } ), ] def quantize_model(model, data, config=None): quantizer = LinearQuantizer(model, config) prepared_model = quantizer.prepare(example_inputs=(data,), inplace=False) quantizer.step() prepared_model(data) return prepared_model, quantizer def _verify_quant_range(fake_quant, weight_n_bits, weight_dtype): quant_min, quant_max = get_quant_range(n_bits=weight_n_bits, dtype=weight_dtype) assert fake_quant.quant_min == quant_min assert fake_quant.quant_max == quant_max @pytest.mark.parametrize( "model_config", [ ( nn.Sequential( OrderedDict( { "conv": nn.Conv2d(1, 20, (3, 3)), "relu": nn.ReLU(), } ) ), True, torch.nn.intrinsic.qat.ConvReLU2d, torch.ao.nn.intrinsic.ConvReLU2d, torch.ao.nn.quantized.reference.Conv2d, ), ( nn.Sequential( OrderedDict( { "conv": nn.ConvTranspose2d(1, 20, (3, 3)), "relu": nn.ReLU(), } ) ), False, _qat.ConvTransposeAct2d, _quantized.QuantizedConvTransposeAct2d, torch.ao.nn.quantized.reference.ConvTranspose2d, ), ], ) @pytest.mark.parametrize( "config", get_configs_for_qscheme() + get_configs_for_qscheme(weight_per_channel=False) + get_configs_for_qscheme(weight_dtype="qint8") + get_configs_for_qscheme(weight_dtype=torch.quint8), ) def test_conv_relu_fusion(config, model_config): ( model, pytorch_builtin_mod, qat_mod_type, fused_quant_mod_type, ref_quant_mod_type, ) = model_config data = torch.randn(1, 1, 28, 28) prepared_model, quantizer = quantize_model(model, data, config) assert isinstance(prepared_model.conv, qat_mod_type) _verify_quant_range( prepared_model.conv.weight_fake_quant, weight_n_bits=config.global_config.weight_n_bits, weight_dtype=config.global_config.weight_dtype, ) converted_model = quantizer.finalize(inplace=False) assert isinstance(converted_model.conv, fused_quant_mod_type) assert isinstance( converted_model.conv[0] if pytorch_builtin_mod else converted_model.conv.conv, ref_quant_mod_type, ) @pytest.mark.parametrize( "config", get_configs_for_qscheme() + get_configs_for_qscheme(weight_per_channel=False) + get_configs_for_qscheme(weight_dtype="qint4") + get_configs_for_qscheme(weight_dtype="quint4"), ) @pytest.mark.parametrize("activation_fn", list(_mod_activations)) @pytest.mark.parametrize("conv_transpose", [False, True]) def test_conv_act_fusion(config, activation_fn, conv_transpose): model = nn.Sequential( OrderedDict( { "conv": ( nn.Conv2d(1, 20, (3, 3)) if not conv_transpose else nn.ConvTranspose2d(1, 20, (3, 3)) ), "act": activation_fn(), } ) ) data = torch.randn(1, 1, 28, 28) prepared_model, quantizer = quantize_model(model, data, config) if not conv_transpose: assert isinstance(prepared_model.conv, _qat.ConvAct2d) else: assert isinstance(prepared_model.conv, _qat.ConvTransposeAct2d) assert isinstance(prepared_model.conv.act, activation_fn) _verify_quant_range( prepared_model.conv.conv.weight_fake_quant, weight_n_bits=config.global_config.weight_n_bits, weight_dtype=config.global_config.weight_dtype, ) converted_model = quantizer.finalize(inplace=False) if not conv_transpose: assert isinstance(converted_model.conv, _quantized.QuantizedConvAct2d) else: assert isinstance(converted_model.conv, _quantized.QuantizedConvTransposeAct2d) assert isinstance(converted_model.conv.act, activation_fn) @pytest.mark.parametrize( "config", get_configs_for_qscheme() + get_configs_for_qscheme(weight_per_channel=False) + get_configs_for_qscheme(weight_dtype="qint4") + get_configs_for_qscheme(weight_dtype="quint4"), ) @pytest.mark.parametrize( "model_config", [ ( nn.Sequential( OrderedDict( { "conv": nn.Conv2d(1, 20, (3, 3)), "bn": nn.BatchNorm2d(20), "relu": nn.ReLU(), } ) ), True, torch.nn.intrinsic.qat.ConvBnReLU2d, torch.ao.nn.intrinsic.ConvReLU2d, torch.ao.nn.quantized.reference.Conv2d, ), ( nn.Sequential( OrderedDict( { "conv": nn.ConvTranspose2d(1, 20, (3, 3)), "bn": nn.BatchNorm2d(20), "relu": nn.ReLU(), } ) ), False, _qat.ConvTransposeBnAct2d, _quantized.QuantizedConvTransposeAct2d, torch.ao.nn.quantized.reference.ConvTranspose2d, ), ], ) def test_conv_bn_relu_fusion(config, model_config): ( model, pytorch_builtin_mod, qat_mod_type, fused_quant_mod_type, ref_quant_mod_type, ) = model_config data = torch.randn(1, 1, 28, 28) prepared_model, quantizer = quantize_model(model, data, config) assert isinstance(prepared_model.conv, qat_mod_type) _verify_quant_range( prepared_model.conv.weight_fake_quant, weight_n_bits=config.global_config.weight_n_bits, weight_dtype=config.global_config.weight_dtype, ) converted_model = quantizer.finalize(inplace=False) assert isinstance(converted_model.conv, fused_quant_mod_type) assert isinstance( converted_model.conv[0] if pytorch_builtin_mod else converted_model.conv.conv, ref_quant_mod_type, ) @pytest.mark.parametrize( "config", get_configs_for_qscheme() + get_configs_for_qscheme(weight_per_channel=False) + get_configs_for_qscheme(weight_dtype="qint4") + get_configs_for_qscheme(weight_dtype="quint4"), ) @pytest.mark.parametrize("activation_fn", list(_mod_activations)) @pytest.mark.parametrize("conv_transpose", [False, True]) def test_conv_bn_act_fusion(config, activation_fn, conv_transpose): model = nn.Sequential( OrderedDict( { "conv": ( nn.Conv2d(1, 20, (3, 3)) if not conv_transpose else nn.ConvTranspose2d(1, 20, (3, 3)) ), "bn": nn.BatchNorm2d(20), "act": activation_fn(), } ) ) data = torch.randn(1, 1, 28, 28) prepared_model, quantizer = quantize_model(model, data, config) if not conv_transpose: assert isinstance(prepared_model.conv, _qat.ConvBnAct2d) else: assert isinstance(prepared_model.conv, _qat.ConvTransposeBnAct2d) assert isinstance(prepared_model.conv.act, activation_fn) _verify_quant_range( prepared_model.conv.conv.weight_fake_quant, weight_n_bits=config.global_config.weight_n_bits, weight_dtype=config.global_config.weight_dtype, ) converted_model = quantizer.finalize(inplace=False) if not conv_transpose: assert isinstance(converted_model.conv, _quantized.QuantizedConvAct2d) else: assert isinstance(converted_model.conv, _quantized.QuantizedConvTransposeAct2d) assert isinstance(converted_model.conv.act, activation_fn) @pytest.mark.parametrize( "config", get_configs_for_qscheme() + get_configs_for_qscheme(weight_per_channel=False) + get_configs_for_qscheme(weight_dtype="qint4") + get_configs_for_qscheme(weight_dtype="quint4"), ) def test_linear_relu_fusion(config): model = nn.Sequential(OrderedDict({"linear": nn.Linear(20, 100), "act": nn.ReLU()})) data = torch.randn(1, 20) prepared_model, quantizer = quantize_model(model, data, config) assert isinstance(prepared_model.linear, torch.nn.intrinsic.qat.LinearReLU) _verify_quant_range( prepared_model.linear.weight_fake_quant, weight_n_bits=config.global_config.weight_n_bits, weight_dtype=config.global_config.weight_dtype, ) converted_model = quantizer.finalize(inplace=False) assert isinstance(converted_model.linear, torch.ao.nn.intrinsic.LinearReLU) assert isinstance(converted_model.linear[0], torch.ao.nn.quantized.reference.Linear) @pytest.mark.parametrize( "config", get_configs_for_qscheme() + get_configs_for_qscheme(weight_per_channel=False) + get_configs_for_qscheme(weight_dtype="qint4") + get_configs_for_qscheme(weight_dtype="quint4"), ) @pytest.mark.parametrize("activation_fn", list(_mod_activations)) def test_linear_act_fusion(config, activation_fn): model = nn.Sequential(OrderedDict({ 'linear': nn.Linear(20, 100), 'act': activation_fn(), })) data = torch.randn(1, 20) prepared_model, quantizer = quantize_model(model, data, config) assert isinstance(prepared_model.linear, _qat.LinearAct) assert isinstance(prepared_model.linear.act, activation_fn) _verify_quant_range( prepared_model.linear.linear.weight_fake_quant, weight_n_bits=config.global_config.weight_n_bits, weight_dtype=config.global_config.weight_dtype, ) converted_model = quantizer.finalize(inplace=False) assert isinstance(converted_model.linear, _quantized.QuantizedLinearAct) assert isinstance(converted_model.linear.act, activation_fn) @pytest.mark.parametrize("activation_fn", [torch.nn.ReLU, torch.nn.ReLU6]) @pytest.mark.parametrize( "layer_and_data", [ [nn.Conv2d(1, 20, (3, 3)), torch.randn(1, 1, 28, 28)], [nn.ConvTranspose2d(1, 20, (3, 3)), torch.randn(1, 1, 28, 28)], [nn.Linear(20, 100), torch.randn(1, 20)], ], ) @pytest.mark.parametrize("bn", [nn.BatchNorm2d(20), None]) def test_single_act_qscheme_for_symmetric(activation_fn, layer_and_data, bn): """ Tests that when qscheme is symmetric, always affine layers have affine qscheme """ layer, data = layer_and_data if (isinstance(layer, nn.Conv2d) or isinstance(layer, nn.ConvTranspose2d)) and bn is not None: model = nn.Sequential( OrderedDict( { "layer": layer, "bn": bn, "act": activation_fn(), } ) ) else: model = nn.Sequential(OrderedDict({ 'layer': layer, 'act': activation_fn(), })) prepared_model, _ = quantize_model(model, data) assert prepared_model.activation_post_process_0.qscheme == torch.per_tensor_symmetric assert prepared_model.activation_post_process_1.qscheme == torch.per_tensor_affine @pytest.mark.parametrize( "activation_fn", [torch.nn.Hardsigmoid, torch.nn.Sigmoid, torch.nn.Softmax, torch.nn.Tanh], ) @pytest.mark.parametrize( "layer_and_data", [ [nn.Conv2d(1, 20, (3, 3)), torch.randn(1, 1, 28, 28)], [nn.ConvTranspose2d(1, 20, (3, 3)), torch.randn(1, 1, 28, 28)], [nn.Linear(20, 100), torch.randn(1, 20)], ], ) @pytest.mark.parametrize("bn", [nn.BatchNorm2d(20), None]) @pytest.mark.parametrize("config", get_configs_for_qscheme()) def test_single_fixed_qparams_act_for_symmetric( activation_fn, layer_and_data, bn, config ): """ Tests that when qscheme is symmetric, the qparams of fixed qparam ops are maintained """ layer, data = layer_and_data if (isinstance(layer, nn.Conv2d) or isinstance(layer, nn.ConvTranspose2d)) and bn is not None: model = nn.Sequential( OrderedDict( { "layer": layer, "bn": bn, "act": activation_fn(), } ) ) else: model = nn.Sequential(OrderedDict({ 'layer': layer, 'act': activation_fn(), })) prepared_model, _ = quantize_model(model, data) builder = _QConfigMappingBuilder() qconfig = builder.get_default_qconfig_mapping( QuantizationScheme.symmetric, ModuleLinearQuantizerConfig(), ).object_type_qconfigs[activation_fn] assert prepared_model.activation_post_process_1.scale == qconfig.activation().scale assert prepared_model.activation_post_process_1.zero_point == qconfig.activation().zero_point @pytest.mark.parametrize("activation_fn", [nn.ReLU, nn.ReLU6]) @pytest.mark.parametrize("conv_transpose", [False, True]) def test_dropout_affine_input(activation_fn, conv_transpose): model = nn.Sequential( OrderedDict( { "conv": ( nn.Conv2d(1, 20, (3, 3)) if not conv_transpose else nn.ConvTranspose2d(1, 20, (3, 3)) ), "relu": activation_fn(), "dropout": nn.Dropout2d(), "leaky_relu": nn.LeakyReLU(), } ) ) data = torch.randn(1, 1, 28, 28) prepared_model, _ = quantize_model(model, data) assert prepared_model.activation_post_process_1.qscheme == torch.per_tensor_affine assert not hasattr(prepared_model, "activation_post_process_2") assert prepared_model.activation_post_process_3.qscheme == torch.per_tensor_symmetric def test_sequential_network_config_for_symmetric(mnist_model_quantization): """ Tests a sequential network with multiple modules is configured correctly. This network has layers where input and output observers are shared. We test that for these layers, we set acitvation quantizer correctly for always affine layers """ data = torch.randn(1, 1, 28, 28) prepared_model, quantizer = quantize_model(mnist_model_quantization, data) # verify module fusion assert isinstance(prepared_model.conv1, _qat.ConvBnAct2d) assert isinstance(prepared_model.conv2, _qat.ConvAct2d) assert isinstance(prepared_model.conv_transpose1, _qat.ConvTransposeBnAct2d) assert isinstance(prepared_model.conv_transpose2, _qat.ConvTransposeAct2d) assert isinstance(prepared_model.dense1, _qat.LinearAct) assert isinstance(prepared_model.dense2, _qat.LinearAct) # verify activation quantizers # after input assert prepared_model.activation_post_process_0.qscheme == torch.per_tensor_symmetric # after conv1 assert prepared_model.activation_post_process_1.qscheme == torch.per_tensor_affine # after pool, this is shared with output of conv1 assert id(prepared_model.activation_post_process_1) == id(prepared_model.activation_post_process_2) # after conv2 assert prepared_model.activation_post_process_3.qscheme == torch.per_tensor_affine # after pool, shared with output of conv2 assert id(prepared_model.activation_post_process_3) == id(prepared_model.activation_post_process_4) # after conv_transpose1 assert prepared_model.activation_post_process_5.qscheme == torch.per_tensor_affine # after pool, shared with output of conv_transpose1 assert id(prepared_model.activation_post_process_5) == id( prepared_model.activation_post_process_6 ) # after conv_transpose2 assert prepared_model.activation_post_process_7.qscheme == torch.per_tensor_symmetric # after flatten, shared with the output of conv_transpose2 assert id(prepared_model.activation_post_process_7) == id( prepared_model.activation_post_process_8 ) # after linear1 assert prepared_model.activation_post_process_9.qscheme == torch.per_tensor_affine # after dropout # we remove activation post process after dropout layer assert not hasattr(prepared_model, "activation_post_process_10") # after linear2, logsoftmax assert prepared_model.activation_post_process_11.qscheme == torch.per_tensor_symmetric # convert model and test fusion converted_model = quantizer.finalize(inplace=False) # assert converted module fusion assert isinstance(converted_model.conv1, _quantized.QuantizedConvAct2d) assert isinstance(converted_model.conv2, _quantized.QuantizedConvAct2d) assert isinstance(converted_model.conv_transpose1, _quantized.QuantizedConvTransposeAct2d) assert isinstance(converted_model.conv_transpose2, _quantized.QuantizedConvTransposeAct2d) assert isinstance(converted_model.dense1, _quantized.QuantizedLinearAct) assert isinstance(converted_model.dense2, _quantized.QuantizedLinearAct) class ConvBlock(nn.Module): def __init__(self, conv_transpose, activation): super().__init__() if conv_transpose: self.conv = nn.ConvTranspose2d(1, 20, (3, 3), padding=1) else: self.conv = nn.Conv2d(1, 20, (3, 3), padding="same") self.activation = activation def forward(self, x): return self.activation(self.conv(x)) class ResidualBlock(nn.Module): def __init__(self, conv_transpose: bool, activation: nn.Module): super().__init__() self.conv = ConvBlock(conv_transpose, activation) def forward(self, x): return x + self.conv(x) @pytest.mark.parametrize("activation_fn", [torch.nn.functional.relu, torch.nn.functional.relu_]) @pytest.mark.parametrize("conv_transpose", [False, True]) def test_functional_relu_qscheme_for_symmetric(activation_fn, conv_transpose): class Model(nn.Module): def __init__(self, conv_transpose): super().__init__() if not conv_transpose: self.conv1 = nn.Conv2d(1, 20, (3, 3), padding="same") self.conv2 = nn.Conv2d(20, 20, (3, 3), padding="same") else: self.conv1 = nn.Conv2d(1, 20, (3, 3), padding=1) self.conv2 = nn.Conv2d(20, 20, (3, 3), padding=1) def forward(self, x): return self.conv2(activation_fn(self.conv1(x))) model = Model(conv_transpose) data = torch.randn(1, 1, 28, 28) prepared_model, _ = quantize_model(model, data) if activation_fn == torch.nn.functional.relu: assert prepared_model.activation_post_process_1.qscheme == torch.per_tensor_affine else: assert prepared_model.activation_post_process_2.qscheme == torch.per_tensor_affine @pytest.mark.parametrize("activation_fn", [torch.nn.ReLU, torch.nn.ReLU6]) @pytest.mark.parametrize("conv_transpose", [False, True]) def test_addition_of_uint_and_uint_for_symmetric(activation_fn, conv_transpose): model = nn.Sequential( OrderedDict( { "previous_activation": nn.ReLU(), "res_block": ResidualBlock(conv_transpose, activation_fn()), } ) ) data = torch.randn(1, 1, 28, 28) prepared_model, _ = quantize_model(model, data) assert prepared_model.activation_post_process_0.qscheme == torch.per_tensor_symmetric affine_acts = [prepared_model.activation_post_process_1, prepared_model.activation_post_process_2, prepared_model.activation_post_process_3] for act in affine_acts: assert act.qscheme == torch.per_tensor_affine @pytest.mark.parametrize("activation_fn", [torch.nn.ReLU, torch.nn.ReLU6]) @pytest.mark.parametrize("conv_transpose", [False, True]) def test_addition_of_int_and_uint_for_symmetric(activation_fn, conv_transpose): model = nn.Sequential( OrderedDict( { "previous_activation": nn.LeakyReLU(), "res_block": ResidualBlock(conv_transpose, activation_fn()), } ) ) data = torch.randn(1, 1, 28, 28) prepared_model, _ = quantize_model(model, data) # relu shares observer with input, so input is affine as well symmetric_acts = [prepared_model.activation_post_process_0, prepared_model.activation_post_process_1, prepared_model.activation_post_process_3] for act in symmetric_acts: assert act.qscheme == torch.per_tensor_symmetric # output of conv block is still affine assert prepared_model.activation_post_process_2.qscheme == torch.per_tensor_affine class ComplexAdd(nn.Module): """ a (affine) + -> c (symmetric) b (symmetric) + -> g (symmetric) d (affine) + -> f (affine) e (affine) """ def __init__(self, activation_fn): super().__init__() self.lrelu = nn.LeakyReLU() self.relu1 = activation_fn() self.relu2 = activation_fn() self.relu3 = activation_fn() def forward(self, x): a = self.relu1(x) b = self.lrelu(x) d = self.relu2(x) e = self.relu3(x) c = a + b f = d + e g = c + f return g @pytest.mark.parametrize("activation_fn", [torch.nn.ReLU, torch.nn.ReLU6]) def test_complex_add(activation_fn): model = ComplexAdd(activation_fn) data = torch.randn(1, 1, 28, 28) prepared_model, _ = quantize_model(model, data) symmetric_acts = [prepared_model.activation_post_process_0, prepared_model.activation_post_process_2, prepared_model.activation_post_process_5, prepared_model.activation_post_process_7] for act in symmetric_acts: assert act.qscheme == torch.per_tensor_symmetric affine_acts = [prepared_model.activation_post_process_1, prepared_model.activation_post_process_3, prepared_model.activation_post_process_4, prepared_model.activation_post_process_6] for act in affine_acts: assert act.qscheme == torch.per_tensor_affine class ComplexConcatAdd(nn.Module): """ conv_c (uint) --- c. .`-- concat .--a2 conv_a (uint) ` `--a1 `-- add conv_b (int) ---- b ` """ def __init__(self, conv_transpose, activation_fn): super().__init__() self.conv_a = ConvBlock(conv_transpose, activation_fn()) self.conv_b = ConvBlock(conv_transpose, nn.LeakyReLU()) self.conv_c = ConvBlock(conv_transpose, activation_fn()) def forward(self, x): a1 = self.conv_a(x) b = self.conv_b(x) ab = a1 + b c = self.conv_c(x) ac = torch.cat([a1, c]) return ab, ac @pytest.mark.parametrize("activation_fn", [torch.nn.ReLU, torch.nn.ReLU6]) @pytest.mark.parametrize("conv_transpose", [False, True]) def test_complex_concat_add(activation_fn, conv_transpose): model = ComplexConcatAdd(conv_transpose, activation_fn) data = torch.randn(1, 1, 28, 28) prepared_model, _ = quantize_model(model, data) symmetric_acts = [prepared_model.activation_post_process_0, prepared_model.activation_post_process_2, prepared_model.activation_post_process_3] for act in symmetric_acts: assert act.qscheme == torch.per_tensor_symmetric affine_acts = [prepared_model.activation_post_process_1, prepared_model.activation_post_process_4, prepared_model.activation_post_process_5] for act in affine_acts: assert act.qscheme == torch.per_tensor_affine class ConcatBlock(nn.Module): def __init__(self, conv_transpose: bool, *activations: torch.nn.Module): super().__init__() self.branches = nn.ModuleList(ConvBlock(conv_transpose, act) for act in activations) def forward(self, x): return torch.cat(list(f(x) for f in self.branches)) @pytest.mark.parametrize("activation_fn", [torch.nn.ReLU, torch.nn.ReLU6, torch.nn.LeakyReLU]) @pytest.mark.parametrize("conv_transpose", [False, True]) def test_concat_uint_and_int(activation_fn, conv_transpose): model = ConcatBlock(conv_transpose, activation_fn(), nn.Identity()) data = torch.randn(1, 1, 28, 28) prepared_model, _ = quantize_model(model, data) symmetric_acts = [prepared_model.activation_post_process_0, prepared_model.activation_post_process_2] for act in symmetric_acts: assert act.qscheme == torch.per_tensor_symmetric # these are inputs and output of cat layer, they all share same activation quantization other_acts = [prepared_model.activation_post_process_1, prepared_model.activation_post_process_3, prepared_model.activation_post_process_4] for act in other_acts: if isinstance(activation_fn(), (torch.nn.ReLU, torch.nn.ReLU6)): assert act.qscheme == torch.per_tensor_affine else: assert act.qscheme == torch.per_tensor_symmetric assert id(prepared_model.activation_post_process_1) == id(prepared_model.activation_post_process_3) assert id(prepared_model.activation_post_process_3) == id(prepared_model.activation_post_process_4) @pytest.mark.parametrize( "config", get_configs_for_qscheme(activation_dtype=torch.float32) ) @pytest.mark.parametrize("activation_fn", list(_mod_activations) + [nn.ReLU]) @pytest.mark.parametrize("bn", [nn.BatchNorm2d(20), None]) @pytest.mark.parametrize("conv_transpose", [False, True]) def test_conv_weight_only_quantization(config, activation_fn, bn, conv_transpose): if bn is not None: model = nn.Sequential( OrderedDict( { "layer": ( nn.Conv2d(1, 20, (3, 3)) if not conv_transpose else nn.ConvTranspose2d(1, 20, (3, 3)) ), "bn": bn, "act": activation_fn(), } ) ) else: model = nn.Sequential( OrderedDict( { "layer": ( nn.Conv2d(1, 20, (3, 3)) if not conv_transpose else nn.ConvTranspose2d(1, 20, (3, 3)) ), "act": activation_fn(), } ) ) data = torch.randn(1, 1, 28, 28) prepared_model, quantizer = quantize_model(model, data, config) if bn is not None: if conv_transpose: assert isinstance(prepared_model.layer, _qat.ConvTransposeBnAct2d) else: assert isinstance(prepared_model.layer, _qat.ConvBnAct2d) or isinstance( prepared_model.layer, torch.nn.intrinsic.qat.ConvBnReLU2d ) else: if conv_transpose: assert isinstance(prepared_model.layer, _qat.ConvTransposeAct2d) else: assert isinstance(prepared_model.layer, _qat.ConvAct2d) or isinstance( prepared_model.layer, torch.nn.intrinsic.qat.ConvReLU2d ) assert len(list(prepared_model.children())) == 1 converted_model = quantizer.finalize(inplace=False) if conv_transpose: assert isinstance(converted_model.layer, _quantized.QuantizedConvTransposeAct2d) else: assert isinstance(converted_model.layer, _quantized.QuantizedConvAct2d) or isinstance( converted_model.layer[0], torch.ao.nn.quantized.reference.Conv2d ) @pytest.mark.parametrize("config", get_configs_for_qscheme(weight_dtype=torch.float32)) @pytest.mark.parametrize("activation_fn", list(_mod_activations) + [nn.ReLU]) @pytest.mark.parametrize("bn", [nn.BatchNorm2d(20), None]) def test_conv_activation_only_quantization(config, activation_fn, bn): if bn is not None: model = nn.Sequential( OrderedDict( { "layer": nn.Conv2d(1, 20, (3, 3)), "bn": bn, "act": activation_fn(), } ) ) else: model = nn.Sequential( OrderedDict( { "layer": nn.Conv2d(1, 20, (3, 3)), "act": activation_fn(), } ) ) data = torch.randn(1, 1, 28, 28) prepared_model, quantizer = quantize_model(model, data, config) if bn is not None: assert isinstance(prepared_model.layer, _qat.ConvBnAct2d) or isinstance( prepared_model.layer, torch.nn.intrinsic.qat.ConvBnReLU2d ) else: assert isinstance(prepared_model.layer, _qat.ConvAct2d) or isinstance( prepared_model.layer, torch.nn.intrinsic.qat.ConvReLU2d ) assert len(list(prepared_model.children())) == 3 assert isinstance( prepared_model.get_submodule("activation_post_process_0"), torch.ao.quantization.FakeQuantize, ) assert isinstance( prepared_model.get_submodule("activation_post_process_1"), torch.ao.quantization.FakeQuantize, ) converted_model = quantizer.finalize(inplace=False) assert isinstance(converted_model.layer, _quantized.QuantizedConvAct2d) or isinstance( converted_model.layer[0], torch.nn.Conv2d ) @pytest.mark.parametrize( "config", get_configs_for_qscheme(activation_dtype=torch.float32) ) @pytest.mark.parametrize("activation_fn", list(_mod_activations) + [nn.ReLU]) def test_linear_weight_only_quantization(config, activation_fn): model = nn.Sequential( OrderedDict( { "layer": nn.Linear(20, 100), "act": activation_fn(), } ) ) data = torch.randn(1, 20) prepared_model, quantizer = quantize_model(model, data, config) assert isinstance(prepared_model.layer, _qat.LinearAct) or isinstance( prepared_model.layer, torch.nn.intrinsic.qat.LinearReLU ) assert len(list(prepared_model.children())) == 1 converted_model = quantizer.finalize(inplace=False) assert isinstance( converted_model.layer, _quantized.QuantizedLinearAct ) or isinstance(converted_model.layer[0], torch.ao.nn.quantized.reference.Linear) @pytest.mark.parametrize("config", get_configs_for_qscheme(weight_dtype=torch.float32)) @pytest.mark.parametrize("activation_fn", list(_mod_activations) + [nn.ReLU]) def test_linear_activation_only_quantization(config, activation_fn): model = nn.Sequential( OrderedDict( { "layer": nn.Linear(20, 100), "act": activation_fn(), } ) ) data = torch.randn(1, 20) prepared_model, quantizer = quantize_model(model, data, config) assert isinstance(prepared_model.layer, _qat.LinearAct) or isinstance( prepared_model.layer, torch.nn.intrinsic.qat.LinearReLU ) assert len(list(prepared_model.children())) == 3 assert isinstance( prepared_model.get_submodule("activation_post_process_0"), torch.ao.quantization.FakeQuantize, ) assert isinstance( prepared_model.get_submodule("activation_post_process_1"), torch.ao.quantization.FakeQuantize, ) converted_model = quantizer.finalize(inplace=False) assert isinstance(converted_model.layer, _quantized.QuantizedLinearAct) or isinstance( converted_model.layer[0], torch.nn.Linear ) # @pytest.mark.parametrize("activation_dtype", [torch.float32, torch.quint8]) # TODO: Fix quantization of embedding layer when activation dtype is quint8 @pytest.mark.parametrize("activation_dtype", [torch.float32]) def test_embedding_layer_quantization(activation_dtype): model = nn.Sequential( OrderedDict( { "embedding": nn.Embedding(10, 10), "linear": nn.Linear(10, 10), } ) ) data = torch.randint(0, 10, (1, 10)) configs = get_configs_for_qscheme(activation_dtype) for config in configs: prepared_model, quantizer = quantize_model(model, data, config) assert isinstance(prepared_model.embedding, torch.nn.qat.Embedding) if activation_dtype == torch.float32: assert len(list(prepared_model.children())) == 2 else: assert len(list(prepared_model.children())) == 4 assert prepared_model.activation_post_process_0.dtype == torch.quint8 assert prepared_model.activation_post_process_1.dtype == torch.quint8 if config.global_config.quantization_scheme == QuantizationScheme.symmetric: assert ( prepared_model.embedding.weight_fake_quant.qscheme == torch.per_channel_symmetric ) else: assert ( prepared_model.embedding.weight_fake_quant.qscheme == torch.per_channel_affine ) converted_model = quantizer.finalize(inplace=False) assert isinstance( converted_model.embedding, torch.ao.nn.quantized.reference.Embedding ) assert isinstance( converted_model.linear, torch.ao.nn.quantized.reference.Linear ) @pytest.mark.parametrize("config", get_configs_for_qscheme()) @pytest.mark.parametrize("activation_fn", list(_mod_activations) + [nn.ReLU]) @pytest.mark.parametrize( "elementwise_op", [operator.add, torch.add, operator.mul, torch.mul, torch.matmul, torch.einsum], ) @pytest.mark.parametrize("conv_transpose", [False, True]) def test_elementwise_op_act_fusion(config, activation_fn, elementwise_op, conv_transpose): class ElementWiseActModule(torch.nn.Module): def __init__(self, conv_transpose): super().__init__() if conv_transpose: self.conv1 = torch.nn.ConvTranspose2d(48, 48, (3, 3), (1, 1), padding=(1, 1)) else: self.conv1 = torch.nn.Conv2d(48, 48, (3, 3), (1, 1), padding=(1, 1)) self.act = activation_fn() def forward(self, x): if elementwise_op == torch.einsum: return self.act( elementwise_op("bkhq,bchk->bchq", x.transpose(1, 3), self.conv1(x)) ) return self.act(elementwise_op(x, self.conv1(x))) model = ElementWiseActModule(conv_transpose) data = torch.randn(1, 48, 224, 224) prepared_model, quantizer = quantize_model(model, data, config) for node in prepared_model.graph.nodes: if node.op == "call_function": assert isinstance(find_module(prepared_model, node.next), activation_fn) assert is_activation_post_process( find_module(prepared_model, node.next.next) ) @pytest.mark.parametrize("quantization_scheme", ["symmetric", "affine"]) @pytest.mark.parametrize( "skipped_layers", [ ["conv1", "pool1"], ["conv2", "pool1", "pool2"], ["dense1", "flatten", "dropout"], ["dense2", "dropout"], ["conv_transpose1", "pool2", "pool3"], ["conv_transpose2", "pool3", "flatten"], ], ) def test_skipping_quantization_for_layers( mnist_model_quantization, quantization_scheme, skipped_layers ): config_s = LinearQuantizerConfig.from_dict( { "global_config": { "quantization_scheme": quantization_scheme, "milestones": [0, 0, 100, 100], }, "module_name_configs": { skipped_layer: None for skipped_layer in skipped_layers }, } ) config_f = LinearQuantizerConfig.from_dict( { "global_config": { "quantization_scheme": quantization_scheme, "milestones": [0, 0, 100, 100], } } ) data = torch.randn(1, 1, 28, 28) prepared_model_s, quantizer_s = quantize_model( mnist_model_quantization, data, config_s ) prepared_model_f, quantizer_f = quantize_model( mnist_model_quantization, data, config_f ) skipped_mod_name = skipped_layers[0] skipped_mod = mnist_model_quantization.get_submodule(skipped_mod_name) if isinstance(skipped_mod, nn.Conv2d): submod_s = prepared_model_s.get_submodule(skipped_mod_name) submod_f = prepared_model_f.get_submodule(skipped_mod_name) assert isinstance(submod_s, _fused.ConvBnAct2d) or isinstance( submod_s, _fused.ConvAct2d ) assert not hasattr(submod_s.conv, "weight_fake_quant") assert isinstance(submod_f, _qat.ConvBnAct2d) or isinstance( submod_f, _qat.ConvAct2d ) assert hasattr(submod_f.conv, "weight_fake_quant") elif isinstance(skipped_mod, nn.Linear): submod_s = prepared_model_s.get_submodule(skipped_mod_name) submod_f = prepared_model_f.get_submodule(skipped_mod_name) assert isinstance(submod_s, _fused.LinearAct) assert not hasattr(submod_s.linear, "weight_fake_quant") assert isinstance(submod_f, _qat.LinearAct) assert hasattr(submod_f.linear, "weight_fake_quant") for node in prepared_model_s.graph.nodes: if node.target == skipped_mod_name: for consumer in node.users: assert "activation_post_process" not in consumer.target for producer in node.args: assert "activation_post_process" not in producer.target for node in prepared_model_f.graph.nodes: if node.target == skipped_mod_name: for consumer in node.users: assert "activation_post_process" in consumer.target for producer in node.args: if producer.target != "dropout": # for some nodes, if producer is dropout, we won't have activation post process assert "activation_post_process" in producer.target ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/quantization/test_coreml_quantizer.py0000644000000000000000000001732714672066616030533 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict from typing import Dict, Optional import pytest import torch import torch.nn as nn from torch.fx import Node from coremltools.optimize.torch.quantization.quantization_config import ( LinearQuantizerConfig, QuantizationScheme, ) from coremltools._deps import _HAS_TORCH_EXPORT_API if _HAS_TORCH_EXPORT_API: from torch._export import capture_pre_autograd_graph from torch.ao.quantization.quantize_pt2e import ( convert_pt2e, prepare_pt2e, prepare_qat_pt2e, ) _TORCH_VERSION = torch.__version__ _EXPECTED_TORCH_VERSION = '2.2.0' if _TORCH_VERSION >= _EXPECTED_TORCH_VERSION: from coremltools.optimize.torch.quantization._coreml_quantizer import CoreMLQuantizer activations = { nn.ReLU: { True: torch.ops.aten.relu_.default, False: torch.ops.aten.relu.default, }, nn.ReLU6: { True: torch.ops.aten.hardtanh_.default, False: torch.ops.aten.hardtanh.default, }, nn.LeakyReLU: { True: torch.ops.aten.leaky_relu_.default, False: torch.ops.aten.leaky_relu.default, }, nn.SiLU: { True: torch.ops.aten.silu_.default, False: torch.ops.aten.silu.default, }, nn.ELU: { True: torch.ops.aten.elu_.default, False: torch.ops.aten.elu.default, }, nn.CELU: { True: torch.ops.aten.celu_.default, False: torch.ops.aten.celu.default, }, nn.SELU: { True: torch.ops.aten.selu_.default, False: torch.ops.aten.selu.default, }, nn.Mish: { True: torch.ops.aten.mish_.default, False: torch.ops.aten.mish.default, }, nn.Hardtanh: { True: torch.ops.aten.hardtanh_.default, False: torch.ops.aten.hardtanh.default, }, nn.Hardswish: { True: torch.ops.aten.hardswish_.default, False: torch.ops.aten.hardswish.default, }, nn.Hardsigmoid: { True: torch.ops.aten.hardsigmoid_.default, False: torch.ops.aten.hardsigmoid.default, }, nn.GELU: { False: torch.ops.aten.gelu.default, }, nn.Sigmoid: { False: torch.ops.aten.sigmoid.default, }, nn.LogSigmoid: { False: torch.ops.aten.log_sigmoid.default, }, nn.Tanh: { False: torch.ops.aten.tanh.default, }, } @pytest.fixture(scope="module") def model_for_quant() -> torch.nn.Module: model_dict = OrderedDict() activation_dict = {} idx = 0 start_idx = idx for act_fn in activations: for inplace in activations[act_fn].keys(): inp_channels = 1 if idx == start_idx else 20 model_dict[f"conv_{idx}"] = torch.nn.Conv2d( inp_channels, 20, (3, 3), padding=(1, 1) ) model_dict[f"act_{idx}"] = act_fn(inplace=inplace) if inplace else act_fn() activation_dict[idx] = activations[act_fn][inplace] idx += 1 model_dict[f"conv_{idx}"] = torch.nn.Conv2d(20, 20, (3, 3), padding=(1, 1)) model_dict[f"bn_{idx}"] = nn.BatchNorm2d(20) model_dict[f"act_{idx}"] = act_fn(inplace=inplace) if inplace else act_fn() activation_dict[idx] = activations[act_fn][inplace] idx += 1 model_dict["flatten"] = torch.nn.Flatten(start_dim=2) start_idx = idx for act_fn in activations: for inplace in activations[act_fn].keys(): inp_channels = 784 if idx == start_idx else 20 model_dict[f"lin_{idx}"] = nn.Linear(inp_channels, 20) model_dict[f"act_{idx}"] = act_fn(inplace=inplace) if inplace else act_fn() activation_dict[idx] = activations[act_fn][inplace] idx += 1 model_dict[f"lin_{idx}"] = nn.Linear(20, 20) model_dict[f"bn_{idx}"] = nn.BatchNorm1d(20) model_dict[f"act_{idx}"] = act_fn(inplace=inplace) if inplace else act_fn() activation_dict[idx] = activations[act_fn][inplace] idx += 1 return nn.Sequential(model_dict) def get_node_map(model: torch.fx.GraphModule) -> Dict[str, Node]: """ Return a dictionary of node name to node """ node_map = {} for node in model.graph.nodes: node_map[node.name] = node return node_map @pytest.fixture(scope="module") def config(request) -> LinearQuantizerConfig: quantization_scheme, weight_per_channel, activation_dtype = request.param return LinearQuantizerConfig.from_dict( { "global_config": { "quantization_scheme": quantization_scheme, "milestones": [0, 0, 10, 10], "activation_dtype": activation_dtype, "weight_dtype": torch.qint8, "weight_per_channel": weight_per_channel, } } ) def quantize_model( model: nn.Module, data: torch.Tensor, quantization_config: Optional[LinearQuantizerConfig] = None, is_qat: bool = True, ): quantizer = CoreMLQuantizer(quantization_config) exported_model = capture_pre_autograd_graph(model, (data,)) if is_qat: prepared_model = prepare_qat_pt2e(exported_model, quantizer) else: prepared_model = prepare_pt2e(exported_model, quantizer) prepared_model(data) converted_model = convert_pt2e(prepared_model, use_reference_representation=False) return converted_model @pytest.mark.parametrize( "config", [ (QuantizationScheme.symmetric, True, torch.quint8), (QuantizationScheme.symmetric, True, torch.float32), ], indirect=True, ) @pytest.mark.parametrize("is_qat", [True, False]) @pytest.mark.skipif(not _HAS_TORCH_EXPORT_API or _TORCH_VERSION < _EXPECTED_TORCH_VERSION, reason="This test requires PyTorch Export APIs and PyTorch >= 2.2.0.") def test_weight_module_act_fusion(model_for_quant, is_qat, config): model = model_for_quant data = torch.randn(2, 1, 28, 28) converted_model = quantize_model(model, data, config, is_qat=is_qat) node_map = get_node_map(converted_model) mod_nodes = [torch.ops.aten.conv2d.default, torch.ops.aten.linear.default] activation_dtype = config.global_config.activation_dtype for node_name, node in node_map.items(): if node.target in mod_nodes: if activation_dtype == torch.float32: assert ( node.args[0].target != torch.ops.quantized_decomposed.dequantize_per_tensor.default ) else: assert ( node.args[0].target == torch.ops.quantized_decomposed.dequantize_per_tensor.default ) assert ( node.args[1].target == torch.ops.quantized_decomposed.dequantize_per_channel.default ) assert len(node.users) == 1 act_node = list(node.users.keys())[0] if act_node.target == torch.ops.aten._native_batch_norm_legit.default: act_node = act_node.next.next assert len(act_node.users) == 1 post_act_node = list(act_node.users.keys())[0] if activation_dtype == torch.float32: assert ( post_act_node.target != torch.ops.quantized_decomposed.quantize_per_tensor.default ) else: assert ( post_act_node.target == torch.ops.quantized_decomposed.quantize_per_tensor.default ) # necessary to clear cache, otherwise tests fail with cache_size_limit reached torch._dynamo.reset() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/quantization/test_post_training_quantization.py0000644000000000000000000002164714672066616032636 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import numpy as np import pytest import torch ct = pytest.importorskip("coremltools") pytest.importorskip("coremltools.optimize.coreml._utils") from coremltools.optimize.torch.optimization_config import QuantizationGranularity from coremltools.optimize.torch.quantization import ( PostTrainingQuantizer, PostTrainingQuantizerConfig, QuantizationScheme, ) np.random.seed(0) torch.manual_seed(0) def get_rmse(a, b): return torch.norm(torch.abs(a - b)) def get_atol_rtol(block_size, weight_n_bits): if block_size is None: block_size = 0 if block_size == 1: # With block_size == 1, the information loss is minimum. atol, rtol = 1e-02, 1e-02 elif weight_n_bits >= 4 and block_size < 3: # When block size is small and nbits is large, the information loss is limited. atol, rtol = 3e-02, 3e-02 elif weight_n_bits <= 2 and block_size >= 2: atol, rtol = 0.5, 0.5 else: atol, rtol = 0.4, 0.4 return (atol, rtol) def test_ptq_default_config(): config = PostTrainingQuantizerConfig() ptq = PostTrainingQuantizer(torch.nn.Identity(), config) assert ptq is not None assert config.global_config.block_size is None assert config.global_config.weight_dtype == torch.int8 assert config.global_config.quantization_scheme == QuantizationScheme.symmetric assert config.global_config.weight_dtype == torch.int8 assert config.global_config.granularity == QuantizationGranularity.per_channel @pytest.mark.parametrize( "module", [ torch.nn.Linear(10, 10), torch.nn.Conv2d(10, 10, 3, 3), torch.nn.ConvTranspose2d(10, 20, 3, 3), torch.nn.Conv2d(20, 10, 3, 3), torch.nn.MultiheadAttention( bias=True, embed_dim=6, num_heads=3, add_bias_kv=True, kdim=1, vdim=1, ), torch.nn.MultiheadAttention( bias=True, embed_dim=6, num_heads=3, add_bias_kv=True, kdim=None, vdim=None, ), ], ) @pytest.mark.parametrize( "granularity_block_size", [ ("per_channel", None), ("per_tensor", None), ("per_block", 2), ("per_block", 5), ("per_block", (2,)), ("per_block", (5,)), ("per_block", (5, 2)), ("per_block", (2, 5)), ], ) @pytest.mark.parametrize("quantization_scheme", ["symmetric", "affine"]) @pytest.mark.parametrize("weight_dtype", ["int8", "int4", "uint8", "uint4"]) def test_ptq_compress_all_combinations( module, quantization_scheme, granularity_block_size, weight_dtype, ): granularity, block_size = granularity_block_size config = PostTrainingQuantizerConfig.from_dict( { "global_config": { "quantization_scheme": quantization_scheme, "granularity": granularity, "weight_dtype": weight_dtype, "block_size": block_size, } } ) ptq = PostTrainingQuantizer(module, config) module = ptq.compress() @pytest.mark.parametrize("quantization_scheme", ["symmetric", "affine"]) @pytest.mark.parametrize( "granularity_block_size", [ ("per_channel", None), ("per_tensor", None), ("per_block", 2), ("per_block", 5), ], ) @pytest.mark.parametrize("weight_dtype", ["int4", "int8"]) @pytest.mark.parametrize( "module", [ torch.nn.Conv1d(10, 10, 3, 3), torch.nn.Conv2d(10, 10, 3, 3), torch.nn.Conv3d(10, 10, 3, 3), torch.nn.Linear(10, 10), torch.nn.ConvTranspose1d(10, 20, 3, 3), torch.nn.ConvTranspose2d(10, 20, 3, 3), torch.nn.ConvTranspose3d(10, 20, 3, 3), ], ) def test_ptq_post_compress_conv_linear( quantization_scheme, granularity_block_size, weight_dtype, module ): granularity, block_size = granularity_block_size orig_weight = module.weight.clone() config = PostTrainingQuantizerConfig.from_dict( { "global_config": { "weight_dtype": weight_dtype, "granularity": granularity, "block_size": block_size, "quantization_scheme": quantization_scheme, } } ) ptq = PostTrainingQuantizer(module, config) module = ptq.compress() if isinstance( module, ( torch.nn.ConvTranspose1d, torch.nn.ConvTranspose2d, torch.nn.ConvTranspose3d, ), ): ch_axis = 1 block_axis = 0 elif isinstance( module, ( torch.nn.Linear, torch.nn.Conv1d, torch.nn.Conv2d, torch.nn.Conv3d, ), ): ch_axis = 0 block_axis = 1 else: raise NotImplementedError assert hasattr(module, "_COREML_/weight/quantization_scale") if quantization_scheme == "affine": assert hasattr(module, "_COREML_/weight/zero_point") if granularity in ["per_channel", "per_block"]: assert ( getattr(module, "_COREML_/weight/quantization_scale").shape[ch_axis] == module.weight.shape[ch_axis] ) if quantization_scheme == "affine": assert ( getattr(module, "_COREML_/weight/zero_point").shape[ch_axis] == module.weight.shape[ch_axis] ) if granularity == "per_block": assert ( getattr(module, "_COREML_/weight/quantization_scale").shape[block_axis] == module.weight.shape[block_axis] / block_size ) if quantization_scheme == "affine": assert ( getattr(module, "_COREML_/weight/zero_point").shape[block_axis] == module.weight.shape[block_axis] / block_size ) assert not torch.equal(orig_weight, module.weight) atol, rtol = get_atol_rtol(block_size, config.global_config.weight_n_bits) np.testing.assert_allclose( orig_weight.detach().numpy(), module.weight.detach().numpy(), atol=atol, rtol=rtol, ) @pytest.mark.parametrize("quantization_scheme", ["symmetric", "affine"]) @pytest.mark.parametrize( "granularity_block_size", [ ("per_channel", None), ("per_tensor", None), ("per_block", 2), ("per_block", 5), ], ) @pytest.mark.parametrize("weight_dtype", ["int4", "int8"]) def test_ptq_post_compress_multihead( quantization_scheme, granularity_block_size, weight_dtype, ): granularity, block_size = granularity_block_size module = torch.nn.MultiheadAttention( bias=True, embed_dim=10, num_heads=10, add_bias_kv=True, kdim=None, vdim=None, ) assert hasattr(module, "in_proj_weight") assert hasattr(module.out_proj, "weight") orig_in_proj_weight = module.in_proj_weight.clone() orig_out_proj_weight = module.out_proj.weight.clone() config = PostTrainingQuantizerConfig.from_dict( { "global_config": { "weight_dtype": weight_dtype, "quantization_scheme": quantization_scheme, "granularity": granularity, "block_size": block_size, } } ) ptq = PostTrainingQuantizer(module, config) module = ptq.compress() assert hasattr(module, "_COREML_/in_proj_weight/quantization_scale") assert hasattr(module.out_proj, "_COREML_/weight/quantization_scale") if quantization_scheme == "affine": assert hasattr(module, "_COREML_/in_proj_weight/zero_point") assert hasattr(module.out_proj, "_COREML_/weight/zero_point") assert not torch.equal(orig_in_proj_weight, module.in_proj_weight) assert not torch.equal(orig_out_proj_weight, module.out_proj.weight) atol, rtol = get_atol_rtol(block_size, config.global_config.weight_n_bits) np.testing.assert_allclose( orig_in_proj_weight.detach().numpy(), module.in_proj_weight.detach().numpy(), atol=atol, rtol=rtol, ) np.testing.assert_allclose( orig_out_proj_weight.detach().numpy(), module.out_proj.weight.detach().numpy(), atol=atol, rtol=rtol, ) def test_ptq_compression_metadata(): config = PostTrainingQuantizerConfig() ptq = PostTrainingQuantizer(torch.nn.Linear(10, 10), config) model = ptq.compress() from coremltools.optimize.torch._utils.metadata_utils import CompressionType assert hasattr(model, "_COREML_/weight/compression_type") assert torch.IntTensor([CompressionType.quantization.value]) == getattr( model, "_COREML_/weight/compression_type" ) assert torch.IntTensor([8]) == getattr(model, "_COREML_/weight/quantization_n_bits") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/quantization/test_quantizer.py0000644000000000000000000004076314672066616027172 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict import cattrs import pytest import torch import torch.ao.quantization import torch.nn as nn import torch.nn.intrinsic import torch.nn.intrinsic.qat import torch.nn.quantized import torch.nn.quantized.modules.utils import coremltools.optimize.torch.quantization.modules.qat_modules as _qat from coremltools.optimize.torch._utils.metadata_utils import CompressionMetadata, CompressionType from coremltools.optimize.torch.quantization import ( LinearQuantizer, LinearQuantizerConfig, ModuleLinearQuantizerConfig, QuantizationScheme, ) @pytest.mark.parametrize( "option_and_value", [ ("weight_dtype", torch.int32), ("activation_dtype", torch.int8), ("milestones", [0, 2]) ] ) def test_config_illegal_options(option_and_value): option, value = option_and_value with pytest.raises(cattrs.errors.ClassValidationError): LinearQuantizerConfig.from_dict({"global_config": {option: value}}) @pytest.mark.parametrize( "config_dict", [ {"module_type_configs": {nn.Linear: {"weight_dtype": torch.quint8}}}, {"module_type_configs": {nn.ConvTranspose2d: {"weight_dtype": torch.quint8}}}, {"module_name_configs": {"conv2d": {"weight_dtype": torch.quint8}}}, {"global_config": {"weight_dtype": torch.quint8}}, {}, ], ) def test_linear_quantizer_config_global_config_set(config_dict): config = LinearQuantizerConfig.from_dict(config_dict) if len(config_dict) == 0: assert config.global_config == ModuleLinearQuantizerConfig() else: keys = ["global_config", "module_type_configs", "module_name_configs"] for key in keys: if key not in config_dict: param_in_config = getattr(config, key) assert param_in_config is None or len(param_in_config) == 0 if "global_config" in config_dict: assert config.global_config.weight_dtype == config_dict["global_config"]["weight_dtype"] if "module_name_configs" in config_dict: for key in config_dict["module_name_configs"]: assert config.module_name_configs[key].weight_dtype == \ config_dict["module_name_configs"][key]["weight_dtype"] if "module_type_configs" in config_dict: for key in config_dict["module_type_configs"]: assert config.module_type_configs[key].weight_dtype == \ config_dict["module_type_configs"][key]["weight_dtype"] @pytest.mark.parametrize( "config_dict", [ { "global_config": {"quantization_scheme": "affine"}, "module_name_configs": {"conv1": {"quantization_scheme": "symmetric"}}, }, { "global_config": {"quantization_scheme": "affine"}, "module_type_configs": {nn.Linear: {"quantization_scheme": "symmetric"}}, }, { "module_name_configs": { "conv1": {"quantization_scheme": "affine"}, "conv2": {"quantization_scheme": "symmetric"}, } }, { "module_type_configs": { nn.Linear: {"quantization_scheme": "symmetric"}, "Conv2d": {"quantization_scheme": "affine"}, "ConvTranspose2d": {"quantization_scheme": "affine"}, } }, { "module_type_configs": {nn.Linear: {"quantization_scheme": "symmetric"}}, "module_name_configs": {"conv1": {"quantization_scheme": "affine"}}, }, ], ) def test_linear_quantizer_config_failure_modes(config_dict): with pytest.raises(Exception): LinearQuantizerConfig.from_dict(config_dict) def test_linear_quantizer_config_different_config_success(): config_dict = { "global_config": {"quantization_scheme": "affine"}, "module_name_configs": { "conv1": {"quantization_scheme": "affine"}, "conv2": {"quantization_scheme": "affine"}, }, "module_type_configs": { nn.Linear: {"quantization_scheme": "affine"}, nn.ConvTranspose2d: {"quantization_scheme": "affine"}, }, } LinearQuantizerConfig.from_dict(config_dict) @pytest.mark.parametrize( "config_dict", [ { "global_config": {"quantization_scheme": "affine"}, "module_name_configs": { "conv1": {"quantization_scheme": "affine"}, "conv2": {"quantization_scheme": "affine"}, }, "module_type_configs": {nn.Linear: {"quantization_scheme": "affine"}}, }, { "module_name_configs": { "conv1": {"quantization_scheme": "affine"}, "conv2": {"quantization_scheme": "affine"}, } }, {"module_type_configs": {nn.Linear: {"quantization_scheme": "affine"}}}, {}, ], ) def test_linear_quantizer_quantization_scheme_setting(config_dict): model = nn.Sequential(OrderedDict({ 'conv': nn.Conv2d(1, 20, (3, 3)), 'relu': nn.ReLU(), })) config = LinearQuantizerConfig.from_dict(config_dict) quantizer = LinearQuantizer(model, config) def_quantization_scheme = ModuleLinearQuantizerConfig().quantization_scheme.value quantization_scheme = quantizer._quantization_scheme.value if len(config_dict) == 0: assert def_quantization_scheme == quantization_scheme else: assert quantization_scheme == "affine" @pytest.mark.parametrize( "model_config", [ ( nn.Sequential( OrderedDict( { "conv": nn.Conv2d(1, 20, (3, 3)), "relu": nn.ReLU(), } ) ), torch.nn.intrinsic.qat.ConvReLU2d, ), ( nn.Sequential( OrderedDict( { "conv": nn.ConvTranspose2d(1, 20, (3, 3)), "relu": nn.ReLU(), } ) ), _qat.ConvTransposeAct2d, ), ], ) @pytest.mark.parametrize("quantization_scheme", ["symmetric", "affine"]) def test_activation_defaults(quantization_scheme, model_config): config = LinearQuantizerConfig.from_dict( {"global_config": { "quantization_scheme": quantization_scheme, "milestones": [0, 2, 3, 3], }} ) model, model_conv_instance = model_config quantizer = LinearQuantizer(model, config) model = quantizer.prepare(example_inputs=(torch.randn(1, 1, 28, 28),)) assert isinstance(model.conv, model_conv_instance) assert model.activation_post_process_0.dtype == torch.quint8 if quantization_scheme == "symmetric": assert model.activation_post_process_0.qscheme == torch.per_tensor_symmetric else: assert model.activation_post_process_0.qscheme == torch.per_tensor_affine assert model.activation_post_process_1.dtype == torch.quint8 assert model.activation_post_process_1.qscheme == torch.per_tensor_affine @pytest.mark.parametrize( "model_config", [ ( nn.Sequential( OrderedDict( { "conv": nn.Conv2d(1, 20, (3, 3)), "bn": nn.BatchNorm2d(20), "relu": nn.ReLU(), } ) ), True, ), ( nn.Sequential( OrderedDict( { "conv": nn.ConvTranspose2d(1, 20, (3, 3)), "bn": nn.BatchNorm2d(20), "relu": nn.ReLU(), } ) ), False, ), ], ) @pytest.mark.parametrize("quantization_scheme", ["symmetric", "affine"]) def test_quantizer_step_mechanism(quantization_scheme, model_config): config = LinearQuantizerConfig.from_dict( {"global_config": { "quantization_scheme": quantization_scheme, "milestones": [0, 1, 2, 3], }} ) model, pytorch_builtin_mod = model_config quantizer = LinearQuantizer(model, config) model = quantizer.prepare(example_inputs=(torch.randn(1, 1, 28, 28),)) if pytorch_builtin_mod: bn_module_to_check = model.conv else: bn_module_to_check = model.conv.conv assert not model.activation_post_process_0.observer_enabled assert not model.activation_post_process_0.fake_quant_enabled assert not model.activation_post_process_1.observer_enabled assert not model.activation_post_process_1.fake_quant_enabled for idx in range(4): quantizer.step() if idx == 0: assert not getattr(bn_module_to_check, "freeze_bn") assert model.activation_post_process_0.observer_enabled assert not model.activation_post_process_0.fake_quant_enabled assert model.activation_post_process_1.observer_enabled assert not model.activation_post_process_1.fake_quant_enabled if idx == 1: assert not getattr(bn_module_to_check, "freeze_bn") assert model.activation_post_process_0.observer_enabled assert model.activation_post_process_0.fake_quant_enabled assert model.activation_post_process_1.observer_enabled assert model.activation_post_process_1.fake_quant_enabled if idx == 2: assert not getattr(bn_module_to_check, "freeze_bn") assert not model.activation_post_process_0.observer_enabled assert model.activation_post_process_0.fake_quant_enabled assert not model.activation_post_process_1.observer_enabled assert model.activation_post_process_1.fake_quant_enabled if idx == 3: assert getattr(bn_module_to_check, "freeze_bn") assert not model.activation_post_process_0.observer_enabled assert model.activation_post_process_0.fake_quant_enabled assert not model.activation_post_process_1.observer_enabled assert model.activation_post_process_1.fake_quant_enabled @pytest.mark.parametrize( "model_dict", [ OrderedDict( { "conv": nn.Conv2d(1, 20, (3, 3)), "bn": nn.BatchNorm2d(20), "relu": nn.ReLU(), } ), OrderedDict( { "conv": nn.ConvTranspose2d(1, 20, (3, 3)), "bn": nn.BatchNorm2d(20), "relu": nn.ReLU(), } ), ], ) def test_preserved_attributes(model_dict): """ Test if methods and attributes on the model are preserved by passing preserved_attributes to the config. """ class MyModel(nn.Sequential): def __init__(self, model_dict): super().__init__(model_dict) self.conv.weight.data = torch.ones_like(self.conv.weight.data) def my_method(self): return self.weight + torch.ones_like(self.weight) @property def weight(self): return ( self.conv.weight if hasattr(self.conv, "weight") else self.conv.get_submodule("0").weight ) preserved_attrs = ["key_1", "key_2", "my_method", "weight"] model = MyModel(model_dict) model.key_1 = 5 model.key_2 = torch.tensor(5) config = LinearQuantizerConfig.from_dict( { "global_config": { "milestones": [0, 3, 4, 5], }, "preserved_attributes": preserved_attrs, } ) quantizer_1 = LinearQuantizer(model, LinearQuantizerConfig()) prepared_model = quantizer_1.prepare(example_inputs=(torch.randn(1),), inplace=False) for attr in preserved_attrs: assert not hasattr(prepared_model, attr) quantizer_2 = LinearQuantizer(model, config) prepared_model = quantizer_2.prepare(example_inputs=(torch.randn(1),), inplace=False) for attr in preserved_attrs: assert hasattr(prepared_model, attr) assert torch.all( prepared_model.my_method() == 2 * torch.ones_like(prepared_model.conv.weight.data) ) quantizer_2.step() prepared_model(torch.randn(2, 1, 28, 28)) final_model = quantizer_2.finalize() for attr in preserved_attrs: assert hasattr(final_model, attr) assert torch.all( final_model.my_method() == final_model.weight.data + torch.ones_like(prepared_model.weight.data) ) @pytest.mark.optional @pytest.mark.parametrize("algorithm", ["vanilla", "learnable"]) @pytest.mark.parametrize("weight_dtype", ["qint8", "quint8", "qint4", "quint4"]) @pytest.mark.parametrize("weight_per_channel", [True, False]) @pytest.mark.parametrize( "quantization_scheme", [QuantizationScheme.symmetric, QuantizationScheme.affine] ) def test_linear_quantizer_report( mnist_model_conv_transpose, algorithm, weight_dtype, weight_per_channel, quantization_scheme, ): print("\nTESTING REPORT WITH") print("ALGORITHM", algorithm) print("WEIGHT_DTYPE", weight_dtype) print("WEIGHT_PER_CHANNEL", weight_per_channel) print("QUANTIZATION_SCHEME", quantization_scheme) config = LinearQuantizerConfig.from_dict( { "global_config": { "milestones": [0, 1, 1, 3], "algorithm": algorithm, "weight_dtype": weight_dtype, "weight_per_channel": weight_per_channel, "quantization_scheme": quantization_scheme, }, "module_name_configs": { "dense2": { "milestones": [0, 1, 1, 3], "activation_dtype": torch.float32, "algorithm": algorithm, "weight_dtype": weight_dtype, "weight_per_channel": weight_per_channel, "quantization_scheme": quantization_scheme, } }, } ) quantizer = LinearQuantizer(mnist_model_conv_transpose, config) prepared_model = quantizer.prepare(example_inputs=(torch.randn(1, 1, 28, 28),)) report = quantizer.report() print("\nREPORT\n" + str(report)) @pytest.mark.parametrize( "dtype", [ pytest.param("qint4", marks=pytest.mark.xfail(reason="rdar://134169158")), "qint8", ], ) @pytest.mark.parametrize("scheme", ["symmetric", "affine"]) @pytest.mark.parametrize("conv_transpose", [False, True]) def test_compression_metadata(dtype, scheme, conv_transpose): """ Test that calling finalize on the module leads to compression metadata being added to the model """ model = nn.Sequential( OrderedDict( [ ( "conv1", (nn.Conv2d(1, 20, 3) if not conv_transpose else nn.ConvTranspose2d(1, 20, 3)), ), ("relu", nn.ReLU()), ("fc1", nn.Linear(20, 100)), ] ) ) config = LinearQuantizerConfig.from_dict( { "module_name_configs": { "conv1": { "weight_dtype": dtype, "quantization_scheme": scheme, }, "fc1": None, }, } ) quantizer = LinearQuantizer(model, config) quantizer.prepare(inplace=True, example_inputs=(torch.randn(1, 1, 28, 28),)) for _ in range(4): quantizer.step() model = quantizer.finalize() # Verify metadata version is added to model assert "_COREML_/metadata_version" in model.state_dict() # Verify compression metadata is added for conv1 metadata_dict = CompressionMetadata.from_state_dict(model.conv1[0].state_dict()) assert len(metadata_dict) == 1 assert "weight" in metadata_dict metadata = metadata_dict["weight"] assert metadata.compression_type == [CompressionType.quantization.value] assert metadata.quantization_n_bits == 4 if dtype == "qint4" else 8 scale_zero_point_shape = (20, 1) if not conv_transpose else (1, 20) assert metadata.quantization_scale.shape == scale_zero_point_shape assert metadata.zero_point.shape == scale_zero_point_shape if scheme == "symmetric": assert torch.all(metadata.zero_point == 0) # Verify no compression metadata is added for fc1 metadata_dict = CompressionMetadata.from_state_dict(model.fc1.state_dict()) assert len(metadata_dict) == 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/quantization/test_utils.py0000644000000000000000000000310114672066616026271 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import pytest import torch from coremltools.optimize.torch.quantization._utils import get_n_bits_from_range, get_quant_range @pytest.mark.parametrize("n_bits", list(range(2, 8))) @pytest.mark.parametrize("dtype", [torch.quint8, torch.uint8, torch.qint8, torch.int8]) def test_quant_range(dtype, n_bits): quant_min, quant_max = get_quant_range(n_bits, dtype) signed_expected_values = { 2: [-2, 1], 3: [-4, 3], 4: [-8, 7], 5: [-16, 15], 6: [-32, 31], 7: [-64, 63], 8: [-128, 127], } unsigned_expected_values = { 2: [0, 3], 3: [0, 7], 4: [0, 15], 5: [0, 31], 6: [0, 63], 7: [0, 127], 8: [0, 256], } if dtype in [torch.quint8, torch.uint8]: assert quant_min == unsigned_expected_values[n_bits][0] assert quant_max == unsigned_expected_values[n_bits][1] else: assert quant_min == signed_expected_values[n_bits][0] assert quant_max == signed_expected_values[n_bits][1] @pytest.mark.parametrize("n_bits", list(range(2, 8))) @pytest.mark.parametrize("dtype", [torch.quint8, torch.uint8, torch.qint8, torch.int8]) def test_n_bits_from_range(dtype, n_bits): quant_min, quant_max = get_quant_range(n_bits, dtype) output_n_bits = get_n_bits_from_range(quant_min, quant_max) assert output_n_bits == n_bits ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/smoke_test.py0000644000000000000000000000245214672066616023531 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import torch class TestSmokeTest: def test_coremltools_optimize_torch_import(self): import coremltools.optimize.torch def test_model_optimizations(self): from coremltools.optimize.torch.palettization import DKMPalettizer, DKMPalettizerConfig from coremltools.optimize.torch.pruning import MagnitudePruner, MagnitudePrunerConfig from coremltools.optimize.torch.quantization import LinearQuantizer, LinearQuantizerConfig for OptCls, OptConfig, args in [ (MagnitudePruner, MagnitudePrunerConfig, None), (DKMPalettizer, DKMPalettizerConfig, None), (LinearQuantizer, LinearQuantizerConfig, torch.randn(100)), ]: obj = OptCls(torch.nn.Identity(), OptConfig()) obj.prepare(args) obj.finalize() def test_model_conversion(self, mnist_model, mnist_example_input): import coremltools.test.optimize.torch.conversion.conversion_utils as util converted_model = util.get_converted_model(mnist_model, mnist_example_input) assert converted_model is not None ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/test_api_surface.py0000644000000000000000000001310314672066616024667 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from typing import List import coremltools.optimize.torch def _get_visible_items(d): return [x for x in dir(d) if not x.startswith("_")] def _check_visible_modules(actual: List[str], expected: List[str]): assert set(actual) == set(expected), "API mis-matched. Got %s, expected %s" % ( actual, expected, ) class TestApiVisibilities: """Test APIs visible to users""" def test_top_level(self): # coremltools.optimize.torch.* expected = [ "base_model_optimizer", "optimization_config", "palettization", "pruning", "quantization", "layerwise_compression", ] visible_modules = _get_visible_items(coremltools.optimize.torch) _check_visible_modules(visible_modules, expected) def test_base_model_optimizer_module(self): # coremltools.optimize.torch.base_model_optimizer.* expected = [ "BaseModelOptimizer", "BaseTrainingTimeModelOptimizer", "BasePostTrainingModelOptimizer", "BaseDataCalibratedModelOptimizer", ] visible_modules = _get_visible_items(coremltools.optimize.torch.base_model_optimizer) _check_visible_modules(visible_modules, expected) def test_optimization_config_module(self): # coremltools.optimize.torch.optimization_config.* expected = [ "PalettizationGranularity", "QuantizationGranularity", "ModuleOptimizationConfig", "OptimizationConfig", ] visible_modules = _get_visible_items(coremltools.optimize.torch.optimization_config) _check_visible_modules(visible_modules, expected) def test_palettization_module(self): # coremltools.optimize.torch.palettization.* expected = [ "FakePalettize", "DKMPalettizer", "DKMPalettizerConfig", "ModuleDKMPalettizerConfig", "palettization_config", "fake_palettize", "palettizer", "post_training_palettization", "ModulePostTrainingPalettizerConfig", "PostTrainingPalettizer", "PostTrainingPalettizerConfig", "sensitive_k_means", "ModuleSKMPalettizerConfig", "SKMPalettizer", "SKMPalettizerConfig", ] visible_modules = _get_visible_items(coremltools.optimize.torch.palettization) _check_visible_modules(visible_modules, expected) # coremltools.optimize.torch.palettization.palettizer.* expected = [ "Palettizer", "DKMPalettizer", ] visible_modules = _get_visible_items(coremltools.optimize.torch.palettization.palettizer) _check_visible_modules(visible_modules, expected) def test_pruning_module(self): # coremltools.optimize.torch.pruning.* expected = [ "ConstantSparsityScheduler", "MagnitudePruner", "MagnitudePrunerConfig", "ModuleMagnitudePrunerConfig", "PolynomialDecayScheduler", "magnitude_pruner", "pruning_scheduler", ] visible_modules = _get_visible_items(coremltools.optimize.torch.pruning) _check_visible_modules(visible_modules, expected) def test_quantization_module(self): # coremltools.optimize.torch.quantization.* expected = [ "LinearQuantizer", "LinearQuantizerConfig", "ModuleLinearQuantizerConfig", "ObserverType", "QuantizationScheme", "quantizer", "quantization_config", "modules", "ModulePostTrainingQuantizerConfig", "PostTrainingQuantizer", "PostTrainingQuantizerConfig", "post_training_quantization", ] visible_modules = _get_visible_items(coremltools.optimize.torch.quantization) _check_visible_modules(visible_modules, expected) # coremltools.optimize.torch.quantization.LinearQuantizer.* expected = [ "finalize", "prepare", "step", "report", "supported_modules", ] visible_modules = _get_visible_items( coremltools.optimize.torch.quantization.LinearQuantizer ) _check_visible_modules(visible_modules, expected) # coremltools.optimize.torch.quantization.quantizer.* expected = [ "Quantizer", "LinearQuantizer", ] visible_modules = _get_visible_items(coremltools.optimize.torch.quantization.quantizer) _check_visible_modules(visible_modules, expected) def test_layerwise_compression_module(self): expected = [ "algorithms", "LayerwiseCompressionAlgorithm", "LayerwiseCompressionAlgorithmConfig", "SparseGPT", "GPTQ", "ModuleGPTQConfig", "ModuleSparseGPTConfig", "input_cacher", "FirstLayerInputCacher", "DefaultInputCacher", "GPTFirstLayerInputCacher", "layerwise_compressor", "LayerwiseCompressor", "LayerwiseCompressorConfig", ] visible_modules = _get_visible_items(coremltools.optimize.torch.layerwise_compression) _check_visible_modules(visible_modules, expected) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/test_base_optimizer.py0000644000000000000000000000446314672066616025433 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import pytest import torch from coremltools.optimize.torch.base_model_optimizer import ( BaseDataCalibratedModelOptimizer, BasePostTrainingModelOptimizer, ) from coremltools.optimize.torch.palettization import DKMPalettizer from coremltools.optimize.torch.pruning import MagnitudePruner from coremltools.optimize.torch.quantization import LinearQuantizer @pytest.mark.parametrize("optimizer", [MagnitudePruner, LinearQuantizer, DKMPalettizer]) @pytest.mark.parametrize("inplace", [True, False]) def test_report_model_train_state(optimizer, inplace): model = torch.nn.Sequential(torch.nn.Conv2d(1, 31, 2, 1), torch.nn.Conv2d(31, 21, 2, 1)) opt = optimizer(model) if optimizer == LinearQuantizer: p_model = opt.prepare(inplace=inplace, example_inputs=(torch.randn(1),)) else: p_model = opt.prepare(inplace=inplace) p_model.train() opt.report() assert p_model.training p_model.eval() opt.report() assert not p_model.training @pytest.mark.parametrize( "optimizer", [BasePostTrainingModelOptimizer, BaseDataCalibratedModelOptimizer] ) @pytest.mark.parametrize("inplace", [True, False]) def test_inplace_behavior_for_optimizers(optimizer, inplace): def create_model(): return torch.nn.Sequential(torch.nn.Conv2d(1, 31, 2, 1), torch.nn.Conv2d(31, 21, 2, 1)) class DummyOptimizer(optimizer): def report(self): return None @torch.no_grad() def compress(self, *args, inplace, **kwargs): super().compress(*args, inplace=inplace, **kwargs) self._model[0].weight.data = torch.ones_like(self._model[0].weight.data) return self._model model = create_model() opt = DummyOptimizer(model) opt.compress(dataloader=None, inplace=inplace) if inplace: assert id(opt._model) == id(model) assert id(opt._uncompressed_model) != id(model) else: assert id(opt._model) != id(model) assert id(opt._uncompressed_model) == id(model) assert torch.all(opt._model[0].weight == 1) assert not torch.all(opt._uncompressed_model[0].weight == 1) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2895474 coremltools-8.0/coremltools/test/optimize/torch/test_utils/0000755000000000000000000000000014672075535023176 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/test_utils/__init__.py0000644000000000000000000000033314672066616025306 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/test_utils/test_fsdp_utils.py0000644000000000000000000000172314672066616026766 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import torch from coremltools.optimize.torch._utils.fsdp_utils import ModuleWrapPolicy, SizeBasedWrapPolicy def test_module_wrap_policy(): """ Test constructor for underlying FSDP policy is called with correct arguments """ module_classes = [torch.nn.Linear, torch.nn.Conv2d] policy = ModuleWrapPolicy(module_classes=module_classes) policy = policy.get_policy() assert policy._module_classes == set(module_classes) def test_size_based_policy(): """ Test constructor for underlying FSDP policy is called with correct arguments """ min_num_params = 100 policy = SizeBasedWrapPolicy(min_num_params=min_num_params) policy = policy.get_policy() assert policy.keywords["min_num_params"] == min_num_params ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/test_utils/test_k_means.py0000644000000000000000000002376014672066616026234 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import pytest import torch from coremltools.optimize.torch._utils.k_means import ( KMeansConfig, KMeansSupportedModulesRegistry, ParallelKMeans, SequentialKMeans, ) from coremltools.test.optimize.torch.utils import count_unique_params @pytest.mark.parametrize( "config", [ KMeansConfig(n_bits=2, enable_per_channel_scale=False), { "conv1": {"weight": KMeansConfig(n_bits=4, enable_per_channel_scale=False)}, "dense1": {"weight": KMeansConfig(n_bits=2, enable_per_channel_scale=True)}, }, ], ) @pytest.mark.parametrize( "kmeans_cls", [SequentialKMeans, ParallelKMeans], ) def test_k_means_mnist_per_weight(mock_name_main, mnist_model, config, kmeans_cls): model = kmeans_cls.cluster_weights(mnist_model, config=config, num_workers=4) layers = [ ("conv1", model.conv1), ("conv2", model.conv2), ("dense1", model.dense1), ("dense2", model.dense2), ] with torch.no_grad(): for layer_name, layer in layers: if isinstance(config, dict): if layer_name in config: for param_name, layer_config in config[layer_name].items(): param = getattr(layer, param_name) if layer_config.enable_per_channel_scale: per_channel_scale_key = f"_COREML_/{param_name}/palettization_scale" assert per_channel_scale_key in layer.state_dict() per_channel_scale = layer.state_dict()[per_channel_scale_key] param = param / per_channel_scale assert count_unique_params(torch.unique(param)) == 2**layer_config.n_bits else: assert len(torch.unique(layer.weight)) > 16 else: assert len(torch.unique(layer.weight)) == 2**config.n_bits @pytest.mark.parametrize( "config", [ KMeansConfig(n_bits=4, block_size=4, axis=0, enable_per_channel_scale=False), KMeansConfig(n_bits=4, block_size=4, axis=0, enable_per_channel_scale=True), KMeansConfig(n_bits=4, block_size=4, axis=1, enable_per_channel_scale=False), KMeansConfig(n_bits=4, block_size=4, axis=1, enable_per_channel_scale=True), ], ) @pytest.mark.parametrize( "kmeans_cls", [SequentialKMeans, ParallelKMeans], ) def test_k_means_block_wise(mock_name_main, config, kmeans_cls): model = torch.nn.Conv2d(12, 32, (2, 2)) model = kmeans_cls.cluster_weights(model, config=config, num_workers=4) block_size = config.block_size with torch.no_grad(): weight = model.weight if config.enable_per_channel_scale: per_channel_scale_key = "_COREML_/weight/palettization_scale" assert per_channel_scale_key in model.state_dict() per_channel_scale = model.state_dict()[per_channel_scale_key] weight = weight / per_channel_scale if config.axis == 0: weight_flat = weight.flatten(1) else: weight_flat = weight.transpose(0, 1).flatten(1).transpose(0, 1) weight_shape = weight_flat.shape[config.axis] if config.axis == 0: for block_idx in range(0, weight_shape, block_size): assert ( count_unique_params( torch.unique(weight_flat[block_idx : block_idx + block_size, :]) ) == 2**config.n_bits ) else: for block_idx in range(0, weight_shape, block_size): assert ( count_unique_params( torch.unique(weight_flat[:, block_idx : block_idx + block_size]) ) == 2**config.n_bits ) @pytest.mark.parametrize( "config", [ KMeansConfig(n_bits=4, cluster_dim=4, axis=0, enable_per_channel_scale=False), KMeansConfig(n_bits=4, cluster_dim=4, axis=1, enable_per_channel_scale=False), KMeansConfig(n_bits=2, cluster_dim=2, axis=0, enable_per_channel_scale=False), KMeansConfig(n_bits=2, cluster_dim=2, axis=1, enable_per_channel_scale=False), ], ) @pytest.mark.parametrize( "kmeans_cls", [SequentialKMeans, ParallelKMeans], ) def test_k_means_vector_wise(mock_name_main, config, kmeans_cls): model = torch.nn.Conv2d(16, 8, (2, 2)) model = kmeans_cls.cluster_weights(model, config=config, num_workers=4) cluster_dim = config.cluster_dim with torch.no_grad(): weight = model.weight if config.axis == 0: weight_reshaped = weight.flatten(1).transpose(0, 1).reshape(-1, cluster_dim) elif config.axis == 1: weight_reshaped = ( weight.transpose(0, 1).flatten(1).transpose(0, 1).reshape(-1, cluster_dim) ) else: raise ValueError("axis must be 0 or 1.") unique_vector = torch.unique(weight_reshaped, dim=0) assert len(unique_vector) == 2**config.n_bits @pytest.mark.parametrize("importance", [True, False]) @pytest.mark.parametrize("config", [tuple((4, 4, 0)), tuple((4, 4, 1))]) @pytest.mark.parametrize( "kmeans_cls", [ SequentialKMeans, ParallelKMeans, ], ) def test_k_means_masked(mock_name_main, importance, config, kmeans_cls): model = torch.nn.Linear(32, 32) block_size = config[1] axis = config[2] weight_mask = torch.ones_like(model.weight.data, dtype=torch.bool) for idx in range(32): if axis == 0: weight_mask[idx, torch.randperm(32)[:4]] = False else: weight_mask[torch.randperm(32)[:4], idx] = False importance = torch.abs(torch.randn(model.weight.shape)) if importance else None config = KMeansConfig( n_bits=config[0], block_size=block_size, enable_per_channel_scale=False, axis=axis, mask=weight_mask, importance=importance, ) weight_clone = model.weight.clone() model = kmeans_cls.cluster_weights(model, config=config, num_workers=4) with torch.no_grad(): model_weight = model.weight weight_shape = model_weight.shape[config.axis] for block_idx in range(0, weight_shape, block_size): if config.axis == 0: mask_block = weight_mask[block_idx : block_idx + block_size, :] weight_block_masked = model_weight[block_idx : block_idx + block_size, :][ mask_block ] weight_unmasked = model_weight[block_idx : block_idx + block_size, :][~mask_block] weight_orig_unmasked = weight_clone[block_idx : block_idx + block_size, :][ ~mask_block ] else: mask_block = weight_mask[:, block_idx : block_idx + block_size] weight_block_masked = model_weight[:, block_idx : block_idx + block_size][ mask_block ] weight_unmasked = model_weight[:, block_idx : block_idx + block_size][~mask_block] weight_orig_unmasked = weight_clone[:, block_idx : block_idx + block_size][ ~mask_block ] assert len(torch.unique(weight_block_masked)) == 2**config.n_bits assert torch.all(weight_orig_unmasked == weight_unmasked) # region KMeansModule Tests @pytest.mark.parametrize( "layer, layer_config", [ ( torch.nn.Linear(10, 100), {"weight": KMeansConfig(n_bits=4, enable_per_channel_scale=True)}, ), ], ) @torch.no_grad() def test_zero_per_channel_scale(layer, layer_config): k_means_module_cls = KMeansSupportedModulesRegistry.get_kmeans_module(layer) k_means_module = k_means_module_cls(layer, layer_config) # Set one output chanel to zero so its per_channel_scale is 0 layer.weight[0] = 0 orig_weight = layer.weight.clone() # Scale weights scaled_weight = k_means_module._scale_by_per_channel_scale("weight", layer.weight) # Verify no NaN values are introduced assert not torch.any(torch.isnan(scaled_weight)) # Confirm layer weights for corresponding channel remain 0 assert torch.all(scaled_weight[0] == 0) # Unscale weights unscaled_weight = k_means_module._unscale_by_per_channel_scale("weight", layer.weight) # Verify no NaN values are introduced assert not torch.any(torch.isnan(unscaled_weight)) # Confirm unscaled weights match original weights within tolerance assert torch.all(torch.isclose(unscaled_weight, orig_weight)) @pytest.mark.parametrize( "layer, param_name, axis, expected_shape", [ (torch.nn.Conv2d(16, 32, 5), "weight", 0, (32, 16 * 5 * 5)), (torch.nn.Conv2d(16, 32, 5), "weight", 1, (32 * 5 * 5, 16)), (torch.nn.Linear(1024, 10), "weight", 0, (10, 1024)), (torch.nn.Linear(1024, 10), "weight", 1, (10, 1024)), (torch.nn.Embedding(50000, 256), "weight", 0, (50000, 256)), (torch.nn.Embedding(50000, 256), "weight", 1, (50000, 256)), (torch.nn.MultiheadAttention(256, 4), "in_proj_weight", 0, (3 * 256, 256)), (torch.nn.MultiheadAttention(256, 4), "in_proj_weight", 1, (3 * 256, 256)), ], ) def test_parameter_reshaping(layer, param_name, axis, expected_shape): config = {param_name: KMeansConfig(n_bits=4, block_size=8, axis=axis)} k_means_module_cls = KMeansSupportedModulesRegistry.get_kmeans_module(layer) k_means_module = k_means_module_cls(layer, config) # reshape for kmeans param = getattr(layer, param_name) new_param = k_means_module._reshape_for_kmeans(param_name, param) assert new_param.shape == expected_shape # reshape back to original weight shape reshaped_param = k_means_module._reshape_to_original(param_name, new_param) assert reshaped_param.shape == param.shape # endregion ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/test_utils/test_metadata_utils.py0000644000000000000000000001237514672066616027617 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from contextlib import nullcontext as does_not_raise import pytest import torch from coremltools.optimize.torch._utils.metadata_utils import ( METADATA_VERSION, METADATA_VERSION_BUFFER, CompressionMetadata, CompressionType, register_metadata_version, ) @pytest.mark.parametrize( "metadata_dict, expectation", [ ( { "param_name": "weight", "quantization_scale": torch.rand(3, 1), "quantization_n_bits": 4, "compression_type": ["pruning", "quantization"], }, does_not_raise(), ), ( { "param_name": "weight", "quantization_scale": torch.rand(3, 1), "quantization_n_bits": 4, "compression_type": ["pruning", "quantizatoin"], # mis-spelled }, pytest.raises(KeyError), ), ], ) def test_metadata_from_dict(metadata_dict, expectation): with expectation: metadata = CompressionMetadata.from_dict(metadata_dict) assert torch.equal(metadata.quantization_scale, metadata_dict["quantization_scale"]) assert metadata.quantization_n_bits == metadata_dict["quantization_n_bits"] assert metadata.compression_type == [ CompressionType[x].value for x in metadata_dict["compression_type"] ] for key, value in metadata.as_dict().items(): if key not in metadata_dict: assert value is None @pytest.mark.parametrize( "state_dict", [ { "_COREML_/weight/quantization_scale": torch.rand(3, 1), "_COREML_/weight/quantization_n_bits": torch.tensor(4), "_COREML_/weight/compression_type": torch.tensor([1, 2]), "_COREML_/bias/quantization_scale": torch.rand(3, 1), "_COREML_/bias/quantization_n_bits": torch.tensor(8), "_COREML_/bias/compression_type": torch.tensor([1, 3]), } ], ) def test_metadata_from_state_dict(state_dict): metadata_dict = CompressionMetadata.from_state_dict(state_dict) print(metadata_dict) assert len(metadata_dict) == 2 assert "weight" in metadata_dict assert "bias" in metadata_dict for param in ["weight", "bias"]: metadata = metadata_dict[param] assert metadata.param_name == param assert torch.equal( metadata.quantization_scale, state_dict[f"_COREML_/{param}/quantization_scale"], ) assert ( metadata.quantization_n_bits == state_dict[f"_COREML_/{param}/quantization_n_bits"].item() ) assert ( metadata.compression_type == state_dict[f"_COREML_/{param}/compression_type"].tolist() ) non_none_keys = [ "quantization_n_bits", "quantization_scale", "param_name", "compression_type", ] for key, value in metadata.as_dict().items(): if key not in non_none_keys: assert value is None @pytest.mark.parametrize( "metadata_dict", [ { "param_name": "weight", "zero_point": torch.rand(3, 1), "compression_type": ["pruning", "quantization"], }, ], ) def test_register(metadata_dict): module = torch.nn.Conv2d(3, 32, 3) metadata = CompressionMetadata.from_dict(metadata_dict) state_dict = module.state_dict() for key in metadata_dict: assert metadata._get_metadata_buffer_name(key) not in state_dict metadata.register(module) state_dict = module.state_dict() for key, value in metadata_dict.items(): if key != "param_name": metadata_key = metadata._get_metadata_buffer_name(key) if key == "compression_type": metadata_value = torch.tensor([CompressionType[x].value for x in value]) else: metadata_value = torch.tensor(value) assert metadata_key in state_dict assert torch.equal(state_dict[metadata_key], metadata_value) def test_chaining_compression_type(): module = torch.nn.Conv2d(3, 32, 3) metadata = CompressionMetadata(param_name="weight") metadata.compression_type = ["pruning"] metadata.register(module) buffer_name = metadata._get_metadata_buffer_name("compression_type") assert buffer_name in module.state_dict() assert torch.equal(module.state_dict()[buffer_name], torch.tensor([1])) metadata2 = CompressionMetadata(param_name="weight") metadata2.compression_type = ["palettization"] metadata2.register(module) assert buffer_name in module.state_dict() assert torch.equal(module.state_dict()[buffer_name], torch.tensor([1, 2])) def test_register_metadata_version(): model = torch.nn.Sequential(torch.nn.Conv2d(3, 32, 3), torch.nn.ReLU()) assert METADATA_VERSION_BUFFER not in model.state_dict() register_metadata_version(model) assert METADATA_VERSION_BUFFER in model.state_dict() assert torch.equal(model.state_dict()[METADATA_VERSION_BUFFER], METADATA_VERSION) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/test_utils/test_report_utils.py0000644000000000000000000002404514672066616027347 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause from collections import OrderedDict from typing import Tuple import pytest import torch from coremltools.optimize.torch.layerwise_compression import ( LayerwiseCompressor, LayerwiseCompressorConfig, ) from coremltools.optimize.torch.palettization import ( PostTrainingPalettizer, PostTrainingPalettizerConfig, ) from coremltools.optimize.torch.palettization.sensitive_k_means import ( SKMPalettizer, SKMPalettizerConfig, ) from coremltools.optimize.torch.quantization import ( PostTrainingQuantizer, PostTrainingQuantizerConfig, ) @pytest.fixture() def model_for_compression(request) -> torch.nn.Module: decomposed_multihead_forward = request.param class ProjectionModule(torch.nn.Module): def __init__(self, embed_dim: int, hidden_dim: int): super().__init__() self.query = torch.nn.Linear(embed_dim, hidden_dim) self.key = torch.nn.Linear(embed_dim, hidden_dim) self.value = torch.nn.Linear(embed_dim, hidden_dim) def forward(self, x: torch.Tensor): return self.query(x), self.key(x), self.value(x) if decomposed_multihead_forward: class MultiheadWrapper(torch.nn.Module): def __init__(self, multihead_layer): super().__init__() self.layer = multihead_layer def forward(self, q, k, v): return self.layer(q, k, v, need_weights=False)[0] else: class MultiheadWrapper(torch.nn.Module): def __init__(self, multihead_layer): super().__init__() self.layer = multihead_layer def forward(self, x: Tuple[torch.Tensor]): return self.layer(x[0], x[1], x[2], need_weights=False)[0] class LinearWrapper(torch.nn.Module): def __init__(self, linear_layer): super().__init__() self.layer = linear_layer def forward(self, x): out = self.layer(x) return out.reshape(-1, 100, 10, 10) return torch.nn.Sequential( OrderedDict( [ ("embedding", torch.nn.Embedding(100, 100)), ("projection", ProjectionModule(100, 100)), ( "multihead", MultiheadWrapper(torch.nn.MultiheadAttention(100, 5, batch_first=True)), ), ("linear", LinearWrapper(torch.nn.Linear(100, 100))), ("conv", torch.nn.Conv2d(100, 100, (3, 3), padding=(1, 1))), ] ) ) @pytest.mark.parametrize( "config, expected_num_columns", [ ( { "global_config": {"algorithm": "gptq", "weight_dtype": "uint4"}, "module_name_configs": {"multihead.layer.out_proj": None}, "input_cacher": "default", "calibration_nsamples": 128, }, 3, ), ( { "global_config": { "algorithm": "gptq", "weight_dtype": "uint4", "enable_normal_float": True, }, "module_name_configs": {"multihead.layer.out_proj": None}, "input_cacher": "default", "calibration_nsamples": 128, }, 3, ), ( { "global_config": {"algorithm": "gptq", "weight_dtype": "uint8"}, "module_name_configs": { "projection.*": { "algorithm": "sparse_gpt", "weight_dtype": "uint8", "target_sparsity": 0.25, }, "multihead.layer.out_proj": None, }, "input_cacher": "default", "calibration_nsamples": 128, }, 6, ), ], ) @pytest.mark.parametrize("model_for_compression", [True], indirect=True) def test_report_layerwise_compressor(model_for_compression, config, expected_num_columns): config = LayerwiseCompressorConfig.from_dict(config) compressor = LayerwiseCompressor(model_for_compression, config) def compression_loader(): dataset = torch.utils.data.TensorDataset(torch.randint(0, high=100, size=(100, 100))) loader = torch.utils.data.DataLoader(dataset, batch_size=10) for data in loader: yield data[0] compressor.compress(compression_loader(), device="cpu") report = compressor.report() print(report) assert (len(report)) == 5 expected_params = [ "projection.query.weight", "projection.key.weight", "projection.value.weight", "linear.layer.weight", "conv.weight", ] for param_name in expected_params: assert param_name in report param_report = report[param_name] assert len(param_report) == expected_num_columns if not config.global_config.enable_normal_float: assert param_report["dtype"] == f"dtype=int{config.global_config.weight_n_bits}" else: assert ( param_report["palettization_mode"] == f"num_clusters={2 ** config.global_config.weight_n_bits}, cluster_dim=1" ) @pytest.mark.parametrize("quantization_scheme", ["symmetric", "affine"]) @pytest.mark.parametrize( "granularity_block_size", [ ("per_channel", None), ("per_tensor", None), ("per_block", 5), ], ) @pytest.mark.parametrize("weight_dtype", ["int4", "int8"]) @pytest.mark.parametrize("model_for_compression", [True], indirect=True) def test_report_post_training_quantization( model_for_compression, quantization_scheme, granularity_block_size, weight_dtype, ): granularity, block_size = granularity_block_size config = PostTrainingQuantizerConfig.from_dict( { "global_config": { "weight_dtype": weight_dtype, "granularity": granularity, "block_size": block_size, "quantization_scheme": quantization_scheme, } } ) compressor = PostTrainingQuantizer(model_for_compression, config) model = compressor.compress() report = compressor.report() assert (len(report)) == 7 for param_name, param in model.named_parameters(): if "embedding" not in param_name and "bias" not in param_name: assert param_name in report param_report = report[param_name] assert len(param_report) == 3 assert param_report["dtype"] == f"dtype=int{config.global_config.weight_n_bits}" @pytest.mark.parametrize( "config", [ { "global_config": {"granularity": "per_tensor", "n_bits": 4}, }, { "global_config": { "granularity": "per_grouped_channel", "n_bits": 4, "group_size": 1, }, }, { "global_config": { "granularity": "per_grouped_channel", "n_bits": 4, "group_size": 5, }, }, { "global_config": {"granularity": "per_tensor", "n_bits": 4}, "module_name_configs": { "linear.layer": { "n_bits": 4, "granularity": "per_tensor", "cluster_dim": 5, }, "conv": { "n_bits": 4, "granularity": "per_tensor", "cluster_dim": 4, }, }, }, ], ) @pytest.mark.parametrize("model_for_compression", [True], indirect=True) def test_report_post_training_palettization(model_for_compression, config): config = PostTrainingPalettizerConfig.from_dict(config) compressor = PostTrainingPalettizer(model_for_compression, config) model = compressor.compress(num_kmeans_workers=1) report = compressor.report() assert (len(report)) == 8 for param_name, param in model.named_parameters(): if "bias" not in param_name: assert param_name in report param_report = report[param_name] assert len(param_report) == 3 assert "num_clusters=16" in param_report["palettization_mode"] @pytest.mark.parametrize( "config", [ { "global_config": {"granularity": "per_tensor", "n_bits": 6}, }, { "global_config": { "granularity": "per_grouped_channel", "n_bits": 8, "group_size": 1, }, }, { "global_config": { "granularity": "per_grouped_channel", "n_bits": 4, "group_size": 5, }, }, ], ) @pytest.mark.parametrize("model_for_compression", [False], indirect=True) def test_report_skm_palettizer(model_for_compression, config): config = SKMPalettizerConfig.from_dict(config) compressor = SKMPalettizer(model_for_compression, config) def compression_loader(): dataset = torch.utils.data.TensorDataset(torch.randint(0, high=100, size=(100, 100))) loader = torch.utils.data.DataLoader(dataset, batch_size=10) for data in loader: yield data[0] def loss_fn(model, data): out = model(data) return torch.sum(out) model = compressor.compress( dataloader=compression_loader(), loss_fn=loss_fn, ) report = compressor.report() assert (len(report)) == 8 for param_name, param in model.named_parameters(): if "bias" not in param_name: assert param_name in report param_report = report[param_name] assert len(param_report) == 3 assert ( param_report["palettization_mode"] == f"num_clusters={2 ** config.global_config.n_bits}, cluster_dim=1" ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/test_utils/test_validation_utils.py0000644000000000000000000001174114672066616030165 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import pytest import torch from coremltools.optimize.torch._utils.validation_utils import ( ConfigValidator, validate_param_config, ) from coremltools.optimize.torch.palettization import ( ModulePostTrainingPalettizerConfig, ModuleSKMPalettizerConfig, ) from coremltools.optimize.torch.quantization import ModulePostTrainingQuantizerConfig @pytest.mark.parametrize( "config, expectation", [ ( ModulePostTrainingPalettizerConfig( n_bits=4, granularity="per_grouped_channel", group_size=4 ), True, ), ( ModulePostTrainingPalettizerConfig(n_bits=4, granularity="per_tensor", cluster_dim=3), False, ), ( ModulePostTrainingPalettizerConfig(n_bits=4, granularity="per_tensor", cluster_dim=4), True, ), ], ) def test_validate_param_config(config, expectation): module = torch.nn.Conv2d(16, 32, 5) result = validate_param_config( "weight", module.weight, module, config, ["palettization_group_size", "palettization_cluster_dim"], ) if expectation: assert result is not None else: assert result is None def test_validate_no_check(): module = torch.nn.Conv2d(3, 16, 5) config = ModuleSKMPalettizerConfig() validator = ConfigValidator("weight", module.weight, module, config) with pytest.raises(AssertionError): validator.validate(["invalid_check"]) @pytest.mark.parametrize( "group_size, channel_axis, expectation", [ pytest.param(4, None, True, id="default_axis"), pytest.param(4, 0, True, id="axis_0"), pytest.param(4, 1, True, id="axis_1"), pytest.param(5, None, False, id="default_indivisible_group_size"), pytest.param(5, 0, False, id="axis_0_indivisible_group_size"), pytest.param(5, 1, False, id="axis_1_indivisible_group_size"), ], ) def test_validate_palettization_group_size(group_size, channel_axis, expectation): module = torch.nn.Conv2d(16, 32, 5) if channel_axis: config = ModuleSKMPalettizerConfig( n_bits=4, granularity="per_grouped_channel", group_size=group_size, channel_axis=channel_axis, ) else: config = ModuleSKMPalettizerConfig( n_bits=4, granularity="per_grouped_channel", group_size=group_size, ) validator = ConfigValidator("weight", module.weight, module, config) assert validator.validate(["palettization_group_size"]) == expectation @pytest.mark.parametrize( "block_size, sanitized_block_size, expectation", [ pytest.param(4, (1, 4), True, id="default_axis_int_block_size"), pytest.param((1, 4), (1, 4), True, id="tuple_with_per_channel"), pytest.param((4, 16), (4, 16), True, id="tuple_block_size"), pytest.param((4, 16, 5, 5), (4, 16), True, id="tuple_block_size_greater_than_ndim"), pytest.param((0, 16), -1, False, id="per_block_without_per_channel"), pytest.param((0, 0), -1, False, id="no_blocking_tuple"), pytest.param(0, -1, False, id="no_blocking_int"), pytest.param(5, -1, False, id="non_divisible_block_size_int"), pytest.param((5, 5), -1, False, id="non_divisible_block_size_tuple"), pytest.param((5, 16), -1, False, id="non_divisible_block_size_tuple_axis_0"), pytest.param((4, 5), -1, False, id="non_divisible_block_size_tuple_axis_1"), ], ) def test_validate_quantization_block_size(block_size, sanitized_block_size, expectation): module = torch.nn.Conv2d(16, 32, 5) config = ModulePostTrainingQuantizerConfig( weight_dtype="int4", granularity="per_block", block_size=block_size ) validator = ConfigValidator("weight", module.weight, module, config) assert validator.validate(["quantization_block_size"]) == expectation if expectation is True: assert validator.config.block_size == sanitized_block_size @pytest.mark.parametrize( "cluster_dim, expectation", [ pytest.param(None, True, id="cluster_dim_unspecified"), pytest.param(1, True, id="cluster_dim_scalar"), pytest.param(4, True, id="cluster_dim_valid_1"), pytest.param(8, True, id="cluster_dim_valid_2"), pytest.param(3, False, id="cluster_dim_invalid_1"), pytest.param(5, False, id="cluster_dim_invalid_1"), ], ) def test_validate_palettization_cluster_dim(cluster_dim, expectation): module = torch.nn.Conv2d(3, 16, 5) config = ModulePostTrainingPalettizerConfig( n_bits=4, granularity="per_tensor", cluster_dim=cluster_dim ) validator = ConfigValidator("weight", module.weight, module, config) assert validator.validate(["palettization_cluster_dim"]) == expectation ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/optimize/torch/utils.py0000644000000000000000000001100314672066616022504 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import contextlib import io import logging import pathlib import sys import torch import torch.nn.functional as F import torch.utils.data from packaging import version # region version_utils def _python_version(): """ Return python version as a tuple of integers """ version = sys.version.split(" ")[0] version = list(map(int, list(version.split(".")))) return tuple(version) def _macos_version(): """ Returns macOS version as a tuple of integers, making it easy to do proper version comparisons. On non-Macs, it returns an empty tuple. """ if sys.platform == "darwin": try: import subprocess ver_str = ( subprocess.run(["sw_vers", "-productVersion"], stdout=subprocess.PIPE) .stdout.decode("utf-8") .strip("\n") ) return tuple([int(v) for v in ver_str.split(".")]) except: raise Exception("Unable to detemine the macOS version") return () def count_unique_params(tensor): """ Returns number of unique parameters in the same tensor. Set a defaulted absolute tolerance, so that very close values can be treated as identical in palletization. """ unique_set = {tensor[0]} for elem in tensor[1:]: if all(not torch.isclose(elem, uelem, atol=1e-6) for uelem in unique_set): unique_set.add(elem) return len(unique_set) def version_ge(module, target_version): """ Example usage: >>> import torch # v1.5.0 >>> version_ge(torch, '1.6.0') # False """ return version.parse(module.__version__) >= version.parse(target_version) def version_lt(module, target_version): """See version_ge""" return version.parse(module.__version__) < version.parse(target_version) # endregion # region path_utils def test_data_path(): return pathlib.Path(__file__).parent.absolute() / "_test_data" # endregion # region train_utils def setup_data_loaders(dataset, batch_size): train, test = dataset train_loader = torch.utils.data.DataLoader(train, batch_size=batch_size, shuffle=True) test_loader = torch.utils.data.DataLoader(test, batch_size=batch_size) return train_loader, test_loader def train_step(model, optimizer, train_loader, data, target, batch_idx, epoch): optimizer.zero_grad() output = model(data) loss = F.nll_loss(output, target) loss.backward() optimizer.step() if batch_idx % 100 == 0: print( "Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}".format( epoch, batch_idx * len(data), len(train_loader.dataset), 100.0 * batch_idx / len(train_loader), loss.item(), ) ) return loss def eval_model(model, test_loader): model.eval() test_loss = 0 correct = 0 accuracy = 0.0 with torch.no_grad(): for data, target in test_loader: output = model(data) test_loss += F.nll_loss(output, target, reduction="sum").item() pred = output.argmax(dim=1, keepdim=True) correct += pred.eq(target.view_as(pred)).sum().item() test_loss /= len(test_loader.dataset) accuracy = 100.0 * correct / len(test_loader.dataset) print("\nTest set: Average loss: {:.4f}, Accuracy: {:.0f}%\n".format(test_loss, accuracy)) return accuracy def get_logging_capture_context_manager(): @contextlib.contextmanager def capture_logs(logger_name): # Create a StringIO object to capture the log output log_capture = io.StringIO() # Get the logger logger = logging.getLogger(logger_name) # Save the current handlers original_handlers = logger.handlers # Create a custom logging handler that writes to the StringIO object string_io_handler = logging.StreamHandler(log_capture) formatter = logging.Formatter("%(levelname)s:%(name)s:%(message)s") string_io_handler.setFormatter(formatter) # Clear existing handlers and add the custom handler logger.handlers = [string_io_handler] # Capture the logs try: yield log_capture finally: # Restore original handlers logger.handlers = original_handlers return capture_logs # endregion ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2895474 coremltools-8.0/coremltools/test/pipeline/0000755000000000000000000000000014672075535017625 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/pipeline/__init__.py0000644000000000000000000000032714672066616021740 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/pipeline/test_model_updatable.py0000644000000000000000000006723114672066616024370 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import shutil import tempfile import unittest import numpy as _np import coremltools.models.datatypes as datatypes from coremltools.models import MLModel from coremltools.models.neural_network import (AdamParams, NeuralNetworkBuilder, SgdParams, quantization_utils) from coremltools.models.pipeline import PipelineClassifier, PipelineRegressor from coremltools.models.utils import save_spec class LayerSelector(quantization_utils.QuantizedLayerSelector): def __init__(self, layer_name): super(LayerSelector, self).__init__() self.layer_name = layer_name def do_quantize(self, layer, weight_param="bias"): ret = super(LayerSelector, self).do_quantize(layer) if not ret or layer.name == self.layer_name: return False return True class MLModelUpdatableTest(unittest.TestCase): @classmethod def setUpClass(self): self.model_dir = tempfile.mkdtemp() @classmethod def tearDownClass(self): if os.path.exists(self.model_dir): shutil.rmtree(self.model_dir) def create_base_builder(self, is_updatable=True): self.input_features = [("input", datatypes.Array(3))] self.output_features = [("output", None)] self.output_names = ["output"] builder = NeuralNetworkBuilder(self.input_features, self.output_features) W1 = _np.random.uniform(-0.5, 0.5, (3, 3)) W2 = _np.random.uniform(-0.5, 0.5, (3, 3)) builder.add_inner_product( name="ip1", W=W1, b=None, input_channels=3, output_channels=3, has_bias=False, input_name="input", output_name="hidden", ) builder.add_inner_product( name="ip2", W=W2, b=None, input_channels=3, output_channels=3, has_bias=False, input_name="hidden", output_name="output", ) if is_updatable: builder.make_updatable(["ip1", "ip2"]) return builder def test_updatable_model_creation_ce_sgd(self): builder = self.create_base_builder() builder.add_softmax( name="softmax", input_name="output", output_name="softmax_output" ) builder.set_categorical_cross_entropy_loss( name="cross_entropy", input="softmax_output" ) builder.set_sgd_optimizer(SgdParams(lr=1e-2, batch=10, momentum=0.0)) builder.set_epochs(20, allowed_set=[10, 20, 30, 40]) model_path = os.path.join(self.model_dir, "updatable_creation.mlmodel") save_spec(builder.spec, model_path) mlmodel = MLModel(model_path) self.assertTrue(mlmodel is not None) spec = mlmodel.get_spec() self.assertTrue(spec.isUpdatable) self.assertTrue(spec.neuralNetwork.layers[0].isUpdatable) self.assertTrue(spec.neuralNetwork.layers[0].innerProduct.weights.isUpdatable) self.assertTrue(spec.neuralNetwork.layers[1].isUpdatable) self.assertTrue(spec.neuralNetwork.layers[1].innerProduct.weights.isUpdatable) self.assertTrue( spec.neuralNetwork.updateParams.lossLayers[ 0 ].categoricalCrossEntropyLossLayer is not None ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer is not None ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.learningRate.defaultValue, 1e-2, atol=1e-4, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.miniBatchSize.defaultValue, 10, atol=1e-4, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.momentum.defaultValue, 0, atol=1e-8, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.epochs.defaultValue, 20, atol=1e-4 ) ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.learningRate.range.minValue == 0 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.learningRate.range.maxValue == 1 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.miniBatchSize.set.values == [10] ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.momentum.range.minValue == 0 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.momentum.range.maxValue == 1 ) def test_updatable_model_creation_ce_adam(self): builder = self.create_base_builder() builder.add_softmax( name="softmax", input_name="output", output_name="softmax_output" ) builder.set_categorical_cross_entropy_loss( name="cross_entropy", input="softmax_output" ) adam_params = AdamParams() adam_params.set_batch(value=10, allowed_set=[10, 20]) builder.set_adam_optimizer(adam_params) builder.set_epochs(20) model_path = os.path.join(self.model_dir, "updatable_creation.mlmodel") print(model_path) save_spec(builder.spec, model_path) mlmodel = MLModel(model_path) self.assertTrue(mlmodel is not None) spec = mlmodel.get_spec() self.assertTrue(spec.isUpdatable) self.assertTrue(spec.neuralNetwork.layers[0].isUpdatable) self.assertTrue(spec.neuralNetwork.layers[0].innerProduct.weights.isUpdatable) self.assertTrue(spec.neuralNetwork.layers[1].isUpdatable) self.assertTrue(spec.neuralNetwork.layers[1].innerProduct.weights.isUpdatable) self.assertTrue( spec.neuralNetwork.updateParams.lossLayers[ 0 ].categoricalCrossEntropyLossLayer is not None ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer is not None ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.learningRate.defaultValue, 1e-2, atol=1e-4, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.miniBatchSize.defaultValue, 10, atol=1e-4, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.beta1.defaultValue, 0.9, atol=1e-4, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.beta2.defaultValue, 0.999, atol=1e-4, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.eps.defaultValue, 1e-8, atol=1e-8, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.epochs.defaultValue, 20, atol=1e-4 ) ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.learningRate.range.minValue == 0 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.learningRate.range.maxValue == 1 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.miniBatchSize.set.values == [10, 20] ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.beta1.range.minValue == 0 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.beta1.range.maxValue == 1 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.beta2.range.minValue == 0 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.beta2.range.maxValue == 1 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.eps.range.minValue == 0 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.eps.range.maxValue == 1 ) self.assertTrue(spec.neuralNetwork.updateParams.epochs.set.values == [20]) def test_updatable_model_creation_mse_sgd(self): builder = self.create_base_builder() builder.set_mean_squared_error_loss( name="mse", input_feature=("output", datatypes.Array(3)) ) builder.set_sgd_optimizer(SgdParams(lr=1e-2, batch=10, momentum=0.0)) builder.set_epochs(20) model_path = os.path.join(self.model_dir, "updatable_creation.mlmodel") print(model_path) save_spec(builder.spec, model_path) mlmodel = MLModel(model_path) self.assertTrue(mlmodel is not None) spec = mlmodel.get_spec() self.assertTrue(spec.isUpdatable) self.assertTrue(spec.neuralNetwork.layers[0].isUpdatable) self.assertTrue(spec.neuralNetwork.layers[0].innerProduct.weights.isUpdatable) self.assertTrue(spec.neuralNetwork.layers[1].isUpdatable) self.assertTrue(spec.neuralNetwork.layers[1].innerProduct.weights.isUpdatable) self.assertTrue( spec.neuralNetwork.updateParams.lossLayers[ 0 ].categoricalCrossEntropyLossLayer is not None ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer is not None ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.learningRate.defaultValue, 1e-2, atol=1e-4, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.miniBatchSize.defaultValue, 10, atol=1e-4, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.momentum.defaultValue, 0, atol=1e-8, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.epochs.defaultValue, 20, atol=1e-4 ) ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.learningRate.range.minValue == 0 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.learningRate.range.maxValue == 1 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.miniBatchSize.set.values == [10] ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.momentum.range.minValue == 0 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.sgdOptimizer.momentum.range.maxValue == 1 ) def test_updatable_model_creation_mse_adam(self): builder = self.create_base_builder() builder.set_mean_squared_error_loss( name="mse", input_feature=("output", datatypes.Array(3)) ) builder.set_adam_optimizer( AdamParams(lr=1e-2, batch=10, beta1=0.9, beta2=0.999, eps=1e-8) ) builder.set_epochs(20, allowed_set=[10, 20, 30]) model_path = os.path.join(self.model_dir, "updatable_creation.mlmodel") print(model_path) save_spec(builder.spec, model_path) mlmodel = MLModel(model_path) self.assertTrue(mlmodel is not None) spec = mlmodel.get_spec() self.assertTrue(spec.isUpdatable) self.assertTrue(spec.neuralNetwork.layers[0].isUpdatable) self.assertTrue(spec.neuralNetwork.layers[0].innerProduct.weights.isUpdatable) self.assertTrue(spec.neuralNetwork.layers[1].isUpdatable) self.assertTrue(spec.neuralNetwork.layers[1].innerProduct.weights.isUpdatable) self.assertTrue( spec.neuralNetwork.updateParams.lossLayers[ 0 ].categoricalCrossEntropyLossLayer is not None ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer is not None ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.learningRate.defaultValue, 1e-2, atol=1e-4, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.miniBatchSize.defaultValue, 10, atol=1e-4, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.beta1.defaultValue, 0.9, atol=1e-4, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.beta2.defaultValue, 0.999, atol=1e-4, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.eps.defaultValue, 1e-8, atol=1e-8, ) ) self.assertTrue( _np.isclose( spec.neuralNetwork.updateParams.epochs.defaultValue, 20, atol=1e-4 ) ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.learningRate.range.minValue == 0 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.learningRate.range.maxValue == 1 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.miniBatchSize.set.values == [10] ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.beta1.range.minValue == 0 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.beta1.range.maxValue == 1 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.beta2.range.minValue == 0 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.beta2.range.maxValue == 1 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.eps.range.minValue == 0 ) self.assertTrue( spec.neuralNetwork.updateParams.optimizer.adamOptimizer.eps.range.maxValue == 1 ) self.assertTrue( spec.neuralNetwork.updateParams.epochs.set.values == [10, 20, 30] ) def test_nn_set_cce_without_softmax_fail(self): nn_builder = self.create_base_builder() # fails since adding CCE without softmax must raise error with self.assertRaises(ValueError): nn_builder.set_categorical_cross_entropy_loss( name="cross_entropy", input="output" ) def test_nn_set_cce_invalid(self): nn_builder = self.create_base_builder() nn_builder.add_softmax( name="softmax", input_name="output", output_name="softmax_output" ) # fails since CCE input must be softmax output with self.assertRaises(ValueError): nn_builder.set_categorical_cross_entropy_loss( name="cross_entropy", input="output" ) def test_nn_set_softmax_updatable_invalid(self): nn_builder = self.create_base_builder() nn_builder.add_softmax( name="softmax", input_name="output", output_name="softmax_output" ) # fails since marking softmax as updatable layer is not allowed with self.assertRaises(ValueError): nn_builder.make_updatable(["softmax"]) def test_nn_set_training_input(self): builder = self.create_base_builder() builder.set_mean_squared_error_loss( name="mse", input_feature=("output", datatypes.Array(3)) ) builder.set_adam_optimizer( AdamParams(lr=1e-2, batch=10, beta1=0.9, beta2=0.999, eps=1e-8) ) builder.set_epochs(20, allowed_set=[10, 20, 30]) model_path = os.path.join(self.model_dir, "updatable_creation.mlmodel") print(model_path) save_spec(builder.spec, model_path) mlmodel = MLModel(model_path) self.assertTrue(mlmodel is not None) spec = mlmodel.get_spec() self.assertEqual(spec.description.trainingInput[0].name, "input") self.assertEqual( spec.description.trainingInput[0].type.WhichOneof("Type"), "multiArrayType" ) self.assertEqual(spec.description.trainingInput[1].name, "output_true") self.assertEqual( spec.description.trainingInput[1].type.WhichOneof("Type"), "multiArrayType" ) def test_nn_builder_with_training_features(self): input_features = [("input", datatypes.Array(3))] output_features = [("output", datatypes.Array(3))] builder = NeuralNetworkBuilder(input_features, output_features) W1 = _np.random.uniform(-0.5, 0.5, (3, 3)) W2 = _np.random.uniform(-0.5, 0.5, (3, 3)) builder.add_inner_product( name="ip1", W=W1, b=None, input_channels=3, output_channels=3, has_bias=False, input_name="input", output_name="hidden", ) builder.add_inner_product( name="ip2", W=W2, b=None, input_channels=3, output_channels=3, has_bias=False, input_name="hidden", output_name="output", ) builder.make_updatable(["ip1", "ip2"]) # or a dict for weightParams builder.set_mean_squared_error_loss( name="mse", input_feature=("output", datatypes.Array(3)) ) builder.set_adam_optimizer( AdamParams(lr=1e-2, batch=10, beta1=0.9, beta2=0.999, eps=1e-8) ) builder.set_epochs(20, allowed_set=[10, 20, 30]) model_path = os.path.join(self.model_dir, "updatable_creation.mlmodel") print(model_path) save_spec(builder.spec, model_path) mlmodel = MLModel(model_path) self.assertTrue(mlmodel is not None) spec = mlmodel.get_spec() self.assertEqual(spec.description.trainingInput[0].name, "input") self.assertEqual( spec.description.trainingInput[0].type.WhichOneof("Type"), "multiArrayType" ) self.assertEqual(spec.description.trainingInput[1].name, "output_true") self.assertEqual( spec.description.trainingInput[1].type.WhichOneof("Type"), "multiArrayType" ) def test_nn_fp16_make_updatable_fail(self): nn_builder = self.create_base_builder(is_updatable=False) model_path = os.path.join(self.model_dir, "updatable_creation.mlmodel") save_spec(nn_builder.spec, model_path) mlmodel = MLModel(model_path) quantized_result = quantization_utils.quantize_weights(mlmodel, 16, "linear") q_nn_builder = NeuralNetworkBuilder(spec=quantized_result._spec) # fails since an FP16 model cannot be marked updatable with self.assertRaises(ValueError): q_nn_builder.make_updatable(["ip1", "ip2"]) def test_nn_partial_fp16_make_updatable_fail(self): nn_builder = self.create_base_builder(is_updatable=False) model_path = os.path.join(self.model_dir, "updatable_creation.mlmodel") print(model_path) save_spec(nn_builder.spec, model_path) mlmodel = MLModel(model_path) selector = LayerSelector(layer_name='ip1') quantized_model = quantization_utils.quantize_weights(mlmodel, 16, "linear", selector=selector) q_nn_builder = NeuralNetworkBuilder(spec=quantized_model._spec) # fails since model has a layer with FP16 bias with self.assertRaises(ValueError): q_nn_builder.make_updatable(["ip2"]) def test_nn_partial_fp16_make_updatable_quantized_layer_fail(self): nn_builder = self.create_base_builder(is_updatable=False) model_path = os.path.join(self.model_dir, "updatable_creation.mlmodel") print(model_path) save_spec(nn_builder.spec, model_path) mlmodel = MLModel(model_path) selector = LayerSelector(layer_name='ip2') quantized_result = quantization_utils.quantize_weights(mlmodel, 16, "linear", selector=selector) quantized_spec = quantized_result._spec q_nn_builder = NeuralNetworkBuilder(spec=quantized_spec) # fails since model has a layer with FP16 bias with self.assertRaises(ValueError): q_nn_builder.make_updatable(["ip2"]) def test_nn_partial_fp16_make_updatable_fail(self): nn_builder = self.create_base_builder() model_path = os.path.join(self.model_dir, "updatable_creation.mlmodel") print(model_path) save_spec(nn_builder.spec, model_path) mlmodel = MLModel(model_path) # fails since updatable models cannot get quantized to FP16 with self.assertRaises(Exception): quantization_utils.quantize_weights(mlmodel, 16, "linear") def test_pipeline_regressor_make_updatable(self): builder = self.create_base_builder() builder.spec.isUpdatable = False training_input = [("input", datatypes.Array(3)), ("target", "Double")] # fails due to missing sub-models p_regressor = PipelineRegressor( self.input_features, self.output_names, training_input ) with self.assertRaises(ValueError): p_regressor.make_updatable() self.assertEqual(p_regressor.spec.isUpdatable, False) # fails due to sub-model being not updatable p_regressor.add_model(builder.spec) with self.assertRaises(ValueError): p_regressor.make_updatable() self.assertEqual(p_regressor.spec.isUpdatable, False) builder.spec.isUpdatable = True p_regressor.add_model(builder.spec) self.assertEqual(p_regressor.spec.isUpdatable, False) p_regressor.make_updatable() self.assertEqual(p_regressor.spec.isUpdatable, True) self.assertEqual(p_regressor.spec.description.trainingInput[0].name, "input") self.assertEqual( p_regressor.spec.description.trainingInput[0].type.WhichOneof("Type"), "multiArrayType", ) self.assertEqual(p_regressor.spec.description.trainingInput[1].name, "target") self.assertEqual( p_regressor.spec.description.trainingInput[1].type.WhichOneof("Type"), "doubleType", ) # fails since once updatable does not allow adding new models with self.assertRaises(ValueError): p_regressor.add_model(builder.spec) self.assertEqual(p_regressor.spec.isUpdatable, True) def test_pipeline_classifier_make_updatable(self): builder = self.create_base_builder() builder.spec.isUpdatable = False training_input = [("input", datatypes.Array(3)), ("target", "String")] # fails due to missing sub-models p_classifier = PipelineClassifier( self.input_features, self.output_names, training_features=training_input ) with self.assertRaises(ValueError): p_classifier.make_updatable() self.assertEqual(p_classifier.spec.isUpdatable, False) # fails due to sub-model being not updatable p_classifier.add_model(builder.spec) with self.assertRaises(ValueError): p_classifier.make_updatable() self.assertEqual(p_classifier.spec.isUpdatable, False) builder.spec.isUpdatable = True p_classifier.add_model(builder.spec) self.assertEqual(p_classifier.spec.isUpdatable, False) p_classifier.make_updatable() self.assertEqual(p_classifier.spec.isUpdatable, True) self.assertEqual(p_classifier.spec.description.trainingInput[0].name, "input") self.assertEqual( p_classifier.spec.description.trainingInput[0].type.WhichOneof("Type"), "multiArrayType", ) self.assertEqual(p_classifier.spec.description.trainingInput[1].name, "target") self.assertEqual( p_classifier.spec.description.trainingInput[1].type.WhichOneof("Type"), "stringType", ) # fails since once updatable does not allow adding new models with self.assertRaises(ValueError): p_classifier.add_model(builder.spec) self.assertEqual(p_classifier.spec.isUpdatable, True) def test_pipeline_classifier_set_training_inputs(self): builder = self.create_base_builder() builder.spec.isUpdatable = False training_input = [("input", datatypes.Array(3)), ("target", "String")] # fails due to missing sub-models p_classifier = PipelineClassifier(self.input_features, self.output_names) p_classifier.set_training_input(training_input) with self.assertRaises(ValueError): p_classifier.make_updatable() self.assertEqual(p_classifier.spec.isUpdatable, False) # fails due to sub-model being not updatable p_classifier.add_model(builder.spec) with self.assertRaises(ValueError): p_classifier.make_updatable() self.assertEqual(p_classifier.spec.isUpdatable, False) builder.spec.isUpdatable = True p_classifier.add_model(builder.spec) self.assertEqual(p_classifier.spec.isUpdatable, False) p_classifier.make_updatable() self.assertEqual(p_classifier.spec.isUpdatable, True) self.assertEqual(p_classifier.spec.description.trainingInput[0].name, "input") self.assertEqual( p_classifier.spec.description.trainingInput[0].type.WhichOneof("Type"), "multiArrayType", ) self.assertEqual(p_classifier.spec.description.trainingInput[1].name, "target") self.assertEqual( p_classifier.spec.description.trainingInput[1].type.WhichOneof("Type"), "stringType", ) # fails since once updatable does not allow adding new models with self.assertRaises(ValueError): p_classifier.add_model(builder.spec) self.assertEqual(p_classifier.spec.isUpdatable, True) def test_shuffle_on_by_default(self): builder = self.create_base_builder() # base builder already marks two layers as updatable self.assertTrue( builder.nn_spec.updateParams.shuffle.defaultValue, "Shuffle not turned on by default for updatable models", ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/pipeline/test_pipeline.py0000644000000000000000000003537714672066616023062 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import tempfile import unittest import numpy as np import pytest from ..utils import load_boston import coremltools as ct from coremltools._deps import _HAS_LIBSVM, _HAS_SKLEARN from coremltools.converters.mil import mil from coremltools.converters.mil.mil import Builder as mb from coremltools.converters.mil.mil import Function, Program from coremltools.models.pipeline import PipelineClassifier, PipelineRegressor from coremltools.models.utils import _is_macos if _HAS_SKLEARN: from sklearn.linear_model import LinearRegression from sklearn.pipeline import Pipeline from sklearn.preprocessing import OneHotEncoder from coremltools.converters import sklearn as converter if _HAS_LIBSVM: from libsvm import svmutil from coremltools.converters import libsvm as libsvm_converter @unittest.skipIf(not _HAS_SKLEARN, "Missing scikit-learn. Skipping tests.") @unittest.skipIf(not _HAS_LIBSVM, "Missing libsvm. Skipping tests.") class LinearRegressionPipelineCreationTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() feature_names = scikit_data["feature_names"] scikit_model = LinearRegression() scikit_model.fit(scikit_data["data"], scikit_data["target"]) scikit_spec = converter.convert( scikit_model, feature_names, "target" ).get_spec() # Save the data and the model self.scikit_data = scikit_data self.scikit_model = scikit_model self.scikit_spec = scikit_spec def test_pipeline_regression_creation(self): input_names = self.scikit_data["feature_names"] output_name = "target" p_regressor = PipelineRegressor(input_names, "target") p_regressor.add_model(self.scikit_spec) self.assertIsNotNone(p_regressor.spec) self.assertEqual(len(p_regressor.spec.pipelineRegressor.pipeline.models), 1) # Test the model class of the linear regressor model spec = p_regressor.spec.pipelineRegressor.pipeline.models[0] self.assertIsNotNone(spec.description) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 1) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "doubleType" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(input_names), sorted(map(lambda x: x.name, spec.description.input)) ) @unittest.skipIf(not _HAS_SKLEARN, "Missing scikit-learn. Skipping tests.") @unittest.skipIf(not _HAS_LIBSVM, "Missing libsvm. Skipping tests.") class LibSVMPipelineCreationTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ if not _HAS_SKLEARN: return if not _HAS_LIBSVM: return scikit_data = load_boston() prob = svmutil.svm_problem( scikit_data["target"] > scikit_data["target"].mean(), scikit_data["data"].tolist(), ) param = svmutil.svm_parameter() param.svm_type = svmutil.C_SVC param.kernel_type = svmutil.LINEAR param.eps = 1 libsvm_model = svmutil.svm_train(prob, param) libsvm_spec = libsvm_converter.convert( libsvm_model, scikit_data["feature_names"], "target" ).get_spec() # Save the data and the model self.scikit_data = scikit_data self.libsvm_spec = libsvm_spec def test_pipeline_classifier_creation(self): input_names = self.scikit_data["feature_names"] p_classifier = PipelineClassifier(input_names, [1, 0]) p_classifier.add_model(self.libsvm_spec) self.assertIsNotNone(p_classifier.spec) self.assertEqual(len(p_classifier.spec.pipelineClassifier.pipeline.models), 1) # Test the model class of the svm model spec = p_classifier.spec.pipelineClassifier.pipeline.models[0] self.assertIsNotNone(spec.description) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 1) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "int64Type" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(input_names), sorted(map(lambda x: x.name, spec.description.input)) ) @unittest.skipIf(not _HAS_SKLEARN, "Missing scikit-learn. Skipping tests.") class LinearRegressionPipeline(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() scikit_model = Pipeline(steps=[("linear", LinearRegression())]) scikit_model.fit(scikit_data["data"], scikit_data["target"]) # Save the data and the model self.scikit_data = scikit_data self.scikit_model = scikit_model def test_pipeline_regression_creation(self): input_names = self.scikit_data["feature_names"] output_name = "target" p_regressor_model = converter.convert(self.scikit_model, input_names, "target") x = dict(zip(self.scikit_data["feature_names"], self.scikit_data["data"][0])) y = p_regressor_model.predict(x) self.assertIsNotNone(y) with tempfile.TemporaryDirectory() as save_dir: p_regressor_model.save(save_dir + "/test.mlmodel") p_regressor = p_regressor_model.get_spec() self.assertIsNotNone(p_regressor) self.assertEqual(len(p_regressor.pipelineRegressor.pipeline.models), 2) # Test the model class of the linear regressor model spec = p_regressor.pipelineRegressor.pipeline.models[-1] self.assertIsNotNone(spec.description) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 1) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "doubleType" ) for input_type in p_regressor.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(input_names), sorted(map(lambda x: x.name, p_regressor.description.input)), ) def test_conversion_bad_inputs(self): """ Failure testing for bad conversion. """ # Error on converting an untrained model with self.assertRaises(TypeError): model = OneHotEncoder() spec = converter.convert(model, "data", "out", "regressor") class TestMakePipeline: @staticmethod def _make_model(input_name, input_length, output_name, output_length, convert_to='mlprogram', compute_units=ct.ComputeUnit.ALL): weight_tensor = np.arange(input_length * output_length, dtype='float32') weight_tensor = weight_tensor.reshape(output_length, input_length) prog = mil.Program() func_inputs = {input_name: mb.placeholder(shape=(input_length,))} with Function(func_inputs) as ssa_fun: input = ssa_fun.inputs[input_name] y = mb.linear(x=input, weight=weight_tensor, name=output_name) ssa_fun.set_outputs([y]) prog.add_function("main", ssa_fun) return ct.convert(prog, convert_to=convert_to, compute_units=compute_units) @staticmethod @pytest.mark.parametrize( "model1_backend, model2_backend", itertools.product(["mlprogram", "neuralnetwork"], ["mlprogram", "neuralnetwork"]), ) def test_simple(model1_backend, model2_backend): # Create models m1 = TestMakePipeline._make_model("x", 20, "y1", 10, model1_backend) m2 = TestMakePipeline._make_model("y1", 10, "y2", 2, model2_backend) # Get non-pipeline result x = np.random.rand(20) if _is_macos(): y1 = m1.predict({"x": x})["y1"] y2 = m2.predict({"y1": y1}) pipeline_model = ct.utils.make_pipeline(m1, m2) if _is_macos(): y_pipeline = pipeline_model.predict({"x": x}) assert(len(y_pipeline) == 1) np.testing.assert_allclose(y2["y2"], y_pipeline["y2"]) # Check save/load with tempfile.TemporaryDirectory() as save_dir: # Save pipeline save_path = save_dir + "/test.mlpackage" pipeline_model.save(save_path) # Check loading from a mlpackage path p2 = ct.models.MLModel(save_path) if _is_macos(): y_pipeline = p2.predict({"x": x}) np.testing.assert_allclose(y2["y2"], y_pipeline["y2"]) # Check loading from spec and weight dir p3 = ct.models.MLModel(p2.get_spec(), weights_dir=p2.weights_dir) if _is_macos(): y_pipeline = p3.predict({"x": x}) np.testing.assert_allclose(y2["y2"], y_pipeline["y2"]) @staticmethod def test_compute_unit(): # Case 1 - Inferring compute_unit m1 = TestMakePipeline._make_model("x", 20, "y1", 10, compute_units=ct.ComputeUnit.CPU_ONLY) m2 = TestMakePipeline._make_model("y1", 10, "y2", 2, compute_units=ct.ComputeUnit.CPU_ONLY) pipeline_model = ct.utils.make_pipeline(m1, m2) assert pipeline_model.compute_unit is ct.ComputeUnit.CPU_ONLY # Case 2 - Specifying compute_unit pipeline_model = ct.utils.make_pipeline(m1, m2, compute_units=ct.ComputeUnit.ALL) assert pipeline_model.compute_unit is ct.ComputeUnit.ALL # Case 3 (error case) - No compute_unit specified and the two models don't agree m2 = TestMakePipeline._make_model("y1", 10, "y2", 2, compute_units=ct.ComputeUnit.ALL) with pytest.raises(ValueError, match='"compute_units" parameter must be specified.'): pipeline_model = ct.utils.make_pipeline(m1, m2) # Case 4 (error case) - Garbage compute_unit input with pytest.raises(TypeError, match='"compute_units" parameter must'): pipeline_model = ct.utils.make_pipeline(m1, m2, compute_units="Garbage!") @staticmethod def test_second_model_needs_pipeline_input(): # First model takes one parameter p1 = mil.Program() func_inputs = {'x1': mb.placeholder(shape=(2,))} with Function(func_inputs) as ssa_fun: x1 = ssa_fun.inputs['x1'] y1 = mb.add(x=x1, y=[0., 1.], name='y1') ssa_fun.set_outputs([y1]) p1.add_function("main", ssa_fun) m1 = ct.convert(p1) # Second model takes two parameters. One will be from previous model in pipeline. # The other as pipeline input. p2 = mil.Program() func_inputs = { 'y1': mb.placeholder(shape=(2,)), 'x2': mb.placeholder(shape=(2,)), } with Function(func_inputs) as ssa_fun: x2, y1 = ssa_fun.inputs['x2'], ssa_fun.inputs['y1'] y2 = mb.add(x=x2, y=y1, name='y2') ssa_fun.set_outputs([y2]) p2.add_function("main", ssa_fun) m2 = ct.convert(p2) # Get predictions without a pipeline x1 = [1.,2.] y1 = m1.predict({'x1': x1})['y1'] x2 = [4., 9.] y2 = m2.predict({'x2': x2, 'y1': y1})['y2'] # Make a pipeline and get predictions from it pipeline = ct.utils.make_pipeline(m1, m2) y_pipeline = pipeline.predict({'x1': x1, 'x2': x2}) assert len(y_pipeline) == 1 np.testing.assert_allclose(y2, y_pipeline['y2']) @staticmethod def test_pipeline_outputs_from_multiple_models(): # Create models m1 = TestMakePipeline._make_model("x", 20, "y1", 10) m2 = TestMakePipeline._make_model("y1", 10, "y2", 2) m3 = TestMakePipeline._make_model("y1", 10, "y3", 4) # Get non-pipeline results x = np.random.rand(20) if _is_macos(): y1 = m1.predict({"x": x})["y1"] y2 = m2.predict({"y1": y1}) y3 = m3.predict({"y1": y1}) pipeline_model = ct.utils.make_pipeline(m1, m2, m3) if _is_macos(): y_pipeline = pipeline_model.predict({"x": x}) assert(len(y_pipeline) == 2) np.testing.assert_allclose(y2["y2"], y_pipeline["y2"]) np.testing.assert_allclose(y3["y3"], y_pipeline["y3"]) @staticmethod def test_pipeline_input_goes_to_multiple_models(): # Create the first two models that take the same input m1 = TestMakePipeline._make_model("x", 20, "y1", 10) m2 = TestMakePipeline._make_model("x", 20, "y2", 10) # Create the last models which add the output from the other two models. p3 = Program() func_inputs = {'y1': mb.placeholder(shape=(10,)), 'y2': mb.placeholder(shape=(10,)),} with Function(func_inputs) as ssa_fun: y1, y2 = ssa_fun.inputs['y1'], ssa_fun.inputs['y2'] y3 = mb.add(x=y1, y=y2, name='y3') ssa_fun.set_outputs([y3]) p3.add_function("main", ssa_fun) m3 = ct.convert(p3) # Get non-pipeline result x = np.random.rand(20) if _is_macos(): y1 = m1.predict({"x": x})["y1"] y2 = m2.predict({"x": x})["y2"] y3 = m3.predict({"y1": y1, "y2": y2}) pipeline_model = ct.utils.make_pipeline(m1, m2, m3) if _is_macos(): y_pipeline = pipeline_model.predict({"x": x}) assert(len(y_pipeline) == 1) np.testing.assert_allclose(y3['y3'], y_pipeline["y3"]) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2935474 coremltools-8.0/coremltools/test/sklearn_tests/0000755000000000000000000000000014672075535020701 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/__init__.py0000644000000000000000000000032714672066616023014 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_NuSVC.py0000644000000000000000000002706014672066616023255 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import random import tempfile import unittest import pandas as pd import pytest from coremltools._deps import (_HAS_LIBSVM, _HAS_SKLEARN, _SKLEARN_VERSION, MSG_LIBSVM_NOT_FOUND, MSG_SKLEARN_NOT_FOUND) from coremltools.models.utils import (_is_macos, _macos_version, evaluate_classifier, evaluate_classifier_with_probabilities) if _HAS_LIBSVM: from libsvm import svmutil from svmutil import svm_predict, svm_train from coremltools.converters import libsvm if _HAS_SKLEARN: from packaging.version import Version from sklearn.preprocessing import OneHotEncoder from sklearn.svm import NuSVC from coremltools.converters import sklearn as scikit_converter @unittest.skipIf(not _HAS_SKLEARN, MSG_SKLEARN_NOT_FOUND) class NuSvcScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ def _evaluation_test_helper( self, class_labels, use_probability_estimates, allow_slow, allowed_prob_delta=0.00001, ): # Parameters to test kernel_parameters = [ {}, {"kernel": "rbf", "gamma": 1.2}, {"kernel": "linear"}, {"kernel": "poly"}, {"kernel": "poly", "degree": 2}, {"kernel": "poly", "gamma": 0.75}, ] # sklearn version > 0.22 NuSVC introduced finiteness checks that fail for # the 'sigmoid' and one 'poly' kernel test cases. Avoid those. # See https://github.com/scikit-learn/scikit-learn/issues/17925 if _SKLEARN_VERSION <= Version("0.22"): kernel_parameters += [ {"kernel": "poly", "degree": 0, "gamma": 0.9, "coef0": 2}, {"kernel": "sigmoid"}, {"kernel": "sigmoid", "gamma": 1.3}, {"kernel": "sigmoid", "coef0": 0.8}, {"kernel": "sigmoid", "coef0": 0.8, "gamma": 0.5}, ] non_kernel_parameters = [ {}, {"nu": 0.75}, {"nu": 0.25, "shrinking": True}, {"shrinking": False}, ] # Generate some random data x, y = [], [] random.seed(42) for _ in range(50): x.append( [random.gauss(200, 30), random.gauss(-100, 22), random.gauss(100, 42)] ) y.append(random.choice(class_labels)) column_names = ["x1", "x2", "x3"] # make sure first label is seen first, second is seen second, and so on. for i, val in enumerate(class_labels): y[i] = val df = pd.DataFrame(x, columns=column_names) # Test for param1 in non_kernel_parameters: for param2 in kernel_parameters: cur_params = param1.copy() cur_params.update(param2) cur_params["probability"] = use_probability_estimates cur_params["max_iter"] = 10 # Don't want test to take too long cur_model = NuSVC(**cur_params) cur_model.fit(x, y) spec = scikit_converter.convert(cur_model, column_names, "target") if _is_macos() and _macos_version() >= (10, 13): if use_probability_estimates: probability_lists = cur_model.predict_proba(x) df["classProbability"] = [ dict(zip(cur_model.classes_, cur_vals)) for cur_vals in probability_lists ] metrics = evaluate_classifier_with_probabilities( spec, df, probabilities="classProbability" ) self.assertEqual(metrics["num_key_mismatch"], 0) self.assertLess( metrics["max_probability_error"], allowed_prob_delta ) else: df["target"] = cur_model.predict(x) metrics = evaluate_classifier(spec, df, verbose=False) self.assertEqual(metrics["num_errors"], 0) if not allow_slow: break if not allow_slow: break @pytest.mark.slow def test_binary_class_int_label_without_probability_stress_test(self): self._evaluation_test_helper([1, 3], False, allow_slow=True) def test_binary_class_int_label_without_probability(self): self._evaluation_test_helper([1, 3], False, allow_slow=False) @pytest.mark.slow def test_binary_class_string_label_with_probability_stress_test(self): # Scikit Learn uses technique to normalize pairwise probabilities even for binary classification. # This leads to difference in probabilities. self._evaluation_test_helper( ["foo", "bar"], True, allow_slow=True, allowed_prob_delta=0.005 ) def test_binary_class_string_label_with_probability(self): # Scikit Learn uses technique to normalize pairwise probabilities even for binary classification. # This leads to difference in probabilities. self._evaluation_test_helper( ["foo", "bar"], True, allow_slow=False, allowed_prob_delta=0.005 ) @pytest.mark.slow def test_multi_class_int_label_without_probability_stress_test(self): self._evaluation_test_helper([12, 33, -1, 1234], False, allow_slow=True) def test_multi_class_int_label_without_probability(self): self._evaluation_test_helper([12, 33, -1, 1234], False, allow_slow=False) @pytest.mark.slow def test_multi_class_string_label_with_probability_stress_test(self): self._evaluation_test_helper(["X", "Y", "z"], True, allow_slow=True) def test_multi_class_string_label_with_probability(self): self._evaluation_test_helper(["X", "Y", "z"], True, allow_slow=False) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(TypeError): model = NuSVC() spec = scikit_converter.convert(model, "data", "out") # Check the expected class during conversion with self.assertRaises(TypeError): model = OneHotEncoder() spec = scikit_converter.convert(model, "data", "out") @unittest.skipIf(not _HAS_LIBSVM, MSG_LIBSVM_NOT_FOUND) @unittest.skipIf(not _HAS_SKLEARN, MSG_SKLEARN_NOT_FOUND) class NuSVCLibSVMTest(unittest.TestCase): # Model parameters for testing base_param = "-s 1 -q" # model type C-SVC and quiet mode non_kernel_parameters = ["", "-n 0.6 -p 0.5 -h 1", "-c 0.5 -p 0.5 -h 0"] kernel_parameters = [ "-t 0", # linear kernel "", "-t 2 -g 1.2", # rbf kernel "-t 1", "-t 1 -d 2", "-t 1 -g 0.75", "-t 1 -d 0 -g 0.9 -r 2", # poly kernel "-t 3", "-t 3 -g 1.3", "-t 3 -r 0.8", "-t 3 -r 0.8 -g 0.5", # sigmoid kernel ] """ Unit test class for testing the libsvm sklearn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ if not _HAS_LIBSVM: # setUpClass is still called even if class is skipped. return # Generate some random data. # This unit test should not rely on scikit learn for test data. self.x, self.y = [], [] random.seed(42) for _ in range(50): self.x.append([random.gauss(200, 30), random.gauss(-100, 22)]) self.y.append(random.choice([1, 2])) self.y[0] = 1 # Make sure 1 is always the first label it sees self.y[1] = 2 self.column_names = ["x1", "x2"] self.prob = svmutil.svm_problem(self.y, self.x) param = svmutil.svm_parameter() param.svm_type = svmutil.NU_SVC param.kernel_type = svmutil.LINEAR param.eps = 1 param.probability = 1 # Save the data and the model self.libsvm_model = svmutil.svm_train(self.prob, param) self.df = pd.DataFrame(self.x, columns=self.column_names) def _test_prob_model(self, param1, param2): probability_param = "-b 1" df = self.df param_str = " ".join([self.base_param, param1, param2, probability_param]) param = svmutil.svm_parameter(param_str) model = svm_train(self.prob, param) # Get predictions with probabilities as dictionaries (df["prediction"], _, probability_lists) = svm_predict( self.y, self.x, model, probability_param + " -q" ) probability_dicts = [ dict(zip([1, 2], cur_vals)) for cur_vals in probability_lists ] df["probabilities"] = probability_dicts spec = libsvm.convert(model, self.column_names, "target", "probabilities") if _is_macos() and _macos_version() >= (10, 13): metrics = evaluate_classifier_with_probabilities(spec, df, verbose=False) self.assertEqual(metrics["num_key_mismatch"], 0) self.assertLess(metrics["max_probability_error"], 0.00001) @pytest.mark.slow def test_binary_classificaiton_with_probability_stress_test(self): for param1 in self.non_kernel_parameters: for param2 in self.kernel_parameters: self._test_prob_model(param1, param2) def test_binary_classificaiton_with_probability(self): param1 = self.non_kernel_parameters[0] param2 = self.kernel_parameters[0] self._test_prob_model(param1, param2) @pytest.mark.slow @unittest.skip( "LibSVM's Python library is broken for NuSVC without probabilities. It always segfaults during prediction time." ) def test_multi_class_without_probability(self): # Generate some random data. # This unit test should not rely on scikit learn for test data. x, y = [], [] for _ in range(50): x.append( [random.gauss(200, 30), random.gauss(-100, 22), random.gauss(100, 42)] ) y.append(random.choice([1, 2, 10, 12])) y[0], y[1], y[2], y[3] = 1, 2, 10, 12 column_names = ["x1", "x2", "x3"] prob = svmutil.svm_problem(y, x) df = pd.DataFrame(x, columns=column_names) for param1 in self.non_kernel_parameters: for param2 in self.kernel_parameters: param_str = " ".join([self.base_param, param1, param2]) param = svmutil.svm_parameter(param_str) model = svm_train(prob, param) # Get predictions with probabilities as dictionaries (df["prediction"], _, _) = svm_predict(y, x, model, " -q") spec = libsvm.convert(model, column_names, "target") metrics = evaluate_classifier(spec, df, verbose=False) self.assertEqual(metrics["num_errors"], 0) def test_conversion_from_filesystem(self): libsvm_model_path = tempfile.mktemp(suffix="model.libsvm") svmutil.svm_save_model(libsvm_model_path, self.libsvm_model) spec = libsvm.convert(libsvm_model_path, "data", "target") def test_conversion_bad_inputs(self): # Check the expected class during conversion. with self.assertRaises(TypeError): model = OneHotEncoder() spec = libsvm.convert(model, "data", "out") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_NuSVR.py0000644000000000000000000001622714672066616023277 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import random import tempfile import unittest import pandas as pd import pytest from ..utils import load_boston from coremltools._deps import (_HAS_LIBSVM, _HAS_SKLEARN, MSG_LIBSVM_NOT_FOUND, MSG_SKLEARN_NOT_FOUND) from coremltools.models.utils import (_is_macos, _macos_version, evaluate_regressor) if _HAS_LIBSVM: from libsvm import svmutil from svmutil import svm_predict, svm_train from coremltools.converters import libsvm if _HAS_SKLEARN: from sklearn.preprocessing import OneHotEncoder from sklearn.svm import NuSVR from coremltools.converters import sklearn as scikit_converter @unittest.skipIf(not _HAS_SKLEARN, MSG_SKLEARN_NOT_FOUND) class NuSVRScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ self.scikit_model = NuSVR(kernel="linear") self.data = load_boston() self.scikit_model.fit(self.data["data"], self.data["target"]) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(TypeError): model = NuSVR() spec = scikit_converter.convert(model, "data", "out") # Check the expected class during conversion. with self.assertRaises(TypeError): model = OneHotEncoder() spec = scikit_converter.convert(model, "data", "out") @pytest.mark.slow def test_evaluation_stress_test(self): self._test_evaluation(allow_slow=True) def test_evaluation(self): self._test_evaluation(allow_slow=False) def _test_evaluation(self, allow_slow): """ Test that the same predictions are made """ # Generate some smallish (some kernels take too long on anything else) random data x, y = [], [] for _ in range(50): cur_x1, cur_x2 = random.gauss(2, 3), random.gauss(-1, 2) x.append([cur_x1, cur_x2]) y.append(1 + 2 * cur_x1 + 3 * cur_x2) input_names = ["x1", "x2"] df = pd.DataFrame(x, columns=input_names) # Parameters to test kernel_parameters = [ {}, {"kernel": "rbf", "gamma": 1.2}, {"kernel": "linear"}, {"kernel": "poly"}, {"kernel": "poly", "degree": 2}, {"kernel": "poly", "gamma": 0.75}, {"kernel": "poly", "degree": 0, "gamma": 0.9, "coef0": 2}, {"kernel": "sigmoid"}, {"kernel": "sigmoid", "gamma": 1.3}, {"kernel": "sigmoid", "coef0": 0.8}, {"kernel": "sigmoid", "coef0": 0.8, "gamma": 0.5}, ] non_kernel_parameters = [ {}, {"C": 1}, {"C": 1.5, "shrinking": True}, {"C": 0.5, "shrinking": False, "nu": 0.9}, ] # Test for param1 in non_kernel_parameters: for param2 in kernel_parameters: cur_params = param1.copy() cur_params.update(param2) cur_model = NuSVR(**cur_params) cur_model.fit(x, y) df["target"] = cur_model.predict(x) spec = scikit_converter.convert(cur_model, input_names, "target") if _is_macos() and _macos_version() >= (10, 13): metrics = evaluate_regressor(spec, df) self.assertAlmostEqual(metrics["max_error"], 0) if not allow_slow: break if not allow_slow: break @unittest.skipIf(not _HAS_LIBSVM, MSG_LIBSVM_NOT_FOUND) @unittest.skipIf(not _HAS_SKLEARN, MSG_SKLEARN_NOT_FOUND) class NuSVRLibSVMTest(unittest.TestCase): """ Unit test class for testing the libsvm sklearn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ if not _HAS_SKLEARN: return if not _HAS_LIBSVM: return scikit_data = load_boston() prob = svmutil.svm_problem(scikit_data["target"], scikit_data["data"].tolist()) param = svmutil.svm_parameter() param.svm_type = svmutil.NU_SVR param.kernel_type = svmutil.LINEAR param.eps = 1 self.libsvm_model = svmutil.svm_train(prob, param) def test_conversion(self): spec = libsvm.convert(self.libsvm_model, "data", "target") def test_conversion_from_filesystem(self): libsvm_model_path = tempfile.mktemp(suffix="model.libsvm") svmutil.svm_save_model(libsvm_model_path, self.libsvm_model) spec = libsvm.convert(libsvm_model_path, "data", "target") def test_conversion_bad_inputs(self): # Check the expected class during conversion. with self.assertRaises(TypeError): model = OneHotEncoder() spec = libsvm.convert(model, "data", "out") @pytest.mark.slow def test_evaluation_stress_test(self): self._test_evaluation(allow_slow=True) def test_evaluation(self): self._test_evaluation(allow_slow=False) def _test_evaluation(self, allow_slow): """ Test that the same predictions are made """ # Generate some smallish (poly kernels take too long on anything else) random data x, y = [], [] for _ in range(50): cur_x1, cur_x2 = random.gauss(2, 3), random.gauss(-1, 2) x.append([cur_x1, cur_x2]) y.append(1 + 2 * cur_x1 + 3 * cur_x2) input_names = ["x1", "x2"] df = pd.DataFrame(x, columns=input_names) prob = svmutil.svm_problem(y, x) # Parameters base_param = "-s 4" # model type is nu-SVR non_kernel_parameters = ["", "-c 1.5 -p 0.5 -h 1", "-c 0.5 -p 0.5 -h 0"] kernel_parameters = [ "", "-t 2 -g 1.2", # rbf kernel "-t 0", # linear kernel "-t 1", "-t 1 -d 2", "-t 1 -g 0.75", "-t 1 -d 0 -g 0.9 -r 2", # poly kernel "-t 3", "-t 3 -g 1.3", "-t 3 -r 0.8", "-t 3 -r 0.8 -g 0.5", # sigmoid kernel ] for param1 in non_kernel_parameters: for param2 in kernel_parameters: param_str = " ".join([base_param, param1, param2]) param = svmutil.svm_parameter(param_str) model = svm_train(prob, param) (df["target"], _, _) = svm_predict(y, x, model) spec = libsvm.convert(model, input_names, "target") if _is_macos() and _macos_version() >= (10, 13): metrics = evaluate_regressor(spec, df) self.assertAlmostEqual(metrics["max_error"], 0) if not allow_slow: break if not allow_slow: break ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_SVC.py0000644000000000000000000003364314672066616022756 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import copy import random import tempfile import unittest import numpy as np import pandas as pd import pytest from coremltools._deps import _HAS_LIBSVM, _HAS_SKLEARN from coremltools.models.utils import (_is_macos, _macos_version, evaluate_classifier, evaluate_classifier_with_probabilities) if _HAS_SKLEARN: from sklearn.preprocessing import OneHotEncoder from sklearn.svm import SVC from coremltools.converters import sklearn as scikit_converter if _HAS_LIBSVM: import svmutil from svm import svm_parameter from svmutil import svm_predict, svm_train from coremltools.converters import libsvm @unittest.skipIf(not _HAS_SKLEARN, "Missing scikit-learn. Skipping tests.") class SvcScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ def _evaluation_test_helper( self, class_labels, use_probability_estimates, allow_slow, allowed_prob_delta=0.00001, ): # Parameters to test kernel_parameters = [ {}, {"kernel": "rbf", "gamma": 1.2}, {"kernel": "linear"}, {"kernel": "poly"}, {"kernel": "poly", "degree": 2}, {"kernel": "poly", "gamma": 0.75}, {"kernel": "poly", "degree": 0, "gamma": 0.9, "coef0": 2}, {"kernel": "sigmoid"}, {"kernel": "sigmoid", "gamma": 1.3}, {"kernel": "sigmoid", "coef0": 0.8}, {"kernel": "sigmoid", "coef0": 0.8, "gamma": 0.5}, ] non_kernel_parameters = [ {}, {"C": 1}, {"C": 1.5, "shrinking": True}, {"C": 0.5, "shrinking": False}, ] # Generate some random data x, y = [], [] random.seed(42) for _ in range(50): x.append( [random.gauss(200, 30), random.gauss(-100, 22), random.gauss(100, 42)] ) y.append(random.choice(class_labels)) column_names = ["x1", "x2", "x3"] # make sure first label is seen first, second is seen second, and so on. for i, val in enumerate(class_labels): y[i] = val df = pd.DataFrame(x, columns=column_names) # Test for param1 in non_kernel_parameters: for param2 in kernel_parameters: cur_params = param1.copy() cur_params.update(param2) cur_params["probability"] = use_probability_estimates cur_params["max_iter"] = 10 # Don't want test to take too long cur_model = SVC(**cur_params) cur_model.fit(x, y) spec = scikit_converter.convert(cur_model, column_names, "target") if _is_macos() and _macos_version() >= (10, 13): if use_probability_estimates: probability_lists = cur_model.predict_proba(x) df["classProbability"] = [ dict(zip(cur_model.classes_, cur_vals)) for cur_vals in probability_lists ] metrics = evaluate_classifier_with_probabilities( spec, df, probabilities="classProbability", verbose=True ) self.assertEqual(metrics["num_key_mismatch"], 0) self.assertLess( metrics["max_probability_error"], allowed_prob_delta ) else: df["target"] = cur_model.predict(x) metrics = evaluate_classifier(spec, df, verbose=False) self.assertEqual(metrics["num_errors"], 0) if not allow_slow: break if not allow_slow: break @pytest.mark.slow def test_binary_class_string_label_without_probability_stress_test(self): self._evaluation_test_helper(["A", "B"], False, allow_slow=True) def test_binary_class_string_label_without_probability(self): self._evaluation_test_helper(["A", "B"], False, allow_slow=False) @pytest.mark.slow def test_binary_class_string_label_with_probability_stress_test(self): # Scikit Learn uses technique to normalize pairwise probabilities even for binary classification. # This leads to difference in probabilities. self._evaluation_test_helper( ["foo", "bar"], True, allow_slow=True, allowed_prob_delta=0.005 ) def test_binary_class_string_label_with_probability(self): # Scikit Learn uses technique to normalize pairwise probabilities even for binary classification. # This leads to difference in probabilities. self._evaluation_test_helper( ["foo", "bar"], True, allow_slow=False, allowed_prob_delta=0.005 ) @pytest.mark.slow def test_multi_class_int_label_without_probability_stress_test(self): self._evaluation_test_helper([12, 33, -1, 1234], False, allow_slow=True) def test_multi_class_int_label_without_probability(self): self._evaluation_test_helper([12, 33, -1, 1234], False, allow_slow=False) @pytest.mark.slow def test_multi_class_int_label_with_probability_stress_test(self): self._evaluation_test_helper([1, 2, 3], True, allow_slow=True) def test_multi_class_int_label_with_probability(self): self._evaluation_test_helper([1, 2, 3], True, allow_slow=False) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(TypeError): model = SVC() spec = scikit_converter.convert(model, "data", "out") # Check the expected class during conversion with self.assertRaises(TypeError): model = OneHotEncoder() spec = scikit_converter.convert(model, "data", "out") @unittest.skipIf(not _HAS_LIBSVM, "Missing libsvm. Skipping tests.") class CSVCLibSVMTest(unittest.TestCase): # Model parameters for testing base_param = "-s 0 -q " # model type C-SVC and quiet mode non_kernel_parameters = ["", "-c 1.5 -p 0.5 -h 1", "-c 0.5 -p 0.5 -h 0"] kernel_parameters = [ "-t 0", # linear kernel "", "-t 2 -g 1.2", # rbf kernel "-t 1", "-t 1 -d 2", "-t 1 -g 0.75", "-t 1 -d 0 -g 0.9 -r 2", # poly kernel "-t 3", "-t 3 -g 1.3", "-t 3 -r 0.8", "-t 3 -r 0.8 -g 0.5", # sigmoid kernel ] # XXX: wi params? """ Unit test class for testing the libsvm converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ if not _HAS_LIBSVM: # setUpClass is still called even if class is skipped. return # Generate some random data. # This unit test should not rely on scikit learn for test data. self.x, self.y = [], [] random.seed(42) for _ in range(50): self.x.append([random.gauss(200, 30), random.gauss(-100, 22)]) self.y.append(random.choice([1, 2])) self.y[0] = 1 # Make sure 1 is always the first label it sees self.y[1] = 2 self.column_names = ["x1", "x2"] self.prob = svmutil.svm_problem(self.y, self.x) param = svmutil.svm_parameter() param.svm_type = svmutil.C_SVC param.kernel_type = svmutil.LINEAR param.eps = 1 param.probability = 1 self.libsvm_model = svmutil.svm_train(self.prob, param) def test_default_names(self): df = pd.DataFrame({"input": self.x}) df["input"] = df["input"].apply(np.array) # Test with probabilities spec = libsvm.convert(self.libsvm_model).get_spec() if _is_macos() and _macos_version() >= (10, 13): (_, _, probability_lists) = svm_predict( self.y, self.x, self.libsvm_model, "-b 1 -q" ) probability_dicts = [ dict(zip([1, 2], cur_vals)) for cur_vals in probability_lists ] df["classProbability"] = probability_dicts metrics = evaluate_classifier_with_probabilities( spec, df, verbose=False, probabilities="classProbability" ) self.assertLess(metrics["max_probability_error"], 0.00001) # Test model without probabilities no_probability_model = svmutil.svm_train(self.prob, svmutil.svm_parameter()) spec = libsvm.convert(no_probability_model).get_spec() self.assertEqual(len(spec.description.output), 1) self.assertEqual(spec.description.output[0].name, u"target") if _is_macos() and _macos_version() >= (10, 13): (df["target"], _, _) = svm_predict( self.y, self.x, no_probability_model, " -q" ) metrics = evaluate_classifier(spec, df, verbose=False) self.assertEqual(metrics["num_errors"], 0) # LibSVM only supports string labels @pytest.mark.slow def test_binary_class_without_probability_stress_test(self): self._evaluation_test_helper_no_probability([0, 1], allow_slow=True) @pytest.mark.slow def test_binary_class_with_probability_stress_test(self): self._evaluation_test_helper_with_probability([-1, 90], allow_slow=True) @pytest.mark.slow def test_multi_class_without_probability_stress_test(self): self._evaluation_test_helper_no_probability([12, 33, 12341], allow_slow=True) @pytest.mark.slow def test_multi_class_with_probability_stress_test(self): self._evaluation_test_helper_with_probability([1, 2, 3], allow_slow=True) # LibSVM only supports string labels def test_binary_class_without_probability(self): self._evaluation_test_helper_no_probability([0, 1], allow_slow=False) def test_binary_class_with_probability(self): self._evaluation_test_helper_with_probability([-1, 90], allow_slow=False) def test_multi_class_without_probability(self): self._evaluation_test_helper_no_probability([12, 33, 12341], allow_slow=False) def test_multi_class_with_probability(self): self._evaluation_test_helper_with_probability([1, 2, 3], allow_slow=False) def _evaluation_test_helper_with_probability(self, labels, allow_slow): df = pd.DataFrame(self.x, columns=self.column_names) y = copy.copy(self.y) for i, val in enumerate(labels): y[i] = val probability_param = "-b 1" for param1 in self.non_kernel_parameters: for param2 in self.kernel_parameters: param_str = " ".join( [self.base_param, param1, param2, probability_param] ) param = svm_parameter(param_str) model = svm_train(self.prob, param) # Get predictions with probabilities as dictionaries (df["target"], _, probability_lists) = svm_predict( y, self.x, model, probability_param + " -q" ) probability_dicts = [ dict(zip([1, 2], cur_vals)) for cur_vals in probability_lists ] df["probabilities"] = probability_dicts spec = libsvm.convert( model, self.column_names, "target", "probabilities" ) if _is_macos() and _macos_version() >= (10, 13): metrics = evaluate_classifier_with_probabilities( spec, df, verbose=False ) self.assertEqual(metrics["num_key_mismatch"], 0) self.assertLess(metrics["max_probability_error"], 0.00001) if not allow_slow: break if not allow_slow: break def _evaluation_test_helper_no_probability(self, labels, allow_slow): # Generate some random data. # This unit test should not rely on scikit learn for test data. x, y = [], [] random.seed(42) for _ in range(50): x.append( [random.gauss(200, 30), random.gauss(-100, 22), random.gauss(100, 42)] ) y.append(random.choice(labels)) # make sure first label is seen first, second is seen second, and so on. for i, val in enumerate(labels): y[i] = val column_names = ["x1", "x2", "x3"] prob = svmutil.svm_problem(y, x) df = pd.DataFrame(x, columns=column_names) for param1 in self.non_kernel_parameters: for param2 in self.kernel_parameters: param_str = " ".join([self.base_param, param1, param2]) param = svm_parameter(param_str) model = svm_train(prob, param) # Get predictions with probabilities as dictionaries (df["target"], _, _) = svm_predict(y, x, model, " -q") spec = libsvm.convert(model, column_names, "target") if _is_macos() and _macos_version() >= (10, 13): metrics = evaluate_classifier(spec, df, verbose=False) self.assertEqual(metrics["num_errors"], 0) if not allow_slow: break if not allow_slow: break def test_conversion_from_filesystem(self): libsvm_model_path = tempfile.mktemp(suffix="model.libsvm") svmutil.svm_save_model(libsvm_model_path, self.libsvm_model) # libsvm's save(...) truncates floating points. So it's not going to match self.libsvm_model any more. spec = libsvm.convert(libsvm_model_path, self.column_names, "target") self.assertIsNotNone(spec) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_SVR.py0000644000000000000000000002103414672066616022764 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import random import tempfile import unittest import numpy as np import pandas as pd import pytest from ..utils import load_boston from coremltools._deps import (_HAS_LIBSVM, _HAS_SKLEARN, MSG_LIBSVM_NOT_FOUND, MSG_SKLEARN_NOT_FOUND) from coremltools.models.utils import (_is_macos, _macos_version, evaluate_regressor) if _HAS_LIBSVM: import svmutil from coremltools.converters import libsvm if _HAS_SKLEARN: from sklearn.preprocessing import OneHotEncoder from sklearn.svm import SVR from coremltools.converters import sklearn as sklearn_converter @unittest.skipIf(not _HAS_SKLEARN, MSG_SKLEARN_NOT_FOUND) class SvrScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn sklearn_converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ if not _HAS_SKLEARN: return scikit_data = load_boston() scikit_model = SVR(kernel="linear") scikit_model.fit(scikit_data["data"], scikit_data["target"]) # Save the data and the model self.scikit_data = scikit_data self.scikit_model = scikit_model def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(TypeError): model = SVR() spec = sklearn_converter.convert(model, "data", "out") # Check the expected class during conversion. with self.assertRaises(TypeError): model = OneHotEncoder() spec = sklearn_converter.convert(model, "data", "out") @pytest.mark.slow def test_evaluation_stress_test(self): self._test_evaluation(allow_slow=True) def test_evaluation(self): self._test_evaluation(allow_slow=False) def _test_evaluation(self, allow_slow): """ Test that the same predictions are made """ # Generate some smallish (some kernels take too long on anything else) random data x, y = [], [] for _ in range(50): cur_x1, cur_x2 = random.gauss(2, 3), random.gauss(-1, 2) x.append([cur_x1, cur_x2]) y.append(1 + 2 * cur_x1 + 3 * cur_x2) input_names = ["x1", "x2"] df = pd.DataFrame(x, columns=input_names) # Parameters to test kernel_parameters = [ {}, {"kernel": "rbf", "gamma": 1.2}, {"kernel": "linear"}, {"kernel": "poly"}, {"kernel": "poly", "degree": 2}, {"kernel": "poly", "gamma": 0.75}, {"kernel": "poly", "degree": 0, "gamma": 0.9, "coef0": 2}, {"kernel": "sigmoid"}, {"kernel": "sigmoid", "gamma": 1.3}, {"kernel": "sigmoid", "coef0": 0.8}, {"kernel": "sigmoid", "coef0": 0.8, "gamma": 0.5}, ] non_kernel_parameters = [ {}, {"C": 1}, {"C": 1.5, "epsilon": 0.5, "shrinking": True}, {"C": 0.5, "epsilon": 1.5, "shrinking": False}, ] # Test for param1 in non_kernel_parameters: for param2 in kernel_parameters: cur_params = param1.copy() cur_params.update(param2) print("cur_params=" + str(cur_params)) cur_model = SVR(**cur_params) cur_model.fit(x, y) df["target"] = cur_model.predict(x) spec = sklearn_converter.convert(cur_model, input_names, "target") if _is_macos() and _macos_version() >= (10, 13): metrics = evaluate_regressor(spec, df) self.assertAlmostEqual(metrics["max_error"], 0) if not allow_slow: break if not allow_slow: break @unittest.skipIf(not _HAS_LIBSVM, MSG_LIBSVM_NOT_FOUND) @unittest.skipIf(not _HAS_SKLEARN, MSG_SKLEARN_NOT_FOUND) class EpsilonSVRLibSVMTest(unittest.TestCase): """ Unit test class for testing the libsvm sklearn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ if not _HAS_SKLEARN: return if not _HAS_LIBSVM: return scikit_data = load_boston() prob = svmutil.svm_problem(scikit_data["target"], scikit_data["data"].tolist()) param = svmutil.svm_parameter() param.svm_type = svmutil.EPSILON_SVR param.kernel_type = svmutil.LINEAR param.eps = 1 self.libsvm_model = svmutil.svm_train(prob, param) def test_input_names(self): data = load_boston() df = pd.DataFrame({"input": data["data"].tolist()}) df["input"] = df["input"].apply(np.array) # Default values spec = libsvm.convert(self.libsvm_model) if _is_macos() and _macos_version() >= (10, 13): (df["target"], _, _) = svmutil.svm_predict( data["target"], data["data"].tolist(), self.libsvm_model ) metrics = evaluate_regressor(spec, df) self.assertAlmostEqual(metrics["max_error"], 0) # One extra parameters. This is legal/possible. num_inputs = len(data["data"][0]) spec = libsvm.convert(self.libsvm_model, input_length=num_inputs + 1) # Not enough input names. input_names = ["this", "is", "not", "enough", "names"] with self.assertRaises(ValueError): libsvm.convert(self.libsvm_model, input_names=input_names) with self.assertRaises(ValueError): libsvm.convert(self.libsvm_model, input_length=num_inputs - 1) def test_conversion_from_filesystem(self): libsvm_model_path = tempfile.mktemp(suffix="model.libsvm") svmutil.svm_save_model(libsvm_model_path, self.libsvm_model) spec = libsvm.convert( libsvm_model_path, input_names="data", target_name="target" ) def test_conversion_bad_inputs(self): # Check the expected class during conversion. with self.assertRaises(TypeError): model = OneHotEncoder() spec = libsvm.convert(model, "data", "out") @pytest.mark.slow def test_evaluation_stress_test(self): self._test_evaluation(allow_slow=True) def test_evaluation(self): self._test_evaluation(allow_slow=False) def _test_evaluation(self, allow_slow): """ Test that the same predictions are made """ from svm import svm_parameter, svm_problem from svmutil import svm_predict, svm_train # Generate some smallish (poly kernels take too long on anything else) random data x, y = [], [] for _ in range(50): cur_x1, cur_x2 = random.gauss(2, 3), random.gauss(-1, 2) x.append([cur_x1, cur_x2]) y.append(1 + 2 * cur_x1 + 3 * cur_x2) input_names = ["x1", "x2"] df = pd.DataFrame(x, columns=input_names) prob = svm_problem(y, x) # Parameters base_param = "-s 3" # model type is epsilon SVR non_kernel_parameters = ["", "-c 1.5 -p 0.5 -h 1", "-c 0.5 -p 0.5 -h 0"] kernel_parameters = [ "", "-t 2 -g 1.2", # rbf kernel "-t 0", # linear kernel "-t 1", "-t 1 -d 2", "-t 1 -g 0.75", "-t 1 -d 0 -g 0.9 -r 2", # poly kernel "-t 3", "-t 3 -g 1.3", "-t 3 -r 0.8", "-t 3 -r 0.8 -g 0.5", # sigmoid kernel ] for param1 in non_kernel_parameters: for param2 in kernel_parameters: param_str = " ".join([base_param, param1, param2]) print(param_str) param = svm_parameter(param_str) model = svm_train(prob, param) (df["target"], _, _) = svm_predict(y, x, model) spec = libsvm.convert( model, input_names=input_names, target_name="target" ) if _is_macos() and _macos_version() >= (10, 13): metrics = evaluate_regressor(spec, df) self.assertAlmostEqual(metrics["max_error"], 0) if not allow_slow: break if not allow_slow: break ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_categorical_imputer.py0000644000000000000000000000501114672066616026331 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import numpy as np from packaging.version import Version from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN, _SKLEARN_VERSION if _HAS_SKLEARN: import sklearn from coremltools.converters import sklearn as converter try: # scikit-learn >= 0.21 from sklearn.impute import SimpleImputer as Imputer sklearn_class = sklearn.impute.SimpleImputer except ImportError: # scikit-learn < 0.21 from sklearn.preprocessing import Imputer sklearn_class = sklearn.preprocessing.Imputer @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class ImputerTestCase(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() # axis parameter deprecated in SimpleImputer >= 0.22. which now imputes # only along columns as desired here. if _SKLEARN_VERSION >= Version("0.22"): scikit_model = Imputer(strategy="most_frequent") else: scikit_model = Imputer(strategy="most_frequent", axis=0) scikit_data["data"][1, 8] = np.nan input_data = scikit_data["data"][:, 8].reshape(-1, 1) scikit_model.fit(input_data, scikit_data["target"]) # Save the data and the model self.scikit_data = scikit_data self.scikit_model = scikit_model def test_conversion(self): spec = converter.convert(self.scikit_model, "data", "out").get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) # Test the interface self.assertTrue(spec.pipeline.models[-1].HasField("imputer")) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(Exception): model = Imputer() spec = converter.convert(model, "data", "out") # Check the expected class during conversion. with self.assertRaises(Exception): from sklearn.linear_model import LinearRegression model = LinearRegression() spec = converter.convert(model, "data", "out") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_composite_pipelines.py0000644000000000000000000000573114672066616026372 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import pandas as pd from packaging.version import Version from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN, _SKLEARN_VERSION from coremltools.converters.sklearn import convert from coremltools.models.utils import (_is_macos, _macos_version, evaluate_regressor, evaluate_transformer) if _HAS_SKLEARN: from sklearn.ensemble import GradientBoostingRegressor from sklearn.pipeline import Pipeline from sklearn.preprocessing import OneHotEncoder, StandardScaler @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class GradientBoostingRegressorBostonHousingScikitNumericTest(unittest.TestCase): @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") @unittest.skipIf(_SKLEARN_VERSION >= Version("0.22"), "categorical_features parameter to OneHotEncoder() deprecated after SciKit Learn 0.22." ) def test_boston_OHE_plus_normalizer(self): data = load_boston() pl = Pipeline( [ ("OHE", OneHotEncoder(categorical_features=[8], sparse=False)), ("Scaler", StandardScaler()), ] ) pl.fit(data.data, data.target) # Convert the model spec = convert(pl, data.feature_names, "out") if _is_macos() and _macos_version() >= (10, 13): input_data = [dict(zip(data.feature_names, row)) for row in data.data] output_data = [{"out": row} for row in pl.transform(data.data)] result = evaluate_transformer(spec, input_data, output_data) assert result["num_errors"] == 0 @unittest.skipIf(_SKLEARN_VERSION >= Version("0.22"), "categorical_features parameter to OneHotEncoder() deprecated after SciKit Learn 0.22." ) def _test_boston_OHE_plus_trees(self, loss='ls'): data = load_boston() pl = Pipeline( [ ("OHE", OneHotEncoder(categorical_features=[8], sparse=False)), ("Trees", GradientBoostingRegressor(random_state=1, loss=loss)), ] ) pl.fit(data.data, data.target) # Convert the model spec = convert(pl, data.feature_names, "target") if _is_macos() and _macos_version() >= (10, 13): # Get predictions df = pd.DataFrame(data.data, columns=data.feature_names) df["target"] = pl.predict(data.data) # Evaluate it result = evaluate_regressor(spec, df, "target", verbose=False) assert result["max_error"] < 0.0001 def test_boston_OHE_plus_trees(self): self._test_boston_OHE_plus_trees() def test_boston_OHE_plus_trees_with_huber_loss(self): self._test_boston_OHE_plus_trees(loss='huber') ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_dict_vectorizer.py0000644000000000000000000000643614672066616025522 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import numpy as np import numpy.random as rn import pandas as pd import coremltools from coremltools._deps import _HAS_SKLEARN from coremltools.models.utils import (_is_macos, _macos_version, evaluate_classifier, evaluate_transformer) if _HAS_SKLEARN: from sklearn.feature_extraction import DictVectorizer from sklearn.linear_model import LogisticRegression from sklearn.pipeline import Pipeline from coremltools.converters import sklearn @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class DictVectorizerScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ def _test_conversion(self, data, trained_dict_vectorizer): X = trained_dict_vectorizer.transform(data) m = sklearn.convert( trained_dict_vectorizer, input_features="features", output_feature_names="output", ) if _is_macos() and _macos_version() >= (10, 13): ret = evaluate_transformer( m, [{"features": row} for row in data], [{"output": x_r} for x_r in X], True, ) assert ret["num_errors"] == 0 def test_dictvectorizer(self): D = [ {"foo": 1, "bar": 3}, {"bar": 4, "baz": 2}, {"bar": 1, "quux": 1, "quuux": 2}, ] for sparse in (True, False): for dtype in (int, np.float32, np.int16): for sort in (True, False): v = DictVectorizer(sparse=sparse, dtype=dtype, sort=sort) v = v.fit(D) self._test_conversion(D, v) def test_unseen_or_no_features(self): D1 = [{"camelot": 0, "spamalot": 1}] D2 = [{}, {"nothing": 21}] for sparse in (True, False): for dtype in (int, np.float32, np.int16): for sort in (True, False): v = DictVectorizer(sparse=sparse, dtype=dtype, sort=sort) v = v.fit(D1) self._test_conversion(D2, v) def test_int_features_in_pipeline(self): rn.seed(0) x_train_dict = [ dict((rn.randint(100), 1) for i in range(20)) for j in range(100) ] y_train = [0, 1] * 50 # multi_class default changed in version >= 0.22 from ‘ovr’ to ‘auto’. # Specify explicitly to match < 0.22 behavior. pl = Pipeline([("dv", DictVectorizer()), ("lm", LogisticRegression(multi_class='ovr'))]) pl.fit(x_train_dict, y_train) model = coremltools.converters.sklearn.convert( pl, input_features="features", output_feature_names="target" ) if _is_macos() and _macos_version() >= (10, 13): x = pd.DataFrame( {"features": x_train_dict, "target": pl.predict(x_train_dict)} ) cur_eval_metics = evaluate_classifier(model, x) self.assertEqual(cur_eval_metics["num_errors"], 0) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_feature_names.py0000644000000000000000000000200714672066616025127 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import coremltools.models._feature_management as fm import coremltools.models.datatypes as dt from coremltools._deps import _HAS_SKLEARN @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class FeatureManagementTests(unittest.TestCase): def test_all_strings(self): features = ["a", "b", "c"] processed_features = [ ("a", dt.Double()), ("b", dt.Double()), ("c", dt.Double()), ] out = fm.process_or_validate_features(features) self.assertEqual(out, processed_features) self.assertTrue(fm.is_valid_feature_list(out)) def test_single_array(self): self.assertEqual( fm.process_or_validate_features("a", num_dimensions=10), [("a", dt.Array(10))], ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_glm_classifier.py0000644000000000000000000001046014672066616025276 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import random import unittest import pandas as pd from coremltools._deps import _HAS_SKLEARN from coremltools.converters.sklearn import convert from coremltools.models.utils import (_is_macos, _macos_version, evaluate_classifier, evaluate_classifier_with_probabilities) if _HAS_SKLEARN: from sklearn.linear_model import LogisticRegression from sklearn.svm import LinearSVC @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class GlmCassifierTest(unittest.TestCase): def test_logistic_regression_binary_classification_with_string_labels(self): self._conversion_and_evaluation_helper_for_logistic_regression(["Foo", "Bar"]) def test_logistic_regression_multiclass_classification_with_int_labels(self): self._conversion_and_evaluation_helper_for_logistic_regression([1, 2, 3, 4]) @staticmethod def _generate_random_data(labels): random.seed(42) # Generate some random data x, y = [], [] for _ in range(100): x.append([random.gauss(2, 3), random.gauss(-1, 2)]) y.append(random.choice(labels)) return x, y def _conversion_and_evaluation_helper_for_logistic_regression(self, class_labels): options = { "C": (0.1, 1.0, 2.0), "fit_intercept": (True, False), "class_weight": ("balanced", None), "solver": ("newton-cg", "lbfgs", "liblinear", "sag"), } # Generate a list of all combinations of options and the default parameters product = itertools.product(*options.values()) args = [{}] + [dict(zip(options.keys(), p)) for p in product] x, y = GlmCassifierTest._generate_random_data(class_labels) column_names = ["x1", "x2"] df = pd.DataFrame(x, columns=column_names) for cur_args in args: # multi_class default changed in version 0.22 from ‘ovr’ to ‘auto’ in 0.22. # Specify explicitly to match <0.22 behavior. cur_model = LogisticRegression(**cur_args, multi_class='ovr') cur_model.fit(x, y) spec = convert( cur_model, input_features=column_names, output_feature_names="target" ) if _is_macos() and _macos_version() >= (10, 13): probability_lists = cur_model.predict_proba(x) df["classProbability"] = [ dict(zip(cur_model.classes_, cur_vals)) for cur_vals in probability_lists ] metrics = evaluate_classifier_with_probabilities( spec, df, probabilities="classProbability", verbose=False ) self.assertEqual(metrics["num_key_mismatch"], 0) self.assertLess(metrics["max_probability_error"], 0.00001) def test_linear_svc_binary_classification_with_string_labels(self): self._conversion_and_evaluation_helper_for_linear_svc(["Foo", "Bar"]) def test_linear_svc_multiclass_classification_with_int_labels(self): self._conversion_and_evaluation_helper_for_linear_svc([1, 2, 3, 4]) def _conversion_and_evaluation_helper_for_linear_svc(self, class_labels): ARGS = [ {}, {"C": 0.75, "loss": "hinge"}, {"penalty": "l1", "dual": False}, {"tol": 0.001, "fit_intercept": False}, {"intercept_scaling": 1.5}, ] x, y = GlmCassifierTest._generate_random_data(class_labels) column_names = ["x1", "x2"] df = pd.DataFrame(x, columns=column_names) for cur_args in ARGS: cur_model = LinearSVC(**cur_args) cur_model.fit(x, y) spec = convert( cur_model, input_features=column_names, output_feature_names="target" ) if _is_macos() and _macos_version() >= (10, 13): df["target"] = cur_model.predict(x) cur_eval_metics = evaluate_classifier(spec, df, verbose=False) self.assertEqual(cur_eval_metics["num_errors"], 0) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_imputer.py0000644000000000000000000000473614672066616024011 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import numpy as np import numpy.random as rn from packaging.version import Version from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN, _SKLEARN_VERSION from coremltools.models.utils import (_is_macos, _macos_version, evaluate_transformer) if _HAS_SKLEARN: import sklearn try: # scikit-learn >= 0.21 from sklearn.impute import SimpleImputer as Imputer sklearn_class = sklearn.impute.SimpleImputer except ImportError: # scikit-learn < 0.21 from sklearn.preprocessing import Imputer sklearn_class = sklearn.preprocessing.Imputer from coremltools.converters import sklearn as converter @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class NumericalImputerTestCase(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ def test_conversion_boston(self): scikit_data = load_boston() sh = scikit_data["data"].shape rn.seed(0) missing_value_indices = [ (rn.randint(sh[0]), rn.randint(sh[1])) for k in range(sh[0]) ] for strategy in ["mean", "median", "most_frequent"]: for missing_value in [0, "NaN", -999]: # SimpleImputer >=0.22 does not accept missing values encoded as NaN. if _SKLEARN_VERSION >= Version("0.22"): if missing_value == "NaN": continue X = np.array(scikit_data["data"]).copy() for i, j in missing_value_indices: X[i, j] = missing_value model = Imputer(missing_values=missing_value, strategy=strategy) model = model.fit(X) tr_X = model.transform(X.copy()) spec = converter.convert(model, scikit_data["feature_names"], "out") input_data = [dict(zip(scikit_data["feature_names"], row)) for row in X] output_data = [{"out": row} for row in tr_X] result = evaluate_transformer(spec, input_data, output_data) assert result["num_errors"] == 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_io_types.py0000644000000000000000000003412414672066616024151 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import numpy as np import PIL.Image from ..utils import load_boston import coremltools from coremltools._deps import _HAS_SKLEARN, MSG_SKLEARN_NOT_FOUND from coremltools.models.utils import _is_macos, _macos_version if _HAS_SKLEARN: from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor from sklearn.linear_model import LinearRegression from sklearn.svm import SVC, SVR from sklearn.tree import DecisionTreeRegressor def create_model(spec): """ Create MLModel with specified types Parameters ---------- spec: Pb spec from 3rd party converted model Returns ------- MLModel """ return coremltools.models.MLModel(spec) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) @unittest.skipIf(not _HAS_SKLEARN, MSG_SKLEARN_NOT_FOUND) class TestIODataTypes(unittest.TestCase): """ This class tests for different I/O feature data types for an .mlmodel It will cover the following areas to test for: - All features must have a valid type - Multiarrays must have a valid dataType. Inputs must specify shape. Shape must have >= 0 elements - Images must have a valid colorspace. width & height have to be >= 0 - Dictionaries must have a valid key type """ @property def scikit_data(self): return load_boston() def _feature_data_type(self, dtype): feature_dict = {np.int32: "INT32", np.float32: "FLOAT32", np.float64: "DOUBLE"} return feature_dict[dtype] @property def number_data_type(self): return dict( int8=np.int8, int16=np.int16, int32=np.int32, uint8=np.uint8, uint16=np.uint16, uint32=np.uint32, float=np.float32, double=np.double, ) def _sklearn_setup(self, model, dtype, data, target): model.fit(data, target) spec = coremltools.converters.sklearn.convert( model, "data", "target" ).get_spec() return model, spec def _check_tree_model(self, spec, inputType, outputType, n_out): self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertIsNotNone(spec.treeEnsembleClassifier) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), n_out) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual(spec.description.output[0].type.WhichOneof("Type"), outputType) self.assertEqual(spec.description.input[0].name, "data") self.assertEqual(spec.description.input[0].type.WhichOneof("Type"), inputType) def test_tree_regressor(self): for dtype in self.number_data_type.keys(): scikit_model = DecisionTreeRegressor(random_state=1) data = self.scikit_data["data"].astype(dtype) target = self.scikit_data["target"].astype(dtype) scikit_model, spec = self._sklearn_setup(scikit_model, dtype, data, target) test_data = data[0].reshape(1, -1) self._check_tree_model(spec, "multiArrayType", "doubleType", 1) coreml_model = create_model(spec) try: self.assertEqual( scikit_model.predict(test_data)[0].dtype, type(coreml_model.predict({"data": test_data})["target"]), ) self.assertEqual( scikit_model.predict(test_data)[0], coreml_model.predict({"data": test_data})["target"], msg="{} != {} for Dtype: {}".format( scikit_model.predict(test_data)[0], coreml_model.predict({"data": test_data})["target"], dtype, ), ) except RuntimeError: print("{} not supported. ".format(dtype)) def test_random_forest_classifier(self): for dtype in self.number_data_type.keys(): # n_estimators default changed >= 0.22. Specify explicitly to match <0.22 behavior. scikit_model = RandomForestClassifier(random_state=1, n_estimators=10) data = self.scikit_data["data"].astype(dtype) target = ( self.scikit_data["target"].astype(dtype) > self.scikit_data["target"].astype(dtype).mean() ) scikit_model, spec = self._sklearn_setup(scikit_model, dtype, data, target) test_data = data[0].reshape(1, -1) self._check_tree_model(spec, "multiArrayType", "int64Type", 2) coreml_model = create_model(spec) try: self.assertEqual( scikit_model.predict(test_data)[0], bool(int(coreml_model.predict({"data": test_data})["target"])), msg="{} != {} for Dtype: {}".format( scikit_model.predict(test_data)[0], bool(int(coreml_model.predict({"data": test_data})["target"])), dtype, ), ) except RuntimeError: print("{} not supported. ".format(dtype)) def test_random_forest_regressor(self): for dtype in self.number_data_type.keys(): # n_estimators default changed >= 0.22. Specify explicitly to match <0.22 behavior. scikit_model = RandomForestRegressor(random_state=1, n_estimators=10) data = self.scikit_data["data"].astype(dtype) target = self.scikit_data["target"].astype(dtype) scikit_model, spec = self._sklearn_setup(scikit_model, dtype, data, target) test_data = data[0].reshape(1, -1) self._check_tree_model(spec, "multiArrayType", "doubleType", 1) coreml_model = create_model(spec) try: self.assertEqual( scikit_model.predict(test_data)[0].dtype, type(coreml_model.predict({"data": test_data})["target"]), ) self.assertAlmostEqual( scikit_model.predict(test_data)[0], coreml_model.predict({"data": test_data})["target"], msg="{} != {} for Dtype: {}".format( scikit_model.predict(test_data)[0], coreml_model.predict({"data": test_data})["target"], dtype, ), ) except RuntimeError: print("{} not supported. ".format(dtype)) def test_support_vector_classifier(self): for dtype in self.number_data_type.keys(): scikit_model = SVC(kernel="rbf", gamma=1.2, C=1) data = self.scikit_data["data"].astype(dtype) target = ( self.scikit_data["target"].astype(dtype) > self.scikit_data["target"].astype(dtype).mean() ) scikit_model, spec = self._sklearn_setup(scikit_model, dtype, data, target) coreml_model = create_model(spec) for idx in range(0, 10): test_data = data[idx].reshape(1, -1) try: self.assertEqual( scikit_model.predict(test_data)[0], bool(int(coreml_model.predict({"data": test_data})["target"])), msg="{} != {} for Dtype: {}".format( scikit_model.predict(test_data)[0], bool( int(coreml_model.predict({"data": test_data})["target"]) ), dtype, ), ) except RuntimeError: print("{} not supported. ".format(dtype)) def test_support_vector_regressor(self): for dtype in self.number_data_type.keys(): scikit_model = SVR(kernel="rbf") data = self.scikit_data["data"].astype(dtype) target = self.scikit_data["target"].astype(dtype) scikit_model, spec = self._sklearn_setup(scikit_model, dtype, data, target) test_data = data[0].reshape(1, -1) coreml_model = create_model(spec) try: self.assertAlmostEqual( scikit_model.predict(test_data)[0], coreml_model.predict({"data": test_data})["target"], msg="{} != {} for Dtype: {}".format( scikit_model.predict(test_data)[0], coreml_model.predict({"data": test_data})["target"], dtype, ), ) except RuntimeError: print("{} not supported. ".format(dtype)) def test_linear_regressor(self): for dtype in self.number_data_type.keys(): scikit_model = LinearRegression() data = self.scikit_data["data"].astype(dtype) target = self.scikit_data["target"].astype(dtype) scikit_model, spec = self._sklearn_setup(scikit_model, dtype, data, target) test_data = data[0].reshape(1, -1) coreml_model = create_model(spec) try: self.assertEqual( scikit_model.predict(test_data)[0].dtype, type(coreml_model.predict({"data": test_data})["target"]), ) self.assertAlmostEqual( scikit_model.predict(test_data)[0], coreml_model.predict({"data": test_data})["target"], msg="{} != {} for Dtype: {}".format( scikit_model.predict(test_data)[0], coreml_model.predict({"data": test_data})["target"], dtype, ), ) except RuntimeError: print("{} not supported. ".format(dtype)) def test_image_output_rgb(self): input_shape = (3, 10, 20) input_features = [("data", coremltools.models.datatypes.Array(*input_shape))] output_features = [("target", coremltools.models.datatypes.Array(*input_shape))] builder = coremltools.models.neural_network.NeuralNetworkBuilder( input_features, output_features ) builder.add_elementwise( "Identity", input_names=["data"], output_name="target", mode="ADD", alpha=0.0, ) spec = builder.spec output = spec.description.output[0] output.type.imageType.colorSpace = coremltools.proto.FeatureTypes_pb2.ImageFeatureType.ColorSpace.Value( "RGB" ) output.type.imageType.height = input_shape[1] output.type.imageType.width = input_shape[2] coreml_model = coremltools.models.MLModel(spec) input_data = np.floor(np.random.rand(*input_shape) * 255) coreml_out = coreml_model.predict({"data": input_data})["target"] self.assertEqual(PIL.Image.Image, type(coreml_out)) self.assertEqual("RGBA", coreml_out.mode) np.testing.assert_equal( np.uint8(input_data), np.array(coreml_out).transpose(2, 0, 1)[:3, :] ) @unittest.skip("rdar://71638164") def test_image_output_bgr(self): input_shape = (3, 15, 25) input_features = [("data", coremltools.models.datatypes.Array(*input_shape))] output_features = [("target", coremltools.models.datatypes.Array(*input_shape))] builder = coremltools.models.neural_network.NeuralNetworkBuilder( input_features, output_features ) builder.add_elementwise( "Identity", input_names=["data"], output_name="target", mode="ADD", alpha=0.0, ) spec = builder.spec output = spec.description.output[0] output.type.imageType.colorSpace = coremltools.proto.FeatureTypes_pb2.ImageFeatureType.ColorSpace.Value( "BGR" ) output.type.imageType.height = input_shape[1] output.type.imageType.width = input_shape[2] coreml_model = coremltools.models.MLModel(spec) input_data = np.floor(np.random.rand(*input_shape) * 255) coreml_out = coreml_model.predict({"data": input_data})["target"] self.assertEqual(PIL.Image.Image, type(coreml_out)) self.assertEqual("RGBA", coreml_out.mode) np.testing.assert_equal( np.uint8(input_data), np.array(coreml_out)[:, :, ::-1].transpose(2, 0, 1)[1:, :], ) def test_image_output_grayscale(self): input_shape = (1, 20, 30) input_features = [("data", coremltools.models.datatypes.Array(*input_shape))] output_features = [("target", coremltools.models.datatypes.Array(*input_shape))] builder = coremltools.models.neural_network.NeuralNetworkBuilder( input_features, output_features ) builder.add_elementwise( "Identity", input_names=["data"], output_name="target", mode="ADD", alpha=0.0, ) spec = builder.spec output = spec.description.output[0] output.type.imageType.colorSpace = coremltools.proto.FeatureTypes_pb2.ImageFeatureType.ColorSpace.Value( "GRAYSCALE" ) output.type.imageType.height = input_shape[1] output.type.imageType.width = input_shape[2] coreml_model = coremltools.models.MLModel(spec) input_data = np.floor(np.random.rand(*input_shape) * 255) coreml_out = coreml_model.predict({"data": input_data})["target"] self.assertEqual(PIL.Image.Image, type(coreml_out)) self.assertEqual("L", coreml_out.mode) np.testing.assert_equal(np.uint8(input_data)[0], np.array(coreml_out)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_k_neighbors_classifier.py0000644000000000000000000002625114672066616027016 0ustar00rootroot# Copyright (c) 2019, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest from scipy import sparse from coremltools._deps import _HAS_SKLEARN if _HAS_SKLEARN: from sklearn.datasets import load_iris from sklearn.neighbors import KNeighborsClassifier from coremltools.converters import sklearn @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class KNeighborsClassifierScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ print("Setting up KNeighborsClassifier converter tests") iris_samples = load_iris() self.iris_X = iris_samples.data self.iris_y = iris_samples.target def test_conversion_unfitted(self): """Tests conversion failure for an unfitted scikit model.""" scikit_model = KNeighborsClassifier() self.assertRaises(TypeError, sklearn.convert, scikit_model) def test_conversion_brute_algorithm(self): """Tests conversion of a scikit KNeighborsClassifier using the brute force algorithm.""" scikit_model = KNeighborsClassifier(algorithm="brute", n_neighbors=42) scikit_model.fit(self.iris_X, self.iris_y) coreml_model = sklearn.convert(scikit_model, "single_input", "single_output") coreml_spec = coreml_model.get_spec() self.assertIsNotNone(coreml_spec) self.assertTrue(coreml_spec.HasField("kNearestNeighborsClassifier")) self.assertEqual( coreml_spec.kNearestNeighborsClassifier.numberOfNeighbors.defaultValue, 42 ) self.assertEqual( coreml_spec.kNearestNeighborsClassifier.numberOfNeighbors.range.minValue, 1 ) self.assertEqual( coreml_spec.kNearestNeighborsClassifier.numberOfNeighbors.range.maxValue, len(self.iris_X), ) self.assertTrue( coreml_spec.kNearestNeighborsClassifier.HasField("uniformWeighting") ) self.assertEqual( coreml_spec.kNearestNeighborsClassifier.nearestNeighborsIndex.numberOfDimensions, len(self.iris_X[0]), ) self.assertTrue( coreml_spec.kNearestNeighborsClassifier.nearestNeighborsIndex.HasField( "linearIndex" ) ) self.assertTrue( coreml_spec.kNearestNeighborsClassifier.nearestNeighborsIndex.HasField( "squaredEuclideanDistance" ) ) self.validate_labels(coreml_spec, self.iris_y) self.validate_float_samples(coreml_spec, self.iris_X) def test_conversion_kd_tree_algorithm(self): """Tests conversion of a scikit KNeighborsClassifier using the brute force algorithm.""" test_leaf_size = 23 test_n_neighbors = 42 scikit_model = KNeighborsClassifier( algorithm="kd_tree", leaf_size=test_leaf_size, n_neighbors=test_n_neighbors ) scikit_model.fit(self.iris_X, self.iris_y) coreml_model = sklearn.convert(scikit_model, "single_input", "single_output") coreml_spec = coreml_model.get_spec() self.assertIsNotNone(coreml_spec) self.assertTrue(coreml_spec.HasField("kNearestNeighborsClassifier")) self.assertEqual( coreml_spec.kNearestNeighborsClassifier.numberOfNeighbors.defaultValue, test_n_neighbors, ) self.assertEqual( coreml_spec.kNearestNeighborsClassifier.numberOfNeighbors.range.minValue, 1 ) self.assertEqual( coreml_spec.kNearestNeighborsClassifier.numberOfNeighbors.range.maxValue, len(self.iris_X), ) self.assertTrue( coreml_spec.kNearestNeighborsClassifier.HasField("uniformWeighting") ) self.assertEqual( coreml_spec.kNearestNeighborsClassifier.nearestNeighborsIndex.numberOfDimensions, len(self.iris_X[0]), ) self.assertTrue( coreml_spec.kNearestNeighborsClassifier.nearestNeighborsIndex.HasField( "singleKdTreeIndex" ) ) self.assertEqual( test_leaf_size, coreml_spec.kNearestNeighborsClassifier.nearestNeighborsIndex.singleKdTreeIndex.leafSize, ) self.assertTrue( coreml_spec.kNearestNeighborsClassifier.nearestNeighborsIndex.HasField( "squaredEuclideanDistance" ) ) self.validate_labels(coreml_spec, self.iris_y) self.validate_float_samples(coreml_spec, self.iris_X) def test_conversion_auto_algorithm(self): """Tests conversion of a scikit KNeighborsClassifier using the brute force algorithm.""" test_n_neighbors = 42 scikit_model = KNeighborsClassifier( algorithm="auto", n_neighbors=test_n_neighbors ) scikit_model.fit(self.iris_X, self.iris_y) coreml_model = sklearn.convert(scikit_model, "single_input", "single_output") coreml_spec = coreml_model.get_spec() self.assertIsNotNone(coreml_spec) def test_conversion_unsupported_algorithm(self): """Test a scikit KNeighborsClassifier with an invalid algorithm.""" scikit_model = KNeighborsClassifier(algorithm="ball_tree") self.assertRaises(TypeError, sklearn.convert, scikit_model) def test_conversion_weight_function_good(self): scikit_model = KNeighborsClassifier(weights="uniform") scikit_model.fit(self.iris_X, self.iris_y) coreml_model = sklearn.convert(scikit_model, "single_input", "single_output") coreml_spec = coreml_model.get_spec() self.assertIsNotNone(coreml_spec) self.assertTrue( coreml_spec.kNearestNeighborsClassifier.HasField("uniformWeighting") ) def test_conversion_unsupported_weight_function(self): scikit_model = KNeighborsClassifier(algorithm="brute", weights="distance") scikit_model.fit(self.iris_X, self.iris_y) self.assertRaises(TypeError, sklearn.convert, scikit_model) def callable_weight_function(): print("Inside callable_weight_function") scikit_model = KNeighborsClassifier( algorithm="brute", weights=callable_weight_function ) scikit_model.fit(self.iris_X, self.iris_y) self.assertRaises(TypeError, sklearn.convert, scikit_model) def test_conversion_distance_function_good(self): """Tests conversion of a scikit KNeighborsClassifier with a valid distance metric.""" scikit_model = KNeighborsClassifier(algorithm="brute", metric="euclidean") scikit_model.fit(self.iris_X, self.iris_y) coreml_model = sklearn.convert(scikit_model, "single_input", "single_output") coreml_spec = coreml_model.get_spec() self.assertIsNotNone(coreml_spec) self.assertTrue( coreml_spec.kNearestNeighborsClassifier.nearestNeighborsIndex.HasField( "squaredEuclideanDistance" ) ) # Minkowski metric with p=2 is equivalent to the squared Euclidean distance scikit_model = KNeighborsClassifier(algorithm="brute", metric="minkowski", p=2) scikit_model.fit(self.iris_X, self.iris_y) coreml_spec = coreml_model.get_spec() self.assertIsNotNone(coreml_spec) self.assertTrue( coreml_spec.kNearestNeighborsClassifier.nearestNeighborsIndex.HasField( "squaredEuclideanDistance" ) ) def test_conversion_unsupported_distance_function(self): """Tests conversion of a scikit KNeighborsClassifier with an invalid distance metric.""" # There are many possible distance functions for a brute force neighbors function, but these 3 should give us # coverage over the converter code. scikit_model = KNeighborsClassifier(algorithm="brute", metric="manhattan") scikit_model.fit(self.iris_X, self.iris_y) self.assertRaises(TypeError, sklearn.convert, scikit_model) scikit_model = KNeighborsClassifier(algorithm="kd_tree", metric="chebyshev") scikit_model.fit(self.iris_X, self.iris_y) self.assertRaises(TypeError, sklearn.convert, scikit_model) scikit_model = KNeighborsClassifier(algorithm="brute", metric="minkowski", p=3) scikit_model.fit(self.iris_X, self.iris_y) self.assertRaises(TypeError, sklearn.convert, scikit_model) def callable_distance_function(): print("Inside callable_distance_function") scikit_model = KNeighborsClassifier( algorithm="brute", metric=callable_distance_function ) scikit_model.fit(self.iris_X, self.iris_y) self.assertRaises(TypeError, sklearn.convert, scikit_model) def test_conversion_with_sparse_X(self): """Tests conversion of a model that's fitted with sparse data.""" num_samples = 100 num_dims = 64 sparse_X = sparse.rand( num_samples, num_dims, format="csr" ) # KNeighborsClassifier only supports CSR format y = self.iris_y[ 0:num_samples ] # the labels themselves don't matter - just use 100 of the Iris ones sklearn_model = KNeighborsClassifier(algorithm="brute") sklearn_model.fit(sparse_X, y) coreml_model = sklearn.convert(sklearn_model) coreml_spec = coreml_model.get_spec() self.assertIsNotNone(coreml_spec) def test_conversion_with_sparse_y(self): """Tests conversion of a model that's fitted with y values in a sparse format.""" from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split( self.iris_X, self.iris_y, test_size=0.2, train_size=0.8 ) from sklearn import preprocessing lb = preprocessing.LabelBinarizer(sparse_output=True) binarized_y = lb.fit_transform(y_train) sklearn_model = KNeighborsClassifier(algorithm="brute") sklearn_model.fit(X_train, binarized_y) self.assertRaises(ValueError, sklearn.convert, sklearn_model) def validate_labels(self, spec, expected): """Validate the labels returned from the converted scikit KNeighborsClassifier""" self.assertTrue(spec.kNearestNeighborsClassifier.HasField("int64ClassLabels")) for index, label in enumerate( spec.kNearestNeighborsClassifier.int64ClassLabels.vector ): self.assertEqual(label, expected[index]) def validate_float_samples(self, spec, expected): """Validate the float samples returned from the converted scikit KNeighborsClassifier""" num_dimensions = ( spec.kNearestNeighborsClassifier.nearestNeighborsIndex.numberOfDimensions ) for index, sample in enumerate( spec.kNearestNeighborsClassifier.nearestNeighborsIndex.floatSamples ): for dim in range(0, num_dimensions): self.assertAlmostEqual( sample.vector[dim], expected[index][dim], places=6 ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_linear_regression.py0000644000000000000000000001154214672066616026027 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import pandas as pd from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN from coremltools.models.utils import (_is_macos, _macos_version, evaluate_regressor) if _HAS_SKLEARN: from sklearn.linear_model import LinearRegression from sklearn.preprocessing import OneHotEncoder from sklearn.svm import LinearSVR from coremltools.converters.sklearn import convert @unittest.skipIf(not _HAS_SKLEARN, "Missing scikitlearn. Skipping tests.") class LinearRegressionScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() scikit_model = LinearRegression() scikit_model.fit(scikit_data["data"], scikit_data["target"]) # Save the data and the model self.scikit_data = scikit_data self.scikit_model = scikit_model def test_conversion(self): input_names = self.scikit_data["feature_names"] spec = convert(self.scikit_model, input_names, "target").get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 1) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "doubleType" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(input_names), sorted(map(lambda x: x.name, spec.description.input)) ) # Test the linear regression parameters. self.assertTrue( spec.pipelineRegressor.pipeline.models[-1].HasField("glmRegressor") ) lr = spec.pipelineRegressor.pipeline.models[-1].glmRegressor self.assertEqual(lr.offset, self.scikit_model.intercept_) self.assertEqual(len(lr.weights), 1) self.assertEqual(len(lr.weights[0].value), 13) i = 0 for w in lr.weights[0].value: self.assertAlmostEqual(w, self.scikit_model.coef_[i]) i = i + 1 def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(TypeError): model = LinearRegression() spec = convert(model, "data", "out") # Check the expected class during conversion. with self.assertRaises(TypeError): model = OneHotEncoder() spec = convert(model, "data", "out") @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_linear_regression_evaluation(self): """ Check that the evaluation results are the same in scikit learn and coremltools """ input_names = self.scikit_data["feature_names"] df = pd.DataFrame(self.scikit_data["data"], columns=input_names) cur_model = LinearRegression() cur_model.fit(self.scikit_data["data"], self.scikit_data["target"]) spec = convert(cur_model, input_names, "target") df["target"] = cur_model.predict(self.scikit_data["data"]) metrics = evaluate_regressor(spec, df) self.assertAlmostEqual(metrics["max_error"], 0) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_linear_svr_evaluation(self): """ Check that the evaluation results are the same in scikit learn and coremltools """ ARGS = [ {}, {"C": 0.5, "epsilon": 0.25}, {"dual": False, "loss": "squared_epsilon_insensitive"}, {"tol": 0.005}, {"fit_intercept": False}, {"intercept_scaling": 1.5}, ] input_names = self.scikit_data["feature_names"] df = pd.DataFrame(self.scikit_data["data"], columns=input_names) for cur_args in ARGS: cur_model = LinearSVR(**cur_args) cur_model.fit(self.scikit_data["data"], self.scikit_data["target"]) spec = convert(cur_model, input_names, "target") df["target"] = cur_model.predict(self.scikit_data["data"]) metrics = evaluate_regressor(spec, df) self.assertAlmostEqual(metrics["max_error"], 0) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_nearest_neighbors_builder.py0000644000000000000000000003733114672066616027530 0ustar00rootroot# Copyright (c) 2019, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os import shutil import unittest from coremltools._deps import _HAS_SKLEARN from coremltools.models import MLModel from coremltools.models.nearest_neighbors import \ KNearestNeighborsClassifierBuilder from coremltools.models.utils import _is_macos if _HAS_SKLEARN: from sklearn.datasets import load_iris @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class NearestNeighborsBuilderTest(unittest.TestCase): """ Unit tests for the nearest neighbors builder class. """ def setUp(self): iris_samples = load_iris() self.iris_X = iris_samples.data self.iris_y = iris_samples.target self.training_X = self.iris_X[-30:] self.training_y = self.iris_y[-30:] def tearDown(self): # Do any cleanup here pass def create_builder(self, default_class_label="default_label"): builder = KNearestNeighborsClassifierBuilder( input_name="input", output_name="output", number_of_dimensions=4, default_class_label=default_class_label, ) return builder def test_builder_output_types(self): builder = self.create_builder(default_class_label="default") self.assertIsNotNone(builder) self.assertTrue( builder.spec.kNearestNeighborsClassifier.HasField("stringClassLabels") ) builder = self.create_builder(default_class_label=12) self.assertIsNotNone(builder) self.assertTrue( builder.spec.kNearestNeighborsClassifier.HasField("int64ClassLabels") ) with self.assertRaises(TypeError): bad_default_label = float(21.32) self.create_builder(default_class_label=bad_default_label) def test_builder_training_input(self): builder = self.create_builder(default_class_label="default") self.assertIsNotNone(builder) self.assertTrue( builder.spec.kNearestNeighborsClassifier.HasField("stringClassLabels") ) self.assertEqual(builder.spec.description.trainingInput[0].name, "input") self.assertEqual( builder.spec.description.trainingInput[0].type.WhichOneof("Type"), "multiArrayType", ) self.assertEqual(builder.spec.description.trainingInput[1].name, "output") self.assertEqual( builder.spec.description.trainingInput[1].type.WhichOneof("Type"), "stringType", ) def test_make_updatable(self): builder = self.create_builder() self.assertIsNotNone(builder) self.assertTrue(builder.spec.isUpdatable) builder.is_updatable = False self.assertFalse(builder.spec.isUpdatable) builder.is_updatable = True self.assertTrue(builder.spec.isUpdatable) def test_author(self): builder = self.create_builder() self.assertIsNotNone(builder) self.assertEqual(builder.spec.description.metadata.author, "") builder.author = "John Doe" self.assertEqual(builder.author, "John Doe") self.assertEqual(builder.spec.description.metadata.author, "John Doe") def test_description(self): builder = self.create_builder() self.assertIsNotNone(builder) self.assertEqual(builder.spec.description.metadata.shortDescription, "") builder.description = "This is a description" self.assertEqual(builder.description, "This is a description") self.assertEqual( builder.spec.description.metadata.shortDescription, "This is a description" ) def test_weighting_scheme(self): builder = self.create_builder() self.assertIsNotNone(builder) builder.weighting_scheme = "uniform" self.assertEqual(builder.weighting_scheme, "uniform") builder.weighting_scheme = "inverse_distance" self.assertEqual(builder.weighting_scheme, "inverse_distance") builder.weighting_scheme = "unIfOrM" self.assertEqual(builder.weighting_scheme, "uniform") builder.weighting_scheme = "InVerSE_DISTance" self.assertEqual(builder.weighting_scheme, "inverse_distance") with self.assertRaises(TypeError): builder.weighting_scheme = "test" def test_index_type(self): builder = self.create_builder() self.assertIsNotNone(builder) self.assertEqual(builder.index_type, "linear") self.assertEqual(builder.leaf_size, 0) builder.set_index_type("kd_tree") self.assertEqual(builder.index_type, "kd_tree") # test default value self.assertEqual(builder.leaf_size, 30) builder.set_index_type("linear") self.assertEqual(builder.index_type, "linear") self.assertEqual(builder.leaf_size, 0) builder.set_index_type("kd_tree", leaf_size=45) # test user-defined value self.assertEqual(builder.index_type, "kd_tree") self.assertEqual(builder.leaf_size, 45) builder.set_index_type("linear", leaf_size=37) self.assertEqual(builder.index_type, "linear") self.assertEqual(builder.leaf_size, 0) builder.set_index_type("KD_TrEe", leaf_size=22) # test user-defined value self.assertEqual(builder.index_type, "kd_tree") self.assertEqual(builder.leaf_size, 22) builder.set_index_type("linEAR") self.assertEqual(builder.index_type, "linear") self.assertEqual(builder.leaf_size, 0) with self.assertRaises(TypeError): builder.set_index_type("unsupported_index") with self.assertRaises(TypeError): builder.set_index_type("kd_tree", -10) with self.assertRaises(TypeError): builder.set_index_type("kd_tree", 0) def test_leaf_size(self): builder = self.create_builder() self.assertIsNotNone(builder) builder.set_index_type("kd_tree", leaf_size=45) # test user-defined value self.assertEqual(builder.index_type, "kd_tree") self.assertEqual(builder.leaf_size, 45) builder.leaf_size = 12 self.assertEqual(builder.index_type, "kd_tree") self.assertEqual(builder.leaf_size, 12) def test_set_number_of_neighbors_with_bounds(self): builder = self.create_builder() self.assertIsNotNone(builder) self.assertEqual(builder.number_of_neighbors, 5) (min_value, max_value) = builder.number_of_neighbors_allowed_range() self.assertEqual(min_value, 1) self.assertEqual(max_value, 1000) builder.set_number_of_neighbors_with_bounds(12, allowed_range=(2, 24)) (min_value, max_value) = builder.number_of_neighbors_allowed_range() self.assertEqual(builder.number_of_neighbors, 12) self.assertEqual(min_value, 2) self.assertEqual(max_value, 24) allowed_values = builder.number_of_neighbors_allowed_set() self.assertIsNone(allowed_values) test_set = {3, 5, 7, 9} builder.set_number_of_neighbors_with_bounds(7, allowed_set=test_set) self.assertEqual(builder.number_of_neighbors, 7) allowed_values = builder.number_of_neighbors_allowed_set() self.assertIsNotNone(allowed_values) self.assertEqual(allowed_values, test_set) def test_set_number_of_neighbors_with_bounds_error_conditions(self): builder = self.create_builder() self.assertIsNotNone(builder) with self.assertRaises(ValueError): builder.set_number_of_neighbors_with_bounds(3) test_range = (3, 15) test_set = {1, 3, 5} with self.assertRaises(ValueError): builder.set_number_of_neighbors_with_bounds( 3, allowed_range=test_range, allowed_set=test_set ) with self.assertRaises(ValueError): builder.set_number_of_neighbors_with_bounds(3, allowed_range=(-5, 5)) with self.assertRaises(ValueError): builder.set_number_of_neighbors_with_bounds(3, allowed_range=(5, 1)) with self.assertRaises(ValueError): builder.set_number_of_neighbors_with_bounds( 3, allowed_range=test_range, allowed_set=test_set ) with self.assertRaises(ValueError): builder.set_number_of_neighbors_with_bounds(2, allowed_range=test_range) with self.assertRaises(TypeError): builder.set_number_of_neighbors_with_bounds(5, allowed_set={5, -3, 7}) with self.assertRaises(ValueError): builder.set_number_of_neighbors_with_bounds(4, allowed_set=test_set) with self.assertRaises(ValueError): builder.set_number_of_neighbors_with_bounds(4, allowed_set=test_set) with self.assertRaises(TypeError): builder.set_number_of_neighbors_with_bounds(2, allowed_set=[1, 2, 3]) with self.assertRaises(TypeError): builder.set_number_of_neighbors_with_bounds(4, allowed_range={2, 200}) with self.assertRaises(TypeError): builder.set_number_of_neighbors_with_bounds(4, allowed_range=(2, 10, 20)) with self.assertRaises(TypeError): builder.set_number_of_neighbors_with_bounds(4, allowed_set=set()) with self.assertRaises(TypeError): builder.set_number_of_neighbors_with_bounds(4, allowed_range=[]) def test_set_number_of_neighbors(self): builder = self.create_builder() self.assertIsNotNone(builder) builder.set_number_of_neighbors_with_bounds(12, allowed_range=(2, 24)) self.assertEqual(builder.number_of_neighbors, 12) with self.assertRaises(ValueError): builder.set_number_of_neighbors_with_bounds(1, allowed_range=(2, 24)) builder.set_number_of_neighbors_with_bounds(4, allowed_range=(2, 24)) self.assertEqual(builder.number_of_neighbors, 4) test_set = {3, 5, 7, 9} builder.set_number_of_neighbors_with_bounds(7, allowed_set=test_set) with self.assertRaises(ValueError): builder.set_number_of_neighbors_with_bounds(4, allowed_set=test_set) builder.set_number_of_neighbors_with_bounds(5, allowed_set=test_set) self.assertEqual(builder.number_of_neighbors, 5) def test_add_samples_invalid_data(self): builder = self.create_builder() self.assertIsNotNone(builder) invalid_X = [[1.0, 2.4]] with self.assertRaises(TypeError): builder.add_samples(invalid_X, self.training_y) with self.assertRaises(TypeError): builder.add_samples(self.training_X, self.training_y[:3]) with self.assertRaises(TypeError): builder.add_samples([], self.training_y) with self.assertRaises(TypeError): builder.add_samples(self.training_X, []) def test_add_samples_int_labels(self): builder = self.create_builder(default_class_label=12) self.assertIsNotNone(builder) some_X = self.training_X[:10] some_y = self.training_y[:10] builder.add_samples(some_X, some_y) self._validate_samples(builder.spec, some_X, some_y) addl_X = self.training_X[10:20] addl_y = self.training_y[10:20] builder.add_samples(addl_X, addl_y) self._validate_samples(builder.spec, self.training_X[:20], self.training_y[:20]) def test_add_samples_string_labels(self): builder = self.create_builder(default_class_label="default") self.assertIsNotNone(builder) some_X = self.training_X[:3] some_y = ["one", "two", "three"] builder.add_samples(some_X, some_y) self._validate_samples(builder.spec, some_X, some_y) addl_X = self.training_X[3:6] addl_y = ["four", "five", "six"] builder.add_samples(addl_X, addl_y) self._validate_samples(builder.spec, self.training_X[0:6], some_y + addl_y) def test_add_samples_invalid_label_types(self): builder_int_labels = self.create_builder(default_class_label=42) self.assertIsNotNone(builder_int_labels) some_X = self.training_X[:3] invalid_int_y = [0, "one", 2] with self.assertRaises(TypeError): builder_int_labels.add_samples(some_X, invalid_int_y) builder_string_labels = self.create_builder(default_class_label="default") self.assertIsNotNone(builder_string_labels) invalid_string_y = ["zero", "one", 2] with self.assertRaises(TypeError): builder_string_labels.add_samples(some_X, invalid_string_y) @unittest.skipUnless(_is_macos(), "Only supported on MacOS platform.") def test_can_init_and_save_model_from_builder_with_updated_spec(self): builder = KNearestNeighborsClassifierBuilder( input_name="input", output_name="output", number_of_dimensions=10, default_class_label="defaultLabel", k=3, weighting_scheme="inverse_distance", index_type="kd_tree", leaf_size=50, ) builder.author = "CoreML Team" builder.license = "MIT" builder.description = "test_builder_with_validation" # Save the updated spec coreml_model = MLModel(builder.spec) self.assertIsNotNone(coreml_model) coreml_model_path = "/tmp/__test_builder_with_validation.mlmodel" try: coreml_model.save(coreml_model_path) self.assertTrue(os.path.isfile(coreml_model_path)) finally: self._delete_mlmodel_and_mlmodelc(coreml_model_path) @unittest.skipUnless(_is_macos(), "Only supported on MacOS platform.") def test_can_init_and_save_model_from_builder_default_parameters(self): builder = KNearestNeighborsClassifierBuilder( input_name="input", output_name="output", number_of_dimensions=4, default_class_label="defaultLabel", ) # Save the updated spec coreml_model = MLModel(builder.spec) self.assertIsNotNone(coreml_model) coreml_model_path = "/tmp/__test_builder_with_validation.mlmodel" try: coreml_model.save(coreml_model_path) self.assertTrue(os.path.isfile(coreml_model_path)) finally: self._delete_mlmodel_and_mlmodelc(coreml_model_path) def _validate_samples(self, spec, expected_X, expected_y): """Validate the float samples returned from the converted scikit KNeighborsClassifier""" num_dimensions = ( spec.kNearestNeighborsClassifier.nearestNeighborsIndex.numberOfDimensions ) for index, sample in enumerate( spec.kNearestNeighborsClassifier.nearestNeighborsIndex.floatSamples ): for dim in range(0, num_dimensions): self.assertAlmostEqual( sample.vector[dim], expected_X[index][dim], places=6 ) if spec.kNearestNeighborsClassifier.HasField("int64ClassLabels"): for index, label in enumerate( spec.kNearestNeighborsClassifier.int64ClassLabels.vector ): self.assertEqual(label, expected_y[index]) elif spec.kNearestNeighborsClassifier.HasField("stringClassLabels"): for index, label in enumerate( spec.kNearestNeighborsClassifier.stringClassLabels.vector ): self.assertEqual(label, expected_y[index]) @staticmethod def _delete_mlmodel_and_mlmodelc(path_to_mlmodel): """Delete the .mlmodel and .mlmodelc for the given .mlmodel.""" if os.path.exists(path_to_mlmodel): os.remove(path_to_mlmodel) path_to_mlmodelc = "{}c".format(path_to_mlmodel) if os.path.exists(path_to_mlmodelc): shutil.rmtree(path_to_mlmodelc) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_normalizer.py0000644000000000000000000000357014672066616024501 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import numpy as _np from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN from coremltools.models.utils import (_is_macos, _macos_version, evaluate_transformer) if _HAS_SKLEARN: from sklearn.preprocessing import Normalizer from coremltools.converters import sklearn as converter @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class NormalizerScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ def test_random(self): # Generate some random data_imputeValue.multiArrayValue[i] X = _np.random.random(size=(50, 3)) for param in ("l1", "l2", "max"): cur_model = Normalizer(norm=param) output = cur_model.fit_transform(X) spec = converter.convert(cur_model, ["a", "b", "c"], "out") evaluate_transformer( spec, [dict(zip(["a", "b", "c"], row)) for row in X], [{"out": row} for row in output], ) def test_boston(self): scikit_data = load_boston() scikit_model = Normalizer(norm="l2").fit(scikit_data["data"]) spec = converter.convert(scikit_model, scikit_data["feature_names"], "out") input_data = [ dict(zip(scikit_data["feature_names"], row)) for row in scikit_data["data"] ] output_data = [{"out": row} for row in scikit_model.transform(scikit_data["data"])] evaluate_transformer(spec, input_data, output_data) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_one_hot_encoder.py0000644000000000000000000002410714672066616025450 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest from copy import copy import numpy as np from packaging.version import Version from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN, _SKLEARN_VERSION from coremltools.models.utils import (_is_macos, _macos_version, evaluate_transformer) if _HAS_SKLEARN: from sklearn.pipeline import Pipeline from sklearn.preprocessing import Normalizer, OneHotEncoder from coremltools.converters import sklearn from coremltools.models.datatypes import Array @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class OneHotEncoderScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = [[0], [1], [2], [4], [3], [2], [4], [5], [6], [7]] scikit_data_multiple_cols = [[0, 1], [1, 0], [2, 2], [3, 3], [4, 4]] scikit_model = OneHotEncoder() scikit_model.fit(scikit_data) # Save the data and the model self.scikit_data = np.asarray(scikit_data, dtype="d") self.scikit_data_multiple_cols = np.asarray( scikit_data_multiple_cols, dtype="d" ) self.scikit_model = scikit_model @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_conversion_one_column(self): # Fit a single OHE scikit_model = OneHotEncoder() scikit_model.fit(self.scikit_data) spec = sklearn.convert(scikit_model, "single_feature", "out").get_spec() test_data = [{"single_feature": row} for row in self.scikit_data] scikit_output = [ {"out": row} for row in scikit_model.transform(self.scikit_data).toarray() ] metrics = evaluate_transformer(spec, test_data, scikit_output) self.assertIsNotNone(spec) self.assertIsNotNone(spec.description) self.assertEqual(metrics["num_errors"], 0) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_conversion_many_columns(self): scikit_model = OneHotEncoder() scikit_model.fit(self.scikit_data_multiple_cols) spec = sklearn.convert( scikit_model, ["feature_1", "feature_2"], "out" ).get_spec() test_data = [ {"feature_1": row[0], "feature_2": row[1]} for row in self.scikit_data_multiple_cols ] scikit_output = [ {"out": row} for row in scikit_model.transform(self.scikit_data_multiple_cols).toarray() ] metrics = evaluate_transformer(spec, test_data, scikit_output) self.assertIsNotNone(spec) self.assertIsNotNone(spec.description) self.assertEqual(metrics["num_errors"], 0) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_conversion_one_column_of_several(self): if _SKLEARN_VERSION >= Version("0.22"): scikit_model = OneHotEncoder() else: scikit_model = OneHotEncoder(categorical_features=[0]) scikit_model.fit(copy(self.scikit_data_multiple_cols)) spec = sklearn.convert( scikit_model, ["feature_1", "feature_2"], "out" ).get_spec() test_data = [ {"feature_1": row[0], "feature_2": row[1]} for row in self.scikit_data_multiple_cols ] scikit_output = [ {"out": row} for row in scikit_model.transform(self.scikit_data_multiple_cols).toarray() ] metrics = evaluate_transformer(spec, test_data, scikit_output) self.assertIsNotNone(spec) self.assertIsNotNone(spec.description) self.assertEqual(metrics["num_errors"], 0) @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) @unittest.skipIf(_SKLEARN_VERSION >= Version("0.22"), "categorical_features parameter to OneHotEncoder() deprecated after SciKit Learn 0.22." ) def test_boston_OHE(self): data = load_boston() for categorical_features in [[3], [8], [3, 8], [8, 3]]: model = OneHotEncoder( categorical_features=categorical_features, sparse=False ) model.fit(data.data, data.target) # Convert the model spec = sklearn.convert(model, data.feature_names, "out").get_spec() input_data = [dict(zip(data.feature_names, row)) for row in data.data] output_data = [{"out": row} for row in model.transform(data.data)] result = evaluate_transformer(spec, input_data, output_data) assert result["num_errors"] == 0 @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) @unittest.skipIf(_SKLEARN_VERSION >= Version("0.22"), "categorical_features parameter to OneHotEncoder() deprecated after SciKit Learn 0.22." ) def test_boston_OHE_pipeline(self): data = load_boston() for categorical_features in [[3], [8], [3, 8], [8, 3]]: # Put it in a pipeline so that we can test whether the output dimension # handling is correct. model = Pipeline( [ ("OHE", OneHotEncoder(categorical_features=categorical_features)), ("Normalizer", Normalizer()), ] ) model.fit(data.data.copy(), data.target) # Convert the model spec = sklearn.convert(model, data.feature_names, "out").get_spec() input_data = [dict(zip(data.feature_names, row)) for row in data.data] output_data = [{"out": row} for row in model.transform(data.data.copy())] result = evaluate_transformer(spec, input_data, output_data) assert result["num_errors"] == 0 @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) @unittest.skipIf(_SKLEARN_VERSION >= Version("0.22"), "categorical_features parameter to OneHotEncoder() deprecated after SciKit Learn 0.22." ) def test_random_sparse_data(self): n_columns = 8 n_categories = 20 import numpy.random as rn rn.seed(0) categories = rn.randint(50000, size=(n_columns, n_categories)) for dt in ["int32", "float32", "float64"]: _X = np.array( [ [categories[j, rn.randint(n_categories)] for j in range(n_columns)] for i in range(100) ], dtype=dt, ) # Test this data on a bunch of possible inputs. for sparse in (True, False): for categorical_features in [ "all", [3], [4], range(2, 8), range(0, 4), range(0, 8), ]: X = _X.copy() # This appears to be the only type now working. assert X.dtype == np.dtype(dt) model = OneHotEncoder( categorical_features=categorical_features, sparse=sparse ) model.fit(X) # Convert the model spec = sklearn.convert(model, [("data", Array(n_columns))], "out") X_out = model.transform(X) if sparse: X_out = X_out.todense() input_data = [{"data": row} for row in X] output_data = [{"out": row} for row in X_out] result = evaluate_transformer(spec, input_data, output_data) assert result["num_errors"] == 0 # Test normal data inside a pipeline for sparse in (True, False): for categorical_features in [ "all", [3], [4], range(2, 8), range(0, 4), range(0, 8), ]: X = _X.copy() model = Pipeline( [ ( "OHE", OneHotEncoder( categorical_features=categorical_features, sparse=sparse, ), ), ("Normalizer", Normalizer()), ] ) model.fit(X) # Convert the model spec = sklearn.convert( model, [("data", Array(n_columns))], "out" ).get_spec() X_out = model.transform(X) if sparse: X_out = X_out.todense() input_data = [{"data": row} for row in X] output_data = [{"out": row} for row in X_out] result = evaluate_transformer(spec, input_data, output_data) assert result["num_errors"] == 0 def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(TypeError): model = OneHotEncoder() spec = sklearn.convert(model, "data", "out") # Check the expected class during conversion. with self.assertRaises(TypeError): from sklearn.linear_model import LinearRegression model = LinearRegression() spec = sklearn.convert(model, "data", "out") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_random_forest_classifier.py0000644000000000000000000001421514672066616027363 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import numpy as np from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN if _HAS_SKLEARN: from sklearn.ensemble import RandomForestClassifier from coremltools.converters import sklearn as skl_converter @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class RandomForestBinaryClassifierScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() # n_estimators default changed >= 0.22. Specify explicitly to match <0.22 behavior. scikit_model = RandomForestClassifier(random_state=1, n_estimators=10) target = 1 * (scikit_data["target"] > scikit_data["target"].mean()) scikit_model.fit(scikit_data["data"], target) self.scikit_model_node_count = sum(map(lambda e: e.tree_.node_count, scikit_model.estimators_)) # Save the data and the model self.scikit_data = scikit_data self.scikit_model = scikit_model def test_conversion(self): input_names = self.scikit_data["feature_names"] output_name = "target" spec = skl_converter.convert( self.scikit_model, input_names, "target" ).get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 2) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "int64Type" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(input_names), sorted(map(lambda x: x.name, spec.description.input)) ) self.assertEqual(len(spec.pipelineClassifier.pipeline.models), 2) tr = spec.pipelineClassifier.pipeline.models[ -1 ].treeEnsembleClassifier.treeEnsemble self.assertIsNotNone(tr) self.assertEqual(len(tr.nodes), self.scikit_model_node_count) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(Exception): # n_estimators default changed >= 0.22. Specify explicitly to match <0.22 behavior. model = RandomForestClassifier(n_estimators=10) spec = skl_converter.convert(model, "data", "out") # Check the expected class during conversion. from sklearn.preprocessing import OneHotEncoder with self.assertRaises(Exception): model = OneHotEncoder() spec = skl_converter.convert(model, "data", "out") @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class RandomForestMultiClassClassifierScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ # n_estimators default changed >= 0.22. Specify explicitly to match <0.22 behavior. scikit_model = RandomForestClassifier(random_state=1, n_estimators=10) scikit_data = load_boston() t = scikit_data["target"] target = np.digitize(t, np.histogram(t)[1]) - 1 scikit_model.fit(scikit_data["data"], target) self.scikit_model_node_count = sum(map(lambda e: e.tree_.node_count, scikit_model.estimators_)) # Save the data and the model self.scikit_data = scikit_data self.target = target self.scikit_model = scikit_model def test_conversion(self): input_names = self.scikit_data["feature_names"] output_name = "target" spec = skl_converter.convert( self.scikit_model, input_names, "target" ).get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertIsNotNone(spec.treeEnsembleClassifier) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 2) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "int64Type" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(input_names), sorted(map(lambda x: x.name, spec.description.input)) ) self.assertEqual(len(spec.pipelineClassifier.pipeline.models), 2) tr = spec.pipelineClassifier.pipeline.models[ -1 ].treeEnsembleClassifier.treeEnsemble self.assertIsNotNone(tr) self.assertEqual(len(tr.nodes), self.scikit_model_node_count) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(Exception): # n_estimators default changed >= 0.22. Specify explicitly to match <0.22 behavior. model = RandomForestClassifier(n_estimators=10) spec = skl_converter.convert(model, "data", "out") # Check the expected class during conversion. with self.assertRaises(Exception): from sklearn.preprocessing import OneHotEncoder model = OneHotEncoder() spec = skl_converter.convert(model, "data", "out") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_random_forest_classifier_numeric.py0000644000000000000000000001165714672066616031114 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import unittest import numpy as np import pandas as pd import pytest from packaging.version import Version from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN, _SKLEARN_VERSION from coremltools.models.utils import (_is_macos, _macos_version, evaluate_classifier) if _HAS_SKLEARN: from sklearn.ensemble import RandomForestClassifier from coremltools.converters import sklearn as skl_converter @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class RandomForestClassificationBostonHousingScikitNumericTest(unittest.TestCase): def _check_metrics(self, metrics, params={}): self.assertEqual( metrics["num_errors"], 0, msg="Failed case %s. Results %s" % (params, metrics), ) def _train_convert_evaluate_assert(self, **scikit_params): scikit_model = RandomForestClassifier(random_state=1, **scikit_params) scikit_model.fit(self.X, self.target) # Convert the model spec = skl_converter.convert(scikit_model, self.feature_names, self.output_name) if _is_macos() and _macos_version() >= (10, 13): # Get predictions df = pd.DataFrame(self.X, columns=self.feature_names) df["target"] = scikit_model.predict(self.X) # Evaluate it metrics = evaluate_classifier(spec, df, verbose=False) self._check_metrics(metrics, scikit_params) @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class RandomForestBinaryClassifierBostonHousingScikitNumericTest( RandomForestClassificationBostonHousingScikitNumericTest ): @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ # Load data and train model scikit_data = load_boston() self.X = scikit_data["data"].astype("f").astype( "d" ) ## scikit-learn downcasts data self.target = 1 * (scikit_data["target"] > scikit_data["target"].mean()) self.feature_names = scikit_data["feature_names"] self.output_name = "target" self.scikit_data = scikit_data def test_simple_binary_classifier(self): self._train_convert_evaluate_assert(max_depth=13) @pytest.mark.slow def test_binary_classifier_stress_test(self): options = dict( n_estimators=[1, 5, 10], max_depth=[1, 5, None], min_samples_split=[2, 10, 0.5], min_samples_leaf=[1, 5], min_weight_fraction_leaf=[0.0, 0.5], max_leaf_nodes=[None, 20], ) if _SKLEARN_VERSION >= Version("0.19"): options["min_impurity_decrease"] = [1e-07, 0.1] # Make a cartesian product of all options product = itertools.product(*options.values()) args = [dict(zip(options.keys(), p)) for p in product] print("Testing a total of %s cases. This could take a while" % len(args)) for it, arg in enumerate(args): self._train_convert_evaluate_assert(**arg) @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class RandomForestMultiClassClassificationBostonHousingScikitNumericTest( RandomForestClassificationBostonHousingScikitNumericTest ): @classmethod def setUpClass(self): # Load data and train model scikit_data = load_boston() self.X = scikit_data["data"].astype("f").astype( "d" ) ## scikit-learn downcasts data t = scikit_data["target"] num_classes = 3 target = np.digitize(t, np.histogram(t, bins=num_classes - 1)[1]) - 1 # Save the data and the model self.scikit_data = scikit_data self.target = target self.feature_names = scikit_data["feature_names"] self.output_name = "target" def test_simple_multiclass(self): self._train_convert_evaluate_assert() @pytest.mark.slow def test_multiclass_stress_test(self): options = dict( n_estimators=[1, 5, 10], max_depth=[1, 5, None], min_samples_split=[2, 10, 0.5], min_samples_leaf=[1, 5], min_weight_fraction_leaf=[0.0, 0.5], max_leaf_nodes=[None, 20], ) if _SKLEARN_VERSION >= Version("0.19"): options["min_impurity_decrease"] = [1e-07, 0.1] # Make a cartesian product of all options product = itertools.product(*options.values()) args = [dict(zip(options.keys(), p)) for p in product] print("Testing a total of %s cases. This could take a while" % len(args)) for it, arg in enumerate(args): self._train_convert_evaluate_assert(**arg) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_random_forest_regression.py0000644000000000000000000000631314672066616027417 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN if _HAS_SKLEARN: from sklearn.ensemble import RandomForestRegressor from coremltools.converters import sklearn as skl_converter @unittest.skipIf(not _HAS_SKLEARN, "Missing scikit-learn. Skipping tests.") class RandomForestRegressorScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() # n_estimators default changed >= 0.22. Specify explicitly to match <0.22 behavior. scikit_model = RandomForestRegressor(random_state=1, n_estimators=10) scikit_model.fit(scikit_data["data"], scikit_data["target"]) self.scikit_model_node_count = sum(map(lambda e: e.tree_.node_count, scikit_model.estimators_)) # Save the data and the model self.scikit_data = scikit_data self.scikit_model = scikit_model def test_conversion(self): input_names = self.scikit_data["feature_names"] output_name = "target" spec = skl_converter.convert( self.scikit_model, input_names, "target" ).get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 1) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "doubleType" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(input_names), sorted(map(lambda x: x.name, spec.description.input)) ) # Test the linear regression parameters. self.assertEqual(len(spec.pipelineRegressor.pipeline.models), 2) tr = spec.pipelineRegressor.pipeline.models[ -1 ].treeEnsembleRegressor.treeEnsemble self.assertIsNotNone(tr) self.assertEqual(len(tr.nodes), self.scikit_model_node_count) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(Exception): # n_estimators default changed >= 0.22. Specify explicitly to match <0.22 behavior. model = RandomForestRegressor(n_estimators=10) spec = skl_converter.convert(model, "data", "out") # Check the expected class during conversion. from sklearn.preprocessing import OneHotEncoder with self.assertRaises(Exception): model = OneHotEncoder() spec = skl_converter.convert(model, "data", "out") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_random_forest_regression_numeric.py0000644000000000000000000000715714672066616031150 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import unittest import pandas as pd import pytest from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN from coremltools.models.utils import (_is_macos, _macos_version, evaluate_regressor) if _HAS_SKLEARN: from sklearn.ensemble import RandomForestRegressor from coremltools.converters import sklearn as skl_converter @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class RandomForestRegressorBostonHousingScikitNumericTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter and running both models """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ # Load data and train model scikit_data = load_boston() self.scikit_data = scikit_data self.X = scikit_data["data"].astype("f").astype( "d" ) ## scikit-learn downcasts data self.target = scikit_data["target"] self.feature_names = scikit_data["feature_names"] self.output_name = "target" def _check_metrics(self, metrics, params={}): """ Check the metrics """ self.assertAlmostEqual( metrics["rmse"], 0.0, delta=1e-5, msg="Failed case %s. Results %s" % (params, metrics), ) self.assertAlmostEqual( metrics["max_error"], 0.0, delta=1e-5, msg="Failed case %s. Results %s" % (params, metrics), ) def _train_convert_evaluate_assert(self, **scikit_params): """ Train a scikit-learn model, convert it and then evaluate it with CoreML """ scikit_model = RandomForestRegressor(random_state=1, **scikit_params) scikit_model.fit(self.X, self.target) # Convert the model spec = skl_converter.convert(scikit_model, self.feature_names, self.output_name) if _is_macos() and _macos_version() >= (10, 13): # Get predictions df = pd.DataFrame(self.X, columns=self.feature_names) df["target"] = scikit_model.predict(self.X) # Evaluate it metrics = evaluate_regressor(spec, df, verbose=False) self._check_metrics(metrics, scikit_params) def test_boston_housing_simple_regression(self): self._train_convert_evaluate_assert() def test_boston_housing_float_double_corner_case(self): self._train_convert_evaluate_assert(max_depth=13) @pytest.mark.slow def test_boston_housing_parameter_stress_test(self): ## These are all the options in decision tree regression of scikit-learn options = dict( criterion=["friedman_mse"], n_estimators=[1, 5, 10], max_depth=[1, 5], min_samples_split=[2, 10, 0.5], min_samples_leaf=[1, 5], min_weight_fraction_leaf=[0.0, 0.5], max_leaf_nodes=[None, 20], min_impurity_decrease=[1e-07, 0.1, 0.0], ) # Make a cartesian product of all options product = itertools.product(*options.values()) args = [dict(zip(options.keys(), p)) for p in product] print("Testing a total of %s cases. This could take a while" % len(args)) for it, arg in enumerate(args): self._train_convert_evaluate_assert(**arg) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_ridge_regression.py0000644000000000000000000000745314672066616025655 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import pandas as pd from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN from coremltools.models.utils import (_is_macos, _macos_version, evaluate_regressor) if _HAS_SKLEARN: from sklearn.linear_model import Ridge from sklearn.preprocessing import OneHotEncoder from coremltools.converters.sklearn import convert @unittest.skipIf(not _HAS_SKLEARN, "Missing scikitlearn. Skipping tests.") class RidgeRegressionScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() scikit_model = Ridge() scikit_model.fit(scikit_data["data"], scikit_data["target"]) # Save the data and the model self.scikit_data = scikit_data self.scikit_model = scikit_model def test_conversion(self): input_names = self.scikit_data["feature_names"] spec = convert(self.scikit_model, input_names, "target").get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 1) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "doubleType" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(input_names), sorted(map(lambda x: x.name, spec.description.input)) ) # Test the ridge regression parameters. self.assertTrue( spec.pipelineRegressor.pipeline.models[-1].HasField("glmRegressor") ) lr = spec.pipelineRegressor.pipeline.models[-1].glmRegressor self.assertEqual(lr.offset, self.scikit_model.intercept_) self.assertEqual(len(lr.weights), 1) self.assertEqual(len(lr.weights[0].value), 13) i = 0 for w in lr.weights[0].value: self.assertAlmostEqual(w, self.scikit_model.coef_[i]) i = i + 1 def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(TypeError): model = Ridge() spec = convert(model, "data", "out") # Check the expected class during conversion. with self.assertRaises(TypeError): model = OneHotEncoder() spec = convert(model, "data", "out") @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) def test_ridge_regression_evaluation(self): """ Check that the evaluation results are the same in scikit learn and coremltools """ input_names = self.scikit_data["feature_names"] df = pd.DataFrame(self.scikit_data["data"], columns=input_names) for normalize_value in (True, False): cur_model = Ridge() cur_model.fit(self.scikit_data["data"], self.scikit_data["target"]) spec = convert(cur_model, input_names, "target") df["target"] = cur_model.predict(self.scikit_data["data"]) metrics = evaluate_regressor(spec, df) self.assertAlmostEqual(metrics["max_error"], 0) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_standard_scalar.py0000644000000000000000000000364314672066616025445 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import numpy as np from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN from coremltools.models.utils import (_is_macos, _macos_version, evaluate_transformer) if _HAS_SKLEARN: from sklearn.preprocessing import StandardScaler from coremltools.converters import sklearn as converter @unittest.skipUnless( _is_macos() and _macos_version() >= (10, 13), "Only supported on macOS 10.13+" ) @unittest.skipIf(not _HAS_SKLEARN, "Missing scikit-learn. Skipping tests.") class StandardScalerTestCase(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ def test_random(self): # Generate some random data X = np.random.random(size=(50, 3)) cur_model = StandardScaler() output = cur_model.fit_transform(X) spec = converter.convert(cur_model, ["a", "b", "c"], "out").get_spec() metrics = evaluate_transformer( spec, [dict(zip(["a", "b", "c"], row)) for row in X], [{"out": row} for row in output], ) assert metrics["num_errors"] == 0 def test_boston(self): scikit_data = load_boston() scikit_model = StandardScaler().fit(scikit_data["data"]) spec = converter.convert( scikit_model, scikit_data["feature_names"], "out" ).get_spec() input_data = [ dict(zip(scikit_data["feature_names"], row)) for row in scikit_data["data"] ] output_data = [{"out": row} for row in scikit_model.transform(scikit_data["data"])] metrics = evaluate_transformer(spec, input_data, output_data) assert metrics["num_errors"] == 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/sklearn_tests/test_utils.py0000644000000000000000000000351014672066616023451 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN from coremltools.models import MLModel from coremltools.models.utils import _is_macos, _macos_version, rename_feature if _HAS_SKLEARN: from sklearn.linear_model import LinearRegression from coremltools.converters import sklearn as converter @unittest.skipIf(not _HAS_SKLEARN, "Missing scikit-learn. Skipping tests.") class PipeLineRenameTests(unittest.TestCase): @classmethod def setUpClass(self): scikit_data = load_boston() feature_names = scikit_data["feature_names"] scikit_model = LinearRegression() scikit_model.fit(scikit_data["data"], scikit_data["target"]) # Save the data and the model self.scikit_data = scikit_data self.scikit_model = scikit_model @unittest.skip("rdar://71638164") def test_pipeline_rename(self): # Convert scikit_spec = converter.convert(self.scikit_model).get_spec() model = MLModel(scikit_spec) sample_data = self.scikit_data.data[0] # Rename rename_feature(scikit_spec, "input", "renamed_input") renamed_model = MLModel(scikit_spec) # Check the predictions if _is_macos() and _macos_version() >= (10, 13): out_dict = model.predict({"input": sample_data}) out_dict_renamed = renamed_model.predict({"renamed_input": sample_data}) self.assertAlmostEqual(list(out_dict.keys()), list(out_dict_renamed.keys())) self.assertAlmostEqual( list(out_dict.values()), list(out_dict_renamed.values()) ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/utils.py0000644000000000000000000000200114672066616017523 0ustar00rootroot# Copyright (c) 2024, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import os.path import numpy as np import pandas as pd import requests def load_boston(): DATA_URL = "http://lib.stat.cmu.edu/datasets/boston" LOCAL_FILE = "/tmp/bostonHousingData.txt" if not os.path.isfile(LOCAL_FILE): r = requests.get(DATA_URL) with open(LOCAL_FILE, 'w') as f: f.write(r.text) raw_df = pd.read_csv(LOCAL_FILE, sep="\s+", skiprows=22, header=None) data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]]) data = np.array(data, order='C') target = raw_df.values[1::2, 2] feature_names = np.array(['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT'], dtype=' scikit_data["target"].mean() scikit_model.fit(scikit_data["data"], target) s = 0 for est in scikit_model.estimators_: for e in est: s = s + e.tree_.node_count self.scikit_model_node_count = s # Save the data and the model self.scikit_data = scikit_data self.scikit_model = scikit_model def test_conversion(self): input_names = self.scikit_data["feature_names"] output_name = "target" spec = skl_converter.convert( self.scikit_model, input_names, "target" ).get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertIsNotNone(spec.treeEnsembleClassifier) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 2) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "int64Type" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(input_names), sorted(map(lambda x: x.name, spec.description.input)) ) # Test the linear regression parameters. tr = spec.pipelineClassifier.pipeline.models[ 1 ].treeEnsembleClassifier.treeEnsemble self.assertIsNotNone(tr) self.assertEqual(len(tr.nodes), self.scikit_model_node_count) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(Exception): model = GradientBoostingClassifier() spec = skl_converter.convert(model, "data", "out") # Check the expected class during conversion. from sklearn.preprocessing import OneHotEncoder with self.assertRaises(Exception): model = OneHotEncoder() spec = skl_converter.convert(model, "data", "out") class GradientBoostingMulticlassClassifierScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() scikit_model = GradientBoostingClassifier(random_state=1) t = scikit_data["target"] target = np.digitize(t, np.histogram(t)[1]) - 1 scikit_model.fit(scikit_data["data"], target) self.target = target s = 0 for est in scikit_model.estimators_: for e in est: s = s + e.tree_.node_count self.scikit_model_node_count = s # Save the data and the model self.scikit_data = scikit_data self.scikit_model = scikit_model def test_conversion(self): input_names = self.scikit_data["feature_names"] output_name = "target" spec = skl_converter.convert( self.scikit_model, input_names, "target" ).get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 2) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "int64Type" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(input_names), sorted(map(lambda x: x.name, spec.description.input)) ) self.assertEqual(len(spec.pipelineClassifier.pipeline.models), 2) tr = spec.pipelineClassifier.pipeline.models[ -1 ].treeEnsembleClassifier.treeEnsemble self.assertIsNotNone(tr) self.assertEqual(len(tr.nodes), self.scikit_model_node_count) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(Exception): model = GradientBoostingClassifier() spec = skl_converter.convert(model, "data", "out") # Check the expected class during conversion. from sklearn.preprocessing import OneHotEncoder with self.assertRaises(Exception): model = OneHotEncoder() spec = skl_converter.convert(model, "data", "out") @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") @unittest.skipIf(not _HAS_XGBOOST, "Skipping, no xgboost") class GradientBoostingBinaryClassifierXGboostTest(unittest.TestCase): """ Unit test class for testing xgboost converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() self.xgb_model = xgboost.XGBClassifier() target = scikit_data["target"] > scikit_data["target"].mean() self.xgb_model.fit(scikit_data["data"], target) # Save the data and the model self.scikit_data = scikit_data def test_conversion(self): input_names = self.scikit_data["feature_names"] output_name = "target" spec = xgb_converter.convert( self.xgb_model, input_names, output_name, mode="classifier" ).get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertIsNotNone(spec.treeEnsembleClassifier) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, output_name) # Test the inputs and outputs self.assertEqual(len(spec.description.output), 2) self.assertEqual(spec.description.output[0].name, output_name) self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "int64Type" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(input_names), sorted(map(lambda x: x.name, spec.description.input)) ) # Test the linear regression parameters. tr = spec.treeEnsembleClassifier.treeEnsemble self.assertIsNotNone(tr) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(Exception): model = xgboost.XGBClassifier() spec = xgb_converter.convert(model, "data", "out", mode="classifier") # Check the expected class during conversion. with self.assertRaises(Exception): model = xgboost.XGBRegressor() spec = xgb_converter.convert(model, "data", "out", mode="classifier") @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") @unittest.skipIf(not _HAS_XGBOOST, "Skipping, no xgboost") class GradientBoostingMulticlassClassifierXGboostTest(unittest.TestCase): """ Unit test class for testing xgboost converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() t = scikit_data["target"] target = np.digitize(t, np.histogram(t)[1]) - 1 dtrain = xgboost.DMatrix( scikit_data["data"], label=target, feature_names=scikit_data["feature_names"] ) self.xgb_model = xgboost.train({}, dtrain) self.target = target # Save the data and the model self.scikit_data = scikit_data self.n_classes = len(np.unique(self.target)) # train a booster with special characters in feature names x = scikit_data['data'] # prepare feature names with special chars self.feature_names_special_chars = [f'\t"{i}"\n' for i in range(x.shape[1])] # create training dmatrix dm = xgboost.DMatrix(x, label=target, feature_names=self.feature_names_special_chars) # train booster self.xgb_model_special_chars = xgboost.train({}, dm) # create XGBClassifier from a copy of trainer booster self.xgb_classifier_special_chars = \ xgboost.XGBClassifier(xgb_model=self.xgb_model_special_chars.copy()) self.xgb_classifier_special_chars.fit(x, target) def test_conversion(self): input_names = self.scikit_data["feature_names"] output_name = "target" spec = xgb_converter.convert( self.xgb_model, input_names, output_name, mode="classifier", n_classes=self.n_classes, ).get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertEqual(spec.description.predictedFeatureName, output_name) # Test the inputs and outputs self.assertEqual(len(spec.description.output), 2) self.assertEqual(spec.description.output[0].name, output_name) self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "int64Type" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(input_names), sorted(map(lambda x: x.name, spec.description.input)) ) # Test the linear regression parameters. tr = spec.treeEnsembleClassifier.treeEnsemble self.assertIsNotNone(tr) def test_conversion_from_file(self): import numpy as np output_name = "target" feature_names = self.scikit_data["feature_names"] xgb_model_json = tempfile.mktemp("xgb_tree_model_classifier.json") xgb_json_out = self.xgb_model.get_dump(with_stats=True, dump_format="json") with open(xgb_model_json, "w") as f: json.dump(xgb_json_out, f) spec = xgb_converter.convert( xgb_model_json, feature_names, output_name, mode="classifier", n_classes=self.n_classes, ).get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertIsNotNone(spec.treeEnsembleRegressor) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, output_name) # Test the inputs and outputs self.assertEqual(len(spec.description.output), 2) self.assertEqual(spec.description.output[0].name, output_name) self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "int64Type" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(self.scikit_data["feature_names"]), sorted(map(lambda x: x.name, spec.description.input)), ) # Test the linear regression parameters. tr = spec.treeEnsembleClassifier.treeEnsemble self.assertIsNotNone(tr) def test_conversion_special_characters_in_feature_names(self): # this test should fail if conversion function does not implement the # special characters in feature names fix # test both sklearn wrapper and raw booster for model in [self.xgb_model_special_chars, self.xgb_classifier_special_chars]: # process as usual output_name = "target" spec = xgb_converter.convert( model, self.feature_names_special_chars, output_name, mode="classifier", n_classes=self.n_classes, ).get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertEqual(spec.description.predictedFeatureName, output_name) # Test the inputs and outputs self.assertEqual(len(spec.description.output), 2) self.assertEqual(spec.description.output[0].name, output_name) self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "int64Type" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(self.feature_names_special_chars), sorted(map(lambda x: x.name, spec.description.input)) ) # Test the linear regression parameters. tr = spec.treeEnsembleClassifier.treeEnsemble self.assertIsNotNone(tr) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/xgboost_tests/test_boosted_trees_classifier_numeric.py0000644000000000000000000002303214672066616031127 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import unittest import numpy as np import pandas as pd import pytest from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN, _HAS_XGBOOST from coremltools.models.utils import (_is_macos, _macos_version, evaluate_classifier, evaluate_classifier_with_probabilities) if _HAS_SKLEARN: from sklearn.ensemble import GradientBoostingClassifier from coremltools.converters import sklearn as skl_converter if _HAS_XGBOOST: import xgboost from coremltools.converters import xgboost as xgb_converter @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class BoostedTreeClassificationBostonHousingScikitNumericTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter and running both models """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ # Load data and train model scikit_data = load_boston() self.scikit_data = scikit_data self.X = scikit_data["data"].astype("f").astype( "d" ) ## scikit-learn downcasts data self.target = 1 * (scikit_data["target"] > scikit_data["target"].mean()) self.feature_names = scikit_data["feature_names"] self.output_name = "target" def _check_metrics(self, metrics, params={}): self.assertEqual( metrics["num_errors"], 0, msg="Failed case %s. Results %s" % (params, metrics), ) def _train_convert_evaluate_assert(self, **scikit_params): """ Train a scikit-learn model, convert it and then evaluate it with CoreML """ scikit_model = GradientBoostingClassifier(random_state=1, **scikit_params) scikit_model.fit(self.X, self.target) # Convert the model spec = skl_converter.convert(scikit_model, self.feature_names, self.output_name) if hasattr(scikit_model, '_init_decision_function') and scikit_model.n_classes_ > 2: # fix initial default prediction for multiclass classification # https://github.com/scikit-learn/scikit-learn/pull/12983 assert hasattr(scikit_model, 'init_') assert hasattr(scikit_model.init_, 'priors') scikit_model.init_.priors = np.log(scikit_model.init_.priors) if _is_macos() and _macos_version() >= (10, 13): # Get predictions df = pd.DataFrame(self.X, columns=self.feature_names) df["target"] = scikit_model.predict(self.X) # Evaluate it metrics = evaluate_classifier(spec, df) self._check_metrics(metrics) @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class BoostedTreeBinaryClassificationBostonHousingScikitNumericTest( BoostedTreeClassificationBostonHousingScikitNumericTest ): def test_simple_binary_classifier(self): self._train_convert_evaluate_assert() @pytest.mark.slow def test_binary_classifier_stress_test(self): options = dict( max_depth=[1, 10, None], min_samples_split=[2, 0.5], min_samples_leaf=[1, 5], min_weight_fraction_leaf=[0.0, 0.5], max_features=[None, 1], max_leaf_nodes=[None, 20], ) # Make a cartesian product of all options product = itertools.product(*options.values()) args = [dict(zip(options.keys(), p)) for p in product] print("Testing a total of %s cases. This could take a while" % len(args)) for it, arg in enumerate(args): self._train_convert_evaluate_assert(**arg) @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class BoostedTreeMultiClassClassificationBostonHousingScikitNumericTest( BoostedTreeClassificationBostonHousingScikitNumericTest ): @classmethod def setUpClass(self): # Load data and train model scikit_data = load_boston() num_classes = 3 self.X = scikit_data["data"].astype("f").astype( "d" ) ## scikit-learn downcasts data t = scikit_data["target"] target = np.digitize(t, np.histogram(t, bins=num_classes - 1)[1]) - 1 # Save the data and the model self.scikit_data = scikit_data self.target = target self.feature_names = scikit_data["feature_names"] self.output_name = "target" def test_simple_multiclass(self): self._train_convert_evaluate_assert() @pytest.mark.slow def test_multiclass_stress_test(self): options = dict( max_depth=[1, 10, None], min_samples_split=[2, 0.5], min_samples_leaf=[1, 5], min_weight_fraction_leaf=[0.0, 0.5], max_features=[None, 1], max_leaf_nodes=[None, 20], ) # Make a cartesian product of all options product = itertools.product(*options.values()) args = [dict(zip(options.keys(), p)) for p in product] print("Testing a total of %s cases. This could take a while" % len(args)) for it, arg in enumerate(args): self._train_convert_evaluate_assert(**arg) @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") @unittest.skipIf(not _HAS_XGBOOST, "Skipping, no xgboost") class BoostedTreeClassificationBostonHousingXGboostNumericTest(unittest.TestCase): """ Unit test class for testing xgboost converter and running both models """ def _check_metrics(self, metrics, params={}): self.assertEqual( metrics["num_errors"], 0, msg="Failed case %s. Results %s" % (params, metrics), ) def _train_convert_evaluate_assert(self, **xgboost_params): """ Train a scikit-learn model, convert it and then evaluate it with CoreML """ xgb_model = xgboost.XGBClassifier(**xgboost_params) xgb_model.fit(self.X, self.target) # Convert the model spec = xgb_converter.convert( xgb_model, self.feature_names, self.output_name, mode="classifier" ) if _is_macos() and _macos_version() >= (10, 13): # Get predictions df = pd.DataFrame(self.X, columns=self.feature_names) probabilities = xgb_model.predict_proba(self.X) df["classProbability"] = [ dict(zip(xgb_model.classes_, cur_vals)) for cur_vals in probabilities ] metrics = evaluate_classifier_with_probabilities( spec, df, probabilities="classProbability", verbose=False ) self.assertEqual(metrics["num_key_mismatch"], 0) self.assertLess(metrics["max_probability_error"], 1e-3) def _classifier_stress_test(self): options = dict( max_depth=[1, 10], min_child_weight=[2, 0.5], max_delta_step=[1, 5], ) # Make a cartesian product of all options product = itertools.product(*options.values()) args = [dict(zip(options.keys(), p)) for p in product] print("Testing a total of %s cases. This could take a while" % len(args)) for it, arg in enumerate(args): self._train_convert_evaluate_assert(**arg) @unittest.skipIf(_macos_version() >= (10, 16), "rdar://problem/84898245") @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") @unittest.skipIf(not _HAS_XGBOOST, "Skipping, no xgboost") class BoostedTreeBinaryClassificationBostonHousingXGboostNumericTest( BoostedTreeClassificationBostonHousingXGboostNumericTest ): @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ # Load data and train model scikit_data = load_boston() self.scikit_data = scikit_data self.X = scikit_data["data"].astype("f").astype( "d" ) ## scikit-learn downcasts data self.target = 1 * (scikit_data["target"] > scikit_data["target"].mean()) self.feature_names = scikit_data["feature_names"] self.output_name = "target" def test_simple_binary_classifier(self): self._train_convert_evaluate_assert() @pytest.mark.slow def test_binary_classifier_stress_test(self): self._classifier_stress_test() @unittest.skipIf(_macos_version() >= (12, 0), "rdar://problem/84898245") @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") @unittest.skipIf(not _HAS_XGBOOST, "Skipping, no xgboost") class BoostedTreeMultiClassClassificationBostonHousingXGboostNumericTest( BoostedTreeClassificationBostonHousingXGboostNumericTest ): @classmethod def setUpClass(self): scikit_data = load_boston() num_classes = 3 self.X = scikit_data["data"].astype("f").astype( "d" ) ## scikit-learn downcasts data t = scikit_data["target"] target = np.digitize(t, np.histogram(t, bins=num_classes - 1)[1]) - 1 # Save the data and the model self.scikit_data = scikit_data self.target = target self.feature_names = scikit_data["feature_names"] self.output_name = "target" def test_simple_multiclass(self): self._train_convert_evaluate_assert() @pytest.mark.slow def test_multiclass_stress_test(self): self._classifier_stress_test() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/xgboost_tests/test_boosted_trees_regression.py0000644000000000000000000002402714672066616027446 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import json import tempfile import unittest from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN, _HAS_XGBOOST from coremltools.models.utils import _macos_version if _HAS_XGBOOST: import xgboost from coremltools.converters import xgboost as xgb_converter if _HAS_SKLEARN: from sklearn.ensemble import GradientBoostingRegressor from sklearn.preprocessing import OneHotEncoder from coremltools.converters import sklearn as skl_converter @unittest.skipIf(not _HAS_SKLEARN, "Missing scikit-learn. Skipping tests.") class GradientBoostingRegressorScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(cls): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() scikit_model = GradientBoostingRegressor(random_state=1) scikit_model.fit(scikit_data["data"], scikit_data["target"]) s = 0 for est in scikit_model.estimators_: for e in est: s = s + e.tree_.node_count cls.scikit_model_node_count = s # Save the data and the model cls.scikit_data = scikit_data cls.scikit_model = scikit_model def test_conversion(self): input_names = self.scikit_data["feature_names"] output_name = "target" spec = skl_converter.convert( self.scikit_model, input_names, "target" ).get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 1) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "doubleType" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(input_names), sorted(map(lambda x: x.name, spec.description.input)) ) tr = spec.pipelineRegressor.pipeline.models[ -1 ].treeEnsembleRegressor.treeEnsemble self.assertIsNotNone(tr) self.assertEqual(len(tr.nodes), self.scikit_model_node_count) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(Exception): model = GradientBoostingRegressor() spec = skl_converter.convert(model, "data", "out") # Check the expected class during conversion. with self.assertRaises(Exception): model = OneHotEncoder() spec = skl_converter.convert(model, "data", "out") @unittest.skipIf(_macos_version() >= (10, 16), "rdar://problem/84898245") @unittest.skipIf(not _HAS_SKLEARN, "Missing scikit-learn. Skipping tests.") @unittest.skipIf(not _HAS_XGBOOST, "Skipping, no xgboost") class BoostedTreeRegressorXGboostTest(unittest.TestCase): @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ if not _HAS_XGBOOST: return if not _HAS_SKLEARN: return scikit_data = load_boston() dtrain = xgboost.DMatrix( scikit_data["data"], label=scikit_data["target"], feature_names=scikit_data["feature_names"], ) xgb_model = xgboost.train({}, dtrain, 1) # Save the data and the model self.scikit_data = scikit_data self.xgb_model = xgb_model self.feature_names = self.scikit_data["feature_names"] # train a booster with special characters in feature names x = scikit_data['data'] # prepare feature names with special chars self.feature_names_special_chars = [f'\t"{i}"\n' for i in range(x.shape[1])] # create training dmatrix dm = xgboost.DMatrix(x, label=scikit_data["target"], feature_names=self.feature_names_special_chars) # train booster self.xgb_model_special_chars = xgboost.train({}, dm, 1) # create XGBClassifier from a copy of trainer booster self.xgb_regressor_special_chars = xgboost.XGBRegressor(xgb_model=self.xgb_model_special_chars.copy(), n_estimators=1) self.xgb_regressor_special_chars.fit(x, scikit_data["target"]) def test_conversion(self): feature_names = self.scikit_data["feature_names"] output_name = "target" spec = xgb_converter.convert(self.xgb_model, feature_names, "target").get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertIsNotNone(spec.treeEnsembleRegressor) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 1) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "doubleType" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(self.feature_names), sorted(map(lambda x: x.name, spec.description.input)), ) # Test the linear regression parameters. tr = spec.treeEnsembleRegressor.treeEnsemble self.assertIsNotNone(tr) self.assertEqual(len(tr.nodes), 23) def test_conversion_from_file(self): output_name = "target" feature_names = self.feature_names xgb_model_json = tempfile.mktemp("tree_model.json") xgb_json_out = self.xgb_model.get_dump(dump_format="json") with open(xgb_model_json, "w") as f: json.dump(xgb_json_out, f) spec = xgb_converter.convert(xgb_model_json, feature_names, "target").get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertIsNotNone(spec.treeEnsembleRegressor) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 1) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "doubleType" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(self.feature_names), sorted(map(lambda x: x.name, spec.description.input)), ) # Test the linear regression parameters. tr = spec.treeEnsembleRegressor.treeEnsemble self.assertIsNotNone(tr) self.assertEqual(len(tr.nodes), 23) def test_unsupported_conversion(self): feature_names = self.scikit_data["feature_names"] output_name = "target" xgb_model = xgboost.XGBRegressor(objective="reg:gamma") xgb_model.fit(self.scikit_data["data"], self.scikit_data["target"]) with self.assertRaises(ValueError): spec = xgb_converter.convert(xgb_model, feature_names, "target") xgb_model = xgboost.XGBRegressor(objective="reg:tweedie") xgb_model.fit(self.scikit_data["data"], self.scikit_data["target"]) with self.assertRaises(ValueError): spec = xgb_converter.convert(xgb_model, feature_names, "target") def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(TypeError): model = GradientBoostingRegressor() spec = xgb_converter.convert(model, "data", "out") # Check the expected class during conversion with self.assertRaises(TypeError): model = OneHotEncoder() spec = xgb_converter.convert(model, "data", "out") def test_conversion_special_characters_in_feature_names(self): # this test should fail if conversion function does not implement the # special characters in feature names fix # test both sklearn wrapper and raw booster for model in [self.xgb_model_special_chars, self.xgb_regressor_special_chars]: # process as usual output_name = "target" spec = xgb_converter.convert( model, self.feature_names_special_chars, "target")\ .get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertIsNotNone(spec.treeEnsembleRegressor) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 1) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "doubleType" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(self.feature_names_special_chars), sorted(map(lambda x: x.name, spec.description.input)), ) # Test the linear regression parameters. tr = spec.treeEnsembleRegressor.treeEnsemble self.assertIsNotNone(tr) self.assertEqual(len(tr.nodes), 23) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/xgboost_tests/test_boosted_trees_regression_numeric.py0000644000000000000000000002540314672066616031167 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import unittest import pandas as pd import pytest from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN, _HAS_XGBOOST from coremltools.models.utils import (_is_macos, _macos_version, evaluate_regressor) if _HAS_XGBOOST: import xgboost from coremltools.converters import xgboost as xgb_converter if _HAS_SKLEARN: from sklearn.ensemble import GradientBoostingRegressor from coremltools.converters import sklearn as skl_converter @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class GradientBoostingRegressorBostonHousingScikitNumericTest(unittest.TestCase): @classmethod def setUpClass(self): # Load data and train model scikit_data = load_boston() self.scikit_data = scikit_data self.X = scikit_data["data"] self.target = scikit_data["target"] self.feature_names = scikit_data["feature_names"] self.output_name = "target" def _check_metrics(self, metrics, params={}): self.assertAlmostEqual( metrics["rmse"], 0, delta=1e-5, msg="Failed case %s. Results %s" % (params, metrics), ) self.assertAlmostEqual( metrics["max_error"], 0, delta=1e-5, msg="Failed case %s. Results %s" % (params, metrics), ) def _train_convert_evaluate_assert(self, **scikit_params): scikit_model = GradientBoostingRegressor(random_state=1, **scikit_params) scikit_model.fit(self.X, self.target) # Convert the model spec = skl_converter.convert(scikit_model, self.feature_names, self.output_name) if _is_macos() and _macos_version() >= (10, 13): # Get predictions df = pd.DataFrame(self.X, columns=self.feature_names) df["target"] = scikit_model.predict(self.X) # Evaluate it metrics = evaluate_regressor(spec, df, "target", verbose=False) self._check_metrics(metrics, scikit_params) def test_boston_housing_simple_regression(self): self._train_convert_evaluate_assert() @pytest.mark.slow def test_boston_housing_parameter_stress_test(self): options = dict( max_depth=[1, 10, None], min_samples_split=[2, 0.5], min_samples_leaf=[1, 5], min_weight_fraction_leaf=[0.0, 0.5], max_features=[None, 1], max_leaf_nodes=[None, 20], ) # Make a cartesian product of all options product = itertools.product(*options.values()) args = [dict(zip(options.keys(), p)) for p in product] print("Testing a total of %s cases. This could take a while" % len(args)) for it, arg in enumerate(args): self._train_convert_evaluate_assert(**arg) @unittest.skipIf(_macos_version() >= (12, 0), "rdar://problem/84898245") @unittest.skipIf(not _HAS_XGBOOST, "Missing xgboost. Skipping") @unittest.skipIf(not _HAS_SKLEARN, "Missing scikit-learn. Skipping tests.") class XgboostBoosterBostonHousingNumericTest(unittest.TestCase): @classmethod def setUpClass(self): if not _HAS_XGBOOST: return if not _HAS_SKLEARN: return # Load data and train model scikit_data = load_boston() self.X = scikit_data["data"].astype("f").astype("d") self.dtrain = xgboost.DMatrix( scikit_data["data"], label=scikit_data["target"], feature_names=scikit_data["feature_names"], ) self.feature_names = scikit_data["feature_names"] self.output_name = "target" def _check_metrics(self, metrics, allowed_error={}, params={}): """ Check the metrics """ self.assertAlmostEqual( metrics["rmse"], allowed_error.get("rmse", 0), delta=1e-2, msg="Failed case %s. Results %s" % (params, metrics), ) self.assertAlmostEqual( metrics["max_error"], allowed_error.get("max_error", 0), delta=1e-2, msg="Failed case %s. Results %s" % (params, metrics), ) def _train_convert_evaluate_assert(self, bt_params={}, allowed_error={}, **params): """ Set up the unit test by loading the dataset and training a model. """ # Train a model xgb_model = xgboost.train(bt_params, self.dtrain, **params) # Convert the model spec = xgb_converter.convert( xgb_model, self.feature_names, self.output_name, force_32bit_float=False ) if _is_macos() and _macos_version() >= (10, 13): # Get predictions df = pd.DataFrame(self.X, columns=self.feature_names) df["target"] = xgb_model.predict(self.dtrain) # Evaluate it metrics = evaluate_regressor(spec, df, target="target", verbose=False) self._check_metrics(metrics, allowed_error, bt_params) def test_boston_housing_simple_decision_tree_regression(self): self._train_convert_evaluate_assert(num_boost_round=1) def test_boston_housing_simple_boosted_tree_regression(self): self._train_convert_evaluate_assert(num_boost_round=10) def test_boston_housing_simple_random_forest_regression(self): self._train_convert_evaluate_assert(bt_params={"subsample": 0.5}, allowed_error={"rmse": 0.004, "max_error": 0.09}) def test_boston_housing_float_double_corner_case(self): self._train_convert_evaluate_assert( { "colsample_bytree": 1, "colsample_bylevel": 1, "scale_pos_weight": 1, "learning_rate": 0.5, "max_delta_step": 0, "min_child_weight": 1, "n_estimators": 1, "subsample": 0.5, "objective": "reg:linear", "max_depth": 5, }, num_boost_round=2, ) @pytest.mark.slow def test_boston_housing_parameter_stress_test(self): options = dict( max_depth=[1, 5], learning_rate=[0.1, 0.5], n_estimators=[1, 10], min_child_weight=[1, 2], max_delta_step=[0, 0.1], colsample_bytree=[1, 0.5], colsample_bylevel=[1, 0.5], scale_pos_weight=[1], objective=["reg:linear"], ) # Make a cartesian product of all options product = itertools.product(*options.values()) args = [dict(zip(options.keys(), p)) for p in product] print("Testing a total of %s cases. This could take a while" % len(args)) for it, arg in enumerate(args): self._train_convert_evaluate_assert(arg) @unittest.skipIf(_macos_version() >= (12, 0), "rdar://problem/84898245") @unittest.skipIf(not _HAS_XGBOOST, "Missing xgboost. Skipping") @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class XGboostRegressorBostonHousingNumericTest(unittest.TestCase): @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ # Load data and train model scikit_data = load_boston() self.X = scikit_data["data"] self.scikit_data = self.X self.target = scikit_data["target"] self.feature_names = scikit_data["feature_names"] self.output_name = "target" def _check_metrics(self, metrics, params={}, allowed_error={}): self.assertAlmostEqual( metrics["rmse"], allowed_error.get("rmse", 0), delta=1e-2, msg="Failed case %s. Results %s" % (params, metrics), ) self.assertAlmostEqual( metrics["max_error"], allowed_error.get("max_error", 0), delta=1e-2, msg="Failed case %s. Results %s" % (params, metrics), ) def _train_convert_evaluate_assert(self, bt_params={}, allowed_error={}, **params): """ Set up the unit test by loading the dataset and training a model. """ # Train a model xgb_model = xgboost.XGBRegressor(**params) xgb_model.fit(self.X, self.target) # Convert the model (feature_names can't be given because of XGboost) spec = xgb_converter.convert( xgb_model, self.feature_names, self.output_name, force_32bit_float=False ) if _is_macos() and _macos_version() >= (10, 13): # Get predictions df = pd.DataFrame(self.X, columns=self.feature_names) df["target"] = xgb_model.predict(self.X) # Evaluate it metrics = evaluate_regressor(spec, df, target="target", verbose=False) self._check_metrics(metrics, bt_params, allowed_error) def test_boston_housing_simple_boosted_tree_regression(self): self._train_convert_evaluate_assert() def test_boston_housing_simple_random_forest_regression(self): self._train_convert_evaluate_assert( allowed_error={"rmse": 0.05, "max_error": 0.81}, subsample=0.5 ) def test_boston_housing_simple_decision_tree_regression(self): self._train_convert_evaluate_assert(n_estimators=1) def test_boston_housing_float_double_corner_case(self): self._train_convert_evaluate_assert( { "colsample_bytree": 1, "colsample_bylevel": 1, "scale_pos_weight": 1, "learning_rate": 0.1, "max_delta_step": 0, "min_child_weight": 1, "n_estimators": 10, "subsample": 0.3, "objective": "reg:linear", "max_depth": 1, } ) @pytest.mark.slow def test_boston_housing_parameter_stress_test(self): options = dict( max_depth=[1, 5], learning_rate=[0.1, 0.5], n_estimators=[1, 10], objective=["reg:linear"], min_child_weight=[1, 2], max_delta_step=[0, 0.1], subsample=[1, 0.5, 0.3], colsample_bytree=[1, 0.5], colsample_bylevel=[1, 0.5], scale_pos_weight=[1], ) # Make a cartesian product of all options product = itertools.product(*options.values()) args = [dict(zip(options.keys(), p)) for p in product] print("Testing a total of %s cases. This could take a while" % len(args)) for it, arg in enumerate(args): self._train_convert_evaluate_assert(arg) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/xgboost_tests/test_decision_tree_classifier.py0000644000000000000000000001177214672066616027370 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest import numpy as np from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN, _HAS_XGBOOST if _HAS_SKLEARN: from sklearn.tree import DecisionTreeClassifier from coremltools.converters.sklearn import convert as skl_converter if _HAS_XGBOOST: pass @unittest.skipIf(not _HAS_SKLEARN, "Missing scikit-learn. Skipping tests.") class DecisionTreeBinaryClassifierScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() scikit_model = DecisionTreeClassifier(random_state=1) target = scikit_data["target"] > scikit_data["target"].mean() scikit_model.fit(scikit_data["data"], target) # Save the data and the model self.scikit_data = scikit_data self.scikit_model = scikit_model def test_conversion(self): output_name = "target" spec = skl_converter(self.scikit_model, "data", "target").get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertIsNotNone(spec.treeEnsembleClassifier) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 2) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "int64Type" ) self.assertEqual(len(spec.description.input), 1) input_type = spec.description.input[0] self.assertEqual(input_type.type.WhichOneof("Type"), "multiArrayType") self.assertEqual(input_type.name, "data") # Test the linear regression parameters. tr = spec.treeEnsembleClassifier.treeEnsemble self.assertIsNotNone(tr) self.assertEqual(len(tr.nodes), self.scikit_model.tree_.node_count) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(Exception): model = DecisionTreeClassifier() spec = skl_converter(model, "data", "out") # Check the expected class during conversion. from sklearn.preprocessing import OneHotEncoder with self.assertRaises(Exception): model = OneHotEncoder() spec = skl_converter(model, "data", "out") @unittest.skipIf(not _HAS_SKLEARN, "Missing scikit-learn. Skipping tests.") class DecisionTreeMultiClassClassifierScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() scikit_model = DecisionTreeClassifier(random_state=1) t = scikit_data["target"] target = np.digitize(t, np.histogram(t)[1]) - 1 scikit_model.fit(scikit_data["data"], target) # Save the data and the model self.scikit_data = scikit_data self.target = target self.scikit_model = scikit_model def test_conversion(self): output_name = "target" spec = skl_converter(self.scikit_model, "data", "target").get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertIsNotNone(spec.treeEnsembleClassifier) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 2) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "int64Type" ) self.assertEqual(spec.description.input[0].name, "data") self.assertEqual( spec.description.input[0].type.WhichOneof("Type"), "multiArrayType" ) tr = spec.treeEnsembleClassifier.treeEnsemble self.assertIsNotNone(tr) self.assertEqual(len(tr.nodes), self.scikit_model.tree_.node_count) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(Exception): model = DecisionTreeClassifier() spec = skl_converter(model, "data", "out") # Check the expected class during conversion. from sklearn.preprocessing import OneHotEncoder with self.assertRaises(Exception): model = OneHotEncoder() spec = skl_converter(model, "data", "out") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/xgboost_tests/test_decision_tree_classifier_numeric.py0000644000000000000000000001164114672066616031105 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import unittest import numpy as np import pandas as pd import pytest from packaging.version import Version from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN, _SKLEARN_VERSION from coremltools.models.utils import (_is_macos, _macos_version, evaluate_classifier) if _HAS_SKLEARN: from sklearn.tree import DecisionTreeClassifier from coremltools.converters import sklearn as skl_converter @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class DecisionTreeClassificationBostonHousingScikitNumericTest(unittest.TestCase): def _check_metrics(self, metrics, params={}): self.assertEqual( metrics["num_errors"], 0, msg="Failed case %s. Results %s" % (params, metrics), ) def _train_convert_evaluate_assert(self, **scikit_params): scikit_model = DecisionTreeClassifier(random_state=1, **scikit_params) scikit_model.fit(self.X, self.target) # Convert the model spec = skl_converter.convert(scikit_model, self.feature_names, self.output_name) if _is_macos() and _macos_version() >= (10, 13): # Get predictions df = pd.DataFrame(self.X, columns=self.feature_names) df["target"] = scikit_model.predict(self.X) # Evaluate it metrics = evaluate_classifier(spec, df) self._check_metrics(metrics, scikit_params) @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class DecisionTreeBinaryClassificationBostonHousingScikitNumericTest( DecisionTreeClassificationBostonHousingScikitNumericTest ): @classmethod def setUpClass(self): # Load data and train model scikit_data = load_boston() self.scikit_data = scikit_data self.X = scikit_data["data"].astype("f").astype( "d" ) ## scikit-learn downcasts data self.target = 1 * (scikit_data["target"] > scikit_data["target"].mean()) self.feature_names = scikit_data["feature_names"] self.output_name = "target" def test_simple_binary_classifier(self): self._train_convert_evaluate_assert() @pytest.mark.slow def test_binary_classifier_stress_test(self): options = dict( splitter=["best"], max_depth=[1, 10, None], min_samples_split=[2, 10, 0.5], min_samples_leaf=[1, 5], min_weight_fraction_leaf=[0.0, 0.5], max_features=[None, 1, 5], max_leaf_nodes=[None, 20], ) if _SKLEARN_VERSION < Version("0.22"): # 'presort' option deprecated >=0.22 options["presort"] = [False, True] # Make a cartesian product of all options product = itertools.product(*options.values()) args = [dict(zip(options.keys(), p)) for p in product] print("Testing a total of %s cases. This could take a while" % len(args)) for it, arg in enumerate(args): self._train_convert_evaluate_assert(**arg) @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class DecisionTreeMultiClassClassificationBostonHousingScikitNumericTest( DecisionTreeClassificationBostonHousingScikitNumericTest ): @classmethod def setUpClass(self): # Load data and train model scikit_data = load_boston() num_classes = 3 self.X = scikit_data["data"].astype("f").astype( "d" ) ## scikit-learn downcasts data t = scikit_data["target"] target = np.digitize(t, np.histogram(t, bins=num_classes - 1)[1]) - 1 # Save the data and the model self.scikit_data = scikit_data self.target = target self.feature_names = scikit_data["feature_names"] self.output_name = "target" def test_simple_multiclass(self): self._train_convert_evaluate_assert() @pytest.mark.slow def test_multiclass_stress_test(self): options = dict( splitter=["best"], max_depth=[1, 10, None], min_samples_split=[2, 10, 0.5], min_samples_leaf=[1, 5], min_weight_fraction_leaf=[0.0, 0.5], max_features=[None, 1, 5], max_leaf_nodes=[None, 20], ) if _SKLEARN_VERSION < Version("0.22"): # 'presort' option deprecated >=0.22 options["presort"] = [False, True] # Make a cartesian product of all options product = itertools.product(*options.values()) args = [dict(zip(options.keys(), p)) for p in product] print("Testing a total of %s cases. This could take a while" % len(args)) for it, arg in enumerate(args): self._train_convert_evaluate_assert(**arg) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/xgboost_tests/test_decision_tree_regression.py0000644000000000000000000000552614672066616027424 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import unittest from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN, _HAS_XGBOOST if _HAS_SKLEARN: from sklearn.tree import DecisionTreeRegressor from coremltools.converters import sklearn as skl_converter @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class DecisionTreeRegressorScikitTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter. """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ scikit_data = load_boston() scikit_model = DecisionTreeRegressor(random_state=1) scikit_model.fit(scikit_data["data"], scikit_data["target"]) # Save the data and the model self.scikit_data = scikit_data self.scikit_model = scikit_model def test_conversion(self): feature_names = self.scikit_data["feature_names"] output_name = "target" spec = skl_converter.convert( self.scikit_model, feature_names, "target" ).get_spec() self.assertIsNotNone(spec) # Test the model class self.assertIsNotNone(spec.description) self.assertIsNotNone(spec.treeEnsembleRegressor) # Test the interface class self.assertEqual(spec.description.predictedFeatureName, "target") # Test the inputs and outputs self.assertEqual(len(spec.description.output), 1) self.assertEqual(spec.description.output[0].name, "target") self.assertEqual( spec.description.output[0].type.WhichOneof("Type"), "doubleType" ) for input_type in spec.description.input: self.assertEqual(input_type.type.WhichOneof("Type"), "doubleType") self.assertEqual( sorted(feature_names), sorted(map(lambda x: x.name, spec.description.input)) ) # Test the linear regression parameters. tr = spec.pipelineRegressor.pipeline.models[ 1 ].treeEnsembleRegressor.treeEnsemble self.assertIsNotNone(tr) self.assertEqual(len(tr.nodes), self.scikit_model.tree_.node_count) def test_conversion_bad_inputs(self): # Error on converting an untrained model with self.assertRaises(Exception): model = DecisionTreeRegressor() spec = skl_converter.convert(model, "data", "out") # Check the expected class during conversion. from sklearn.preprocessing import OneHotEncoder with self.assertRaises(Exception): model = OneHotEncoder() spec = skl_converter.convert(model, "data", "out") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508430.0 coremltools-8.0/coremltools/test/xgboost_tests/test_decision_tree_regression_numeric.py0000644000000000000000000000723114672066616031141 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import itertools import unittest import pandas as pd import pytest from packaging.version import Version from ..utils import load_boston from coremltools._deps import _HAS_SKLEARN, _SKLEARN_VERSION from coremltools.models.utils import (_is_macos, _macos_version, evaluate_regressor) @unittest.skipIf(not _HAS_SKLEARN, "Missing sklearn. Skipping tests.") class DecisionTreeRegressorBostonHousingScikitNumericTest(unittest.TestCase): """ Unit test class for testing scikit-learn converter and running both models """ @classmethod def setUpClass(self): """ Set up the unit test by loading the dataset and training a model. """ # Load data and train model scikit_data = load_boston() self.scikit_data = scikit_data self.X = scikit_data["data"] self.target = scikit_data["target"] self.feature_names = scikit_data["feature_names"] self.output_name = "target" def _check_metrics(self, metrics, params={}): """ Check the metrics """ self.assertAlmostEqual( metrics["rmse"], 0, delta=1e-5, msg="Failed case %s. Results %s" % (params, metrics), ) self.assertAlmostEqual( metrics["max_error"], 0, delta=1e-5, msg="Failed case %s. Results %s" % (params, metrics), ) def _train_convert_evaluate_assert(self, **scikit_params): """ Train a scikit-learn model, convert it and then evaluate it with CoreML """ from sklearn.tree import DecisionTreeRegressor from coremltools.converters import sklearn as skl_converter scikit_model = DecisionTreeRegressor(random_state=1, **scikit_params) scikit_model.fit(self.X, self.target) # Convert the model spec = skl_converter.convert(scikit_model, self.feature_names, self.output_name) if _is_macos() and _macos_version() >= (10, 13): # Get predictions df = pd.DataFrame(self.X, columns=self.feature_names) df["target"] = scikit_model.predict(self.X) # Evaluate it metrics = evaluate_regressor(spec, df, target="target", verbose=False) self._check_metrics(metrics, scikit_params) def test_boston_housing_simple_regression(self): self._train_convert_evaluate_assert(max_depth=20) @pytest.mark.slow def test_boston_housing_parameter_stress_test(self): ## These are all the options in decision tree regression of scikit-learn options = dict( criterion=["friedman_mse"], splitter=["best"], max_depth=[1, 10, None], min_samples_split=[2, 10, 0.5], min_samples_leaf=[1, 5], min_weight_fraction_leaf=[0.0, 0.5], max_features=[None, 1, 5], max_leaf_nodes=[None, 20], min_impurity_decrease=[0.0, 1e-07, 0.1], ) if _SKLEARN_VERSION < Version("0.22"): # 'presort' option deprecated >=0.22 options["presort"] = [False, True] # Make a cartesian product of all options product = itertools.product(*options.values()) args = [dict(zip(options.keys(), p)) for p in product] print("Testing a total of %s cases. This could take a while" % len(args)) for it, arg in enumerate(args): self._train_convert_evaluate_assert(**arg) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508431.0 coremltools-8.0/coremltools/version.py0000644000000000000000000000037714672066617017110 0ustar00rootroot# Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause __version__ = "8.0" # VERSION_STRING ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2935474 coremltools-8.0/coremltools.egg-info/0000755000000000000000000000000014672075535016533 5ustar00rootroot././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726511965.0 coremltools-8.0/coremltools.egg-info/PKG-INFO0000644000000000000000000000460314672075535017633 0ustar00rootrootMetadata-Version: 2.1 Name: coremltools Version: 8.0 Summary: Community Tools for Core ML Home-page: https://github.com/apple/coremltools Author: Apple Inc. Author-email: coremltools@apple.com License: BSD Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers Classifier: Operating System :: MacOS :: MacOS X Classifier: Operating System :: POSIX :: Linux Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Topic :: Scientific/Engineering Classifier: Topic :: Software Development License-File: LICENSE.txt License-File: NOTICE.txt Requires-Dist: numpy>=1.14.5 Requires-Dist: protobuf>=3.1.0 Requires-Dist: sympy Requires-Dist: tqdm Requires-Dist: packaging Requires-Dist: attrs>=21.3.0 Requires-Dist: cattrs Requires-Dist: pyaml coremltools =========== `Core ML `_ is an Apple framework that allows developers to easily integrate machine learning (ML) models into apps. Core ML is available on iOS, iPadOS, watchOS, macOS, and tvOS. Core ML introduces a public file format (.mlmodel) for a broad set of ML methods including deep neural networks (convolutional and recurrent), tree ensembles (boosted trees, random forest, decision trees), and generalized linear models. Core ML models can be directly integrated into apps within Xcode. :code:`coremltools` is a python package for creating, examining, and testing models in the .mlmodel format. In particular, it can be used to: - Convert trained models from popular machine learning tools into Core ML format (.mlmodel). - Write models to Core ML format with a simple API. - Making predictions using the Core ML framework (on select platforms) to verify conversion. More Information ---------------- - `coremltools user guide and examples `_ - `Core ML framework documentation `_ - `Machine learning at Apple `_ License ------- Copyright (c) 2020, Apple Inc. All rights reserved. Use of this source code is governed by the `3-Clause BSD License `_ that can be found in the LICENSE.txt file. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726511965.0 coremltools-8.0/coremltools.egg-info/SOURCES.txt0000644000000000000000000010377114672075535020430 0ustar00rootrootLICENSE.txt MANIFEST.in NOTICE.txt README.md setup.py coremltools/__init__.py coremltools/version.py coremltools.egg-info/PKG-INFO coremltools.egg-info/SOURCES.txt coremltools.egg-info/dependency_links.txt coremltools.egg-info/requires.txt coremltools.egg-info/top_level.txt coremltools/_deps/__init__.py coremltools/converters/__init__.py coremltools/converters/_converters_entry.py coremltools/converters/_profile_utils.py coremltools/converters/libsvm/__init__.py coremltools/converters/libsvm/_libsvm_converter.py coremltools/converters/libsvm/_libsvm_util.py coremltools/converters/mil/__init__.py coremltools/converters/mil/_deployment_compatibility.py coremltools/converters/mil/conftest.py coremltools/converters/mil/converter.py coremltools/converters/mil/debugging_utils.py coremltools/converters/mil/input_types.py coremltools/converters/mil/test_inputs_outputs_shape.py coremltools/converters/mil/testing_reqs.py coremltools/converters/mil/testing_utils.py coremltools/converters/mil/backend/__init__.py coremltools/converters/mil/backend/backend_helper.py coremltools/converters/mil/backend/mil/__init__.py coremltools/converters/mil/backend/mil/helper.py coremltools/converters/mil/backend/mil/load.py coremltools/converters/mil/backend/mil/test_helper.py coremltools/converters/mil/backend/mil/test_load.py coremltools/converters/mil/backend/mil/passes/__init__.py coremltools/converters/mil/backend/mil/passes/adjust_io_to_supported_types.py coremltools/converters/mil/backend/mil/passes/fuse_activation_silu.py coremltools/converters/mil/backend/mil/passes/fuse_pow2_sqrt.py coremltools/converters/mil/backend/mil/passes/insert_image_preprocessing_op.py coremltools/converters/mil/backend/mil/passes/sanitize_name_strings.py coremltools/converters/mil/backend/mil/passes/test_passes.py coremltools/converters/mil/backend/nn/__init__.py coremltools/converters/mil/backend/nn/load.py coremltools/converters/mil/backend/nn/mil_to_nn_mapping_registry.py coremltools/converters/mil/backend/nn/op_mapping.py coremltools/converters/mil/backend/nn/passes/__init__.py coremltools/converters/mil/backend/nn/passes/alert_return_type_cast.py coremltools/converters/mil/backend/nn/passes/commingle_loop_vars.py coremltools/converters/mil/backend/nn/passes/conv1d_decomposition.py coremltools/converters/mil/backend/nn/passes/handle_return_inputs_as_outputs.py coremltools/converters/mil/backend/nn/passes/handle_return_unused_inputs.py coremltools/converters/mil/backend/nn/passes/handle_unused_inputs.py coremltools/converters/mil/backend/nn/passes/mlmodel_passes.py coremltools/converters/mil/backend/nn/passes/test_mlmodel_passes.py coremltools/converters/mil/backend/nn/passes/test_passes.py coremltools/converters/mil/experimental/__init__.py coremltools/converters/mil/experimental/passes/__init__.py coremltools/converters/mil/experimental/passes/generic_conv_batchnorm_fusion.py coremltools/converters/mil/experimental/passes/generic_conv_bias_fusion.py coremltools/converters/mil/experimental/passes/generic_conv_scale_fusion.py coremltools/converters/mil/experimental/passes/generic_layernorm_instancenorm_pattern_fusion.py coremltools/converters/mil/experimental/passes/generic_linear_bias_fusion.py coremltools/converters/mil/experimental/passes/generic_pass_infrastructure.py coremltools/converters/mil/frontend/__init__.py coremltools/converters/mil/frontend/_utils.py coremltools/converters/mil/frontend/milproto/__init__.py coremltools/converters/mil/frontend/milproto/helper.py coremltools/converters/mil/frontend/milproto/load.py coremltools/converters/mil/frontend/milproto/test_load.py coremltools/converters/mil/frontend/tensorflow/__init__.py coremltools/converters/mil/frontend/tensorflow/basic_graph_ops.py coremltools/converters/mil/frontend/tensorflow/convert_utils.py coremltools/converters/mil/frontend/tensorflow/converter.py coremltools/converters/mil/frontend/tensorflow/dialect_ops.py coremltools/converters/mil/frontend/tensorflow/dot_visitor.py coremltools/converters/mil/frontend/tensorflow/load.py coremltools/converters/mil/frontend/tensorflow/naming_utils.py coremltools/converters/mil/frontend/tensorflow/ops.py coremltools/converters/mil/frontend/tensorflow/parse.py coremltools/converters/mil/frontend/tensorflow/parsed_tf_node.py coremltools/converters/mil/frontend/tensorflow/tf_op_registry.py coremltools/converters/mil/frontend/tensorflow/tfssa.py coremltools/converters/mil/frontend/tensorflow/ssa_passes/__init__.py coremltools/converters/mil/frontend/tensorflow/ssa_passes/backfill_make_list_elem_type.py coremltools/converters/mil/frontend/tensorflow/ssa_passes/expand_tf_lstm.py coremltools/converters/mil/frontend/tensorflow/ssa_passes/test_passes.py coremltools/converters/mil/frontend/tensorflow/ssa_passes/tf_lstm_to_core_lstm.py coremltools/converters/mil/frontend/tensorflow/test/__init__.py coremltools/converters/mil/frontend/tensorflow/test/test_composite_ops.py coremltools/converters/mil/frontend/tensorflow/test/test_custom_ops.py coremltools/converters/mil/frontend/tensorflow/test/test_graphs.py coremltools/converters/mil/frontend/tensorflow/test/test_load.py coremltools/converters/mil/frontend/tensorflow/test/test_ops.py coremltools/converters/mil/frontend/tensorflow/test/test_parse.py coremltools/converters/mil/frontend/tensorflow/test/test_parsed_tf_node.py coremltools/converters/mil/frontend/tensorflow/test/test_tf_conversion_api.py coremltools/converters/mil/frontend/tensorflow/test/testing_utils.py coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/__init__.py coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/cond_to_where.py coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/constant_propagation.py coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/delete_asserts.py coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/delete_constant.py coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/delete_disconnected_nodes.py coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/functionalize_loops.py coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/fuse_dilation_conv.py coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/insert_get_tuple.py coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/quantization_pass.py coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/tensor_array_transform.py coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/variable_node_transform.py coremltools/converters/mil/frontend/tensorflow/tf_graph_pass/visitors.py coremltools/converters/mil/frontend/tensorflow2/__init__.py coremltools/converters/mil/frontend/tensorflow2/converter.py coremltools/converters/mil/frontend/tensorflow2/load.py coremltools/converters/mil/frontend/tensorflow2/ops.py coremltools/converters/mil/frontend/tensorflow2/ssa_passes/__init__.py coremltools/converters/mil/frontend/tensorflow2/ssa_passes/remove_vacuous_cond.py coremltools/converters/mil/frontend/tensorflow2/ssa_passes/test_v2_passes.py coremltools/converters/mil/frontend/tensorflow2/test/__init__.py coremltools/converters/mil/frontend/tensorflow2/test/test_tf2_conversion_api.py coremltools/converters/mil/frontend/tensorflow2/test/test_v2_load.py coremltools/converters/mil/frontend/tensorflow2/test/test_v2_ops.py coremltools/converters/mil/frontend/tensorflow2/test/test_v2_ops_tf_keras.py coremltools/converters/mil/frontend/tensorflow2/test/testing_utils.py coremltools/converters/mil/frontend/tensorflow2/tf_graph_pass/__init__.py coremltools/converters/mil/frontend/tensorflow2/tf_graph_pass/rewrite_control_flow_functions.py coremltools/converters/mil/frontend/torch/__init__.py coremltools/converters/mil/frontend/torch/converter.py coremltools/converters/mil/frontend/torch/dialect_ops.py coremltools/converters/mil/frontend/torch/exir_utils.py coremltools/converters/mil/frontend/torch/internal_graph.py coremltools/converters/mil/frontend/torch/load.py coremltools/converters/mil/frontend/torch/ops.py coremltools/converters/mil/frontend/torch/quantization_ops.py coremltools/converters/mil/frontend/torch/torch_op_registry.py coremltools/converters/mil/frontend/torch/torchir_passes.py coremltools/converters/mil/frontend/torch/torchscript_utils.py coremltools/converters/mil/frontend/torch/utils.py coremltools/converters/mil/frontend/torch/ssa_passes/__init__.py coremltools/converters/mil/frontend/torch/ssa_passes/torch_tensor_assign_to_core.py coremltools/converters/mil/frontend/torch/ssa_passes/torch_upsample_to_core_upsample.py coremltools/converters/mil/frontend/torch/test/__init__.py coremltools/converters/mil/frontend/torch/test/test_custom_ops.py coremltools/converters/mil/frontend/torch/test/test_examples.py coremltools/converters/mil/frontend/torch/test/test_internal_graph.py coremltools/converters/mil/frontend/torch/test/test_passes.py coremltools/converters/mil/frontend/torch/test/test_torch_conversion_api.py coremltools/converters/mil/frontend/torch/test/test_torch_export_conversion_api.py coremltools/converters/mil/frontend/torch/test/test_torch_export_quantization.py coremltools/converters/mil/frontend/torch/test/test_torch_ops.py coremltools/converters/mil/frontend/torch/test/test_torch_quantization_ops.py coremltools/converters/mil/frontend/torch/test/test_torch_stateful_model.py coremltools/converters/mil/frontend/torch/test/testing_utils.py coremltools/converters/mil/mil/__init__.py coremltools/converters/mil/mil/block.py coremltools/converters/mil/mil/builder.py coremltools/converters/mil/mil/input_type.py coremltools/converters/mil/mil/operation.py coremltools/converters/mil/mil/program.py coremltools/converters/mil/mil/scope.py coremltools/converters/mil/mil/utils.py coremltools/converters/mil/mil/var.py coremltools/converters/mil/mil/ops/__init__.py coremltools/converters/mil/mil/ops/helper.py coremltools/converters/mil/mil/ops/registry.py coremltools/converters/mil/mil/ops/defs/__init__.py coremltools/converters/mil/mil/ops/defs/_op_reqs.py coremltools/converters/mil/mil/ops/defs/_utils.py coremltools/converters/mil/mil/ops/defs/complex_dialect_ops.py coremltools/converters/mil/mil/ops/defs/coreml_dialect/__init__.py coremltools/converters/mil/mil/ops/defs/coreml_dialect/ops.py coremltools/converters/mil/mil/ops/defs/iOS15/__init__.py coremltools/converters/mil/mil/ops/defs/iOS15/activation.py coremltools/converters/mil/mil/ops/defs/iOS15/classify.py coremltools/converters/mil/mil/ops/defs/iOS15/control_flow.py coremltools/converters/mil/mil/ops/defs/iOS15/conv.py coremltools/converters/mil/mil/ops/defs/iOS15/elementwise_binary.py coremltools/converters/mil/mil/ops/defs/iOS15/elementwise_unary.py coremltools/converters/mil/mil/ops/defs/iOS15/image_resizing.py coremltools/converters/mil/mil/ops/defs/iOS15/linear.py coremltools/converters/mil/mil/ops/defs/iOS15/normalization.py coremltools/converters/mil/mil/ops/defs/iOS15/pool.py coremltools/converters/mil/mil/ops/defs/iOS15/random.py coremltools/converters/mil/mil/ops/defs/iOS15/recurrent.py coremltools/converters/mil/mil/ops/defs/iOS15/reduction.py coremltools/converters/mil/mil/ops/defs/iOS15/scatter_gather.py coremltools/converters/mil/mil/ops/defs/iOS15/tensor_operation.py coremltools/converters/mil/mil/ops/defs/iOS15/tensor_transformation.py coremltools/converters/mil/mil/ops/defs/iOS16/__init__.py coremltools/converters/mil/mil/ops/defs/iOS16/constexpr_ops.py coremltools/converters/mil/mil/ops/defs/iOS16/image_resizing.py coremltools/converters/mil/mil/ops/defs/iOS16/scatter_gather.py coremltools/converters/mil/mil/ops/defs/iOS16/tensor_operation.py coremltools/converters/mil/mil/ops/defs/iOS16/tensor_transformation.py coremltools/converters/mil/mil/ops/defs/iOS17/__init__.py coremltools/converters/mil/mil/ops/defs/iOS17/activation.py coremltools/converters/mil/mil/ops/defs/iOS17/conv.py coremltools/converters/mil/mil/ops/defs/iOS17/elementwise_unary.py coremltools/converters/mil/mil/ops/defs/iOS17/image_resizing.py coremltools/converters/mil/mil/ops/defs/iOS17/linear.py coremltools/converters/mil/mil/ops/defs/iOS17/normalization.py coremltools/converters/mil/mil/ops/defs/iOS17/quantization_ops.py coremltools/converters/mil/mil/ops/defs/iOS17/recurrent.py coremltools/converters/mil/mil/ops/defs/iOS17/reduction.py coremltools/converters/mil/mil/ops/defs/iOS17/scatter_gather.py coremltools/converters/mil/mil/ops/defs/iOS17/tensor_operation.py coremltools/converters/mil/mil/ops/defs/iOS17/tensor_transformation.py coremltools/converters/mil/mil/ops/defs/iOS18/__init__.py coremltools/converters/mil/mil/ops/defs/iOS18/compression.py coremltools/converters/mil/mil/ops/defs/iOS18/recurrent.py coremltools/converters/mil/mil/ops/defs/iOS18/states.py coremltools/converters/mil/mil/ops/defs/iOS18/tensor_transformation.py coremltools/converters/mil/mil/ops/defs/iOS18/transformers.py coremltools/converters/mil/mil/ops/tests/__init__.py coremltools/converters/mil/mil/ops/tests/test_utils.py coremltools/converters/mil/mil/ops/tests/testing_utils.py coremltools/converters/mil/mil/ops/tests/coreml_dialect/__init__.py coremltools/converters/mil/mil/ops/tests/coreml_dialect/test_coreml_dialect.py coremltools/converters/mil/mil/ops/tests/iOS14/__init__.py coremltools/converters/mil/mil/ops/tests/iOS14/test_activation.py coremltools/converters/mil/mil/ops/tests/iOS14/test_control_flow.py coremltools/converters/mil/mil/ops/tests/iOS14/test_conv.py coremltools/converters/mil/mil/ops/tests/iOS14/test_elementwise_binary.py coremltools/converters/mil/mil/ops/tests/iOS14/test_elementwise_unary.py coremltools/converters/mil/mil/ops/tests/iOS14/test_image_resizing.py coremltools/converters/mil/mil/ops/tests/iOS14/test_linear.py coremltools/converters/mil/mil/ops/tests/iOS14/test_normalization.py coremltools/converters/mil/mil/ops/tests/iOS14/test_pool.py coremltools/converters/mil/mil/ops/tests/iOS14/test_random.py coremltools/converters/mil/mil/ops/tests/iOS14/test_recurrent.py coremltools/converters/mil/mil/ops/tests/iOS14/test_reduction.py coremltools/converters/mil/mil/ops/tests/iOS14/test_scatter_gather.py coremltools/converters/mil/mil/ops/tests/iOS14/test_tensor_operation.py coremltools/converters/mil/mil/ops/tests/iOS14/test_tensor_transformation.py coremltools/converters/mil/mil/ops/tests/iOS15/__init__.py coremltools/converters/mil/mil/ops/tests/iOS15/test_elementwise_unary.py coremltools/converters/mil/mil/ops/tests/iOS15/test_image_resizing.py coremltools/converters/mil/mil/ops/tests/iOS15/test_tensor_transformation.py coremltools/converters/mil/mil/ops/tests/iOS16/__init__.py coremltools/converters/mil/mil/ops/tests/iOS16/test_constexpr_ops.py coremltools/converters/mil/mil/ops/tests/iOS16/test_conv.py coremltools/converters/mil/mil/ops/tests/iOS16/test_image_resizing.py coremltools/converters/mil/mil/ops/tests/iOS16/test_scatter_gather.py coremltools/converters/mil/mil/ops/tests/iOS16/test_tensor_operation.py coremltools/converters/mil/mil/ops/tests/iOS16/test_tensor_transformation.py coremltools/converters/mil/mil/ops/tests/iOS17/__init__.py coremltools/converters/mil/mil/ops/tests/iOS17/test_activation.py coremltools/converters/mil/mil/ops/tests/iOS17/test_conv.py coremltools/converters/mil/mil/ops/tests/iOS17/test_elementwise_unary.py coremltools/converters/mil/mil/ops/tests/iOS17/test_image_resizing.py coremltools/converters/mil/mil/ops/tests/iOS17/test_linear.py coremltools/converters/mil/mil/ops/tests/iOS17/test_normalization.py coremltools/converters/mil/mil/ops/tests/iOS17/test_quantization.py coremltools/converters/mil/mil/ops/tests/iOS17/test_recurrent.py coremltools/converters/mil/mil/ops/tests/iOS17/test_reduction.py coremltools/converters/mil/mil/ops/tests/iOS17/test_scatter_gather.py coremltools/converters/mil/mil/ops/tests/iOS17/test_tensor_operation.py coremltools/converters/mil/mil/ops/tests/iOS17/test_tensor_transformation.py coremltools/converters/mil/mil/ops/tests/iOS18/__init__.py coremltools/converters/mil/mil/ops/tests/iOS18/test_compression.py coremltools/converters/mil/mil/ops/tests/iOS18/test_recurrent.py coremltools/converters/mil/mil/ops/tests/iOS18/test_states.py coremltools/converters/mil/mil/ops/tests/iOS18/test_tensor_transformation.py coremltools/converters/mil/mil/ops/tests/iOS18/test_transformers.py coremltools/converters/mil/mil/passes/__init__.py coremltools/converters/mil/mil/passes/graph_pass.py coremltools/converters/mil/mil/passes/helper.py coremltools/converters/mil/mil/passes/pass_pipeline.py coremltools/converters/mil/mil/passes/pass_registry.py coremltools/converters/mil/mil/passes/defs/__init__.py coremltools/converters/mil/mil/passes/defs/lower_complex_dialect_ops.py coremltools/converters/mil/mil/passes/defs/optimize_activation.py coremltools/converters/mil/mil/passes/defs/optimize_activation_quantization.py coremltools/converters/mil/mil/passes/defs/optimize_conv.py coremltools/converters/mil/mil/passes/defs/optimize_elementwise_binary.py coremltools/converters/mil/mil/passes/defs/optimize_linear.py coremltools/converters/mil/mil/passes/defs/optimize_normalization.py coremltools/converters/mil/mil/passes/defs/optimize_quantization.py coremltools/converters/mil/mil/passes/defs/optimize_repeat_ops.py coremltools/converters/mil/mil/passes/defs/optimize_state.py coremltools/converters/mil/mil/passes/defs/optimize_tensor_operation.py coremltools/converters/mil/mil/passes/defs/preprocess.py coremltools/converters/mil/mil/passes/defs/quantization.py coremltools/converters/mil/mil/passes/defs/randomize.py coremltools/converters/mil/mil/passes/defs/symbol_transform.py coremltools/converters/mil/mil/passes/defs/cleanup/__init__.py coremltools/converters/mil/mil/passes/defs/cleanup/const_deduplication.py coremltools/converters/mil/mil/passes/defs/cleanup/const_elimination.py coremltools/converters/mil/mil/passes/defs/cleanup/dead_code_elimination.py coremltools/converters/mil/mil/passes/defs/cleanup/dedup_op_and_var_names.py coremltools/converters/mil/mil/passes/defs/cleanup/expand_dynamic_linear.py coremltools/converters/mil/mil/passes/defs/cleanup/fuse_reduce_mean.py coremltools/converters/mil/mil/passes/defs/cleanup/loop_invariant_elimination.py coremltools/converters/mil/mil/passes/defs/cleanup/noop_elimination.py coremltools/converters/mil/mil/passes/defs/cleanup/remove_redundant_ops.py coremltools/converters/mil/mil/passes/defs/cleanup/remove_symbolic_reshape.py coremltools/converters/mil/mil/passes/defs/cleanup/topological_reorder.py coremltools/converters/mil/mil/passes/tests/__init__.py coremltools/converters/mil/mil/passes/tests/test_cleanup_passes.py coremltools/converters/mil/mil/passes/tests/test_lower_complex_dialect_ops.py coremltools/converters/mil/mil/passes/tests/test_optimize_linear_passes.py coremltools/converters/mil/mil/passes/tests/test_pass_pipeline.py coremltools/converters/mil/mil/passes/tests/test_passes.py coremltools/converters/mil/mil/passes/tests/test_quantization_passes.py coremltools/converters/mil/mil/passes/tests/test_reduce_transposes_pass.py coremltools/converters/mil/mil/passes/tests/test_state_passes.py coremltools/converters/mil/mil/passes/tests/test_symbol_transform.py coremltools/converters/mil/mil/tests/__init__.py coremltools/converters/mil/mil/tests/test_block.py coremltools/converters/mil/mil/tests/test_debug.py coremltools/converters/mil/mil/tests/test_programs.py coremltools/converters/mil/mil/tests/test_types.py coremltools/converters/mil/mil/types/__init__.py coremltools/converters/mil/mil/types/annotate.py coremltools/converters/mil/mil/types/get_type_info.py coremltools/converters/mil/mil/types/global_methods.py coremltools/converters/mil/mil/types/symbolic.py coremltools/converters/mil/mil/types/type_bool.py coremltools/converters/mil/mil/types/type_complex.py coremltools/converters/mil/mil/types/type_dict.py coremltools/converters/mil/mil/types/type_double.py coremltools/converters/mil/mil/types/type_globals_pseudo_type.py coremltools/converters/mil/mil/types/type_int.py coremltools/converters/mil/mil/types/type_list.py coremltools/converters/mil/mil/types/type_mapping.py coremltools/converters/mil/mil/types/type_spec.py coremltools/converters/mil/mil/types/type_state.py coremltools/converters/mil/mil/types/type_str.py coremltools/converters/mil/mil/types/type_tensor.py coremltools/converters/mil/mil/types/type_tuple.py coremltools/converters/mil/mil/types/type_unknown.py coremltools/converters/mil/mil/types/type_void.py coremltools/converters/mil/mil/visitors/__init__.py coremltools/converters/mil/mil/visitors/dot_visitor.py coremltools/converters/sklearn/_LinearSVC.py coremltools/converters/sklearn/_LinearSVR.py coremltools/converters/sklearn/_NuSVC.py coremltools/converters/sklearn/_NuSVR.py coremltools/converters/sklearn/_SVC.py coremltools/converters/sklearn/_SVR.py coremltools/converters/sklearn/__init__.py coremltools/converters/sklearn/_converter.py coremltools/converters/sklearn/_converter_internal.py coremltools/converters/sklearn/_decision_tree_classifier.py coremltools/converters/sklearn/_decision_tree_regressor.py coremltools/converters/sklearn/_dict_vectorizer.py coremltools/converters/sklearn/_gradient_boosting_classifier.py coremltools/converters/sklearn/_gradient_boosting_regressor.py coremltools/converters/sklearn/_imputer.py coremltools/converters/sklearn/_k_neighbors_classifier.py coremltools/converters/sklearn/_linear_regression.py coremltools/converters/sklearn/_logistic_regression.py coremltools/converters/sklearn/_normalizer.py coremltools/converters/sklearn/_one_hot_encoder.py coremltools/converters/sklearn/_random_forest_classifier.py coremltools/converters/sklearn/_random_forest_regressor.py coremltools/converters/sklearn/_ridge_regression.py coremltools/converters/sklearn/_sklearn_util.py coremltools/converters/sklearn/_standard_scaler.py coremltools/converters/sklearn/_svm_common.py coremltools/converters/sklearn/_tree_ensemble.py coremltools/converters/xgboost/__init__.py coremltools/converters/xgboost/_tree.py coremltools/converters/xgboost/_tree_ensemble.py coremltools/models/__init__.py coremltools/models/_compiled_model.py coremltools/models/_deprecation.py coremltools/models/_feature_management.py coremltools/models/_interface_management.py coremltools/models/array_feature_extractor.py coremltools/models/datatypes.py coremltools/models/feature_vectorizer.py coremltools/models/model.py coremltools/models/pipeline.py coremltools/models/tree_ensemble.py coremltools/models/utils.py coremltools/models/ml_program/__init__.py coremltools/models/ml_program/compression_utils.py coremltools/models/nearest_neighbors/__init__.py coremltools/models/nearest_neighbors/builder.py coremltools/models/neural_network/__init__.py coremltools/models/neural_network/builder.py coremltools/models/neural_network/flexible_shape_utils.py coremltools/models/neural_network/optimization_utils.py coremltools/models/neural_network/printer.py coremltools/models/neural_network/quantization_utils.py coremltools/models/neural_network/spec_inspection_utils.py coremltools/models/neural_network/update_optimizer_utils.py coremltools/models/neural_network/utils.py coremltools/optimize/__init__.py coremltools/optimize/coreml/__init__.py coremltools/optimize/coreml/_config.py coremltools/optimize/coreml/_post_training_quantization.py coremltools/optimize/coreml/_quantization_passes.py coremltools/optimize/coreml/_utils.py coremltools/optimize/coreml/experimental/__init__.py coremltools/optimize/coreml/experimental/_config.py coremltools/optimize/coreml/experimental/_model_debugger.py coremltools/optimize/coreml/experimental/_post_training_quantization.py coremltools/optimize/coreml/experimental/_quantization_passes.py coremltools/optimize/torch/__init__.py coremltools/optimize/torch/_logging.py coremltools/optimize/torch/_typing.py coremltools/optimize/torch/base_model_optimizer.py coremltools/optimize/torch/optimization_config.py coremltools/optimize/torch/_utils/__init__.py coremltools/optimize/torch/_utils/dist_utils.py coremltools/optimize/torch/_utils/fsdp_utils.py coremltools/optimize/torch/_utils/graph_utils.py coremltools/optimize/torch/_utils/k_means.py coremltools/optimize/torch/_utils/math_utils.py coremltools/optimize/torch/_utils/metadata_utils.py coremltools/optimize/torch/_utils/python_utils.py coremltools/optimize/torch/_utils/registry.py coremltools/optimize/torch/_utils/report_utils.py coremltools/optimize/torch/_utils/state_dict_utils.py coremltools/optimize/torch/_utils/torch_utils.py coremltools/optimize/torch/_utils/transforms.py coremltools/optimize/torch/_utils/validation_utils.py coremltools/optimize/torch/_utils/version_utils.py coremltools/optimize/torch/layerwise_compression/__init__.py coremltools/optimize/torch/layerwise_compression/_quant.py coremltools/optimize/torch/layerwise_compression/algorithms.py coremltools/optimize/torch/layerwise_compression/input_cacher.py coremltools/optimize/torch/layerwise_compression/layerwise_compressor.py coremltools/optimize/torch/palettization/__init__.py coremltools/optimize/torch/palettization/_custom_conversion.py coremltools/optimize/torch/palettization/_efficient_kmeans.py coremltools/optimize/torch/palettization/_fake_palettizer_tensor_hook.py coremltools/optimize/torch/palettization/_partitioner.py coremltools/optimize/torch/palettization/_supported_modules.py coremltools/optimize/torch/palettization/_utils.py coremltools/optimize/torch/palettization/fake_palettize.py coremltools/optimize/torch/palettization/palettization_config.py coremltools/optimize/torch/palettization/palettizer.py coremltools/optimize/torch/palettization/post_training_palettization.py coremltools/optimize/torch/palettization/sensitive_k_means.py coremltools/optimize/torch/pruning/__init__.py coremltools/optimize/torch/pruning/_base_pruner.py coremltools/optimize/torch/pruning/_base_pruning_method.py coremltools/optimize/torch/pruning/_utils.py coremltools/optimize/torch/pruning/magnitude_pruner.py coremltools/optimize/torch/pruning/pruning_scheduler.py coremltools/optimize/torch/quantization/__init__.py coremltools/optimize/torch/quantization/_annotation_config.py coremltools/optimize/torch/quantization/_backend_config.py coremltools/optimize/torch/quantization/_backend_config_utils.py coremltools/optimize/torch/quantization/_configure.py coremltools/optimize/torch/quantization/_coreml_quantizer.py coremltools/optimize/torch/quantization/_coreml_quantizer_utils.py coremltools/optimize/torch/quantization/_qconfig_mapping.py coremltools/optimize/torch/quantization/_utils.py coremltools/optimize/torch/quantization/post_training_quantization.py coremltools/optimize/torch/quantization/quantization_config.py coremltools/optimize/torch/quantization/quantizer.py coremltools/optimize/torch/quantization/modules/__init__.py coremltools/optimize/torch/quantization/modules/conv_transpose.py coremltools/optimize/torch/quantization/modules/conv_transpose_fused.py coremltools/optimize/torch/quantization/modules/fused_modules.py coremltools/optimize/torch/quantization/modules/observers.py coremltools/optimize/torch/quantization/modules/qat_modules.py coremltools/optimize/torch/quantization/modules/quantized_modules.py coremltools/proto/ArrayFeatureExtractor_pb2.py coremltools/proto/AudioFeaturePrint_pb2.py coremltools/proto/BayesianProbitRegressor_pb2.py coremltools/proto/CategoricalMapping_pb2.py coremltools/proto/ClassConfidenceThresholding_pb2.py coremltools/proto/CustomModel_pb2.py coremltools/proto/DataStructures_pb2.py coremltools/proto/DictVectorizer_pb2.py coremltools/proto/FeatureTypes_pb2.py coremltools/proto/FeatureVectorizer_pb2.py coremltools/proto/GLMClassifier_pb2.py coremltools/proto/GLMRegressor_pb2.py coremltools/proto/Gazetteer_pb2.py coremltools/proto/Identity_pb2.py coremltools/proto/Imputer_pb2.py coremltools/proto/ItemSimilarityRecommender_pb2.py coremltools/proto/LinkedModel_pb2.py coremltools/proto/MIL_pb2.py coremltools/proto/Model_pb2.py coremltools/proto/NamedParameters_pb2.py coremltools/proto/NearestNeighbors_pb2.py coremltools/proto/NeuralNetwork_pb2.py coremltools/proto/NonMaximumSuppression_pb2.py coremltools/proto/Normalizer_pb2.py coremltools/proto/OneHotEncoder_pb2.py coremltools/proto/Parameters_pb2.py coremltools/proto/SVM_pb2.py coremltools/proto/Scaler_pb2.py coremltools/proto/SoundAnalysisPreprocessing_pb2.py coremltools/proto/TextClassifier_pb2.py coremltools/proto/TreeEnsemble_pb2.py coremltools/proto/VisionFeaturePrint_pb2.py coremltools/proto/WordEmbedding_pb2.py coremltools/proto/WordTagger_pb2.py coremltools/proto/__init__.py coremltools/test/__init__.py coremltools/test/utils.py coremltools/test/api/__init__.py coremltools/test/api/test_api_examples.py coremltools/test/api/test_api_visibilities.py coremltools/test/blob/__init__.py coremltools/test/blob/test_weights.py coremltools/test/ml_program/__init__.py coremltools/test/ml_program/test_compression.py coremltools/test/ml_program/test_utils.py coremltools/test/modelpackage/__init__.py coremltools/test/modelpackage/test_mlmodel.py coremltools/test/modelpackage/test_modelpackage.py coremltools/test/neural_network/__init__.py coremltools/test/neural_network/test_compiled_model.py coremltools/test/neural_network/test_custom_neural_nets.py coremltools/test/neural_network/test_model.py coremltools/test/neural_network/test_neural_networks.py coremltools/test/neural_network/test_nn_builder.py coremltools/test/neural_network/test_numpy_nn_layers.py coremltools/test/neural_network/test_quantization.py coremltools/test/neural_network/test_simple_nn_inference.py coremltools/test/neural_network/test_tf_numeric.py coremltools/test/optimize/__init__.py coremltools/test/optimize/api/__init__.py coremltools/test/optimize/api/test_optimize_api.py coremltools/test/optimize/coreml/__init__.py coremltools/test/optimize/coreml/test_passes.py coremltools/test/optimize/coreml/test_post_training_quantization.py coremltools/test/optimize/coreml/test_utils.py coremltools/test/optimize/torch/__init__.py coremltools/test/optimize/torch/conftest.py coremltools/test/optimize/torch/smoke_test.py coremltools/test/optimize/torch/test_api_surface.py coremltools/test/optimize/torch/test_base_optimizer.py coremltools/test/optimize/torch/utils.py coremltools/test/optimize/torch/conversion/__init__.py coremltools/test/optimize/torch/conversion/conversion_utils.py coremltools/test/optimize/torch/conversion/joint/__init__.py coremltools/test/optimize/torch/conversion/joint/test_joint_compression_conversion.py coremltools/test/optimize/torch/conversion/palettization/__init__.py coremltools/test/optimize/torch/conversion/palettization/test_palettization_conversion.py coremltools/test/optimize/torch/conversion/pruning/__init__.py coremltools/test/optimize/torch/conversion/pruning/test_pruning_conversion.py coremltools/test/optimize/torch/conversion/quantization/__init__.py coremltools/test/optimize/torch/conversion/quantization/test_quantization_conversion.py coremltools/test/optimize/torch/layerwise_compression/__init__.py coremltools/test/optimize/torch/layerwise_compression/test_algorithms.py coremltools/test/optimize/torch/layerwise_compression/test_quant.py coremltools/test/optimize/torch/models/__init__.py coremltools/test/optimize/torch/models/mnist.py coremltools/test/optimize/torch/palettization/__init__.py coremltools/test/optimize/torch/palettization/palettization_utils.py coremltools/test/optimize/torch/palettization/test_palettization_api.py coremltools/test/optimize/torch/palettization/test_palettization_utils.py coremltools/test/optimize/torch/palettization/test_palettizer.py coremltools/test/optimize/torch/palettization/test_post_training_palettization.py coremltools/test/optimize/torch/palettization/test_sensitive_k_means.py coremltools/test/optimize/torch/pruning/__init__.py coremltools/test/optimize/torch/pruning/pruning_utils.py coremltools/test/optimize/torch/pruning/test_base_pruner.py coremltools/test/optimize/torch/pruning/test_magnitude_pruner.py coremltools/test/optimize/torch/pruning/test_pruning_scheduler.py coremltools/test/optimize/torch/quantization/__init__.py coremltools/test/optimize/torch/quantization/test_configure.py coremltools/test/optimize/torch/quantization/test_coreml_quantizer.py coremltools/test/optimize/torch/quantization/test_post_training_quantization.py coremltools/test/optimize/torch/quantization/test_quantizer.py coremltools/test/optimize/torch/quantization/test_utils.py coremltools/test/optimize/torch/test_utils/__init__.py coremltools/test/optimize/torch/test_utils/test_fsdp_utils.py coremltools/test/optimize/torch/test_utils/test_k_means.py coremltools/test/optimize/torch/test_utils/test_metadata_utils.py coremltools/test/optimize/torch/test_utils/test_report_utils.py coremltools/test/optimize/torch/test_utils/test_validation_utils.py coremltools/test/pipeline/__init__.py coremltools/test/pipeline/test_model_updatable.py coremltools/test/pipeline/test_pipeline.py coremltools/test/sklearn_tests/__init__.py coremltools/test/sklearn_tests/test_NuSVC.py coremltools/test/sklearn_tests/test_NuSVR.py coremltools/test/sklearn_tests/test_SVC.py coremltools/test/sklearn_tests/test_SVR.py coremltools/test/sklearn_tests/test_categorical_imputer.py coremltools/test/sklearn_tests/test_composite_pipelines.py coremltools/test/sklearn_tests/test_dict_vectorizer.py coremltools/test/sklearn_tests/test_feature_names.py coremltools/test/sklearn_tests/test_glm_classifier.py coremltools/test/sklearn_tests/test_imputer.py coremltools/test/sklearn_tests/test_io_types.py coremltools/test/sklearn_tests/test_k_neighbors_classifier.py coremltools/test/sklearn_tests/test_linear_regression.py coremltools/test/sklearn_tests/test_nearest_neighbors_builder.py coremltools/test/sklearn_tests/test_normalizer.py coremltools/test/sklearn_tests/test_one_hot_encoder.py coremltools/test/sklearn_tests/test_random_forest_classifier.py coremltools/test/sklearn_tests/test_random_forest_classifier_numeric.py coremltools/test/sklearn_tests/test_random_forest_regression.py coremltools/test/sklearn_tests/test_random_forest_regression_numeric.py coremltools/test/sklearn_tests/test_ridge_regression.py coremltools/test/sklearn_tests/test_standard_scalar.py coremltools/test/sklearn_tests/test_utils.py coremltools/test/xgboost_tests/__init__.py coremltools/test/xgboost_tests/test_boosted_trees_classifier.py coremltools/test/xgboost_tests/test_boosted_trees_classifier_numeric.py coremltools/test/xgboost_tests/test_boosted_trees_regression.py coremltools/test/xgboost_tests/test_boosted_trees_regression_numeric.py coremltools/test/xgboost_tests/test_decision_tree_classifier.py coremltools/test/xgboost_tests/test_decision_tree_classifier_numeric.py coremltools/test/xgboost_tests/test_decision_tree_regression.py coremltools/test/xgboost_tests/test_decision_tree_regression_numeric.py././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726511965.0 coremltools-8.0/coremltools.egg-info/dependency_links.txt0000644000000000000000000000000114672075535022601 0ustar00rootroot ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726511965.0 coremltools-8.0/coremltools.egg-info/requires.txt0000644000000000000000000000011614672075535021131 0ustar00rootrootnumpy>=1.14.5 protobuf>=3.1.0 sympy tqdm packaging attrs>=21.3.0 cattrs pyaml ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726511965.0 coremltools-8.0/coremltools.egg-info/top_level.txt0000644000000000000000000000001414672075535021260 0ustar00rootrootcoremltools ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726511965.2935474 coremltools-8.0/setup.cfg0000644000000000000000000000004614672075535014320 0ustar00rootroot[egg_info] tag_build = tag_date = 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1726508431.0 coremltools-8.0/setup.py0000755000000000000000000000671414672066617014225 0ustar00rootroot#!/usr/bin/env python # # Copyright (c) 2017, Apple Inc. All rights reserved. # # Use of this source code is governed by a BSD-3-clause license that can be # found in the LICENSE.txt file or at https://opensource.org/licenses/BSD-3-Clause import importlib.util import os from setuptools import setup, find_packages # Get the coremltools version string coremltools_dir = os.path.join(os.path.dirname(__file__), "coremltools") version_file = os.path.join(coremltools_dir, "version.py") spec = importlib.util.spec_from_file_location("coremltools.version", version_file) version_module = importlib.util.module_from_spec(spec) spec.loader.exec_module(version_module) __version__ = version_module.__version__ README = os.path.join(os.getcwd(), "README.md") long_description = """coremltools =========== `Core ML `_ is an Apple framework that allows developers to easily integrate machine learning (ML) models into apps. Core ML is available on iOS, iPadOS, watchOS, macOS, and tvOS. Core ML introduces a public file format (.mlmodel) for a broad set of ML methods including deep neural networks (convolutional and recurrent), tree ensembles (boosted trees, random forest, decision trees), and generalized linear models. Core ML models can be directly integrated into apps within Xcode. :code:`coremltools` is a python package for creating, examining, and testing models in the .mlmodel format. In particular, it can be used to: - Convert trained models from popular machine learning tools into Core ML format (.mlmodel). - Write models to Core ML format with a simple API. - Making predictions using the Core ML framework (on select platforms) to verify conversion. More Information ---------------- - `coremltools user guide and examples `_ - `Core ML framework documentation `_ - `Machine learning at Apple `_ License ------- Copyright (c) 2020, Apple Inc. All rights reserved. Use of this source code is governed by the `3-Clause BSD License `_ that can be found in the LICENSE.txt file. """ setup( name="coremltools", version=__version__, description="Community Tools for Core ML", long_description=long_description, author="Apple Inc.", author_email="coremltools@apple.com", url="https://github.com/apple/coremltools", packages=find_packages(), package_data={ "": [ "_core.*.so", # kmeans1d "libcoremlpython.so", "libmilstoragepython.so", "libmodelpackage.so", "LICENSE.txt", "README.md", ] }, install_requires=[ "numpy >= 1.14.5", "protobuf >= 3.1.0", "sympy", "tqdm", "packaging", "attrs>=21.3.0", "cattrs", "pyaml", ], classifiers=[ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "Operating System :: MacOS :: MacOS X", "Operating System :: POSIX :: Linux", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.8", "Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.10", "Programming Language :: Python :: 3.11", "Topic :: Scientific/Engineering", "Topic :: Software Development", ], license="BSD", )